How to create, index and modify Data Frame in R?

This TechVidvan article is designed to help you in creating, accessing, and modifying data frame in R.

Data frames are lists that have a class of “data frame”. They are a special case of lists where all the components are of equal length.

In this R tutorial, we will take a look at R data frames. We will understand their nature and role in R programming. We will learn how to create and navigate them. We will also learn about modifying them and much more.

So, without any delay, let’s begin!

Keeping you updated with latest technology trends, Join TechVidvan on Telegram

What are data frames in R?

Data frames store data tables in R. If you import a dataset in a variable, R stores the variable as a data frame. In the simplest of terms, they are lists of vectors of equal length.

In a data frame, the columns represent component variables while the rows represent observations. While the most common use of data frames in R is to import datasets into them, we will start with creating our own data frame.

Not familiar with the R vector? Have a basic understanding of it.

How to create a data frame in R?

To create a data frame, we can use the data.frame() function. For example:

Code:

> vec1 <- c("pencil","pen","eraser","notebook","compass")
> vec2 <- c(TRUE,TRUE,FALSE,FALSE,TRUE)
> vec3 <- c(2.0, 5.0, 1.0, 20.0, 10.0)
> data <- data.frame(vec1,vec2,vec3, stringAsFactor=FALSE)
> data

Output:

creating R data.frame()

In this example, we used three vectors vec1, vec2, and vec3. Notice that the three vectors are of the same length. The data.frame() function converts the vectors into columns of data.

Note: By default, the data.frame() function converts character variables into factors. To avoid this behavior use the stringAsFactors = FALSE argument. In the next section, we will learn a few important functions that help us gain more information about a data frame.

Data Frames Functions

There are a few functions in R that give us important information about objects. These functions are very useful when working with data frames. Data frames are usually very large collections of data. It is easier to use functions that give insight into them instead of looking manually. Some of these functions are:

1. str() – The str() function tells us the structure of the data frames. It tells us how many and what variables a data frame has, what are their data types, and how many observations they have.

Code:

> str(data)

Output:

data frames in r str()

2. names() – The names function returns the names of the variables in a data frame that is the names of all the columns. We can also use the names() function to change the names of the variables.

Code:

> names(data)

Code:

> names(data) <- c("item-name","in-stock","price")
> names(data)

Output:

R data frames names()

3. nrow() – The nrow() function returns the number of rows or observations in a data frame.

4. ncol() – The ncol() function returns the number of columns or variables in a data frame.

5. length() – The length() function returns the length of a data frame which is the same as the ncol property.

Code:

> nrow(data)

Code:

> ncol(data)

Code:

> length(data)

Output:

data frames in r - nrow() ncol() length()

6. head() – The head() function returns the first n rows of a data frame.

7. tail() – The tail() function returns the last n rows of a data frame.

Code:

> head(data,2)

Code:

> tail(data,2)

Output:

data frames - head() tail()

How to access Elements of data frame in R?

To access a column of a data frame, we can use the square bracket [ ] or the double square brackets [[ ]] or the dollar sign $. For example:

Code:

> data["item-name"]
> data[["item-name"]]
> data$`item-name`

Output:

R data frames - indexing columnsWe can provide indices for rows and columns to access specific elements. For example:

Code:

> data[1:3,] #first three rows of data
> data[1:4,1] #first to fourth row and first column of data
> data[1:4,1, drop=FALSE] #first to fourth row and first column of data wothout drop

Output:

data frames in r - indexing elementsWe can also use conditionals as queries to select only certain items from the list. For example:

If we need to find the items whose price is more than 5.

Code:

> data[data$price >5,]

Output:

data frames in r - indexing with conditionals

How to change values in the R data frame?

Modifying the values

We can modify a data frame using indexing techniques and reassignment. For example:

Code:

> data

Code:

> data[3,"in-stock"] <- TRUE
> data

Output:

R data frames modify elements

Adding rows

We can add rows to a data frame by using therbind() function. For example:

Code:

> new_row <- list("crayons",TRUE,20.0)
> data <- rbind(data,new_row)
> data

Output:data frames in r - adding a row with rbind()

Adding columns

Similarly, we can add a column by using thecbind()function. For example:

Code:

> quantity <- c(100,60,80,0,30,25)
> data <- cbind(data,quantity)
> data

Output:

data frames in r - adding a column with cbind()

Deleting columns

We can delete columns of a data frame by reassigning them as NULL.

Code:

> data$quantity <- NULL
> data

Output:

data frames in r deleting columns

Deleting rows

We can also remove rows from a data frame through reassignment. For example:

Code:

> data <- data[-6,]
> data

Output:

deleting rows in R data framesSummary

Data frames are the most used data structures of R when it comes to data analysis and data science. They are two-dimensional data structures that help store tabular data with different types.

Today, we learned about the data frames in the R programming language. We learned how to create, access and manipulate them. We also looked at some essential functions in dealing with R data frames.

I hope our R data frame article helped you solve your problems.

Keep Learning!!

Leave a Reply

Your email address will not be published. Required fields are marked *

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.