How to create, index and modify Data Frame in R?
This TechVidvan article is designed to help you in creating, accessing, and modifying data frame in R.
Data frames are lists that have a class of “data frame”. They are a special case of lists where all the components are of equal length.
In this R tutorial, we will take a look at R data frames. We will understand their nature and role in R programming. We will learn how to create and navigate them. We will also learn about modifying them and much more.
So, without any delay, let’s begin!
What are data frames in R?
Data frames store data tables in R. If you import a dataset in a variable, R stores the variable as a data frame. In the simplest of terms, they are lists of vectors of equal length.
In a data frame, the columns represent component variables while the rows represent observations. While the most common use of data frames in R is to import datasets into them, we will start with creating our own data frame.
Not familiar with the R vector? Have a basic understanding of it.
How to create a data frame in R?
To create a data frame, we can use the
data.frame() function. For example:
> vec1 <- c("pencil","pen","eraser","notebook","compass") > vec2 <- c(TRUE,TRUE,FALSE,FALSE,TRUE) > vec3 <- c(2.0, 5.0, 1.0, 20.0, 10.0) > data <- data.frame(vec1,vec2,vec3, stringAsFactor=FALSE) > data
In this example, we used three vectors vec1, vec2, and vec3. Notice that the three vectors are of the same length. The
data.frame() function converts the vectors into columns of data.
Note: By default, the
data.frame() function converts character variables into factors. To avoid this behavior use the
stringAsFactors = FALSE argument. In the next section, we will learn a few important functions that help us gain more information about a data frame.
Data Frames Functions
There are a few functions in R that give us important information about objects. These functions are very useful when working with data frames. Data frames are usually very large collections of data. It is easier to use functions that give insight into them instead of looking manually. Some of these functions are:
str() – The
str() function tells us the structure of the data frames. It tells us how many and what variables a data frame has, what are their data types, and how many observations they have.
names() – The names function returns the names of the variables in a data frame that is the names of all the columns. We can also use the
names() function to change the names of the variables.
> names(data) <- c("item-name","in-stock","price") > names(data)
nrow() – The
nrow() function returns the number of rows or observations in a data frame.
ncol() – The
ncol() function returns the number of columns or variables in a data frame.
length() – The
length() function returns the length of a data frame which is the same as the
head() – The
head() function returns the first n rows of a data frame.
tail() – The
tail() function returns the last n rows of a data frame.
How to access Elements of data frame in R?
To access a column of a data frame, we can use the square bracket
[ ] or the double square brackets
[[ ]] or the dollar sign
$. For example:
> data["item-name"] > data[["item-name"]] > data$`item-name`
> data[1:3,] #first three rows of data > data[1:4,1] #first to fourth row and first column of data > data[1:4,1, drop=FALSE] #first to fourth row and first column of data wothout drop
If we need to find the items whose price is more than 5.
> data[data$price >5,]
How to change values in the R data frame?
Modifying the values
We can modify a data frame using indexing techniques and reassignment. For example:
> data[3,"in-stock"] <- TRUE > data
We can add rows to a data frame by using the
rbind() function. For example:
> new_row <- list("crayons",TRUE,20.0) > data <- rbind(data,new_row) > data
Similarly, we can add a column by using the
cbind()function. For example:
> quantity <- c(100,60,80,0,30,25) > data <- cbind(data,quantity) > data
We can delete columns of a data frame by reassigning them as NULL.
> data$quantity <- NULL > data
We can also remove rows from a data frame through reassignment. For example:
> data <- data[-6,] > data
Data frames are the most used data structures of R when it comes to data analysis and data science. They are two-dimensional data structures that help store tabular data with different types.
Today, we learned about the data frames in the R programming language. We learned how to create, access and manipulate them. We also looked at some essential functions in dealing with R data frames.
I hope our R data frame article helped you solve your problems.