6 Inbuilt Data Structures in R with practical examples
In R, most of the time you will be dealing with collections of data and not singular elements. We hold collections of data in data structures.
Data structures are objects in R that provide a method to arrange data in the desired format.
In this article, we will take a look at the data structures in R. We will learn what they are, and what are their uses. We will also explore a few features and functions of these structures.
Let’s start with understanding the basics of data structures.
Introduction to Data Structures in R
R has six types of basic data structures. We can organize these data structures according to their dimensions(1d, 2d, nd). We can also classify them as homogeneous or heterogeneous (can their contents be of different types or not).
Homogeneous data structures are ones that can only store a single type of data (numeric, integer, character, etc.).
Heterogeneous data structures are ones that can store more than one type of data at the same time.
R does not have 0 dimensional or scalar type. Variables containing single values are vectors of length 1.
We have discussed every concept of R data types in our previous article, now we are going to understand R data structures in detail.
R has the following basic data structures:
- Data frame
So, let’s not wait anymore and get to it!
Vectors are single-dimensional, homogeneous data structures. To create a vector, use the c() function.
> vec <- c(1,2,3) # creates a vector named vec > vec
The assign() function is another way to create a vector.
> assign("vec2", c(4,5,6)) > vec2
Vectors can hold values of a single data type. Thus, they can be numeric, logical, character, integer or complex vectors.
> numeric_vec <- c(1,2,3,4,5) > integer_vec <- c(1L,2L,3L,4L,5L) > logical_vec <- c(TRUE, TRUE, FALSE, FALSE, FALSE) > complex_vec <- c(12+2i, 3i, 4+1i, 5+12i, 6i) > character_vec <- c("techvidvan", "this", "is", "a", "character vector") > numeric_vec > integer_vec > logical_vec > complex_vec > character_vec
 1 2 3 4 5
 1 2 3 4 5
 TRUE FALSE TRUE FALSE FALSE
 12+ 2i 0+ 3i 4+ 1i 5+12i 0+ 6i
 “techvidvan” “this”
 “is” “a”
 “character vector”
The above code will create the following vectors with corresponding values and types.
Note: Technically, we can store different types of data but R converts the values to maintain the vector’s homogeneous nature. This is called Coercion.
If you want to implement this example on your own, then I would recommend you to install R by our step by step R tutorial.
Lists are heterogeneous data structures. They are very similar to vectors except they can store data of different types. To create a list, we use the list() function.
> test_list <- list(1, "hello", c(2,3,1), FALSE, 3+4i, 6L) > test_list
 2 3 1
Lists are often called “recursive vectors” as you can store a list inside another list.
> test_list2<-list(list(1,"a",TRUE), list("b",45L,"c"), list(1,2)) > str(test_list2) #shows the structure of an object
$ :List of 3
..$ : num 1
..$ : chr “a”
..$ : logi TRUE
$ :List of 3
..$ : chr “b”
..$ : int 45
..$ : chr “c”
$ :List of 2
..$ : num 1
..$ : num 2
Note: If the c() function has a list as an argument, the result will be a list. Other values inside the c() function will be coerced into lists themselves.
Matrices are two-dimensional, homogeneous data structures. This means that all values in a matrix have to be of the same type. Coercion takes place if there is more than one data type. They have rows and columns.
By default, matrices are in column-wise order. The basic syntax to create a matrix is:
>matrix( data, nrow, ncol, byrow, dimnames)
Where data is the input values in the matrix given as a vector,
nrow is the number of rows,
ncol is the number of columns,
byrow is a logical which tells the function to arrange the matrix row-wise, by default it is set to FALSE,
dimnames is a list of the names of the rows/columns created.
The following code will create a matrix with 3 rows and values 1 to 9 in a column-wise order.
> test_matrix1 <- matrix(c(1:9), ncol = 3) > test_matrix1
[1,] 1 4 7
[2,] 2 5 8
[3,] 3 6 9
An example of a matrix with row-names and column-names:
> rownames <- c("row1", "row2", "row3") > colnames <- c("col1", "col2", "col3") > test_matrix2 <- matrix(c(1:9), ncol = 3, dimnames = list(rownames, colnames)) > test_matrix2
row1 1 4 7
row2 2 5 8
row3 3 6 9
4. Data Frames
Data frames are two-dimensional, heterogeneous data structures. They are lists of vectors of equal lengths. Data frames have the following constraints placed upon them:
- A data-frame must have column-names and each row should have a unique name.
- Each column should have the same number of items.
- Each item in a single column should be of the same type.
- Different columns can have different data types.
To create a data frame, use the data.frames() function.
> student_id <- c(1:5) > student_name <- c("raj", "jacob", "iqbal", "shawn", "hitesh") > student_rank <- c("third", "fifth", "second", "fourth", "first") > student.data <- data.frame(student_id , student_name, student_rank) > student.data
1 1 raj third
2 2 jacob fifth
3 3 iqbal second
4 4 shawn fourth
5 5 hitesh first
Arrays are three dimensional, homogeneous data structures. They are collections of matrices stacked one on top of the other in layers.
You can create an array using the array() function. The following is the syntax of it:
Array_name = array(data,dim,dimnames)
Where array_name is the name of the array,
data is the data that is filled inside the array,
dim is a vector containing the dimensions of the array,
and dimnames is a list containing the names of the rows, columns, and matrices inside the array.
Here is an example of the array() function:
> arr1 <- array(c(1:18),dim=c(2,3,3)) > arr1
[1,] 1 3 5
[2,] 2 4 6, , 2[,1] [,2] [,3]
[1,] 7 9 11
[2,] 8 10 12, , 3[,1] [,2] [,3]
[1,] 13 15 17
[2,] 14 16 18
Factors are vectors that can only store predefined values. They are useful for storing categorical data. Factors have two attributes:
- Class – which has a value of “factor”, it makes it behave differently than a normal vector.
- Levels – which is the set of allowed values
You can create a factor using the factor() function.
> fac <- factor(c("a", "b", "a", "b", "b")) > fac
Levels: a b
Factors can store both strings and integers. They are useful to categorize unique values in columns like “TRUE” or “FALSE”, or “MALE” or “FEMALE”, etc..
In this tutorial, we looked at the basic data structures available in R. R has many complex data structures. We can make them using these basic structures in different combinations and formations.
In R, almost every calculation is done on structures containing many values. Calculations on singular values are very rare.
Thus, these data structures are very important building blocks in the R programming language.
Still, any confusion regarding data structures in R?
Ask TechVidvan experts in the comment section below!