6 Inbuilt Data Structures in R with practical examples

In R, most of the time you will be dealing with collections of data and not singular elements. We hold collections of data in data structures.

Data structures are objects in R that provide a method to arrange data in the desired format.

In this article, we will take a look at the data structures in R. We will learn what they are, and what are their uses. We will also explore a few features and functions of these structures.

Let’s start with understanding the basics of data structures.

Introduction to Data Structures in R

R has six types of basic data structures. We can organize these data structures according to their dimensions(1d, 2d, nd). We can also classify them as homogeneous or heterogeneous (can their contents be of different types or not).

Homogeneous data structures are ones that can only store a single type of data (numeric, integer, character, etc.).

Heterogeneous data structures are ones that can store more than one type of data at the same time.

R does not have 0 dimensional or scalar type. Variables containing single values are vectors of length 1.

We have discussed every concept of R data types in our previous article, now we are going to understand R data structures in detail.

R has the following basic data structures:

  1. Vector
  2. List
  3. Matrix
  4. Data frame
  5. Array
  6. Factor

data structures in R

So, let’s not wait anymore and get to it!

1. Vectors

Vectors are single-dimensional, homogeneous data structures. To create a vector, use the c() function.

For example:

> vec <- c(1,2,3) # creates a vector named vec
> vec

Output:

[1] 1 2 3

vectors data structures in R

The assign() function is another way to create a vector.

For example:

> assign("vec2", c(4,5,6))
> vec2

Output:

[1] 4 5 6

R data structures vectors

Vectors can hold values of a single data type. Thus, they can be numeric, logical, character, integer or complex vectors.

For example:

> numeric_vec <- c(1,2,3,4,5)
> integer_vec <- c(1L,2L,3L,4L,5L)
> logical_vec <- c(TRUE, TRUE, FALSE, FALSE, FALSE)
> complex_vec <- c(12+2i, 3i, 4+1i, 5+12i, 6i)
> character_vec <- c("techvidvan", "this", "is", "a", "character vector")
> numeric_vec
> integer_vec
> logical_vec
> complex_vec
> character_vec

Output:

[1] 1 2 3 4 5

[1] 1 2 3 4 5

[1] TRUE FALSE TRUE FALSE FALSE

[1] 12+ 2i 0+ 3i 4+ 1i 5+12i 0+ 6i

[1] “techvidvan” “this”
[3] “is” “a”
[5] “character vector”

vector types-data structures in R

The above code will create the following vectors with corresponding values and types.

vector types data structures in R

Note: Technically, we can store different types of data but R converts the values to maintain the vector’s homogeneous nature. This is called Coercion.

If you want to implement this example on your own, then I would recommend you to install R by our step by step R tutorial.

Follow TechVidvan on Google & Stay updated with latest technology trends

2. Lists

Lists are heterogeneous data structures. They are very similar to vectors except they can store data of different types. To create a list, we use the list() function.

For example

> test_list <- list(1, "hello", c(2,3,1), FALSE, 3+4i, 6L)
> test_list

Output:

[[1]]
[1] 1

[[2]]
[1] “hello”

[[3]]
[1] 2 3 1

[[4]]
[1] FALSE

[[5]]
[1] 3+4i

[[6]]
[1] 6

lists test-data structures in R

Lists are often called “recursive vectors” as you can store a list inside another list.

Example:

> test_list2<-list(list(1,"a",TRUE), list("b",45L,"c"), list(1,2))
> str(test_list2) #shows the structure of an object

Output:

List of 3
$ :List of 3
..$ : num 1
..$ : chr “a”
..$ : logi TRUE
$ :List of 3
..$ : chr “b”
..$ : int 45
..$ : chr “c”
$ :List of 2
..$ : num 1
..$ : num 2

R data structures lists test

Note: If the c() function has a list as an argument, the result will be a list. Other values inside the c() function will be coerced into lists themselves.

3. Matrix

Matrices are two-dimensional, homogeneous data structures. This means that all values in a matrix have to be of the same type. Coercion takes place if there is more than one data type. They have rows and columns.

By default, matrices are in column-wise order. The basic syntax to create a matrix is:

>matrix( data, nrow, ncol, byrow, dimnames)

Where data is the input values in the matrix given as a vector,

nrow is the number of rows,

ncol is the number of columns,

byrow is a logical which tells the function to arrange the matrix row-wise, by default it is set to FALSE,

dimnames is a list of the names of the rows/columns created.

The following code will create a matrix with 3 rows and values 1 to 9 in a column-wise order.

For example:

> test_matrix1 <- matrix(c(1:9), ncol = 3)
> test_matrix1

Output:

       [,1] [,2] [,3]
[1,]   1    4     7
[2,]   2    5     8
[3,]   3    6     9

data structures in R matrices matrix

An example of a matrix with row-names and column-names:

> rownames <- c("row1", "row2", "row3")
> colnames <- c("col1", "col2", "col3")
> test_matrix2 <- matrix(c(1:9), ncol = 3, dimnames = list(rownames, colnames))
> test_matrix2

Output:

       col1 col2 col3
row1   1     4       7
row2   2     5       8
row3   3     6       9

data structures in R matrices matrix

4. Data Frames

Data frames are two-dimensional, heterogeneous data structures. They are lists of vectors of equal lengths. Data frames have the following constraints placed upon them:

  1. A data-frame must have column-names and each row should have a unique name.
  2. Each column should have the same number of items.
  3. Each item in a single column should be of the same type.
  4. Different columns can have different data types.

To create a data frame, use the data.frames() function.

For example:

> student_id <- c(1:5)
> student_name <- c("raj", "jacob", "iqbal", "shawn", "hitesh")
> student_rank <- c("third", "fifth", "second", "fourth", "first")
> student.data <- data.frame(student_id , student_name, student_rank)
> student.data

Output:

    student_id     student_name    student_rank
1           1                    raj                     third
2           2                    jacob                 fifth
3           3                    iqbal                  second
4           4                    shawn                fourth
5           5                    hitesh                 first

data frames student-data structures in R

5. Arrays

Arrays are three dimensional, homogeneous data structures. They are collections of matrices stacked one on top of the other in layers.

You can create an array using the array() function. The following is the syntax of it:

Array_name = array(data,dim,dimnames)

Where array_name is the name of the array,

data is the data that is filled inside the array,

dim is a vector containing the dimensions of the array,

and dimnames is a list containing the names of the rows, columns, and matrices inside the array.

Here is an example of the array() function:

> arr1 <- array(c(1:18),dim=c(2,3,3))
> arr1

Output:

, , 1[,1] [,2] [,3]
[1,] 1 3 5
[2,] 2 4 6, , 2[,1] [,2] [,3]
[1,] 7 9 11
[2,] 8 10 12, , 3[,1] [,2] [,3]
[1,] 13 15 17
[2,] 14 16 18

array in r programming

6. Factors

Factors are vectors that can only store predefined values. They are useful for storing categorical data. Factors have two attributes:

  • Class – which has a value of “factor”, it makes it behave differently than a normal vector.
  • Levels – which is the set of allowed values

You can create a factor using the factor() function.

For example:

> fac <- factor(c("a", "b", "a", "b", "b"))
> fac

Output:

[1] a b a b b
Levels: a b

R data structures factors

Factors can store both strings and integers. They are useful to categorize unique values in columns like “TRUE” or “FALSE”, or “MALE” or “FEMALE”, etc..

Summary

In this tutorial, we looked at the basic data structures available in R. R has many complex data structures. We can make them using these basic structures in different combinations and formations.

In R, almost every calculation is done on structures containing many values. Calculations on singular values are very rare.

Thus, these data structures are very important building blocks in the R programming language.

Still, any confusion regarding data structures in R?

Ask TechVidvan experts in the comment section below!

Your 15 seconds will encourage us to work even harder
Please share your happy experience on Google | Facebook


1 Response

  1. Kieryn says:

    Are factors heterogeneous or homogeneous? It says they can be either integers or strings – is that at the same time?

Leave a Reply

Your email address will not be published. Required fields are marked *