# 6 Inbuilt Data Structures in R with practical examples

In R, most of the time you will be dealing with collections of data and not singular elements. We hold collections of data in data structures.

Data structures are objects in R that provide a method to arrange data in the desired format.

In this article, we will take a look at the data structures in R. We will learn what they are, and what are their uses. We will also explore a few features and functions of these structures.

Let’s start with understanding the basics of data structures.

## Introduction to Data Structures in R

R has **six types** of basic data structures. We can organize these data structures according to their **dimensions**(1d, 2d, nd). We can also classify them as **homogeneous** or **heterogeneous **(can their contents be of different types or not).

Homogeneous data structures are ones that can only store a single type of data (numeric, integer, character, etc.).

Heterogeneous data structures are ones that can store more than one type of data at the same time.

R does not have 0 dimensional or scalar type. Variables containing single values are vectors of length 1.

We have discussed **every concept of R data types** in our previous article, now we are going to understand R data structures in detail.

R has the following basic data structures:

- Vector
- List
- Matrix
- Data frame
- Array
- Factor

So, let’s not wait anymore and get to it!

### 1. Vectors

Vectors are **single-dimensional, homogeneous** data structures. To create a vector, use the c() function.

**For example**:

> vec <- c(1,2,3) # creates a vector named vec > vec

**Output**:

The assign() function is another way to create a vector.

**For example**:

> assign("vec2", c(4,5,6)) > vec2

**Output**:

Vectors can hold values of a single data type. Thus, they can be numeric, logical, character, integer or complex vectors.

**For example:**

> numeric_vec <- c(1,2,3,4,5) > integer_vec <- c(1L,2L,3L,4L,5L) > logical_vec <- c(TRUE, TRUE, FALSE, FALSE, FALSE) > complex_vec <- c(12+2i, 3i, 4+1i, 5+12i, 6i) > character_vec <- c("techvidvan", "this", "is", "a", "character vector") > numeric_vec > integer_vec > logical_vec > complex_vec > character_vec

**Output**:

[1] 1 2 3 4 5

[1] 1 2 3 4 5

[1] TRUE FALSE TRUE FALSE FALSE

[1] 12+ 2i 0+ 3i 4+ 1i 5+12i 0+ 6i

[1] “techvidvan” “this”

[3] “is” “a”

[5] “character vector”

The above code will create the following vectors with corresponding values and types.

**Note:** Technically, we can store different types of data but R converts the values to maintain the vector’s homogeneous nature. This is called **Coercion**.

If you want to implement this example on your own, then I would recommend you to * install R* by our step by step R tutorial.

### 2. Lists

Lists are **heterogeneous** data structures. They are very similar to vectors except they can store data of different types. To create a list, we use the list() function.

**For example**

> test_list <- list(1, "hello", c(2,3,1), FALSE, 3+4i, 6L) > test_list

**Output**:

[[1]]

[1] 1

[[2]]

[1] “hello”

[[3]]

[1] 2 3 1

[[4]]

[1] FALSE

[[5]]

[1] 3+4i

[[6]]

[1] 6

Lists are often called “**recursive vectors**” as you can store a list inside another list.

**Example:**

> test_list2<-list(list(1,"a",TRUE), list("b",45L,"c"), list(1,2)) > str(test_list2) #shows the structure of an object

**Output**:

$ :List of 3

..$ : num 1

..$ : chr “a”

..$ : logi TRUE

$ :List of 3

..$ : chr “b”

..$ : int 45

..$ : chr “c”

$ :List of 2

..$ : num 1

..$ : num 2

**Note: **If the c() function has a list as an argument, the result will be a list. Other values inside the c() function will be coerced into lists themselves.

### 3. Matrix

Matrices are** two-dimensional, homogeneous** data structures. This means that all values in a matrix have to be of the same type. Coercion takes place if there is more than one data type. They have rows and columns.

By default, matrices are in **column-wise** order. The basic syntax to create a matrix is:

>matrix( data, nrow, ncol, byrow, dimnames)

Where **data** is the input values in the matrix given as a vector,

**nrow** is the number of rows,

**ncol** is the number of columns,

**byrow** is a logical which tells the function to arrange the matrix row-wise, by default it is set to FALSE,

**dimnames** is a list of the names of the rows/columns created.

The following code will create a matrix with 3 rows and values 1 to 9 in a column-wise order.

**For example:**

> test_matrix1 <- matrix(c(1:9), ncol = 3) > test_matrix1

**Output**:

[1,] 1 4 7

[2,] 2 5 8

[3,] 3 6 9

An example of a matrix with row-names and column-names:

> rownames <- c("row1", "row2", "row3") > colnames <- c("col1", "col2", "col3") > test_matrix2 <- matrix(c(1:9), ncol = 3, dimnames = list(rownames, colnames)) > test_matrix2

**Output**:

row1 1 4 7

row2 2 5 8

row3 3 6 9

### 4. Data Frames

Data frames are **two-dimensional, heterogeneous** data structures. They are lists of vectors of equal lengths. Data frames have the following constraints placed upon them:

- A data-frame
**must have column-names**and each row should have a**unique name**. - Each column should have the
**same number**of items. - Each item in a single column should be of the
**same type**. - Different columns can have different data types.

To create a data frame, use the data.frames() function.

**For example:**

> student_id <- c(1:5) > student_name <- c("raj", "jacob", "iqbal", "shawn", "hitesh") > student_rank <- c("third", "fifth", "second", "fourth", "first") > student.data <- data.frame(student_id , student_name, student_rank) > student.data

**Output**:

1 1 raj third

2 2 jacob fifth

3 3 iqbal second

4 4 shawn fourth

5 5 hitesh first

### 5. Arrays

Arrays are** three dimensional, homogeneous** data structures. They are collections of matrices stacked one on top of the other in layers.

You can create an array using the array() function. The following is the syntax of it:

Array_name = array(data,dim,dimnames)

Where **array_name** is the name of the array,

**data** is the data that is filled inside the array,

**dim** is a vector containing the dimensions of the array,

and **dimnames** is a list containing the names of the rows, columns, and matrices inside the array.

Here is an **example** of the array() function:

> arr1 <- array(c(1:18),dim=c(2,3,3)) > arr1

**Output**:

[1,] 1 3 5

[2,] 2 4 6, , 2[,1] [,2] [,3]

[1,] 7 9 11

[2,] 8 10 12, , 3[,1] [,2] [,3]

[1,] 13 15 17

[2,] 14 16 18

### 6. Factors

Factors are vectors that can only store predefined values. They are useful for storing **categorical data**. Factors have two attributes:

**Class**– which has a value of “factor”, it makes it behave differently than a normal vector.**Levels**– which is the set of allowed values

You can create a factor using the factor() function.

**For example:**

> fac <- factor(c("a", "b", "a", "b", "b")) > fac

**Output**:

Levels: a b

Factors can store both strings and integers. They are useful to categorize unique values in columns like “TRUE” or “FALSE”, or “MALE” or “FEMALE”, etc..

## Summary

In this tutorial, we looked at the basic data structures available in R. R has many complex data structures. We can make them using these basic structures in different combinations and formations.

In R, almost every calculation is done on structures containing many values. Calculations on singular values are very rare.

Thus, these data structures are very important building blocks in the R programming language.

Still, any confusion regarding data structures in R?

Ask **TechVidvan** experts in the comment section below!