Vectors and lists

When working with data, we often have lists or sequences of ‘things’. For example: a list of measurements we have made.

  • When all the things are of the same type, R calls this a vector2.

  • When there is a mix of different things R calls this a list.

Vectors

We can create a vector of numbers and display it like this:

# this creates a vector of heights, in cm
heights <- c(203, 148, 156, 158, 167,
             162, 172, 164, 172, 187,
             134, 182, 175)

The c() command is shorthand for combine, so the example above combines the individual elements (numbers) into a new vector.

We can create a vector of alphanumeric names just as easily:

names <- c("Ben", "Joe", "Sue", "Rosa")

And we can check the values stored in these variables by printing them. You can either type print(heights), or just write the name of the variable alone, which will print it by default. E.g.:

heights
 [1] 203 148 156 158 167 162 172 164 172 187 134 182 175

Try creating your own vector of numbers in a new code block below3 using the c(...) command. Then change the name of the variable you assign it to.

Accessing elements

Once we have created a vector, we often want to access the individual elements again. We do this based on their position.

Let’s say we have created a vector:

my.vector <- c(10, 20, 30, 40)

We can display the whole vector by just typing its name, as we saw above. But if we want to show only the first element of this vector, we type:

my.vector[1]
[1] 10

Here, the square brackets specify a subset of the vector we want - in this case, just the first element.

Selecting more than one element

A neat feature of subsetting is that we can grab more than one element at a time.

To do this, we need to tell R the positions of the elements we want, and so we provide a vector of the positions of the elements we want.

It might seem obvious, but the first element has position 1, the second has position 2, and so on. So, if we wanted to extract the 4th and 5th elements from the vector of heights we saw above we would type:

elements.to.grab <- c(4, 5)
heights[elements.to.grab]
[1] 158 167

We can also make a subset of the original vector and assign it to a new variable:

first.two.elements <- heights[c(1, 2)]
first.two.elements
[1] 203 148

Making and slicing with sequences

One common task in R is to create sequences of numbers, letters or dates.

The simplest way of doing this is to define a range, with the colon:

onetoten <- 1:10
onetoten
 [1]  1  2  3  4  5  6  7  8  9 10

This creates a vector which can be sliced like any other:

onetoten[8]
[1] 8

One common use of sequences is to slice other vectors:

onetoten[1:3]
[1] 1 2 3

Or the first 10 values in the heights vector we defined above:

heights[1:10]
 [1] 203 148 156 158 167 162 172 164 172 187

This works backwards, and with negative numbers too:

5:-5
 [1]  5  4  3  2  1  0 -1 -2 -3 -4 -5

When your sequence doesn’t contain only whole numbers, or non-consecutive numbers, you can use the seq function:

seq(1,10,by=2)
[1] 1 3 5 7 9
seq(0, 1, by=.2)
[1] 0.0 0.2 0.4 0.6 0.8 1.0

Conditional slicing

One neat feature of R is that you can create a sequence of TRUE or FALSE values, by asking whether each value in a sequence matches a particular condition. For example:

1:10 > 5
 [1] FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE

Re-using the heights vector from above, we can then use this to select values that are above the average:

heights > mean(heights)
 [1]  TRUE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE  TRUE  TRUE FALSE
[12]  TRUE  TRUE

And we can use the vector of TRUE and FALSE values to select from the actual scores:

heights[heights > mean(heights)]
[1] 203 172 172 187 182 175

  1. It’s actually a matrix if has 2 dimensions, like a table, or an array if it has more than 2 dimensions.

  2. i.e. edit the RMarkdown document