## Learning Objectives

- Describe what a vector is.
- Create vectors of different data types.
- Use indexing to subset and modify specific portions of vectors.
- Understand how to use vectorized functions to avoid loop operations.
## Suggested readings

- Chapter 20 of “R for Data Science”, by Garrett Grolemund and Hadley Wickham
- Chapter 5.1 of “Hands-On Programming with R”, by Garrett Grolemund

So far we’ve only dealt with objects that contain one value (e.g. `x <- 1`

), but R actually stores those values in a *vector* of length one:

```
x <- 1
length(x)
```

`## [1] 1`

`is.vector(x)`

`## [1] TRUE`

A *vector* is a basic data structure in R. All elements in a vector must have the same type.

The most basic way of creating a vector is to use the `c()`

function (“c” is for “concatenate”):

```
x <- c(1, 2, 3)
length(x)
```

`## [1] 3`

As we saw in the loops lesson, you can also create vectors of sequences using the `:`

operator or the `seq()`

function:

`seq(1, 10)`

`## [1] 1 2 3 4 5 6 7 8 9 10`

`1:5`

`## [1] 1 2 3 4 5`

You can also create a vector by using the `rep()`

function, which replicates the same value `n`

times:

```
y <- rep(5, 10) # The number 5 ten times
z <- rep(10, 5) # The number 10 five times
```

`y`

`## [1] 5 5 5 5 5 5 5 5 5 5`

`z`

`## [1] 10 10 10 10 10`

In fact, you can use the `rep()`

function to create longer vectors made up of repeated vectors:

`rep(c(1, 2), 3) # Repeat the vector c(1, 2) three times`

`## [1] 1 2 1 2 1 2`

If you add the `each`

argument, `rep()`

will repeat each element in the vector:

`rep(c(1, 2), each = 3) # Repeat each element of the vector c(1, 2) three times`

`## [1] 1 1 1 2 2 2`

You can see how long a vector is using the `length()`

function:

`length(y)`

`## [1] 10`

`length(z)`

`## [1] 5`

Each element in a vector must have the **same type**. If you mix types in a vector, R will *coerce* all the elements to either a numeric or character type.

If a vector has a *single* character element, R makes everything a **character**:

`c(1, 2, "3")`

`## [1] "1" "2" "3"`

`c(TRUE, FALSE, "TRUE")`

`## [1] "TRUE" "FALSE" "TRUE"`

If a vector has numeric and logical elements, R makes everything a **number**:

`c(1, 2, TRUE, FALSE)`

`## [1] 1 2 1 0`

If a vector has integers and floats, R makes everything a **float**:

`c(1L, 2, pi)`

`## [1] 1.000000 2.000000 3.141593`

You can delete a vector by assigning `NULL`

to it:

```
x <- seq(1, 10)
x
```

`## [1] 1 2 3 4 5 6 7 8 9 10`

```
x <- NULL
x
```

`## NULL`

As we saw in the loops lesson, you can create a vector of integers using the `:`

operator or the `seq()`

function:

`1:10`

`## [1] 1 2 3 4 5 6 7 8 9 10`

`seq(1, 10)`

`## [1] 1 2 3 4 5 6 7 8 9 10`

Numeric vectors don’t all have to be integers though - they can be any number:

```
v <- c(pi, 7, 42, 365)
v
```

`## [1] 3.141593 7.000000 42.000000 365.000000`

`typeof(v)`

`## [1] "double"`

R has many built-in functions that are designed to give *summary* information about numeric vectors. Note that these functions take a vectors of numbers and return *single values*. Here are some common ones:

Function | Description | Example |
---|---|---|

`mean(x)` |
Mean of values in `x` |
`mean(c(1,2,3,4,5))` returns `3` |

`median(x)` |
Median of values in `x` |
`median(c(1,2,2,4,5))` returns `2` |

`max(x)` |
Max element in `x` |
`max(c(1,2,3,4,5))` returns `5` |

`min(x)` |
Min element in `x` |
`min(c(1,2,3,4,5))` returns `1` |

`sum(x)` |
Sums the elements in `x` |
`sum(c(1,2,3,4,5))` returns `15` |

`prod(x)` |
Product of the elements in `x` |
`prod(c(1,2,3,4,5))` returns `120` |

Character vectors are vectors where each element is a string:

```
stringVector <- c('oh', 'what', 'a', 'beautiful', 'morning')
stringVector
```

`## [1] "oh" "what" "a" "beautiful" "morning"`

`typeof(stringVector)`

`## [1] "character"`

As we’ll see in the next lesson on strings, you can “collapse” a character vector into a single string using the `str_c()`

function from the `stringr`

library:

```
library(stringr)
str_c(stringVector, collapse = ' ')
```

`## [1] "oh what a beautiful morning"`

Logical vectors contain only `TRUE`

or `FALSE`

elements:

```
logicalVector <- c(rep(TRUE, 3), rep(FALSE, 3))
logicalVector
```

`## [1] TRUE TRUE TRUE FALSE FALSE FALSE`

If you add a numeric type to a logical vector, the logical elements will be converted to either a `1`

for `TRUE`

or `0`

for `FALSE`

:

`c(logicalVector, 42)`

`## [1] 1 1 1 0 0 0 42`

**Warning**: If you add a character type to a logical vector, the logical elements will be converted to strings of `"TRUE"`

and `"FALSE"`

. So even though they may still *look* like logical types, they aren’t:

```
y <- c(logicalVector, 'string')
y
```

`## [1] "TRUE" "TRUE" "TRUE" "FALSE" "FALSE" "FALSE" "string"`

`typeof(y)`

`## [1] "character"`

If you want to check if two vectors are identical (in that they contain all the same elements), you can’t use the typical `==`

operator by itself. The reason is because the `==`

operator is performed element-wise, so it will return a logical vector:

```
x <- c(1,2,3)
y <- c(1,2,3)
x == y
```

`## [1] TRUE TRUE TRUE`

Instead of getting one `TRUE`

, you get a vector of `TRUE`

s, because the individual elements are indeed equal. To compare if *all* the elements in the two vectors are identical, wrap the comparison inside the `all()`

function:

`all(x == y)`

`## [1] TRUE`

Keep in mind that there are really two steps going on here: 1) `x == y`

creates a logical vectors of `TRUE`

’s and `FALSE`

’s based on element-wise comparisons, and 2) the `all()`

function compares whether all of the values in the logical vector are `TRUE`

.

You can also use the `all()`

function to compare if other types of conditions are all `TRUE`

for all elements in two vectors:

```
a <- c(1,2,3)
b <- -1*c(1,2,3)
all(a > b)
```

`## [1] TRUE`

In contrast to the `all()`

function, the `any()`

function will return `TRUE`

if *any* of the elements in a vector are `TRUE`

:

```
a <- c(1,2,3)
b <- c(-1,2,-3)
a == b
```

`## [1] FALSE TRUE FALSE`

`any(a == b)`

`## [1] TRUE`

For most situations, the `all()`

function works just fine for comparing vectors, but it only compares the *elements* in the vectors, not their *attributes*. In some situations, you might also want to check if the attributes of vector, such as their *names* and *data types*, are also the same. In this case, you should use the `identical()`

function.

```
names(x) <- c('a', 'b', 'c')
names(y) <- c('one', 'two', 'three')
all(x == y) # Only compares the elements
```

`## [1] TRUE`

`identical(x, y) # Also compares the **names** of the elements`

`## [1] FALSE`

Notice that for the `identical()`

function, you don’t need to add a conditional statement - you just provide it the two vectors you want to compare. This is because `identical()`

by definition is comparing if two things are the same.

You can access elements from a vector using brackets `[]`

and indices inside the brackets. You can use integer indices (probably the most common way), character indices (by naming each element), and logical indices.

Vector indices start from 1 (this is important - most programming languages start from 0):

```
x <- seq(1, 10)
x[1] # Returns the first element
```

`## [1] 1`

`x[3] # Returns the third element`

`## [1] 3`

`x[length(x)] # Returns the last element`

`## [1] 10`

You can access multiple elements by using a vector of indices inside the brackets:

`x[c(1:3)] # Returns the first three elements`

`## [1] 1 2 3`

`x[c(2, 7)] # Returns the 2nd and 7th elements`

`## [1] 2 7`

You can also use negative integers to *remove* elements, which returns all elements except that those specified:

`x[-1] # Returns everything except the first element`

`## [1] 2 3 4 5 6 7 8 9 10`

`x[-c(2, 7)] # Returns everything except the 2nd and 7th elements`

`## [1] 1 3 4 5 6 8 9 10`

But you cannot mix positive and negative integers while indexing:

`x[c(-2, 7)]`

`## Error in x[c(-2, 7)]: only 0's may be mixed with negative subscripts`

If you try to use a float as an index, it gets rounded **down** to the nearest integer:

`x[3.1415] # Returns the 3rd element`

`## [1] 3`

`x[3.9999] # Still returns the 3rd element`

`## [1] 3`

You can name the elements in a vector and then use those names to access elements. To create a named vector, use the `names()`

function:

```
x <- seq(5)
names(x) <- c('a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j')
```

`## Error in names(x) <- c("a", "b", "c", "d", "e", "f", "g", "h", "i", "j"): 'names' attribute [10] must be the same length as the vector [5]`

`x`

`## [1] 1 2 3 4 5`

You can also create a named vector by putting the names directly in the `c()`

function:

```
x <- c('a' = 1, 'b' = 2, 'c' = 3, 'd' = 4, 'e' = 5)
x
```

```
## a b c d e
## 1 2 3 4 5
```

Once your vector has names, you can then use those names as indices:

`x['a'] # Returns the first element`

```
## a
## 1
```

`x[c('a', 'c')] # Returns the 1st and 3rd elements`

```
## a c
## 1 3
```

When using a logical vector for indexing, the position where the logical vector is `TRUE`

is returned. This is helpful for filtering vectors based on conditions:

```
x <- seq(1, 10)
x > 5 # Create logical vector
```

`## [1] FALSE FALSE FALSE FALSE FALSE TRUE TRUE TRUE TRUE TRUE`

`x[x > 5] # Put logical vector in brackets to filter out the TRUE elements`

`## [1] 6 7 8 9 10`

You can also use the `which()`

function to find the numeric indices for which a condition is `TRUE`

, and then use those indices to select elements:

`which(x < 5) # Returns indices of TRUE elements`

`## [1] 1 2 3 4`

`x[which(x < 5)] # Use which to select elements based on a condition`

`## [1] 1 2 3 4`

Most base functions in R are “vectorized”, meaning that when you give them a vector, they perform the operation on each element in the vector.

When you perform arithmetic operations on vectors, they are executed on an element-by-element basis:

```
x1 <- c(1, 2, 3)
x2 <- c(4, 5, 6)
```

```
# Addition
x1 + x2 # Returns (1+4, 2+5, 3+6)
```

`## [1] 5 7 9`

```
# Subtraction
x1 - x2 # Returns (1-4, 2-5, 3-6)
```

`## [1] -3 -3 -3`

```
# Multiplicattion
x1 * x2 # Returns (1*4, 2*5, 3*6)
```

`## [1] 4 10 18`

```
# Division
x1 / x2 # Returns (1/4, 2/5, 3/6)
```

`## [1] 0.25 0.40 0.50`

When performing vectorized operations, the vectors need to have the same dimensions, or one of the vectors needs to be a single-value vector:

```
# Careful! Mis-matched dimensions will only give you a warning, but will still return a value:
x1 <- c(1, 2, 3)
x2 <- c(4, 5)
x1 + x2
```

```
## Warning in x1 + x2: longer object length is not a multiple of shorter object
## length
```

`## [1] 5 7 7`

What R does in these cases is *repeat* the shorter vector, so in the above case the last value is `3 + 4`

.

If you have a single value vector, R will add it element-wise:

```
x1 <- c(1, 2, 3)
x2 <- c(4)
x1 + x2
```

`## [1] 5 6 7`

You can reorder the arrangement of elements in a vector by using the `sort()`

function:

```
a = c(2, 4, 6, 3, 1, 5)
sort(a)
```

`## [1] 1 2 3 4 5 6`

`sort(a, decreasing = TRUE)`

`## [1] 6 5 4 3 2 1`

To get the index values of the sorted order, use the `order()`

function:

`order(a)`

`## [1] 5 1 4 2 6 3`

These indices tell us that the first value in the sorted arrangement of vector `a`

is element number 5 (which is a `1`

), the second value is element number `1`

(which is a `2`

), and so on. If you use `order()`

as the indices to the vector, you’ll get the sorted vector:

`a[order(a)] # Same as sort(a)`

`## [1] 1 2 3 4 5 6`

As we saw in the loops lesson, you can use a loop to perform an operation on each element in a vector. For example, the following loop get the decimal values for each element in a vector of floats:

```
x <- c(3.1415, 1.618, 2.718)
remainder <- c()
for (i in x) {
remainder <- c(remainder, i %% 1)
}
remainder
```

`## [1] 0.1415 0.6180 0.7180`

You could achieve the same thing by just performing the operation inside the loop (the `i %% 1`

bit) on the whole vector:

```
remainder <- x %% 1
remainder
```

`## [1] 0.1415 0.6180 0.7180`

In many cases, using a vector can save you a whole lot of code (and time!) by avoiding loops entirely!

**Page sources**:

Some content on this page has been modified from other courses, including:

- CMU 15-112: Fundamentals of Programming, by David Kosbie & Kelly Rivers
- Danielle Navarro’s website “R for Psychological Science”
- RStudio primers

Tuesdays | 12:45 - 3:15 PM | Dr. John Paul Helveston | jph@gwu.edu

Content 2020 John Paul Helveston. See the licensing page for details.