Learning Objectives
- Be able to use R as a calculator.
- Be able to compare values in R.
- Know the distinctions between how R handles different types of data types (numbers, strings, and logicals).
Suggested Readings
- Chapter 3 of Danielle Navarro’s book “Learning Statistics With R”
You can do a ton of things with R, but at its core it’s basically a fancy calculator. Let’s get started with some basic arithmetic!
R handles simple arithmetic using the following arithmetic operators:
operation | operator | example input | example output |
---|---|---|---|
addition | + |
10 + 2 |
12 |
subtraction | - |
9 - 3 |
6 |
multiplication | * |
5 * 5 |
25 |
division | / |
9 / 3 |
3 |
power | ^ |
5 ^ 2 |
25 |
The first four basic operators (+
, -
, *
, /
) are pretty straightforward and behave as expected:
7 + 5 # Addition
## [1] 12
7 - 5 # Subtraction
## [1] 2
7 * 5 # Multiplication
## [1] 35
7 / 5 # Division
## [1] 1.4
Not a lot of surprises (you can ignore the [1]
you see in the returned values…that’s just R saying there’s only one value to return).
Powers (i.e. \(x^n\)) are represented using the ^
symbol. For example, to calculate \(5^4\) in R, we would type:
5^4
## [1] 625
There are two other operators that are not typically as well-known as the first five but are quite common in programming:
operation | operator | example input | example output |
---|---|---|---|
integer division | %/% |
4 %/% 3 |
1 |
modulus | %% |
8 %% 3 |
2 |
Integer division is division in which the remainder is discarded. Note the difference between regular (/
) and integer (%/%
) division:
4 / 3 # Regular division
## [1] 1.333333
4 %/% 3 # Integer division
## [1] 1
With integer division, 3 can only go into 4 once, so 4 %/% 3
returns 1
.
With integer division, dividing a number by a larger number will always produce 0
(because the larger number cannot go into the smaller number):
4 %/% 5 # Will return 0
## [1] 0
The modulus (aka “mod” operator) returns the remainder after doing integer division. For example:
17 %% 3
## [1] 2
This returns 2
because because 17 / 3 is equal to 5 with a remainder of 2. The modulus returns any remainder, including decimals:
3.1415 %% 3
## [1] 0.1415
If you mod a number by itself, you’ll get 0
(because there’s no remainder):
17 %% 17 # Will return 0
## [1] 0
Finally, if you mod a number by a larger number, you’ll get the smaller number back since it’s the remainder:
17 %% 20 # Will return 17
## [1] 17
%%
and %/%
The %%
and %/%
operators can be really handy. Here are a few tricks.
n %% 2
You can tell if an integer n
is even or odd by using m %% 2
. If the result is 0
, n
must be even (because 2 goes in evenly to even numbers with no remainder). If n
is odd, you’ll get a remainder of 1
. Here’s an example:
10 %% 2 # Even
## [1] 0
11 %% 2 # Odd
## [1] 1
This trick also works with negative numbers!
-42 %% 2 # Even
## [1] 0
-43 %% 2 # Odd
## [1] 1
When you use the mod operator %%
on a positive number with factors of 10, it “chops” the number and returns everything to the right of the “chop” point:
123456 %% 1 # Chops to the right of the *ones* digit
## [1] 0
123456 %% 10 # Chops to the right of the *tens* digit
## [1] 6
123456 %% 100 # Chops to the right of the *hundreds* digit
## [1] 56
Integer division %/%
works the same way, except it returns everything to the left of the “chop” point:
123456 %/% 1 # "Chops to the right of the ones digit
## [1] 123456
123456 %/% 10 # "Chops to the right of the tens digit
## [1] 12345
123456 %/% 100 # "Chops to the right of the hundreds digit
## [1] 1234
This trick works with non-integers too!
3.1415 %% 1
## [1] 0.1415
3.1415 %/% 1
## [1] 3
But be careful - this “trick” only works with positive numbers:
-123.456 %% 10
## [1] 6.544
-123.456 %/% 10
## [1] -13
Here’s some mental notes to remember how this works:
%%
returns everything to the right (<chop> ->
)%/%
returns everything to the left (<- <chop>
)Example | “Chop” point | “Chop” point description |
---|---|---|
1234 %% 1 |
1234 | |
Right of the 1 ’s digit |
1234 %% 10 |
123 | 4 |
Right of the 10 ’s digit |
1234 %% 100 |
12 | 34 |
Right of the 100 ’s digit |
1234 %% 1000 |
1 | 234 |
Right of the 1,000 ’s digit |
1234 %% 10000 |
| 1234 |
Right of the 10,000 ’s digit |
Other than simple arithmetic, another common programming task is to compare different values to see if one is greater than, less than, or equal to the other. R handles comparisons with relational and logical operators.
To compare two things, use the following relational operators:
<
<=
>=
>
==
!=
The less than operator <
can be used to test whether one number is smaller than another number:
2 < 5
## [1] TRUE
If the two values are equal, the <
operator will return FALSE
, while the <=
operator will return TRUE
: :
2 < 2
## [1] FALSE
2 <= 2
## [1] TRUE
The “greater than” (>
) and “greater than or equal to” (>=
) operators work the same way but in reverse:
2 > 5
## [1] FALSE
2 > 2
## [1] FALSE
2 >= 2
## [1] TRUE
To assess whether two values are equal, we have to use a double equal sign (==
):
(2 + 2) == 4
## [1] TRUE
(2 + 2) == 5
## [1] FALSE
To assess whether two values are not equal, we have to use an exclamation point sign with an equal sign (!=
):
(2 + 2) != 4
## [1] FALSE
(2 + 2) != 5
## [1] TRUE
It’s worth noting that you can also apply equality operations to “strings,” which is the general word to describe character values (i.e. not numbers). For example, R understands that a "penguin"
is a "penguin"
so you get this:
"penguin" == "penguin"
## [1] TRUE
However, R is very particular about what counts as equality. For two pieces of text to be equal, they must be precisely the same:
"penguin" == "PENGUIN" # FALSE because the case is different
## [1] FALSE
"penguin" == "p e n g u i n" # FALSE because the spacing is different
## [1] FALSE
"penguin" == "penguin " # FALSE because there's an extra space on the second string
## [1] FALSE
To make a more complex comparison of more than just two things, use the following logical operators:
&
|
!
And:
A logical expression x & y
is TRUE
only if both x
and y
are TRUE
.
(2 == 2) & (2 == 3) # FALSE because the second comparison if not TRUE
## [1] FALSE
(2 == 2) & (3 == 3) # TRUE because both comparisons are TRUE
## [1] TRUE
Or:
A logical expression x | y
is TRUE
if either x
or y
are TRUE
.
(2 == 2) | (2 == 3) # TRUE because the first comparison is TRUE
## [1] TRUE
Not:
The !
operator behaves like the word “not” in everyday language. If a statement is “not true”, then it must be “false”. Perhaps the simplest example is
!TRUE
## [1] FALSE
It is good practice to include parentheses to clarify the statement or comparison being made. Consider the following example:
!3 == 5
## [1] TRUE
This returns TRUE
, but it’s a bit confusing. Reading from left to right, you start by saying “not 3”…what does that mean?
What is really going on here is R first evaluates whether 3 is equal to 5 (3 == 5
), and then returns the “not” (!
) of that. A better version of the same thing would be:
!(3 == 5)
## [1] TRUE
R follows the typical BEDMAS order of operations. That is, R evaluates statements in this order1:
For example, if I type:
1 + 2 * 4
## [1] 9
R first computes 2 * 4
and then adds 1
. If what you actually wanted was for R to first add 2
to 1
, then you should have added parentheses around 1
and 2
:
(1 + 2) * 4
## [1] 12
A helpful rule of thumb to remember is that brackets always come first. So, if you’re ever unsure about what order R will do things in, an easy solution is to enclose the thing you want it to do first in brackets.
Every programming language has the ability to store data of different types. R recognizes several important basic data types (there are others, but these cover most cases):
Type | Description | Example |
---|---|---|
double |
Number with a decimal place (aka “float”) | 3.14 , 1.61803398875 |
integer |
Number without a decimal place | 1 , 42 |
character |
Text in quotes (aka “string”) | "this is some text" , "3.14" |
logical |
True or False (for comparing things) | TRUE , FALSE |
If you want to check with type a value is, you can use the function typeof()
. For example:
typeof("hello")
## [1] "character"
Numbers in R have the numeric
data type, which is also the default computational type. There are two types of numbers:
The difference is that integers don’t have decimal values. A non-integer in R has the type “double
”:
typeof(3.14)
## [1] "double"
By default, R assumes all numbers have a decimal place, even if it looks like an integer:
typeof(3)
## [1] "double"
In this case, R assumes that 3
is really 3.0
. To make sure R knows you really do mean to create an integer, you have to add an L
to the end of the number2:
typeof(3L)
## [1] "integer"
A character value is used to represent string values in R. Anything put between single quotes (''
) or double quotes (""
) will be stored as a character. For example:
typeof('3')
## [1] "character"
Notice that even though the value looks like a number, because it is inside quotes R interprets it as a character. If you mistakenly thought it was a a number, R will gladly return an error when you try to do a numerical operation with it:
'3' + 7
## Error in "3" + 7: non-numeric argument to binary operator
It doesn’t mattef if you use single or double quotes to create a character. The only time is does matter is if the character is a quote symbole itself. For example, if you wanted to type the word "don't"
, you should use double quotes so that R knows the single quote is part of the character:
typeof("don't")
## [1] "character"
If you used single quotes, you’ll get an error because R reads 'don'
as a character:
typeof('don't')
## Error: <text>:1:13: unexpected symbol
## 1: typeof('don't
## ^
We will go into much more detail about working with character values later on in Week 7.
Logical data only have two values: TRUE
or FALSE
. Note that these are not in quotes and are in all caps.
typeof(TRUE)
## [1] "logical"
typeof(FALSE)
## [1] "logical"
R uses these two special values to help answer questions about logical statements. For example, let’s compare whether 1
is greater than 2
:
1 > 2
## [1] FALSE
R returns the values FALSE
because 1 is not greater than 2. If I flip the question to whether 1
is less than 2
, I’ll get TRUE
:
1 < 2
## [1] TRUE
In addition to the four main data types mentioned, there are a few additional “special” types: Inf
, NaN
, NA
and NULL
.
Infinity: Inf
corresponds to a value that is infinitely large (or infinitely small with -Inf
). The easiest way to get Inf
is to divide a positive number by 0:
1/0
## [1] Inf
Not a Number: NaN
is short for “not a number”, and it’s basically a reserved keyword that means “there isn’t a mathematically defined number for this.” For example:
0/0
## [1] NaN
Not available: NA
indicates that the value that is “supposed” to be stored here is missing. We’ll see these much more when we start getting into data structures like vectors and data frames.
No value: NULL
asserts that the variable genuinely has no value whatsoever, or does not even exist.
Page sources:
Some content on this page has been modified from other courses, including:
For a more precise statement, see the operator precedence for R.↩︎
Why L
? Well, it’s a bit complicated, but R supports complex numbers which are denoted by i
, so i
was already taken. A quick answer is that R uses 32-bit long integers, so L
for “long”.↩︎