Learning Objectives
- Be able to use R as a calculator.
- Be able to compare values in R.
- Know the distinctions between how R handles different types of data types (numbers, strings, and logicals).
Suggested Readings
- Chapter 3 of Danielle Navarro’s book “Learning Statistics With R”
You can do a ton of things with R, but at its core it’s basically a fancy calculator. Let’s get started with some basic arithmetic!
R handles simple arithmetic using the following arithmetic operators:
operation | operator | example input | example output |
---|---|---|---|
addition | + |
10 + 2 |
12 |
subtraction | - |
9 - 3 |
6 |
multiplication | * |
5 * 5 |
25 |
division | / |
9 / 3 |
3 |
power | ^ |
5 ^ 2 |
25 |
The first four basic operators (+
, -
,
*
, /
) are pretty straightforward and behave as
expected:
7 + 5 # Addition
#> [1] 12
7 - 5 # Subtraction
#> [1] 2
7 * 5 # Multiplication
#> [1] 35
7 / 5 # Division
#> [1] 1.4
Not a lot of surprises (you can ignore the [1]
you see
in the returned values…that’s just R saying there’s only one value to
return).
Powers (i.e. \(x^n\)) are
represented using the ^
symbol. For example, to calculate
\(5^4\) in R, we would type:
5^4
#> [1] 625
There are two other operators that are not typically as well-known as the first five but are quite common in programming:
operation | operator | example input | example output |
---|---|---|---|
integer division | %/% |
4 %/% 3 |
1 |
modulus | %% |
8 %% 3 |
2 |
Integer division is division in which the remainder is discarded.
Note the difference between regular (/
) and integer
(%/%
) division:
4 / 3 # Regular division
#> [1] 1.333333
4 %/% 3 # Integer division
#> [1] 1
With integer division, 3 can only go into 4 once, so
4 %/% 3
returns 1
.
With integer division, dividing a number by a larger number will
always produce 0
(because the larger number cannot go into
the smaller number):
4 %/% 5 # Will return 0
#> [1] 0
The modulus (aka “mod” operator) returns the remainder after doing integer division. For example:
17 %% 3
#> [1] 2
This returns 2
because because 17 / 3 is equal to 5 with
a remainder of 2. The modulus returns any remainder, including
decimals:
3.1415 %% 3
#> [1] 0.1415
If you mod a number by itself, you’ll get 0
(because
there’s no remainder):
17 %% 17 # Will return 0
#> [1] 0
Finally, if you mod a number by a larger number, you’ll get the smaller number back since it’s the remainder:
17 %% 20 # Will return 17
#> [1] 17
%%
and %/%
The %%
and %/%
operators can be really
handy. Here are a few tricks.
n %% 2
You can tell if an integer n
is even or odd by using
m %% 2
. If the result is 0
, n
must be even (because 2 goes in evenly to even numbers with no
remainder). If n
is odd, you’ll get a remainder of
1
. Here’s an example:
10 %% 2 # Even
#> [1] 0
11 %% 2 # Odd
#> [1] 1
This trick also works with negative numbers!
-42 %% 2 # Even
#> [1] 0
-43 %% 2 # Odd
#> [1] 1
When you use the mod operator %%
on a
positive number with factors of 10, it “chops” the
number and returns everything to the right of the “chop”
point:
123456 %% 1 # Chops to the right of the *ones* digit
#> [1] 0
123456 %% 10 # Chops to the right of the *tens* digit
#> [1] 6
123456 %% 100 # Chops to the right of the *hundreds* digit
#> [1] 56
Integer division %/%
works the same way, except it
returns everything to the left of the “chop” point:
123456 %/% 1 # "Chops to the right of the ones digit
#> [1] 123456
123456 %/% 10 # "Chops to the right of the tens digit
#> [1] 12345
123456 %/% 100 # "Chops to the right of the hundreds digit
#> [1] 1234
This trick works with non-integers too!
3.1415 %% 1
#> [1] 0.1415
3.1415 %/% 1
#> [1] 3
But be careful - this “trick” only works with positive numbers:
-123.456 %% 10
#> [1] 6.544
-123.456 %/% 10
#> [1] -13
Here’s some mental notes to remember how this works:
%%
returns everything to the right
(<chop> ->
)%/%
returns everything to the left
(<- <chop>
)Example | “Chop” point | “Chop” point description |
---|---|---|
1234 %% 1 |
1234 | |
Right of the 1 ’s digit |
1234 %% 10 |
123 | 4 |
Right of the 10 ’s digit |
1234 %% 100 |
12 | 34 |
Right of the 100 ’s digit |
1234 %% 1000 |
1 | 234 |
Right of the 1,000 ’s digit |
1234 %% 10000 |
| 1234 |
Right of the 10,000 ’s digit |
Other than simple arithmetic, another common programming task is to compare different values to see if one is greater than, less than, or equal to the other. R handles comparisons with relational and logical operators.
To compare two things, use the following relational operators:
<
<=
>=
>
==
!=
The less than operator <
can be used to test
whether one number is smaller than another number:
2 < 5
#> [1] TRUE
If the two values are equal, the <
operator will
return FALSE
, while the <=
operator will
return TRUE
: :
2 < 2
#> [1] FALSE
2 <= 2
#> [1] TRUE
The “greater than” (>
) and “greater than or equal to”
(>=
) operators work the same way but in reverse:
2 > 5
#> [1] FALSE
2 > 2
#> [1] FALSE
2 >= 2
#> [1] TRUE
To assess whether two values are equal, we have to use a double equal
sign (==
):
(2 + 2) == 4
#> [1] TRUE
(2 + 2) == 5
#> [1] FALSE
To assess whether two values are not equal, we have to use
an exclamation point sign with an equal sign (!=
):
(2 + 2) != 4
#> [1] FALSE
(2 + 2) != 5
#> [1] TRUE
It’s worth noting that you can also apply equality operations to
“strings,” which is the general word to describe character values
(i.e. not numbers). For example, R understands that a
"penguin"
is a "penguin"
so you get this:
"penguin" == "penguin"
#> [1] TRUE
However, R is very particular about what counts as equality. For two pieces of text to be equal, they must be precisely the same:
"penguin" == "PENGUIN" # FALSE because the case is different
#> [1] FALSE
"penguin" == "p e n g u i n" # FALSE because the spacing is different
#> [1] FALSE
"penguin" == "penguin " # FALSE because there's an extra space on the second string
#> [1] FALSE
To make a more complex comparison of more than just two things, use the following logical operators:
&
|
!
And:
A logical expression x & y
is TRUE
only
if both x
and y
are
TRUE
.
(2 == 2) & (2 == 3) # FALSE because the second comparison if not TRUE
#> [1] FALSE
(2 == 2) & (3 == 3) # TRUE because both comparisons are TRUE
#> [1] TRUE
Or:
A logical expression x | y
is TRUE
if
either x
or y
are
TRUE
.
(2 == 2) | (2 == 3) # TRUE because the first comparison is TRUE
#> [1] TRUE
Not:
The !
operator behaves like the word “not” in
everyday language. If a statement is “not true”, then it must be
“false”. Perhaps the simplest example is
!TRUE
#> [1] FALSE
It is good practice to include parentheses to clarify the statement or comparison being made. Consider the following example:
!3 == 5
#> [1] TRUE
This returns TRUE
, but it’s a bit confusing. Reading
from left to right, you start by saying “not 3”…what does that mean?
What is really going on here is R first evaluates whether 3 is equal
to 5 (3 == 5
), and then returns the “not” (!
)
of that. A better version of the same thing would be:
!(3 == 5)
#> [1] TRUE
R follows the typical BEDMAS order of operations. That is, R evaluates statements in this order1:
For example, if I type:
1 + 2 * 4
#> [1] 9
R first computes 2 * 4
and then adds 1
. If
what you actually wanted was for R to first add 2
to
1
, then you should have added parentheses around
1
and 2
:
(1 + 2) * 4
#> [1] 12
A helpful rule of thumb to remember is that brackets always come first. So, if you’re ever unsure about what order R will do things in, an easy solution is to enclose the thing you want it to do first in brackets.
Note that for logical operators, the order precedence is
! > & > |
For example, consider the following statement:
TRUE | FALSE & FALSE
#> [1] TRUE
This returns TRUE
because the &
statement (FALSE & FALSE
) is evaluated first, so the
whole statement simplifies to TRUE | FALSE
, which returns
TRUE
. If you put parentheses around the |
statement, it would evaluate first and the whole statement would return
FALSE
:
(TRUE | FALSE) & FALSE
#> [1] FALSE
Similarly, consider the following statement:
! TRUE | TRUE
#> [1] TRUE
This returns TRUE
because the !
statement
is evaluated first (! TRUE
is FALSE
), and the
simplified statement FALSE | TRUE
returns
TRUE
. Again, if you put parentheses around the
|
statement the whole statement becomes
FALSE
:
! (TRUE | TRUE)
#> [1] FALSE
Every programming language has the ability to store data of different types. R recognizes several important basic data types (there are others, but these cover most cases):
Type | Description | Example |
---|---|---|
double |
Number with a decimal place (aka “float”) | 3.14 , 1.61803398875 |
integer |
Number without a decimal place | 1 , 42 |
character |
Text in quotes (aka “string”) | "this is some text" , "3.14" |
logical |
True or False (for comparing things) | TRUE , FALSE |
If you want to check with type a value is, you can use the function
typeof()
. For example:
typeof("hello")
#> [1] "character"
Numbers in R have the numeric
data type, which is also
the default computational type. There are two types of numbers:
The difference is that integers don’t have decimal values. A
non-integer in R has the type “double
”:
typeof(3.14)
#> [1] "double"
By default, R assumes all numbers have a decimal place, even if it looks like an integer:
typeof(3)
#> [1] "double"
In this case, R assumes that 3
is really
3.0
. To make sure R knows you really do mean to create an
integer, you have to add an L
to the end of the number2:
typeof(3L)
#> [1] "integer"
A character value is used to represent string values in R. Anything
put between single quotes (''
) or double quotes
(""
) will be stored as a character. For example:
typeof('3')
#> [1] "character"
Notice that even though the value looks like a number, because it is inside quotes R interprets it as a character. If you mistakenly thought it was a a number, R will gladly return an error when you try to do a numerical operation with it:
'3' + 7
#> Error in "3" + 7: non-numeric argument to binary operator
It doesn’t mattef if you use single or double quotes to create a
character. The only time is does matter is if the character is
a quote symbole itself. For example, if you wanted to type the word
"don't"
, you should use double quotes so that R knows the
single quote is part of the character:
typeof("don't")
#> [1] "character"
If you used single quotes, you’ll get an error because R reads
'don'
as a character:
typeof('don't')
#> Error: <text>:1:13: unexpected symbol
#> 1: typeof('don't
#> ^
We will go into much more detail about working with character values later on in Week 7.
Logical data only have two values: TRUE
or
FALSE
. Note that these are not in quotes and are in all
caps.
typeof(TRUE)
#> [1] "logical"
typeof(FALSE)
#> [1] "logical"
R uses these two special values to help answer questions about
logical statements. For example, let’s compare whether 1
is
greater than 2
:
1 > 2
#> [1] FALSE
R returns the values FALSE
because 1 is not greater than
2. If I flip the question to whether 1
is less
than 2
, I’ll get TRUE
:
1 < 2
#> [1] TRUE
In addition to the four main data types mentioned, there are a few
additional “special” types: Inf
, NaN
,
NA
and NULL
.
Infinity: Inf
corresponds to a value
that is infinitely large (or infinitely small with -Inf
).
The easiest way to get Inf
is to divide a positive number
by 0:
1/0
#> [1] Inf
Not a Number: NaN
is short for “not a
number”, and it’s basically a reserved keyword that means “there isn’t a
mathematically defined number for this.” For example:
0/0
#> [1] NaN
Not available: NA
indicates that the
value that is “supposed” to be stored here is missing. We’ll see these
much more when we start getting into data structures like vectors and
data frames.
No value: NULL
asserts that the
variable genuinely has no value whatsoever, or does not even exist.
Page sources:
Some content on this page has been modified from other courses, including:
For a more precise statement, see the operator precedence for R.↩︎
Why L
? Well, it’s a bit complicated,
but R supports complex numbers which are denoted by i
, so
i
was already taken. A quick answer is that R uses 32-bit
long integers, so L
for “long”.↩︎