## R Basics

R History
R is derived from S that came from Bell Labs.  S was written over Fortran.  S was later rewritten to be driven by C.
R Types
base types in R are called “atomic” types.
The atomic types in R are:
• character
• numeric (real numbers)
• integer
• complex
• logical (boolean)
Vector
most basic object in R
contains objects of the same class
The only type of vector that can have different atomic types is the list vector.  A list could have a character, integer, logical, etc.
To create an empty vector with: vector()
Numbers
adding L after a number sets the number as an Integer
Inf is a number type of infinity
NaN is a value that is not a number.  an example is 0/0 results in a non number or NaN
Attributes
• names, dimnames (dimension names)
• dimensions
• class (not like class in OO terms. But is the type of symbol/variable… Integer, character, etc.
• length
Attributes are accessed with attributes() function
Expressions
A little weird at first… but instead of a =, R uses a reverse rocket: <-
For example:
x <- 1
assigns 1 to the symbol x.
you can check the class or type of a symbol with the class() function like so:
class(x)
Which will return numeric if x is set to numeric.
you can print output with print(object)
Comments are done with #, as in Ruby or Python
Ranges
In R ranges are called sequences. they are created like so:
x <- 1:20
that would be a range from 1 to 20.
Concatenate
To create vectors of objects you can use the c() function.
if we set:
x <- c(0.5, 0.6)
x would be
0.5 0.6
This is like an array, without the delimiters.  For example, if I did
x I would get 0.5
x would return 0.6
Notice in this case the indexing doesn’t start at 0, but at 1.
Boolean
True and False can be listed as
x <- TRUE
or simply x <- T
Length
x <- vector(“numeric”, length = 10)
would set 0 0 0 0 0 0 0 0 0 0 to x
Mixing Objects
if you mix objects like so:
y <- c(TRUE, 2)
true will be converted to a number 1 for True, 0 for false.
however, if you use
y <- c(“a”, TRUE)
it will default to character and convert TRUE to a string, not the boolean value.
sournce
dget
unseralize
Write Data
write.table
writeLines
dump
save
serialize
Examples:
that will read the tabular data in the csv file and assign it to the symbol data.
you can set the colClasses argument on the read function so that R doesn’t have to try and figure out what data type is in each column.  For example if all data in the table is numeric, then you could do
data <- read.csv(“my_data.csv”, colClasses = “numeric”)
setting the row value will also help speed up the import.  This way R doesn’t have to make the calc on row count.  you can do this with
nrows can also be used to pull a segment of rows. if the document had 40,000 rows, you could do a
data <- read.csv(“my_data.csv”, nrows=100) to grab only the first 100 rows.
then if you wanted to find out what kind of data types are in there, you could do:
classes <- sapply(data, class)
This will return the types of classes to the classes symbol.
selecting a row by number:
x[47,]
selecting a row by value:
x[“my string value in table”,]
counting a value.  if i had a data frame of x, and it had a column called “Ozone” if I wanted a count of each “NA” in that column I would do:
sum(is.na(x\$”Ozone”))
to count a mean of  column, omitting NaN (non numbers) you can do:
mean(x\$”Ozone”, na.rm = TRUE)
lets say someone has given us a data table with columns like Ozone, Temp, and solar radiation.
They ask us for the mean of Solar Radiation where Ozone is > 31 and Temp is > 90.
we can assign a new symbol/variable called
x.sub <- subset(x, Ozone > 31 & Temp > 90)
This assigns x.sub to the value of the subset of data from the data frame x (x is the data frame), where Ozone is greater than 31 and Temp is greater then 90
Then we can do a mean function on x.sub such as:
mean(x.sub\$”Solar.R”, na.rm = TRUE)
the na.rm = True just tells it to not calc any missing values
Other examples…
Temperature by month… you want the mean of june:
x.sub2 <- subset(x, Month == 6)
then mean(x.sub2\$Temp)
in boolean logic, if one piece is true in it is true… i.e.
6 == 1 | 5 == 5
the first is false, the second is true. therefore it’s true.
combining char vectors…
my_string <- c(“My”,”Name”,”is”)
we are creating a vector that will be
My    Name      Is
If we do paste(my_string, collapse = “ “) it will become
My Name is
Sample Data
if y <- rnorm(1000)
and z <- rep(NA, 1000)
we can use this to sample 100 random items:
my_data <- sample(c(y, z), 100)
more ways to remove NA’s
y <- x[!is.na(x)]
Using the identical function:
x = “hi”
y = “hi”
indentical(x, y)
will produce TRUE