Lecture 6
Dr. Elijah Meyer
Duke University
STA 199 - Spring 2023
September 14, 2022
– Clone ae-04
– HW-1 Due Tonight on Gradescope
– AE’s are being graded
– Keep posting on Slack
– Feedback for Lab-0 is live
– Continue practicing with dplyr
functions
– Change variable types
– Understand variable types
Identify which dplyr
functions chooses rows or changes columns of an exsisting data set
– filter()
– select()
– slice()
– arrange()
– filter()
- row
– select()
– slice()
– arrange()
– filter()
- row
– select()
- column
– slice()
– arrange()
– filter()
- row
– select()
- column
– slice()
- row
– arrange()
– filter()
- row
– select()
- column
– slice()
- row
– arrange()
- row
– “Do everything in one go”
Ex.
Ex.
Type is how an object is stored in memory.
– glimpse
is a great way to check data types
– Can also use typeof()
– glimpse(mtcars)
– typeof(mtcars$mpg)
Some of the types of variables include:
– “logical”
– “integer”
– “double”
– “character”
– “factor”
– logi
in glimpse
– The logical data type in R is also known as boolean data type. It can only have two values: TRUE and FALSE.
– as.logical
can turn a variable into a logical. False
= 0; True
everything else
– int
in glimpse
– Integers are whole numbers (those numbers without a decimal point)
– as.integer
can turn a double into an integer. Forces 22.8 -> 22.
– dbl
in glimpse
– Real numbers (can include decimals)
– as.double
can force a column to be a double. Identical to as.numeric
.
– chr
in glimpse
– Character string (text)
– as.character
attempts to coerce its argument to character type
– fct
in glimpse
– Factor in R is also known as a categorical variable that stores both string and integer data values as levels.
– factor
attempts to coerce its argument to factor type
– Plotting
– Summary statistics
– Can you identify variable types
– Often need to turn something into a factor to make it categorical
– Often need to turn something into a double (numeric) to make it quantitative
– Data types matter. Get in the habit of checking them at the beginning of analysis
– Have the tools to create new variables, calculate summary statistics, etc. that accompany strong visualizations