We will work our way through this quarto document together during class. The activity will cover using R as a calculator, creating R objects, and exploring the features of a data set.
First Load the Tidyverse Package
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.4 ✔ readr 2.1.5
✔ forcats 1.0.0 ✔ stringr 1.5.1
✔ ggplot2 3.5.1 ✔ tibble 3.2.1
✔ lubridate 1.9.4 ✔ tidyr 1.3.1
✔ purrr 1.0.4
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
# logical# testing each variable in the vector and outputting TRUE or FALSEover100 <- x>100table(over100)
over100
FALSE TRUE
100 125
Your Turn
A. Create a vector of all the homeworlds in starwars using the starwars data.
# create vector called "homeworlds" and assign it the value "homeworld" from the starwars data set# how many worlds are there? hint: use the unique function# is there a world called "Ohio"? how would you test this with code?# How many characters live on Naboo?# Who lives on Naboo? (hint use the "names" variable in the starwars data and the "which" function)
B. Import and explore the dataframe called “taylor” from the csv “taylorswift.csv”
library(taylor)taylor <- taylor_all_songs# what is the "class" of the object taylor?# what types of data are in the object taylor?# change the "album_name" from class "character" to class "factor"# How many albums are in the data set & how many songs on each album?
C. Which numeric song features are correlated with one another? Hint create a correlation matrix.
# pick what features of the data you want to explore# create a matrix of those features# evaluate the correlation matrixmat |> GGally::ggpairs()
Registered S3 method overwritten by 'GGally':
method from
+.gg ggplot2
Error: object 'mat' not found
# which values show a positive correlation? Which values show a negative correlation?#var1 <- #var2 <- #var3 <- #var4 <- #ggplot(taylor, aes(x=var1, y=var2)) + geom_point(size=3)#ggplot(taylor, aes(x=var3, y=var4)) + geom_point(size=3)#library(taylor)#ggplot(taylor, aes(x=loudness, y=energy, color = album_name)) + geom_point(size=3) + scale_color_albums() + facet_wrap(~album_name)#ggplot(taylor, aes(x=acousticness, y=energy, color = album_name)) + geom_point(size=3) + scale_color_albums() + facet_wrap(~album_name)