Dimension Reduction with tidymodels

Author
Affiliation

Department of Cell Biology, OUHSC

Published

July 14, 2025

Pre-lecture activities

Before class, watch this video from Data Scientists Julie Silge from RStudio. She does a real-time data analysis using Principal Component Analysis (PCA) of the best hip hop songs of all time according to critics ratings. This video is not a detailed explanation of what PCA is or how it works (there is a lot of that maths on the internet if you want). In contrast, it is a live, real-time, analysis that asks the question what song features make a hip hop song the best. I want you to see that PCA is often a first line of inquiry when exploring a dataset.

  • For background, TidyTuesday is a weekly podcast and community activity that brings an interesting dataset to the data science community each week to do some cool plotting or analysis on. It provides interesting data to use for teaching purposes or code testing.

  • This video goes pretty fast but all the code she uses is below the video.

Lecture

Learning objectives

Learning objectives

At the end of class you will be able to:

  • Define dimension reduction
  • Explain why dimension reduction is used
  • Build a model to perform PCA on a dataset

Slides

Post-lecture Practice

  • If that wasn’t enough dimension reduction you can read more here, here, or here.