STA 199: Introduction to Data Science and Statistical Thinking

This page contains an outline of the topics, content, and assignments for the semester. Note that this schedule will be updated as the semester progresses and the timeline of topics and assignments might be updated throughout the semester.

WEEK DATE TOPIC PREPARE MATERIALS DUE
1 Mon, Aug 25 Lab 0: Mise en place
💻 lab 0 Lab 0 due at the end of lab session (not graded)

Tue, Aug 26 Hello World and Hello STA 199! 📝 Syllabus 🖥️ slides 01
🗒️ notes 01


Thu, Aug 28 Meet the toolkit 📗 r4ds - intro
📘 ims - chp 1
🎥 Meet the toolkit :: R and RStudio
🎥 Meet the toolkit :: Quarto
🎥 Code along :: First data viz with UN Votes
🖥️ slides 02
🗒️ notes 02
⌨️ ae 01
ae 01

2 Mon, Sep 1 No lab - Labor Day



Tue, Sep 2 ARC Presentation
Grammar of data visualization
📗 r4ds - chp 1
📘 ims - chp 4
🎥 Visualizing data
🎥 Building a plot step-by-step with ggplot2
🎥 Grammar of graphics
🎥 Code along :: First look at Palmer Penguins
🖥️ slides 03
🗒️ notes 03


Thu, Sep 4 Grammar of data transformation 📗 r4ds - chp 2
📗 r4ds - chp 3.1-3.5
🎥 Grammar of data transformation
🎥 Code along :: Flights and pipes
🖥️ slides 04
🗒️ notes 04
⌨️ ae 02
ae 02

3 Mon, Sep 8 Lab 1: Exploring NC Counties
💻 lab 1
📝 hw 1
Lab 1 due at the end of the lab session

Tue, Sep 9 Exploratory data analysis I 📗 r4ds - chp 3.6-3.7
🎥 Visualizing and summarizing categorical data
🎥 Visualizing and summarizing numerical data
🎥 Visualizing and summarizing relationships
🎥 Code along :: Star Wars characters
🖥️ slides 05
🗒️ notes 05
⌨️ ae 03
ae 03


Thu, Sep 11 Exploratory data analysis II 📘 ims - chp 5
📘 ims - chp 6
🎥 Code along :: Diving deeper with Palmer Penguins
🖥️ slides 06
🗒️ notes 06
⌨️ ae 04
ae 04


Sun, Sep 14


HW 1 due at 11:59 pm
4 Mon, Sep 15 Lab 2: Get in teams then group_by() 📗 r4ds - chp 4 💻 lab 2
📝 hw 2
Lab 2 due at the end of the lab session

Tue, Sep 16 Tidying data 🎥 Tidy data
🎥 Tidying data
🎥 Code along :: Country populations over time
📗 r4ds - chp 5
🖥️ slides 07
🗒️ notes 07
⌨️ ae 05
ae 05


Thu, Sep 18 Joining data 🎥 Joining data
🎥 Code along :: Continent populations
📗 r4ds - chp 19.1-19.3
🖥️ slides 08
🗒️ notes 08
⌨️ ae 06
ae 06


Sun, Sep 21


HW 2 due at 11:59 pm
5 Mon, Sep 22 Lab 3: Inflation everywhere
💻 lab 3
📝 hw 3
Lab 3 due at the end of the lab session

Tue, Sep 23 Data types and classes 🎥 Data types
🎥 Data classes
🎥 Code along :: That’s my type
📗 r4ds - chp 16
🖥️ slides 09
🗒️ notes 09
⌨️ ae 07
ae 07


Thu, Sep 25 Importing and recoding data 🎥 Importing data
🎥 Code along :: Halving CO2 emissions
🎥 Code along :: Student survey
📗 r4ds - chp 7
📗 r4ds - chp 17.1 - 17.3
🖥️ slides 10
🗒️ notes 10
⌨️ ae 08
ae 08


Sun, Sep 28


HW 3 due at 11:59 pm
6 Mon, Sep 29 Lab 4: Changes in college athletics
💻 lab 4 Lab 4 due at the end of lab session

Tue, Sep 30 Exam 1 review
🖥️ slides 11
🗒️ notes 11
📝 exam 1 review
exam 1 review


Thu, Oct 2 Exam 1 - In-class + take-home released



Sat, Oct 4


Exam 1 take-home due at 12 pm (noon)
7 Mon, Oct 6 Project milestone 1 - Working collaboratively [45 mins]
Project milestone 2 - Project proposals [30 mins]
📝 Pre-read: Merge conflicts
📝 Project description
📓 project milestone 1
📓 project milestone 2
Project milestone 1 due at the end of lab session

Tue, Oct 7 Web scraping a single page 🎥 Web scraping basics
🎥 Code along :: Scraping an eCommerce page
📗 r4ds - chp 24.1 - 24.6
🖥️ slides 12
🗒️ notes 12
⌨️ ae 09
ae 09


Thu, Oct 9 Web scraping many pages 🎥 Code along :: Scraping many eCommerce pages
🎥 Web scraping considerations
📗 r4ds - chp 25.1 - 25.2
🖥️ slides 13
🗒️ notes 13
⌨️ ae 09
ae 09
Midterm course evaluation due at 11:59 pm (optional)
8 Mon, Oct 13 No lab - Fall Break



Tue, Oct 14 No lecture - Fall Break



Thu, Oct 16 Data science ethics 🎥 Misrepresentation
🎥 Data privacy
🎥 Algorithmic bias
🎥 Code along :: Sectors and services
🖥️ slides 14
🗒️ notes 14
Project milestone 2 due at 11:59 pm
Peer evaluation 1 due at 11:59 pm
9 Mon, Oct 20 Project milestone 3 - Improve and progress 📝 Tidyverse style guide - Chp 1-5 📓 project milestone 3 Project milestone 3 due at the end of lab session

Tue, Oct 21 The language of models 🎥 The language of models
🎥 Linear regression with a numerical predictor
📘 ims - chp 7.1
🖥️ slides 15
🗒️ notes 15
⌨️ ae 10
ae 10


Thu, Oct 23 Linear regression with a single predictor 🎥 Linear regression with a categorical predictor
🎥 Outliers in linear regression
🎥 Code along :: Modeling fish
📘 ims - chp 7.2
🖥️ slides 16
🗒️ notes 16
⌨️ ae 11
ae 11
Peer evaluation 2 due at 11:59 pm
10 Mon, Oct 27 Project milestone 4 - Peer review [30 minutes]
Lab 5: Make up your other half [45 minutes]

📓 project milestone 4
💻 lab 5
📝 hw 4
Project milestone 4 at the end of lab session
Lab 5 due at the end of the lab session

Tue, Oct 28 Linear regression with multiple predictors 🎥 Linear regression with multiple predictors
🎥 Main and interaction effects
📘 ims - chp 8.1-8.2
📘 ims - chp 8.3-8.5
🖥️ slides 17
🗒️ notes 17
⌨️ ae 12
ae 12


Thu, Oct 30 Model selection and overfitting 🎥 Code along :: Modeling interest rates 🖥️ slides 18
🗒️ notes 18
Peer evaluation 3 at 11:59 pm

Sun, Nov 2


HW 4 at 11:59 pm
11 Mon, Nov 3 Project milestone 5 - Work on writeup and presentations
📓 project milestone 5 Project milestone 5 due at the end of lab session

Tue, Nov 4 Developing and communicating data science results 📘 ims - chp 6
📗 r4ds - chp 10
🖥️ slides 19
🗒️ notes 19


Thu, Nov 6 Logistic regression 🎥 Logistic regression
🎥 Code along :: Building a spam filter
📘 ims - chp 9
🖥️ slides 20
🗒️ notes 20
⌨️ ae 13
ae 13

12 Mon, Nov 10 Project milestone 6 - Presentation
📓 project milestone 6 Project milestone 6 presentation due at the start of lab session

Tue, Nov 11 Spending your data 🎥 Clasification and decision errors
🎥 Overfitting and spending your data
🖥️ slides 21
🗒️ notes 21


Thu, Nov 13 Evaluating models 🎥 Code along :: Forest classification 🖥️ slides 22
🗒️ notes 22
⌨️ ae 14
ae 14
Project milestone 6 write-up due at 11:59 pm

Fri, Nov 14


Peer evaluation 4 due at 11:59 pm
13 Mon, Nov 17 Lab 6: Everything so far II
💻 lab 6 Lab 6 due at the end of lab session

Tue, Nov 18 Exam 2 review
🖥️ slides 23
🗒️ notes 23
📝 exam 2 review
exam 2 review


Thu, Nov 20 Exam 2 - In-class + take-home released



Sat, Nov 22


Exam 2 take-home due at 12 pm (noon)
14 Mon, Nov 24 Lab 7: Leavin’ on a jet plane
💻 lab 7
📝 hw 5
Lab 7 due at the end of lab session

Tue, Nov 25 Quantifying uncertainty with bootstrap intervals 🎥 Quantifying uncertainty
🎥 Bootstrapping
🎥 Code along :: Bootstrapping Duke Forest houses
📘 ims - chp 11
📘 ims - chp 12
🖥️ slides 24
🗒️ notes 24
⌨️ ae 15
ae 15


Thu, Nov 27 No lecture - Thanksgiving



Sun, Nov 30


HW 5 due at 11:59 pm (will be accepted until Wed, Dec 3 at 11:59 pm without penalty)
15 Mon, Dec 1 Lab 8: Inference
💻 lab 8
📝 hw 6
Lab 8 due at the end of lab session

Tue, Dec 2 Making decisions with randomization tests 🎥 Hypothesis testing
📘 ims - chp 11
🖥️ slides 25
🗒️ notes 25


Thu, Dec 4 Looking further
🖥️ slides 26
🗒️ notes 26


Fri, Dec 5


HW 6 due at 11:59 pm (will be accepted until Sun, Dec 7 at 11:59 pm without penalty)

NA Final review (time TBD, location TBD)
📝 final review
final review

16 Fri, Dec 12 Final (2 pm - 5 pm)