Principles and Techniques of Data Science

UC Berkeley, Summer 2020

This schedule is still tentative, and is likely to change. See the Calendar to see the scheduling of our weekly events.

Week 1

Jun 22

Lecture 1 Course Overview (slides) (code) (video)

Homework 1 Prerequisites (due Jun. 24)

Discussion 1 Prerequisite Review (video) (solutions)

Jun 23

Lecture 2 Data Sampling and Probability I

Lab 1 Prerequisite Coding (due Jun. 23)

Jun 24

Lecture 3 Data Sampling and Probability II

Discussion 2 Random Variables

Homework 2 Trump Sampling

Jun 25

Lecture 4 SQL

Lab 2 SQL (due Jun. 25)

Week 2

Jun 29

Lecture 5 Pandas I

Project 1 Food Safety

Discussion 3 SQL

Jun 30

Lecture 6 Pandas II

Lab 3 Pandas I

Jul 1

Lecture 7 Data Cleaning and EDA

Discussion 4 Pandas II

Jul 2

Lecture 8 Regular Expressions

Lab 4 Data Cleaning and EDA

Week 3

Jul 6

Lecture 9 Visualization I

Homework 3 Bike Sharing

Discussion 5 Regex

Jul 7

Lecture 10 Visualization II

Lab 5 Visualization & KDE

Jul 8

Lecture 11 Modeling

Discussion 6 Visualizations and Transformations

Homework 4 Trump

Jul 9

Exam Midterm I

Lab 6 Modeling and Loss Functions

Week 4

Jul 13

Lecture 12 Simple Linear Regression

Discussion 7 Correlation

Jul 14

Lecture 13 Ordinary Least Squares

Lab 7 Regression

Jul 15

Lecture 14 Feature Engineering

Discussion 8 Geometric Least Squares & One Hot Encoding

Homework 5 Regression

Jul 16

Lecture 15 Bias-Variance Tradeoff

Lab 8 Feature Engineering

Week 5

Jul 20

Lecture 16 Regularization & Cross-Validation

Homework 6 Housing

Discussion 9 Bias Variance & Cross Validation

Jul 21

Lecture 17 Gradient Descent

Lab 9 Cross Validation

Jul 22

Lecture 18 Logistic Regression I

Discussion 10 Gradient Descent & Logistic Regression

Homework 7 Gradient Descent & Logistic Regression

Jul 23

Lecture 19 Logistic Regression II and Classification

Lab 10 Logistic Regression

Week 6

Jul 27

Exam Midterm II

Discussion 11 Cross Entropy Loss and Classification

Jul 28

Lecture 20 Inference for Modeling

Lab 11 Bootstrap the model parameters

Jul 29

Lecture 21 Decision Trees

Discussion 12 Decision Trees & Random Forests

Project 2 Spam/Ham

Jul 30

Lecture 22 Dimensionality Reduction & PCA

Lab 12 Decision Trees

Week 7

Aug 3

Lecture 23 PCA

Discussion 13 PCA

Aug 4

Lecture 24 Clustering

Lab 13 Clustering

Aug 5

Lecture 25 Guest Lecture

Discussion 14 Clustering

Homework 8 PCA

Aug 6

Lecture 26 Conclusion

Week 8

Aug 10

Lecture Review

Aug 11

Lecture Review

Aug 12

Exam Final Part I

Aug 13

Exam Final Part II