Materials

Labs and Discussion Notebooks

Labs

Lab 2: Pandas Overview

Lab 3: Data Cleaning and Seaborn

Lab 4: Plotting, smoothing, transformation

Lab 5: Regular Expression

Lab 6: Modeling and Estimation

Lab 7: Bootstrap

Lab 8: SQL and Database Setup

Lab 9: Spark

Lab 10: Least Squares Regression

Lab 11: Feature Engineering & Cross Validation

Lab 12: TensorFlow & Logistic/Softmax Regression

Lab 13: XPath

Practice Exam Questions

Midterm

Discussions

Discussion 1: Python, Numpy, Matrix Operations, Calculus

Discussion 2: Sampling

Discussion 3: HW2 Recap, Data Cleaning, and EDA

Discussion 4: Visualizations / Slides

Discussion 5: Regex

Discussion 6: Estimation & Convexity

Discussion 7: SQL

Discussion 8: P-values & P-hacking

Discussion 9: Regression and Featurization

Discussion 10: Linear Algebra and Gradient Review

Discussion 12: Precision/Recall and Logistic Regression

Previous Semester’s Materials

You can take a look at materials from the previous semester github repository. You can also obtain all the original slides from last semester here google drive and all datasets are available here.