Schedule

Author

Davi Moreira

Course Schedule

Day Date Topic Videos Notebook Assessment Materials
0 Pre-course Launchpad: Welcome, course setup, and Colab orientation 2 videos Open In Colab Colab Readiness Check Google Colab docs
1 Mon May 18 Predictive analytics fundamentals, EDA, and data splitting 4 videos Open In Colab Concept Quiz ISLP Ch 2
sklearn: cross-validation
Kaggle Learn: data leakage
2 Tue May 19 Data setup and preprocessing pipelines (the professional way) 3 videos Open In Colab Concept Quiz
Participation
sklearn: pipelines, ColumnTransformer
Pedregosa et al.: scikit-learn paper
3 Wed May 20 Regression metrics and baseline modeling (with test-set lockbox discipline) 3 videos Open In Colab Concept Quiz
3-sentence Evaluation Note
ISLP: Model Assessment
sklearn: regression metrics
4 Thu May 21 Linear regression that actually works: features, interactions, diagnostics 3 videos Open In Colab Concept Quiz
Participation
ISLP Ch 3: Linear Regression
sklearn: LinearRegression, PolynomialFeatures
5 Fri May 22 Regularization (Ridge/Lasso) + Project proposal sprint Synchronous lecture + 3 videos Open In Colab Concept Quiz
PROJECT MILESTONE 1: Proposal + Dataset
ISLP Ch 6: Regularization
sklearn: Ridge/Lasso/ElasticNet
6 Mon May 25 Logistic regression: probabilities, decision boundaries, and pipelines 3 videos Open In Colab Concept Quiz
Participation
ISLP Ch 4: Classification
sklearn: LogisticRegression
7 Tue May 26 Classification metrics: confusion matrix, ROC and PR curves, and business costs 3 videos Open In Colab Concept Quiz
Threshold Recommendation
Fawcett: ROC analysis
Saito & Rehmsmeier: PR curves
sklearn: classification metrics
8 Wed May 27 Resampling and CV: how to compare models without fooling yourself (k-fold + Student’s t CIs) 3 videos Open In Colab Concept Quiz
Participation
ISLP Ch 5: Resampling
sklearn: cross-validation utilities
9 Thu May 28 Hyperparameter tuning + feature engineering + leakage detection (and Project baseline build) 8 videos Open In Colab Concept Quiz
Project Baseline Draft
sklearn: GridSearchCV, RandomizedSearchCV
Provost & Fawcett: evaluation framing
10 Fri May 29 Midterm: Business-case predictive strategy practicum + Project baseline submission Synchronous lecture + 3 videos Open In Colab MIDTERM (graded)
PROJECT MILESTONE 2: Baseline Model + Evaluation Plan
Provost & Fawcett: business framing
sklearn: common pitfalls
11 Mon Jun 1 Decision trees: interpretable models with sharp edges 6 videos Open In Colab Concept Quiz
Participation
ISLP Ch 8: Tree-Based Methods
sklearn: DecisionTree estimators
12 Tue Jun 2 Random forests: bagging, OOB intuition, and feature importance 6 videos Open In Colab Concept Quiz
Participation
Breiman: Random Forests paper
sklearn: RandomForest, permutation importance
13 Wed Jun 3 Gradient boosting: performance with discipline (and leakage avoidance) 6 videos Open In Colab Concept Quiz
Participation
Friedman: Gradient Boosting Machine
sklearn: gradient boosting estimators
14 Thu Jun 4 Model selection and comparison: making the call like a professional 6 videos Open In Colab Concept Quiz
Participation
ISLP: Model Assessment
sklearn: model evaluation best practices
15 Fri Jun 5 Final Project Milestone 03 Walkthrough — Complex Model + Hyperparameter Tuning + Draft Abstract Synchronous lecture + 6 videos Open In Colab Concept Quiz
PROJECT MILESTONE 3: More Complex Model + Hyperparameter Tuning + Draft Abstract + saved champion_pipeline.joblib
M3 rubric: complex-model tuning, CI-overlap rule, draft abstract
sklearn: GridSearchCV, joblib persistence
16 Mon Jun 8 Time-series forecasting: walk-forward CV, lag features, and baseline models 6 videos Open In Colab Concept Quiz
Participation
Hyndman & Athanasopoulos: FPP3 (otexts.com/fpp3)
sklearn: TimeSeriesSplit
17 Tue Jun 9 Data communication and poster design: six principles (context, visualization, less-is-more, hierarchy, beauty, story) applied to the eleven-section research-poster architecture 6 videos Open In Colab Concept Quiz
Draft Poster Outline + Abstract
Tufte: data-ink ratio
Healy: Data Visualization
Knaflic: Storytelling with Data
18 Wed Jun 10 Competition workflow: end-to-end pipeline from notebook to Kaggle submission 6 videos Open In Colab Concept Quiz
submission.csv to Kaggle
Chip Huyen: Designing ML Systems
sklearn: model persistence (joblib)
19 Thu Jun 11 Special topic: deep learning (awareness, when-to-use, and one tabular demo) 6 videos Open In Colab Concept Quiz
Four-Question Rubric
ISLP Ch. 10: Deep Learning
Goodfellow et al.: Deep Learning Book
20 Fri Jun 12 Course end and reflection: project package submission + peer review + reflection survey Synchronous lecture + 6 videos Open In Colab PROJECT MILESTONE 4: Final Research Poster (single PDF named <group-number>.pdf) + intra-group Peer Evaluation form
KAGGLE COMPETITION DEADLINE (11:59 PM)
REFLECTION SURVEY (required for course completion)
Course rubric for M4 final poster
Purdue URC poster guidelines

Core Course References

  • James, Witten, Hastie, Tibshirani. An Introduction to Statistical Learning (ISLP) + Python labs. Download: https://www.statlearning.com/
  • Hastie, Tibshirani, Friedman. The Elements of Statistical Learning (ESL).
  • Provost, Fawcett. Data Science for Business.
  • Pedregosa et al. “Scikit-learn: Machine Learning in Python.” JMLR.
  • scikit-learn User Guide (pipelines, preprocessing, model selection, metrics, inspection).
  • Chip Huyen. Designing Machine Learning Systems (deployment thinking, monitoring).