Jump to Block: (About) 01 02 03 04 05 06 07 08 09 10 11 12
05 Supervised Learning and Ensembles
In block 05 we cover:
- Introduction to Classification:
- Classification in context
- Logistic Regression
- Receiver-Operator and Precision-Recall Curves (ROC and PR Curves)
- Area Under the Curve
- k-Nearest Neighbour (kNN) classification
- Linear Discriminant Analysis (the ‘first’ LDA)
- Support Vector Machines (SVM)
- Ensemble methods/Meta-methods, including:
- Bootstrap Aggregating (Bagging)
- Boosting
- Stacking
Lectures:
- Introduction to Classification:
- Ensemble Methods:
Worksheets:
Workshop:
Assessments:
- Assessment 2 will be set in this week; see Assessments. This is a summatieve assessment (i.e. does contribute to your grade) and will be due in Week 12.
References
- General Classification:
- Rob Schapire’s ML Classification features a Batman Example…
- Chapter 4 of The Elements of Statistical Learning: Data Mining, Inference, and Prediction
- ROC and PR:
- Stack Exchange Discussion of ROC vs PR curves.
- Davis and Goadrich, “The Relationship Between Precision-Recall and ROC Curves”, ICML 2006. (Friedman, Hastie and Tibshirani).
- k-Nearest Neighbours:
- Chapter 13.3 of The Elements of Statistical Learning: Data Mining, Inference, and Prediction (Friedman, Hastie and Tibshirani).
- Linear Discriminant Analysis:
- Sebastian Raschka’s PCA vs LDA article with Python Examples
- Chapter 4.3 of The Elements of Statistical Learning: Data Mining, Inference, and Prediction (Friedman, Hastie and Tibshirani).
- SVMs:
- Jason Weston’s SVMs tutorial
- e1071 Package for SVMs in R
- Chapter 12 of The Elements of Statistical Learning: Data Mining, Inference, and Prediction (Friedman, Hastie and Tibshirani).
- Ensemble learning in general:
- Vadim Smolyakov, MIT: ML-perspective on Ensemble Methods
- Stacked Ensembles by H2O, a Commercial AI Company focussing on Deployable AI
- StackExchange: Stacking vs Bagging vs Boosting
- Super Learners: van der Laan, M, Polley E, Hubbard A, “Super Learner” (2007) Statistical Applications in Genetics and Molecular Biology, Volume 6.
- Boosting:
- AdaBoost paper: Experiments with a New Boosting Algorithm Freund and Schapire (1996).
- Explaining AdaBoost, Rob Schapire, Empirical Inference (2013) pp 37-52.
- xgboost Chen T and Carlos G (2016) KDD 2016.
- xgboost explained, a blog post about Didrik Nilsen’s paper Tree Boosting With XGBoost: Why Does XGBoost Win “Every” Machine Learning Competition?
- Kroese et al’s Data Science & Machine Learning free ebook looks pretty helpful.