Jump to Block: (About) 01 02 03 04 05 06 07 08 09 10 11 12
02 Regression and Statistical Testing
This is a very important theory-heavy block, in which the lectures are longer than usual, and the workshop is shorter to make up for it.
In Block 02, we cover:
- Classical Regression:
- How to implement regression in practice
- How to interpret regression outputs
- Standard and Logistic regression
- Modern Regression:
- Matrix formulation of multivariate regression.
- Elements of multivariate calculus
- Statistical Testing:
- Classical statistical testing
- Resampling approaches to statistical testing
- Model evaluation using Cross Validation
- Implementations in R:
- Regression analyses and limitations when applied to Cyber security data
Lectures:
Regression:
Statistical Testing and Model Selection:
- 2.2.1 Statistical Testing - Classical Testing (25.50)
- 2.2.2 - Statistical Testing - Empirical Testing (24.53)
- 2.2.3 - Model Selection (36.26)
- Reference R code
Worksheets:
Preparation:
If you have not completed the Block 1 Preparation, please do so.
Workshop:
Mastery:
This pair of lectures forms the core theoretical content for the Unit. It is essential that you understand this material. Revisit it several times and reflect on how it interacts with future blocks. The application of these ideas into more complex models is the core of Data Science.
Reference material:
For Regression:
- Cosma Shalizi’s Modern Regression Lectures (Lectures 4-9 for basic material; Lectures 13-14 for Linear Algebra approach)
- Matrix Multiplication Cheat Sheet
- A complete reference is The Matrix Cookbook
- Sam Roweis’ Matrix Identities
- Cosma Shalizi’s Modern Regression Lectures
- Further reading in chapters 2.3 and 3.2 of The Elements of Statistical Learning: Data Mining, Inference, and Prediction (Friedman, Hastie and Tibshirani)
For Statistical testing:
- Cosma Shalizi’s Modern Regression Lectures (Lecture 21)
- Chapter 4 of Statistical Data Analysis by Glen Cowan
- Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations by Greenland et al.
For Model Comparison:
- Cosma Shalizi’s Modern Regression Lectures (Lectures 26,28)
- Further reading in Chapters 2.3 and 7.10 of The Elements of Statistical Learning: Data Mining, Inference, and Prediction (Friedman, Hastie and Tibshirani).