Jump to Block: (About) 01 02 03 04 05 06 07 08 09 10 11 12 (Assessments)
03 Latent Structures, PCA, and Clustering
In Block 3 we cover:
- Motivation for latent structures
- Principal Components Analysis
- How to calculate PCA
- What PCA is good for
- Relationship to SVD and other Spectral Embeddings
- Clustering
- Algorithmic Clustering
- Hierarchical Clustering
- Model based clustering
- Implementations in R:
- Spectral Clustering as a pipeline element for classification
Lectures:
Workshop:
Assessments:
- Assessment 1 will be set in this week; see Assessments. This is a summative assessment (i.e. does contribute to your grade) and will be due in Week 7.
- Portfolio 03 of the full Portfolio.
- Block03 on Noteable via Blackboard
Reference material:
For PCA:
- Cosma Shalizi’s Advanced Data Analysis, Lecture 18
- Boyd and Vandenberghe: Convex Optimization is an excellent and thorough resource.
- I showed Kalman: Leveling with Lagrange: An Alternate View of Constrained Optimization
For Clustering:
- Tibsherani’s Data Mining lecture notes (Lecture 2 and Lecture 5)
- 5 clustering algorithms you need to know
- The fastcluster packages for R and python implements “fastest” \(O(N^2)\) versions of hierarchical clustering.
- Python resources comparing hdbscan
- Scikit Learn Diagram