MATH 251: Statistical and Machine Learning Classification
Fall 2018, San Jose State UniversityCourse description [Syllabus]
This is an advanced topics course in the machine learning field of classification, with the goals of introducing
- Dimensionality Reduction
- Instance-based Methods
- Discriminant Analysis
- Logistic Regression
- Support Vector Machine
- Kernel Methods
- Ensemble Methods
- Neural Networks
all based on the benchmark dataset of MNIST Handwritten Digits. Such a teaching strategy was partly inspired by Michael Nielsen's free online book - Neural Networks and Deep Learning, which notes explicitly that this dataset hits a ``sweet spot'' - it is challenging, but ``not so difficult as to require an extremely complicated solution, or tremendous computational power''. In addition, the digit recognition problem is very easy to understand, yet practically important.
Course progress
Date | Slides | Further Reading |
---|---|---|
8/22 | Review | |
8/27 | Introduction | Final project instructions |
9/5 | Instance-based classification | Chapter 2 of textbook 1 |
9/12 | PCA [Matrix algebra] | Section 10.2 of textbook 1 |
9/24 | LDA (for dimensionailty reduction) | Prof. Olga Veksler’s lecture |
10/8 | Bayes classifiers | Section 4.4 of textbook 1 |
10/15 | Midterm | Midterm solution |
10/22 | Logistic regression | Section 4.3 of textbook 1 |
10/29 | Support vector machines [Lagrange Dual] | Chapter 9 of textbook 1 |
11/19 | Ensemble learning | [Trevor Hastie's slides] [Adele Cutler's lecture] [Chapter 8 of textbook] |
11/26 | Neural networks | [Michael Nielsen’s book] [Olga Veksler’s lecture] [Perceptron] |
12/5 | Course summary and project information | |
12/10 | Final project presentations | |
12/12 | Final project presentations (cont'd) |
More learning resources
Programming languages
- MATLAB:
- Common Matlab commands;
- Online tutorials (see here for a simple one);
- Statistics and Machine Learning Toolbox Documentation;
- Python:
- R:
Useful course websites
- Prof. Veksler's CS9840a Learning and Computer Vision at University of Western Ontario
- Andrew Ng's CS 229 Machine Learning at Standford University
- Manik's CSL 864 - Special Topics in AI: Classification at Microsoft
Data sets
- USPS Zip Code Data
- UCI Machine Learning Repository
- LibSVM data sets
- Extended Yale Face Database B
- Oxford Flowers Category Datasets