课程名称 (Course Name) : Statistical Learning and Inference
课程代码 (Course Code): X033524
学分/学时 (Credits/Credit Hours): 3/54
开课时间 (Course Term ): Autumn
开课学院(School Providing the Course): Electronic Information and Electrical Engineering
任课教师(Teacher): Zhang Liqing
课程讨论时数(Course Discussion Hours): 6
课程实验数(Lab Hours): 16
课程内容简介(Course Introduction):
Statistical Learning and Inference focuses on the statistical features of machine learning and inference. This course introduces basic theory and methods for extracting rules, structures and patterns in large scale data, requiring students to master system modeling, parameter identification and model inference based on statistical models. The statistical learning methods are applicable to broad areas such as data mining, artificial intelligence and natural language processing. The course features to provide project practice on large scale data to master capability of solving large scale practical problems through modeling and learning.
The course is suitable for the master degree students working on intelligent information processing, pattern recognition, data mining and bioinformatics.
教学大纲(Course Teaching Outline):
第1章 绪论 (Introduction) 1学时
第2章 有指导学习概述 (Overview of Supervised Learning 3学时
第3章 回归的线性方法 (Linear Method for Regression ) 4学时
第4章 分类的线性方法 (Linear Method for Classification) 4学时
第5章 基展开与正则化 (Basis Expansion and Regularization )4学时
第6章 核方法 (Kernel Methods ) 4学时
第7章 模型评估与选择 (Model Assessment and Selection ) 4学时
第8章 模型推理和平均 (Model Inference and Averaging) 4学时
第9章 加法模型、树和相关方法 (Additive Model, Tree and Related Methods ) 4学时
第10章 提升和加法树 (Boosting and Additive Trees) 4学时
课程进度计划(Course Schedule):
Week 1:
1 Introduction
2 Overview of Supervised Learning
2.1 Introduction
2.2 Variable Types and Terminology
2.3 Two Simple Approaches to Prediction: Least Squares and Nearest Neighbors
2.4 Statistical Decision Theory
2.5 Local Methods in High Dimensions
2.6 Statistical Models, Supervised Learning and Function Approximation
Week 2
2.7 Structured Regression Models
2.8 Classes of Restricted Estimators
2.9 Model Selection and the Bias–Variance Tradeoff
3 Linear Methods for Regression 41
3.1 Introduction
3.2 Linear Regression Models and Least Squares
3.3 Multiple Regression from Simple Univariate Regression
3.4 Subset Selection and Coefficient Shrinkage
Week 3
4 Linear Methods for Classification 79
4.1 Introduction
4.2 Linear Regression of an Indicator Matrix
4.3 Linear Discriminant Analysis
4.4 Logistic Regression
Week 4
4.5 Separating Hyperplanes
5 Basis Expansions and Regularization
5.1 Introduction
5.2 Piecewise Polynomials and Splines
5.3 Filtering and Feature Extraction
5.4 Smoothing Splines
5.5 Automatic Selection of the Smoothing Parameters
Week 5
5.6 Nonparametric Logistic Regression
5.7 Multidimensional Splines
5.8 Regularization and Reproducing Kernel Hilbert Spaces
5.9 Wavelet Smoothing
Week 6
6 Kernel Methods
6.1 One-Dimensional Kernel Smoothers
6.2 Selecting the Width of the Kernel
6.3 Local Regression in IRp
6.4 Structured Local Regression Models in IRp
6.5 Local Likelihood and Other Models
Week 7
6.6 Kernel Density Estimation and Classification
6.7 Radial Basis Functions and Kernels
6.8 Mixture Models for Density Estimation and Classification
7 Model Assessment and Selection
7.1 Introduction
7.2 Bias, Variance and Model Complexity
7.3 The Bias–Variance Decomposition
Week 8
7.4 Optimism of the Training Error Rate
7.5 Estimates of In-Sample Prediction Error
7.6 The Effective Number of Parameters
7.7 The Bayesian Approach and BIC
7.8 Minimum Description Length
7.9 Vapnik–Chernovenkis Dimension
7.10 Cross-Validation
7.11 Bootstrap Methods
Week 9
8 Model Inference and Averaging
8.1 Introduction
8.2 The Bootstrap and Maximum Likelihood Methods
8.3 Bayesian Methods
8.4 Relationship Between the Bootstrap and Bayesian Inference
8.5 The EM Algorithm
Week 10
8.6 MCMC for Sampling from the Posterior
8.7 Bagging
8.8 Model Averaging and Stacking
9 Additive Models, Trees, and Related Methods
9.1 Generalized Additive Models
Week 11
9.2 Tree-Based Methods
9.3 PRIM—Bump Hunting
9.4 MARS: Multivariate Adaptive Regression Splines
9.5 HierarchicalMixtures of Experts
9.6 Missing Data
Week 12
12 Support Vector Machines and Flexible Discriminants
12.1 Introduction
12.2 The Support Vector Classifier
12.3 Support Vector Machines
12.4 Generalizing Linear Discriminant Analysis
12.5 Flexible Discriminant Analysis
Week 13
12.6 Penalized Discriminant Analysis
12.7 Mixture Discriminant Analysis
13 Prototype Methods and Nearest-Neighbors
13.1 Introduction
13.2 Prototype Methods
13.3 k-Nearest-Neighbor Classifiers
13.4 Adaptive Nearest-Neighbor Methods
Week 14
14 Unsupervised Learning
14.1 Introduction
14.2 Association Rules
14.3 Cluster Analysis
Week 15
14.4 Self-Organizing Maps
14.5 Principal Components, Curves and Surfaces
14.6 Independent Component Analysis and Exploratory Projection Pursuit
14.7 Multidimensional Scaling
Week 16
Review
课程考核要求(Course Assessment Requirements):
Students will be graded on their understanding of the course as reflected in their performance on the homework, class participation, examinations, and projects as follows (approximately):
Homework 20% + Projects 40% + Final Exams 40%
参考文献(Course References):
1. Elements of Statistical Learning, Second Edition, Hastie T., R. Tibshirani, and J. Fiedman, Springer, 2009
2. The Nature of Statistical Learning Theory, Vapnik, V., Springer-Verlag, New York. 1996
预修课程(Prerequisite Course):
Matrix Theory; Optimization Method; Probability and Statistics