Model selection via penalization, resampling and cross-validation, with application to change-point detection
For most estimation or prediction tasks in statistics, many estimators are available, and each estimator usually depends itself on one or several parameters, whose calibration is crucial for optimizing the statistical performance.
These lectures will address the problem of data-driven estimator selection, focusing mostly (but not only) on the model selection problem, where all estimators are least-squares estimators. We will in particular tackle the problem of detecting changes in the mean of a noisy signal, which is a particular instance of change-point detection.
We will focus on two main kinds of questions: Which theoretical results can be proved for these selection procedures, and how these results can help practicioners to choose a selection procedure for a given statistical problem ? How can theory help to design new selection procedures that improve existing ones ?
The series of lectures will be split into three main parts:
1. Model selection via penalization, with application to change-point detection
2. Resampling methods for penalization, and robustness to heteroscedasticity in regression
3. Cross-validation for model/estimator selection, with application to detecting changes in the mean of a signal
Model selection/aggregation of models for autoregression
Statistical learning theory aims at:
- giving empirical bounds on the risk of prediction procedures;
- providing sharp oracle inequalities.
These tools are usually developed in the iid setting. In this course, we try to extend these results to the context of time series prediction.
The main tools are extensions of Hoeffding's and Bernstein's inequalities, we will particularly focus on Rio's version of Hoeffding inequality (2000), and PAC-Bayesian bounds (e. g. Catoni 2004).
2 Hoeffding's inequality for dependent random variables
3 PAC-Bayesian inequalities
4 Fast rates
Change-point detection for time series
We first present some usual time series. Then a general model is defined and studied. Finally the case of a non-stationary process composed by a sequel of several and non-independent time series is considered. An estimator of the instant breaks, parameters and number of changes is defined and its asymptotic and numerical properties are studied.
Local stationnary processes
- Gaussian likelihood theory and spectral density methods.
- Wavelet based methods, bootstrap method and multivariated processes, random fields and long memory.
- Asymptotic properties of parameters estimates: general concepts, derivative processes.
Bootstrap methods play an important role when knowledge of the distribution of some statistics is required. Typical applications are tests of hypotheses
where the determination of critical values requires at least some approximate knowledge of the distribution of the test statistic under the null hypothesis.
I first review bootstrap methods for independent data and describe
approaches to prove the asymptotic correctness of these approximations.
In a second part, I discuss discuss a few examples of bootstrap methods for dependent data, including model-based methods and a recently proposed version of a wild bootstrap for such data. Since the course is intended to be an introductionary one, I do not present the methods in most generality but I explain the main ideas by simple examples.
Lectures 1 & 2: Bootstrap for independent data (Efron's bootstrap and wild bootstrap, applied to linear and von Mises statistics)
Lectures 3 & 4: Bootstrap for dependent data (autoregressive bootstrap and dependent wild bootstrap)