- Thesis:
- Inauguraldissertation zur Erlangung der Doktorwürde der Naturwissenschaftlich-Mathematischen Gesamtfakultät der Ruprecht-Karls-Universität Heidelberg
- urn:nbn:de:bsz:16-heidok-318910
- Author:
- Nathawut Phandoidaen (Heidelberg University)
- Title:
- Statistical analysis of dependent data
- Supervisor and examiner:
- Rainer Dahlhaus (Heidelberg University)
- Second examiner:
- Jan JOHANNES
- Abstract:
- In this doctoral dissertation we will investigate dependence
structures in three different cases.
We first provide a framework for empirical process theory of (locally) stationary processes for classes of either smooth or nonsmooth functions. The theory is approached by using the so-called functional dependence measure in order to quantify dependence. This work extends known results for stationary Markov chains and mixing sequences while accounting for additional time dependence. The main contributions consist of functional central limit theorems and nonasymptotic maximal inequalities. These can be employed to show, for example, uniform convergence rates for nonparametric regression with locally stationary noise. We further derive rates for kernel density estimators in the case of stationary and locally stationary observations. A special focus is placed on the functional convergence of the empirical distribution function (EDF). Comparisons with results based on other measures of dependence are carried out, as well.
In a subsequent step, we consider high-dimensional stationary processes where new observations are generated by a noisy transformation of past observations. By means of our previous results we prove oracle inequalities for the empirical risk minimizer if the data is generated by either an absolutely regular mixing sequence ( β -mixing) or a Bernoulli shift process under functional dependence. Assuming that the underlying transformation of our data follows an encoder-decoder structure, we construct an encoder-decoder neural network estimator for the prediction of future time steps. We give upper bounds for the expected forecast error under specific structural and sparsity conditions on the network architecture. In a quantitative simulation we discuss the behavior of network estimators under different model assumptions and provide a weather forecast for German cities using data available by the German Meteorological Service (Deutsche Wetterdienst).
Moving onto a different setting, we study the nonparametric estimation of an unknown survival function with support on the positive real line based on a sample with multiplicative measurement errors. The proposed fully data-driven procedure involves an estimation step of the survival function’s Mellin transform and a regularization of the Mellin transform’s inverse by a spectral cut-off. A data-driven choice of the cut-off parameter balances bias and variance. In order to discuss the bias term, we consider Mellin-Sobolev spaces which characterize the regularity of the unknown survival function by the decay behavior of its Mellin transform. When analyzing the variance term we consider the standard i.i.d. case and incorporate dependent observations in form of Bernoulli shift processes and absolutely regular mixing sequences. In the i.i.d. setting we are able to show minimax-optimality over Mellin-Sobolev spaces for the spectral cut-off estimator.