Supersmooth functional data analysis and PCA-preprocessing
Supersmooth functional data analysis and PCA-preprocessing
DFG-Forschungsgruppen
Disciplines
Computer Sciences (20%); Mathematics (80%)
Keywords
-
PCA,
Preprocessing,
Robustness,
Dependence,
Statistical efficiency
Over the past decades, the amount of data and possible ways for analysing have seen a massive increase. In particular, methods that can treat data as high-dimensional or even infinite-dimensional objects have become viable. On the other hand, there are classical ideas and concepts in statistical data analysis that have had a huge story of success, a key method being the so called regression analysis. Given an input X and an output Y, regression analysis seeks to find the simplest and most influential connections between X and Y. The aim of this project is to transfer this idea to high- dimensional or even infinite-dimensional objects. This poses quite challenging mathematical and conceptual problems: It is not so clear what good notions of `simple` and `influential` are in this context. Moreover, the optimal solutions - in a mathematical, theoretical sense - may be impossible to compute in practice subject to realistic time constraints. Therefore, in a first step, we will make a complexity analysis and try to solve this problem from a so-called information theoretic perspective. Put simply, this means that, given the most ideal scenarios and limitless computational resources, we wish to determine the optimal procedures. In the second phase of this project, we will look for feasible solutions, approximating the theoretical ones as best as possible. Here, feasibility corresponds to realistic computational time constraints, but also to properties that real data typically display in practice: There may be outliers or data contamination (meaning that parts of the data are different and not as expected), or certain additional dependence relations that have to be accounted for. Our final, ultimate goal is a computationally feasible procedure that is able to automatically adapt to all these kind of problems in a purely data-driven way. In other words, a user friendly statistical tool, where the user does not need to worry about any kind of tuning problems and no additional, external sources or knowledge is required.
- Universität Wien - 100%
- Aurore Delaigle, The University of Melbourne - Australia
- Martin Wahl, Humboldt-Universität zu Berlin - Germany
- Alexander Meister, Universität Rostock - Germany
- Wei-Biao Wu, University of Chicago - USA
Research Output
- 12 Citations
- 4 Publications
-
2023
Title Relative perturbation bounds with applications to empirical covariance operators DOI 10.1016/j.aim.2022.108808 Type Journal Article Author Jirak M Journal Advances in Mathematics Pages 108808 -
2025
Title Weak dependence and optimal quantitative self-normalized central limit theorems DOI 10.4171/jems/1573 Type Journal Article Author Jirak M Journal Journal of the European Mathematical Society Link Publication -
2024
Title Quantitative limit theorems and bootstrap approximations for empirical spectral projectors DOI 10.1007/s00440-024-01290-4 Type Journal Article Author Jirak M Journal Probability Theory and Related Fields Pages 119-177 Link Publication -
2025
Title Robust signal recovery in Hadamard spaces DOI 10.1016/j.jmva.2025.105469 Type Journal Article Author Köstenberger G Journal Journal of Multivariate Analysis Pages 105469 Link Publication