Analysing functional data by approximate factor models
Analysing functional data by approximate factor models
Disciplines
Mathematics (100%)
Keywords
-
Approximate Factor Models,
Functional Data,
High Dimensional Inference,
Preprocessing
Due to the rapid technical progress we are able to collect and store ever larger amounts of data in a many areas of our lives. In order to extract useful information from this flood of data, its adequate preparation and evaluation is essential. Hence, it doesnt surprise that in line with this rapid growth in data, many new scientific disciplines addressing precisely these problems have developed. Functional data analysis is one of those. The term functional data is used when statistical observations can be viewed as continuous time processes. Intraday pollution level curves at a specific measuring station provide an example. In practice, however, it is not possible to continuously measure such curves. Instead we have a sequence of measurements at a certain time resolution (e.g. half -hourly measurements). Moreover, measurements are often flawed. If, for example, we measure the ambient particular matter concentration then, on the one hand, the underlying measurement technology is extremely complex and therefore prone to errors, and, on the other hand, local external influences (e.g. cigarette smoke in the immediate vicinity of the measuring station) can considerably disturb these measurements. In terms of this example, a challenging question is how to estimate the latent but actual pollutant concentration (we refer to this as the "signal") and to separate it from the "errors". This problem occurs in a large number of functional data records and is usually dealt with by smoothing approaches in the preprocessing. In doing so, each noisy curve is smoothed individually. This implicates that the model is not "trained" with the entire data. In addition, the adequate degree of smoothing is difficult to assess in practice. In this project we are exploring an alternative and innovative approach. With the help of so - called approximate factor models, which by now have mainly been used in macroeconometrics to model high-dimensional financial data, we "learn" from all the available data and may reconstruct the signal efficiently without making specific assumptions. Preliminary results show that this method works extremely well in practice and at the same time it can be theoretically justified. Our alternative approach gives rise to a number of open research questions and several extensions which we will explore and develop in this project.
- Technische Universität Graz - 100%