Projectdetail

Grant DOI 10.55776/KIN4464025
Funding program International - Multilateral Initiatives
Status Ongoing
Start April 20, 2026
End April 19, 2030
Funding amount € 230,764

DFG Research Units

Disciplines

Computer Sciences (20%); Mathematics (80%)

Keywords

Discriminant Analysis,
High-Dimension,
Regularization,
Gradient Descent

Abstract

Despite the unprecedented success of modern artificial intelligence, the precise reasons for the effectiveness of these complex methods are still far from being fully understood. Given the widespread use and application of AI, however, a systematic understanding of the strengths, weaknesses, and safety of these technologies is of great societal interest. The aim of this project is therefore to provide a mathematically precise description and analysis of a particular type of AI, namely so-called classification algorithms, with respect to their statistical reliability and computational feasibility. In addition, statistical methods will be developed that ensure the protection of individual privacy when such algorithms are applied. A classification algorithm is a computational procedure that is able to assign digitized observational units (e.g., hospital patients, texts, videos, etc.) to an appropriate class (e.g., healthy or ill, spam or email, film genre, etc.). Modern classification problems are primarily characterized by their high dimensionality. This means that the observational units can be very complex digital objects such as images or videos, and that the algorithms used are themselves characterized by a large number of free parameters. This enormous complexity is very difficult to capture and analyze in a mathematically rigorous manner. For this reason, in this project we will initially focus on a particular form of high-dimensional classification algorithm, namely linear discriminant analysis. However, when the dimension becomes too high, the problem of interpolation arises, in which every available data point is assigned to its class without error. We are interested in the predictive performance of such an interpolating classifier. Classical statistical theory suggests that this performance will be rather inadequate and instead recommends the approach of so-called l2- regularization for stabilization. However, it is largely unknown whether this approach remains effective in the high-dimensional setting and how its computational implementation can be designed efficiently. To address this, new methodological approaches based on gradient methods will be developed and their statistical accuracy will be investigated. With regard to the privacy protection problem mentioned above, the widely discussed statistical paradigm of differential privacy (DP) is particularly relevant. In particular, the recent development of f-DP provides intuitively interpretable statements about the type of protection that can be guaranteed. However, it is largely unclear how to select an appropriate randomization strategy from the vast number of possible options. In this project, we will investigate to what extent existing methods can be extended to higher dimensions and to the new framework of f-DP.

Research institution(s)

Universität Wien - 100%

International project participants

Angelika Rohde, Universität Freiburg - Germany, project partner

Go to overview page Discover

Go to overview page Funding

Go to overview page About Us

Go to overview page News

High-dimensional data sets in discriminant analysis

Disciplines

Keywords

Contact

General information

Go to overview page Discover

Go to overview page Funding

Go to overview page About Us

Go to overview page News

SOCIAL MEDIA

SCILOG

High-dimensional data sets in discriminant analysis

Disciplines

Keywords