Projectdetail

Grant DOI 10.55776/ESP643
Funding program ESPRIT
Status ongoing
Start August 1, 2024
End July 31, 2027
Funding amount € 340,819

Disciplines

Computer Sciences (10%); Mathematics (90%)

Keywords

Acceleration,
Supervised Learning,
First-Order Methods,
Implicit Bias,
Adaptive Step Sizes

Abstract

When training modern machine learning models, the number of parameters frequently exceeds the number of samples or observations. In such scenarios, classical statistical theory suggest that the resulting model might perform poorly due to overfitting. Possibly many solutions to the optimization problem exist and not all of them might generalize well to unseen data. However, in practice, we observe that when these models are trained using optimization methods based on gradient information, no such overfitting is happening, and the models tend to perform well. This phenomenon is commonly referred to as implicit regularization. We intend to study the effects of two commonly used optimization techniques on the obtained solution. The two techniques in question are momentum, referring to the use of not just the current but also past gradients, and adaptive step sizes, meaning the step size (or learning rate) parameter is not supplied externally but computed by the algorithm itself.

Research institution(s)

Universität Wien - 100%

Project participants

Yurii Malitskyi, Universität Wien , mentor

International project participants

Stöger Dominik, Katholische Universität Eichstätt-Ingolstadt - Germany

Go to overview page Discover

Go to overview page Funding

Go to overview page About Us

Go to overview page News

The implicit bias of momentum and adaptive step sizes

Disciplines

Keywords

Contact

General information

Go to overview page Discover

Go to overview page Funding

Go to overview page About Us

Go to overview page News

SOCIAL MEDIA

SCILOG

The implicit bias of momentum and adaptive step sizes

Disciplines

Keywords