Depth and Discriminability in Deep Learning Architectures
Depth and Discriminability in Deep Learning Architectures
Disciplines
Computer Sciences (60%); Mathematics (40%)
Keywords
-
Discriminatory Properties,
Scattering Networks,
Deep Learning,
Deep Neural Networks
Research in artificial intelligence and machine learning aims at developing computational systems that match or exceed human performance in highly complex cognitive tasks. In recent years, we have been witness to a number of impressive breakthroughs in these fields. In 2016, for the first time ever, a world- class human Go player was defeated by a computer. Advances in speech and object recognition have far exceeded previous expectations and are greatly contributing to the applicability and robustness of new technologies such as voice control or autonomous driving. Most of these breakthroughs are due to a special method of teaching and designing smart computational systems which is known as deep learning. The basic idea of deep learning is to adapt strongly simplified models of layered network structures in the human nervous system such that they succeed at performing specific task as, for example, digit recognition. While the success of deep neural networks is evident, the underlying mechanisms that are responsible for their groundbreaking performance remain largely mysterious. This has sparked a renewed interest in the rigorous mathematical analysis of deep neural network architectures with the goal of uncovering these mechanisms. In the project "Depth and Discriminability in Deep Learning Architectures", we will perform a fundamental mathematical analysis of a highly important property of deep learning architectures, namely their discriminatory behavior. Imagine a neural network that is supposed to correctly recognize handwritten letters. In order to succeed at this task, it has to be capable of distinguishing the letter "u" from the letter "v" even though in many handwriting styles, they look very similar. Our goal is to better understand how depth, that is the number of layers, and other characteristics of a neural network influence its capability of distinguishing between elements from different classes that, at first sight, are very similar. We hope that our work will not only contribute to a better understanding of why deep learning works so well but also provide guidelines on how to optimally design deep learning architectures to succeed at a given task.
- Universität Wien - 100%
- Helmut Bölcskei, ETH Hönggerberg - Switzerland
Research Output
- 86 Citations
- 9 Publications
- 1 Disseminations
-
2019
Title Edge, Ridge, and Blob Detection with Symmetric Molecules DOI 10.48550/arxiv.1901.09723 Type Preprint Author Reisenhofer R -
2023
Title Assessing the heterogeneity in the transmission of infectious diseases from time series of epidemiological data. DOI 10.1371/journal.pone.0286012 Type Journal Article Author Herrmann L Journal PloS one -
2022
Title HARNet: A Convolutional Neural Network for Realized Volatility Forecasting DOI 10.2139/ssrn.4116642 Type Preprint Author Reisenhofer R Link Publication -
2019
Title Design for Extremes: A Contour Method for Defining Requirements Based on Multivariate Extremes DOI 10.1017/dsi.2019.149 Type Journal Article Author Haselsteiner A Journal Proceedings of the Design Society: International Conference on Engineering Design Pages 1433-1442 Link Publication -
2019
Title Edge, Ridge, and Blob Detection with Symmetric Molecules DOI 10.1137/19m1240861 Type Journal Article Author Reisenhofer R Journal SIAM Journal on Imaging Sciences Pages 1585-1626 Link Publication -
2022
Title Assessing the heterogeneity in the transmission of infectious diseases from time series of epidemiological data DOI 10.1101/2022.02.21.22271241 Type Preprint Author Schneckenreither G Pages 2022.02.21.22271241 Link Publication -
2022
Title Solving the electronic Schrödinger equation for multiple nuclear geometries with weight-sharing deep neural networks DOI 10.1038/s43588-022-00228-x Type Journal Article Author Scherbela M Journal Nature Computational Science Pages 331-341 -
2022
Title HARNet: A Convolutional Neural Network for Realized Volatility Forecasting DOI 10.48550/arxiv.2205.07719 Type Preprint Author Reisenhofer R -
2021
Title Solving the electronic Schrödinger equation for multiple nuclear geometries with weight-sharing deep neural networks DOI 10.48550/arxiv.2105.08351 Type Preprint Author Scherbela M