• Skip to content (access key 1)
  • Skip to search (access key 7)
FWF — Austrian Science Fund
  • Go to overview page Discover

    • Research Radar
      • Research Radar Archives 1974–1994
      • Open API
    • Discoveries
      • Emmanuelle Charpentier
      • Adrian Constantin
      • Monika Henzinger
      • Ferenc Krausz
      • Wolfgang Lutz
      • Walter Pohl
      • Christa Schleper
      • Elly Tanaka
      • Anton Zeilinger
    • Impact Stories
      • Verena Gassner
      • Wolfgang Lechner
      • Birgit Mitter
      • Oliver Spadiut
      • Georg Winter
    • scilog Magazine
    • Austrian Science Awards
      • FWF Wittgenstein Awards
      • FWF ASTRA Awards
      • FWF START Awards
      • Award Ceremony
    • excellent=austria
      • Clusters of Excellence
      • Emerging Fields
    • In the Spotlight
      • 40 Years of Erwin Schrödinger Fellowships
      • Quantum Austria
    • Dialogs and Talks
      • think.beyond Summit
    • Knowledge Transfer Events
    • E-Book Library
  • Go to overview page Funding

    • Portfolio
      • excellent=austria
        • Clusters of Excellence
        • Emerging Fields
      • Projects
        • Principal Investigator Projects
        • Principal Investigator Projects International
        • Clinical Research
        • 1000 Ideas
        • Arts-Based Research
        • FWF Wittgenstein Award
      • Careers
        • ESPRIT
        • FWF ASTRA Awards
        • Erwin Schrödinger
        • doc.funds
        • doc.funds.connect
      • Collaborations
        • Specialized Research Groups
        • Special Research Areas
        • Research Groups
        • International – Multilateral Initiatives
        • #ConnectingMinds
      • Communication
        • Top Citizen Science
        • Science Communication
        • Book Publications
        • Digital Publications
        • Open-Access Block Grant
      • Subject-Specific Funding
        • AI Mission Austria
        • Belmont Forum
        • ERA-NET HERA
        • ERA-NET NORFACE
        • ERA-NET QuantERA
        • Alternative Methods to Animal Testing
        • European Partnership BE READY
        • European Partnership Biodiversa+
        • European Partnership BrainHealth
        • European Partnership ERA4Health
        • European Partnership ERDERA
        • European Partnership EUPAHW
        • European Partnership FutureFoodS
        • European Partnership OHAMR
        • European Partnership PerMed
        • European Partnership Water4All
        • Gottfried and Vera Weiss Award
        • LUKE – Ukraine
        • netidee SCIENCE
        • Herzfelder Foundation Projects
        • Quantum Austria
        • Rückenwind Funding Bonus
        • WE&ME Award
        • Zero Emissions Award
      • International Collaborations
        • Belgium/Flanders
        • Germany
        • France
        • Italy/South Tyrol
        • Japan
        • Korea
        • Luxembourg
        • Poland
        • Switzerland
        • Slovenia
        • Taiwan
        • Tyrol-South Tyrol-Trentino
        • Czech Republic
        • Hungary
    • Step by Step
      • Find Funding
      • Submitting Your Application
      • International Peer Review
      • Funding Decisions
      • Carrying out Your Project
      • Closing Your Project
      • Further Information
        • Integrity and Ethics
        • Inclusion
        • Applying from Abroad
        • Personnel Costs
        • PROFI
        • Final Project Reports
        • Final Project Report Survey
    • FAQ
      • Project Phase PROFI
      • Project Phase Ad Personam
      • Expiring Programs
        • Elise Richter and Elise Richter PEEK
        • FWF START Awards
  • Go to overview page About Us

    • Mission Statement
    • FWF Video
    • Values
    • Facts and Figures
    • Annual Report
    • What We Do
      • Research Funding
        • Matching Funds Initiative
      • International Collaborations
      • Studies and Publications
      • Equal Opportunities and Diversity
        • Objectives and Principles
        • Measures
        • Creating Awareness of Bias in the Review Process
        • Terms and Definitions
        • Your Career in Cutting-Edge Research
      • Open Science
        • Open-Access Policy
          • Open-Access Policy for Peer-Reviewed Publications
          • Open-Access Policy for Peer-Reviewed Book Publications
          • Open-Access Policy for Research Data
        • Research Data Management
        • Citizen Science
        • Open Science Infrastructures
        • Open Science Funding
      • Evaluations and Quality Assurance
      • Academic Integrity
      • Science Communication
      • Philanthropy
      • Sustainability
    • History
    • Legal Basis
    • Organization
      • Executive Bodies
        • Executive Board
        • Supervisory Board
        • Assembly of Delegates
        • Scientific Board
        • Juries
      • FWF Office
    • Jobs at FWF
  • Go to overview page News

    • News
    • Press
      • Logos
    • Calendar
      • Post an Event
      • FWF Informational Events
    • Job Openings
      • Enter Job Opening
    • Newsletter
  • Discovering
    what
    matters.

    FWF-Newsletter Press-Newsletter Calendar-Newsletter Job-Newsletter scilog-Newsletter

    SOCIAL MEDIA

    • LinkedIn, external URL, opens in a new window
    • , external URL, opens in a new window
    • Facebook, external URL, opens in a new window
    • Instagram, external URL, opens in a new window
    • YouTube, external URL, opens in a new window

    SCILOG

    • Scilog — The science magazine of the Austrian Science Fund (FWF)
  • elane login, external URL, opens in a new window
  • Scilog external URL, opens in a new window
  • de Wechsle zu Deutsch

  

Cross-layer language models for conversational speech

Cross-layer language models for conversational speech

Barbara Schuppler (ORCID: 0000-0003-4009-0832)
  • Grant DOI 10.55776/P32700
  • Funding program Principal Investigator Projects
  • Status ended
  • Start November 1, 2019
  • End October 31, 2024
  • Funding amount € 593,189

Disciplines

Electrical Engineering, Electronics, Information Engineering (40%); Linguistics and Literature (60%)

Keywords

    Conversational Speech, Automatic Speech Recognition, Language Modeling, Speech Perception, Prosody, Communicative Functions

Abstract Final report

Whereas speech scientists have focused on carefully pronounced speech for a long time, the interest has more and more shifted to language as it occurs in natural conversations. This has two reasons. From a technological point of view, there is an increasing demand for social robots, which in order to become more interactional and social also need to use language naturally. Second, linguists became more interested in natural conversations, as they reveal additional insights to controlled experiments with respect to how speech is processed in our brain. In this project, we aim at improving the automatic recognition of conversational speech, at increasing our knowledge about the human production and perception of conversational speech, and to increase our knowledge and resources for conversational Austrian German. On the basis of conversational speech and chat corpora from German and Austrian speakers, we develop cross-layered language models which include acoustic and semantic contextual information the way humans do. These models will be informed by quantitative phonetic corpus studies and tested in ASR and speech perception experiments. For conducting the linguistic studies, speech technology will be used for creating automatic annotations, acoustic feature extraction and data analysis. Gained linguistic knowledge will then again be incorporated into the language models. This approach requires an interdisciplinary team (engineers and linguists) that works closely together. The PI Dr. Barbara Schuppler (Graz University of Technology) is a young interdisciplinary speech scientist who has shown in two previous FWF projects that her cross-layer principle reaches good results for pronunciation and prosody modelling. The project proposed gives her the opportunity to expand the cross-layer concept to language models, and to establish a research group on conversational speech in Austria. The national partners Prof. Dina El Zarka (Department of Linguistics, University of Graz) and Dr. Roman Kern (Know-Center GmbH) bring long lasting experience to the project. Together, they cover the disciplines speech technology, linguistics, phonetics and natural language processing.

In the last decade, conversational speech has received a lot of attention among speech scientists. On the one hand, accurate automatic speech recognition (ASR) systems are essential for conversational systems and social robots, as these shall become more interactional and social rather than solely transactional. On the other hand, linguists study natural conversations, as these reveal additional insights to controlled experiments with respect to how speech processing works. The works of this project investigate conversational speech to enhance our linguistic knowledge of conversational Austrian German and to use this knowledge to improve ASR systems. For this purpose, the GRASS corpus, a large-scale database of Austrian German conversations, has been annotated with respect to communicative functions annotations suitable for a newly introduced method for quantitative analysis of conversational dynamics. Our work demonstrates that prosodic variation in conversational speech is systematic and linked to semantic and pragmatic context. But how sensitive are ASR systems to prosodic cues and conversational context? Our work suggests that integrating both data-driven and theory-driven components, including linguistic knowledge, can improve ASR, particularly for short utterances. When comparing how ASR systems transcribe conversational speech with how humans transcribe the same utterances, we find that they struggle with the same characteristics of conversational speech (e.g., disfluent sentences, dialectal pronunciation, fast speech rate), but just to a different degree. Finally, the project delivers valuable methods for speech technologists working with low-resource languages and dialects and for working with small datasets of high degrees of variation (e.g., pathological speech, child speech).

Research institution(s)
  • Technische Universität Graz - 50%
  • Universität Graz - 22%
  • Technische Universität Graz - 28%
Project participants
  • Roman Kern, Technische Universität Graz , associated research partner
  • Dina El Zarka, Universität Graz , associated research partner
International project participants
  • Benno Maria Stein, Bauhaus-Universität Weimar - Germany
  • Bogdan Ludusan, Universität Bielefeld - Germany
  • Margaret Zellers, University of Stockholm - Sweden
  • Dimitra Vergyri, SRI International - USA

Research Output

  • 5 Citations
  • 44 Publications
  • 1 Methods & Materials
  • 3 Software
  • 12 Disseminations
  • 6 Scientific Awards
  • 6 Fundings
Publications
  • 0
    Title (When) Does it harm to be incomplete? Human and automatic speech recognition of syntactically disfluent structures
    Type Journal Article
    Author Lennkh S
    Journal Speech Communication
  • 0
    Title What the Filler? Both ASR Systems and Humans Struggle More With Other Kinds of Disfluencies Than With Filler Particles
    Type Conference Proceeding Abstract
    Author Eckert L
    Conference Interspeech 2025
  • 0
    Title Prominence-aware automatic speech recognition for conversational speech
    Type Conference Proceeding Abstract
    Author Kubin G.
    Conference Interspeech 2024
  • 0
    Title Context is all you need? Low-resource conversational ASR profits from context, coming from the same or from the other speaker
    Type Conference Proceeding Abstract
    Author Linke J.
    Conference Interspeech 2024
  • 0
    Title Continuous prediction of backchannel timing for human-robot interaction
    Type Conference Proceeding Abstract
    Author Hagmueller M.
    Conference Interspeech 2024
  • 2022
    Title Information-theoretic approaches in model reduction and machine learning
    Type Postdoctoral Thesis
    Author Bernhard Geiger
  • 2025
    Title Slicer - A Tool for Efficient Stimuli Extraction from Large Speech Corpora
    Type Conference Proceeding Abstract
    Author Eckert L
    Conference Forum Acusticum Euronoise 2025
  • 2025
    Title Uncertainty prediction for prominence classification with chroma features
    Type Conference Proceeding Abstract
    Author Linke J.
    Conference ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
    Pages 1 - 5
    Link Publication
  • 2025
    Title Uncertainty prediction for prominence classification with chroma features
    Type Conference Proceeding Abstract
    Author Linke J.
    Conference Event 2025 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP
  • 2025
    Title Turn-taking annotation for quantitative and qualitative analyses of conversation
    Type Other
    Author Kelterer A.
  • 2025
    Title Turn-taking annotation for quantitative and qualitative analyses of conversation
    Type Other
    Author Kelterer A.
    Pages 1 - 41
    Link Publication
  • 2024
    Title On the Role of Priors in Bayesian Causal Learning
    DOI 10.1109/tai.2024.3522867
    Type Journal Article
    Author Geiger B
    Journal IEEE Transactions on Artificial Intelligence
    Pages 1439-1445
    Link Publication
  • 2024
    Title On Disfluency and Non-lexical Sound Labeling for End-to-end Automatic Speech Recognition
    DOI 10.21437/interspeech.2024-2157
    Type Conference Proceeding Abstract
    Author Meng Y
    Pages 1270-1274
  • 2024
    Title Towards causal data science for non-independent data
    Type Postdoctoral Thesis
    Author Roman Kern
  • 2023
    Title Reconsidering Read and Spontaneous Speech: Causal Perspectives on the Generation of Training Data for Automatic Speech Recognition
    DOI 10.3390/info14020137
    Type Journal Article
    Author Gabler P
    Journal Information
    Pages 137
    Link Publication
  • 2023
    Title Using Kaldi for Automatic Speech Recognition of Conversational Austrian German
    DOI 10.48550/arxiv.2301.06475
    Type Preprint
    Author Linke J
  • 2025
    Title Uncertainty prediction for prominence classification with chroma features
    DOI 10.1109/icassp49660.2025.10887992
    Type Conference Proceeding Abstract
    Author Linke J
    Pages 1-5
  • 2025
    Title Cross-layer models for conversational speech
    Type Postdoctoral Thesis
    Author Barbara Schuppler
  • 2025
    Title What’s so complex about conversational speech? A comparison of HMM-based and transformer-based ASR architectures
    DOI 10.1016/j.csl.2024.101738
    Type Journal Article
    Author Linke J
    Journal Computer Speech & Language
    Pages 101738
    Link Publication
  • 2025
    Title What's so complex about conversational speech? Prosodic Prominence and Speech Recognition Challenges
    Type PhD Thesis
    Author Julian Linke
  • 2022
    Title Analyzing the different meanings of laughter in conversational speech
    Type Other
    Author Schmallegger E.
    Link Publication
  • 2022
    Title Speaker interpolation based data augmentation for Automatic Speech Recognition
    Type Other
    Author Kerle L.
    Link Publication
  • 2022
    Title Text Complexity in the Digital Humanities - A Case Study on 18th Century Periodicals
    Type Other
    Author Geiger B
    Link Publication
  • 2024
    Title Breath sounds and their relationship to turn-taking in conversational speech
    Type Other
    Author Menrath A.
    Link Publication
  • 2024
    Title Modelling Bachchannels for Human-Robot Interaction
    Type Other
    Author Paierl M.
    Link Publication
  • 2024
    Title Towards Improving ASR Outputs of Spontaneous Speech with LLMs
    Type Conference Proceeding Abstract
    Author Karner M.
    Conference 20th Conference on Natural Language Processing (KONVENS 2024),
    Pages 339 - 348
    Link Publication
  • 2024
    Title Version Control for Speech Corpora
    Type Conference Proceeding Abstract
    Author Boehm M.
    Conference 20th Conference on Natural Language Processing (KONVENS 2024)
    Pages 303 - 308
    Link Publication
  • 2023
    Title creapy: A Python-based tool for the detection of creak in conversational speech
    Type Conference Proceeding Abstract
    Author Paierl M
    Conference 20th International Congress on Phonetic Sciences (ICPhS)
    Pages 1716-1720
    Link Publication
  • 2023
    Title Points of maximum grammatical control - The prosody of a turn-holding practice
    Type Conference Proceeding Abstract
    Author Kelterer A
    Conference 20th International Congress on Phonetic Sciences (ICPhS)
    Pages 3467-3471
    Link Publication
  • 2023
    Title Speaker interpolation based data augmentation for automatic speech recognition
    Type Conference Proceeding Abstract
    Author Kerle L.
    Conference Proceedings of the 20th International Congress of Phonetic Sciences - ICPhS 2023
    Pages 3126 - 3130
    Link Publication
  • 2023
    Title Speechcake: Version control for speech corpora
    Type Other
    Author Dumitru V.A.
    Link Publication
  • 2023
    Title 10 Years of GRASS development: Experiences from annotating a large corpus of conversational Austrian German
    Type Conference Proceeding Abstract
    Author Kelterer A.
    Conference Österreichische Linguistiktagung : Austrian Meeting on Digital Linguistics: Recent Developments in Austria - Institut fuer Linguistik, Graz, Austria
    Link Publication
  • 2023
    Title Using word-level features for prosodic prominence detection in conversational speech
    Type Conference Proceeding Abstract
    Author Kubin G.
    Conference Proceedings of the 20th International Congress of Phonetic Sciences - ICPhS 2023
    Pages 3101 - 3105
    Link Publication
  • 2023
    Title Single Channel Source Separation in the Wild -- Conversational Speech in Realistic Environments
    Type Conference Proceeding Abstract
    Author Berger E.
    Conference ITG-Fachbericht 312: Speech Communication
    Pages 96 - 100
    Link Publication
  • 2023
    Title What do self-supervised speech representations encode? An analysis of languages, varieties, speaking styles and speakers
    DOI 10.21437/interspeech.2023-951
    Type Conference Proceeding Abstract
    Author Kadar M
    Pages 5371-5375
  • 2023
    Title (Dis)agreement and Preference Structure are Reflected in Matching Along Distinct Acoustic-prosodic Features
    DOI 10.21437/interspeech.2023-1538
    Type Conference Proceeding Abstract
    Author Kelterer A
    Pages 4768-4772
  • 2023
    Title Exploring Graph Theory Methods For the Analysis of Pronunciation Variation in Spontaneous Speech
    DOI 10.21437/interspeech.2023-1398
    Type Conference Proceeding Abstract
    Author Geiger B
    Pages 596-600
  • 2022
    Title An analysis of prosodic boundaries across speaking styles in two varieties of German
    DOI 10.1016/j.specom.2022.05.002
    Type Journal Article
    Author Ludusan B
    Journal Speech Communication
    Pages 93-106
  • 2022
    Title How prosody affects ASR performance in conversational Austrian German
    DOI 10.21437/speechprosody.2022-40
    Type Conference Proceeding Abstract
    Author Schuppler B
    Pages 195-199
  • 2022
    Title To laugh or not to laugh? The use of laughter to mark discourse structure
    DOI 10.18653/v1/2022.sigdial-1.8
    Type Conference Proceeding Abstract
    Author Ludusan B
    Pages 76-82
  • 2021
    Title Developing an Annotation System for Communicative Functions for a Cross-Layer ASR System
    Type Conference Proceeding Abstract
    Author Kelterer A.
    Conference ESSLLI Workshop "Integrating Perspectives on Discourse Annotation" (DiscAnn)
    Link Publication
  • 2021
    Title Prosodic cues to agreement and disagreement prefaces in Austrian German conversations
    DOI 10.21437/tai.2021-22
    Type Conference Proceeding Abstract
    Author Kelterer A
    Pages 107-111
  • 2020
    Title Towards automatic annotation of prosodic prominence levels in Austrian German
    DOI 10.21437/speechprosody.2020-204
    Type Conference Proceeding Abstract
    Author Linke J
    Pages 1000-1004
  • 2020
    Title Automatic Speech Segmentation using KALDI
    Type Other
    Author Wasserfall S.
    Link Publication
  • 2020
    Title An Analysis of Prosodic Prominence Cues to Information Structure in Egyptian Arabic
    DOI 10.21437/interspeech.2020-2322
    Type Conference Proceeding Abstract
    Author Kelterer A
    Pages 1883-1887
Methods & Materials
  • 2023 Link
    Title Tool for Analysis of Self-supervised Speech Representations
    DOI 10.21437/interspeech.2023-951
    Type Improvements to research infrastructure
    Public Access
    Link Link
Software
  • 2025 Link
    Title pvlex
    Link Link
  • 2024 Link
    Title speechcake
    Link Link
  • 2023 Link
    Title creapy
    Link Link
Disseminations
  • 2024 Link
    Title Newsaper Article on AI for Austrian German: Der Standard
    Type A press release, press conference or response to a media enquiry/interview
    Link Link
  • 2023 Link
    Title GEED Graz Electrical Engineering Days
    Type Participation in an open day or visit at my research institution
    Link Link
  • 2021 Link
    Title Special Session at "Phonetics and Phonology in Europe" 2021
    Type Participation in an activity, workshop or similar
    Link Link
  • 2021
    Title Initiation of the "Graz-Vienna Speechworkshop" Series
    Type Participation in an activity, workshop or similar
  • 2025 Link
    Title Podcast about our work on Conversational Speech
    Type A broadcast e.g. TV/radio/film/podcast (other than news/press)
    Link Link
  • 2025 Link
    Title Newspaper Article in Kleine Zeitung on KI for Styrian Dialect
    Type A press release, press conference or response to a media enquiry/interview
    Link Link
  • 2023 Link
    Title MINKT Labor a super science space for children
    Type Participation in an activity, workshop or similar
    Link Link
  • 2025 Link
    Title Newsaper Article on AI for Dialect: Klipp Das Magazin
    Type A press release, press conference or response to a media enquiry/interview
    Link Link
  • 2025 Link
    Title AI and Dialect? Radio Interview in Oe3
    Type A broadcast e.g. TV/radio/film/podcast (other than news/press)
    Link Link
  • 2025 Link
    Title Speech AI for Styrian Dialect on "Radio Steiermark"
    Type A broadcast e.g. TV/radio/film/podcast (other than news/press)
    Link Link
  • 2025 Link
    Title Podcast in Oe1 DIGITAL Leben
    Type A broadcast e.g. TV/radio/film/podcast (other than news/press)
    Link Link
  • 2024
    Title Invited talk at Bielefeld University
    Type A talk or presentation
Scientific Awards
  • 2023
    Title Jury member of "Das österreichische Wort des Jahres"
    Type Prestigious/honorary/advisory position to an external body
    Level of Recognition National (any country)
  • 2023
    Title Guest Professorship teaching the course: Speaker charisma: Analysis and training of acoustic-prosodic features within a sex-sensitive framework
    Type Attracted visiting staff or user to your research group
    Level of Recognition Regional (any country)
  • 2023
    Title Invited Speaker at "Ringvorlesung: Vielfalt im Zentrum der Forschung"
    Type Personally asked as a key note speaker to a conference
    Level of Recognition Regional (any country)
  • 2021
    Title Invited participant to the student-meets experts event at DAGA 47. Jahrestagung fuer Akustik 2021
    Type Personally asked as a key note speaker to a conference
    Level of Recognition Continental/International
  • 2019
    Title Guest Professorship teaching the course: Experimental Methods in Phonetics
    Type Attracted visiting staff or user to your research group
    Level of Recognition Regional (any country)
  • 2019
    Title Speech Communication Editor
    Type Appointed as the editor/advisor to a journal or book series
    Level of Recognition Continental/International
Fundings
  • 2024
    Title ERASMUS+ Short-Term Mobility WASP Summer School 2024
    Type Studentship
    Start of Funding 2024
    Funder ERASMUS+ Short-Term Mobility International Office - Welcome Center, TU Graz
  • 2023
    Title ICPhS 2023 Reisekostenübernahme Land Steiermark
    Type Studentship
    Start of Funding 2023
    Funder Land Steiermark
  • 2024
    Title Prof. Margaret Zellers - teaching
    Type Fellowship
    Start of Funding 2024
    Funder University of Graz
  • 2024
    Title Doktoratsfertigstellungsstipendium
    Type Research grant (including intramural programme)
    Start of Funding 2024
    Funder Literar Mechana
  • 2023
    Title Reisekostenzuschuss für Interspeech 2023
    Type Studentship
    Start of Funding 2023
    Funder Austrian Research Association
  • 2023
    Title Förderungsbeitrag für die Tagungsteilnahme
    Type Studentship
    Start of Funding 2023
    Funder Land Steiermark

Discovering
what
matters.

Newsletter

FWF-Newsletter Press-Newsletter Calendar-Newsletter Job-Newsletter scilog-Newsletter

Contact

Austrian Science Fund (FWF)
Georg-Coch-Platz 2
(Entrance Wiesingerstraße 4)
1010 Vienna

office(at)fwf.ac.at
+43 1 505 67 40

General information

  • Job Openings
  • Jobs at FWF
  • Press
  • Philanthropy
  • scilog
  • FWF Office
  • Social Media Directory
  • LinkedIn, external URL, opens in a new window
  • , external URL, opens in a new window
  • Facebook, external URL, opens in a new window
  • Instagram, external URL, opens in a new window
  • YouTube, external URL, opens in a new window
  • Cookies
  • Whistleblowing/Complaints Management
  • Accessibility Statement
  • Data Protection
  • Acknowledgements
  • IFG-Form
  • Social Media Directory
  • © Österreichischer Wissenschaftsfonds FWF
© Österreichischer Wissenschaftsfonds FWF