• Skip to content (access key 1)
  • Skip to search (access key 7)
FWF — Austrian Science Fund
  • Go to overview page Discover

    • Research Radar
      • Research Radar Archives 1974–1994
      • Open API
    • Discoveries
      • Emmanuelle Charpentier
      • Adrian Constantin
      • Monika Henzinger
      • Ferenc Krausz
      • Wolfgang Lutz
      • Walter Pohl
      • Christa Schleper
      • Elly Tanaka
      • Anton Zeilinger
    • Impact Stories
      • Verena Gassner
      • Wolfgang Lechner
      • Birgit Mitter
      • Oliver Spadiut
      • Georg Winter
    • scilog Magazine
    • Austrian Science Awards
      • FWF Wittgenstein Awards
      • FWF ASTRA Awards
      • FWF START Awards
      • Award Ceremony
    • excellent=austria
      • Clusters of Excellence
      • Emerging Fields
    • In the Spotlight
      • 40 Years of Erwin Schrödinger Fellowships
      • Quantum Austria
    • Dialogs and Talks
      • think.beyond Summit
    • Knowledge Transfer Events
    • E-Book Library
  • Go to overview page Funding

    • Portfolio
      • excellent=austria
        • Clusters of Excellence
        • Emerging Fields
      • Projects
        • Principal Investigator Projects
        • Principal Investigator Projects International
        • Clinical Research
        • 1000 Ideas
        • Arts-Based Research
        • FWF Wittgenstein Award
      • Careers
        • ESPRIT
        • FWF ASTRA Awards
        • Erwin Schrödinger
        • doc.funds
        • doc.funds.connect
      • Collaborations
        • Specialized Research Groups
        • Special Research Areas
        • Research Groups
        • International – Multilateral Initiatives
        • #ConnectingMinds
      • Communication
        • Top Citizen Science
        • Science Communication
        • Book Publications
        • Digital Publications
        • Open-Access Block Grant
      • Subject-Specific Funding
        • Belmont Forum
        • ERA-NET HERA
        • ERA-NET NORFACE
        • ERA-NET QuantERA
        • Alternative Methods to Animal Testing
        • European Partnership BE READY
        • European Partnership Biodiversa+
        • European Partnership BrainHealth
        • European Partnership ERA4Health
        • European Partnership ERDERA
        • European Partnership EUPAHW
        • European Partnership FutureFoodS
        • European Partnership OHAMR
        • European Partnership PerMed
        • European Partnership Water4All
        • Gottfried and Vera Weiss Award
        • LUKE – Ukraine
        • netidee SCIENCE
        • Herzfelder Foundation Projects
        • Quantum Austria
        • Rückenwind Funding Bonus
        • WE&ME Award
        • Zero Emissions Award
      • International Collaborations
        • Belgium/Flanders
        • Germany
        • France
        • Italy/South Tyrol
        • Japan
        • Korea
        • Luxembourg
        • Poland
        • Switzerland
        • Slovenia
        • Taiwan
        • Tyrol-South Tyrol-Trentino
        • Czech Republic
        • Hungary
    • Step by Step
      • Find Funding
      • Submitting Your Application
      • International Peer Review
      • Funding Decisions
      • Carrying out Your Project
      • Closing Your Project
      • Further Information
        • Integrity and Ethics
        • Inclusion
        • Applying from Abroad
        • Personnel Costs
        • PROFI
        • Final Project Reports
        • Final Project Report Survey
    • FAQ
      • Project Phase PROFI
      • Project Phase Ad Personam
      • Expiring Programs
        • Elise Richter and Elise Richter PEEK
        • FWF START Awards
        • AI Mission Austria
  • Go to overview page About Us

    • Mission Statement
    • FWF Video
    • Values
    • Facts and Figures
    • Annual Report
    • What We Do
      • Research Funding
        • Matching Funds Initiative
      • International Collaborations
      • Studies and Publications
      • Equal Opportunities and Diversity
        • Objectives and Principles
        • Measures
        • Creating Awareness of Bias in the Review Process
        • Terms and Definitions
        • Your Career in Cutting-Edge Research
      • Open Science
        • Open-Access Policy
          • Open-Access Policy for Peer-Reviewed Publications
          • Open-Access Policy for Peer-Reviewed Book Publications
          • Open-Access Policy for Research Data
        • Research Data Management
        • Citizen Science
        • Open Science Infrastructures
        • Open Science Funding
      • Evaluations and Quality Assurance
      • Academic Integrity
      • Science Communication
      • Philanthropy
      • Sustainability
    • History
    • Legal Basis
    • Organization
      • Executive Bodies
        • Executive Board
        • Supervisory Board
        • Assembly of Delegates
        • Scientific Board
        • Juries
      • FWF Office
    • Jobs at FWF
  • Go to overview page News

    • News
    • Press
      • Logos
    • Calendar
      • Post an Event
      • FWF Informational Events
    • Job Openings
      • Enter Job Opening
    • Newsletter
  • Discovering
    what
    matters.

    FWF-Newsletter Press-Newsletter Calendar-Newsletter Job-Newsletter scilog-Newsletter

    SOCIAL MEDIA

    • LinkedIn, external URL, opens in a new window
    • , external URL, opens in a new window
    • Facebook, external URL, opens in a new window
    • Instagram, external URL, opens in a new window
    • YouTube, external URL, opens in a new window

    SCILOG

    • Scilog — The science magazine of the Austrian Science Fund (FWF)
  • elane login, external URL, opens in a new window
  • Scilog external URL, opens in a new window
  • de Wechsle zu Deutsch

  

Cross-layer pronunciation modeling for conversational speech

Cross-layer pronunciation modeling for conversational speech

Barbara Schuppler (ORCID: 0000-0003-4009-0832)
  • Grant DOI 10.55776/T572
  • Funding program Hertha Firnberg
  • Status ended
  • Start September 1, 2012
  • End April 30, 2017
  • Funding amount € 206,340

Disciplines

Computer Sciences (40%); Linguistics and Literature (60%)

Keywords

    Automatic Speech Recognition, Spontaneous Speech, Pronunciation Variation, Austrian German, Linguistic Models, Dutch

Abstract Final report

ASR systems have originally been designed to cope with carefully pronounced speech. As a consequence, these systems cannot deal well with spontaneous, conversational speech. Read and conversational speech are different in many aspects. On the linguistic level, conversational speech contains disfluencies and many utterances that might be considered as `ungrammatical`. On the phonetic level, a much higher degree of pronunciation variation is observed in spontaneous than in read speech. Words are more often acoustically reduced compared to their full pronunciations, such that a word like yesterday may sound like yeshay or a German word like haben my sound like ham. Since most real world applications of ASR systems require the recognition of spontaneous speech (e.g., dialogue systems, voice input aids for physically disabled, medical dictation systems, etc.), the investigation of new methods to model every-day speech has received a lot of attention among speech technologists. Also in the linguistic and psycholinguistic domain, casual conversations are studied on the search for an answer to how every-day speech production and comprehension works. Their studies have indicated that certain higher level linguistic functions and structures of utterances condition the details of their pronunciation. It is likely that the kind of analysis that is becoming feasible with the growing availability of large speech corpora will bring to light yet unknown factors that affect pronunciation variation. The research envisioned in this proposal is designed to increase our knowledge about spontaneous, conversational speech and to use this knowledge to improve Automatic Speech Recognition (ASR) systems. The first objective is to identify which higher level linguistic structures and functions condition pronunciation variation by means of quantitative phonetic analyses. Studies will be carried out on Dutch and on Austrian German material, which will allow to draw conclusions about which findings are language specific and which are characteristic for conversational speech in general. The second objective is to improve ASR technology by incorporating the gained knowledge about the conditions for pronunciation variation. Most ASR systems still deal with acoustic and linguistic information independently of each other. In contrast, I propose a Cross-layer pronunciation modeling technique, which (1) makes use of the gained knowledge about the effects of several layers of linguistic structures and functions on pronunciation variation, and (2) which means that the recognizer makes use of lexicons in more than just one layer of its architecture. Additional deliverables of this project are the collected speech material along with the created tools for its automatic annotation, which both would be of great value for future studies of linguists and engineers.

The Problem Automatic speech recognition (ASR) systems were originally designed to cope with carefully pronounced speech. Most real world applications of ASR systems, however, require the recognition of spontaneous, conversational speech (e.g., dialogue systems, voice input aids for physically disabled, medical dictation systems, etc.). Compared to prepared or read speech, conversational speech contains utterances that might be considered 'ungrammatical' and contain disfluencies, such as ...oh, well, I think ahhm exactly The pronunciation of the words may depend for instance on the regional background of the speakers, the formality of the situation or the frequency of the word. A highly frequent word like yesterday may sound like yeshay and the German word haben (to have) may sound like ham. This project focused on investigating interdisciplinary methods (including linguistics, phonetics, speech technology) to model the factors on which pronunciation variation depends in everyday speech. The Methods In this project, we collected and annotated the first largescale speech database of Austrian German. It is a rich resource on pronunciation variation in Austrian German, containing approximately 1900 minutes of speech spoken by 38 speakers from 5 provinces in 3 different speaking styles (read speech, spontaneous commands, and conversational speech). Moreover, it is one of the largest German speech databases with completely unconstrained and casual conversations, and thus is also relevant to speech scientists outside of Austria. We have also developed transcription tools for the corpus and have made both the speech material and the tools available for other researchers.The Findings Based on Dutch, German and the collected Austrian German speech material, we found that pronunciation variation does not only depend on well known factors such as the regional background of the speaker and the speaking style, but also on, for example, the grammatical and morphological properties of the words. For instance, whereas in spontaneous speech the German word der is pronounced differently depending on whether it is an article, a demonstrative pronoun or a relative pronoun, in read speech it is always pronounced the same way. These linguistic findings for pronunciation variation were used to develop methods to improve ASR systems. Most importantly, our work not only demonstrates novel methods for ASR, it introduces a new perspective: Whereas previously, the high degree of pronunciation variation in spontaneous speech was primarily seen as a problem for ASR, we view it as an additional resource which is not present in read speech. This change in perspective will guide our future research plans.

Research institution(s)
  • Technische Universität Graz - 100%
International project participants
  • Mirjam Ernestus, Radboud University - Netherlands

Research Output

  • 40 Citations
  • 13 Publications
Publications
  • 2017
    Title Rethinking classification results based on read speech, or: why improvements do not always transfer to other speaking styles
    DOI 10.1007/s10772-017-9436-y
    Type Journal Article
    Author Schuppler B
    Journal International Journal of Speech Technology
    Pages 699-713
    Link Publication
  • 2017
    Title A corpus of read and conversational Austrian German
    DOI 10.1016/j.specom.2017.09.003
    Type Journal Article
    Author Schuppler B
    Journal Speech Communication
    Pages 62-74
  • 2017
    Title Acoustic correlates of stress and accent in Standard Austrian German.
    Type Book Chapter
    Author El Zarka D
  • 2013
    Title Informal speech processes can be categorical in nature, even if they affect many different words
    DOI 10.1121/1.4790352
    Type Journal Article
    Author Hanique I
    Journal The Journal of the Acoustical Society of America
    Pages 1644-1655
    Link Publication
  • 2018
    Title On the use of acoustic features for automatic disambiguation of homophones in spontaneous German
    DOI 10.1016/j.csl.2017.12.011
    Type Journal Article
    Author Schuppler B
    Journal Computer Speech & Language
    Pages 209-224
  • 2014
    Title Pronunciation Variation in Read and Conversational Austrian German.
    Type Conference Proceeding Abstract
    Author Morales-Cordovilla Ja Et Al
    Conference Proceedings of Interspeech
  • 2014
    Title How extra-linguistic factors affect pronunciation variation in different speaking styles.
    Type Conference Proceeding Abstract
    Author Schuppler B
    Conference 22Nd Czech-German Workshop on Speech Communication.
  • 2014
    Title GRASS: The Graz Corpus of Read and Spontaneous Speech.
    Type Conference Proceeding Abstract
    Author Pessentheiner H Et Al
    Conference Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14).
  • 2013
    Title The challenge of manner classification in conversational speech.
    Type Conference Proceeding Abstract
    Author Boves L Et Al
    Conference Proceedings of the Workshop on Speech Production in Automatic Speech Recognition, Satellite Workshop of Interspeech
  • 2015
    Title Automatic detection of uncertainty in spontaneous German dialogue.
    Type Conference Proceeding Abstract
    Author Schrank T
    Conference Proceedings of Interspeech
  • 2014
    Title Statistical Language and Speech Processing, Second International Conference, SLSP 2014, Grenoble, France, October 14-16, 2014, Proceedings
    DOI 10.1007/978-3-319-11397-5
    Type Book
    Publisher Springer Nature
  • 2014
    Title Automatic Phonetic Transcription in Two Steps: Forced Alignment and Burst Detection
    DOI 10.1007/978-3-319-11397-5_10
    Type Book Chapter
    Author Schuppler B
    Publisher Springer Nature
    Pages 132-143
  • 2014
    Title Where /aR/ the /R/s in Standard Austrian German?
    Type Conference Proceeding Abstract
    Author Jackschina A
    Conference Proceedings of Interspeech

Discovering
what
matters.

Newsletter

FWF-Newsletter Press-Newsletter Calendar-Newsletter Job-Newsletter scilog-Newsletter

Contact

Austrian Science Fund (FWF)
Georg-Coch-Platz 2
(Entrance Wiesingerstraße 4)
1010 Vienna

office(at)fwf.ac.at
+43 1 505 67 40

General information

  • Job Openings
  • Jobs at FWF
  • Press
  • Philanthropy
  • scilog
  • FWF Office
  • Social Media Directory
  • LinkedIn, external URL, opens in a new window
  • , external URL, opens in a new window
  • Facebook, external URL, opens in a new window
  • Instagram, external URL, opens in a new window
  • YouTube, external URL, opens in a new window
  • Cookies
  • Whistleblowing/Complaints Management
  • Accessibility Statement
  • Data Protection
  • IFG-Form
  • Acknowledgements
  • © Österreichischer Wissenschaftsfonds FWF
© Österreichischer Wissenschaftsfonds FWF