• Skip to content (access key 1)
  • Skip to search (access key 7)
FWF — Austrian Science Fund
  • Go to overview page Discover

    • Research Radar
      • Research Radar Archives 1974–1994
    • Discoveries
      • Emmanuelle Charpentier
      • Adrian Constantin
      • Monika Henzinger
      • Ferenc Krausz
      • Wolfgang Lutz
      • Walter Pohl
      • Christa Schleper
      • Elly Tanaka
      • Anton Zeilinger
    • Impact Stories
      • Verena Gassner
      • Wolfgang Lechner
      • Birgit Mitter
      • Oliver Spadiut
      • Georg Winter
    • scilog Magazine
    • Austrian Science Awards
      • FWF Wittgenstein Awards
      • FWF ASTRA Awards
      • FWF START Awards
      • Award Ceremony
    • excellent=austria
      • Clusters of Excellence
      • Emerging Fields
    • In the Spotlight
      • 40 Years of Erwin Schrödinger Fellowships
      • Quantum Austria
    • Dialogs and Talks
      • think.beyond Summit
    • Knowledge Transfer Events
    • E-Book Library
  • Go to overview page Funding

    • Portfolio
      • excellent=austria
        • Clusters of Excellence
        • Emerging Fields
      • Projects
        • Principal Investigator Projects
        • Principal Investigator Projects International
        • Clinical Research
        • 1000 Ideas
        • Arts-Based Research
        • FWF Wittgenstein Award
      • Careers
        • ESPRIT
        • FWF ASTRA Awards
        • Erwin Schrödinger
        • doc.funds
        • doc.funds.connect
      • Collaborations
        • Specialized Research Groups
        • Special Research Areas
        • Research Groups
        • International – Multilateral Initiatives
        • #ConnectingMinds
      • Communication
        • Top Citizen Science
        • Science Communication
        • Book Publications
        • Digital Publications
        • Open-Access Block Grant
      • Subject-Specific Funding
        • AI Mission Austria
        • Belmont Forum
        • ERA-NET HERA
        • ERA-NET NORFACE
        • ERA-NET QuantERA
        • Alternative Methods to Animal Testing
        • European Partnership BE READY
        • European Partnership Biodiversa+
        • European Partnership BrainHealth
        • European Partnership ERA4Health
        • European Partnership ERDERA
        • European Partnership EUPAHW
        • European Partnership FutureFoodS
        • European Partnership OHAMR
        • European Partnership PerMed
        • European Partnership Water4All
        • Gottfried and Vera Weiss Award
        • LUKE – Ukraine
        • netidee SCIENCE
        • Herzfelder Foundation Projects
        • Quantum Austria
        • Rückenwind Funding Bonus
        • WE&ME Award
        • Zero Emissions Award
      • International Collaborations
        • Belgium/Flanders
        • Germany
        • France
        • Italy/South Tyrol
        • Japan
        • Korea
        • Luxembourg
        • Poland
        • Switzerland
        • Slovenia
        • Taiwan
        • Tyrol–South Tyrol–Trentino
        • Czech Republic
        • Hungary
    • Step by Step
      • Find Funding
      • Submitting Your Application
      • International Peer Review
      • Funding Decisions
      • Carrying out Your Project
      • Closing Your Project
      • Further Information
        • Integrity and Ethics
        • Inclusion
        • Applying from Abroad
        • Personnel Costs
        • PROFI
        • Final Project Reports
        • Final Project Report Survey
    • FAQ
      • Project Phase PROFI
      • Project Phase Ad Personam
      • Expiring Programs
        • Elise Richter and Elise Richter PEEK
        • FWF START Awards
  • Go to overview page About Us

    • Mission Statement
    • FWF Video
    • Values
    • Facts and Figures
    • Annual Report
    • What We Do
      • Research Funding
        • Matching Funds Initiative
      • International Collaborations
      • Studies and Publications
      • Equal Opportunities and Diversity
        • Objectives and Principles
        • Measures
        • Creating Awareness of Bias in the Review Process
        • Terms and Definitions
        • Your Career in Cutting-Edge Research
      • Open Science
        • Open-Access Policy
          • Open-Access Policy for Peer-Reviewed Publications
          • Open-Access Policy for Peer-Reviewed Book Publications
          • Open-Access Policy for Research Data
        • Research Data Management
        • Citizen Science
        • Open Science Infrastructures
        • Open Science Funding
      • Evaluations and Quality Assurance
      • Academic Integrity
      • Science Communication
      • Philanthropy
      • Sustainability
    • History
    • Legal Basis
    • Organization
      • Executive Bodies
        • Executive Board
        • Supervisory Board
        • Assembly of Delegates
        • Scientific Board
        • Juries
      • FWF Office
    • Jobs at FWF
  • Go to overview page News

    • News
    • Press
      • Logos
    • Calendar
      • Post an Event
      • FWF Informational Events
    • Job Openings
      • Enter Job Opening
    • Newsletter
  • Discovering
    what
    matters.

    FWF-Newsletter Press-Newsletter Calendar-Newsletter Job-Newsletter scilog-Newsletter

    SOCIAL MEDIA

    • LinkedIn, external URL, opens in a new window
    • , external URL, opens in a new window
    • Facebook, external URL, opens in a new window
    • Instagram, external URL, opens in a new window
    • YouTube, external URL, opens in a new window

    SCILOG

    • Scilog — The science magazine of the Austrian Science Fund (FWF)
  • elane login, external URL, opens in a new window
  • Scilog external URL, opens in a new window
  • de Wechsle zu Deutsch

  

Acoustic modeling and transformation of varieties for speech synthesis

Acoustic modeling and transformation of varieties for speech synthesis

Michael Pucher (ORCID: 0000-0002-5374-1342)
  • Grant DOI 10.55776/P23821
  • Funding program Principal Investigator Projects
  • Status ended
  • Start February 1, 2012
  • End June 30, 2016
  • Funding amount € 296,510

Disciplines

Computer Sciences (95%); Linguistics and Literature (5%)

Keywords

    Speech Synthesis, Hidden Markov Model, Dialect, Machine Learing, Adaption

Abstract Final report

Our main goal in this research project is the advancement of variety modeling for speech synthesis through the optimal use of available data resources. The fact of having phonetically similar data within different social (sociolects) or regional (dialects) varieties and the potential to use statistical parametric synthesis to adapt models with a relatively small amount of data from background models will yield synthesis methods that can model varieties using only a few minutes of speech adaptation data. To reach this overall goal we focus on three topics that are highly relevant for variety modeling and that represent new scientific challenges, namely average voice models for varieties, modeling of variety transformation, and modeling of varieties with incomplete training data. In average voice modeling for varieties we will investigate variety and speaker adaptive training as a new training method for average voice models. In variety transformation we will develop techniques to create a speaker`s voice in a certain variety when only having speech data of the speaker in a similar variety. Furthermore we will investigate methods to create a speaker`s voice from incomplete speech data sets, which can be used to synthesize historic dialect states. Speech synthesis is becoming more and more important as an output interface in cognitive user interfaces. While it is possible to achieve natural sounding speech synthesis for neutral style speech with today`s technology, the fast adaptation of speech synthesis systems to different contexts and situations is still a problem, something to which both speakers and listeners are used to. While emotional speech and natural intonation are an area of active research, relatively little research is devoted to language varieties. Within this project we will develop the necessary methods to develop speech synthesis systems that can be easily adapted to social and regional language varieties. To achieve this we search for optimal ways to use the available training data by exploiting similarities within social and regional varieties using different layers of abstraction.

Our main goal in this research project was the advancement of variety modeling for speech synthesis. To reach this overall goal, we focused on three topics that are highly relevant for variety modeling and that represented new scientific challenges, namely modeling of variety transformation, average voice models for varieties, and modeling of varieties with incomplete training data. In modeling of variety transformation, we developed a method for unsupervised interpolation of language varieties that automatically creates in-between varieties by generating gradual transitions between two varieties, be it two dialects/sociolects, or a dialect and a standard. Furthermore, we developed a cross-variety speaker transformation method that can create a speakers voice in a certain variety even if only speech data of another variety of the speaker are available. In average voice modeling, we investigated different adaptation methods like dialect-adaptive training and dialect clustering that exploit the common phone sets of dialects and standard and applied an adaptive modelling method that uses one variety as background and one as adaptation variety to Albanian dialects.On modeling of varieties with incomplete training data we evaluated the perception of foreign-accented natural and synthetic speech in comparison to automatically accent-reduced synthetic speech. The applied method does not use an average voice model but only the phonetically incomplete accented speech data.Speech synthesis is becoming increasingly important as an output interface in cognitive user interfaces. While emotional speech and natural intonation are an area of active research, less attention has been paid to the investigation of language varieties in the context of speech synthesis. Within this project we developed methods for speech synthesis systems that can be easily adapted to social and regional language varieties.

Research institution(s)
  • Österreichische Akademie der Wissenschaften - 100%
International project participants
  • Sebastian Möller, Technische Universität Berlin - Germany
  • Junichi Yamagishi, National Institute of Informatics - Japan

Research Output

  • 19 Citations
  • 17 Publications
Publications
  • 2017
    Title Influence of speaker familiarity on blind and visually impaired children’s and young adults’ perception of synthetic voices
    DOI 10.1016/j.csl.2017.05.010
    Type Journal Article
    Author Pucher M
    Journal Computer Speech & Language
    Pages 179-195
    Link Publication
  • 2013
    Title Cross-variety speaker transformation in HSMM-based speech synthesis.
    Type Conference Proceeding Abstract
    Author Schabus D
    Conference 8th ISCA Speech Synthesis Workshop (SSW8).
  • 2013
    Title Structural KLD for Cross-Variety Speaker Adaptation in HMM-based Speech Synthesis
    DOI 10.2316/p.2013.798-069
    Type Conference Proceeding Abstract
    Author Toman M
  • 2015
    Title Efficient Pitch Estimation on Natural Opera-Singing by a Spectral Correlation based Strategy.
    Type Journal Article
    Author Pucher M Et Al
    Journal IPSJ SIG Technical Report.
  • 2015
    Title Visio-articulatory to acoustic conversion of speech
    DOI 10.1145/2813852.2813858
    Type Conference Proceeding Abstract
    Author Pucher M
    Pages 1-2
  • 2015
    Title Comparison of dialect models and phone mappings in HSMM-based visual dialect speech synthesis.
    Type Conference Proceeding Abstract
    Author Pucher M
    Conference 1st Joint Conference on Facial Analysis, Animation and Auditory-Visual Speech Processing (FAAVSP).
  • 2016
    Title Development of a statistical parametric synthesis system for operatic singing in German
    DOI 10.21437/ssw.2016-11
    Type Conference Proceeding Abstract
    Author Pucher M
    Pages 64-69
    Link Publication
  • 2013
    Title Multi-variety adaptive acoustic modeling in HSMM-based speech synthesis.
    Type Conference Proceeding Abstract
    Author Schabus D Et Al
    Conference 8th ISCA Speech Synthesis Workshop (SSW8).
  • 2016
    Title Aufnahme von hochwertigen authentischen Dialektdaten im Feld.
    Type Conference Proceeding Abstract
    Author Pucher M
    Conference 13 Bayerisch-österreichische Dialektologentagung.
  • 2015
    Title Influence of speaker familiarity on blind and visually impaired children's perception of synthetic voices in audio games.
    Type Conference Proceeding Abstract
    Author Pucher M
    Conference 16th Annual Conference of the International Speech Communication Association.
  • 2015
    Title Adaptive Speech Synthesis of Albanian Dialects
    DOI 10.1007/978-3-319-24033-6_18
    Type Book Chapter
    Author Pucher M
    Publisher Springer Nature
    Pages 158-164
  • 2015
    Title Evaluation of state mapping based foreign accent conversion.
    Type Conference Proceeding Abstract
    Author Pucher M
    Conference 16th Annual Conference of the International Speech Communication Association
  • 2015
    Title An Open Source Speech Synthesis Frontend for HTS
    DOI 10.1007/978-3-319-24033-6_33
    Type Book Chapter
    Author Toman M
    Publisher Springer Nature
    Pages 291-298
  • 2015
    Title Unsupervised and phonologically controlled interpolation of Austrian German language varieties for speech synthesis
    DOI 10.1016/j.specom.2015.06.005
    Type Journal Article
    Author Toman M
    Journal Speech Communication
    Pages 176-193
    Link Publication
  • 0
    Title MMASCS multi-modal annotated synchronous corpus of audio, video, facial motion and tongue motion data of normal, fast and slow speech.
    Type Other
    Author Pucher M
  • 0
    Title GIDS Bad Goisern and Innervillgraten Audio-Visual Dialect Speech Corpus, a collection of audiovisual speech recordings for research purposes.
    Type Other
    Author Pucher M
  • 0
    Title FAAVSP - The 1st Joint Conference on Facial Analysis, Animation and Auditory-Visual Speech Processing.
    Type Other
    Author Davis C Et Al

Discovering
what
matters.

Newsletter

FWF-Newsletter Press-Newsletter Calendar-Newsletter Job-Newsletter scilog-Newsletter

Contact

Austrian Science Fund (FWF)
Georg-Coch-Platz 2
(Entrance Wiesingerstraße 4)
1010 Vienna

office(at)fwf.ac.at
+43 1 505 67 40

General information

  • Job Openings
  • Jobs at FWF
  • Press
  • Philanthropy
  • scilog
  • FWF Office
  • Social Media Directory
  • LinkedIn, external URL, opens in a new window
  • , external URL, opens in a new window
  • Facebook, external URL, opens in a new window
  • Instagram, external URL, opens in a new window
  • YouTube, external URL, opens in a new window
  • Cookies
  • Whistleblowing/Complaints Management
  • Accessibility Statement
  • Data Protection
  • Acknowledgements
  • IFG-Form
  • Social Media Directory
  • © Österreichischer Wissenschaftsfonds FWF
© Österreichischer Wissenschaftsfonds FWF