• Skip to content (access key 1)
  • Skip to search (access key 7)
FWF — Austrian Science Fund
  • Go to overview page Discover

    • Research Radar
      • Research Radar Archives 1974–1994
    • Discoveries
      • Emmanuelle Charpentier
      • Adrian Constantin
      • Monika Henzinger
      • Ferenc Krausz
      • Wolfgang Lutz
      • Walter Pohl
      • Christa Schleper
      • Elly Tanaka
      • Anton Zeilinger
    • Impact Stories
      • Verena Gassner
      • Wolfgang Lechner
      • Birgit Mitter
      • Oliver Spadiut
      • Georg Winter
    • scilog Magazine
    • Austrian Science Awards
      • FWF Wittgenstein Awards
      • FWF ASTRA Awards
      • FWF START Awards
      • Award Ceremony
    • excellent=austria
      • Clusters of Excellence
      • Emerging Fields
    • In the Spotlight
      • 40 Years of Erwin Schrödinger Fellowships
      • Quantum Austria
    • Dialogs and Talks
      • think.beyond Summit
    • Knowledge Transfer Events
    • E-Book Library
  • Go to overview page Funding

    • Portfolio
      • excellent=austria
        • Clusters of Excellence
        • Emerging Fields
      • Projects
        • Principal Investigator Projects
        • Principal Investigator Projects International
        • Clinical Research
        • 1000 Ideas
        • Arts-Based Research
        • FWF Wittgenstein Award
      • Careers
        • ESPRIT
        • FWF ASTRA Awards
        • Erwin Schrödinger
        • doc.funds
        • doc.funds.connect
      • Collaborations
        • Specialized Research Groups
        • Special Research Areas
        • Research Groups
        • International – Multilateral Initiatives
        • #ConnectingMinds
      • Communication
        • Top Citizen Science
        • Science Communication
        • Book Publications
        • Digital Publications
        • Open-Access Block Grant
      • Subject-Specific Funding
        • AI Mission Austria
        • Belmont Forum
        • ERA-NET HERA
        • ERA-NET NORFACE
        • ERA-NET QuantERA
        • Alternative Methods to Animal Testing
        • European Partnership BE READY
        • European Partnership Biodiversa+
        • European Partnership BrainHealth
        • European Partnership ERA4Health
        • European Partnership ERDERA
        • European Partnership EUPAHW
        • European Partnership FutureFoodS
        • European Partnership OHAMR
        • European Partnership PerMed
        • European Partnership Water4All
        • Gottfried and Vera Weiss Award
        • LUKE – Ukraine
        • netidee SCIENCE
        • Herzfelder Foundation Projects
        • Quantum Austria
        • Rückenwind Funding Bonus
        • WE&ME Award
        • Zero Emissions Award
      • International Collaborations
        • Belgium/Flanders
        • Germany
        • France
        • Italy/South Tyrol
        • Japan
        • Korea
        • Luxembourg
        • Poland
        • Switzerland
        • Slovenia
        • Taiwan
        • Tyrol-South Tyrol-Trentino
        • Czech Republic
        • Hungary
    • Step by Step
      • Find Funding
      • Submitting Your Application
      • International Peer Review
      • Funding Decisions
      • Carrying out Your Project
      • Closing Your Project
      • Further Information
        • Integrity and Ethics
        • Inclusion
        • Applying from Abroad
        • Personnel Costs
        • PROFI
        • Final Project Reports
        • Final Project Report Survey
    • FAQ
      • Project Phase PROFI
      • Project Phase Ad Personam
      • Expiring Programs
        • Elise Richter and Elise Richter PEEK
        • FWF START Awards
  • Go to overview page About Us

    • Mission Statement
    • FWF Video
    • Values
    • Facts and Figures
    • Annual Report
    • What We Do
      • Research Funding
        • Matching Funds Initiative
      • International Collaborations
      • Studies and Publications
      • Equal Opportunities and Diversity
        • Objectives and Principles
        • Measures
        • Creating Awareness of Bias in the Review Process
        • Terms and Definitions
        • Your Career in Cutting-Edge Research
      • Open Science
        • Open-Access Policy
          • Open-Access Policy for Peer-Reviewed Publications
          • Open-Access Policy for Peer-Reviewed Book Publications
          • Open-Access Policy for Research Data
        • Research Data Management
        • Citizen Science
        • Open Science Infrastructures
        • Open Science Funding
      • Evaluations and Quality Assurance
      • Academic Integrity
      • Science Communication
      • Philanthropy
      • Sustainability
    • History
    • Legal Basis
    • Organization
      • Executive Bodies
        • Executive Board
        • Supervisory Board
        • Assembly of Delegates
        • Scientific Board
        • Juries
      • FWF Office
    • Jobs at FWF
  • Go to overview page News

    • News
    • Press
      • Logos
    • Calendar
      • Post an Event
      • FWF Informational Events
    • Job Openings
      • Enter Job Opening
    • Newsletter
  • Discovering
    what
    matters.

    FWF-Newsletter Press-Newsletter Calendar-Newsletter Job-Newsletter scilog-Newsletter

    SOCIAL MEDIA

    • LinkedIn, external URL, opens in a new window
    • , external URL, opens in a new window
    • Facebook, external URL, opens in a new window
    • Instagram, external URL, opens in a new window
    • YouTube, external URL, opens in a new window

    SCILOG

    • Scilog — The science magazine of the Austrian Science Fund (FWF)
  • elane login, external URL, opens in a new window
  • Scilog external URL, opens in a new window
  • de Wechsle zu Deutsch

  

Objective differentiation of dysphonic voice quality types

Objective differentiation of dysphonic voice quality types

Philipp Aichinger (ORCID: 0000-0003-4353-4996)
  • Grant DOI 10.55776/KLI722
  • Funding program Clinical Research
  • Status ended
  • Start November 1, 2018
  • End April 30, 2024
  • Funding amount € 402,531

Disciplines

Electrical Engineering, Electronics, Information Engineering (60%); Clinical Medicine (40%)

Keywords

    Voice disorders, Dysphonia, Voice assessment, Laryngeal high-speed videos, Speech processing, Acoustics

Abstract Final report

Voice problems are not completely understood yet, partly because conventional methods for clinical examination are limited. Aim of this research project is to create computerized methods which automatically recognize and identify reliably voice problems in microphone recordings of patients voices. Videos of the throat and microphone recordings of 230 patients with voice problems are obtained. Super slow motion videos with 4000 images per second are used, because the vocal folds, which are located in the larynx, normally vibrate very fast during voice production, more often than 100 times per second. To visualize irregularities, videos showing two seconds of vocal fold vibration are slowed down to a playing time of five minutes. The researchers investigate how the vocal fold vibration relates to the sound of the voice. In particular, three voice types are payed attention to. First, a so called vocal fry is a very low pitched type of voice. This voice type may be evocative of the Strohbass singing register. Sometimes, it is possible to hear individual pulses of the vocal folds in these voices. Second, extrapulsed voices are investigated. These may be compared to a common phenomenon in human heart beating, i.e., extra systoles. Just as a human heart may sometimes stumble every now and then, extra pulses may occur in the voice. Frequently occurring extra pulses may be a sign of a voice problem, may be perceived to sound raspy. Third, phase differences between the left and the right vocal fold may occur in voice disorders. To understanding phase differences, one could try to bounce two basketballs with the left and the right hand simultaneously, and notice that the balls will hardly ever hit the ground at exactly same time, and two separate bouncing sounds will be audible every time the balls hit the ground. Such timing differences also occur in vocal folds, and are a sign of a voice problem. Interestingly, instead of hearing the two vocal folds separately, listeners have reported to hear a certain type of rumbling in voices with phase differences. This is mainly because the frequency of the vocal folds is much higher than the frequency of basketballs. A diverse approach is chosen to investigate how the vocal folds vibrate, and how abnormal voices sound to listeners. The project comprises computer-based science, patient data, and auditory experiments. The results may be applied for improving clinical voice quality assessment.

Verbal communication is one of the most significant human achievements, relying on the functioning of the voice box, particularly the vibration of the vocal cords. This vibration gives the voice its tone, similar to how a vibrating string gives a guitar its sound. Voice disorders may disrupt this normal vibration, making speaking difficult. Clinicians use cameras to examine patient's vocal folds and listening to the nuances of the voice. However, vocal cords vibrate rapidly, making them difficult to see, and both visual and auditory assessments can be subjective. This project aimed to address these challenges through innovative technology and methods. The innovative techniques and significant findings are the following. First, researchers used high-speed cameras to slow down voice recordings by a factor of 160. This allowed for observation of minute details otherwise missed, especially features of irregularity. Second, microphone recordings were analyzed to understand how vocal sounds relate to vibrations. This led to improved understanding of how different vocal fold conditions affect voice sound. Third, the project involved advanced simulations of vocal fold vibrations and the auditory process, further pinpointing critical features of phonatory dysfunctions. Finally, the project leveraged artificial intelligence and machine learning (AIML). In particular, recent advances in speech technology (cf. Siri, Alexa, etc.) were adapted to create more realistic simulations of pathological voices, and AIML mimicking human vision was used to automate video analysis, reducing the need for manual review and facilitating the implementation of slow-motion video analysis in clinical settings. Specific voice types were investigated. First, diplophonia is a condition where different regions of the vocal folds vibrate at distinct rates, causing a doubled voice. Software was developed to measure the frequency of this occurrence in speech objectively. Second, vocal fry and creaky voice are characterized by separated sound pulses, similar to the sounds of a frying pan, a creaky door, or the making of popcorn. This research clarified that such voices either have a low vibration rate, or other disturbances creating only the illusion of pulse separation. Third, researchers investigated timing differences between vocal fold regions and extra pulses similar to extra systoles in heartbeats. In summary, this research has greatly enhanced our understanding of vocal fold mechanics and voice perception. By combining high-speed video technology, computer simulations, and AI, the project tackles key challenges in diagnosing and treating voice disorders. The findings have the potential to transform clinical practices, providing more accurate and reliable diagnostics via digital twinning and decision support, ultimately leading to better treatment outcomes and an improved quality of life for individuals with voice problems.

Research institution(s)
  • Medizinische Universität Wien - 100%
International project participants
  • Jean Schoentgen, Université Libre de Bruxelles - Belgium

Research Output

  • 21 Citations
  • 29 Publications
  • 2 Methods & Materials
  • 6 Scientific Awards
Publications
  • 2024
    Title Auditory perception of impulsiveness and tonality in vocal fry
    DOI 10.61782/fa.2023.0426
    Type Conference Proceeding Abstract
    Author Devaraj V
    Pages 4719-4724
  • 2021
    Title Modelling of Amplitude Modulated Vocal Fry Glottal Area Waveforms Using an Analysis-by-Synthesis Approach
    DOI 10.3390/app11051990
    Type Journal Article
    Author Devaraj V
    Journal Applied Sciences
    Pages 1990
    Link Publication
  • 2021
    Title Fitting synthetic to clinical kymographic images for deriving kinematic vocal fold parameters: Application to left-right vibratory phase differences
    DOI 10.1016/j.bspc.2020.102253
    Type Journal Article
    Author Bulusu S
    Journal Biomedical Signal Processing and Control
    Pages 102253
    Link Publication
  • 2021
    Title Modelling sagittal and vertical phase differences in a lumped and distributed elements vocal fold model
    DOI 10.1016/j.bspc.2020.102309
    Type Journal Article
    Author Drioli C
    Journal Biomedical Signal Processing and Control
    Pages 102309
    Link Publication
  • 2021
    Title Synthesis and Analysis-By-Synthesis of Modulated Diplophonic Glottal Area Waveforms
    DOI 10.1109/taslp.2021.3053387
    Type Journal Article
    Author Aichinger P
    Journal IEEE/ACM Transactions on Audio, Speech, and Language Processing
    Pages 914-926
    Link Publication
  • 2019
    Title Detection of extra pulses in synthesized glottal area waveforms of dysphonic voices
    DOI 10.1016/j.bspc.2019.01.007
    Type Journal Article
    Author Aichinger P
    Journal Biomedical Signal Processing and Control
    Pages 158-167
    Link Publication
  • 2019
    Title Analysis and Synthesis of Vocal Flutter and Vocal Jitter
    DOI 10.21437/interspeech.2019-1998
    Type Conference Proceeding Abstract
    Author Schoentgen J
    Pages 2518-2522
  • 2024
    Title Deep Learning-Based Detection of Glottis Segmentation Failures.
    DOI 10.3390/bioengineering11050443
    Type Journal Article
    Author Aichinger P
    Journal Bioengineering (Basel, Switzerland)
  • 2018
    Title Detection of Diplophonation in Audio Recordings of German Standard Text Readings
    DOI 10.1016/j.jvoice.2018.06.009
    Type Journal Article
    Author Aichinger P
    Journal Journal of Voice
  • 2019
    Title Tracking of multiple fundamental frequencies in standard text readings of diplophonic speakers
    Type Conference Proceeding Abstract
    Author Aichinger P
    Conference International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications
    Pages 125-128
    Link Publication
  • 2019
    Title Perturbation of cycle lengths and cycle peak amplitudes in diplophonic voices
    Type Conference Proceeding Abstract
    Author Aichinger P
    Conference International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications
    Pages 121-124
    Link Publication
  • 2019
    Title A glottal area waveform model for multi-pulsed vocal fry
    Type Conference Proceeding Abstract
    Author Aichinger P
    Conference International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications
    Pages 133-136
    Link Publication
  • 2019
    Title Extracting kinematic vocal fold parameters from videokymograms via simulation of clinical data
    Type Conference Proceeding Abstract
    Author Bulusu S
    Conference International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications
    Pages 141-144
    Link Publication
  • 2019
    Title Modelling longitudinal phase differences in a lumped and distributed elements vocal fold model
    Type Conference Proceeding Abstract
    Author Aichinger P
    Conference International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications
    Pages 137-140
    Link Publication
  • 2019
    Title Analysis and synthesis of vocal flutter and vocal jitter
    Type Conference Proceeding Abstract
    Author Aichinger P
    Conference Annual Conference of the International Speech Communication Association, INTERSPEECH
    Pages 2518-2522
  • 2019
    Title Aerodynamics and Lumped-Masses Combined with Delay Lines for Modeling Vertical and Anterior-Posterior Phase Differences in Pathological Vocal Fold Vibration
    Type Conference Proceeding Abstract
    Author Aichinger P
    Conference Annual Conference of the International Speech Communication Association, INTERSPEECH
    Pages 2503-2507
  • 2019
    Title Characterization of turbulence noise in breathy human phonation
    Type Conference Proceeding Abstract
    Author Aichinger P
    Conference ICA 2019 and EAA Euroregio
    Pages 3139-3146
  • 2021
    Title Neural network based estimation of vocal fold kinematic parameters from digital videokymograms
    Type Conference Proceeding Abstract
    Author Bulusu S
    Conference Advances in Quantitative Laryngology, Voice and Speech Research (AQL)
  • 2021
    Title Artificial high-speed videos of normal and dysphonic vocal fold vibration
    Type Conference Proceeding Abstract
    Author Aichinger P
    Conference International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications
    Pages 93-96
    Link Publication
  • 2019
    Title Characterization of turbulence noise in breathy human phonation
    DOI 10.18154/rwth-conv-239381
    Type Other
    Author Aichinger P
    Link Publication
  • 2022
    Title A Modelling Study on the Comparison of Predicted Auditory Nerve Firing Rates for the Personalized Indication of Cochlear Implantation
    DOI 10.3390/app12105168
    Type Journal Article
    Author Aichinger P
    Journal Applied Sciences
    Pages 5168
    Link Publication
  • 2022
    Title Simulated Laryngeal High-Speed Videos for the Study of Normal and Dysphonic Vocal Fold Vibration.
    DOI 10.1044/2022_jslhr-21-00673
    Type Journal Article
    Author Aichinger P
    Journal Journal of speech, language, and hearing research : JSLHR
    Pages 2431-2445
    Link Publication
  • 2021
    Title Fitting a biomechanical model of the folds to oscillatory patterns with AP and LR asymmetries observed in high speed video data
    Type Conference Proceeding Abstract
    Author Aichinger P
    Conference International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications
    Pages 89-92
    Link Publication
  • 2021
    Title Objective detection of amplitude modulation in glottal area waveforms
    Type Conference Proceeding Abstract
    Author Aichinger P
    Conference International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications
    Pages 15-18
    Link Publication
  • 2023
    Title Performance evaluation of 3D neural networks applied to high-speed videos for glottis segmentation in difficult cases
    Type Conference Proceeding Abstract
    Author Aichinger P
    Conference International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications
    Pages 87-90
    Link Publication
  • 2023
    Title Sinusoidal Modelling of Vocal Fold Medial Surface Vibration Trajectories
    Type Conference Proceeding Abstract
    Author Aichinger P
    Conference Conference on Advances in Quantitative Laryngology, Voice and Speech Research (AQL)
  • 2023
    Title Kinematics of Vocal Fold Vibration in Double-Pulsed Phonation
    Type Conference Proceeding Abstract
    Author Aichinger P
    Conference Annual Symposium of the Voice Foundation
  • 2023
    Title Biomechanics and acoustics of voice production
    Type PhD Thesis
    Author Lehoux, Sarah
  • 2023
    Title Auditory Perception of Impulsiveness and Tonality in Vocal Fry
    DOI 10.3390/app13074186
    Type Journal Article
    Author Devaraj V
    Journal Applied Sciences
Methods & Materials
  • 2022
    Title Synthesizer for videos of vocal fold vibration
    Type Model of mechanisms or symptoms - human
    Public Access
  • 2019
    Title Diplophonia rate (DR) extractor
    Type Physiological assessment or outcome measure
    Public Access
Scientific Awards
  • 2023
    Title Becoming Ap.Professor
    Type Honorary Degree
    Level of Recognition Regional (any country)
  • 2023
    Title Senior member of IEEE
    Type Awarded honorary membership, or a fellowship, of a learned society
    Level of Recognition Continental/International
  • 2023
    Title Associate Editor for IEEE/ACM Transactions on Audio Speech and Language Processing
    Type Appointed as the editor/advisor to a journal or book series
    Level of Recognition Continental/International
  • 2022
    Title Sarah Lehoux visited the lab for one week 2022
    Type Attracted visiting staff or user to your research group
    Level of Recognition Continental/International
  • 2022
    Title Guest editor for Biomedical Signal Processing and Control
    Type Appointed as the editor/advisor to a journal or book series
    Level of Recognition Continental/International
  • 2018
    Title Attracted Jean Schoentgen to temporally join the lab in Vienna
    Type Attracted visiting staff or user to your research group
    Level of Recognition Continental/International

Discovering
what
matters.

Newsletter

FWF-Newsletter Press-Newsletter Calendar-Newsletter Job-Newsletter scilog-Newsletter

Contact

Austrian Science Fund (FWF)
Georg-Coch-Platz 2
(Entrance Wiesingerstraße 4)
1010 Vienna

office(at)fwf.ac.at
+43 1 505 67 40

General information

  • Job Openings
  • Jobs at FWF
  • Press
  • Philanthropy
  • scilog
  • FWF Office
  • Social Media Directory
  • LinkedIn, external URL, opens in a new window
  • , external URL, opens in a new window
  • Facebook, external URL, opens in a new window
  • Instagram, external URL, opens in a new window
  • YouTube, external URL, opens in a new window
  • Cookies
  • Whistleblowing/Complaints Management
  • Accessibility Statement
  • Data Protection
  • Acknowledgements
  • IFG-Form
  • Social Media Directory
  • © Österreichischer Wissenschaftsfonds FWF
© Österreichischer Wissenschaftsfonds FWF