Phase-Aware Signal Processing for Speech Transmission
Phase-Aware Signal Processing for Speech Transmission
Disciplines
Electrical Engineering, Electronics, Information Engineering (75%); Computer Sciences (25%)
Keywords
-
Phase-aware signal processing,
Speech enhancement,
Perceived signal quality,
Speech transmission,
Signal processing,
Phase spectrum estimation
Everyday life applications highly depend on successful speech transmission and speech communication, to name a few: smart homes with voice commands, hands-free mobile telephony, and speech recognition with machines. In all these applications it is quite important to guarantee a high performance robust to the background noise or reverberation in the room. A pre-processing stage in the form of signal enhancement is very important in order to remove the undesired background noise sources. While state-of-the-art technology for speech transmission mainly focuses on filtering the signal in the amplitude domain, we aim to push the limits of the achievable performance by extending the enhancement solution to both the amplitude and phase components leading to the new concept of phase-aware signal processing. The contributions in this proposal are divided in three lines: i) develop methods for estimating phase information about the desired source signal observed in noise, ii) develop novel speech enhancement algorithms in the complex domain to circumvent the limits of the conventional methods, iii) extension of single-channel source separation and artificial bandwidth extension using the concept of phase-aware signal processing, and iv) derive new quality estimators to reflect how humans perceive speech in noise and reduce or avoid tedious listening tests.
Everyday life applications heavily depend on successful speech transmission and speech communication, to name a few in voice command smart home or in handsfree mobile telephony and robust automatic speech recognition. All these applications are required to perform with high enough accuracy in presence of background noise or reverberation. A pre-processing stage in the form of speech enhancement or source separation is required to remove the undesired interfering component from the desired one before the target application. While state-of-the-art technology mainly focuses on filtering the signal in amplitude domain, we aim to push the limits of the achievable performance by extending the enhancement problem to both amplitude and phase parts leading to the new concept of phase-aware signal processing. The contributions in this proposal are in three folds: i) find solutions for estimating phase of desired signals from their mixed observation. The knowledge about phase is then further utilized in order to improve the amplitude estimation accuracy leading to a full complex signal enhancement framework, ii) The second goal of the proposal will be to extend the phase-aware signal processing tools to a speaker-dependent scenario. While previous works trained statistical models learned on spectral amplitude extracted from training sentences, we extend the dictionary learning idea to phase information as well. To this end, a new harmonic model plus phase distortion is employed to obtain a useful representation of phase, iii) Finally the last research question addressed in this proposal is to find reliable predictor of performance evaluation for the proposed phase-aware signal processing tools developed in previous steps. This is important to quantify how successful are the proposed methods in terms of achieving an accurate estimation of amplitude and phase parts when used for different applications: speech enhancement, source separation and robust automatic speech recognition.
- Technische Universität Graz - 100%
- Paavo Alku, Aalto University Helsinki - Finland
- Rahim Saeidi, University of Eastern Finland - Finland
- Gilles Degottex, Centre Georges Pompidou - France
- Tim Fingscheidt, Technische Universität Braunschweig - Germany
Research Output
- 351 Citations
- 18 Publications
- 1 Software
- 2 Scientific Awards
-
2018
Title Single-channel speech enhancement using inter-component phase relations DOI 10.1016/j.specom.2018.03.009 Type Journal Article Author Barysenka S Journal Speech Communication Pages 144-160 -
2016
Title Fixed Points of Belief Propagation -- An Analysis via Polynomial Homotopy Continuation DOI 10.48550/arxiv.1605.06451 Type Preprint Author Knoll C -
2016
Title On the Importance of Harmonic Phase Modification for Improved Speech Signal Reconstruction DOI 10.1109/icassp.2016.7471742 Type Conference Proceeding Abstract Author Maly A Pages 584-588 -
2016
Title Phase-Processing for Voice Activity Detection: A Statistical Approach DOI 10.1109/eusipco.2016.7760439 Type Conference Proceeding Abstract Author Stahl J Pages 1202-1206 -
2014
Title Phase Estimation in Single Channel Speech Enhancement Using Phase Decomposition DOI 10.1109/lsp.2014.2365040 Type Journal Article Author Kulmer J Journal IEEE Signal Processing Letters Pages 598-602 -
2019
Title Exploiting temporal correlation in pitch-adaptive speech enhancement DOI 10.1016/j.specom.2019.05.001 Type Journal Article Author Stahl J Journal Speech Communication Pages 1-13 -
2019
Title Binaural Codebook-Based Speech Enhancement With Atomic Speech Presence Probability DOI 10.1109/taslp.2019.2937174 Type Journal Article Author Wood S Journal IEEE/ACM Transactions on Audio, Speech, and Language Processing Pages 2150-2161 -
2017
Title New Results in Modulation-Domain Single-Channel Speech Enhancement DOI 10.1109/taslp.2017.2747082 Type Journal Article Author Mowlaee P Journal IEEE/ACM Transactions on Audio, Speech, and Language Processing Pages 2125-2137 -
2017
Title Fixed Points of Belief Propagation—An Analysis via Polynomial Homotopy Continuation DOI 10.1109/tpami.2017.2749575 Type Journal Article Author Knoll C Journal IEEE Transactions on Pattern Analysis and Machine Intelligence Pages 2124-2136 Link Publication -
2017
Title Impact of phase estimation on single-channel speech separation based on time-frequency masking DOI 10.1121/1.4986647 Type Journal Article Author Mayer F Journal The Journal of the Acoustical Society of America Pages 4668-4679 Link Publication -
2017
Title Iterative joint MAP single-channel speech enhancement given non-uniform phase prior DOI 10.1016/j.specom.2016.11.008 Type Journal Article Author Mowlaee P Journal Speech Communication Pages 85-96 -
2020
Title Single-channel speech enhancement with correlated spectral components: Limits-potential DOI 10.1016/j.specom.2020.05.002 Type Journal Article Author Mowlaee P Journal Speech Communication Pages 58-69 -
2019
Title Maximum a posteriori Speech Enhancement Based on Double Spectrum DOI 10.21437/interspeech.2019-1197 Type Conference Proceeding Abstract Author Mowlaee P Pages 2738-2742 -
2015
Title Phase Estimation in Single-Channel Speech Enhancement: Limits-Potential DOI 10.1109/taslp.2015.2430820 Type Journal Article Author Mowlaee P Journal IEEE/ACM Transactions on Audio, Speech, and Language Processing Pages 1283-1294 -
2015
Title Harmonic Phase Estimation in Single-Channel Speech Enhancement Using Phase Decomposition and SNR Information DOI 10.1109/taslp.2015.2439038 Type Journal Article Author Mowlaee P Journal IEEE/ACM Transactions on Audio, Speech, and Language Processing Pages 1521-1532 -
2018
Title A Simple and Effective Framework for a Priori SNR Estimation DOI 10.1109/icassp.2018.8461787 Type Conference Proceeding Abstract Author Stahl J Pages 5644-5648 -
2018
Title A Pitch-Synchronous Simultaneous Detection-Estimation Framework for Speech Enhancement DOI 10.1109/taslp.2017.2779405 Type Journal Article Author Stahl J Journal IEEE/ACM Transactions on Audio, Speech, and Language Processing Pages 436-450 -
2016
Title Advances in phase-aware signal processing in speech communication DOI 10.1016/j.specom.2016.04.002 Type Journal Article Author Mowlaee P Journal Speech Communication Pages 1-29
-
2016
Title Editor for special issue Type Appointed as the editor/advisor to a journal or book series Level of Recognition Continental/International -
2016
Title IEEE Senior membership Type Medal Level of Recognition Continental/International