Disciplines
Electrical Engineering, Electronics, Information Engineering (60%); Computer Sciences (15%); Medical-Theoretical Sciences, Pharmacy (25%)
Keywords
Speech and Audio Coding,
Neural Pulse Code,
Quantizer Design,
Auditory Model Inversion,
Auditory Representation,
Perceptual Domain
Abstract
The goal of the proposed research is the development of a new and efficient source coder for speech and audio
signals based on the approach of coding in the perceptual domain. In this approach the signal is transformed into an
auditory representation by passing it through a model of the human peripheral auditory system. The auditory
representation is quantized and encoded for an efficient digital transmission or storage. Upon decoding the auditory
representation is then transformed back into the acoustic domain using an inverse of the auditory model.
Auditory modeling and research on perceptual-domain coding provides insight into human perception and
facilitates the extraction of signal features that are most relevant to the listener. The gained findings not only yield
a new coding method for transmission and storage but importantly assist the development of next-generation
hearing aids and cochlear implants.
The interdisciplinarity of perceptual-domain coding calls for consultation and cooperation with experts from
information theory as well as hearing physiology. In collaboration with Professor Bastiaan Kleijn and his research
group in Stockholm, an optimum quantizer for the encoding of auditory representations should be designed. By the
cooperation with Professor Roy Patterson in Cambridge, a more accurate auditory model should be investigated
and incorporated into the perceptual-domain coder.