Neuromorphic model for sound source segregation


Autoria(s): Krishnan, Lakshmi
Contribuinte(s)

Shamma, Shihab

Digital Repository at the University of Maryland

University of Maryland (College Park, Md.)

Electrical Engineering

Data(s)

22/06/2016

22/06/2016

2015

Resumo

While humans can easily segregate and track a speaker's voice in a loud noisy environment, most modern speech recognition systems still perform poorly in loud background noise. The computational principles behind auditory source segregation in humans is not yet fully understood. In this dissertation, we develop a computational model for source segregation inspired by auditory processing in the brain. To support the key principles behind the computational model, we conduct a series of electro-encephalography experiments using both simple tone-based stimuli and more natural speech stimulus. Most source segregation algorithms utilize some form of prior information about the target speaker or use more than one simultaneous recording of the noisy speech mixtures. Other methods develop models on the noise characteristics. Source segregation of simultaneous speech mixtures with a single microphone recording and no knowledge of the target speaker is still a challenge. Using the principle of temporal coherence, we develop a novel computational model that exploits the difference in the temporal evolution of features that belong to different sources to perform unsupervised monaural source segregation. While using no prior information about the target speaker, this method can gracefully incorporate knowledge about the target speaker to further enhance the segregation.Through a series of EEG experiments we collect neurological evidence to support the principle behind the model. Aside from its unusual structure and computational innovations, the proposed model provides testable hypotheses of the physiological mechanisms of the remarkable perceptual ability of humans to segregate acoustic sources, and of its psychophysical manifestations in navigating complex sensory environments. Results from EEG experiments provide further insights into the assumptions behind the model and provide motivation for future single unit studies that can provide more direct evidence for the principle of temporal coherence.

Identificador

doi:10.13016/M2NB68

http://hdl.handle.net/1903/18155

Idioma(s)

en

Palavras-Chave #Electrical engineering #Neurosciences #auditory EEG #auditory scene analysis #cocktail party problem #sound source segregation #temporal coherence
Tipo

Dissertation