Channel selection in the short-time modulation domain for distant speech recognition


Autoria(s): Himawan, Ivan; Motlicek, Petr; Sridharan, Sridha; Dean, David; Tjondronegoro, Dian
Data(s)

01/09/2015

Resumo

Automatic speech recognition from multiple distant micro- phones poses significant challenges because of noise and reverberations. The quality of speech acquisition may vary between microphones because of movements of speakers and channel distortions. This paper proposes a channel selection approach for selecting reliable channels based on selection criterion operating in the short-term modulation spectrum domain. The proposed approach quantifies the relative strength of speech from each microphone and speech obtained from beamforming modulations. The new technique is compared experimentally in the real reverb conditions in terms of perceptual evaluation of speech quality (PESQ) measures and word error rate (WER). Overall improvement in recognition rate is observed using delay-sum and superdirective beamformers compared to the case when the channel is selected randomly using circular microphone arrays.

Formato

application/pdf

Identificador

http://eprints.qut.edu.au/87652/

Relação

http://eprints.qut.edu.au/87652/1/Paper%20for%20Channel%20Selection%20in%20the%20Short-time%20Modulation%20Domain%20for%20Distant%20Speech%20Recognition.pdf

Himawan, Ivan, Motlicek, Petr, Sridharan, Sridha, Dean, David, & Tjondronegoro, Dian (2015) Channel selection in the short-time modulation domain for distant speech recognition. In Interspeech 2015: 16th Annual Conference of the International Speech Communication Association, 6-10 September 2015, Maritim International Congress Center, Dresden, Germany.

Direitos

Copyright 2015 [please consult the authors]

Fonte

School of Electrical Engineering & Computer Science; Science & Engineering Faculty

Palavras-Chave #channel selection #signal quality #microphone arrays #reverberation
Tipo

Conference Paper