Audio visual automatic speech recognition in vehicles


Autoria(s): Navarathna, Rajitha; Dean, David B.; Lucey, Patrick J.; Sridharan, Sridha
Data(s)

21/07/2010

Resumo

Acoustically, car cabins are extremely noisy and as a consequence, existing audio-only speech recognition systems, for voice-based control of vehicle functions such as the GPS based navigator, perform poorly. Audio-only speech recognition systems fail to make use of the visual modality of speech (eg: lip movements). As the visual modality is immune to acoustic noise, utilising this visual information in conjunction with an audio only speech recognition system has the potential to improve the accuracy of the system. The field of recognising speech using both auditory and visual inputs is known as Audio Visual Speech Recognition (AVSR). Continuous research in AVASR field has been ongoing for the past twenty-five years with notable progress being made. However, the practical deployment of AVASR systems for use in a variety of real-world applications has not yet emerged. The main reason is due to most research to date neglecting to address variabilities in the visual domain such as illumination and viewpoint in the design of the visual front-end of the AVSR system. In this paper we present an AVASR system in a real-world car environment using the AVICAR database [1], which is publicly available in-car database and we show that the use of visual speech conjunction with the audio modality is a better approach to improve the robustness and effectiveness of voice-only recognition systems in car cabin environments.

Formato

application/pdf

Identificador

http://eprints.qut.edu.au/33212/

Relação

http://eprints.qut.edu.au/33212/1/c33212.pdf

Navarathna, Rajitha, Dean, David B., Lucey, Patrick J., & Sridharan, Sridha (2010) Audio visual automatic speech recognition in vehicles. In AutoCRC2010 Conference, 27th July 2010, Cliftons, Melbourne, Vic.

Navarathna, Rajitha, Dean, David B., Lucey, Patrick J., & Sridharan, Sridha (2010) Audio visual automatic speech recognition in vehicles. In Proceedings of AutoCRC2010 Conference, Cliftons, Melbourne. (In Press)

Direitos

Copyright 2010 [please consult the authors]

Fonte

Faculty of Built Environment and Engineering; Information Security Institute; School of Engineering Systems

Palavras-Chave #090609 Signal Processing #AVASR #AVICAR database
Tipo

Conference Paper