FPGA implementation of dual-microphone delay-and-sum beamforming for in-car speech enhancement and recognition


Autoria(s): Ye, Harvey; Whittington, Jim; Himawan, Ivan; Kleinschmidt, Tristan; Mason, Michael
Data(s)

05/03/2009

Resumo

In an automotive environment, the performance of a speech recognition system is affected by environmental noise if the speech signal is acquired directly from a microphone. Speech enhancement techniques are therefore necessary to improve the speech recognition performance. In this paper, a field-programmable gate array (FPGA) implementation of dual-microphone delay-and-sum beamforming (DASB) for speech enhancement is presented. As the first step towards a cost-effective solution, the implementation described in this paper uses a relatively high-end FPGA device to facilitate the verification of various design strategies and parameters. Experimental results show that the proposed design can produce output waveforms close to those generated by a theoretical (floating-point) model with modest usage of FPGA resources. Speech recognition experiments are also conducted on enhanced in-car speech waveforms produced by the FPGA in order to compare recognition performance with the floating-point representation running on a PC.

Formato

application/pdf

Identificador

http://eprints.qut.edu.au/31339/

Publicador

Cooperative Research Centre for Advanced Automotive Technology

Relação

http://eprints.qut.edu.au/31339/1/c31339.pdf

Ye, Harvey, Whittington, Jim, Himawan, Ivan, Kleinschmidt, Tristan, & Mason, Michael (2009) FPGA implementation of dual-microphone delay-and-sum beamforming for in-car speech enhancement and recognition. In AutoCRC Conference 2009 : Conference Proceedings, Cooperative Research Centre for Advanced Automotive Technology, Melbourne Convention and Exhibition Centre, Melbourne, Victoria.

Direitos

Copyright 2009 [please consult the authors]

Fonte

Faculty of Built Environment and Engineering; Information Security Institute; School of Engineering Systems

Palavras-Chave #090609 Signal Processing #090601 Circuits and Systems #field programmable gate arrays #array signal processing #speech enhancement #speech recognition
Tipo

Conference Paper