Clustered blind beamforming from ad-hoc microphone arrays


Autoria(s): Himawan, Ivan; Mccowan, Iain; Sridharan, Sridha
Data(s)

2010

Resumo

Microphone arrays have been used in various applications to capture conversations, such as in meetings and teleconferences. In many cases, the microphone and likely source locations are known \emph{a priori}, and calculating beamforming filters is therefore straightforward. In ad-hoc situations, however, when the microphones have not been systematically positioned, this information is not available and beamforming must be achieved blindly. In achieving this, a commonly neglected issue is whether it is optimal to use all of the available microphones, or only an advantageous subset of these. This paper commences by reviewing different approaches to blind beamforming, characterising them by the way they estimate the signal propagation vector and the spatial coherence of noise in the absence of prior knowledge of microphone and speaker locations. Following this, a novel clustered approach to blind beamforming is motivated and developed. Without using any prior geometrical information, microphones are first grouped into localised clusters, which are then ranked according to their relative distance from a speaker. Beamforming is then performed using either the closest microphone cluster, or a weighted combination of clusters. The clustered algorithms are compared to the full set of microphones in experiments on a database recorded on different ad-hoc array geometries. These experiments evaluate the methods in terms of signal enhancement as well as performance on a large vocabulary speech recognition task.

Formato

application/pdf

Identificador

http://eprints.qut.edu.au/34235/

Publicador

IEEE

Relação

http://eprints.qut.edu.au/34235/1/c34235.pdf

DOI:10.1109/TASL.2010.2055560

Himawan, Ivan, Mccowan, Iain, & Sridharan, Sridha (2010) Clustered blind beamforming from ad-hoc microphone arrays. IEEE Transactions on Audio, Speech, and Language Processing, 19(4), pp. 661-676.

Direitos

Copyright 2010 IEEE

Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.

Fonte

Faculty of Built Environment and Engineering; Information Security Institute; School of Engineering Systems

Palavras-Chave #090609 Signal Processing #Array Signal Processing #Speech Enhancement #Speech Recognition
Tipo

Journal Article