Using Reverberation to Improve Range and Elevation Discrimination for Small Array Sound Source Localization


Autoria(s): RIBEIRO, Flavio; ZHANG, Cha; FLORENCIO, Dinei A.; BA, Demba Elimane
Contribuinte(s)

UNIVERSIDADE DE SÃO PAULO

Data(s)

18/10/2012

18/10/2012

2010

Resumo

Sound source localization (SSL) is an essential task in many applications involving speech capture and enhancement. As such, speaker localization with microphone arrays has received significant research attention. Nevertheless, existing SSL algorithms for small arrays still have two significant limitations: lack of range resolution, and accuracy degradation with increasing reverberation. The latter is natural and expected, given that strong reflections can have amplitudes similar to that of the direct signal, but different directions of arrival. Therefore, correctly modeling the room and compensating for the reflections should reduce the degradation due to reverberation. In this paper, we show a stronger result. If modeled correctly, early reflections can be used to provide more information about the source location than would have been available in an anechoic scenario. The modeling not only compensates for the reverberation, but also significantly increases resolution for range and elevation. Thus, we show that under certain conditions and limitations, reverberation can be used to improve SSL performance. Prior attempts to compensate for reverberation tried to model the room impulse response (RIR). However, RIRs change quickly with speaker position, and are nearly impossible to track accurately. Instead, we build a 3-D model of the room, which we use to predict early reflections, which are then incorporated into the SSL estimation. Simulation results with real and synthetic data show that even a simplistic room model is sufficient to produce significant improvements in range and elevation estimation, tasks which would be very difficult when relying only on direct path signal components.

Identificador

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, v.18, n.7, p.1781-1792, 2010

1558-7916

http://producao.usp.br/handle/BDPI/18790

10.1109/TASL.2010.2052250

http://dx.doi.org/10.1109/TASL.2010.2052250

Idioma(s)

eng

Publicador

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Relação

Ieee Transactions on Audio Speech and Language Processing

Direitos

restrictedAccess

Copyright IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Palavras-Chave #Array processing #circular microphone array #distance discrimination #image method #range estimation #sound source localization (SSL) #TIME-DELAY #ACOUSTICS #MEETINGS #Acoustics #Engineering, Electrical & Electronic
Tipo

article

original article

publishedVersion