Using the interaural time difference and cross-correlation to localise short-term complex noises


Autoria(s): Wall, Julie; McGinnity, Martin; Maguire, Liam
Data(s)

2011

Resumo

The mammalian binaural cue of interaural time difference (ITD) and cross-correlation have long been used to determine the point of origin of a sound source. The ITD can be defined as the different points in time at which a sound from a single location arrives at each individual ear [1]. From this time difference, the brain can calculate the angle of the sound source in relation to the head [2]. Cross-correlation compares the similarity of each channel of a binaural waveform producing the time lag or offset required for both channels to be in phase with one another. This offset corresponds to the maximum value produced by the cross-correlation function and can be used to determine the ITD and thus the azimuthal angle θ of the original sound source. However, in indoor environments, cross-correlation has been known to have problems with both sound reflections and reverberations. Additionally, cross-correlation has difficulties with localising short-term complex noises when they occur during a longer duration waveform, i.e. in the presence of background noise. The crosscorrelation algorithm processes the entire waveform and the short-term complex noise can be ignored. This paper presents a technique using thresholding which enables higher-localisation abilities for short-term complex sounds in the midst of background noise. To determine the success of this thresholding technique, twenty-five sounds were recorded in a dynamic and echoic environment. The twenty-five sounds consist of hand-claps, finger-clicks and speech. The proposed technique was compared to the regular cross-correlation function for the same waveforms, and an average of the azimuthal angles determined for each individual sample. The sound localisation ability for all twenty-five sound samples is as follows: average of the sampled angles using cross-correlation: 44%; cross-correlation technique with thresholding: 84%. From these results, it is clear that this proposed technique is very successful for the localisation of short-term complex sounds in the midst of background noise and in a dynamic and echoic indoor environment.

Formato

text

Identificador

http://roar.uel.ac.uk/4515/1/AICS2011.pdf

Wall, Julie and McGinnity, Martin and Maguire, Liam (2011) ‘Using the interaural time difference and cross-correlation to localise short-term complex noises’, Artificial Intelligence and Cognitive Science (AICS). Derry, UK, 31st August - 2nd September 2011. University of Ulster, Intelligent Systems Research Centre.

Publicador

University of Ulster, Intelligent Systems Research Centre

Relação

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.592.7969&rep=rep1&type=pdf

http://roar.uel.ac.uk/4515/

Tipo

Conference or Event Item

PeerReviewed