Speaker attribution of multiple telephone conversations using a complete-linkage clustering approach


Autoria(s): Ghaemmaghami, Houman; Dean, David; Vogt, Robbie; Sridharan, Sridha
Data(s)

25/03/2012

Resumo

In this paper we propose and evaluate a speaker attribution system using a complete-linkage clustering method. Speaker attribution refers to the annotation of a collection of spoken audio based on speaker identities. This can be achieved using diarization and speaker linking. The main challenge associated with attribution is achieving computational efficiency when dealing with large audio archives. Traditional agglomerative clustering methods with model merging and retraining are not feasible for this purpose. This has motivated the use of linkage clustering methods without retraining. We first propose a diarization system using complete-linkage clustering and show that it outperforms traditional agglomerative and single-linkage clustering based diarization systems with a relative improvement of 40% and 68%, respectively. We then propose a complete-linkage speaker linking system to achieve attribution and demonstrate a 26% relative improvement in attribution error rate (AER) over the single-linkage speaker linking approach.

Formato

application/pdf

Identificador

http://eprints.qut.edu.au/57372/

Publicador

IEEE

Relação

http://eprints.qut.edu.au/57372/1/SPEAKER_ATTRIBUTION_OF_MULTIPLE_TELEPHONE_CONVERSATIONS_USING_A_COMPLETE-LINKAGE_CLUSTERING_APPROACH.pdf

DOI:10.1109/ICASSP.2012.6288841

Ghaemmaghami, Houman, Dean, David, Vogt, Robbie, & Sridharan, Sridha (2012) Speaker attribution of multiple telephone conversations using a complete-linkage clustering approach. In ICASSP 2012 IEEE International Conference on Acoustics, Speech and Signal Processing, IEEE, Kyoto International Conference Centre, Kyoto, Japan, pp. 4185-4188.

http://purl.org/au-research/grants/ARC/LP0991238

Direitos

Copyright 2012 Institute of Electrical and Electronics Engineers, Inc.

Fonte

Science & Engineering Faculty

Palavras-Chave #080100 ARTIFICIAL INTELLIGENCE AND IMAGE PROCESSING #080107 Natural Language Processing #090000 ENGINEERING #speaker diarization #speaker linking #speaker attribution #complete-linkage #joint factor analysis
Tipo

Conference Paper