Speaker attribution of multiple telephone conversations using a complete-linkage clustering approach
Data(s) |
25/03/2012
|
---|---|
Resumo |
In this paper we propose and evaluate a speaker attribution system using a complete-linkage clustering method. Speaker attribution refers to the annotation of a collection of spoken audio based on speaker identities. This can be achieved using diarization and speaker linking. The main challenge associated with attribution is achieving computational efficiency when dealing with large audio archives. Traditional agglomerative clustering methods with model merging and retraining are not feasible for this purpose. This has motivated the use of linkage clustering methods without retraining. We first propose a diarization system using complete-linkage clustering and show that it outperforms traditional agglomerative and single-linkage clustering based diarization systems with a relative improvement of 40% and 68%, respectively. We then propose a complete-linkage speaker linking system to achieve attribution and demonstrate a 26% relative improvement in attribution error rate (AER) over the single-linkage speaker linking approach. |
Formato |
application/pdf |
Identificador | |
Publicador |
IEEE |
Relação |
http://eprints.qut.edu.au/57372/1/SPEAKER_ATTRIBUTION_OF_MULTIPLE_TELEPHONE_CONVERSATIONS_USING_A_COMPLETE-LINKAGE_CLUSTERING_APPROACH.pdf DOI:10.1109/ICASSP.2012.6288841 Ghaemmaghami, Houman, Dean, David, Vogt, Robbie, & Sridharan, Sridha (2012) Speaker attribution of multiple telephone conversations using a complete-linkage clustering approach. In ICASSP 2012 IEEE International Conference on Acoustics, Speech and Signal Processing, IEEE, Kyoto International Conference Centre, Kyoto, Japan, pp. 4185-4188. http://purl.org/au-research/grants/ARC/LP0991238 |
Direitos |
Copyright 2012 Institute of Electrical and Electronics Engineers, Inc. |
Fonte |
Science & Engineering Faculty |
Palavras-Chave | #080100 ARTIFICIAL INTELLIGENCE AND IMAGE PROCESSING #080107 Natural Language Processing #090000 ENGINEERING #speaker diarization #speaker linking #speaker attribution #complete-linkage #joint factor analysis |
Tipo |
Conference Paper |