Speaker attribution of Australian broadcast news data
Data(s) |
15/08/2013
|
---|---|
Resumo |
Speaker attribution is the task of annotating a spoken audio archive based on speaker identities. This can be achieved using speaker diarization and speaker linking. In our previous work, we proposed an efficient attribution system, using complete-linkage clustering, for conducting attribution of large sets of two-speaker telephone data. In this paper, we build on our proposed approach to achieve a robust system, applicable to multiple recording domains. To do this, we first extend the diarization module of our system to accommodate multi-speaker (>2) recordings. We achieve this through using a robust cross-likelihood ratio (CLR) threshold stopping criterion for clustering, as opposed to the original stopping criterion of two speakers used for telephone data. We evaluate this baseline diarization module across a dataset of Australian broadcast news recordings, showing a significant lack of diarization accuracy without previous knowledge of the true number of speakers within a recording. We thus propose applying an additional pass of complete-linkage clustering to the diarization module, demonstrating an absolute improvement of 20% in diarization error rate (DER). We then evaluate our proposed multi-domain attribution system across the broadcast news data, demonstrating achievable attribution error rates (AER) as low as 17%. |
Formato |
application/pdf |
Identificador | |
Publicador |
Sun SITE Central Europe |
Relação |
http://eprints.qut.edu.au/63498/1/SLAM13_eprints.pdf http://ceur-ws.org/Vol-1012/ Ghaemmaghami, Houman, Dean, David, & Sridharan, Sridha (2013) Speaker attribution of Australian broadcast news data. In Proceedings of the First Workshop on Speech, Language and Audio in Multimedia (SLAM): CEUR Workshop Proceedings, Volume 1012, Sun SITE Central Europe , Marseille, France, pp. 72-77. http://purl.org/au-research/grants/ARC/LP0991238 |
Direitos |
Copyright 2013 [please consult the author] |
Fonte |
School of Electrical Engineering & Computer Science; Faculty of Built Environment and Engineering; Information Security Institute |
Palavras-Chave | #090609 Signal Processing |
Tipo |
Conference Paper |