Extending the task of diarization to speaker attribution


Autoria(s): Ghaemmaghami, Houman; Dean, David; Vogt, Robbie; Sridharan, Sridha
Data(s)

28/08/2011

Resumo

In this paper we extend the concept of speaker annotation within a single-recording, or speaker diarization, to a collection wide approach we call speaker attribution. Accordingly, speaker attribution is the task of clustering expectantly homogenous intersession clusters obtained using diarization according to common cross-recording identities. The result of attribution is a collection of spoken audio across multiple recordings attributed to speaker identities. In this paper, an attribution system is proposed using mean-only MAP adaptation of a combined-gender UBM to model clusters from a perfect diarization system, as well as a JFA-based system with session variability compensation. The normalized cross-likelihood ratio is calculated for each pair of clusters to construct an attribution matrix and the complete linkage algorithm is employed to conduct clustering of the inter-session clusters. A matched cluster purity and coverage of 87.1% was obtained on the NIST 2008 SRE corpus.

Formato

application/pdf

Identificador

http://eprints.qut.edu.au/43351/

Relação

http://eprints.qut.edu.au/43351/4/46031.pdf

http://www.interspeech2011.org/

Ghaemmaghami, Houman, Dean, David, Vogt, Robbie, & Sridharan, Sridha (2011) Extending the task of diarization to speaker attribution. In Interspeech 2011, 28-31 August 2011, Florence, Italy.

http://purl.org/au-research/grants/ARC/LP0991238

Direitos

Copyright 2011 (please consult the authors).

Fonte

Faculty of Built Environment and Engineering; Information Security Institute; School of Engineering Systems

Palavras-Chave #090609 Signal Processing #speaker attribution #diarization #clustering #cross likelihood ration #joint factor analysis
Tipo

Conference Paper