Bayes Factor based speaker segmentation for speaker diarization


Autoria(s): Wang, David; Vogt, Robert J.; Sridharan, Sridha
Data(s)

2010

Resumo

This paper proposes the use of the Bayes Factor as a distance metric for speaker segmentation within a speaker diarization system. The proposed approach uses a pair of constant sized, sliding windows to compute the value of the Bayes Factor between the adjacent windows over the entire audio. Results obtained on the 2002 Rich Transcription Evaluation dataset show an improved segmentation performance compared to previous approaches reported in literature using the Generalized Likelihood Ratio. When applied in a speaker diarization system, this approach results in a 5.1% relative improvement in the overall Diarization Error Rate compared to the baseline.

Formato

application/pdf

Identificador

http://eprints.qut.edu.au/40855/

Publicador

International Speech Communication Association

Relação

http://eprints.qut.edu.au/40855/1/40855.pdf

http://www.interspeech2010.org/

Wang, David, Vogt, Robert J., & Sridharan, Sridha (2010) Bayes Factor based speaker segmentation for speaker diarization. In Proceedings of the 11th Annual Conference of the International Speech Communication Association (Interspeech 2010), International Speech Communication Association, Makuhari Messe International Convention Complex, Chiba, Makuhari, Japan, 1405 -1408.

http://purl.org/au-research/grants/ARC/LP0991238

Direitos

Copyright 2010 International Speech Communication Association. All rights reserved.

Fonte

Faculty of Built Environment and Engineering; Information Security Institute; School of Engineering Systems

Palavras-Chave #080107 Natural Language Processing #080109 Pattern Recognition and Data Mining #090609 Signal Processing #Speaker Diarization #Speaker Segmentation #Bayes Factor #Generalized Likelihood Ratio
Tipo

Conference Paper