PLDA based speaker recognition on short utterances


Autoria(s): Kanagasundaram, Ahilan; Vogt, Robert J.; Dean, David B.; Sridharan, Sridha
Data(s)

28/06/2012

Resumo

This paper investigates the effects of limited speech data in the context of speaker verification using a probabilistic linear discriminant analysis (PLDA) approach. Being able to reduce the length of required speech data is important to the development of automatic speaker verification system in real world applications. When sufficient speech is available, previous research has shown that heavy-tailed PLDA (HTPLDA) modeling of speakers in the i-vector space provides state-of-the-art performance, however, the robustness of HTPLDA to the limited speech resources in development, enrolment and verification is an important issue that has not yet been investigated. In this paper, we analyze the speaker verification performance with regards to the duration of utterances used for both speaker evaluation (enrolment and verification) and score normalization and PLDA modeling during development. Two different approaches to total-variability representation are analyzed within the PLDA approach to show improved performance in short-utterance mismatched evaluation conditions and conditions for which insufficient speech resources are available for adequate system development. The results presented within this paper using the NIST 2008 Speaker Recognition Evaluation dataset suggest that the HTPLDA system can continue to achieve better performance than Gaussian PLDA (GPLDA) as evaluation utterance lengths are decreased. We also highlight the importance of matching durations for score normalization and PLDA modeling to the expected evaluation conditions. Finally, we found that a pooled total-variability approach to PLDA modeling can achieve better performance than the traditional concatenated total-variability approach for short utterances in mismatched evaluation conditions and conditions for which insufficient speech resources are available for adequate system development.

Formato

application/pdf

Identificador

http://eprints.qut.edu.au/51213/

Publicador

ISCA

Relação

http://eprints.qut.edu.au/51213/1/028-033-15.pdf

http://www.odyssey2012.org/

Kanagasundaram, Ahilan, Vogt, Robert J., Dean, David B., & Sridharan, Sridha (2012) PLDA based speaker recognition on short utterances. In The Speaker and Language Recognition Workshop (Odyssey 2012), ISCA, Singapore.

Direitos

Copyright 2012 [please consult the author]

Fonte

School of Electrical Engineering & Computer Science; Information Security Institute; Science & Engineering Faculty

Palavras-Chave #080000 INFORMATION AND COMPUTING SCIENCES #090000 ENGINEERING #Text-Independent Speaker Recognition #Short utterances #Commercial Applications
Tipo

Conference Paper