Data-driven background dataset selection for SVM-based speaker verification


Autoria(s): McLaren, Mitchell L.; Vogt, Robert J.; Baker, Brendan J.; Sridharan, Sridha
Data(s)

2010

Resumo

The recently proposed data-driven background dataset refinement technique provides a means of selecting an informative background for support vector machine (SVM)-based speaker verification systems. This paper investigates the characteristics of the impostor examples in such highly-informative background datasets. Data-driven dataset refinement individually evaluates the suitability of candidate impostor examples for the SVM background prior to selecting the highest-ranking examples as a refined background dataset. Further, the characteristics of the refined dataset were analysed to investigate the desired traits of an informative SVM background. The most informative examples of the refined dataset were found to consist of large amounts of active speech and distinctive language characteristics. The data-driven refinement technique was shown to filter the set of candidate impostor examples to produce a more disperse representation of the impostor population in the SVM kernel space, thereby reducing the number of redundant and less-informative examples in the background dataset. Furthermore, data-driven refinement was shown to provide performance gains when applied to the difficult task of refining a small candidate dataset that was mis-matched to the evaluation conditions.

Formato

application/pdf

Identificador

http://eprints.qut.edu.au/32293/

Publicador

IEEE

Relação

http://eprints.qut.edu.au/32293/1/c32293.pdf

http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=05308408

DOI:10.1109/TASL.2009.2035786

McLaren, Mitchell L., Vogt, Robert J., Baker, Brendan J., & Sridharan, Sridha (2010) Data-driven background dataset selection for SVM-based speaker verification. IEEE Transactions on Audio, Speech and Language Processing, 18(6), pp. 1496-1506.

Direitos

Copyright 2010 IEEE

Copyright 2010 IEEE. Personal use is permitted. For any other purposes, Permission must be obtained from the IEEE by emailing pubs-permissions@ieee.org.

Fonte

Faculty of Built Environment and Engineering; Information Security Institute; School of Engineering Systems

Palavras-Chave #080109 Pattern Recognition and Data Mining #080107 Natural Language Processing #Speaker Verification #Support Vector Machines
Tipo

Journal Article