Speaker Diarization Features: The UPM Contribution to the RT09 Evaluation


Autoria(s): Pardo Muñoz, José Manuel; Barra Chicote, Roberto; San Segundo Hernández, Rubén; Córdoba Herralde, Ricardo de; Martínez González, Beatriz
Data(s)

2012

Resumo

Two new features have been proposed and used in the Rich Transcription Evaluation 2009 by the Universidad Politécnica de Madrid, which outperform the results of the baseline system. One of the features is the intensity channel contribution, a feature related to the location of the speaker. The second feature is the logarithm of the interpolated fundamental frequency. It is the first time that both features are applied to the clustering stage of multiple distant microphone meetings diarization. It is shown that the inclusion of both features improves the baseline results by 15.36% and 16.71% relative to the development set and the RT 09 set, respectively. If we consider speaker errors only, the relative improvement is 23% and 32.83% on the development set and the RT09 set, respectively.

Formato

application/pdf

Identificador

http://oa.upm.es/11896/

Idioma(s)

eng

Publicador

E.T.S.I. Telecomunicación (UPM)

Relação

http://oa.upm.es/11896/2/INVE_MEM_2011_107717.pdf

http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5893922&tag=1

info:eu-repo/semantics/altIdentifier/doi/10.1109/TASL.2011.2159971

Direitos

http://creativecommons.org/licenses/by-nc-nd/3.0/es/

info:eu-repo/semantics/openAccess

Fonte

IEEE Transactions on Audio, Speech and Language Processing, ISSN 1558-7916, 2012, Vol. 20, No. 2

Palavras-Chave #Telecomunicaciones #Ciencias Sociales
Tipo

info:eu-repo/semantics/article

Artículo

PeerReviewed