Biblioteca Digital

Phonotactic language recognition using i-vectors and phoneme posteriogram counts

**Autoria(s):** D'haro Enríquez, Luis Fernando; Glembek, Ondřej; Plchot, Oldřich; Matějka, Pavel; Souﬁfar, Mehdi; Córdoba Herralde, Ricardo de; Černocký, Jan
Data(s)	2012
Resumo	This paper describes a novel approach to phonotactic LID, where instead of using soft-counts based on phoneme lattices, we use posteriogram to obtain n-gram counts. The high-dimensional vectors of counts are reduced to low-dimensional units for which we adapted the commonly used term i-vectors. The reduction is based on multinomial subspace modeling and is designed to work in the total-variability space. The proposed technique was tested on the NIST 2009 LRE set with better results to a system based on using soft-counts (Cavg on 30s: 3.15% vs 3.43%), and with very good results when fused with an acoustic i-vector LID system (Cavg on 30s acoustic 2.4% vs 1.25%). The proposed technique is also compared with another low dimensional projection system based on PCA. In comparison with the original soft-counts, the proposed technique provides better results, reduces the problems due to sparse counts, and avoids the process of using pruning techniques when creating the lattices.
Formato	application/pdf
Identificador	http://oa.upm.es/20403/
Idioma(s)	eng
Publicador	E.T.S.I. Telecomunicación (UPM)
Relação	http://oa.upm.es/20403/1/INVE_MEM_2012_134401.pdf info:eu-repo/semantics/altIdentifier/doi/null
Direitos	http://creativecommons.org/licenses/by-nc-nd/3.0/es/ info:eu-repo/semantics/openAccess
Fonte	InterSpeech 2012 - 13th Annual Conference of the International Speech Communication Association \| InterSpeech 2012 - 13th Annual Conference of the International Speech Communication Association \| 09/09/2012 - 13/09/2012 \| Portland, Oregon
Palavras-Chave	#Telecomunicaciones
Tipo	info:eu-repo/semantics/conferenceObject Ponencia en Congreso o Jornada PeerReviewed

Acesso ao item digital