Biblioteca Digital

**Autoria(s):** Lopez Ludeña, Veronica; San Segundo Hernández, Rubén; Montero Martínez, Juan Manuel; Barra Chicote, Roberto; Lorenzo Trueba, Jaime
Data(s)	2012
Resumo	This paper proposes an architecture, based on statistical machine translation, for developing the text normalization module of a text to speech conversion system. The main target is to generate a language independent text normalization module, based on data and flexible enough to deal with all situa-tions presented in this task. The proposed architecture is composed by three main modules: a tokenizer module for splitting the text input into a token graph (tokenization), a phrase-based translation module (token translation) and a post-processing module for removing some tokens. This paper presents initial exper-iments for numbers and abbreviations. The very good results obtained validate the proposed architecture.
Formato	application/pdf
Identificador	http://oa.upm.es/20353/
Idioma(s)	eng
Publicador	E.T.S.I. Telecomunicación (UPM)
Relação	http://oa.upm.es/20353/1/INVE_MEM_2012_133658.pdf info:eu-repo/semantics/altIdentifier/doi/null
Direitos	http://creativecommons.org/licenses/by-nc-nd/3.0/es/ info:eu-repo/semantics/openAccess
Fonte	Jornadas en Tecnología del Habla and III Iberian SLTech \| VII Jornadas en Tecnología del Habla and III Iberian SLTech \| 21/11/2012 - 22/11/2012 \| Madrid, España
Palavras-Chave	#Telecomunicaciones
Tipo	info:eu-repo/semantics/conferenceObject Ponencia en Congreso o Jornada PeerReviewed

Acesso ao item digital