Biblioteca Digital

**Autoria(s):** Santhosh Kumar, G; Mary, Priya Sebastian; Sheena Kurian, K
Data(s)	19/07/2014 19/07/2014 2010
Resumo	In this paper we describe the methodology and the structural design of a system that translates English into Malayalam using statistical models. A monolingual Malayalam corpus and a bilingual English/Malayalam corpus are the main resource in building this Statistical Machine Translator. Training strategy adopted has been enhanced by PoS tagging which helps to get rid of the insignificant alignments. Moreover, incorporating units like suffix separator and the stop word eliminator has proven to be effective in bringing about better training results. In the decoder, order conversion rules are applied to reduce the structural difference between the language pair. The quality of statistical outcome of the decoder is further improved by applying mending rules. Experiments conducted on a sample corpus have generated reasonably good Malayalam translations and the results are verified with F measure, BLEU and WER evaluation metrics Proceedings of Fourth International Conference on Information Processing, Bangalore, India Cochin University of Science and Technology
Identificador	http://dyuthi.cusat.ac.in/purl/4138
Idioma(s)	en
Palavras-Chave	#Alignment #English Malayalam Translation #PoS Tagging #Statistical Machine Translation #Suffix Separation
Tipo	Article

Acesso ao item digital