A Framework of Statistical Machine Translator from English to Malayalam
Data(s) |
19/07/2014
19/07/2014
2010
|
---|---|
Resumo |
In this paper we describe the methodology and the structural design of a system that translates English into Malayalam using statistical models. A monolingual Malayalam corpus and a bilingual English/Malayalam corpus are the main resource in building this Statistical Machine Translator. Training strategy adopted has been enhanced by PoS tagging which helps to get rid of the insignificant alignments. Moreover, incorporating units like suffix separator and the stop word eliminator has proven to be effective in bringing about better training results. In the decoder, order conversion rules are applied to reduce the structural difference between the language pair. The quality of statistical outcome of the decoder is further improved by applying mending rules. Experiments conducted on a sample corpus have generated reasonably good Malayalam translations and the results are verified with F measure, BLEU and WER evaluation metrics Proceedings of Fourth International Conference on Information Processing, Bangalore, India Cochin University of Science and Technology |
Identificador | |
Idioma(s) |
en |
Palavras-Chave | #Alignment #English Malayalam Translation #PoS Tagging #Statistical Machine Translation #Suffix Separation |
Tipo |
Article |