Development Of A Pos Tagger For Malayalam-An Experience
Data(s) |
17/07/2014
17/07/2014
2009
|
---|---|
Resumo |
A Parts of Speech tagger for Malayalam which uses a stochastic approach has been proposed. The tagger makes use of word frequencies and bigram statistics from a corpus. The morphological analyzer is used to generate a tagged corpus due to the unavailability of an annotated corpus in Malayalam. Although the experiments have been performed on a very small corpus, the results have shown that the statistical approach works well with a highly agglutinative language like Malayalam 2009 International Conference on Advances in Recent Technologies in Communication and Computing Cochin University of Science and Technology |
Identificador | |
Idioma(s) |
en |
Publicador |
IEEE |
Palavras-Chave | #Dravidian Language #Morphemes #HMM #Viterbi #Tagset. |
Tipo |
Article |