Development Of A Pos Tagger For Malayalam-An Experience


Autoria(s): Sumam, Mary Idicula; Soumya, S; Manju, K
Data(s)

17/07/2014

17/07/2014

2009

Resumo

A Parts of Speech tagger for Malayalam which uses a stochastic approach has been proposed. The tagger makes use of word frequencies and bigram statistics from a corpus. The morphological analyzer is used to generate a tagged corpus due to the unavailability of an annotated corpus in Malayalam. Although the experiments have been performed on a very small corpus, the results have shown that the statistical approach works well with a highly agglutinative language like Malayalam

2009 International Conference on Advances in Recent Technologies in Communication and Computing

Cochin University of Science and Technology

Identificador

http://dyuthi.cusat.ac.in/purl/4090

Idioma(s)

en

Publicador

IEEE

Palavras-Chave #Dravidian Language #Morphemes #HMM #Viterbi #Tagset.
Tipo

Article