Biblioteca Digital

Language models for online handwritten Tamil word recognition

**Autoria(s):** Sundaram, Suresh; Urala, Bhargava K; Ramakrishnan, AG
Data(s)	2012
Resumo	N-gram language models and lexicon-based word-recognition are popular methods in the literature to improve recognition accuracies of online and offline handwritten data. However, there are very few works that deal with application of these techniques on online Tamil handwritten data. In this paper, we explore methods of developing symbol-level language models and a lexicon from a large Tamil text corpus and their application to improving symbol and word recognition accuracies. On a test database of around 2000 words, we find that bigram language models improve symbol (3%) and word recognition (8%) accuracies and while lexicon methods offer much greater improvements (30%) in terms of word recognition, there is a large dependency on choosing the right lexicon. For comparison to lexicon and language model based methods, we have also explored re-evaluation techniques which involve the use of expert classifiers to improve symbol and word recognition accuracies.
Formato	application/pdf
Identificador	http://eprints.iisc.ernet.in/46547/1/Pro_Wor_Doc_Ana_Rec_42_2012.pdf Sundaram, Suresh and Urala, Bhargava K and Ramakrishnan, AG (2012) Language models for online handwritten Tamil word recognition. In: Proceeding of the workshop on Document Analysis and Recognition, Dec. 16, 2012, New York, NY, USA.
Publicador	ACM, Inc
Relação	http://dx.doi.org/10.1145/2432553.2432562 http://eprints.iisc.ernet.in/46547/
Palavras-Chave	#Electrical Engineering
Tipo	Conference Paper PeerReviewed

Acesso ao item digital