Recognition of open vocabulary, online handwritten pages in Tamil script


Autoria(s): Urala, Bhargava K; Ramakrishnan, AG; Mohamed, Sahil
Data(s)

2014

Resumo

In this work, we describe a system, which recognises open vocabulary, isolated, online handwritten Tamil words and extend it to recognize a paragraph of writing. We explain in detail each step involved in the process: segmentation, preprocessing, feature extraction, classification and bigram-based post-processing. On our database of 45,000 handwritten words obtained through tablet PC, we have obtained symbol level accuracy of 78.5% and 85.3% without and with the usage of post-processing using symbol level language models, respectively. Word level accuracies for the same are 40.1% and 59.6%. A line and word level segmentation strategy is proposed, which gives promising results of 100% line segmentation and 98.1% word segmentation accuracies on our initial trials of 40 handwritten paragraphs. The two modules have been combined to obtain a full-fledged page recognition system for online handwritten Tamil data. To the knowledge of the authors, this is the first ever attempt on recognition of open vocabulary, online handwritten paragraphs in any Indian language.

Formato

application/pdf

Identificador

http://eprints.iisc.ernet.in/52983/1/2014_Int_Con_Sig_Pro_Com_2014.pdf

Urala, Bhargava K and Ramakrishnan, AG and Mohamed, Sahil (2014) Recognition of open vocabulary, online handwritten pages in Tamil script. In: International Conference on Signal Processing and Communications (SPCOM), JUL 22-25, 2014, Banaglore, INDIA.

Publicador

IEEE

Relação

http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6984002

http://eprints.iisc.ernet.in/52983/

Palavras-Chave #Electrical Communication Engineering #Electrical Engineering
Tipo

Conference Proceedings

NonPeerReviewed