Power-law transformation for enhanced recognition of born-digital word images
Data(s) |
2012
|
---|---|
Resumo |
In this paper, we discuss the issues related to word recognition in born-digital word images. We introduce a novel method of power-law transformation on the word image for binarization. We show the improvement in image binarization and the consequent increase in the recognition performance of OCR engine on the word image. The optimal value of gamma for a word image is automatically chosen by our algorithm with fixed stroke width threshold. We have exhaustively experimented our algorithm by varying the gamma and stroke width threshold value. By varying the gamma value, we found that our algorithm performed better than the results reported in the literature. On the ICDAR Robust Reading Systems Challenge-1: Word Recognition Task on born digital dataset, as compared to the recognition rate of 61.5% achieved by TH-OCR after suitable pre-processing by Yang et. al. and 63.4% by ABBYY Fine Reader (used as baseline by the competition organizers without any preprocessing), we achieved 82.9% using Omnipage OCR applied on the images after being processed by our algorithm. |
Formato |
application/pdf |
Identificador |
http://eprints.iisc.ernet.in/46543/1/Int_Con_Sig_Pro_Comm_1_2012.pdf Kumar, Deepak and Ramakrishnan, AG (2012) Power-law transformation for enhanced recognition of born-digital word images. In: 2012 International Conference on Signal Processing and Communications (SPCOM), 22-25 July 2012, Bangalore, Karnataka, India. |
Publicador |
IEEE |
Relação |
http://dx.doi.org/10.1109/SPCOM.2012.6290009 http://eprints.iisc.ernet.in/46543/ |
Palavras-Chave | #Electrical Engineering |
Tipo |
Conference Paper PeerReviewed |