Evaluation of document binarization using eigen value decomposition


Autoria(s): Kumar, Deepak; Prasad, MN Anil; Ramakrishnan, AG
Data(s)

2013

Resumo

A necessary step for the recognition of scanned documents is binarization, which is essentially the segmentation of the document. In order to binarize a scanned document, we can find several algorithms in the literature. What is the best binarization result for a given document image? To answer this question, a user needs to check different binarization algorithms for suitability, since different algorithms may work better for different type of documents. Manually choosing the best from a set of binarized documents is time consuming. To automate the selection of the best segmented document, either we need to use ground-truth of the document or propose an evaluation metric. If ground-truth is available, then precision and recall can be used to choose the best binarized document. What is the case, when ground-truth is not available? Can we come up with a metric which evaluates these binarized documents? Hence, we propose a metric to evaluate binarized document images using eigen value decomposition. We have evaluated this measure on DIBCO and H-DIBCO datasets. The proposed method chooses the best binarized document that is close to the ground-truth of the document.

Formato

application/pdf

Identificador

http://eprints.iisc.ernet.in/48006/1/Doc_rec_ret-8658.1_2013.pdf

Kumar, Deepak and Prasad, MN Anil and Ramakrishnan, AG (2013) Evaluation of document binarization using eigen value decomposition. In: 20th Conference on Document Recognition and Retrieval (DRR) held as part of the IS and T/SPIE Symposium on Electronic Imaging, FEB 05-07, 2013 , San Francisco, CA.

Publicador

SPIE-INT SOC OPTICAL ENGINEERING

Relação

http://dx.doi.org/10.1117/12.2008502

http://eprints.iisc.ernet.in/48006/

Palavras-Chave #Electrical Engineering
Tipo

Conference Proceedings

NonPeerReviewed