Texture for Script identification
Data(s) |
2005
|
---|---|
Resumo |
The problem of determining the script and language of a document image has a number of important applications in the field of document analysis, such as indexing and sorting of large collections of such images, or as a precursor to optical character recognition (OCR). In this paper, we investigate the use of texture as a tool for determining the script of a document image, based on the observation that text has a distinct visual texture. An experimental evaluation of a number of commonly used texture features is conducted on a newly created script database, providing a qualitative measure of which features are most appropriate for this task. Strategies for improving classification results in situations with limited training data and multiple font types are also proposed. |
Identificador | |
Publicador |
IEEE |
Relação |
DOI:10.1109/TPAMI.2005.227 Boles, Wageeh, Busch, Andrew, & Sridharan, Subramanian (2005) Texture for Script identification. IEEE Transcriptions on Pattern Analysis and Machine Intelligence, 27(11), pp. 1720-1731. |
Fonte |
Faculty of Built Environment and Engineering |
Palavras-Chave | #080109 Pattern Recognition and Data Mining #080199 Artificial Intelligence and Image Processing not elsewhere classified #Script Identification, Wavelets and Fractals, Texture, Document Analysis, Clustering, Classification and Association Rules |
Tipo |
Journal Article |