Semantic information extraction from images of complex documents
Contribuinte(s) |
UNIVERSIDADE DE SÃO PAULO |
---|---|
Data(s) |
08/08/2013
08/08/2013
01/12/2012
|
Resumo |
Even though the digital processing of documents is increasingly widespread in industry, printed documents are still largely in use. In order to process electronically the contents of printed documents, information must be extracted from digital images of documents. When dealing with complex documents, in which the contents of different regions and fields can be highly heterogeneous with respect to layout, printing quality and the utilization of fonts and typing standards, the reconstruction of the contents of documents from digital images can be a difficult problem. In the present article we present an efficient solution for this problem, in which the semantic contents of fields in a complex document are extracted from a digital image. Opus Software Opus Software |
Identificador |
APPLIED INTELLIGENCE, DORDRECHT, v. 37, n. 4, supl. 1, Part 1, pp. 543-557, DEC, 2012 0924-669X http://www.producao.usp.br/handle/BDPI/32527 10.1007/s10489-012-0348-x |
Idioma(s) |
eng |
Publicador |
SPRINGER DORDRECHT |
Relação |
APPLIED INTELLIGENCE |
Direitos |
openAccess Copyright SPRINGER |
Palavras-Chave | #DOCUMENT IMAGE PROCESSING #INFORMATION EXTRACTION FROM DOCUMENTS #COMPUTATIONAL APPROACH #SYSTEM #COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE |
Tipo |
article original article publishedVersion |