2 resultados para Document analysis

em Nottingham eTheses


Relevância:

100.00% 100.00%

Publicador:

Resumo:

A strategy for document analysis is presented which uses Portable Document Format (PDF the underlying file structure for Adobe Acrobat software) as its starting point. This strategy examines the appearance and geometric position of text and image blocks distributed over an entire document. A blackboard system is used to tag the blocks as a first stage in deducing the fundamental relationships existing between them. PDF is shown to be a useful intermediate stage in the bottom-up analysis of document structure. Its information on line spacing and font usage gives important clues in bridging the semantic gap between the scanned bitmap page and its fully analysed, block-structured form. Analysis of PDF can yield not only accurate page decomposition but also sufficient document information for the later stages of structural analysis and document understanding.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The purpose of this paper is twofold. Firstly it presents a preliminary and ethnomethodologically-informed analysis of the way in which the growing structure of a particular program's code was ongoingly derived from its earliest stages. This was motivated by an interest in how the detailed structure of completed program `emerged from nothing' as a product of the concrete practices of the programmer within the framework afforded by the language. The analysis is broken down into three sections that discuss: the beginnings of the program's structure; the incremental development of structure; and finally the code productions that constitute the structure and the importance of the programmer's stock of knowledge. The discussion attempts to understand and describe the emerging structure of code rather than focus on generating `requirements' for supporting the production of that structure. Due to time and space constraints, however, only a relatively cursory examination of these features was possible. Secondly the paper presents some thoughts on the difficulties associated with the analytic---in particular ethnographic---study of code, drawing on general problems as well as issues arising from the difficulties and failings encountered as part of the analysis presented in the first section.