2 resultados para Machine Typed Document

em Aston University Research Archive


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Owing to the rise in the volume of literature, problems arise in the retrieval of required information. Various retrieval strategies have been proposed, but most of that are not flexible enough for their users. Specifically, most of these systems assume that users know exactly what they are looking for before approaching the system, and that users are able to precisely express their information needs according to l aid- down specifications. There has, however, been described a retrieval program THOMAS which aims at satisfying incompletely- defined user needs through a man- machine dialogue which does not require any rigid queries. Unlike most systems, Thomas attempts to satisfy the user's needs from a model which it builds of the user's area of interest. This model is a subset of the program's "world model" - a database in the form of a network where the nodes represent concepts since various concepts have various degrees of similarities and associations, this thesis contends that instead of models which assume equal levels of similarities between concepts, the links between the concepts should have values assigned to them to indicate the degree of similarity between the concepts. Furthermore, the world model of the system should be structured such that concepts which are related to one another be clustered together, so that a user- interaction would involve only the relevant clusters rather than the entire database such clusters being determined by the system, not the user. This thesis also attempts to link the design work with the current notion in psychology centred on the use of the computer to simulate human cognitive processes. In this case, an attempt has been made to model a dialogue between two people - the information seeker and the information expert. The system, called Thomas-II, has been implemented and found to require less effort from the user than Thomas.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We propose a novel template matching approach for the discrimination of handwritten and machine-printed text. We first pre-process the scanned document images by performing denoising, circles/lines exclusion and word-block level segmentation. We then align and match characters in a flexible sized gallery with the segmented regions, using parallelised normalised cross-correlation. The experimental results over the Pattern Recognition & Image Analysis Research Lab-Natural History Museum (PRImA-NHM) dataset show remarkably high robustness of the algorithm in classifying cluttered, occluded and noisy samples, in addition to those with significant high missing data. The algorithm, which gives 84.0% classification rate with false positive rate 0.16 over the dataset, does not require training samples and generates compelling results as opposed to the training-based approaches, which have used the same benchmark.