960 resultados para document categorization
Resumo:
Document classification is a supervised machine learning process, where predefined category labels are assigned to documents based on the hypothesis derived from training set of labelled documents. Documents cannot be directly interpreted by a computer system unless they have been modelled as a collection of computable features. Rogati and Yang [M. Rogati and Y. Yang, Resource selection for domain-specific cross-lingual IR, in SIGIR 2004: Proceedings of the 27th annual international conference on Research and Development in Information Retrieval, ACM Press, Sheffied: United Kingdom, pp. 154-161.] pointed out that the effectiveness of document classification system may vary in different domains. This implies that the quality of document model contributes to the effectiveness of document classification. Conventionally, model evaluation is accomplished by comparing the effectiveness scores of classifiers on model candidates. However, this kind of evaluation methods may encounter either under-fitting or over-fitting problems, because the effectiveness scores are restricted by the learning capacities of classifiers. We propose a model fitness evaluation method to determine whether a model is sufficient to distinguish positive and negative instances while still competent to provide satisfactory effectiveness with a small feature subset. Our experiments demonstrated how the fitness of models are assessed. The results of our work contribute to the researches of feature selection, dimensionality reduction and document classification.
Resumo:
International audience
Resumo:
During development, children become capable of categorically associating stimuli and of using these relationships for memory recall. Brain damage in childhood can interfere with this development. This study investigated categorical association of stimuli and recall in four children with brain damages. The etiology, topography and timing of the lesions were diverse. Tasks included naming and immediate recall of 30 perceptually and semantically related figures, free sorting, delayed recall, and cued recall of the same material. Traditional neuropsychological tests were also employed. Two children with brain damage sustained in middle childhood relied on perceptual rather than on categorical associations in making associations between figures and showed deficits in delayed or cued recall, in contrast to those with perinatal lesions. One child exhibited normal performance in recall despite categorical association deficits. The present results suggest that brain damaged children show deficits in categorization and recall that are not usually identified in traditional neuropsychological tests.
Resumo:
Due to both the widespread and multipurpose use of document images and the current availability of a high number of document images repositories, robust information retrieval mechanisms and systems have been increasingly demanded. This paper presents an approach to support the automatic generation of relationships among document images by exploiting Latent Semantic Indexing (LSI) and Optical Character Recognition (OCR). We developed the LinkDI (Linking of Document Images) service, which extracts and indexes document images content, computes its latent semantics, and defines relationships among images as hyperlinks. LinkDI was experimented with document images repositories, and its performance was evaluated by comparing the quality of the relationships created among textual documents as well as among their respective document images. Considering those same document images, we ran further experiments in order to compare the performance of LinkDI when it exploits or not the LSI technique. Experimental results showed that LSI can mitigate the effects of usual OCR misrecognition, which reinforces the feasibility of LinkDI relating OCR output with high degradation.
Resumo:
This study aimed to describe the benefits of memory training for older adults with low education. Twenty-nine healthy older adults with zero to two years of formal education participated. Sixteen participants received training based on categorization (categorization group = CATG) and 13 received training based on mental images (imagery group = IMG). One group served as control for the other because they trained with different strategies. Training was offered in eight sessions of 90 minutes. The participants were evaluated pre- and posttraining. IMG improved performance in episodic memory tests and had reduced depressive symptoms. CATG increased the use of categorization but did not increase performance in episodic memory tests. Results suggest that the strategy based on the creation of mental images was more effective for older adults with low formal education.
Resumo:
Application of geographic information system (GIS) and global positioning system (GPS) technology in the Hlabisa community-based tuberculosis treatment programme documents the increase in accessibility to treatment after the expansion of the service from health facilities to include community workers and volunteers.
Resumo:
Although aspects of social identity theory are familiar to organizational psychologists, its elaboration, through self-categorization theory, of how social categorization and prototype-based depersonalization actually produce social identity effects is less well known. We describe these processes, relate self-categorization theory to social identity theory, describe new theoretical developments in detail, and show how these developments can address a: range of organizational phenomena. We discuss cohesion and deviance, leadership, subgroup and sociodemographic structure, and mergers and acquisitions.