14 resultados para document categorization
em University of Queensland eSpace - Australia
Resumo:
Document classification is a supervised machine learning process, where predefined category labels are assigned to documents based on the hypothesis derived from training set of labelled documents. Documents cannot be directly interpreted by a computer system unless they have been modelled as a collection of computable features. Rogati and Yang [M. Rogati and Y. Yang, Resource selection for domain-specific cross-lingual IR, in SIGIR 2004: Proceedings of the 27th annual international conference on Research and Development in Information Retrieval, ACM Press, Sheffied: United Kingdom, pp. 154-161.] pointed out that the effectiveness of document classification system may vary in different domains. This implies that the quality of document model contributes to the effectiveness of document classification. Conventionally, model evaluation is accomplished by comparing the effectiveness scores of classifiers on model candidates. However, this kind of evaluation methods may encounter either under-fitting or over-fitting problems, because the effectiveness scores are restricted by the learning capacities of classifiers. We propose a model fitness evaluation method to determine whether a model is sufficient to distinguish positive and negative instances while still competent to provide satisfactory effectiveness with a small feature subset. Our experiments demonstrated how the fitness of models are assessed. The results of our work contribute to the researches of feature selection, dimensionality reduction and document classification.
Resumo:
Application of geographic information system (GIS) and global positioning system (GPS) technology in the Hlabisa community-based tuberculosis treatment programme documents the increase in accessibility to treatment after the expansion of the service from health facilities to include community workers and volunteers.
Resumo:
Although aspects of social identity theory are familiar to organizational psychologists, its elaboration, through self-categorization theory, of how social categorization and prototype-based depersonalization actually produce social identity effects is less well known. We describe these processes, relate self-categorization theory to social identity theory, describe new theoretical developments in detail, and show how these developments can address a: range of organizational phenomena. We discuss cohesion and deviance, leadership, subgroup and sociodemographic structure, and mergers and acquisitions.
Resumo:
It has been hypothesized that the brain categorizes stressors and utilizes neural response pathways that vary in accordance with the assigned category. If this is true, stressors should elicit patterns of neuronal activation within the brain that are category-specific. Data from previous Immediate-early gene expression mapping studies have hinted that this is the case, but interstudy differences in methodology render conclusions tenuous. In the present study, immunolabelling for the expression of c-fos was used as a marker of neuronal activity elicited in the rat brain by haemorrhage, immune challenge, noise, restraint and forced swim. All stressors elicited c-fos expression in 25-30% of hypothalamic paraventricular nucleus corticotrophin-releasing-factor cells, suggesting that these stimuli were of comparable strength, at least with regard to their ability to activate the hypothalamic-pituitary-ad renal axis. In the amygdala, haemorrhage and immune challenge both elicited c-fos expression in a large number of neurons in the central nucleus of the amygdala, whereas noise, restraint and forced swim primarily elicited recruitment of cells within the medial nucleus of the amygdala. In the medulla, all stressors recruited similar numbers of noradrenergic (A1 and A2) and adrenergic (C1 and C2) cells. However, haemorrhage and immune challenge elicited c-fos expression In subpopulations of A1 and A2 noradrenergic cells that were significantly more rostral than those recruited by noise, restraint or forced swim. The present data support the suggestion that the brain recognizes at least two major categories of stressor, which we have referred to as 'physical' and 'psychological'. Moreover, the present data suggest that the neural activation footprint that is left in the brain by stressors can be used to determine the category to which they have been assigned by the brain.
Resumo:
By spliced alignment of human DNA and transcript sequence data we constructed a data set of transcript-confirmed exons and introns from 2793 genes, 796 of which (28%) were seen to have multiple isoforms. We find that over one-third of human exons can translate in more than one frame, and that this is highly correlated with G+C content. Introns containing adenosine at donor site position +3 (A3), rather than guanosine (G3), are more common in low G+C regions, while the converse is true in high G+C regions. These two classes of introns are shown to have distinct lengths, consensus sequences and correlations among splice signals, leading to the hypothesis that A3 donor sites are associated with exon definition, and G3 donor sites with intron definition. Minor classes of introns, including GC-AG, U12-type GT-AG, weak, and putative AG-dependant introns are identified and characterized. Cassette exons are more prevalent in low G+C regions, while exon isoforms are more prevalent in high G+C regions. Cassette exon events outnumber other alternative events, while exon isoform events involve truncation twice as often as extension, and occur at acceptor sites twice as often as at donor sites. Alternative splicing is usually associated with weak splice signals, and in a majority of cases, preserves the coding frame. The reported characteristics of constitutive and alternative splice signals, and the hypotheses offered regarding alternative splicing and genome organization, have important implications for experimental research into RNA processing. The 'AltExtron' data sets are available at http://www.bit.uq.edu.au/altExtron/ and http://www.ebi.ac.uk/similar tothanaraj/altExtron/.
Resumo:
This paper discusses a document discovery tool based on Conceptual Clustering by Formal Concept Analysis. The program allows users to navigate e-mail using a visual lattice metaphor rather than a tree. It implements a virtual. le structure over e-mail where files and entire directories can appear in multiple positions. The content and shape of the lattice formed by the conceptual ontology can assist in e-mail discovery. The system described provides more flexibility in retrieving stored e-mails than what is normally available in e-mail clients. The paper discusses how conceptual ontologies can leverage traditional document retrieval systems and aid knowledge discovery in document collections.