24 resultados para Text categorization

em University of Queensland eSpace - Australia


Relevância:

60.00% 60.00%

Publicador:

Resumo:

Document classification is a supervised machine learning process, where predefined category labels are assigned to documents based on the hypothesis derived from training set of labelled documents. Documents cannot be directly interpreted by a computer system unless they have been modelled as a collection of computable features. Rogati and Yang [M. Rogati and Y. Yang, Resource selection for domain-specific cross-lingual IR, in SIGIR 2004: Proceedings of the 27th annual international conference on Research and Development in Information Retrieval, ACM Press, Sheffied: United Kingdom, pp. 154-161.] pointed out that the effectiveness of document classification system may vary in different domains. This implies that the quality of document model contributes to the effectiveness of document classification. Conventionally, model evaluation is accomplished by comparing the effectiveness scores of classifiers on model candidates. However, this kind of evaluation methods may encounter either under-fitting or over-fitting problems, because the effectiveness scores are restricted by the learning capacities of classifiers. We propose a model fitness evaluation method to determine whether a model is sufficient to distinguish positive and negative instances while still competent to provide satisfactory effectiveness with a small feature subset. Our experiments demonstrated how the fitness of models are assessed. The results of our work contribute to the researches of feature selection, dimensionality reduction and document classification.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

One of the goals of the ARC funded Eresearch project called Sharing access and analytical tools for ethnographic digital media using high speed networks, or simply EthnoER is to take outputs of normal linguistic analytical processes and present them online in a system we have called the EthnoER online presentation and annotation system, or EOPAS.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Although aspects of social identity theory are familiar to organizational psychologists, its elaboration, through self-categorization theory, of how social categorization and prototype-based depersonalization actually produce social identity effects is less well known. We describe these processes, relate self-categorization theory to social identity theory, describe new theoretical developments in detail, and show how these developments can address a: range of organizational phenomena. We discuss cohesion and deviance, leadership, subgroup and sociodemographic structure, and mergers and acquisitions.

Relevância:

20.00% 20.00%

Publicador: