101 resultados para Text categorization

em Queensland University of Technology - ePrints Archive


Relevância:

70.00% 70.00%

Publicador:

Resumo:

This article examines manual textual categorisation by human coders with the hypothesis that the law of total probability may be violated for difficult categories. An empirical evaluation was conducted to compare a one step categorisation task with a two step categorisation task using crowdsourcing. It was found that the law of total probability was violated. Both a quantum and classical probabilistic interpretations for this violation are presented. Further studies are required to resolve whether quantum models are more appropriate for this task.

Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this article, we take a close look at the literacy demands of one task from the ‘Marvellous Micro-organisms Stage 3 Life and Living’ Primary Connections unit (Australian Academy of Science, 2005). One lesson from the unit, ‘Exploring Bread’, (pp 4-8) asks students to ‘use bread labels to locate ingredient information and synthesise understanding of bread ingredients’. We draw upon a framework offered by the New London Group (2000), that of linguistic, visual and spatial design, to consider in more detail three bread wrappers and from there the complex literacies that students need to interrelate to undertake the required task. Our findings are that although bread wrappers are an example of an everyday science text, their linguistic, visual and spatial designs and their interrelationship are not trivial. We conclude by reinforcing the need for teachers of science to also consider how the complex design elements of everyday science texts and their interrelated literacies are made visible through instructional practice.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The recent focus on literacy in Social Studies has been on linguistic design, particularly that related to the grammar of written and spoken text. When students are expected to produce complex hybridized genres such as timelines, a focus on the teaching and learning of linguistic design is necessary but not sufficient to complete the task. Theorizations of new literacies identify five interrelated meaning making designs for text deconstruction and reproduction: linguistic, spatial, visual, gestural, and audio design. Honing in on the complexity of timelines, this paper casts a lens on the linguistic, visual, spatial, and gestural designs of three pairs of primary school aged Social Studies learners. Drawing on a functional metalanguage, we analyze the linguistic, visual, spatial, and gestural designs of their work. We also offer suggestions of their effect, and from there consider the importance of explicit instruction in text design choices for this Social Studies task. We conclude the analysis by suggesting the foci of explicit instruction for future lessons.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Objective: To summarise the extent to which narrative text fields in administrative health data are used to gather information about the event resulting in presentation to a health care provider for treatment of an injury, and to highlight best practise approaches to conducting narrative text interrogation for injury surveillance purposes.----- Design: Systematic review----- Data sources: Electronic databases searched included CINAHL, Google Scholar, Medline, Proquest, PubMed and PubMed Central.. Snowballing strategies were employed by searching the bibliographies of retrieved references to identify relevant associated articles.----- Selection criteria: Papers were selected if the study used a health-related database and if the study objectives were to a) use text field to identify injury cases or use text fields to extract additional information on injury circumstances not available from coded data or b) use text fields to assess accuracy of coded data fields for injury-related cases or c) describe methods/approaches for extracting injury information from text fields.----- Methods: The papers identified through the search were independently screened by two authors for inclusion, resulting in 41 papers selected for review. Due to heterogeneity between studies metaanalysis was not performed.----- Results: The majority of papers reviewed focused on describing injury epidemiology trends using coded data and text fields to supplement coded data (28 papers), with these studies demonstrating the value of text data for providing more specific information beyond what had been coded to enable case selection or provide circumstantial information. Caveats were expressed in terms of the consistency and completeness of recording of text information resulting in underestimates when using these data. Four coding validation papers were reviewed with these studies showing the utility of text data for validating and checking the accuracy of coded data. Seven studies (9 papers) described methods for interrogating injury text fields for systematic extraction of information, with a combination of manual and semi-automated methods used to refine and develop algorithms for extraction and classification of coded data from text. Quality assurance approaches to assessing the robustness of the methods for extracting text data was only discussed in 8 of the epidemiology papers, and 1 of the coding validation papers. All of the text interrogation methodology papers described systematic approaches to ensuring the quality of the approach.----- Conclusions: Manual review and coding approaches, text search methods, and statistical tools have been utilised to extract data from narrative text and translate it into useable, detailed injury event information. These techniques can and have been applied to administrative datasets to identify specific injury types and add value to previously coded injury datasets. Only a few studies thoroughly described the methods which were used for text mining and less than half of the studies which were reviewed used/described quality assurance methods for ensuring the robustness of the approach. New techniques utilising semi-automated computerised approaches and Bayesian/clustering statistical methods offer the potential to further develop and standardise the analysis of narrative text for injury surveillance.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Traffic safety is a major concern world-wide. It is in both the sociological and economic interests of society that attempts should be made to identify the major and multiple contributory factors to those road crashes. This paper presents a text mining based method to better understand the contextual relationships inherent in road crashes. By examining and analyzing the crash report data in Queensland from year 2004 and year 2005, this paper identifies and reports the major and multiple contributory factors to those crashes. The outcome of this study will support road asset management in reducing road crashes.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Many data mining techniques have been proposed for mining useful patterns in databases. However, how to effectively utilize discovered patterns is still an open research issue, especially in the domain of text mining. Most existing methods adopt term-based approaches. However, they all suffer from the problems of polysemy and synonymy. This paper presents an innovative technique, pattern taxonomy mining, to improve the effectiveness of using discovered patterns for finding useful information. Substantial experiments on RCV1 demonstrate that the proposed solution achieves encouraging performance.