898 resultados para Text mining, Classificazione, Stemming, Text categorization
Resumo:
Background: A major challenge for assessing students’ conceptual understanding of STEM subjects is the capacity of assessment tools to reliably and robustly evaluate student thinking and reasoning. Multiple-choice tests are typically used to assess student learning and are designed to include distractors that can indicate students’ incomplete understanding of a topic or concept based on which distractor the student selects. However, these tests fail to provide the critical information uncovering the how and why of students’ reasoning for their multiple-choice selections. Open-ended or structured response questions are one method for capturing higher level thinking, but are often costly in terms of time and attention to properly assess student responses. Purpose: The goal of this study is to evaluate methods for automatically assessing open-ended responses, e.g. students’ written explanations and reasoning for multiple-choice selections. Design/Method: We incorporated an open response component for an online signals and systems multiple-choice test to capture written explanations of students’ selections. The effectiveness of an automated approach for identifying and assessing student conceptual understanding was evaluated by comparing results of lexical analysis software packages (Leximancer and NVivo) to expert human analysis of student responses. In order to understand and delineate the process for effectively analysing text provided by students, the researchers evaluated strengths and weakness for both the human and automated approaches. Results: Human and automated analyses revealed both correct and incorrect associations for certain conceptual areas. For some questions, that were not anticipated or included in the distractor selections, showing how multiple-choice questions alone fail to capture the comprehensive picture of student understanding. The comparison of textual analysis methods revealed the capability of automated lexical analysis software to assist in the identification of concepts and their relationships for large textual data sets. We also identified several challenges to using automated analysis as well as the manual and computer-assisted analysis. Conclusions: This study highlighted the usefulness incorporating and analysing students’ reasoning or explanations in understanding how students think about certain conceptual ideas. The ultimate value of automating the evaluation of written explanations is that it can be applied more frequently and at various stages of instruction to formatively evaluate conceptual understanding and engage students in reflective
Resumo:
Evidence is needed for the acceptability and user preferences of receiving skin cancer-related text messages. We prepared 27 questions to evaluate attitudes, satisfaction with program characteristics such as timing and spacing, and overall satisfaction with the Healthy Text program in young adults. Within this randomised controlled trial (age 18-42 years), 546 participants were assigned to one of three Healthy Text message groups; sun protection, skin self-examination, or attention-control. Over a 12-month period, 21 behaviour-specific text messages were sent to each group. Participants’ preferences were compared between the two interventions and control group at the 12-month follow-up telephone interview. In all three groups, participants reported the messages were easy to understand (98%), provided good suggestions or ideas (88%), and were encouraging (86%) and informative (85%) with little difference between the groups. The timing of the texts was received positively (92%); however, some suggestions for frequency or time of day the messages were received from 8% of participants. Participants in the two intervention groups found their messages more informative, and triggering behaviour change compared to control. Text messages about skin cancer prevention and early detection are novel and acceptable to induce behaviour change in young adults.
Resumo:
Reflective writing is an important learning task to help foster reflective practice, but even when assessed it is rarely analysed or critically reviewed due to its subjective and affective nature. We propose a process for capturing subjective and affective analytics based on the identification and recontextualisation of anomalous features within reflective text. We evaluate 2 human supervised trials of the process, and so demonstrate the potential for an automated Anomaly Recontextualisation process for Learning Analytics.
Resumo:
On our first day in Kalgoorlie, a local woman in her mid-thirties tells us that ‘Kal wouldn’t exist if it wasn’t for mining and prostitution’. In the ensuing days many others would tell us the same thing. More explicitly, in the words of another local resident, ‘The town was founded on brothels. [Without them] the men wouldn’t have been happy and they wouldn’t have got as much gold.’ These two phenomena – mining and prostitution – and their seemingly natural and straightforward connection to each other are also routinely invoked in tourist and popular culture depictions of Kalgoorlie. The Lonely Planet, for example, notes that ‘historically, mineworkers would come straight to town to spend disposable income at Kalgoorlie’s infamous brothels, or at pubs staffed by “skimpies” (scantily clad female bar staff)’.
Resumo:
Narrative text is a useful way of identifying injury circumstances from the routine emergency department data collections. Automatically classifying narratives based on machine learning techniques is a promising technique, which can consequently reduce the tedious manual classification process. Existing works focus on using Naive Bayes which does not always offer the best performance. This paper proposes the Matrix Factorization approaches along with a learning enhancement process for this task. The results are compared with the performance of various other classification approaches. The impact on the classification results from the parameters setting during the classification of a medical text dataset is discussed. With the selection of right dimension k, Non Negative Matrix Factorization-model method achieves 10 CV accuracy of 0.93.
Resumo:
This article presents and evaluates a model to automatically derive word association networks from text corpora. Two aspects were evaluated: To what degree can corpus-based word association networks (CANs) approximate human word association networks with respect to (1) their ability to quantitatively predict word associations and (2) their structural network characteristics. Word association networks are the basis of the human mental lexicon. However, extracting such networks from human subjects is laborious, time consuming and thus necessarily limited in relation to the breadth of human vocabulary. Automatic derivation of word associations from text corpora would address these limitations. In both evaluations corpus-based processing provided vector representations for words. These representations were then employed to derive CANs using two measures: (1) the well known cosine metric, which is a symmetric measure, and (2) a new asymmetric measure computed from orthogonal vector projections. For both evaluations, the full set of 4068 free association networks (FANs) from the University of South Florida word association norms were used as baseline human data. Two corpus based models were benchmarked for comparison: a latent topic model and latent semantic analysis (LSA). We observed that CANs constructed using the asymmetric measure were slightly less effective than the topic model in quantitatively predicting free associates, and slightly better than LSA. The structural networks analysis revealed that CANs do approximate the FANs to an encouraging degree.
Resumo:
This report identifies the outcomes of a program evaluation of the five year Workplace Health and Safety Strategy (2012-2017), specifically, the engagement component within the Queensland Ambulance Service. As part of the former Department of Community Safety, their objective was to work towards harmonising the occupational health and safety policies and process to improve the workplace culture. The report examines and assess the process paths and resource inputs into the strategy, provides feedback on progress to achieving identified goals as well as identify opportunities for improvements and barriers to progress. Consultations were held with key stakeholders within QAS and focus groups were facilitated with managers and health and safety representatives of each Local Area Service Network.
Resumo:
In this paper we present a robust method to detect handwritten text from unconstrained drawings on normal whiteboards. Unlike printed text on documents, free form handwritten text has no pattern in terms of size, orientation and font and it is often mixed with other drawings such as lines and shapes. Unlike handwritings on paper, handwritings on a normal whiteboard cannot be scanned so the detection has to be based on photos. Our work traces straight edges on photos of the whiteboard and builds graph representation of connected components. We use geometric properties such as edge density, graph density, aspect ratio and neighborhood similarity to differentiate handwritten text from other drawings. The experiment results show that our method achieves satisfactory precision and recall. Furthermore, the method is robust and efficient enough to be deployed in a mobile device. This is an important enabler of business applications that support whiteboard-centric visual meetings in enterprise scenarios. © 2012 IEEE.
Resumo:
Experiences showed that developing business applications that base on text analysis normally requires a lot of time and expertise in the field of computer linguistics. Several approaches of integrating text analysis systems with business applications have been proposed, but so far there has been no coordinated approach which would enable building scalable and flexible applications of text analysis in enterprise scenarios. In this paper, a service-oriented architecture for text processing applications in the business domain is introduced. It comprises various groups of processing components and knowledge resources. The architecture, created as a result of our experiences with building natural language processing applications in business scenarios, allows for the reuse of text analysis and other components, and facilitates the development of business applications. We verify our approach by showing how the proposed architecture can be applied to create a text analytics enabled business application that addresses a concrete business scenario. © 2010 IEEE.
Resumo:
Assessing students’ conceptual understanding of technical content is important for instructors as well as students to learn content and apply knowledge in various contexts. Concept inventories that identify possible misconceptions through validated multiple-choice questions are helpful in identifying a misconception that may exist, but do not provide a meaningful assessment of why they exist or the nature of the students’ understanding. We conducted a case study with undergraduate students in an electrical engineering course by testing a validated multiple-choice response concept inventory that we augmented with a component for students to provide written explanations for their multiple-choice selection. Results revealed that correctly chosen multiple-choice selections did not always match correct conceptual understanding for question testing a specific concept. The addition of a text-response to multiple-choice concept inventory questions provided an enhanced and meaningful assessment of students’ conceptual understanding and highlighted variables associated with current concept inventories or multiple choice questions.
Resumo:
Concept mapping involves determining relevant concepts from a free-text input, where concepts are defined in an external reference ontology. This is an important process that underpins many applications for clinical information reporting, derivation of phenotypic descriptions, and a number of state-of-the-art medical information retrieval methods. Concept mapping can be cast into an information retrieval (IR) problem: free-text mentions are treated as queries and concepts from a reference ontology as the documents to be indexed and retrieved. This paper presents an empirical investigation applying general-purpose IR techniques for concept mapping in the medical domain. A dataset used for evaluating medical information extraction is adapted to measure the effectiveness of the considered IR approaches. Standard IR approaches used here are contrasted with the effectiveness of two established benchmark methods specifically developed for medical concept mapping. The empirical findings show that the IR approaches are comparable with one benchmark method but well below the best benchmark.
Resumo:
In my master’s thesis I analyse mystical Islamic poetry in ritualistic performance context, samā` , focusing on the poetry used by the Chishti Sufis. The work is based on both literary sources and ethnographic material collected in India. The central textual source is Surūd-i Rūhānī, a compilation of mystical poetry. Textual sources, however, can be understood properly only in relation to the living performance context and therefore I also utilise interviews of Sufis and performers of mystical music and recordings of samā` assemblies along with texts. First part of the thesis concentrates on thematic overview of the poems and the process of selecting a suitable text for performance. The poems are written in three languages, viz. in Persian, Urdu and Hindi. Among the authors are both Sufis and non-Sufis. The poems, mystical and non-mystical alike, share the same poetic images and they acquire a mystical meaning when they are set to qawwali music and performed in samā` assemblies. My work includes several translations of verses not previously translated. Latter part of the thesis analyses the musical idiom of qawwali and the ways in which the impact of text on listeners is intensified in performance. Typically the intensification is accomplished in the level of a single poem through three different techniques: using introductory verses, inserting verses between the verses of the main poem and repeating individual units of text. The former two techniques are tied to creating a mystical state in the listeners while the latter aims at sustaining it. It is customary that a listener enraptured by mystical experience offers a monetary contribution to the performers. Thus, intensification of the text’s impact aims at enabling the listeners to experience mystical states.