3 resultados para Short-text clustering

em QUB Research Portal - Research Directory and Institutional Repository for Queen's University Belfast


Relevância:

30.00% 30.00%

Publicador:

Resumo:

The main aim of this article is to propose an exercise in stylistic analysis which can be employed in the teaching of English language. It details the design and results of a workshop activity on narrative carried out with undergraduates in a university department of English. The methods proposed are intended to enable students to obtain insights into aspects of cohesion and narrative structure; insights, it is suggested, which are not as readily obtainable through more traditional techniques of stylistic analysis. The text chosen for analysis is a short story by Ernest Hemingway comprising only 11 sentences. A jumbled version of this story is presented to students who are asked to assemble a cohesive and well-formed version of the story. Their (re)constructions are then compared with the original Hemingway version. Much interest, it is argued, lies in the ways in which the students justify their own versions in terms of their expectations about well-formedness in narrative. The activity is also intended to encourage students to see literary texts as a valuable means of providing insights into the subtleties of linguistic form and function.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Clusters of text documents output by clustering algorithms are often hard to interpret. We describe motivating real-world scenarios that necessitate reconfigurability and high interpretability of clusters and outline the problem of generating clusterings with interpretable and reconfigurable cluster models. We develop two clustering algorithms toward the outlined goal of building interpretable and reconfigurable cluster models. They generate clusters with associated rules that are composed of conditions on word occurrences or nonoccurrences. The proposed approaches vary in the complexity of the format of the rules; RGC employs disjunctions and conjunctions in rule generation whereas RGC-D rules are simple disjunctions of conditions signifying presence of various words. In both the cases, each cluster is comprised of precisely the set of documents that satisfy the corresponding rule. Rules of the latter kind are easy to interpret, whereas the former leads to more accurate clustering. We show that our approaches outperform the unsupervised decision tree approach for rule-generating clustering and also an approach we provide for generating interpretable models for general clusterings, both by significant margins. We empirically show that the purity and f-measure losses to achieve interpretability can be as little as 3 and 5%, respectively using the algorithms presented herein.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Public health risk communication during emergencies should be rapid and accurate in order to allow the audience to take steps to prevent adverse outcomes. Delays to official communications may cause unnecessary anxiety due to uncertainty or inaccurate information circulating within the at-risk group. Modern electronic communications present opportunities for rapid, targeted public health risk communication. We present a case report of a cluster of invasive meningococcal disease in a primary school in which we used the school's mass short message service (SMS) text message system to inform parents and guardians of pupils about the incident, to tell them that chemoprophylaxis would be offered to all pupils and staff, and to advise them when to attend the school to obtain further information and antibiotics. Following notification to public health on a Saturday, an incident team met on Sunday, sent the SMS messages that afternoon, and administered chemoprophyaxis to 93% of 404 pupils on Monday. The use of mass SMS messages enabled rapid communication from an official source and greatly aided the public health response to the cluster.