76 resultados para text analytics
Resumo:
The continued use of traditional lecturing across Higher Education as the main teaching and learning approach in many disciplines must be challenged. An increasing number of studies suggest that this approach, compared to more active learning methods, is the least effective. In counterargument, the use of traditional lectures are often justified as necessary given a large student population. By analysing the implementation of a web based broadcasting approach which replaced the traditional lecture within a programming-based module, and thereby removed the student population rationale, it was hoped that the student learning experience would become more active and ultimately enhance learning on the module. The implemented model replaces the traditional approach of students attending an on-campus lecture theatre with a web-based live broadcast approach that focuses on students being active learners rather than passive recipients. Students ‘attend’ by viewing a live broadcast of the lecturer, presented as a talking head, and the lecturer’s desktop, via a web browser. Video and audio communication is primarily from tutor to students, with text-based comments used to provide communication from students to tutor. This approach promotes active learning by allowing student to perform activities on their own computer rather than the passive viewing and listening common encountered in large lecture classes. By analysing this approach over two years (n = 234 students) results indicate that 89.6% of students rated the approach as offering a highly positive learning experience. Comparing student performance across three academic years also indicates a positive change. A small data analytic analysis was conducted into student participation levels and suggests that the student cohort's willingness to engage with the broadcast lectures material is high.
Resumo:
Social media channels, such as Facebook or Twitter, allow for people to express their views and opinions about any public topics. Public sentiment related to future events, such as demonstrations or parades, indicate public attitude and therefore may be applied while trying to estimate the level of disruption and disorder during such events. Consequently, sentiment analysis of social media content may be of interest for different organisations, especially in security and law enforcement sectors. This paper presents a new lexicon-based sentiment analysis algorithm that has been designed with the main focus on real time Twitter content analysis. The algorithm consists of two key components, namely sentiment normalisation and evidence-based combination function, which have been used in order to estimate the intensity of the sentiment rather than positive/negative label and to support the mixed sentiment classification process. Finally, we illustrate a case study examining the relation between negative sentiment of twitter posts related to English Defence League and the level of disorder during the organisation’s related events.
Resumo:
We address the problem of mining interesting phrases from subsets of a text corpus where the subset is specified using a set of features such as keywords that form a query. Previous algorithms for the problem have proposed solutions that involve sifting through a phrase dictionary based index or a document-based index where the solution is linear in either the phrase dictionary size or the size of the document subset. We propose the usage of an independence assumption between query keywords given the top correlated phrases, wherein the pre-processing could be reduced to discovering phrases from among the top phrases per each feature in the query. We then outline an indexing mechanism where per-keyword phrase lists are stored either in disk or memory, so that popular aggregation algorithms such as No Random Access and Sort-merge Join may be adapted to do the scoring at real-time to identify the top interesting phrases. Though such an approach is expected to be approximate, we empirically illustrate that very high accuracies (of over 90%) are achieved against the results of exact algorithms. Due to the simplified list-aggregation, we are also able to provide response times that are orders of magnitude better than state-of-the-art algorithms. Interestingly, our disk-based approach outperforms the in-memory baselines by up to hundred times and sometimes more, confirming the superiority of the proposed method.
Resumo:
We consider the problem of segmenting text documents that have a
two-part structure such as a problem part and a solution part. Documents
of this genre include incident reports that typically involve
description of events relating to a problem followed by those pertaining
to the solution that was tried. Segmenting such documents
into the component two parts would render them usable in knowledge
reuse frameworks such as Case-Based Reasoning. This segmentation
problem presents a hard case for traditional text segmentation
due to the lexical inter-relatedness of the segments. We develop
a two-part segmentation technique that can harness a corpus
of similar documents to model the behavior of the two segments
and their inter-relatedness using language models and translation
models respectively. In particular, we use separate language models
for the problem and solution segment types, whereas the interrelatedness
between segment types is modeled using an IBM Model
1 translation model. We model documents as being generated starting
from the problem part that comprises of words sampled from
the problem language model, followed by the solution part whose
words are sampled either from the solution language model or from
a translation model conditioned on the words already chosen in the
problem part. We show, through an extensive set of experiments on
real-world data, that our approach outperforms the state-of-the-art
text segmentation algorithms in the accuracy of segmentation, and
that such improved accuracy translates well to improved usability
in Case-based Reasoning systems. We also analyze the robustness
of our technique to varying amounts and types of noise and empirically
illustrate that our technique is quite noise tolerant, and
degrades gracefully with increasing amounts of noise
Resumo:
We make a case for studying the impact of intra-node parallelism on the performance of data analytics. We identify four performance optimizations that are enabled by an increasing number of processing cores on a chip. We discuss the performance impact of these opimizations on two analytics operators and we identify how these optimizations affect each another.
Resumo:
Public health risk communication during emergencies should be rapid and accurate in order to allow the audience to take steps to prevent adverse outcomes. Delays to official communications may cause unnecessary anxiety due to uncertainty or inaccurate information circulating within the at-risk group. Modern electronic communications present opportunities for rapid, targeted public health risk communication. We present a case report of a cluster of invasive meningococcal disease in a primary school in which we used the school's mass short message service (SMS) text message system to inform parents and guardians of pupils about the incident, to tell them that chemoprophylaxis would be offered to all pupils and staff, and to advise them when to attend the school to obtain further information and antibiotics. Following notification to public health on a Saturday, an incident team met on Sunday, sent the SMS messages that afternoon, and administered chemoprophyaxis to 93% of 404 pupils on Monday. The use of mass SMS messages enabled rapid communication from an official source and greatly aided the public health response to the cluster.
Resumo:
ABSTRACT
The proliferation in the use of video lecture capture in universities worldwide presents an opportunity to analyse video watching patterns in an attempt to quantify and qualify how students engage and learn with the videos. It also presents an opportunity to investigate if there are similar student learning patterns during the equivalent physical lecture. The goal of this action based research project was to capture and quantitatively analyse the viewing behaviours and patterns of a series of video lecture captures across several university Java programming modules. It sought to study if a quantitative analysis of viewing behaviours of Lecture Capture videos coupled with a qualitative evaluation from the students and lecturers could be correlated to provide generalised patterns that could then be used to understand the learning experience of students during videos and potentially face to face lectures and, thereby, present opportunities to reflectively enhance lecturer performance and the students’ overall learning experience. The report establishes a baseline understanding of the analytics of videos of several commonly used pedagogical teaching methods used in the delivery of programming courses. It reflects on possible concurrences within live lecture delivery with the potential to inform and improve lecturing performance.
Resumo:
Learning from visual representations is enhanced when learners appropriately integrate corresponding visual and verbal information. This study examined the effects of two methods of promoting integration, color coding and labeling, on learning about probabilistic reasoning from a table and text. Undergraduate students (N = 98) were randomly assigned to learn about probabilistic reasoning from one of 4 computer-based lessons generated from a 2 (color coding/no color coding) by 2 (labeling/no labeling) between-subjects design. Learners added the labels or color coding at their own pace by clicking buttons in a computer-based lesson. Participants' eye movements were recorded while viewing the lesson. Labeling was beneficial for learning, but color coding was not. In addition, labeling, but not color coding, increased attention to important information in the table and time with the lesson. Both labeling and color coding increased looks between the text and corresponding information in the table. The findings provide support for the multimedia principle, and they suggest that providing labeling enhances learning about probabilistic reasoning from text and tables