876 resultados para text analytics
Resumo:
Objective To develop and evaluate machine learning techniques that identify limb fractures and other abnormalities (e.g. dislocations) from radiology reports. Materials and Methods 99 free-text reports of limb radiology examinations were acquired from an Australian public hospital. Two clinicians were employed to identify fractures and abnormalities from the reports; a third senior clinician resolved disagreements. These assessors found that, of the 99 reports, 48 referred to fractures or abnormalities of limb structures. Automated methods were then used to extract features from these reports that could be useful for their automatic classification. The Naive Bayes classification algorithm and two implementations of the support vector machine algorithm were formally evaluated using cross-fold validation over the 99 reports. Result Results show that the Naive Bayes classifier accurately identifies fractures and other abnormalities from the radiology reports. These results were achieved when extracting stemmed token bigram and negation features, as well as using these features in combination with SNOMED CT concepts related to abnormalities and disorders. The latter feature has not been used in previous works that attempted classifying free-text radiology reports. Discussion Automated classification methods have proven effective at identifying fractures and other abnormalities from radiology reports (F-Measure up to 92.31%). Key to the success of these techniques are features such as stemmed token bigrams, negations, and SNOMED CT concepts associated with morphologic abnormalities and disorders. Conclusion This investigation shows early promising results and future work will further validate and strengthen the proposed approaches.
Resumo:
Social Media Analytics ist ein neuer Forschungsbereich, in dem interdisziplinäre Methoden kombiniert, erweitert und angepasst werden, um Social-Media-Daten auszuwerten. Neben der Beantwortung von Forschungsfragen ist es ebenfalls ein Ziel, Architekturentwürfe für die Entwicklung neuer Informationssysteme und Anwendungen bereitzustellen, die auf sozialen Medien basieren. Der Beitrag stellt die wichtigsten Aspekte des Bereichs Social Media Analytics vor und verweist auf die Notwendigkeit einer fächerübergreifenden Forschungsagenda, für deren Erstellung und Bearbeitung der Wirtschaftsinformatik eine wichtige Rolle zukommt.
Resumo:
Social Media Analytics is an emerging interdisciplinary research field that aims on combining, extending, and adapting methods for analysis of social media data. On the one hand it can support IS and other research disciplines to answer their research questions and on the other hand it helps to provide architectural designs as well as solution frameworks for new social media-based applications and information systems. The authors suggest that IS should contribute to this field and help to develop and process an interdisciplinary research agenda.
Resumo:
Enterprises, both public and private, have rapidly commenced using the benefits of enterprise resource planning (ERP) combined with business analytics and “open data sets” which are often outside the control of the enterprise to gain further efficiencies, build new service operations and increase business activity. In many cases, these business activities are based around relevant software systems hosted in a “cloud computing” environment. “Garbage in, garbage out”, or “GIGO”, is a term long used to describe problems in unqualified dependency on information systems, dating from the 1960s. However, a more pertinent variation arose sometime later, namely “garbage in, gospel out” signifying that with large scale information systems, such as ERP and usage of open datasets in a cloud environment, the ability to verify the authenticity of those data sets used may be almost impossible, resulting in dependence upon questionable results. Illicit data set “impersonation” becomes a reality. At the same time the ability to audit such results may be an important requirement, particularly in the public sector. This paper discusses the need for enhancement of identity, reliability, authenticity and audit services, including naming and addressing services, in this emerging environment and analyses some current technologies that are offered and which may be appropriate. However, severe limitations to addressing these requirements have been identified and the paper proposes further research work in the area.
Resumo:
Background Timely diagnosis and reporting of patient symptoms in hospital emergency departments (ED) is a critical component of health services delivery. However, due to dispersed information resources and a vast amount of manual processing of unstructured information, accurate point-of-care diagnosis is often difficult. Aims The aim of this research is to report initial experimental evaluation of a clinician-informed automated method for the issue of initial misdiagnoses associated with delayed receipt of unstructured radiology reports. Method A method was developed that resembles clinical reasoning for identifying limb abnormalities. The method consists of a gazetteer of keywords related to radiological findings; the method classifies an X-ray report as abnormal if it contains evidence contained in the gazetteer. A set of 99 narrative reports of radiological findings was sourced from a tertiary hospital. Reports were manually assessed by two clinicians and discrepancies were validated by a third expert ED clinician; the final manual classification generated by the expert ED clinician was used as ground truth to empirically evaluate the approach. Results The automated method that attempts to individuate limb abnormalities by searching for keywords expressed by clinicians achieved an F-measure of 0.80 and an accuracy of 0.80. Conclusion While the automated clinician-driven method achieved promising performances, a number of avenues for improvement were identified using advanced natural language processing (NLP) and machine learning techniques.
Resumo:
The ability to identify and assess user engagement with transmedia productions is vital to the success of individual projects and the sustainability of this mode of media production as a whole. It is essential that industry players have access to tools and methodologies that offer the most complete and accurate picture of how audiences/users engage with their productions and which assets generate the most valuable returns of investment. Drawing upon research conducted with Hoodlum Entertainment, a Brisbane-based transmedia producer, this chapter outlines an initial assessment of the way engagement tends to be understood, why standard web analytics tools are ill-suited to measuring it, how a customised tool could offer solutions, and why this question of measuring engagement is so vital to the future of transmedia as a sustainable industry.
Resumo:
It might still sound strange to dedicate an entire journal issue exclusively to a single internet platform. But it is not the company Twitter Inc. that draws our attention; this issue is not about a platform and its features and services. It is about its users and the ways in which they interact with one another via the platform, about the situations that motivate people to share their thoughts publicly, using Twitter as a means to reach out to one another. And it is about the digital traces people leave behind when interacting with Twitter, and most of all about the ways in which these traces – as a new type of research data – can also enable new types of research questions and insights.
Resumo:
This research proposes the development of interfaces to support collaborative, community-driven inquiry into data, which we refer to as Participatory Data Analytics. Since the investigation is led by local communities, it is not possible to anticipate which data will be relevant and what questions are going to be asked. Therefore, users have to be able to construct and tailor visualisations to their own needs. The poster presents early work towards defining a suitable compositional model, which will allow users to mix, match, and manipulate data sets to obtain visual representations with little-to-no programming knowledge. Following a user-centred design process, we are subsequently planning to identify appropriate interaction techniques and metaphors for generating such visual specifications on wall-sized, multi-touch displays.
Resumo:
This publication arose from the interests of the chapter authors, ‘a small group of thoughtful people’ almost all of whom participated in one or both Transnational Dialogues in Research in Early Childhood Education for Sustainability, held in Stavanger, Norway in 2010 and Brisbane, Australia in 2011 (Refer Appendix 1 for list of participants). These meetings were the first time that a critical mass of researchers from vastly different parts of the globe - Norway, Sweden, Australia and New Zealand at the inaugural meeting, with additional participants from Korea, Japan and Singapore attending the second - had come together to debate, discuss and share ideas about research and theory in the emerging field of Early Childhood Education for Sustainability (ECEfS. Some of the researchers who joined these Transnational Dialogues, had met serendipitously at earlier conferences and meetings, or corresponded via email, but many had never met face-to-face. Now a significant number are contributing authors in this text. It is a testament to these researchers’ interest in this agenda that they mostly self-funded their travel and other costs to attend the Transnational Dialogues research meetings. While most chapter authors come from the field of early childhood education, a few are more aligned with education for sustainability/environmental education, while a much smaller number are already working at the intersection of early childhood education and education for sustainability. What we share as a group is a range of perspectives and orientations to research and to the research focus at the heart of this book - young children and their actual and potential capabilities as agents of change for sustainability. As researchers, regardless of experience and perspectives, participants knew they had something extra to offer - their expertise as researchers - providing scholarly insights into the work of practitioners, applying critically reflective lenses to curricula, pedagogies and assumptions, testing of ideas and theories, and presenting a sense for where ECEfS might fit or, indeed, go beyond norms and orthodoxies. This is a text, then, for both researchers and those whose primary interests lie in daily interactions with children, families and communities.
Resumo:
Theorists of multiliteracies, social semiotics, and the New Literacy Studies have drawn attention to the potential changing nature of writing and literacy in the context of networked communications. This article reports findings from a design-based research project in Year 4 classrooms (students aged 8.5-10 years) in a low socioeconomic status school. A new writing program taught students how to design multimodal and digital texts across a range of genres and text types, such as web pages, online comics, video documentaries, and blogs. The authors use Bernstein’s theory of the pedagogic device to theorize the pedagogic struggles and resolutions in remaking English through the specialization of time, space, and text. The changes created an ideological struggle as new writing practices were adapted from broader societal fields to meet the instructional and regulative discourses of a conventional writing curriculum.
Resumo:
Background Breastfeeding is recognised as the optimal method for feeding infants with health gains made by reducing infectious diseases in infancy; and chronic diseases, including obesity, in childhood, adolescence and adulthood. Despite this, exclusivity and duration in developed countries remains resistant to improvement. The objectives of this research were to test if an automated mobile phone text messaging intervention, delivering one text message a week, could increase “any” breastfeeding rates and improve breastfeeding self-efficacy and coping. Methods Women were eligible to participate if they were: over eighteen years; had an infant less than three months old; were currently breastfeeding; no diagnosed mental illness; and used a mobile phone . Women in the intervention group received MumBubConnect, a text messaging service with automated responses delivered once a week for 8 weeks. Women in the comparison group received their usual care and were sampled two years after the intervention group. Data collection included online surveys at two time points, week zero and week nine, to measure breastfeeding exclusivity and duration, coping, emotions, accountability and self-efficacy. A range of statistical analyses were used to test for differences between groups. Hierarchical regression was used to investigate change in breastfeeding outcome, between groups, adjusting for co-variates. Results The intervention group had 120 participants at commencement and 114 at completion, the comparison group had 114 participants at commencement and 86 at completion. MumBubConnect had a positive impact on the primary outcome of breastfeeding behaviors with women receiving the intervention more likely to continue exclusive breastfeeding; with a 6% decrease in exclusive breastfeeding in the intervention group, compared to a 14% decrease in the comparison group (p < 0.001). This remained significant after controlling for infant age, mother’s income, education and delivery type (p = 0.04). Women in the intervention group demonstrated active coping and were less likely to display emotions-focussed coping (p < .001). There was no discernible statistical effect on self-efficacy or accountability. Conclusions A fully automated text messaging services appears to improve exclusive breastfeeding duration. The service provides a well-accepted, personalised support service that empowers women to actively resolve breastfeeding issues. Trial registration Australian New Zealand Clinical Trials Registry: ACTRN12614001091695.
Resumo:
Description of a patient's injuries is recorded in narrative text form by hospital emergency departments. For statistical reporting, this text data needs to be mapped to pre-defined codes. Existing research in this field uses the Naïve Bayes probabilistic method to build classifiers for mapping. In this paper, we focus on providing guidance on the selection of a classification method. We build a number of classifiers belonging to different classification families such as decision tree, probabilistic, neural networks, and instance-based, ensemble-based and kernel-based linear classifiers. An extensive pre-processing is carried out to ensure the quality of data and, in hence, the quality classification outcome. The records with a null entry in injury description are removed. The misspelling correction process is carried out by finding and replacing the misspelt word with a soundlike word. Meaningful phrases have been identified and kept, instead of removing the part of phrase as a stop word. The abbreviations appearing in many forms of entry are manually identified and only one form of abbreviations is used. Clustering is utilised to discriminate between non-frequent and frequent terms. This process reduced the number of text features dramatically from about 28,000 to 5000. The medical narrative text injury dataset, under consideration, is composed of many short documents. The data can be characterized as high-dimensional and sparse, i.e., few features are irrelevant but features are correlated with one another. Therefore, Matrix factorization techniques such as Singular Value Decomposition (SVD) and Non Negative Matrix Factorization (NNMF) have been used to map the processed feature space to a lower-dimensional feature space. Classifiers with these reduced feature space have been built. In experiments, a set of tests are conducted to reflect which classification method is best for the medical text classification. The Non Negative Matrix Factorization with Support Vector Machine method can achieve 93% precision which is higher than all the tested traditional classifiers. We also found that TF/IDF weighting which works well for long text classification is inferior to binary weighting in short document classification. Another finding is that the Top-n terms should be removed in consultation with medical experts, as it affects the classification performance.
Resumo:
This paper evaluates the performance of different text recognition techniques for a mobile robot in an indoor (university campus) environment. We compared four different methods: our own approach using existing text detection methods (Minimally Stable Extremal Regions detector and Stroke Width Transform) combined with a convolutional neural network, two modes of the open source program Tesseract, and the experimental mobile app Google Goggles. The results show that a convolutional neural network combined with the Stroke Width Transform gives the best performance in correctly matched text on images with single characters whereas Google Goggles gives the best performance on images with multiple words. The dataset used for this work is released as well.
Resumo:
Heterogeneous health data is a critical issue when managing health information for quality decision making processes. In this paper we examine the efficient aggregation of lifestyle information through a data warehousing architecture lens. We present a proof of concept for a clinical data warehouse architecture that enables evidence based decision making processes by integrating and organising disparate data silos in support of healthcare services improvement paradigms.