765 resultados para Sentiment Analysis, Opinion Mining, Twitter
Resumo:
This thesis presents an association rule mining approach, association hierarchy mining (AHM). Different to the traditional two-step bottom-up rule mining, AHM adopts one-step top-down rule mining strategy to improve the efficiency and effectiveness of mining association rules from datasets. The thesis also presents a novel approach to evaluate the quality of knowledge discovered by AHM, which focuses on evaluating information difference between the discovered knowledge and the original datasets. Experiments performed on the real application, characterizing network traffic behaviour, have shown that AHM achieves encouraging performance.
Resumo:
This research is a step forward in improving the accuracy of detecting anomaly in a data graph representing connectivity between people in an online social network. The proposed hybrid methods are based on fuzzy machine learning techniques utilising different types of structural input features. The methods are presented within a multi-layered framework which provides the full requirements needed for finding anomalies in data graphs generated from online social networks, including data modelling and analysis, labelling, and evaluation.
Resumo:
This paper presents a single pass algorithm for mining discriminative Itemsets in data streams using a novel data structure and the tilted-time window model. Discriminative Itemsets are defined as Itemsets that are frequent in one data stream and their frequency in that stream is much higher than the rest of the streams in the dataset. In order to deal with the data structure size, we propose a pruning process that results in the compact tree structure containing discriminative Itemsets. Empirical analysis shows the sound time and space complexity of the proposed method.
Resumo:
Background: Surprisingly, opinion about whether men are suitable within the profession continues to be a divided issue. Men enter the profession for a multitude of reasons, yet barriers whether emotional, verbal or sexual are still present. Aim: The aim of this study was to examine the experience of men “training” to be registered nurses within a regional New Zealand context. Design: A Narrative Analysis approach was used. Participants: Five New Zealand men currently undertaking their bachelor of nursing degree at a regional tertiary institute were interviewed as to their experiences of what it meant to be a man in “training”. Method: A thematic analysis was undertaken and guided by an understanding of the way personal narratives informs the human sciences especially within the context of nursing praxis. Four key themes were identified. Results: Four key themes were identified: A career with flexibility and promise; perceived gender inequality in providing care; developing professional boundaries with female colleagues and being unique has its advantages. Conclusion: The men in this study were attracted to the profession by career stability and advancement; the opportunities for travel also figured highly. At times they felt excluded and marginalised because of their minority status within their group and the feminine nature of the curriculum. The men attempted to dispel the myth around male nurse sexual stereotypes. Some of the students behaved in a manner to exert their heterosexualness. The students in this study sensed their vulnerability in choosing nursing as a career. However, all the participants saw nursing as viable and portable career in terms of advancement and travel.
Resumo:
Strain-based failure criteria have several advantages over stress-based failure criteria: they can account for elastic and inelastic strains, they utilise direct, observables effects instead of inferred effects (strain gauges vs. stress estimates), and model complete stress-strain curves including pre-peak, non-linear elasticity and post-peak strain weakening. In this study, a strain-based failure criterion derived from thermodynamic first principles utilising the concepts of continuum damage mechanics is presented. Furthermore, implementation of this failure criterion into a finite-element simulation is demonstrated and applied to the stability of underground mining coal pillars. In numerical studies, pillar strength is usually expressed in terms of critical stresses or stress-based failure criteria where scaling with pillar width and height is common. Previous publications have employed the finite-element method for pillar stability analysis using stress-based failure criterion such as Mohr-Coulomb and Hoek-Brown or stress-based scalar damage models. A novel constitutive material model, which takes into consideration anisotropy as well as elastic strain and damage as state variables has been developed and is presented in this paper. The damage threshold and its evolution are strain-controlled, and coupling of the state variables is achieved through the damage-induced degradation of the elasticity tensor. This material model is implemented into the finite-element software ABAQUS and can be applied to 3D problems. Initial results show that this new material model is capable of describing the non-linear behaviour of geomaterials commonly observed before peak strength is reached as well as post-peak strain softening. Furthermore, it is demonstrated that the model can account for directional dependency of failure behaviour (i.e. anisotropy) and has the potential to be expanded to environmental controls like temperature or moisture.
Resumo:
Protein adsorption at solid-liquid interfaces is critical to many applications, including biomaterials, protein microarrays and lab-on-a-chip devices. Despite this general interest, and a large amount of research in the last half a century, protein adsorption cannot be predicted with an engineering level, design-orientated accuracy. Here we describe a Biomolecular Adsorption Database (BAD), freely available online, which archives the published protein adsorption data. Piecewise linear regression with breakpoint applied to the data in the BAD suggests that the input variables to protein adsorption, i.e., protein concentration in solution; protein descriptors derived from primary structure (number of residues, global protein hydrophobicity and range of amino acid hydrophobicity, isoelectric point); surface descriptors (contact angle); and fluid environment descriptors (pH, ionic strength), correlate well with the output variable-the protein concentration on the surface. Furthermore, neural network analysis revealed that the size of the BAD makes it sufficiently representative, with a neural network-based predictive error of 5% or less. Interestingly, a consistently better fit is obtained if the BAD is divided in two separate sub-sets representing protein adsorption on hydrophilic and hydrophobic surfaces, respectively. Based on these findings, selected entries from the BAD have been used to construct neural network-based estimation routines, which predict the amount of adsorbed protein, the thickness of the adsorbed layer and the surface tension of the protein-covered surface. While the BAD is of general interest, the prediction of the thickness and the surface tension of the protein-covered layers are of particular relevance to the design of microfluidics devices.
Resumo:
Over the last few years, investigations of human epigenetic profiles have identified key elements of change to be Histone Modifications, stable and heritable DNA methylation and Chromatin remodeling. These factors determine gene expression levels and characterise conditions leading to disease. In order to extract information embedded in long DNA sequences, data mining and pattern recognition tools are widely used, but efforts have been limited to date with respect to analyzing epigenetic changes, and their role as catalysts in disease onset. Useful insight, however, can be gained by investigation of associated dinucleotide distributions. The focus of this paper is to explore specific dinucleotides frequencies across defined regions within the human genome, and to identify new patterns between epigenetic mechanisms and DNA content. Signal processing methods, including Fourier and Wavelet Transformations, are employed and principal results are reported.
Resumo:
Background The requirement for dual screening of titles and abstracts to select papers to examine in full text can create a huge workload, not least when the topic is complex and a broad search strategy is required, resulting in a large number of results. An automated system to reduce this burden, while still assuring high accuracy, has the potential to provide huge efficiency savings within the review process. Objectives To undertake a direct comparison of manual screening with a semi‐automated process (priority screening) using a machine classifier. The research is being carried out as part of the current update of a population‐level public health review. Methods Authors have hand selected studies for the review update, in duplicate, using the standard Cochrane Handbook methodology. A retrospective analysis, simulating a quasi‐‘active learning’ process (whereby a classifier is repeatedly trained based on ‘manually’ labelled data) will be completed, using different starting parameters. Tests will be carried out to see how far different training sets, and the size of the training set, affect the classification performance; i.e. what percentage of papers would need to be manually screened to locate 100% of those papers included as a result of the traditional manual method. Results From a search retrieval set of 9555 papers, authors excluded 9494 papers at title/abstract and 52 at full text, leaving 9 papers for inclusion in the review update. The ability of the machine classifier to reduce the percentage of papers that need to be manually screened to identify all the included studies, under different training conditions, will be reported. Conclusions The findings of this study will be presented along with an estimate of any efficiency gains for the author team if the screening process can be semi‐automated using text mining methodology, along with a discussion of the implications for text mining in screening papers within complex health reviews.
Resumo:
This article analyses and compares Twitter activity for the niche sport of netball over the 2013 trans-Tasman ANZ Championship competition and the international Commonwealth Games event in 2014. Patterns within the Twitter data that were discovered through an analysis of the 2013 ANZ Championship season are considered in terms of the Commonwealth Games, and thus compared between a quasi-domestic and an international context. In particular, we highlight the extent to which niche sports such as netball attempt to capitalise on the opportunities provided by social media, and the challenges involved in coordinating event-specific hashtags, such as the #netball2014 hashtag promoted by the Commonwealth Games Federation.
Resumo:
This research contributes novel techniques for identifying and evaluating business process risks and analysing human resource behaviour. The developed techniques use predefined indicators to identify process risks in individual process instances, evaluate overall process risk, predict process outcomes and analyse human resource behaviour based on the analysis of information about process executions recorded in event logs by information systems. The results of this research can help managers to more accurately evaluate the risk exposure of their business processes, to more objectively evaluate the performance of their employees, and to identify opportunities for improvement of resource and process performance.
Resumo:
In this paper we illustrate a set of features of the Apromore process model repository for analyzing business process variants. Two types of analysis are provided: one is static and based on differences on the process control flow, the other is dynamic and based on differences in the process behavior between the variants. These features combine techniques for the management of large process model collections with those for mining process knowledge from process execution logs. The tool demonstration will be useful for researchers and practitioners working on large process model collections and process execution logs, and specifically for those with an interest in understanding, managing and consolidating business process variants both within and across organizational boundaries.
Resumo:
This article investigates whether participation on Twitter during Toronto’s 2014 WorldPride festival facilitated challenges to heteronormativity through increased visibility, connections, and messages about LGBTQ people. Analysis of 68,231 tweets found that surges in activity using WorldPride hashtags, connections among users, and the circulation of affective content with common symbols made celebrations visible. However, the platform’s features catered to politicians, celebrities, and advertisers in ways that accentuated self-promotional, local, and often banal content, overshadowing individual users and the festival’s global mandate. By identifying Twitter’s limits in fostering the visibility of users and messages that circulate nonnormative discourses, this study makes way for future research identifying alternative platform dynamics that can enhance the visibility of diversity.
Resumo:
This paper proposes the Clinical Pathway Analysis Method (CPAM) approach that enables the extraction of valuable organisational and medical information on past clinical pathway executions from the event logs of healthcare information systems. The method deals with the complexity of real-world clinical pathways by introducing a perspective-based segmentation of the date-stamped event log. CPAM enables the clinical pathway analyst to effectively and efficiently acquire a profound insight into the clinical pathways. By comparing the specific medical conditions of patients with the factors used for characterising the different clinical pathway variants, the medical expert can identify the best therapeutic option. Process mining-based analytics enables the acquisition of valuable insights into clinical pathways, based on the complete audit traces of previous clinical pathway instances. Additionally, the methodology is suited to assess guideline compliance and analyse adverse events. Finally, the methodology provides support for eliciting tacit knowledge and providing treatment selection assistance.
Resumo:
Rolling-element bearing failures are the most frequent problems in rotating machinery, which can be catastrophic and cause major downtime. Hence, providing advance failure warning and precise fault detection in such components are pivotal and cost-effective. The vast majority of past research has focused on signal processing and spectral analysis for fault diagnostics in rotating components. In this study, a data mining approach using a machine learning technique called anomaly detection (AD) is presented. This method employs classification techniques to discriminate between defect examples. Two features, kurtosis and Non-Gaussianity Score (NGS), are extracted to develop anomaly detection algorithms. The performance of the developed algorithms was examined through real data from a test to failure bearing. Finally, the application of anomaly detection is compared with one of the popular methods called Support Vector Machine (SVM) to investigate the sensitivity and accuracy of this approach and its ability to detect the anomalies in early stages.
Resumo:
Analysing wastewater samples is an innovative approach that overcomes many limitations of traditional surveys to identify and measure a range of chemicals that were consumed by or exposed to people living in a sewer catchment area. First conceptualised in 2001, much progress has been made to make wastewater analysis (WWA) a reliable and robust tool for measuring chemical consumption and/or exposure. At the moment, the most popular application of WWA, sometimes referred as sewage epidemiology, is to monitor the consumption of illicit drugs in communities around the globe, including China. The approach has been largely adopted by law enforcement agencies as a device to monitor the temporal and geographical patterns of drug consumption. In the future, the methodology can be extended to other chemicals including biomarkers of population health (e.g. environmental or oxidative stress biomarkers, lifestyle indicators or medications that are taken by different demographic groups) and pollutants that people are exposed to (e.g. polycyclic aromatic hydrocarbons, perfluorinated chemicals, and toxic pesticides). The extension of WWA to a huge range of chemicals may give rise to a field called sewage chemical-information mining (SCIM) with unexplored potentials. China has many densely populated cities with thousands of sewage treatment plants which are favourable for applying WWA/SCIM in order to help relevant authorities gather information about illicit drug consumption and population health status. However, there are some prerequisites and uncertainties of the methodology that should be addressed for SCIM to reach its full potential in China.