962 resultados para Emerging pattern mining
Resumo:
Extracting frequent subtrees from the tree structured data has important applications in Web mining. In this paper, we introduce a novel canonical form for rooted labelled unordered trees called the balanced-optimal-search canonical form (BOCF) that can handle the isomorphism problem efficiently. Using BOCF, we define a tree structure guided scheme based enumeration approach that systematically enumerates only the valid subtrees. Finally, we present the balanced optimal search tree miner (BOSTER) algorithm based on BOCF and the proposed enumeration approach, for finding frequent induced subtrees from a database of labelled rooted unordered trees. Experiments on the real datasets compare the efficiency of BOSTER over the two state-of-the-art algorithms for mining induced unordered subtrees, HybridTreeMiner and UNI3. The results are encouraging.
Resumo:
This paper presents an algorithm for mining unordered embedded subtrees using the balanced-optimal-search canonical form (BOCF). A tree structure guided scheme based enumeration approach is defined using BOCF for systematically enumerating the valid subtrees only. Based on this canonical form and enumeration technique, the balanced optimal search embedded subtree mining algorithm (BEST) is introduced for mining embedded subtrees from a database of labelled rooted unordered trees. The extensive experiments on both synthetic and real datasets demonstrate the efficiency of BEST over the two state-of-the-art algorithms for mining embedded unordered subtrees, SLEUTH and U3.
Resumo:
Identifying product families has been considered as an effective way to accommodate the increasing product varieties across the diverse market niches. In this paper, we propose a novel framework to identifying product families by using a similarity measure for a common product design data BOM (Bill of Materials) based on data mining techniques such as frequent mining and clus-tering. For calculating the similarity between BOMs, a novel Extended Augmented Adjacency Matrix (EAAM) representation is introduced that consists of information not only of the content and topology but also of the fre-quent structural dependency among the various parts of a product design. These EAAM representations of BOMs are compared to calculate the similarity between products and used as a clustering input to group the product fami-lies. When applied on a real-life manufacturing data, the proposed framework outperforms a current baseline that uses orthogonal Procrustes for grouping product families.
Resumo:
It is a big challenge to guarantee the quality of discovered relevance features in text documents for describing user preferences because of large scale terms and data patterns. Most existing popular text mining and classification methods have adopted term-based approaches. However, they have all suffered from the problems of polysemy and synonymy. Over the years, there has been often held the hypothesis that pattern-based methods should perform better than term-based ones in describing user preferences; yet, how to effectively use large scale patterns remains a hard problem in text mining. To make a breakthrough in this challenging issue, this paper presents an innovative model for relevance feature discovery. It discovers both positive and negative patterns in text documents as higher level features and deploys them over low-level features (terms). It also classifies terms into categories and updates term weights based on their specificity and their distributions in patterns. Substantial experiments using this model on RCV1, TREC topics and Reuters-21578 show that the proposed model significantly outperforms both the state-of-the-art term-based methods and the pattern based methods.
Resumo:
Smart Card Automated Fare Collection (AFC) data has been extensively exploited to understand passenger behavior, passenger segment, trip purpose and improve transit planning through spatial travel pattern analysis. The literature has been evolving from simple to more sophisticated methods such as from aggregated to individual travel pattern analysis, and from stop-to-stop to flexible stop aggregation. However, the issue of high computing complexity has limited these methods in practical applications. This paper proposes a new algorithm named Weighted Stop Density Based Scanning Algorithm with Noise (WS-DBSCAN) based on the classical Density Based Scanning Algorithm with Noise (DBSCAN) algorithm to detect and update the daily changes in travel pattern. WS-DBSCAN converts the classical quadratic computation complexity DBSCAN to a problem of sub-quadratic complexity. The numerical experiment using the real AFC data in South East Queensland, Australia shows that the algorithm costs only 0.45% in computation time compared to the classical DBSCAN, but provides the same clustering results.
Resumo:
The mining industry presents us with a number of ideal applications for sensor based machine control because of the unstructured environment that exists within each mine. The aim of the research presented here is to increase the productivity of existing large compliant mining machines by retrofitting with enhanced sensing and control technology. The current research focusses on the automatic control of the swing motion cycle of a dragline and an automated roof bolting system. We have achieved: * closed-loop swing control of an one-tenth scale model dragline; * single degree of freedom closed-loop visual control of an electro-hydraulic manipulator in the lab developed from standard components.
Resumo:
In recent years a significant amount of research has been undertaken in collision avoidance and personnel location technology in order to reduce the number of incidents involving pedestrians and mobile plant equipment which are a high risk in underground coal mines. Improving the visibility of pedestrians to drivers would potentially reduce the likelihood of these incidents. In the road safety context, a variety of approaches have been used to make pedestrians more conspicuous to drivers at night (including vehicle and roadway lighting technologies and night vision enhancement systems). However, emerging research from our group and others has demonstrated that clothing incorporating retroreflective markers on the movable joints as well as the torso can provide highly significant improvements in pedestrian visibility in reduced illumination. Importantly, retroreflective markers are most effective when positioned on the moveable joints creating a sensation of “biological motion”. Based only on the motion of points on the moveable joints of an otherwise invisible body, observers can quickly recognize a walking human form, and even correctly judge characteristics such as gender and weight. An important and as yet unexplored question is whether the benefits of these retroreflective clothing configurations translate to the context of mining where workers are operating under low light conditions. Given that the benefits of biomotion clothing are effective for both young and older drivers, as well as those with various eye conditions common in those >50 years reinforces their potential application in the mining industry which employs many workers in this age bracket. This paper will summarise the visibility benefits of retroreflective markers in a biomotion configuration for the mining industry, highlighting that this form of clothing has the potential to be an affordable and convenient way to provide a sizeable safety benefit. It does not involve modifications to vehicles, drivers, or infrastructure. Instead, adding biomotion markings to standard retroreflective vests can enhance the night-time conspicuity of mining workers by capitalising on perceptual capabilities that have already been well documented.
Resumo:
This chapter addresses a topic of growing significance to green criminology - the harmful effects of mining on local communities and the environment (Ruggiero and South 2013; White 2013a). While mining has long been recognised as an agent of environmental harm (White 2013a), less recognised is that its global expansion also has harmful effects on localised patterns of violence, work and community life in mining towns. Australia provides an excellent case study for exploring some of these mining impacts.
Resumo:
Introduction and Aims Wastewater analysis provides a non-intrusive way of measuring drug use within a population. We used this approach to determine daily use of conventional illicit drugs [cannabis, cocaine, methamphetamine and 3,4-methylenedioxymethamphetamine (MDMA)] and emerging illicit psychostimulants (benzylpiperazine, mephedrone and methylone) in two consecutive years (2010 and 2011) at an annual music festival. Design and Methods Daily composite wastewater samples, representative of the festival, were collected from the on-site wastewater treatment plant and analysed for drug metabolites. Data over 2 years were compared using Wilcoxon matched-pair test. Data from 2010 festival were compared with data collected at the same time from a nearby urban community using equivalent methods. Results Conventional illicit drugs were detected in all samples whereas emerging illicit psychostimulants were found only on specific days. The estimated per capita consumption of MDMA, cocaine and cannabis was similar between the two festival years. Statistically significant (P < 0.05; Z = −2.0–2.2) decreases were observed in use of methamphetamine and one emerging illicit psychostimulant (benzyl piperazine). Only consumption of MDMA was elevated at the festival compared with the nearby urban community. Discussion and Conclusions Rates of substance use at this festival remained relatively consistent over two monitoring years. Compared with the urban community, drug use among festival goers was only elevated for MDMA, confirming its popularity in music settings. Our study demonstrated that wastewater analysis can objectively capture changes in substance use at a music setting without raising major ethical issues. It would potentially allow effective assessments of drug prevention strategies in such settings in the future.
Resumo:
Big Data and predictive analytics have received significant attention from the media and academic literature throughout the past few years, and it is likely that these emerging technologies will materially impact the mining sector. This short communication argues, however, that these technological forces will probably unfold differently in the mining industry than they have in many other sectors because of significant differences in the marginal cost of data capture and storage. To this end, we offer a brief overview of what Big Data and predictive analytics are, and explain how they are bringing about changes in a broad range of sectors. We discuss the “N=all” approach to data collection being promoted by many consultants and technology vendors in the marketplace but, by considering the economic and technical realities of data acquisition and storage, we then explain why a “n « all” data collection strategy probably makes more sense for the mining sector. Finally, towards shaping the industry’s policies with regards to technology-related investments in this area, we conclude by putting forward a conceptual model for leveraging Big Data tools and analytical techniques that is a more appropriate fit for the mining sector.
Resumo:
Cat's claw creeper, Dolichandra unguis-cati (L.) L.G. Lohman (syn: Macfadyena unguis-cati (L.) A.H. Gentry) (Bignoniaceae), a major environmental weed in Queensland and New South Wales, is a Weed of National Significance and an approved target for biological control. A leaf-mining jewel beetle, Hylaeogena jureceki Obenberger (Coleoptera: Buprestidae), first collected in 2002 from D. unguis-cati in Brazil and Argentina, was imported from South Africa into a quarantine facility in Brisbane in 2009 for host-specificity testing. H. jureceki adults chew holes in leaves and lay eggs on leaf margins and the emerging larvae mine within the leaves of D. unguis-cati. The generation time (egg to adult) of H. jureceki under quarantine conditions was 55.4 ± 0.2 days. Host-specificity trials conducted in Australia on 38 plant species from 11 families supplement and support South African studies which indicated that H. jureceki is highly host-specific and does not pose a risk to any non-target plant species in Australia. In no-choice tests, adults survived significantly longer (>32 weeks) on D. unguis-cati than on non-target test plant species (<3 weeks). Oviposition occurred on D. unguis-cati and one non-target test plant species, Citharexylum spinosum (Verbenaceae), but no larval development occurred on the latter species. In choice tests involving D. unguis-cati, C. spinosum and Avicennia marina (Avicenniaceae), feeding and oviposition were evident only on D. unguis-cati. The insect was approved for field release in Australia in May 2012.
Resumo:
With the development of wearable and mobile computing technology, more and more people start using sleep-tracking tools to collect personal sleep data on a daily basis aiming at understanding and improving their sleep. While sleep quality is influenced by many factors in a person’s lifestyle context, such as exercise, diet and steps walked, existing tools simply visualize sleep data per se on a dashboard rather than analyse those data in combination with contextual factors. Hence many people find it difficult to make sense of their sleep data. In this paper, we present a cloud-based intelligent computing system named SleepExplorer that incorporates sleep domain knowledge and association rule mining for automated analysis on personal sleep data in light of contextual factors. Experiments show that the same contextual factors can play a distinct role in sleep of different people, and SleepExplorer could help users discover factors that are most relevant to their personal sleep.
Resumo:
Acoustics is a rich source of environmental information that can reflect the ecological dynamics. To deal with the escalating acoustic data, a variety of automated classification techniques have been used for acoustic patterns or scene recognition, including urban soundscapes such as streets and restaurants; and natural soundscapes such as raining and thundering. It is common to classify acoustic patterns under the assumption that a single type of soundscapes present in an audio clip. This assumption is reasonable for some carefully selected audios. However, only few experiments have been focused on classifying simultaneous acoustic patterns in long-duration recordings. This paper proposes a binary relevance based multi-label classification approach to recognise simultaneous acoustic patterns in one-minute audio clips. By utilising acoustic indices as global features and multilayer perceptron as a base classifier, we achieve good classification performance on in-the-field data. Compared with single-label classification, multi-label classification approach provides more detailed information about the distributions of various acoustic patterns in long-duration recordings. These results will merit further biodiversity investigations, such as bird species surveys.