775 resultados para mining data streams
Resumo:
Road networks are a national critical infrastructure. The road assets need to be monitored and maintained efficiently as their conditions deteriorate over time. The condition of one of such assets, road pavement, plays a major role in the road network maintenance programmes. Pavement conditions depend upon many factors such as pavement types, traffic and environmental conditions. This paper presents a data analytics case study for assessing the factors affecting the pavement deflection values measured by the traffic speed deflectometer (TSD) device. The analytics process includes acquisition and integration of data from multiple sources, data pre-processing, mining useful information from them and utilising data mining outputs for knowledge deployment. Data mining techniques are able to show how TSD outputs vary in different roads, traffic and environmental conditions. The generated data mining models map the TSD outputs to some classes and define correction factors for each class.
Resumo:
Term-based approaches can extract many features in text documents, but most include noise. Many popular text-mining strategies have been adapted to reduce noisy information from extracted features; however, text-mining techniques suffer from low frequency. The key issue is how to discover relevance features in text documents to fulfil user information needs. To address this issue, we propose a new method to extract specific features from user relevance feedback. The proposed approach includes two stages. The first stage extracts topics (or patterns) from text documents to focus on interesting topics. In the second stage, topics are deployed to lower level terms to address the low-frequency problem and find specific terms. The specific terms are determined based on their appearances in relevance feedback and their distribution in topics or high-level patterns. We test our proposed method with extensive experiments in the Reuters Corpus Volume 1 dataset and TREC topics. Results show that our proposed approach significantly outperforms the state-of-the-art models.
Resumo:
The practices and public reputation of mining have been changing over time. In the past, mining operations frequently stood accused of being socially and environmentally disruptive, whereas mining today invests heavily in ‘socially responsible’ and ‘sustainable’ business practices. Changes such as these can be witnessed internationally as well as in places like Western Australia (WA), where the mining sector has matured into an economic pillar of the state, and indeed the nation in the context of the recent resources boom. This paper explores the role of mining in WA, presenting a multi-disciplinary perspective on the sector's contribution to sustainable development in the state. The perspectives offered here are drawn from community-based research and the associated academic literature as well as data derived from government sources and the not-for-profit sector. Findings suggest that despite noteworthy attitudinal and operational improvements in the industry, social, economic and environmental problem areas remain. As mining in WA is expected to grow in the years to come, these problem areas require the attention of business and government alike to ensure the long-term sustainability of development as well as people and place.
Resumo:
This paper draws upon Hubbard's (1999, p. 57) term ‘scary heterosexualities,’ that is non-normative heterosexuality, in the context of the rural drawing on data from fieldwork in the remote Western Australian mining town of Kalgoorlie. Our focus is ‘the skimpie’ – a female barmaid who serves in her underwear and who, in both historical and contemporary times, is strongly associated with rural mining communities. Interviews with skimpies and local residents as well as participant observation reveal how potential fears and anxieties about skimpies are managed. We identify the discursive and spatial processes by which skimpie work is contained in Kalgoorlie so that the potential scariness ‘the skimpie’ represents to the rural is muted and buttressed in terms of a more conventional and less threatening rural heterosexuality.
Resumo:
This study was a step forward to improve the performance for discovering useful knowledge – especially, association rules in this study – in databases. The thesis proposed an approach to use granules instead of patterns to represent knowledge implicitly contained in relational databases; and multi-tier structure to interpret association rules in terms of granules. Association mappings were proposed for the construction of multi-tier structure. With these tools, association rules can be quickly assessed and meaningless association rules can be justified according to the association mappings. The experimental results indicated that the proposed approach is promising.
Resumo:
In studies using macroinvertebrates as indicators for monitoring rivers and streams, species level identifications in comparison with lower resolution identifications can have greater information content and result in more reliable site classifications and better capacity to discriminate between sites, yet many such programmes identify specimens to the resolution of family rather than species. This is often because it is cheaper to obtain family level data than species level data. Choice of appropriate taxonomic resolution is a compromise between the cost of obtaining data at high taxonomic resolutions and the loss of information at lower resolutions. Optimum taxonomic resolution should be determined by the information required to address programme objectives. Costs saved in identifying macroinvertebrates to family level may not be justified if family level data can not give the answers required and expending the extra cost to obtain species level data may not be warranted if cheaper family level data retains sufficient information to meet objectives. We investigated the influence of taxonomic resolution and sample quantification (abundance vs. presence/absence) on the representation of aquatic macroinvertebrate species assemblage patterns and species richness estimates. The study was conducted in a physically harsh dryland river system (Condamine-Balonne River system, located in south-western Queensland, Australia), characterised by low macroinvertebrate diversity. Our 29 study sites covered a wide geographic range and a diversity of lotic conditions and this was reflected by differences between sites in macroinvertebrate assemblage composition and richness. The usefulness of expending the extra cost necessary to identify macroinvertebrates to species was quantified via the benefits this higher resolution data offered in its capacity to discriminate between sites and give accurate estimates of site species richness. We found that very little information (<6%) was lost by identifying taxa to family (or genus), as opposed to species, and that quantifying the abundance of taxa provided greater resolution for pattern interpretation than simply noting their presence/absence. Species richness was very well represented by genus, family and order richness, so that each of these could be used as surrogates of species richness if, for example, surveying to identify diversity hot-spots. It is suggested that sharing of common ecological responses among species within higher taxonomic units is the most plausible mechanism for the results. Based on a cost/benefit analysis, family level abundance data is recommended as the best resolution for resolving patterns in macroinvertebrate assemblages in this system. The relevance of these findings are discussed in the context of other low diversity, harsh, dryland river systems.
Resumo:
The human right to water has recently been recognised by both the United Nations General Assembly and the Human Rights Council. As the mining industry interacts with water on multiple levels, it is important that these interactions respect the human right to water. Currently, a disconnect exists between mine site water management practices and the recognition of water from a human rights perspective. The Minerals Council of Australia (MCA) Water Accounting Framework (WAF) has previously been used to strengthen the connection between water management and human rights. This article extends this connection through the use of a Social Water Assessment Protocol (SWAP). The SWAP is scoping tool consisting of a set of questions classified into taxonomic themes under leading topics with suggested sources of data that enable mine sites to better understand the local water context in which they operate. Three of the themes contained in the SWAP – gender, Indigenous peoples and health – are discussed to demonstrate how the protocol may be useful in assisting mining companies to consider their impacts on the human right to water.
Resumo:
This paper evaluates the suitability of sequence classification techniques for analyzing deviant business process executions based on event logs. Deviant process executions are those that deviate in a negative or positive way with respect to normative or desirable outcomes, such as non-compliant executions or executions that undershoot or exceed performance targets. We evaluate a range of feature types and classification methods in terms of their ability to accurately discriminate between normal and deviant executions both when deviances are infrequent (unbalanced) and when deviances are as frequent as normal executions (balanced). We also analyze the ability of the discovered rules to explain potential causes and contributing factors of observed deviances. The evaluation results show that feature types extracted using pattern mining techniques only slightly outperform those based on individual activity frequency. The results also suggest that more complex feature types ought to be explored to achieve higher levels of accuracy.
Resumo:
Although the collection of player and ball tracking data is fast becoming the norm in professional sports, large-scale mining of such spatiotemporal data has yet to surface. In this paper, given an entire season's worth of player and ball tracking data from a professional soccer league (approx 400,000,000 data points), we present a method which can conduct both individual player and team analysis. Due to the dynamic, continuous and multi-player nature of team sports like soccer, a major issue is aligning player positions over time. We present a "role-based" representation that dynamically updates each player's relative role at each frame and demonstrate how this captures the short-term context to enable both individual player and team analysis. We discover role directly from data by utilizing a minimum entropy data partitioning method and show how this can be used to accurately detect and visualize formations, as well as analyze individual player behavior.
Resumo:
To the trained-eye, experts can often identify a team based on their unique style of play due to their movement, passing and interactions. In this paper, we present a method which can accurately determine the identity of a team from spatiotemporal player tracking data. We do this by utilizing a formation descriptor which is found by minimizing the entropy of role-specific occupancy maps. We show how our approach is significantly better at identifying different teams compared to standard measures (i.e., shots, passes etc.). We demonstrate the utility of our approach using an entire season of Prozone player tracking data from a top-tier professional soccer league.
Resumo:
This thesis presents an association rule mining approach, association hierarchy mining (AHM). Different to the traditional two-step bottom-up rule mining, AHM adopts one-step top-down rule mining strategy to improve the efficiency and effectiveness of mining association rules from datasets. The thesis also presents a novel approach to evaluate the quality of knowledge discovered by AHM, which focuses on evaluating information difference between the discovered knowledge and the original datasets. Experiments performed on the real application, characterizing network traffic behaviour, have shown that AHM achieves encouraging performance.
Resumo:
This paper examines the social licence to operate (SLO) of Western Australia's (WA's) mining industry in the context of the state's ‘developmentalist’ agenda. We draw on the findings of a multi-disciplinary body of new research on the risks and challenges posed byWA's mining industry for environmental, social and economic sustainability. We synthesise the findings of this work against the backdrop of the broader debates on corporate social responsibility (CSR) and resource governance. In light of the data presented, this paper takes issue with the mining sector's SLO and its assessment of social and environmental impacts in WA for three inter-related reasons. A state government ideologically wedded to resource-led growth is seen to offer the resource sector a political licence to operate and to give insufficient attention to its potential social and environmental impacts. As a result, the resource sector can adopt a self-serving CSR agenda built on a limited win–win logic and operate with a ‘quasi social licence’ that is restricted to mere economic legitimacy. Overall, this paper problematises the political-cum-commercial construction and neoliberalisation of the SLO and raises questions about the impact of mining in WA.
Resumo:
Extracting frequent subtrees from the tree structured data has important applications in Web mining. In this paper, we introduce a novel canonical form for rooted labelled unordered trees called the balanced-optimal-search canonical form (BOCF) that can handle the isomorphism problem efficiently. Using BOCF, we define a tree structure guided scheme based enumeration approach that systematically enumerates only the valid subtrees. Finally, we present the balanced optimal search tree miner (BOSTER) algorithm based on BOCF and the proposed enumeration approach, for finding frequent induced subtrees from a database of labelled rooted unordered trees. Experiments on the real datasets compare the efficiency of BOSTER over the two state-of-the-art algorithms for mining induced unordered subtrees, HybridTreeMiner and UNI3. The results are encouraging.
Resumo:
This paper presents an algorithm for mining unordered embedded subtrees using the balanced-optimal-search canonical form (BOCF). A tree structure guided scheme based enumeration approach is defined using BOCF for systematically enumerating the valid subtrees only. Based on this canonical form and enumeration technique, the balanced optimal search embedded subtree mining algorithm (BEST) is introduced for mining embedded subtrees from a database of labelled rooted unordered trees. The extensive experiments on both synthetic and real datasets demonstrate the efficiency of BEST over the two state-of-the-art algorithms for mining embedded unordered subtrees, SLEUTH and U3.