984 resultados para Mining extraction
Resumo:
Over the last decade, the majority of existing search techniques is either keyword- based or category-based, resulting in unsatisfactory effectiveness. Meanwhile, studies have illustrated that more than 80% of users preferred personalized search results. As a result, many studies paid a great deal of efforts (referred to as col- laborative filtering) investigating on personalized notions for enhancing retrieval performance. One of the fundamental yet most challenging steps is to capture precise user information needs. Most Web users are inexperienced or lack the capability to express their needs properly, whereas the existent retrieval systems are highly sensitive to vocabulary. Researchers have increasingly proposed the utilization of ontology-based tech- niques to improve current mining approaches. The related techniques are not only able to refine search intentions among specific generic domains, but also to access new knowledge by tracking semantic relations. In recent years, some researchers have attempted to build ontological user profiles according to discovered user background knowledge. The knowledge is considered to be both global and lo- cal analyses, which aim to produce tailored ontologies by a group of concepts. However, a key problem here that has not been addressed is: how to accurately match diverse local information to universal global knowledge. This research conducts a theoretical study on the use of personalized ontolo- gies to enhance text mining performance. The objective is to understand user information needs by a \bag-of-concepts" rather than \words". The concepts are gathered from a general world knowledge base named the Library of Congress Subject Headings. To return desirable search results, a novel ontology-based mining approach is introduced to discover accurate search intentions and learn personalized ontologies as user profiles. The approach can not only pinpoint users' individual intentions in a rough hierarchical structure, but can also in- terpret their needs by a set of acknowledged concepts. Along with global and local analyses, another solid concept matching approach is carried out to address about the mismatch between local information and world knowledge. Relevance features produced by the Relevance Feature Discovery model, are determined as representatives of local information. These features have been proven as the best alternative for user queries to avoid ambiguity and consistently outperform the features extracted by other filtering models. The two attempt-to-proposed ap- proaches are both evaluated by a scientific evaluation with the standard Reuters Corpus Volume 1 testing set. A comprehensive comparison is made with a num- ber of the state-of-the art baseline models, including TF-IDF, Rocchio, Okapi BM25, the deploying Pattern Taxonomy Model, and an ontology-based model. The gathered results indicate that the top precision can be improved remarkably with the proposed ontology mining approach, where the matching approach is successful and achieves significant improvements in most information filtering measurements. This research contributes to the fields of ontological filtering, user profiling, and knowledge representation. The related outputs are critical when systems are expected to return proper mining results and provide personalized services. The scientific findings have the potential to facilitate the design of advanced preference mining models, where impact on people's daily lives.
Resumo:
A people-to-people matching system (or a match-making system) refers to a system in which users join with the objective of meeting other users with the common need. Some real-world examples of these systems are employer-employee (in job search networks), mentor-student (in university social networks), consume-to-consumer (in marketplaces) and male-female (in an online dating network). The network underlying in these systems consists of two groups of users, and the relationships between users need to be captured for developing an efficient match-making system. Most of the existing studies utilize information either about each of the users in isolation or their interaction separately, and develop recommender systems using the one form of information only. It is imperative to understand the linkages among the users in the network and use them in developing a match-making system. This study utilizes several social network analysis methods such as graph theory, small world phenomenon, centrality analysis, density analysis to gain insight into the entities and their relationships present in this network. This paper also proposes a new type of graph called “attributed bipartite graph”. By using these analyses and the proposed type of graph, an efficient hybrid recommender system is developed which generates recommendation for new users as well as shows improvement in accuracy over the baseline methods.
Resumo:
Commuting in the mining industry -Background -The problem -Journey management -The structure of the legislative framework Legislation and Regulation -Workplace safety in Queensland mining -Risk management -Mining legislation and journey management -Commuting and employee responsibilities -Queensland Workers’ Compensation Scheme Industry standards -Industry standards and journey management Regulated and organisational policy documents -Policy documents and journey management Observations & Conclusions
Resumo:
Much publicity has been given to the problem of high levels of environmental contaminants, most notably high blood lead concentration levels among children in the city of Mount Isa because of mining and smelting activities. The health impacts from mining-related pollutants are now well documented. This includes published research being discussed in an editorial of the Medical Journal of Australia (see Munksgaard et al. 2010). On the other hand, negative impacts on property prices, although mentioned, have not been examined to date. This study rectifies this research gap. This study uses a hedonic property price approach to examine the impact of mining- and smelting-related pollution on nearby property prices. The hypothesis is that those properties closer to the lead and copper smelters have lower property (house) prices than those farther away. The results of the study show that the marginal willingness to pay to be farther from the pollution source is AUS $13 947 per kilometre within the 4 km radius selected. The study has several policy implications, which are discussed briefly. We used ordinary least squares, geographically weighted regression, spatial error and spatial autoregressive or spatial lag models for this analysis.
Resumo:
This thesis describes the development of a robust and novel prototype to address the data quality problems that relate to the dimension of outlier data. It thoroughly investigates the associated problems with regards to detecting, assessing and determining the severity of the problem of outlier data; and proposes granule-mining based alternative techniques to significantly improve the effectiveness of mining and assessing outlier data.
Resumo:
The design of applications for dynamic ridesharing or carpooling is often formulated as a matching problem of connecting people with an aligned set of transport needs within a reasonable interval of time and space. This problem formulation relegates social connections to being secondary factors. Technology assisted ridesharing applications that put the matching problem first have revealed that they suffer from being unable to address the factor of social comfort, even after adding friend features or piggybacking on social networking sites. This research aims to understand the fabric of social interactions through which ridesharing happens. We take an online observation approach in order to understand the fabric of social interactions for ridesharing that is happening in highly subscribed online groups of local residents. This understanding will help researchers to identify design challenges and opportunities to support ridesharing in local communities. This paper contributes a fundamental understanding of how social interactions and social comfort precede rideshare requests in local communities.
Resumo:
Topic modelling, such as Latent Dirichlet Allocation (LDA), was proposed to generate statistical models to represent multiple topics in a collection of documents, which has been widely utilized in the fields of machine learning and information retrieval, etc. But its effectiveness in information filtering is rarely known. Patterns are always thought to be more representative than single terms for representing documents. In this paper, a novel information filtering model, Pattern-based Topic Model(PBTM) , is proposed to represent the text documents not only using the topic distributions at general level but also using semantic pattern representations at detailed specific level, both of which contribute to the accurate document representation and document relevance ranking. Extensive experiments are conducted to evaluate the effectiveness of PBTM by using the TREC data collection Reuters Corpus Volume 1. The results show that the proposed model achieves outstanding performance.
Resumo:
Australia’s mining boom Global demand for minerals and energy products has fuelled Australia’s recent resources boom and has led to the rapid expansion of mining projects not only in remote locations but increasingly in settled traditionally agricultural rural areas. A fundamental shift has also occurred in the provisioning of skilled and semi-skilled workers. The huge acceleration in industry demand for labour has been accompanied by the entrenchment of workforce arrangements largely dependent on fly-in, fly-out (FIFO) and drive–in, drive–out (DIDO) non-resident workers (NRWs). While NRWs are working away from their homes, they are usually accommodated in work camps or ‘villages’ for the duration of their work cycle which are normally comprised of many consecutive days of 12-hour day- and night-shifts. The health effects of this form of employment and the accompanying lifestyle is increasingly becoming contentious. Impacts on personal wellness, wellbeing and quality of life essentially remain under-researched and thus misunderstood. Sodexo in Australia Sodexo began operations in Australia in 1982, and has since become a leader in providing Quality of Life (QOL) services to businesses across the country. The 6,000 Australian employees are part of a global Sodexo team of 413,000 people. Sodexo in Australia designs, delivers and manages on-site their QOL services at 320 diverse site locations, including remote sites. Sodexo operates in a range of sectors, including the mining industry. Service plans are tailored to suit the individual needs of organisations. Sodexo Remote Sites has previously conducted unpublished research among mining workers in Australia. The results highlighted needs and expectations of Australian mining workers. Main insights about workers’ requirements were directed towards: • contacts with closest; • warm rest time around proper and varied meals; • additional services to help them better enjoy their life onsite and/or make the most of it; • organise their transportation; • promote community living; and • finding balance between professional and personal life. The brief for this current research is aimed at building upon this knowledge. Research brief Expectations for quality of life and wellness and wellbeing services are increasing dramatically. It's getting costlier and more difficult to retain valuable employees. This is particularly the case in the Australian mining sector. Given the level of interest in ensuring healthy workplaces in Australia, Sodexo has commissioned QUT to conduct a literature review. The objectives as specified by Sodexo are: Objective 1: To define the concepts of wellness and wellbeing and quality of life in Australia Objective 2: To examine how wellness and wellbeing are developed within organisations in Australia and how they impact on employee and organizational performance. More specifically, to review the literature that could be sourced about: • challenges of the mining environment; • the mining lifestyle – implications for health, wellness and daily life; • personal health and wellness of Australian mining workers; • factors affecting health in mines and perceived support for health and wellness; and • the impact of employer investment in health on perceptions and behaviour of employees. Objective 3: To determine what impact employee wellness and well-being has on the performance of mining workers. More specifically, to review the literature that could be sourced about: • impact of obesity, alcohol, tobacco use on companies; and • links between employee engagement and satisfaction and company productivity. Accordingly this review has attempted to ascertain what factors an organisation should focus on in order to reduce absenteeism and turnover and increase commitment, satisfaction, safety and productivity, with specific reference to the mining industry in Australia. The structure of the report aligns with the stated objectives in that each of the first three parts address an objective. Part IV summarises prominent issues that have arisen and offers some concluding observations and comments.
Resumo:
Business process analysis and process mining, particularly within the health care domain, remain under-utilised. Applied research that employs such techniques to routinely collected, health care data enables stakeholders to empirically investigate care as it is delivered by different health providers. However, cross-organisational mining and the comparative analysis of processes present a set of unique challenges in terms of ensuring population and activity comparability, visualising the mined models and interpreting the results. Without addressing these issues, health providers will find it difficult to use process mining insights, and the potential benefits of evidence-based process improvement within health will remain unrealised. In this paper, we present a brief introduction on the nature of health care processes; a review of the process mining in health literature; and a case study conducted to explore and learn how health care data, and cross-organisational comparisons with process mining techniques may be approached. The case study applies process mining techniques to administrative and clinical data for patients who present with chest pain symptoms at one of four public hospitals in South Australia. We demonstrate an approach that provides detailed insights into clinical (quality of patient health) and fiscal (hospital budget) pressures in health care practice. We conclude by discussing the key lessons learned from our experience in conducting business process analysis and process mining based on the data from four different hospitals.
Resumo:
This is a practice-led project consisting of a Young Adult novel, Open Cut, and an exegesis, 'I Wouldn't Say That': Finding a Young Adult, Female Voice in a Queensland Mining Town. The thesis investigates the use of first person narration in order to create an immediate engaging, realist Young Adult Fiction. The research design is bound by a feminist interpretative paradigm. The methodology employed is practice-led, auto-ethnography, and participant observation. Particular characteristics of first person narration used in Australian Young Adult Fiction are identified in an analysis of Dust, by Christine Bongers, and Jasper Jones, by Craig Silvey. The exegesis also contains a reflection on the researcher's creative work, and the process used to draft, edit, plot and construct the novel. The research contributes to knowledge in the field of Young Adult Literature because it offers a graphic portrayal of an Australian mining town that has not been heard before.
Resumo:
Textual document set has become an important and rapidly growing information source in the web. Text classification is one of the crucial technologies for information organisation and management. Text classification has become more and more important and attracted wide attention of researchers from different research fields. In this paper, many feature selection methods, the implement algorithms and applications of text classification are introduced firstly. However, because there are much noise in the knowledge extracted by current data-mining techniques for text classification, it leads to much uncertainty in the process of text classification which is produced from both the knowledge extraction and knowledge usage, therefore, more innovative techniques and methods are needed to improve the performance of text classification. It has been a critical step with great challenge to further improve the process of knowledge extraction and effectively utilization of the extracted knowledge. Rough Set decision making approach is proposed to use Rough Set decision techniques to more precisely classify the textual documents which are difficult to separate by the classic text classification methods. The purpose of this paper is to give an overview of existing text classification technologies, to demonstrate the Rough Set concepts and the decision making approach based on Rough Set theory for building more reliable and effective text classification framework with higher precision, to set up an innovative evaluation metric named CEI which is very effective for the performance assessment of the similar research, and to propose a promising research direction for addressing the challenging problems in text classification, text mining and other relative fields.
Resumo:
In recent years, the Web 2.0 has provided considerable facilities for people to create, share and exchange information and ideas. Upon this, the user generated content, such as reviews, has exploded. Such data provide a rich source to exploit in order to identify the information associated with specific reviewed items. Opinion mining has been widely used to identify the significant features of items (e.g., cameras) based upon user reviews. Feature extraction is the most critical step to identify useful information from texts. Most existing approaches only find individual features about a product without revealing the structural relationships between the features which usually exist. In this paper, we propose an approach to extract features and feature relationships, represented as a tree structure called feature taxonomy, based on frequent patterns and associations between patterns derived from user reviews. The generated feature taxonomy profiles the product at multiple levels and provides more detailed information about the product. Our experiment results based on some popularly used review datasets show that our proposed approach is able to capture the product features and relations effectively.
Resumo:
Guaranteeing the quality of extracted features that describe relevant knowledge to users or topics is a challenge because of the large number of extracted features. Most popular existing term-based feature selection methods suffer from noisy feature extraction, which is irrelevant to the user needs (noisy). One popular method is to extract phrases or n-grams to describe the relevant knowledge. However, extracted n-grams and phrases usually contain a lot of noise. This paper proposes a method for reducing the noise in n-grams. The method first extracts more specific features (terms) to remove noisy features. The method then uses an extended random set to accurately weight n-grams based on their distribution in the documents and their terms distribution in n-grams. The proposed approach not only reduces the number of extracted n-grams but also improves the performance. The experimental results on Reuters Corpus Volume 1 (RCV1) data collection and TREC topics show that the proposed method significantly outperforms the state-of-art methods underpinned by Okapi BM25, tf*idf and Rocchio.
Resumo:
Process mining has developed into a popular research discipline and nowadays its associated techniques are widely applied in practice. What is currently ill-understood is how the success of a process mining project can be measured and what the antecedent factors of process mining success are. We consider an improved, grounded understanding of these aspects of value to better manage the effectiveness and efficiency of process mining projects in practice. As such, we advance a model, tailored to the characteristics of process mining projects, which identifies and relates success factors and measures. We draw inspiration from the literature from related fields for the construction of a theoretical, a priori model. That model has been validated and re-specified on the basis of a multiple case study, which involved four industrial process mining projects. The unique contribution of this paper is that it presents the first set of success factors and measures on the basis of an analysis of real process mining projects. The presented model can also serve as a basis for further extension and refinement using insights from additional analyses.
Resumo:
This paper uses innovative content analysis techniques to map how the death of Oscar Pistorius' girlfriend, Reeva Steenkamp, was framed on Twitter conversations. Around 1.5 million posts from a two-week timeframe are analyzed with a combination of syntactic and semantic methods. This analysis is grounded in the frame analysis perspective and is different than sentiment analysis. Instead of looking for explicit evaluations, such as “he is guilty” or “he is innocent”, we showcase through the results how opinions can be identified by complex articulations of more implicit symbolic devices such as examples and metaphors repeatedly mentioned. Different frames are adopted by users as more information about the case is revealed: from a more episodic one, highly used in the very beginning, to more systemic approaches, highlighting the association of the event with urban violence, gun control issues, and violence against women. A detailed timeline of the discussions is provided.