279 resultados para Knowledge Discovery Tools
Resumo:
Key decisions at the collection, pre-processing, transformation, mining and interpretation phase of any knowledge discovery from database (KDD) process depend heavily on assumptions and theorectical perspectives relating to the type of task to be performed and characteristics of data sourced. In this article, we compare and contrast theoretical perspectives and assumptions taken in data mining exercises in the legal domain with those adopted in data mining in TCM and allopathic medicine. The juxtaposition results in insights for the application of KDD for Traditional Chinese Medicine.
Resumo:
Process mining encompasses the research area which is concerned with knowledge discovery from information system event logs. Within the process mining research area, two prominent tasks can be discerned. First of all, process discovery deals with the automatic construction of a process model out of an event log. Secondly, conformance checking focuses on the assessment of the quality of a discovered or designed process model in respect to the actual behavior as captured in event logs. Hereto, multiple techniques and metrics have been developed and described in the literature. However, the process mining domain still lacks a comprehensive framework for assessing the goodness of a process model from a quantitative perspective. In this study, we describe the architecture of an extensible framework within ProM, allowing for the consistent, comparative and repeatable calculation of conformance metrics. For the development and assessment of both process discovery as well as conformance techniques, such a framework is considered greatly valuable.
Resumo:
Supervision in the creative arts is a topic of growing significance since the increase in creative practice PhDs across universities in Australasia. This presentation will provide context of existing discussions in creative practice and supervision. Creative practice – encompassing practice-based or practice-led research – has now a rich history of research surrounding it. Although it is a comparatively new area of knowledge, great advances have been made in terms of how practice can influence, generate, and become research. The practice of supervision is also a topic of interest, perhaps unsurprisingly considering its necessity within the university environment. Many scholars have written much about supervision practices and the importance of the supervisory role, both in academic and more informal forms. However, there is an obvious space in between: there is very little research on supervision practices within creative practice higher degrees, especially at PhD or doctorate level. Despite the existence of creative practice PhD programs, and thus the inherent necessity for successful supervisors, there remain minimal publications and limited resources available. Creative Intersections explores the existing publications and resources, and illustrates that a space for new published knowledge and tools exists.
Resumo:
It is a big challenge to find useful associations in databases for user specific needs. The essential issue is how to provide efficient methods for describing meaningful associations and pruning false discoveries or meaningless ones. One major obstacle is the overwhelmingly large volume of discovered patterns. This paper discusses an alternative approach called multi-tier granule mining to improve frequent association mining. Rather than using patterns, it uses granules to represent knowledge implicitly contained in databases. It also uses multi-tier structures and association mappings to represent association rules in terms of granules. Consequently, association rules can be quickly accessed and meaningless association rules can be justified according to the association mappings. Moreover, the proposed structure is also an precise compression of patterns which can restore the original supports. The experimental results shows that the proposed approach is promising.
Resumo:
Topic modeling has been widely utilized in the fields of information retrieval, text mining, text classification etc. Most existing statistical topic modeling methods such as LDA and pLSA generate a term based representation to represent a topic by selecting single words from multinomial word distribution over this topic. There are two main shortcomings: firstly, popular or common words occur very often across different topics that bring ambiguity to understand topics; secondly, single words lack coherent semantic meaning to accurately represent topics. In order to overcome these problems, in this paper, we propose a two-stage model that combines text mining and pattern mining with statistical modeling to generate more discriminative and semantic rich topic representations. Experiments show that the optimized topic representations generated by the proposed methods outperform the typical statistical topic modeling method LDA in terms of accuracy and certainty.
Resumo:
Process mining encompasses the research area which is concerned with knowledge discovery from event logs. One common process mining task focuses on conformance checking, comparing discovered or designed process models with actual real-life behavior as captured in event logs in order to assess the “goodness” of the process model. This paper introduces a novel conformance checking method to measure how well a process model performs in terms of precision and generalization with respect to the actual executions of a process as recorded in an event log. Our approach differs from related work in the sense that we apply the concept of so-called weighted artificial negative events towards conformance checking, leading to more robust results, especially when dealing with less complete event logs that only contain a subset of all possible process execution behavior. In addition, our technique offers a novel way to estimate a process model’s ability to generalize. Existing literature has focused mainly on the fitness (recall) and precision (appropriateness) of process models, whereas generalization has been much more difficult to estimate. The described algorithms are implemented in a number of ProM plugins, and a Petri net conformance checking tool was developed to inspect process model conformance in a visual manner.
Resumo:
Road surface skid resistance has been shown to have a strong relationship to road crash risk, however, applying the current method of using investigatory levels to identify crash prone roads is problematic as they may fail in identifying risky roads outside of the norm. The proposed method analyses a complex and formerly impenetrable volume of data from roads and crashes using data mining. This method rapidly identifies roads with elevated crash-rate, potentially due to skid resistance deficit, for investigation. A hypothetical skid resistance/crash risk curve is developed for each road segment, driven by the model deployed in a novel regression tree extrapolation method. The method potentially solves the problem of missing skid resistance values which occurs during network-wide crash analysis, and allows risk assessment of the major proportion of roads without skid resistance values.
Resumo:
The Web is a steadily evolving resource comprising much more than mere HTML pages. With its ever-growing data sources in a variety of formats, it provides great potential for knowledge discovery. In this article, we shed light on some interesting phenomena of the Web: the deep Web, which surfaces database records as Web pages; the Semantic Web, which de�nes meaningful data exchange formats; XML, which has established itself as a lingua franca for Web data exchange; and domain-speci�c markup languages, which are designed based on XML syntax with the goal of preserving semantics in targeted domains. We detail these four developments in Web technology, and explain how they can be used for data mining. Our goal is to show that all these areas can be as useful for knowledge discovery as the HTML-based part of the Web.
Resumo:
One of the most common ways to share project knowledge is to capture the positive and negative aspects of projects in the form of lessons learned (LL). If effectively used, this process can assist project managers in reusing project knowledge and preventing future projects from repeating mistakes. Nevertheless, the process of capturing, storing, reviewing and reusing LL often remains suboptimal. Despite the potential for rich knowledge capture, lessons are often documented as simple, line-item statements devoid of context. Findings from an empirical investigation across four cases revealed a range of reasons related to the perceived quality, process and visibility of LL that lead to their limited use and application. Drawn from the cross-case analysis, this paper investigates an integrated approach to LL involving the use of a collaborative Web-based tool, which is easily accessible, intelligible and user-friendly, allowing more effective sharing of project knowledge and overcoming existing problems with LL.
Resumo:
Term-based approaches can extract many features in text documents, but most include noise. Many popular text-mining strategies have been adapted to reduce noisy information from extracted features; however, text-mining techniques suffer from low frequency. The key issue is how to discover relevance features in text documents to fulfil user information needs. To address this issue, we propose a new method to extract specific features from user relevance feedback. The proposed approach includes two stages. The first stage extracts topics (or patterns) from text documents to focus on interesting topics. In the second stage, topics are deployed to lower level terms to address the low-frequency problem and find specific terms. The specific terms are determined based on their appearances in relevance feedback and their distribution in topics or high-level patterns. We test our proposed method with extensive experiments in the Reuters Corpus Volume 1 dataset and TREC topics. Results show that our proposed approach significantly outperforms the state-of-the-art models.
Resumo:
Although recommender systems and reputation systems have quite different theoretical and technical bases, both types of systems have the purpose of providing advice for decision making in e-commerce and online service environments. The similarity in purpose makes it natural to integrate both types of systems in order to produce better online advice, but their difference in theory and implementation makes the integration challenging. In this paper, we propose to use mappings to subjective opinions from values produced by recommender systems as well as from scores produced by reputation systems, and to combine the resulting opinions within the framework of subjective logic.
Resumo:
IUCN´s core work involves generating knowledge and tools to influence policy and practice for nature conservation. Whilst it appears that we are collectively making progress in some areas, we acknowledge the need to improve our communication processes and practices to ´move to action´ in this regard. We need to extend the influence of the science and the knowledge beyond the documents to achieve effective impact and action. The training course will focus on the process of getting the conservation messages out to a wider audience. This interactive and participatory training course will develop the skills and knowledge needed to communicate effective conservation messages for a range of IUCN internal and external audiences. The course will cover: • what is communication for conservation? • the communication planning process (developing your communication objectives) • identifying and understanding your target audiences • developing your conservation message • choosing your communication media and • evaluating the effectiveness of your communication strategies. A unique feature of the training course will be the use of Web 2.0 tools in innovative conservation communications e.g. use of social media in concept branding and social marketing. In the spirit of the Forum´s objective of ´Sharing know how´, each participant will bring a current conservation issue to the training course and will leave with their own communication plan. Potentially, the training course adopts a cross-thematic approach as the issues addressed could be drawn from any of the IUCN´s program themes. Primarily though, the training course´s best fit is with the ´Valuing and Conserving Biodiversity´ theme since it will provide concrete and pragmatic solutions to enhancing the implementation of conservation measures through participatory planning and capacity building.
Resumo:
Introduction The professional doctorate is specifically designed for professionals investigating real-world problems and relevant issues for a profession, industry, and/or the community. The focus is scholarly research into professional practices. The research programme bridges academia and the professions, and offers doctoral candidates the opportunity to investigate issues relevant to their own practices and to apply these understandings to their professional contexts. The study on which this article is based sought to track the scholarly skill development of a cohort of professional doctoral students who commenced the course in January 2008 at an Australian university. Because they hold positions of responsibility and are time-poor, many doctoral students have difficulty transitioning from professional practitioner to researcher and scholar. The struggle many experience is in the development of a theoretical or conceptual standpoint for argumentation (Lesham, 2007; Weese et al., 1999). It was thought that the use of a scaffolded learning environment that drew upon a blended learning approach incorporating face to face intensive blocks and collaborative knowledge-building tools such as wikis would provide a data source for understanding the development of scholarly skills. Wikis, weblogs and similar social networking software have the potential to support communities to share, learn, create and collaborate. The development of a wiki page by each candidate in the 2008 cohort was encouraged to provide the participants and the teaching team members with textual indicators of progress. Learning tasks were scaffolded with the expectation that the candidates would complete these tasks via the wikis. The expectation was that cohort members would comment on each other’s work, together with the supervisor and/or teaching team member who was allocated to each candidate. The supervisor is responsible for supervising the candidate’s work through to submission of the thesis for examination and the teaching team member provides support to both the supervisor and the candidate through to confirmation. This paper reports on the learning journey of a cohort of doctoral students during the first seven months of their professional doctoral programme to determine if there had been any qualitative shifts in understandings, expectations and perceptions regarding their developing knowledge and skills. The paper is grounded in the literature pertaining to doctoral studies and examines the structure of the professional doctoral programme. Following this is a discussion of the qualitative study that helped to unearth key themes regarding the participants’ learning journey.