26 resultados para data-mining application
Resumo:
Retrospective clinical data presents many challenges for data mining and machine learning. The transcription of patient records from paper charts and subsequent manipulation of data often results in high volumes of noise as well as a loss of other important information. In addition, such datasets often fail to represent expert medical knowledge and reasoning in any explicit manner. In this research we describe applying data mining methods to retrospective clinical data to build a prediction model for asthma exacerbation severity for pediatric patients in the emergency department. Difficulties in building such a model forced us to investigate alternative strategies for analyzing and processing retrospective data. This paper describes this process together with an approach to mining retrospective clinical data by incorporating formalized external expert knowledge (secondary knowledge sources) into the classification task. This knowledge is used to partition the data into a number of coherent sets, where each set is explicitly described in terms of the secondary knowledge source. Instances from each set are then classified in a manner appropriate for the characteristics of the particular set. We present our methodology and outline a set of experiential results that demonstrate some advantages and some limitations of our approach. © 2008 Springer-Verlag Berlin Heidelberg.
Resumo:
This thesis introduces a flexible visual data exploration framework which combines advanced projection algorithms from the machine learning domain with visual representation techniques developed in the information visualisation domain to help a user to explore and understand effectively large multi-dimensional datasets. The advantage of such a framework to other techniques currently available to the domain experts is that the user is directly involved in the data mining process and advanced machine learning algorithms are employed for better projection. A hierarchical visualisation model guided by a domain expert allows them to obtain an informed segmentation of the input space. Two other components of this thesis exploit properties of these principled probabilistic projection algorithms to develop a guided mixture of local experts algorithm which provides robust prediction and a model to estimate feature saliency simultaneously with the training of a projection algorithm.Local models are useful since a single global model cannot capture the full variability of a heterogeneous data space such as the chemical space. Probabilistic hierarchical visualisation techniques provide an effective soft segmentation of an input space by a visualisation hierarchy whose leaf nodes represent different regions of the input space. We use this soft segmentation to develop a guided mixture of local experts (GME) algorithm which is appropriate for the heterogeneous datasets found in chemoinformatics problems. Moreover, in this approach the domain experts are more involved in the model development process which is suitable for an intuition and domain knowledge driven task such as drug discovery. We also derive a generative topographic mapping (GTM) based data visualisation approach which estimates feature saliency simultaneously with the training of a visualisation model.
Resumo:
This paper explores the use of the optimization procedures in SAS/OR software with application to the ordered weight averaging (OWA) operators of decision-making units (DMUs). OWA was originally introduced by Yager (IEEE Trans Syst Man Cybern 18(1):183-190, 1988) has gained much interest among researchers, hence many applications such as in the areas of decision making, expert systems, data mining, approximate reasoning, fuzzy system and control have been proposed. On the other hand, the SAS is powerful software and it is capable of running various optimization tools such as linear and non-linear programming with all type of constraints. To facilitate the use of OWA operator by SAS users, a code was implemented. The SAS macro developed in this paper selects the criteria and alternatives from a SAS dataset and calculates a set of OWA weights. An example is given to illustrate the features of SAS/OWA software. © Springer-Verlag 2009.
Resumo:
We address the important bioinformatics problem of predicting protein function from a protein's primary sequence. We consider the functional classification of G-Protein-Coupled Receptors (GPCRs), whose functions are specified in a class hierarchy. We tackle this task using a novel top-down hierarchical classification system where, for each node in the class hierarchy, the predictor attributes to be used in that node and the classifier to be applied to the selected attributes are chosen in a data-driven manner. Compared with a previous hierarchical classification system selecting classifiers only, our new system significantly reduced processing time without significantly sacrificing predictive accuracy.
Resumo:
Forests play a pivotal role in timber production, maintenance and development of biodiversity and in carbon sequestration and storage in the context of the Kyoto Protocol. Policy makers and forest experts therefore require reliable information on forest extent, type and change for management, planning and modeling purposes. It is becoming increasingly clear that such forest information is frequently inconsistent and unharmonised between countries and continents. This research paper presents a forest information portal that has been developed in line with the GEOSS and INSPIRE frameworks. The web portal provides access to forest resources data at a variety of spatial scales, from global through to regional and local, as well as providing analytical capabilities for monitoring and validating forest change. The system also allows for the utilisation of forest data and processing services within other thematic areas. The web portal has been developed using open standards to facilitate accessibility, interoperability and data transfer.
Resumo:
This paper proposes a novel framework of incorporating protein-protein interactions (PPI) ontology knowledge into PPI extraction from biomedical literature in order to address the emerging challenges of deep natural language understanding. It is built upon the existing work on relation extraction using the Hidden Vector State (HVS) model. The HVS model belongs to the category of statistical learning methods. It can be trained directly from un-annotated data in a constrained way whilst at the same time being able to capture the underlying named entity relationships. However, it is difficult to incorporate background knowledge or non-local information into the HVS model. This paper proposes to represent the HVS model as a conditionally trained undirected graphical model in which non-local features derived from PPI ontology through inference would be easily incorporated. The seamless fusion of ontology inference with statistical learning produces a new paradigm to information extraction.
Resumo:
Mobile technology has not yet achieved widespread acceptance in the Architectural, Engineering, and Construction (AEC) industry. This paper presents work that is part of an ongoing research project focusing on the development of multimodal mobile applications for use in the AEC industry. This paper focuses specifically on a context-relevant lab-based evaluation of two input modalities – stylus and soft-keyboard v. speech-based input – for use with a mobile data collection application for concrete test technicians. The manner in which the evaluation was conducted as well as the results obtained are discussed in detail.
Resumo:
We are the first to examine the market reaction to 13 announcement dates related to IFRS 9 for over 5400 European listed firms. We find an overall positive reaction to the introduction of IFRS 9. The regulation is particularly beneficial to shareholders of firms in countries with weaker rule of law and a smaller divergence between local GAAP and IAS 39. Bootstrap simulations rule out the possibility that sampling error or data mining are driving our findings. Our main findings are also robust to confounding events and the extent of the media coverage for each event. These results suggest that investors perceive the new regulation as shareholder-wealth enhancing and support the view that stronger comparability across accounting standards of European firms is beneficial to international investors and outweighs the costs of poorer firm-specific information.
Resumo:
Product recommender systems are often deployed by e-commerce websites to improve user experience and increase sales. However, recommendation is limited by the product information hosted in those e-commerce sites and is only triggered when users are performing e-commerce activities. In this paper, we develop a novel product recommender system called METIS, a MErchanT Intelligence recommender System, which detects users' purchase intents from their microblogs in near real-time and makes product recommendation based on matching the users' demographic information extracted from their public profiles with product demographics learned from microblogs and online reviews. METIS distinguishes itself from traditional product recommender systems in the following aspects: 1) METIS was developed based on a microblogging service platform. As such, it is not limited by the information available in any specific e-commerce website. In addition, METIS is able to track users' purchase intents in near real-time and make recommendations accordingly. 2) In METIS, product recommendation is framed as a learning to rank problem. Users' characteristics extracted from their public profiles in microblogs and products' demographics learned from both online product reviews and microblogs are fed into learning to rank algorithms for product recommendation. We have evaluated our system in a large dataset crawled from Sina Weibo. The experimental results have verified the feasibility and effectiveness of our system. We have also made a demo version of our system publicly available and have implemented a live system which allows registered users to receive recommendations in real time. © 2014 ACM.
Resumo:
The Semantic Web has come a long way since its inception in 2001, especially in terms of technical development and research progress. However, adoption by non- technical practitioners is still an ongoing process, and in some areas this process is just now starting. Emergency response is an area where reliability and timeliness of information and technologies is of essence. Therefore it is quite natural that more widespread adoption in this area has not been seen until now, when Semantic Web technologies are mature enough to support the high requirements of the application area. Nevertheless, to leverage the full potential of Semantic Web research results for this application area, there is need for an arena where practitioners and researchers can meet and exchange ideas and results. Our intention is for this workshop, and hopefully coming workshops in the same series, to be such an arena for discussion. The Extended Semantic Web Conference (ESWC - formerly the European Semantic Web conference) is one of the major research conferences in the Semantic Web field, whereas this is a suitable location for this workshop in order to discuss the application of Semantic Web technology to our specific area of applications. Hence, we chose to arrange our first SMILE workshop at ESWC 2013. However, this workshop does not focus solely on semantic technologies for emergency response, but rather Semantic Web technologies in combination with technologies and principles for what is sometimes called the "social web". Social media has already been used successfully in many cases, as a tool for supporting emergency response. The aim of this workshop is therefore to take this to the next level and answer questions like: "how can we make sense of, and furthermore make use of, all the data that is produced by different kinds of social media platforms in an emergency situation?" For the first edition of this workshop the chairs collected the following main topics of interest: • Semantic Annotation for understanding the content and context of social media streams. • Integration of Social Media with Linked Data. • Interactive Interfaces and visual analytics methodologies for managing multiple large-scale, dynamic, evolving datasets. • Stream reasoning and event detection. • Social Data Mining. • Collaborative tools and services for Citizens, Organisations, Communities. • Privacy, ethics, trustworthiness and legal issues in the Social Semantic Web. • Use case analysis, with specific interest for use cases that involve the application of Social Media and Linked Data methodologies in real-life scenarios. All of these, applied in the context of: • Crisis and Disaster Management • Emergency Response • Security and Citizen Journalism The workshop received 6 high-quality paper submissions and based on a thorough review process, thanks to our program committee, the decision was made to accept four of these papers for the workshop (67% acceptance rate). These four papers can be found later in this proceedings volume. Three out of four of these papers particularly discuss the integration and analysis of social media data, using Semantic Web technologies, e.g. for detecting complex events in social media streams, for visualizing and analysing sentiments with respect to certain topics in social media, or for detecting small-scale incidents entirely through the use of social media information. Finally, the fourth paper presents an architecture for using Semantic Web technologies in resource management during a disaster. Additionally, the workshop featured an invited keynote speech by Dr. Tomi Kauppinen from Aalto university. Dr. Kauppinen shared experiences from his work on applying Semantic Web technologies to application fields such as geoinformatics and scientific research, i.e. so-called Linked Science, but also recent ideas and applications in the emergency response field. His input was also highly valuable for the roadmapping discussion, which was held at the end of the workshop. A separate summary of the roadmapping session can be found at the end of these proceedings. Finally, we would like to thank our invited speaker Dr. Tomi Kauppinen, all our program committee members, as well as the workshop chair of ESWC2013, Johanna Völker (University of Mannheim), for helping us to make this first SMILE workshop a highly interesting and successful event!
Resumo:
This paper proposes a new thermography-based maximum power point tracking (MPPT) scheme to address photovoltaic (PV) partial shading faults. Solar power generation utilizes a large number of PV cells connected in series and in parallel in an array, and that are physically distributed across a large field. When a PV module is faulted or partial shading occurs, the PV system sees a nonuniform distribution of generated electrical power and thermal profile, and the generation of multiple maximum power points (MPPs). If left untreated, this reduces the overall power generation and severe faults may propagate, resulting in damage to the system. In this paper, a thermal camera is employed for fault detection and a new MPPT scheme is developed to alter the operating point to match an optimized MPP. Extensive data mining is conducted on the images from the thermal camera in order to locate global MPPs. Based on this, a virtual MPPT is set out to find the global MPP. This can reduce MPPT time and be used to calculate the MPP reference voltage. Finally, the proposed methodology is experimentally implemented and validated by tests on a 600-W PV array.