15 resultados para Decision Tree
em Universidade do Minho
Resumo:
Hospitals are nowadays collecting vast amounts of data related with patient records. All this data hold valuable knowledge that can be used to improve hospital decision making. Data mining techniques aim precisely at the extraction of useful knowledge from raw data. This work describes an implementation of a medical data mining project approach based on the CRISP-DM methodology. Recent real-world data, from 2000 to 2013, were collected from a Portuguese hospital and related with inpatient hospitalization. The goal was to predict generic hospital Length Of Stay based on indicators that are commonly available at the hospitalization process (e.g., gender, age, episode type, medical specialty). At the data preparation stage, the data were cleaned and variables were selected and transformed, leading to 14 inputs. Next, at the modeling stage, a regression approach was adopted, where six learning methods were compared: Average Prediction, Multiple Regression, Decision Tree, Artificial Neural Network ensemble, Support Vector Machine and Random Forest. The best learning model was obtained by the Random Forest method, which presents a high quality coefficient of determination value (0.81). This model was then opened by using a sensitivity analysis procedure that revealed three influential input attributes: the hospital episode type, the physical service where the patient is hospitalized and the associated medical specialty. Such extracted knowledge confirmed that the obtained predictive model is credible and with potential value for supporting decisions of hospital managers.
Resumo:
The RMR system is still very much applied in rock mechanics engineering context. It is based on the evaluation of six weights to obtain a final rating. To obtain the final rating a considerable amount of information is needed concerning the rock mass which can be difficult to obtain in some projects or project stages at least with accuracy. In 2007 an alternative classification scheme based on the RMR, the Hierarchical Rock Mass Rating (HRMR) was presented. The main feature of this system was the adaptation to the level of knowledge existent about the rock mass to obtain the classification of the rock mass since it followed a decision tree approach. However, the HRMR was only valid for hard rock granites with low fracturing degrees. In this work, the database was enlarged with approximately 40% more cases considering other types of granite rock masses including weathered granites and based on this increased database the system was updated. Granite formations existent in the north of Portugal including Porto city are predominantly granites. Some years ago a light rail infrastructure was built in the city of Porto and surrounding municipalities whi h involved considerable challenges due to the high heterogeneity levels of the granite formations and the difficulties involved in their geomechanical characterization. In this work it is intended to provide also a contribution to improve the characterization of these formations with special emphasis to the weathered horizons. A specific subsystem applicable to the weathered formations was developed. The results of the validation of these systems are presented and show acceptable performances in identifying the correct class using less information than with the RMR system.
Resumo:
Abstract This study aimed to investigate the role of ascorbate peroxidase (APX), guaiacol peroxidase (GPX), polysaccharides, and protein contents associated with the early events of postharvest physiological deterioration (PPD) in cassava roots. Increases in APX and GPX activity, as well as total protein contents occurred from 3 to 5 days of storage and were correlated with the delay of PPD. Cassava samples stained with periodic acid-Schiff (PAS) highlighted the presence of starch and cellulose. Degradation of starch granules during PPD was also detected. Slight metachromatic reaction with toluidine blue is indicative of increasing of acidic polysaccharides and may play an important role in PPD delay. Principal component analysis (PCA) classified samples according to their levels of enzymatic activity based on the decision tree model which showed GPX and total protein amounts to be correlated with PPD. The Oriental (ORI) cultivar was more susceptible to PPD.
Resumo:
Autor proof
Resumo:
The selective collection of municipal solid waste for recycling is a very complex and expensive process, where a major issue is to perform cost-efficient waste collection routes. Despite the abundance of commercially available software for fleet management, they often lack the capability to deal properly with sequencing problems and dynamic revision of plans and schedules during process execution. Our approach to achieve better solutions for the waste collection process is to model it as a vehicle routing problem, more specifically as a team orienteering problem where capacity constraints on the vehicles are considered, as well as time windows for the waste collection points and for the vehicles. The final model is called capacitated team orienteering problem with double time windows (CTOPdTW).We developed a genetic algorithm to solve routing problems in waste collection modelled as a CTOPdTW. The results achieved suggest possible reductions of logistic costs in selective waste collection.
Resumo:
To solve a health and safety problem on a waste treatment facility, different multicriteria decision methods were used, including the PROV Exponential decision method. Four alternatives and ten attributes were considered. We found a congruent solution, validated by the different methods. The AHP and the PROV Exponential decision method led us to the same options ordering, but the last method reinforced one of the options as being the best performing one, and detached the least performing option. Also, the ELECTRE I method results led to the same ordering which allowed to point the best solution with reasonable confidence. This paper demonstrates the potential of using multicriteria decision methods to support decision making on complex problems such as risk control and accidents prevention.
Resumo:
Given the current economic situation of the Portuguese municipalities, it is necessary to identify the priority investments in order to achieve a more efficient financial management. The classification of the road network of the municipality according to the occurrence of traffic accidents is fundamental to set priorities for road interventions. This paper presents a model for road network classification based on traffic accidents integrated in a geographic information system. Its practical application was developed through a case study in the municipality of Barcelos. An equation was defined to obtain a road safety index through the combination of the following indicators: severity, property damage only and accident costs. In addition to the road network classification, the application of the model allows to analyze the spatial coverage of accidents in order to determine the centrality and dispersion of the locations with the highest incidence of road accidents. This analysis can be further refined according to the nature of the accidents namely in collision, runoff and pedestrian crashes.
Resumo:
"Lecture notes in computer science series, ISSN 0302-9743, vol. 9273"
Resumo:
Business Intelligence (BI) can be seen as a method that gathers information and data from information systems in order to help companies to be more accurate in their decision-making process. Traditionally BI systems were associated with the use of Data Warehouses (DW). The prime purpose of DW is to serve as a repository that stores all the relevant information required for making the correct decision. The necessity to integrate streaming data became crucial with the need to improve the efficiency and effectiveness of the decision process. In primary and secondary education, there is a lack of BI solutions. Due to the schools reality the main purpose of this study is to provide a Pervasive BI solution able to monitoring the schools and student data anywhere and anytime in real-time as well as disseminating the information through ubiquitous devices. The first task consisted in gathering data regarding the different choices made by the student since his enrolment in a certain school year until the end of it. Thereafter a dimensional model was developed in order to be possible building a BI platform. This paper presents the dimensional model, a set of pre-defined indicators, the Pervasive Business Intelligence characteristics and the prototype designed. The main contribution of this study was to offer to the schools a tool that could help them to make accurate decisions in real-time. Data dissemination was achieved through a localized application that can be accessed anywhere and anytime.
Resumo:
Children are an especially vulnerable population, particularly in respect to drug administration. It is estimated that neonatal and pediatric patients are at least three times more vulnerable to damage due to adverse events and medication errors than adults are. With the development of this framework, it is intended the provision of a Clinical Decision Support System based on a prototype already tested in a real environment. The framework will include features such as preparation of Total Parenteral Nutrition prescriptions, table pediatric and neonatal emergency drugs, medical scales of morbidity and mortality, anthropometry percentiles (weight, length/height, head circumference and BMI), utilities for supporting medical decision on the treatment of neonatal jaundice and anemia and support for technical procedures and other calculators and widespread use tools. The solution in development means an extension of INTCare project. The main goal is to provide an approach to get the functionality at all times of clinical practice and outside the hospital environment for dissemination, education and simulation of hypothetical situations. The aim is also to develop an area for the study and analysis of information and extraction of knowledge from the data collected by the use of the system. This paper presents the architecture, their requirements and functionalities and a SWOT analysis of the solution proposed.
Resumo:
The occurrence of Barotrauma is identified as a major concern for health professionals, since it can be fatal for patients. In order to support the decision process and to predict the risk of occurring barotrauma Data Mining models were induced. Based on this principle, the present study addresses the Data Mining process aiming to provide hourly probability of a patient has Barotrauma. The process of discovering implicit knowledge in data collected from Intensive Care Units patientswas achieved through the standard process Cross Industry Standard Process for Data Mining. With the goal of making predictions according to the classification approach they several DM techniques were selected: Decision Trees, Naive Bayes and Support Vector Machine. The study was focused on identifying the validity and viability to predict a composite variable. To predict the Barotrauma two classes were created: “risk” and “no risk”. Such target come from combining two variables: Plateau Pressure and PCO2. The best models presented a sensitivity between 96.19% and 100%. In terms of accuracy the values varied between 87.5% and 100%. This study and the achieved results demonstrated the feasibility of predicting the risk of a patient having Barotrauma by presenting the probability associated.
Resumo:
This paper presents an improved version of an application whose goal is to provide a simple and intuitive way to use multicriteria decision methods in day-to-day decision problems. The application allows comparisons between several alternatives with several criteria, always keeping a permanent backup of both model and results, and provides a framework to incorporate new methods in the future. Developed in C#, the application implements the AHP, SMART and Value Functions methods.
Resumo:
A high-resolution mtDNA phylogenetic tree allowed us to look backward in time to investigate purifying selection. Purifying selection was very strong in the last 2,500 years, continuously eliminating pathogenic mutations back until the end of the Younger Dryas (∼11,000 years ago), when a large population expansion likely relaxed selection pressure. This was preceded by a phase of stable selection until another relaxation occurred in the out-of-Africa migration. Demography and selection are closely related: expansions led to relaxation of selection and higher pathogenicity mutations significantly decreased the growth of descendants. The only detectible positive selection was the recurrence of highly pathogenic nonsynonymous mutations (m.3394T>C-m.3397A>G-m.3398T>C) at interior branches of the tree, preventing the formation of a dinucleotide STR (TATATA) in the MT-ND1 gene. At the most recent time scale in 124 mother-children transmissions, purifying selection was detectable through the loss of mtDNA variants with high predicted pathogenicity. A few haplogroup-defining sites were also heteroplasmic, agreeing with a significant propensity in 349 positions in the phylogenetic tree to revert back to the ancestral variant. This nonrandom mutation property explains the observation of heteroplasmic mutations at some haplogroup-defining sites in sequencing datasets, which may not indicate poor quality as has been claimed.
Resumo:
Football is considered nowadays one of the most popular sports. In the betting world, it has acquired an outstanding position, which moves millions of euros during the period of a single football match. The lack of profitability of football betting users has been stressed as a problem. This lack gave origin to this research proposal, which it is going to analyse the possibility of existing a way to support the users to increase their profits on their bets. Data mining models were induced with the purpose of supporting the gamblers to increase their profits in the medium/long term. Being conscience that the models can fail, the results achieved by four of the seven targets in the models are encouraging and suggest that the system can help to increase the profits. All defined targets have two possible classes to predict, for example, if there are more or less than 7.5 corners in a single game. The data mining models of the targets, more or less than 7.5 corners, 8.5 corners, 1.5 goals and 3.5 goals achieved the pre-defined thresholds. The models were implemented in a prototype, which it is a pervasive decision support system. This system was developed with the purpose to be an interface for any user, both for an expert user as to a user who has no knowledge in football games.
Resumo:
Patient blood pressure is an important vital signal to the physicians take a decision and to better understand the patient condition. In Intensive Care Units is possible monitoring the blood pressure due the fact of the patient being in continuous monitoring through bedside monitors and the use of sensors. The intensivist only have access to vital signs values when they look to the monitor or consult the values hourly collected. Most important is the sequence of the values collected, i.e., a set of highest or lowest values can signify a critical event and bring future complications to a patient as is Hypotension or Hypertension. This complications can leverage a set of dangerous diseases and side-effects. The main goal of this work is to predict the probability of a patient has a blood pressure critical event in the next hours by combining a set of patient data collected in real-time and using Data Mining classification techniques. As output the models indicate the probability (%) of a patient has a Blood Pressure Critical Event in the next hour. The achieved results showed to be very promising, presenting sensitivity around of 95%.