6 resultados para Semantic Publishing, Linked Data, Bibliometrics, Informetrics, Data Retrieval, Citations
em Universidade do Minho
Resumo:
Studies in Computational Intelligence, 616
Resumo:
Hospitals are nowadays collecting vast amounts of data related with patient records. All this data hold valuable knowledge that can be used to improve hospital decision making. Data mining techniques aim precisely at the extraction of useful knowledge from raw data. This work describes an implementation of a medical data mining project approach based on the CRISP-DM methodology. Recent real-world data, from 2000 to 2013, were collected from a Portuguese hospital and related with inpatient hospitalization. The goal was to predict generic hospital Length Of Stay based on indicators that are commonly available at the hospitalization process (e.g., gender, age, episode type, medical specialty). At the data preparation stage, the data were cleaned and variables were selected and transformed, leading to 14 inputs. Next, at the modeling stage, a regression approach was adopted, where six learning methods were compared: Average Prediction, Multiple Regression, Decision Tree, Artificial Neural Network ensemble, Support Vector Machine and Random Forest. The best learning model was obtained by the Random Forest method, which presents a high quality coefficient of determination value (0.81). This model was then opened by using a sensitivity analysis procedure that revealed three influential input attributes: the hospital episode type, the physical service where the patient is hospitalized and the associated medical specialty. Such extracted knowledge confirmed that the obtained predictive model is credible and with potential value for supporting decisions of hospital managers.
Resumo:
We are living in the era of Big Data. A time which is characterized by the continuous creation of vast amounts of data, originated from different sources, and with different formats. First, with the rise of the social networks and, more recently, with the advent of the Internet of Things (IoT), in which everyone and (eventually) everything is linked to the Internet, data with enormous potential for organizations is being continuously generated. In order to be more competitive, organizations want to access and explore all the richness that is present in those data. Indeed, Big Data is only as valuable as the insights organizations gather from it to make better decisions, which is the main goal of Business Intelligence. In this paper we describe an experiment in which data obtained from a NoSQL data source (database technology explicitly developed to deal with the specificities of Big Data) is used to feed a Business Intelligence solution.
Resumo:
This paper describes the concept, technical realisation and validation of a largely data-driven method to model events with Z→ττ decays. In Z→μμ events selected from proton-proton collision data recorded at s√=8 TeV with the ATLAS experiment at the LHC in 2012, the Z decay muons are replaced by τ leptons from simulated Z→ττ decays at the level of reconstructed tracks and calorimeter cells. The τ lepton kinematics are derived from the kinematics of the original muons. Thus, only the well-understood decays of the Z boson and τ leptons as well as the detector response to the τ decay products are obtained from simulation. All other aspects of the event, such as the Z boson and jet kinematics as well as effects from multiple interactions, are given by the actual data. This so-called τ-embedding method is particularly relevant for Higgs boson searches and analyses in ττ final states, where Z→ττ decays constitute a large irreducible background that cannot be obtained directly from data control samples.
Resumo:
Football is considered nowadays one of the most popular sports. In the betting world, it has acquired an outstanding position, which moves millions of euros during the period of a single football match. The lack of profitability of football betting users has been stressed as a problem. This lack gave origin to this research proposal, which it is going to analyse the possibility of existing a way to support the users to increase their profits on their bets. Data mining models were induced with the purpose of supporting the gamblers to increase their profits in the medium/long term. Being conscience that the models can fail, the results achieved by four of the seven targets in the models are encouraging and suggest that the system can help to increase the profits. All defined targets have two possible classes to predict, for example, if there are more or less than 7.5 corners in a single game. The data mining models of the targets, more or less than 7.5 corners, 8.5 corners, 1.5 goals and 3.5 goals achieved the pre-defined thresholds. The models were implemented in a prototype, which it is a pervasive decision support system. This system was developed with the purpose to be an interface for any user, both for an expert user as to a user who has no knowledge in football games.
Resumo:
Healthcare organizations often benefit from information technologies as well as embedded decision support systems, which improve the quality of services and help preventing complications and adverse events. In Centro Materno Infantil do Norte (CMIN), the maternal and perinatal care unit of Centro Hospitalar of Oporto (CHP), an intelligent pre-triage system is implemented, aiming to prioritize patients in need of gynaecology and obstetrics care in two classes: urgent and consultation. The system is designed to evade emergency problems such as incorrect triage outcomes and extensive triage waiting times. The current study intends to improve the triage system, and therefore, optimize the patient workflow through the emergency room, by predicting the triage waiting time comprised between the patient triage and their medical admission. For this purpose, data mining (DM) techniques are induced in selected information provided by the information technologies implemented in CMIN. The DM models achieved accuracy values of approximately 94% with a five range target distribution, which not only allow obtaining confident prediction models, but also identify the variables that stand as direct inducers to the triage waiting times.