22 resultados para Data Mining and its Application

em Universidade do Minho


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Earthworks tasks aim at levelling the ground surface at a target construction area and precede any kind of structural construction (e.g., road and railway construction). It is comprised of sequential tasks, such as excavation, transportation, spreading and compaction, and it is strongly based on heavy mechanical equipment and repetitive processes. Under this context, it is essential to optimize the usage of all available resources under two key criteria: the costs and duration of earthwork projects. In this paper, we present an integrated system that uses two artificial intelligence based techniques: data mining and evolutionary multi-objective optimization. The former is used to build data-driven models capable of providing realistic estimates of resource productivity, while the latter is used to optimize resource allocation considering the two main earthwork objectives (duration and cost). Experiments held using real-world data, from a construction site, have shown that the proposed system is competitive when compared with current manual earthwork design.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Hospitals are nowadays collecting vast amounts of data related with patient records. All this data hold valuable knowledge that can be used to improve hospital decision making. Data mining techniques aim precisely at the extraction of useful knowledge from raw data. This work describes an implementation of a medical data mining project approach based on the CRISP-DM methodology. Recent real-world data, from 2000 to 2013, were collected from a Portuguese hospital and related with inpatient hospitalization. The goal was to predict generic hospital Length Of Stay based on indicators that are commonly available at the hospitalization process (e.g., gender, age, episode type, medical specialty). At the data preparation stage, the data were cleaned and variables were selected and transformed, leading to 14 inputs. Next, at the modeling stage, a regression approach was adopted, where six learning methods were compared: Average Prediction, Multiple Regression, Decision Tree, Artificial Neural Network ensemble, Support Vector Machine and Random Forest. The best learning model was obtained by the Random Forest method, which presents a high quality coefficient of determination value (0.81). This model was then opened by using a sensitivity analysis procedure that revealed three influential input attributes: the hospital episode type, the physical service where the patient is hospitalized and the associated medical specialty. Such extracted knowledge confirmed that the obtained predictive model is credible and with potential value for supporting decisions of hospital managers.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

telligence applications for the banking industry. Searches were performed in relevant journals resulting in 219 articles published between 2002 and 2013. To analyze such a large number of manuscripts, text mining techniques were used in pursuit for relevant terms on both business intelligence and banking domains. Moreover, the latent Dirichlet allocation modeling was used in or- der to group articles in several relevant topics. The analysis was conducted using a dictionary of terms belonging to both banking and business intelli- gence domains. Such procedure allowed for the identification of relationships between terms and topics grouping articles, enabling to emerge hypotheses regarding research directions. To confirm such hypotheses, relevant articles were collected and scrutinized, allowing to validate the text mining proce- dure. The results show that credit in banking is clearly the main application trend, particularly predicting risk and thus supporting credit approval or de- nial. There is also a relevant interest in bankruptcy and fraud prediction. Customer retention seems to be associated, although weakly, with targeting, justifying bank offers to reduce churn. In addition, a large number of ar- ticles focused more on business intelligence techniques and its applications, using the banking industry just for evaluation, thus, not clearly acclaiming for benefits in the banking business. By identifying these current research topics, this study also highlights opportunities for future research.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Rockburst is characterized by a violent explosion of a block causing a sudden rupture in the rock and is quite common in deep tunnels. It is critical to understand the phenomenon of rockburst, focusing on the patterns of occurrence so these events can be avoided and/or managed saving costs and possibly lives. The failure mechanism of rockburst needs to be better understood. Laboratory experiments are undergoing at the Laboratory for Geomechanics and Deep Underground Engineering (SKLGDUE) of Beijing and the system is described. A large number of rockburst tests were performed and their information collected, stored in a database and analyzed. Data Mining (DM) techniques were applied to the database in order to develop predictive models for the rockburst maximum stress (σRB) and rockburst risk index (IRB) that need the results of such tests to be determined. With the developed models it is possible to predict these parameters with high accuracy levels using data from the rock mass and specific project.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Dissertação de mestrado integrado em Engenharia e Gestão de Sistemas de Informação

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Transcriptional Regulatory Networks (TRNs) are powerful tool for representing several interactions that occur within a cell. Recent studies have provided information to help researchers in the tasks of building and understanding these networks. One of the major sources of information to build TRNs is biomedical literature. However, due to the rapidly increasing number of scientific papers, it is quite difficult to analyse the large amount of papers that have been published about this subject. This fact has heightened the importance of Biomedical Text Mining approaches in this task. Also, owing to the lack of adequate standards, as the number of databases increases, several inconsistencies concerning gene and protein names and identifiers are common. In this work, we developed an integrated approach for the reconstruction of TRNs that retrieve the relevant information from important biological databases and insert it into a unique repository, named KREN. Also, we applied text mining techniques over this integrated repository to build TRNs. However, was necessary to create a dictionary of names and synonyms associated with these entities and also develop an approach that retrieves all the abstracts from the related scientific papers stored on PubMed, in order to create a corpora of data about genes. Furthermore, these tasks were integrated into @Note, a software system that allows to use some methods from the Biomedical Text Mining field, including an algorithms for Named Entity Recognition (NER), extraction of all relevant terms from publication abstracts, extraction relationships between biological entities (genes, proteins and transcription factors). And finally, extended this tool to allow the reconstruction Transcriptional Regulatory Networks through using scientific literature.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Football is considered nowadays one of the most popular sports. In the betting world, it has acquired an outstanding position, which moves millions of euros during the period of a single football match. The lack of profitability of football betting users has been stressed as a problem. This lack gave origin to this research proposal, which it is going to analyse the possibility of existing a way to support the users to increase their profits on their bets. Data mining models were induced with the purpose of supporting the gamblers to increase their profits in the medium/long term. Being conscience that the models can fail, the results achieved by four of the seven targets in the models are encouraging and suggest that the system can help to increase the profits. All defined targets have two possible classes to predict, for example, if there are more or less than 7.5 corners in a single game. The data mining models of the targets, more or less than 7.5 corners, 8.5 corners, 1.5 goals and 3.5 goals achieved the pre-defined thresholds. The models were implemented in a prototype, which it is a pervasive decision support system. This system was developed with the purpose to be an interface for any user, both for an expert user as to a user who has no knowledge in football games.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The excavations carried out under the rescue “Project of Bracara Augusta” have generated significant amounts of data that enabled the reconstruction of Bracara Augusta urban evolution and the characterization of its buildings and blocks. This paper aims to enhance the existing data related with the domestic architecture of the roman town, which was mainly represented by the houses of domus type.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Worldwide, around 9% of the children are born with less than 37 weeks of labour, causing risk to the premature child, whom it is not prepared to develop a number of basic functions that begin soon after the birth. In order to ensure that those risk pregnancies are being properly monitored by the obstetricians in time to avoid those problems, Data Mining (DM) models were induced in this study to predict preterm births in a real environment using data from 3376 patients (women) admitted in the maternal and perinatal care unit of Centro Hospitalar of Oporto. A sensitive metric to predict preterm deliveries was developed, assisting physicians in the decision-making process regarding the patients’ observation. It was possible to obtain promising results, achieving sensitivity and specificity values of 96% and 98%, respectively.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Lecture Notes in Computer Science, 9273

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In Maternity Care, a quick decision has to be made about the most suitable delivery type for the current patient. Guidelines are followed by physicians to support that decision; however, those practice recommendations are limited and underused. In the last years, caesarean delivery has been pursued in over 28% of pregnancies, and other operative techniques regarding specific problems have also been excessively employed. This study identifies obstetric and pregnancy factors that can be used to predict the most appropriate delivery technique, through the induction of data mining models using real data gathered in the perinatal and maternal care unit of Centro Hospitalar of Oporto (CHP). Predicting the type of birth envisions high-quality services, increased safety and effectiveness of specific practices to help guide maternity care decisions and facilitate optimal outcomes in mother and child. In this work was possible to acquire good results, achieving sensitivity and specificity values of 90.11% and 80.05%, respectively, providing the CHP with a model capable of correctly identify caesarean sections and vaginal deliveries.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This study aims to analyse the relationship between safety climate and the level of risk acceptance, as well as its relationship with workplace safety performance. The sample includes 14 companies and 403 workers. The safety climate assessment was performed by the application of a Safety Climate in Wood Industries questionnaire and safety performance was assessed with a checklist. Judgements about risk acceptance were measured through questionnaires together with four other variables: trust, risk perception, benefit perception and emotion. Safety climate was found to be correlated with workgroup safety performance, and it also plays an important role in workers’ risk acceptance levels. Risk acceptance tends to be lower when safety climate scores of workgroups are high, and subsequently, their safety performance is better. These findings seem to be relevant, as they provide Occupational, Safety and Health practitioners with a better understanding of workers’ risk acceptance levels and of the differences among workgroups.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The jet energy scale (JES) and its systematic uncertainty are determined for jets measured with the ATLAS detector using proton–proton collision data with a centre-of-mass energy of s√=7 TeV corresponding to an integrated luminosity of 4.7 fb −1 . Jets are reconstructed from energy deposits forming topological clusters of calorimeter cells using the anti- kt algorithm with distance parameters R=0.4 or R=0.6 , and are calibrated using MC simulations. A residual JES correction is applied to account for differences between data and MC simulations. This correction and its systematic uncertainty are estimated using a combination of in situ techniques exploiting the transverse momentum balance between a jet and a reference object such as a photon or a Z boson, for 20≤pjetT<1000 GeV and pseudorapidities |η|<4.5 . The effect of multiple proton–proton interactions is corrected for, and an uncertainty is evaluated using in situ techniques. The smallest JES uncertainty of less than 1 % is found in the central calorimeter region ( |η|<1.2 ) for jets with 55≤pjetT<500 GeV . For central jets at lower pT , the uncertainty is about 3 %. A consistent JES estimate is found using measurements of the calorimeter response of single hadrons in proton–proton collisions and test-beam data, which also provide the estimate for pjetT>1 TeV. The calibration of forward jets is derived from dijet pT balance measurements. The resulting uncertainty reaches its largest value of 6 % for low- pT jets at |η|=4.5 . Additional JES uncertainties due to specific event topologies, such as close-by jets or selections of event samples with an enhanced content of jets originating from light quarks or gluons, are also discussed. The magnitude of these uncertainties depends on the event sample used in a given physics analysis, but typically amounts to 0.5–3 %.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

With the present study we aimed to analyze the relationship between infants' behavior and their visual evoked-potential (VEPs) response. Specifically, we want to verify differences regarding the VEP response in sleeping and awake infants and if an association between VEP components, in both groups, with neurobehavioral outcome could be identified. To do so, thirty-two full-term and healthy infants, approximately 1-month of age, were assessed through a VEP unpatterned flashlight stimuli paradigm, offered in two different intensities, and were assessed using a neurobehavioral scale. However, only 18 infants have both assessments, and therefore, these is the total included in both analysis. Infants displayed a mature neurobehavioral outcome, expected for their age. We observed that P2 and N3 components were present in both sleeping and awake infants. Differences between intensities were found regarding the P2 amplitude, but only in awake infants. Regression analysis showed that N3 amplitude predicted an adequate social interactive and internal regulatory behavior in infants who were awake during the stimuli presentation. Taking into account that social orientation and regulatory behaviors are fundamental keys for social-like behavior in 1-month-old infants, this study provides an important approach for assessing physiological biomarkers (VEPs) and its relation with social behavior, very early in postnatal development. Moreover, we evidence the importance of the infant's state when studying differences regarding visual threshold processing and its association with behavioral outcome.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Dissertação de mestrado integrado em Engenharia e Gestão de Sistemas de Informação