27 resultados para mining algorithm


Relevância:

20.00% 20.00%

Publicador:

Resumo:

The present paper reports the precipitation process of Al3Sc structures in an aluminum scandium alloy, which has been simulated with a synchronous parallel kinetic Monte Carlo (spkMC) algorithm. The spkMC implementation is based on the vacancy diffusion mechanism. To filter the raw data generated by the spkMC simulations, the density-based clustering with noise (DBSCAN) method has been employed. spkMC and DBSCAN algorithms were implemented in the C language and using MPI library. The simulations were conducted in the SeARCH cluster located at the University of Minho. The Al3Sc precipitation was successfully simulated at the atomistic scale with the spkMC. DBSCAN proved to be a valuable aid to identify the precipitates by performing a cluster analysis of the simulation results. The achieved simulations results are in good agreement with those reported in the literature under sequential kinetic Monte Carlo simulations (kMC). The parallel implementation of kMC has provided a 4x speedup over the sequential version.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Hospitals are nowadays collecting vast amounts of data related with patient records. All this data hold valuable knowledge that can be used to improve hospital decision making. Data mining techniques aim precisely at the extraction of useful knowledge from raw data. This work describes an implementation of a medical data mining project approach based on the CRISP-DM methodology. Recent real-world data, from 2000 to 2013, were collected from a Portuguese hospital and related with inpatient hospitalization. The goal was to predict generic hospital Length Of Stay based on indicators that are commonly available at the hospitalization process (e.g., gender, age, episode type, medical specialty). At the data preparation stage, the data were cleaned and variables were selected and transformed, leading to 14 inputs. Next, at the modeling stage, a regression approach was adopted, where six learning methods were compared: Average Prediction, Multiple Regression, Decision Tree, Artificial Neural Network ensemble, Support Vector Machine and Random Forest. The best learning model was obtained by the Random Forest method, which presents a high quality coefficient of determination value (0.81). This model was then opened by using a sensitivity analysis procedure that revealed three influential input attributes: the hospital episode type, the physical service where the patient is hospitalized and the associated medical specialty. Such extracted knowledge confirmed that the obtained predictive model is credible and with potential value for supporting decisions of hospital managers.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

telligence applications for the banking industry. Searches were performed in relevant journals resulting in 219 articles published between 2002 and 2013. To analyze such a large number of manuscripts, text mining techniques were used in pursuit for relevant terms on both business intelligence and banking domains. Moreover, the latent Dirichlet allocation modeling was used in or- der to group articles in several relevant topics. The analysis was conducted using a dictionary of terms belonging to both banking and business intelli- gence domains. Such procedure allowed for the identification of relationships between terms and topics grouping articles, enabling to emerge hypotheses regarding research directions. To confirm such hypotheses, relevant articles were collected and scrutinized, allowing to validate the text mining proce- dure. The results show that credit in banking is clearly the main application trend, particularly predicting risk and thus supporting credit approval or de- nial. There is also a relevant interest in bankruptcy and fraud prediction. Customer retention seems to be associated, although weakly, with targeting, justifying bank offers to reduce churn. In addition, a large number of ar- ticles focused more on business intelligence techniques and its applications, using the banking industry just for evaluation, thus, not clearly acclaiming for benefits in the banking business. By identifying these current research topics, this study also highlights opportunities for future research.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Earthworks tasks aim at levelling the ground surface at a target construction area and precede any kind of structural construction (e.g., road and railway construction). It is comprised of sequential tasks, such as excavation, transportation, spreading and compaction, and it is strongly based on heavy mechanical equipment and repetitive processes. Under this context, it is essential to optimize the usage of all available resources under two key criteria: the costs and duration of earthwork projects. In this paper, we present an integrated system that uses two artificial intelligence based techniques: data mining and evolutionary multi-objective optimization. The former is used to build data-driven models capable of providing realistic estimates of resource productivity, while the latter is used to optimize resource allocation considering the two main earthwork objectives (duration and cost). Experiments held using real-world data, from a construction site, have shown that the proposed system is competitive when compared with current manual earthwork design.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Worldwide, around 9% of the children are born with less than 37 weeks of labour, causing risk to the premature child, whom it is not prepared to develop a number of basic functions that begin soon after the birth. In order to ensure that those risk pregnancies are being properly monitored by the obstetricians in time to avoid those problems, Data Mining (DM) models were induced in this study to predict preterm births in a real environment using data from 3376 patients (women) admitted in the maternal and perinatal care unit of Centro Hospitalar of Oporto. A sensitive metric to predict preterm deliveries was developed, assisting physicians in the decision-making process regarding the patients’ observation. It was possible to obtain promising results, achieving sensitivity and specificity values of 96% and 98%, respectively.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Lecture Notes in Computer Science, 9273

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In Maternity Care, a quick decision has to be made about the most suitable delivery type for the current patient. Guidelines are followed by physicians to support that decision; however, those practice recommendations are limited and underused. In the last years, caesarean delivery has been pursued in over 28% of pregnancies, and other operative techniques regarding specific problems have also been excessively employed. This study identifies obstetric and pregnancy factors that can be used to predict the most appropriate delivery technique, through the induction of data mining models using real data gathered in the perinatal and maternal care unit of Centro Hospitalar of Oporto (CHP). Predicting the type of birth envisions high-quality services, increased safety and effectiveness of specific practices to help guide maternity care decisions and facilitate optimal outcomes in mother and child. In this work was possible to acquire good results, achieving sensitivity and specificity values of 90.11% and 80.05%, respectively, providing the CHP with a model capable of correctly identify caesarean sections and vaginal deliveries.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Rockburst is characterized by a violent explosion of a block causing a sudden rupture in the rock and is quite common in deep tunnels. It is critical to understand the phenomenon of rockburst, focusing on the patterns of occurrence so these events can be avoided and/or managed saving costs and possibly lives. The failure mechanism of rockburst needs to be better understood. Laboratory experiments are undergoing at the Laboratory for Geomechanics and Deep Underground Engineering (SKLGDUE) of Beijing and the system is described. A large number of rockburst tests were performed and their information collected, stored in a database and analyzed. Data Mining (DM) techniques were applied to the database in order to develop predictive models for the rockburst maximum stress (σRB) and rockburst risk index (IRB) that need the results of such tests to be determined. With the developed models it is possible to predict these parameters with high accuracy levels using data from the rock mass and specific project.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Transcriptional Regulatory Networks (TRNs) are powerful tool for representing several interactions that occur within a cell. Recent studies have provided information to help researchers in the tasks of building and understanding these networks. One of the major sources of information to build TRNs is biomedical literature. However, due to the rapidly increasing number of scientific papers, it is quite difficult to analyse the large amount of papers that have been published about this subject. This fact has heightened the importance of Biomedical Text Mining approaches in this task. Also, owing to the lack of adequate standards, as the number of databases increases, several inconsistencies concerning gene and protein names and identifiers are common. In this work, we developed an integrated approach for the reconstruction of TRNs that retrieve the relevant information from important biological databases and insert it into a unique repository, named KREN. Also, we applied text mining techniques over this integrated repository to build TRNs. However, was necessary to create a dictionary of names and synonyms associated with these entities and also develop an approach that retrieves all the abstracts from the related scientific papers stored on PubMed, in order to create a corpora of data about genes. Furthermore, these tasks were integrated into @Note, a software system that allows to use some methods from the Biomedical Text Mining field, including an algorithms for Named Entity Recognition (NER), extraction of all relevant terms from publication abstracts, extraction relationships between biological entities (genes, proteins and transcription factors). And finally, extended this tool to allow the reconstruction Transcriptional Regulatory Networks through using scientific literature.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The artificial fish swarm algorithm has recently been emerged in continuous global optimization. It uses points of a population in space to identify the position of fish in the school. Many real-world optimization problems are described by 0-1 multidimensional knapsack problems that are NP-hard. In the last decades several exact as well as heuristic methods have been proposed for solving these problems. In this paper, a new simpli ed binary version of the artificial fish swarm algorithm is presented, where a point/ fish is represented by a binary string of 0/1 bits. Trial points are created by using crossover and mutation in the different fi sh behavior that are randomly selected by using two user de ned probability values. In order to make the points feasible the presented algorithm uses a random heuristic drop item procedure followed by an add item procedure aiming to increase the profit throughout the adding of more items in the knapsack. A cyclic reinitialization of 50% of the population, and a simple local search that allows the progress of a small percentage of points towards optimality and after that refines the best point in the population greatly improve the quality of the solutions. The presented method is tested on a set of benchmark instances and a comparison with other methods available in literature is shown. The comparison shows that the proposed method can be an alternative method for solving these problems.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The Electromagnetism-like (EM) algorithm is a population- based stochastic global optimization algorithm that uses an attraction- repulsion mechanism to move sample points towards the optimal. In this paper, an implementation of the EM algorithm in the Matlab en- vironment as a useful function for practitioners and for those who want to experiment a new global optimization solver is proposed. A set of benchmark problems are solved in order to evaluate the performance of the implemented method when compared with other stochastic methods available in the Matlab environment. The results con rm that our imple- mentation is a competitive alternative both in term of numerical results and performance. Finally, a case study based on a parameter estimation problem of a biology system shows that the EM implementation could be applied with promising results in the control optimization area.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we propose an extension of the firefly algorithm (FA) to multi-objective optimization. FA is a swarm intelligence optimization algorithm inspired by the flashing behavior of fireflies at night that is capable of computing global solutions to continuous optimization problems. Our proposal relies on a fitness assignment scheme that gives lower fitness values to the positions of fireflies that correspond to non-dominated points with smaller aggregation of objective function distances to the minimum values. Furthermore, FA randomness is based on the spread metric to reduce the gaps between consecutive non-dominated solutions. The obtained results from the preliminary computational experiments show that our proposal gives a dense and well distributed approximated Pareto front with a large number of points.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents a single-phase Series Active Power Filter (Series APF) for mitigation of the load voltage harmonic content, while maintaining the voltage on the DC side regulated without the support of a voltage source. The proposed series active power filter control algorithm eliminates the additional voltage source to regulate the DC voltage, and with the adopted topology it is not used a coupling transformer to interface the series active power filter with the electrical power grid. The paper describes the control strategy which encapsulates the grid synchronization scheme, the compensation voltage calculation, the damping algorithm and the dead-time compensation. The topology and control strategy of the series active power filter have been evaluated in simulation software and simulations results are presented. Experimental results, obtained with a developed laboratorial prototype, validate the theoretical assumptions, and are within the harmonic spectrum limits imposed by the international recommendations of the IEEE-519 Standard.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Natural selection favors the survival and reproduction of organisms that are best adapted to their environment. Selection mechanism in evolutionary algorithms mimics this process, aiming to create environmental conditions in which artificial organisms could evolve solving the problem at hand. This paper proposes a new selection scheme for evolutionary multiobjective optimization. The similarity measure that defines the concept of the neighborhood is a key feature of the proposed selection. Contrary to commonly used approaches, usually defined on the basis of distances between either individuals or weight vectors, it is suggested to consider the similarity and neighborhood based on the angle between individuals in the objective space. The smaller the angle, the more similar individuals. This notion is exploited during the mating and environmental selections. The convergence is ensured by minimizing distances from individuals to a reference point, whereas the diversity is preserved by maximizing angles between neighboring individuals. Experimental results reveal a highly competitive performance and useful characteristics of the proposed selection. Its strong diversity preserving ability allows to produce a significantly better performance on some problems when compared with stat-of-the-art algorithms.