59 resultados para EVOLUTIONARY ALGORITHMS (EAS)
em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo (BDPI/USP)
Resumo:
There is an increasing interest in the application of Evolutionary Algorithms (EAs) to induce classification rules. This hybrid approach can benefit areas where classical methods for rule induction have not been very successful. One example is the induction of classification rules in imbalanced domains. Imbalanced data occur when one or more classes heavily outnumber other classes. Frequently, classical machine learning (ML) classifiers are not able to learn in the presence of imbalanced data sets, inducing classification models that always predict the most numerous classes. In this work, we propose a novel hybrid approach to deal with this problem. We create several balanced data sets with all minority class cases and a random sample of majority class cases. These balanced data sets are fed to classical ML systems that produce rule sets. The rule sets are combined creating a pool of rules and an EA is used to build a classifier from this pool of rules. This hybrid approach has some advantages over undersampling, since it reduces the amount of discarded information, and some advantages over oversampling, since it avoids overfitting. The proposed approach was experimentally analysed and the experimental results show an improvement in the classification performance measured as the area under the receiver operating characteristics (ROC) curve.
Resumo:
This paper proposes the use of the q-Gaussian mutation with self-adaptation of the shape of the mutation distribution in evolutionary algorithms. The shape of the q-Gaussian mutation distribution is controlled by a real parameter q. In the proposed method, the real parameter q of the q-Gaussian mutation is encoded in the chromosome of individuals and hence is allowed to evolve during the evolutionary process. In order to test the new mutation operator, evolution strategy and evolutionary programming algorithms with self-adapted q-Gaussian mutation generated from anisotropic and isotropic distributions are presented. The theoretical analysis of the q-Gaussian mutation is also provided. In the experimental study, the q-Gaussian mutation is compared to Gaussian and Cauchy mutations in the optimization of a set of test functions. Experimental results show the efficiency of the proposed method of self-adapting the mutation distribution in evolutionary algorithms.
Resumo:
The power loss reduction in distribution systems (DSs) is a nonlinear and multiobjective problem. Service restoration in DSs is even computationally hard since it additionally requires a solution in real-time. Both DS problems are computationally complex. For large-scale networks, the usual problem formulation has thousands of constraint equations. The node-depth encoding (NDE) enables a modeling of DSs problems that eliminates several constraint equations from the usual formulation, making the problem solution simpler. On the other hand, a multiobjective evolutionary algorithm (EA) based on subpopulation tables adequately models several objectives and constraints, enabling a better exploration of the search space. The combination of the multiobjective EA with NDE (MEAN) results in the proposed approach for solving DSs problems for large-scale networks. Simulation results have shown the MEAN is able to find adequate restoration plans for a real DS with 3860 buses and 632 switches in a running time of 0.68 s. Moreover, the MEAN has shown a sublinear running time in function of the system size. Tests with networks ranging from 632 to 5166 switches indicate that the MEAN can find network configurations corresponding to a power loss reduction of 27.64% for very large networks requiring relatively low running time.
Resumo:
This paper presents a new methodology to estimate unbalanced harmonic distortions in a power system, based on measurements of a limited number of given sites. The algorithm utilizes evolutionary strategies (ES), a development branch of evolutionary algorithms. The problem solving algorithm herein proposed makes use of data from various power quality meters, which can either be synchronized by high technology GPS devices or by using information from a fundamental frequency load flow, what makes the overall power quality monitoring system much less costly. The ES based harmonic estimation model is applied to a 14 bus network to compare its performance to a conventional Monte Carlo approach. It is also applied to a 50 bus subtransmission network in order to compare the three-phase and single-phase approaches as well as the robustness of the proposed method. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
This paper presents a new methodology to estimate harmonic distortions in a power system, based on measurements of a limited number of given sites. The algorithm utilizes evolutionary strategies (ES), a development branch of evolutionary algorithms. The main advantage in using such a technique relies upon its modeling facilities as well as its potential to solve fairly complex problems. The problem-solving algorithm herein proposed makes use of data from various power-quality (PQ) meters, which can either be synchronized by high technology global positioning system devices or by using information from a fundamental frequency load flow. This second approach makes the overall PQ monitoring system much less costly. The algorithm is applied to an IEEE test network, for which sensitivity analysis is performed to determine how the parameters of the ES can be selected so that the algorithm performs in an effective way. Case studies show fairly promising results and the robustness of the proposed method.
Resumo:
One of the top ten most influential data mining algorithms, k-means, is known for being simple and scalable. However, it is sensitive to initialization of prototypes and requires that the number of clusters be specified in advance. This paper shows that evolutionary techniques conceived to guide the application of k-means can be more computationally efficient than systematic (i.e., repetitive) approaches that try to get around the above-mentioned drawbacks by repeatedly running the algorithm from different configurations for the number of clusters and initial positions of prototypes. To do so, a modified version of a (k-means based) fast evolutionary algorithm for clustering is employed. Theoretical complexity analyses for the systematic and evolutionary algorithms under interest are provided. Computational experiments and statistical analyses of the results are presented for artificial and text mining data sets. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
This paper is concerned with the computational efficiency of fuzzy clustering algorithms when the data set to be clustered is described by a proximity matrix only (relational data) and the number of clusters must be automatically estimated from such data. A fuzzy variant of an evolutionary algorithm for relational clustering is derived and compared against two systematic (pseudo-exhaustive) approaches that can also be used to automatically estimate the number of fuzzy clusters in relational data. An extensive collection of experiments involving 18 artificial and two real data sets is reported and analyzed. (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
This paper tackles the problem of showing that evolutionary algorithms for fuzzy clustering can be more efficient than systematic (i.e. repetitive) approaches when the number of clusters in a data set is unknown. To do so, a fuzzy version of an Evolutionary Algorithm for Clustering (EAC) is introduced. A fuzzy cluster validity criterion and a fuzzy local search algorithm are used instead of their hard counterparts employed by EAC. Theoretical complexity analyses for both the systematic and evolutionary algorithms under interest are provided. Examples with computational experiments and statistical analyses are also presented.
Resumo:
Model trees are a particular case of decision trees employed to solve regression problems. They have the advantage of presenting an interpretable output, helping the end-user to get more confidence in the prediction and providing the basis for the end-user to have new insight about the data, confirming or rejecting hypotheses previously formed. Moreover, model trees present an acceptable level of predictive performance in comparison to most techniques used for solving regression problems. Since generating the optimal model tree is an NP-Complete problem, traditional model tree induction algorithms make use of a greedy top-down divide-and-conquer strategy, which may not converge to the global optimal solution. In this paper, we propose a novel algorithm based on the use of the evolutionary algorithms paradigm as an alternate heuristic to generate model trees in order to improve the convergence to globally near-optimal solutions. We call our new approach evolutionary model tree induction (E-Motion). We test its predictive performance using public UCI data sets, and we compare the results to traditional greedy regression/model trees induction algorithms, as well as to other evolutionary approaches. Results show that our method presents a good trade-off between predictive performance and model comprehensibility, which may be crucial in many machine learning applications. (C) 2010 Elsevier Inc. All rights reserved.
Resumo:
The purpose of this paper is to propose a multiobjective optimization approach for solving the manufacturing cell formation problem, explicitly considering the performance of this said manufacturing system. Cells are formed so as to simultaneously minimize three conflicting objectives, namely, the level of the work-in-process, the intercell moves and the total machinery investment. A genetic algorithm performs a search in the design space, in order to approximate to the Pareto optimal set. The values of the objectives for each candidate solution in a population are assigned by running a discrete-event simulation, in which the model is automatically generated according to the number of machines and their distribution among cells implied by a particular solution. The potential of this approach is evaluated via its application to an illustrative example, and a case from the relevant literature. The obtained results are analyzed and reviewed. Therefore, it is concluded that this approach is capable of generating a set of alternative manufacturing cell configurations considering the optimization of multiple performance measures, greatly improving the decision making process involved in planning and designing cellular systems. (C) 2010 Elsevier Ltd. All rights reserved.
Resumo:
In this paper a computational implementation of an evolutionary algorithm (EA) is shown in order to tackle the problem of reconfiguring radial distribution systems. The developed module considers power quality indices such as long duration interruptions and customer process disruptions due to voltage sags, by using the Monte Carlo simulation method. Power quality costs are modeled into the mathematical problem formulation, which are added to the cost of network losses. As for the EA codification proposed, a decimal representation is used. The EA operators, namely selection, recombination and mutation, which are considered for the reconfiguration algorithm, are herein analyzed. A number of selection procedures are analyzed, namely tournament, elitism and a mixed technique using both elitism and tournament. The recombination operator was developed by considering a chromosome structure representation that maps the network branches and system radiality, and another structure that takes into account the network topology and feasibility of network operation to exchange genetic material. The topologies regarding the initial population are randomly produced so as radial configurations are produced through the Prim and Kruskal algorithms that rapidly build minimum spanning trees. (C) 2009 Elsevier B.V. All rights reserved.
Resumo:
When building genetic maps, it is necessary to choose from several marker ordering algorithms and criteria, and the choice is not always simple. In this study, we evaluate the efficiency of algorithms try (TRY), seriation (SER), rapid chain delineation (RCD), recombination counting and ordering (RECORD) and unidirectional growth (UG), as well as the criteria PARF (product of adjacent recombination fractions), SARF (sum of adjacent recombination fractions), SALOD (sum of adjacent LOD scores) and LHMC (likelihood through hidden Markov chains), used with the RIPPLE algorithm for error verification, in the construction of genetic linkage maps. A linkage map of a hypothetical diploid and monoecious plant species was simulated containing one linkage group and 21 markers with fixed distance of 3 cM between them. In all, 700 F(2) populations were randomly simulated with and 400 individuals with different combinations of dominant and co-dominant markers, as well as 10 and 20% of missing data. The simulations showed that, in the presence of co-dominant markers only, any combination of algorithm and criteria may be used, even for a reduced population size. In the case of a smaller proportion of dominant markers, any of the algorithms and criteria (except SALOD) investigated may be used. In the presence of high proportions of dominant markers and smaller samples (around 100), the probability of repulsion linkage increases between them and, in this case, use of the algorithms TRY and SER associated to RIPPLE with criterion LHMC would provide better results. Heredity (2009) 103, 494-502; doi:10.1038/hdy.2009.96; published online 29 July 2009
Resumo:
Support vector machines (SVMs) were originally formulated for the solution of binary classification problems. In multiclass problems, a decomposition approach is often employed, in which the multiclass problem is divided into multiple binary subproblems, whose results are combined. Generally, the performance of SVM classifiers is affected by the selection of values for their parameters. This paper investigates the use of genetic algorithms (GAs) to tune the parameters of the binary SVMs in common multiclass decompositions. The developed GA may search for a set of parameter values common to all binary classifiers or for differentiated values for each binary classifier. (C) 2008 Elsevier B.V. All rights reserved.
Resumo:
Aims. We determine the age and mass of the three best solar twin candidates in open cluster M 67 through lithium evolutionary models. Methods. We computed a grid of evolutionary models with non-standard mixing at metallicity [Fe/H] = 0.01 with the Toulouse-Geneva evolution code for a range of stellar masses. We estimated the mass and age of 10 solar analogs belonging to the open cluster M 67. We made a detailed study of the three solar twins of the sample, YPB637, YPB1194, and YPB1787. Results. We obtained a very accurate estimation of the mass of our solar analogs in M 67 by interpolating in the grid of evolutionary models. The three solar twins allowed us to estimate the age of the open cluster, which is 3.87(-0.66)(+0.55) Gyr, which is better constrained than former estimates. Conclusions. Our results show that the 3 solar twin candidates have one solar mass within the errors and that M 67 has a solar age within the errors, validating its use as a solar proxy. M 67 is an important cluster when searching for solar twins.
Resumo:
Context. Compact groups of galaxies are entities that have high densities of galaxies and serve as laboratories to study galaxy interactions, intergalactic star formation and galaxy evolution. Aims. The main goal of this study is to search for young objects in the intragroup medium of seven compact groups of galaxies: HCG 2, 7, 22, 23, 92, 100 and NGC 92 as well as to evaluate the stage of interaction of each group. Methods. We used Fabry-Perot velocity fields and rotation curves together with GALEX NUV and FUV images and optical R-band and HI maps. Results. (i) HCG 7 and HCG 23 are in early stages of interaction; (ii) HCG 2 and HCG 22 are mildly interacting; and (iii) HCG 92, HCG 100 and NGC 92 are in late stages of evolution. We find that all three evolved groups contain populations of young blue objects in the intragroup medium, consistent with ages < 100 Myr, of which several are younger than < 10 Myr. We also report the discovery of a tidal dwarf galaxy candidate in the tail of NGC 92. These three groups, besides containing galaxies that have peculiar velocity fields, also show extended HI tails. Conclusions. Our results indicate that the advanced stage of evolution of a group, together with the presence of intragroup HI clouds, may lead to star formation in the intragroup medium. A table containing all intergalactic HII regions and tidal dwarf galaxies confirmed to date is appended.