42 resultados para estimation of distribution algorithms
em Universidad Politécnica de Madrid
Resumo:
Evolutionary search algorithms have become an essential asset in the algorithmic toolbox for solving high-dimensional optimization problems in across a broad range of bioinformatics problems. Genetic algorithms, the most well-known and representative evolutionary search technique, have been the subject of the major part of such applications. Estimation of distribution algorithms (EDAs) offer a novel evolutionary paradigm that constitutes a natural and attractive alternative to genetic algorithms. They make use of a probabilistic model, learnt from the promising solutions, to guide the search process. In this paper, we set out a basic taxonomy of EDA techniques, underlining the nature and complexity of the probabilistic model of each EDA variant. We review a set of innovative works that make use of EDA techniques to solve challenging bioinformatics problems, emphasizing the EDA paradigm's potential for further research in this domain.
Resumo:
One of the most promising areas in which probabilistic graphical models have shown an incipient activity is the field of heuristic optimization and, in particular, in Estimation of Distribution Algorithms. Due to their inherent parallelism, different research lines have been studied trying to improve Estimation of Distribution Algorithms from the point of view of execution time and/or accuracy. Among these proposals, we focus on the so-called distributed or island-based models. This approach defines several islands (algorithms instances) running independently and exchanging information with a given frequency. The information sent by the islands can be either a set of individuals or a probabilistic model. This paper presents a comparative study for a distributed univariate Estimation of Distribution Algorithm and a multivariate version, paying special attention to the comparison of two alternative methods for exchanging information, over a wide set of parameters and problems ? the standard benchmark developed for the IEEE Workshop on Evolutionary Algorithms and other Metaheuristics for Continuous Optimization Problems of the ISDA 2009 Conference. Several analyses from different points of view have been conducted to analyze both the influence of the parameters and the relationships between them including a characterization of the configurations according to their behavior on the proposed benchmark.
Resumo:
This paper proposes a new multi-objective estimation of distribution algorithm (EDA) based on joint modeling of objectives and variables. This EDA uses the multi-dimensional Bayesian network as its probabilistic model. In this way it can capture the dependencies between objectives, variables and objectives, as well as the dependencies learnt between variables in other Bayesian network-based EDAs. This model leads to a problem decomposition that helps the proposed algorithm to find better trade-off solutions to the multi-objective problem. In addition to Pareto set approximation, the algorithm is also able to estimate the structure of the multi-objective problem. To apply the algorithm to many-objective problems, the algorithm includes four different ranking methods proposed in the literature for this purpose. The algorithm is applied to the set of walking fish group (WFG) problems, and its optimization performance is compared with an evolutionary algorithm and another multi-objective EDA. The experimental results show that the proposed algorithm performs significantly better on many of the problems and for different objective space dimensions, and achieves comparable results on some compared with the other algorithms.
Resumo:
Finding the degree-constrained minimum spanning tree (DCMST) of a graph is a widely studied NP-hard problem. One of its most important applications is network design. Here we deal with a new variant of the DCMST problem, which consists of finding not only the degree- but also the role-constrained minimum spanning tree (DRCMST), i.e., we add constraints to restrict the role of the nodes in the tree to root, intermediate or leaf node. Furthermore, we do not limit the number of root nodes to one, thereby, generally, building a forest of DRCMSTs. The modeling of network design problems can benefit from the possibility of generating more than one tree and determining the role of the nodes in the network. We propose a novel permutation-based representation to encode these forests. In this new representation, one permutation simultaneously encodes all the trees to be built. We simulate a wide variety of DRCMST problems which we optimize using eight different evolutionary computation algorithms encoding individuals of the population using the proposed representation. The algorithms we use are: estimation of distribution algorithm, generational genetic algorithm, steady-state genetic algorithm, covariance matrix adaptation evolution strategy, differential evolution, elitist evolution strategy, non-elitist evolution strategy and particle swarm optimization. The best results are for the estimation of distribution algorithms and both types of genetic algorithms, although the genetic algorithms are significantly faster.
Resumo:
Thanks to their inherent properties, probabilistic graphical models are one of the prime candidates for machine learning and decision making tasks especially in uncertain domains. Their capabilities, like representation, inference and learning, if used effectively, can greatly help to build intelligent systems that are able to act accordingly in different problem domains. Evolutionary algorithms is one such discipline that has employed probabilistic graphical models to improve the search for optimal solutions in complex problems. This paper shows how probabilistic graphical models have been used in evolutionary algorithms to improve their performance in solving complex problems. Specifically, we give a survey of probabilistic model building-based evolutionary algorithms, called estimation of distribution algorithms, and compare different methods for probabilistic modeling in these algorithms.
Resumo:
Probabilistic modeling is the de�ning characteristic of estimation of distribution algorithms (EDAs) which determines their behavior and performance in optimization. Regularization is a well-known statistical technique used for obtaining an improved model by reducing the generalization error of estimation, especially in high-dimensional problems. `1-regularization is a type of this technique with the appealing variable selection property which results in sparse model estimations. In this thesis, we study the use of regularization techniques for model learning in EDAs. Several methods for regularized model estimation in continuous domains based on a Gaussian distribution assumption are presented, and analyzed from di�erent aspects when used for optimization in a high-dimensional setting, where the population size of EDA has a logarithmic scale with respect to the number of variables. The optimization results obtained for a number of continuous problems with an increasing number of variables show that the proposed EDA based on regularized model estimation performs a more robust optimization, and is able to achieve signi�cantly better results for larger dimensions than other Gaussian-based EDAs. We also propose a method for learning a marginally factorized Gaussian Markov random �eld model using regularization techniques and a clustering algorithm. The experimental results show notable optimization performance on continuous additively decomposable problems when using this model estimation method. Our study also covers multi-objective optimization and we propose joint probabilistic modeling of variables and objectives in EDAs based on Bayesian networks, speci�cally models inspired from multi-dimensional Bayesian network classi�ers. It is shown that with this approach to modeling, two new types of relationships are encoded in the estimated models in addition to the variable relationships captured in other EDAs: objectivevariable and objective-objective relationships. An extensive experimental study shows the e�ectiveness of this approach for multi- and many-objective optimization. With the proposed joint variable-objective modeling, in addition to the Pareto set approximation, the algorithm is also able to obtain an estimation of the multi-objective problem structure. Finally, the study of multi-objective optimization based on joint probabilistic modeling is extended to noisy domains, where the noise in objective values is represented by intervals. A new version of the Pareto dominance relation for ordering the solutions in these problems, namely �-degree Pareto dominance, is introduced and its properties are analyzed. We show that the ranking methods based on this dominance relation can result in competitive performance of EDAs with respect to the quality of the approximated Pareto sets. This dominance relation is then used together with a method for joint probabilistic modeling based on `1-regularization for multi-objective feature subset selection in classi�cation, where six di�erent measures of accuracy are considered as objectives with interval values. The individual assessment of the proposed joint probabilistic modeling and solution ranking methods on datasets with small-medium dimensionality, when using two di�erent Bayesian classi�ers, shows that comparable or better Pareto sets of feature subsets are approximated in comparison to standard methods.
Resumo:
Through the use of the Distributed Fiber Optic Temperature Measurement (DFOT) method, it is possible to measure the temperature in small intervals (on the order of centimeters) for long distances (on the order of kilometers) with a high temporal frequency and great accuracy. The heat pulse method consists of applying a known amount of heat to the soil and monitoring the temperature evolution, which is primarily dependent on the soil moisture content. The use of both methods, which is called the active heat pulse method with fiber optic temperature sensing (AHFO), allows accurate soil moisture content measurements. In order to experimentally study the wetting patterns, i.e. shape, size, and the water distribution, from a drip irrigation emitter, a soil column of 0.5 m of diameter and 0.6 m high was built. Inside the column, a fiber optic cable with a stainless steel sheath was placed forming three concentric helixes of diameters 0.2 m, 0.4 m and 0.6 m, leading to a 148 measurement point network. Before, during, and after the irrigation event, heat pulses were performed supplying electrical power of 20 W/m to the steel. The soil moisture content was measured with a capacitive sensor in one location at depths of 0.1 m, 0.2 m, 0.3 m and 0.4 m during the irrigation. It was also determined by the gravimetric method in several locations and depths before and right after the irrigation. The emitter bulb dimensions and shape evolution was satisfactorily measured during infiltration. Furthermore, some bulb's characteristics difficult to predict (e.g. preferential flow) were detected. The results point out that the AHFO is a useful tool to estimate the wetting pattern of drip irrigation emitters in soil columns and show a high potential for its use in the field.
Resumo:
This paper describes new approaches to improve the local and global approximation (matching) and modeling capability of Takagi–Sugeno (T-S) fuzzy model. The main aim is obtaining high function approximation accuracy and fast convergence. The main problem encountered is that T-S identification method cannot be applied when the membership functions are overlapped by pairs. This restricts the application of the T-S method because this type of membership function has been widely used during the last 2 decades in the stability, controller design of fuzzy systems and is popular in industrial control applications. The approach developed here can be considered as a generalized version of T-S identification method with optimized performance in approximating nonlinear functions. We propose a noniterative method through weighting of parameters approach and an iterative algorithm by applying the extended Kalman filter, based on the same idea of parameters’ weighting. We show that the Kalman filter is an effective tool in the identification of T-S fuzzy model. A fuzzy controller based linear quadratic regulator is proposed in order to show the effectiveness of the estimation method developed here in control applications. An illustrative example of an inverted pendulum is chosen to evaluate the robustness and remarkable performance of the proposed method locally and globally in comparison with the original T-S model. Simulation results indicate the potential, simplicity, and generality of the algorithm. An illustrative example is chosen to evaluate the robustness. In this paper, we prove that these algorithms converge very fast, thereby making them very practical to use.
Resumo:
This paper presents a time-domain stochastic system identification method based on maximum likelihood estimation (MLE) with the expectation maximization (EM) algorithm. The effectiveness of this structural identification method is evaluated through numerical simulation in the context of the ASCE benchmark problem on structural health monitoring. The benchmark structure is a four-story, two-bay by two-bay steel-frame scale model structure built in the Earthquake Engineering Research Laboratory at the University of British Columbia, Canada. This paper focuses on Phase I of the analytical benchmark studies. A MATLAB-based finite element analysis code obtained from the IASC-ASCE SHM Task Group web site is used to calculate the dynamic response of the prototype structure. A number of 100 simulations have been made using this MATLAB-based finite element analysis code in order to evaluate the proposed identification method. There are several techniques to realize system identification. In this work, stochastic subspace identification (SSI)method has been used for comparison. SSI identification method is a well known method and computes accurate estimates of the modal parameters. The principles of the SSI identification method has been introduced in the paper and next the proposed MLE with EM algorithm has been explained in detail. The advantages of the proposed structural identification method can be summarized as follows: (i) the method is based on maximum likelihood, that implies minimum variance estimates; (ii) EM is a computational simpler estimation procedure than other optimization algorithms; (iii) estimate more parameters than SSI, and these estimates are accurate. On the contrary, the main disadvantages of the method are: (i) EM algorithm is an iterative procedure and it consumes time until convergence is reached; and (ii) this method needs starting values for the parameters. Modal parameters (eigenfrequencies, damping ratios and mode shapes) of the benchmark structure have been estimated using both the SSI method and the proposed MLE + EM method. The numerical results show that the proposed method identifies eigenfrequencies, damping ratios and mode shapes reasonably well even in the presence of 10% measurement noises. These modal parameters are more accurate than the SSI estimated modal parameters.
Resumo:
The purpose of this paper is to present a program written in Matlab-Octave for the simulation of the time evolution of student curricula, i.e, how students pass their subjects along time until graduation. The program computes, from the simulations, the academic performance rates for the subjects of the study plan for each semester as well as the overall rates, which are a) the efficiency rate defined as the ratio of the number of students passing the exam to the number of students who registered for it and b) the success rate, defined as the ratio of the number of students passing the exam to the number of students who not only registered for it but also actually took it. Additionally, we compute the rates for the bachelor academic degree which are established for Spain by the National Quality Evaluation and Accreditation Agency (ANECA) and which are the graduation rate (measured as the percentage of students who finish as scheduled in the plan or taking an extra year) and the efficiency rate (measured as the percentage of credits which a student who graduated has really taken). The simulation is done in terms of the probabilities of passing all the subjects in their study plan. The application of the simulator to Polytech students in Madrid, where requirements for passing are specially stiff in first and second year subjects, is particularly relevant to analyze student cohorts and the probabilities of students finishing in the minimum of four years, or taking and extra year or two extra years, and so forth. It is a very useful tool when designing new study plans. The calculation of the probability distribution of the random variable "number of semesters a student has taken to complete the curricula and graduate" is difficult or even unfeasible to obtain analytically, and this is even truer when we incorporate uncertainty in parameter estimation. This is why we apply Monte Carlo simulation which not only provides illustration of the stochastic process but also a method for computation. The stochastic simulator is proving to be a useful tool for identification of the subjects most critical in the distribution of the number of semesters for curriculum vitae (CV) completion and subsequently for a decision making process in terms of CV planning and passing standards in the University. Simulations are performed through a graphical interface where also the results are presented in appropriate figures. The Project has been funded by the Call for Innovation in Education Projects of Universidad Politécnica de Madrid (UPM) through a Project of its school Escuela Técnica Superior de Ingenieros Industriales ETSII during the period September 2010-September 2011.
Resumo:
The problem of channel estimation for multicarrier communications is addressed. We focus on systems employing the Discrete Cosine Transform Type-I (DCT1) even at both the transmitter and the receiver, presenting an algorithm which achieves an accurate estimation of symmetric channel filters using only a small number of training symbols. The solution is obtained by using either matrix inversion or compressed sensing algorithms. We provide the theoretical results which guarantee the validity of the proposed technique for the DCT1. Numerical simulations illustrate the good behaviour of the proposed algorithm.
Resumo:
The optimum quality that can be asymptotically achieved in the estimation of a probability p using inverse binomial sampling is addressed. A general definition of quality is used in terms of the risk associated with a loss function that satisfies certain assumptions. It is shown that the limit superior of the risk for p asymptotically small has a minimum over all (possibly randomized) estimators. This minimum is achieved by certain non-randomized estimators. The model includes commonly used quality criteria as particular cases. Applications to the non-asymptotic regime are discussed considering specific loss functions, for which minimax estimators are derived.
Resumo:
Triple-Play (3P) and Quadruple-Play (4P) services are being widely offered by telecommunication services providers. Such services must be able to offer equal or higher quality levels than those obtained with traditional systems, especially for the most demanding services such as broadcast IPTV. This paper presents a matrix-based model, defined in terms of service components, user perceptions, agent capabilities, performance indicators and evaluation functions, which allows to estimate the overall quality of a set of convergent services, as perceived by the users, from a set of performance and/or Quality of Service (QoS) parameters of the convergent IP transport network
Resumo:
The aim of this paper was to accurately estimate the local truncation error of partial differential equations, that are numerically solved using a finite difference or finite volume approach on structured and unstructured meshes. In this work, we approximated the local truncation error using the @t-estimation procedure, which aims to compare the residuals on a sequence of grids with different spacing. First, we focused the analysis on one-dimensional scalar linear and non-linear test cases to examine the accuracy of the estimation of the truncation error for both finite difference and finite volume approaches on different grid topologies. Then, we extended the analysis to two-dimensional problems: first on linear and non-linear scalar equations and finally on the Euler equations. We demonstrated that this approach yields a highly accurate estimation of the truncation error if some conditions are fulfilled. These conditions are related to the accuracy of the restriction operators, the choice of the boundary conditions, the distortion of the grids and the magnitude of the iteration error.