46 resultados para Alonso Bermejo, Antonio

em Archivo Digital para la Docencia y la Investigación - Repositorio Institucional de la Universidad del País Vasco


Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper we empirically investigate which are the structural characteristics that can help to predict the complexity of NK-landscape instances for estimation of distribution algorithms. To this end, we evolve instances that maximize the estimation of distribution algorithm complexity in terms of its success rate. Similarly, instances that minimize the algorithm complexity are evolved. We then identify network measures, computed from the structures of the NK-landscape instances, that have a statistically significant difference between the set of easy and hard instances. The features identified are consistently significant for different values of N and K.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Providing on line travel time information to commuters has become an important issue for Advanced Traveler Information Systems and Route Guidance Systems in the past years, due to the increasing traffic volume and congestion in the road networks. Travel time is one of the most useful traffic variables because it is more intuitive than other traffic variables such as flow, occupancy or density, and is useful for travelers in decision making. The aim of this paper is to present a global view of the literature on the modeling of travel time, introducing crucial concepts and giving a thorough classification of the existing tech- niques. Most of the attention will focus on travel time estimation and travel time prediction, which are generally not presented together. The main goals of these models, the study areas and methodologies used to carry out these tasks will be further explored and categorized.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Tesis leida dentro del Master de "Ingeniería Computacional y Sistemas Inteligentes"

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Methods for generating a new population are a fundamental component of estimation of distribution algorithms (EDAs). They serve to transfer the information contained in the probabilistic model to the new generated population. In EDAs based on Markov networks, methods for generating new populations usually discard information contained in the model to gain in efficiency. Other methods like Gibbs sampling use information about all interactions in the model but are computationally very costly. In this paper we propose new methods for generating new solutions in EDAs based on Markov networks. We introduce approaches based on inference methods for computing the most probable configurations and model-based template recombination. We show that the application of different variants of inference methods can increase the EDAs’ convergence rate and reduce the number of function evaluations needed to find the optimum of binary and non-binary discrete functions.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The Linear Ordering Problem is a popular combinatorial optimisation problem which has been extensively addressed in the literature. However, in spite of its popularity, little is known about the characteristics of this problem. This paper studies a procedure to extract static information from an instance of the problem, and proposes a method to incorporate the obtained knowledge in order to improve the performance of local search-based algorithms. The procedure introduced identifies the positions where the indexes cannot generate local optima for the insert neighbourhood, and thus global optima solutions. This information is then used to propose a restricted insert neighbourhood that discards the insert operations which move indexes to positions where optimal solutions are not generated. In order to measure the efficiency of the proposed restricted insert neighbourhood system, two state-of-the-art algorithms for the LOP that include local search procedures have been modified. Conducted experiments confirm that the restricted versions of the algorithms outperform the classical designs systematically. The statistical test included in the experimentation reports significant differences in all the cases, which validates the efficiency of our proposal.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

[EN]The Mallows and Generalized Mallows models are compact yet powerful and natural ways of representing a probability distribution over the space of permutations. In this paper we deal with the problems of sampling and learning (estimating) such distributions when the metric on permutations is the Cayley distance. We propose new methods for both operations, whose performance is shown through several experiments. We also introduce novel procedures to count and randomly generate permutations at a given Cayley distance both with and without certain structural restrictions. An application in the field of biology is given to motivate the interest of this model.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

[EN]In this paper we deal with distributions over permutation spaces. The Mallows model is the mode l in use. The associated distance for permutations is the Hamming distance.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

[EN]In this paper we deal with probability distributions over permutation spaces. The Probability model in use is the Mallows model. The distance for permutations that the model uses in the Ulam distance.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

[EN]Probability models on permutations associate a probability value to each of the permutations on n items. This paper considers two popular probability models, the Mallows model and the Generalized Mallows model. We describe methods for making inference, sampling and learning such distributions, some of which are novel in the literature. This paper also describes operations for permutations, with special attention in those related with the Kendall and Cayley distances and the random generation of permutations. These operations are of key importance for the efficient computation of the operations on distributions. These algorithms are implemented in the associated R package. Moreover, the internal code is written in C++.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper describes Mateda-2.0, a MATLAB package for estimation of distribution algorithms (EDAs). This package can be used to solve single and multi-objective discrete and continuous optimization problems using EDAs based on undirected and directed probabilistic graphical models. The implementation contains several methods commonly employed by EDAs. It is also conceived as an open package to allow users to incorporate different combinations of selection, learning, sampling, and local search procedures. Additionally, it includes methods to extract, process and visualize the structures learned by the probabilistic models. This way, it can unveil previously unknown information about the optimization problem domain. Mateda-2.0 also incorporates a module for creating and validating function models based on the probabilistic models learned by EDAs.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The development of techniques for oncogenomic analyses such as array comparative genomic hybridization, messenger RNA expression arrays and mutational screens have come to the fore in modern cancer research. Studies utilizing these techniques are able to highlight panels of genes that are altered in cancer. However, these candidate cancer genes must then be scrutinized to reveal whether they contribute to oncogenesis or are coincidental and non-causative. We present a computational method for the prioritization of candidate (i) proto-oncogenes and (ii) tumour suppressor genes from oncogenomic experiments. We constructed computational classifiers using different combinations of sequence and functional data including sequence conservation, protein domains and interactions, and regulatory data. We found that these classifiers are able to distinguish between known cancer genes and other human genes. Furthermore, the classifiers also discriminate candidate cancer genes from a recent mutational screen from other human genes. We provide a web-based facility through which cancer biologists may access our results and we propose computational cancer gene classification as a useful method of prioritizing candidate cancer genes identified in oncogenomic studies.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The learning of probability distributions from data is a ubiquitous problem in the fields of Statistics and Artificial Intelligence. During the last decades several learning algorithms have been proposed to learn probability distributions based on decomposable models due to their advantageous theoretical properties. Some of these algorithms can be used to search for a maximum likelihood decomposable model with a given maximum clique size, k, which controls the complexity of the model. Unfortunately, the problem of learning a maximum likelihood decomposable model given a maximum clique size is NP-hard for k > 2. In this work, we propose a family of algorithms which approximates this problem with a computational complexity of O(k · n^2 log n) in the worst case, where n is the number of implied random variables. The structures of the decomposable models that solve the maximum likelihood problem are called maximal k-order decomposable graphs. Our proposals, called fractal trees, construct a sequence of maximal i-order decomposable graphs, for i = 2, ..., k, in k − 1 steps. At each step, the algorithms follow a divide-and-conquer strategy based on the particular features of this type of structures. Additionally, we propose a prune-and-graft procedure which transforms a maximal k-order decomposable graph into another one, increasing its likelihood. We have implemented two particular fractal tree algorithms called parallel fractal tree and sequential fractal tree. These algorithms can be considered a natural extension of Chow and Liu’s algorithm, from k = 2 to arbitrary values of k. Both algorithms have been compared against other efficient approaches in artificial and real domains, and they have shown a competitive behavior to deal with the maximum likelihood problem. Due to their low computational complexity they are especially recommended to deal with high dimensional domains.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Recently, probability models on rankings have been proposed in the field of estimation of distribution algorithms in order to solve permutation-based combinatorial optimisation problems. Particularly, distance-based ranking models, such as Mallows and Generalized Mallows under the Kendall’s-t distance, have demonstrated their validity when solving this type of problems. Nevertheless, there are still many trends that deserve further study. In this paper, we extend the use of distance-based ranking models in the framework of EDAs by introducing new distance metrics such as Cayley and Ulam. In order to analyse the performance of the Mallows and Generalized Mallows EDAs under the Kendall, Cayley and Ulam distances, we run them on a benchmark of 120 instances from four well known permutation problems. The conducted experiments showed that there is not just one metric that performs the best in all the problems. However, the statistical test pointed out that Mallows-Ulam EDA is the most stable algorithm among the studied proposals.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In recent years, the performance of semi-supervised learning has been theoretically investigated. However, most of this theoretical development has focussed on binary classification problems. In this paper, we take it a step further by extending the work of Castelli and Cover [1] [2] to the multi-class paradigm. Particularly, we consider the key problem in semi-supervised learning of classifying an unseen instance x into one of K different classes, using a training dataset sampled from a mixture density distribution and composed of l labelled records and u unlabelled examples. Even under the assumption of identifiability of the mixture and having infinite unlabelled examples, labelled records are needed to determine the K decision regions. Therefore, in this paper, we first investigate the minimum number of labelled examples needed to accomplish that task. Then, we propose an optimal multi-class learning algorithm which is a generalisation of the optimal procedure proposed in the literature for binary problems. Finally, we make use of this generalisation to study the probability of error when the binary class constraint is relaxed.