10 resultados para random search algorithms

em AMS Tesi di Dottorato - Alm@DL - Università di Bologna


Relevância:

90.00% 90.00%

Publicador:

Resumo:

Information is nowadays a key resource: machine learning and data mining techniques have been developed to extract high-level information from great amounts of data. As most data comes in form of unstructured text in natural languages, research on text mining is currently very active and dealing with practical problems. Among these, text categorization deals with the automatic organization of large quantities of documents in priorly defined taxonomies of topic categories, possibly arranged in large hierarchies. In commonly proposed machine learning approaches, classifiers are automatically trained from pre-labeled documents: they can perform very accurate classification, but often require a consistent training set and notable computational effort. Methods for cross-domain text categorization have been proposed, allowing to leverage a set of labeled documents of one domain to classify those of another one. Most methods use advanced statistical techniques, usually involving tuning of parameters. A first contribution presented here is a method based on nearest centroid classification, where profiles of categories are generated from the known domain and then iteratively adapted to the unknown one. Despite being conceptually simple and having easily tuned parameters, this method achieves state-of-the-art accuracy in most benchmark datasets with fast running times. A second, deeper contribution involves the design of a domain-independent model to distinguish the degree and type of relatedness between arbitrary documents and topics, inferred from the different types of semantic relationships between respective representative words, identified by specific search algorithms. The application of this model is tested on both flat and hierarchical text categorization, where it potentially allows the efficient addition of new categories during classification. Results show that classification accuracy still requires improvements, but models generated from one domain are shown to be effectively able to be reused in a different one.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

While imperfect information games are an excellent model of real-world problems and tasks, they are often difficult for computer programs to play at a high level of proficiency, especially if they involve major uncertainty and a very large state space. Kriegspiel, a variant of chess making it similar to a wargame, is a perfect example: while the game was studied for decades from a game-theoretical viewpoint, it was only very recently that the first practical algorithms for playing it began to appear. This thesis presents, documents and tests a multi-sided effort towards making a strong Kriegspiel player, using heuristic searching, retrograde analysis and Monte Carlo tree search algorithms to achieve increasingly higher levels of play. The resulting program is currently the strongest computer player in the world and plays at an above-average human level.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The inherent stochastic character of most of the physical quantities involved in engineering models has led to an always increasing interest for probabilistic analysis. Many approaches to stochastic analysis have been proposed. However, it is widely acknowledged that the only universal method available to solve accurately any kind of stochastic mechanics problem is Monte Carlo Simulation. One of the key parts in the implementation of this technique is the accurate and efficient generation of samples of the random processes and fields involved in the problem at hand. In the present thesis an original method for the simulation of homogeneous, multi-dimensional, multi-variate, non-Gaussian random fields is proposed. The algorithm has proved to be very accurate in matching both the target spectrum and the marginal probability. The computational efficiency and robustness are very good too, even when dealing with strongly non-Gaussian distributions. What is more, the resulting samples posses all the relevant, welldefined and desired properties of “translation fields”, including crossing rates and distributions of extremes. The topic of the second part of the thesis lies in the field of non-destructive parametric structural identification. Its objective is to evaluate the mechanical characteristics of constituent bars in existing truss structures, using static loads and strain measurements. In the cases of missing data and of damages that interest only a small portion of the bar, Genetic Algorithm have proved to be an effective tool to solve the problem.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Combinatorial Optimization is a branch of optimization that deals with the problems where the set of feasible solutions is discrete. Routing problem is a well studied branch of Combinatorial Optimization that concerns the process of deciding the best way of visiting the nodes (customers) in a network. Routing problems appear in many real world applications including: Transportation, Telephone or Electronic data Networks. During the years, many solution procedures have been introduced for the solution of different Routing problems. Some of them are based on exact approaches to solve the problems to optimality and some others are based on heuristic or metaheuristic search to find optimal or near optimal solutions. There is also a less studied method, which combines both heuristic and exact approaches to face different problems including those in the Combinatorial Optimization area. The aim of this dissertation is to develop some solution procedures based on the combination of heuristic and Integer Linear Programming (ILP) techniques for some important problems in Routing Optimization. In this approach, given an initial feasible solution to be possibly improved, the method follows a destruct-and-repair paradigm, where the given solution is randomly destroyed (i.e., customers are removed in a random way) and repaired by solving an ILP model, in an attempt to find a new improved solution.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In this thesis we made the first steps towards the systematic application of a methodology for automatically building formal models of complex biological systems. Such a methodology could be useful also to design artificial systems possessing desirable properties such as robustness and evolvability. The approach we follow in this thesis is to manipulate formal models by means of adaptive search methods called metaheuristics. In the first part of the thesis we develop state-of-the-art hybrid metaheuristic algorithms to tackle two important problems in genomics, namely, the Haplotype Inference by parsimony and the Founder Sequence Reconstruction Problem. We compare our algorithms with other effective techniques in the literature, we show strength and limitations of our approaches to various problem formulations and, finally, we propose further enhancements that could possibly improve the performance of our algorithms and widen their applicability. In the second part, we concentrate on Boolean network (BN) models of gene regulatory networks (GRNs). We detail our automatic design methodology and apply it to four use cases which correspond to different design criteria and address some limitations of GRN modeling by BNs. Finally, we tackle the Density Classification Problem with the aim of showing the learning capabilities of BNs. Experimental evaluation of this methodology shows its efficacy in producing network that meet our design criteria. Our results, coherently to what has been found in other works, also suggest that networks manipulated by a search process exhibit a mixture of characteristics typical of different dynamical regimes.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Large scale wireless adhoc networks of computers, sensors, PDAs etc. (i.e. nodes) are revolutionizing connectivity and leading to a paradigm shift from centralized systems to highly distributed and dynamic environments. An example of adhoc networks are sensor networks, which are usually composed by small units able to sense and transmit to a sink elementary data which are successively processed by an external machine. Recent improvements in the memory and computational power of sensors, together with the reduction of energy consumptions, are rapidly changing the potential of such systems, moving the attention towards datacentric sensor networks. A plethora of routing and data management algorithms have been proposed for the network path discovery ranging from broadcasting/floodingbased approaches to those using global positioning systems (GPS). We studied WGrid, a novel decentralized infrastructure that organizes wireless devices in an adhoc manner, where each node has one or more virtual coordinates through which both message routing and data management occur without reliance on either flooding/broadcasting operations or GPS. The resulting adhoc network does not suffer from the deadend problem, which happens in geographicbased routing when a node is unable to locate a neighbor closer to the destination than itself. WGrid allow multidimensional data management capability since nodes' virtual coordinates can act as a distributed database without needing neither special implementation or reorganization. Any kind of data (both single and multidimensional) can be distributed, stored and managed. We will show how a location service can be easily implemented so that any search is reduced to a simple query, like for any other data type. WGrid has then been extended by adopting a replication methodology. We called the resulting algorithm WRGrid. Just like WGrid, WRGrid acts as a distributed database without needing neither special implementation nor reorganization and any kind of data can be distributed, stored and managed. We have evaluated the benefits of replication on data management, finding out, from experimental results, that it can halve the average number of hops in the network. The direct consequence of this fact are a significant improvement on energy consumption and a workload balancing among sensors (number of messages routed by each node). Finally, thanks to the replications, whose number can be arbitrarily chosen, the resulting sensor network can face sensors disconnections/connections, due to failures of sensors, without data loss. Another extension to {WGrid} is {W*Grid} which extends it by strongly improving network recovery performance from link and/or device failures that may happen due to crashes or battery exhaustion of devices or to temporary obstacles. W*Grid guarantees, by construction, at least two disjoint paths between each couple of nodes. This implies that the recovery in W*Grid occurs without broadcasting transmissions and guaranteeing robustness while drastically reducing the energy consumption. An extensive number of simulations shows the efficiency, robustness and traffic road of resulting networks under several scenarios of device density and of number of coordinates. Performance analysis have been compared to existent algorithms in order to validate the results.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The Capacitated Location-Routing Problem (CLRP) is a NP-hard problem since it generalizes two well known NP-hard problems: the Capacitated Facility Location Problem (CFLP) and the Capacitated Vehicle Routing Problem (CVRP). The Multi-Depot Vehicle Routing Problem (MDVRP) is known to be a NP-hard since it is a generalization of the well known Vehicle Routing Problem (VRP), arising with one depot. This thesis addresses heuristics algorithms based on the well-know granular search idea introduced by Toth and Vigo (2003) to solve the CLRP and the MDVRP. Extensive computational experiments on benchmark instances for both problems have been performed to determine the effectiveness of the proposed algorithms. This work is organized as follows: Chapter 1 describes a detailed overview and a methodological review of the literature for the the Capacitated Location-Routing Problem (CLRP) and the Multi-Depot Vehicle Routing Problem (MDVRP). Chapter 2 describes a two-phase hybrid heuristic algorithm to solve the CLRP. Chapter 3 shows a computational comparison of heuristic algorithms for the CLRP. Chapter 4 presents a hybrid granular tabu search approach for solving the MDVRP.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In the field of vibration qualification testing, with the popular Random Control mode of shakers, the specimen is excited by random vibrations typically set in the form of a Power Spectral Density (PSD). The corresponding signals are stationary and Gaussian, i.e. featuring a normal distribution. Conversely, real-life excitations are frequently non-Gaussian, exhibiting high peaks and/or burst signals and/or deterministic harmonic components. The so-called kurtosis is a parameter often used to statistically describe the occurrence and significance of high peak values in a random process. Since the similarity between test input profiles and real-life excitations is fundamental for qualification test reliability, some methods of kurtosis-control can be implemented to synthesize realistic (non-Gaussian) input signals. Durability tests are performed to check the resistance of a component to vibration-based fatigue damage. A procedure to synthesize test excitations which starts from measured data and preserves both the damage potential and the characteristics of the reference signals is desirable. The Fatigue Damage Spectrum (FDS) is generally used to quantify the fatigue damage potential associated with the excitation. The signal synthesized for accelerated durability tests (i.e. with a limited duration) must feature the same FDS as the reference vibration computed for the component’s expected lifetime. Current standard procedures are efficient in synthesizing signals in the form of a PSD, but prove inaccurate if reference data are non-Gaussian. This work presents novel algorithms for the synthesis of accelerated durability test profiles with prescribed FDS and a non-Gaussian distribution. An experimental campaign is conducted to validate the algorithms, by testing their accuracy, robustness, and practical effectiveness. Moreover, an original procedure is proposed for the estimation of the fatigue damage potential, aiming to minimize the computational time. The research is thus supposed to improve both the effectiveness and the efficiency of excitation profile synthesis for accelerated durability tests.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The study of random probability measures is a lively research topic that has attracted interest from different fields in recent years. In this thesis, we consider random probability measures in the context of Bayesian nonparametrics, where the law of a random probability measure is used as prior distribution, and in the context of distributional data analysis, where the goal is to perform inference given avsample from the law of a random probability measure. The contributions contained in this thesis can be subdivided according to three different topics: (i) the use of almost surely discrete repulsive random measures (i.e., whose support points are well separated) for Bayesian model-based clustering, (ii) the proposal of new laws for collections of random probability measures for Bayesian density estimation of partially exchangeable data subdivided into different groups, and (iii) the study of principal component analysis and regression models for probability distributions seen as elements of the 2-Wasserstein space. Specifically, for point (i) above we propose an efficient Markov chain Monte Carlo algorithm for posterior inference, which sidesteps the need of split-merge reversible jump moves typically associated with poor performance, we propose a model for clustering high-dimensional data by introducing a novel class of anisotropic determinantal point processes, and study the distributional properties of the repulsive measures, shedding light on important theoretical results which enable more principled prior elicitation and more efficient posterior simulation algorithms. For point (ii) above, we consider several models suitable for clustering homogeneous populations, inducing spatial dependence across groups of data, extracting the characteristic traits common to all the data-groups, and propose a novel vector autoregressive model to study of growth curves of Singaporean kids. Finally, for point (iii), we propose a novel class of projected statistical methods for distributional data analysis for measures on the real line and on the unit-circle.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Latency can be defined as the sum of the arrival times at the customers. Minimum latency problems are specially relevant in applications related to humanitarian logistics. This thesis presents algorithms for solving a family of vehicle routing problems with minimum latency. First the latency location routing problem (LLRP) is considered. It consists of determining the subset of depots to be opened, and the routes that a set of homogeneous capacitated vehicles must perform in order to visit a set of customers such that the sum of the demands of the customers assigned to each vehicle does not exceed the capacity of the vehicle. For solving this problem three metaheuristic algorithms combining simulated annealing and variable neighborhood descent, and an iterated local search (ILS) algorithm, are proposed. Furthermore, the multi-depot cumulative capacitated vehicle routing problem (MDCCVRP) and the multi-depot k-traveling repairman problem (MDk-TRP) are solved with the proposed ILS algorithm. The MDCCVRP is a special case of the LLRP in which all the depots can be opened, and the MDk-TRP is a special case of the MDCCVRP in which the capacity constraints are relaxed. Finally, a LLRP with stochastic travel times is studied. A two-stage stochastic programming model and a variable neighborhood search algorithm are proposed for solving the problem. Furthermore a sampling method is developed for tackling instances with an infinite number of scenarios. Extensive computational experiments show that the proposed methods are effective for solving the problems under study.