7 resultados para Practical problems

em AMS Tesi di Dottorato - Alm@DL - Università di Bologna


Relevância:

70.00% 70.00%

Publicador:

Resumo:

In this thesis we study three combinatorial optimization problems belonging to the classes of Network Design and Vehicle Routing problems that are strongly linked in the context of the design and management of transportation networks: the Non-Bifurcated Capacitated Network Design Problem (NBP), the Period Vehicle Routing Problem (PVRP) and the Pickup and Delivery Problem with Time Windows (PDPTW). These problems are NP-hard and contain as special cases some well known difficult problems such as the Traveling Salesman Problem and the Steiner Tree Problem. Moreover, they model the core structure of many practical problems arising in logistics and telecommunications. The NBP is the problem of designing the optimum network to satisfy a given set of traffic demands. Given a set of nodes, a set of potential links and a set of point-to-point demands called commodities, the objective is to select the links to install and dimension their capacities so that all the demands can be routed between their respective endpoints, and the sum of link fixed costs and commodity routing costs is minimized. The problem is called non- bifurcated because the solution network must allow each demand to follow a single path, i.e., the flow of each demand cannot be splitted. Although this is the case in many real applications, the NBP has received significantly less attention in the literature than other capacitated network design problems that allow bifurcation. We describe an exact algorithm for the NBP that is based on solving by an integer programming solver a formulation of the problem strengthened by simple valid inequalities and four new heuristic algorithms. One of these heuristics is an adaptive memory metaheuristic, based on partial enumeration, that could be applied to a wider class of structured combinatorial optimization problems. In the PVRP a fleet of vehicles of identical capacity must be used to service a set of customers over a planning period of several days. Each customer specifies a service frequency, a set of allowable day-combinations and a quantity of product that the customer must receive every time he is visited. For example, a customer may require to be visited twice during a 5-day period imposing that these visits take place on Monday-Thursday or Monday-Friday or Tuesday-Friday. The problem consists in simultaneously assigning a day- combination to each customer and in designing the vehicle routes for each day so that each customer is visited the required number of times, the number of routes on each day does not exceed the number of vehicles available, and the total cost of the routes over the period is minimized. We also consider a tactical variant of this problem, called Tactical Planning Vehicle Routing Problem, where customers require to be visited on a specific day of the period but a penalty cost, called service cost, can be paid to postpone the visit to a later day than that required. At our knowledge all the algorithms proposed in the literature for the PVRP are heuristics. In this thesis we present for the first time an exact algorithm for the PVRP that is based on different relaxations of a set partitioning-like formulation. The effectiveness of the proposed algorithm is tested on a set of instances from the literature and on a new set of instances. Finally, the PDPTW is to service a set of transportation requests using a fleet of identical vehicles of limited capacity located at a central depot. Each request specifies a pickup location and a delivery location and requires that a given quantity of load is transported from the pickup location to the delivery location. Moreover, each location can be visited only within an associated time window. Each vehicle can perform at most one route and the problem is to satisfy all the requests using the available vehicles so that each request is serviced by a single vehicle, the load on each vehicle does not exceed the capacity, and all locations are visited according to their time window. We formulate the PDPTW as a set partitioning-like problem with additional cuts and we propose an exact algorithm based on different relaxations of the mathematical formulation and a branch-and-cut-and-price algorithm. The new algorithm is tested on two classes of problems from the literature and compared with a recent branch-and-cut-and-price algorithm from the literature.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Water distribution networks optimization is a challenging problem due to the dimension and the complexity of these systems. Since the last half of the twentieth century this field has been investigated by many authors. Recently, to overcome discrete nature of variables and non linearity of equations, the research has been focused on the development of heuristic algorithms. This algorithms do not require continuity and linearity of the problem functions because they are linked to an external hydraulic simulator that solve equations of mass continuity and of energy conservation of the network. In this work, a NSGA-II (Non-dominating Sorting Genetic Algorithm) has been used. This is a heuristic multi-objective genetic algorithm based on the analogy of evolution in nature. Starting from an initial random set of solutions, called population, it evolves them towards a front of solutions that minimize, separately and contemporaneously, all the objectives. This can be very useful in practical problems where multiple and discordant goals are common. Usually, one of the main drawback of these algorithms is related to time consuming: being a stochastic research, a lot of solutions must be analized before good ones are found. Results of this thesis about the classical optimal design problem shows that is possible to improve results modifying the mathematical definition of objective functions and the survival criterion, inserting good solutions created by a Cellular Automata and using rules created by classifier algorithm (C4.5). This part has been tested using the version of NSGA-II supplied by Centre for Water Systems (University of Exeter, UK) in MATLAB® environment. Even if orientating the research can constrain the algorithm with the risk of not finding the optimal set of solutions, it can greatly improve the results. Subsequently, thanks to CINECA help, a version of NSGA-II has been implemented in C language and parallelized: results about the global parallelization show the speed up, while results about the island parallelization show that communication among islands can improve the optimization. Finally, some tests about the optimization of pump scheduling have been carried out. In this case, good results are found for a small network, while the solutions of a big problem are affected by the lack of constraints on the number of pump switches. Possible future research is about the insertion of further constraints and the evolution guide. In the end, the optimization of water distribution systems is still far from a definitive solution, but the improvement in this field can be very useful in reducing the solutions cost of practical problems, where the high number of variables makes their management very difficult from human point of view.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Information is nowadays a key resource: machine learning and data mining techniques have been developed to extract high-level information from great amounts of data. As most data comes in form of unstructured text in natural languages, research on text mining is currently very active and dealing with practical problems. Among these, text categorization deals with the automatic organization of large quantities of documents in priorly defined taxonomies of topic categories, possibly arranged in large hierarchies. In commonly proposed machine learning approaches, classifiers are automatically trained from pre-labeled documents: they can perform very accurate classification, but often require a consistent training set and notable computational effort. Methods for cross-domain text categorization have been proposed, allowing to leverage a set of labeled documents of one domain to classify those of another one. Most methods use advanced statistical techniques, usually involving tuning of parameters. A first contribution presented here is a method based on nearest centroid classification, where profiles of categories are generated from the known domain and then iteratively adapted to the unknown one. Despite being conceptually simple and having easily tuned parameters, this method achieves state-of-the-art accuracy in most benchmark datasets with fast running times. A second, deeper contribution involves the design of a domain-independent model to distinguish the degree and type of relatedness between arbitrary documents and topics, inferred from the different types of semantic relationships between respective representative words, identified by specific search algorithms. The application of this model is tested on both flat and hierarchical text categorization, where it potentially allows the efficient addition of new categories during classification. Results show that classification accuracy still requires improvements, but models generated from one domain are shown to be effectively able to be reused in a different one.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In the past decade, the advent of efficient genome sequencing tools and high-throughput experimental biotechnology has lead to enormous progress in the life science. Among the most important innovations is the microarray tecnology. It allows to quantify the expression for thousands of genes simultaneously by measurin the hybridization from a tissue of interest to probes on a small glass or plastic slide. The characteristics of these data include a fair amount of random noise, a predictor dimension in the thousand, and a sample noise in the dozens. One of the most exciting areas to which microarray technology has been applied is the challenge of deciphering complex disease such as cancer. In these studies, samples are taken from two or more groups of individuals with heterogeneous phenotypes, pathologies, or clinical outcomes. these samples are hybridized to microarrays in an effort to find a small number of genes which are strongly correlated with the group of individuals. Eventhough today methods to analyse the data are welle developed and close to reach a standard organization (through the effort of preposed International project like Microarray Gene Expression Data -MGED- Society [1]) it is not unfrequant to stumble in a clinician's question that do not have a compelling statistical method that could permit to answer it.The contribution of this dissertation in deciphering disease regards the development of new approaches aiming at handle open problems posed by clinicians in handle specific experimental designs. In Chapter 1 starting from a biological necessary introduction, we revise the microarray tecnologies and all the important steps that involve an experiment from the production of the array, to the quality controls ending with preprocessing steps that will be used into the data analysis in the rest of the dissertation. While in Chapter 2 a critical review of standard analysis methods are provided stressing most of problems that In Chapter 3 is introduced a method to adress the issue of unbalanced design of miacroarray experiments. In microarray experiments, experimental design is a crucial starting-point for obtaining reasonable results. In a two-class problem, an equal or similar number of samples it should be collected between the two classes. However in some cases, e.g. rare pathologies, the approach to be taken is less evident. We propose to address this issue by applying a modified version of SAM [2]. MultiSAM consists in a reiterated application of a SAM analysis, comparing the less populated class (LPC) with 1,000 random samplings of the same size from the more populated class (MPC) A list of the differentially expressed genes is generated for each SAM application. After 1,000 reiterations, each single probe given a "score" ranging from 0 to 1,000 based on its recurrence in the 1,000 lists as differentially expressed. The performance of MultiSAM was compared to the performance of SAM and LIMMA [3] over two simulated data sets via beta and exponential distribution. The results of all three algorithms over low- noise data sets seems acceptable However, on a real unbalanced two-channel data set reagardin Chronic Lymphocitic Leukemia, LIMMA finds no significant probe, SAM finds 23 significantly changed probes but cannot separate the two classes, while MultiSAM finds 122 probes with score >300 and separates the data into two clusters by hierarchical clustering. We also report extra-assay validation in terms of differentially expressed genes Although standard algorithms perform well over low-noise simulated data sets, multi-SAM seems to be the only one able to reveal subtle differences in gene expression profiles on real unbalanced data. In Chapter 4 a method to adress similarities evaluation in a three-class prblem by means of Relevance Vector Machine [4] is described. In fact, looking at microarray data in a prognostic and diagnostic clinical framework, not only differences could have a crucial role. In some cases similarities can give useful and, sometimes even more, important information. The goal, given three classes, could be to establish, with a certain level of confidence, if the third one is similar to the first or the second one. In this work we show that Relevance Vector Machine (RVM) [2] could be a possible solutions to the limitation of standard supervised classification. In fact, RVM offers many advantages compared, for example, with his well-known precursor (Support Vector Machine - SVM [3]). Among these advantages, the estimate of posterior probability of class membership represents a key feature to address the similarity issue. This is a highly important, but often overlooked, option of any practical pattern recognition system. We focused on Tumor-Grade-three-class problem, so we have 67 samples of grade I (G1), 54 samples of grade 3 (G3) and 100 samples of grade 2 (G2). The goal is to find a model able to separate G1 from G3, then evaluate the third class G2 as test-set to obtain the probability for samples of G2 to be member of class G1 or class G3. The analysis showed that breast cancer samples of grade II have a molecular profile more similar to breast cancer samples of grade I. Looking at the literature this result have been guessed, but no measure of significance was gived before.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Mixed integer programming is up today one of the most widely used techniques for dealing with hard optimization problems. On the one side, many practical optimization problems arising from real-world applications (such as, e.g., scheduling, project planning, transportation, telecommunications, economics and finance, timetabling, etc) can be easily and effectively formulated as Mixed Integer linear Programs (MIPs). On the other hand, 50 and more years of intensive research has dramatically improved on the capability of the current generation of MIP solvers to tackle hard problems in practice. However, many questions are still open and not fully understood, and the mixed integer programming community is still more than active in trying to answer some of these questions. As a consequence, a huge number of papers are continuously developed and new intriguing questions arise every year. When dealing with MIPs, we have to distinguish between two different scenarios. The first one happens when we are asked to handle a general MIP and we cannot assume any special structure for the given problem. In this case, a Linear Programming (LP) relaxation and some integrality requirements are all we have for tackling the problem, and we are ``forced" to use some general purpose techniques. The second one happens when mixed integer programming is used to address a somehow structured problem. In this context, polyhedral analysis and other theoretical and practical considerations are typically exploited to devise some special purpose techniques. This thesis tries to give some insights in both the above mentioned situations. The first part of the work is focused on general purpose cutting planes, which are probably the key ingredient behind the success of the current generation of MIP solvers. Chapter 1 presents a quick overview of the main ingredients of a branch-and-cut algorithm, while Chapter 2 recalls some results from the literature in the context of disjunctive cuts and their connections with Gomory mixed integer cuts. Chapter 3 presents a theoretical and computational investigation of disjunctive cuts. In particular, we analyze the connections between different normalization conditions (i.e., conditions to truncate the cone associated with disjunctive cutting planes) and other crucial aspects as cut rank, cut density and cut strength. We give a theoretical characterization of weak rays of the disjunctive cone that lead to dominated cuts, and propose a practical method to possibly strengthen those cuts arising from such weak extremal solution. Further, we point out how redundant constraints can affect the quality of the generated disjunctive cuts, and discuss possible ways to cope with them. Finally, Chapter 4 presents some preliminary ideas in the context of multiple-row cuts. Very recently, a series of papers have brought the attention to the possibility of generating cuts using more than one row of the simplex tableau at a time. Several interesting theoretical results have been presented in this direction, often revisiting and recalling other important results discovered more than 40 years ago. However, is not clear at all how these results can be exploited in practice. As stated, the chapter is a still work-in-progress and simply presents a possible way for generating two-row cuts from the simplex tableau arising from lattice-free triangles and some preliminary computational results. The second part of the thesis is instead focused on the heuristic and exact exploitation of integer programming techniques for hard combinatorial optimization problems in the context of routing applications. Chapters 5 and 6 present an integer linear programming local search algorithm for Vehicle Routing Problems (VRPs). The overall procedure follows a general destroy-and-repair paradigm (i.e., the current solution is first randomly destroyed and then repaired in the attempt of finding a new improved solution) where a class of exponential neighborhoods are iteratively explored by heuristically solving an integer programming formulation through a general purpose MIP solver. Chapters 7 and 8 deal with exact branch-and-cut methods. Chapter 7 presents an extended formulation for the Traveling Salesman Problem with Time Windows (TSPTW), a generalization of the well known TSP where each node must be visited within a given time window. The polyhedral approaches proposed for this problem in the literature typically follow the one which has been proven to be extremely effective in the classical TSP context. Here we present an overall (quite) general idea which is based on a relaxed discretization of time windows. Such an idea leads to a stronger formulation and to stronger valid inequalities which are then separated within the classical branch-and-cut framework. Finally, Chapter 8 addresses the branch-and-cut in the context of Generalized Minimum Spanning Tree Problems (GMSTPs) (i.e., a class of NP-hard generalizations of the classical minimum spanning tree problem). In this chapter, we show how some basic ideas (and, in particular, the usage of general purpose cutting planes) can be useful to improve on branch-and-cut methods proposed in the literature.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This work presents exact, hybrid algorithms for mixed resource Allocation and Scheduling problems; in general terms, those consist into assigning over time finite capacity resources to a set of precedence connected activities. The proposed methods have broad applicability, but are mainly motivated by applications in the field of Embedded System Design. In particular, high-performance embedded computing recently witnessed the shift from single CPU platforms with application-specific accelerators to programmable Multi Processor Systems-on-Chip (MPSoCs). Those allow higher flexibility, real time performance and low energy consumption, but the programmer must be able to effectively exploit the platform parallelism. This raises interest in the development of algorithmic techniques to be embedded in CAD tools; in particular, given a specific application and platform, the objective if to perform optimal allocation of hardware resources and to compute an execution schedule. On this regard, since embedded systems tend to run the same set of applications for their entire lifetime, off-line, exact optimization approaches are particularly appealing. Quite surprisingly, the use of exact algorithms has not been well investigated so far; this is in part motivated by the complexity of integrated allocation and scheduling, setting tough challenges for ``pure'' combinatorial methods. The use of hybrid CP/OR approaches presents the opportunity to exploit mutual advantages of different methods, while compensating for their weaknesses. In this work, we consider in first instance an Allocation and Scheduling problem over the Cell BE processor by Sony, IBM and Toshiba; we propose three different solution methods, leveraging decomposition, cut generation and heuristic guided search. Next, we face Allocation and Scheduling of so-called Conditional Task Graphs, explicitly accounting for branches with outcome not known at design time; we extend the CP scheduling framework to effectively deal with the introduced stochastic elements. Finally, we address Allocation and Scheduling with uncertain, bounded execution times, via conflict based tree search; we introduce a simple and flexible time model to take into account duration variability and provide an efficient conflict detection method. The proposed approaches achieve good results on practical size problem, thus demonstrating the use of exact approaches for system design is feasible. Furthermore, the developed techniques bring significant contributions to combinatorial optimization methods.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This work presents hybrid Constraint Programming (CP) and metaheuristic methods for the solution of Large Scale Optimization Problems; it aims at integrating concepts and mechanisms from the metaheuristic methods to a CP-based tree search environment in order to exploit the advantages of both approaches. The modeling and solution of large scale combinatorial optimization problem is a topic which has arisen the interest of many researcherers in the Operations Research field; combinatorial optimization problems are widely spread in everyday life and the need of solving difficult problems is more and more urgent. Metaheuristic techniques have been developed in the last decades to effectively handle the approximate solution of combinatorial optimization problems; we will examine metaheuristics in detail, focusing on the common aspects of different techniques. Each metaheuristic approach possesses its own peculiarities in designing and guiding the solution process; our work aims at recognizing components which can be extracted from metaheuristic methods and re-used in different contexts. In particular we focus on the possibility of porting metaheuristic elements to constraint programming based environments, as constraint programming is able to deal with feasibility issues of optimization problems in a very effective manner. Moreover, CP offers a general paradigm which allows to easily model any type of problem and solve it with a problem-independent framework, differently from local search and metaheuristic methods which are highly problem specific. In this work we describe the implementation of the Local Branching framework, originally developed for Mixed Integer Programming, in a CP-based environment. Constraint programming specific features are used to ease the search process, still mantaining an absolute generality of the approach. We also propose a search strategy called Sliced Neighborhood Search, SNS, that iteratively explores slices of large neighborhoods of an incumbent solution by performing CP-based tree search and encloses concepts from metaheuristic techniques. SNS can be used as a stand alone search strategy, but it can alternatively be embedded in existing strategies as intensification and diversification mechanism. In particular we show its integration within the CP-based local branching. We provide an extensive experimental evaluation of the proposed approaches on instances of the Asymmetric Traveling Salesman Problem and of the Asymmetric Traveling Salesman Problem with Time Windows. The proposed approaches achieve good results on practical size problem, thus demonstrating the benefit of integrating metaheuristic concepts in CP-based frameworks.