17 resultados para Scientific Algorithms. Evolutionary Computation. Metaheuristics. Car Renter Salesman Problem
em AMS Tesi di Dottorato - Alm@DL - Università di Bologna
Resumo:
Combinatorial optimization problems have been strongly addressed throughout history. Their study involves highly applied problems that must be solved in reasonable times. This doctoral Thesis addresses three Operations Research problems: the first deals with the Traveling Salesman Problem with Pickups and Delivery with Handling cost, which was approached with two metaheuristics based on Iterated Local Search; the results show that the proposed methods are faster and obtain good results respect to the metaheuristics from the literature. The second problem corresponds to the Quadratic Multiple Knapsack Problem, and polynomial formulations and relaxations are presented for new instances of the problem; in addition, a metaheuristic and a matheuristic are proposed that are competitive with state of the art algorithms. Finally, an Open-Pit Mining problem is approached. This problem is solved with a parallel genetic algorithm that allows excavations using truncated cones. Each of these problems was computationally tested with difficult instances from the literature, obtaining good quality results in reasonable computational times, and making significant contributions to the state of the art techniques of Operations Research.
Resumo:
In this thesis we study three combinatorial optimization problems belonging to the classes of Network Design and Vehicle Routing problems that are strongly linked in the context of the design and management of transportation networks: the Non-Bifurcated Capacitated Network Design Problem (NBP), the Period Vehicle Routing Problem (PVRP) and the Pickup and Delivery Problem with Time Windows (PDPTW). These problems are NP-hard and contain as special cases some well known difficult problems such as the Traveling Salesman Problem and the Steiner Tree Problem. Moreover, they model the core structure of many practical problems arising in logistics and telecommunications. The NBP is the problem of designing the optimum network to satisfy a given set of traffic demands. Given a set of nodes, a set of potential links and a set of point-to-point demands called commodities, the objective is to select the links to install and dimension their capacities so that all the demands can be routed between their respective endpoints, and the sum of link fixed costs and commodity routing costs is minimized. The problem is called non- bifurcated because the solution network must allow each demand to follow a single path, i.e., the flow of each demand cannot be splitted. Although this is the case in many real applications, the NBP has received significantly less attention in the literature than other capacitated network design problems that allow bifurcation. We describe an exact algorithm for the NBP that is based on solving by an integer programming solver a formulation of the problem strengthened by simple valid inequalities and four new heuristic algorithms. One of these heuristics is an adaptive memory metaheuristic, based on partial enumeration, that could be applied to a wider class of structured combinatorial optimization problems. In the PVRP a fleet of vehicles of identical capacity must be used to service a set of customers over a planning period of several days. Each customer specifies a service frequency, a set of allowable day-combinations and a quantity of product that the customer must receive every time he is visited. For example, a customer may require to be visited twice during a 5-day period imposing that these visits take place on Monday-Thursday or Monday-Friday or Tuesday-Friday. The problem consists in simultaneously assigning a day- combination to each customer and in designing the vehicle routes for each day so that each customer is visited the required number of times, the number of routes on each day does not exceed the number of vehicles available, and the total cost of the routes over the period is minimized. We also consider a tactical variant of this problem, called Tactical Planning Vehicle Routing Problem, where customers require to be visited on a specific day of the period but a penalty cost, called service cost, can be paid to postpone the visit to a later day than that required. At our knowledge all the algorithms proposed in the literature for the PVRP are heuristics. In this thesis we present for the first time an exact algorithm for the PVRP that is based on different relaxations of a set partitioning-like formulation. The effectiveness of the proposed algorithm is tested on a set of instances from the literature and on a new set of instances. Finally, the PDPTW is to service a set of transportation requests using a fleet of identical vehicles of limited capacity located at a central depot. Each request specifies a pickup location and a delivery location and requires that a given quantity of load is transported from the pickup location to the delivery location. Moreover, each location can be visited only within an associated time window. Each vehicle can perform at most one route and the problem is to satisfy all the requests using the available vehicles so that each request is serviced by a single vehicle, the load on each vehicle does not exceed the capacity, and all locations are visited according to their time window. We formulate the PDPTW as a set partitioning-like problem with additional cuts and we propose an exact algorithm based on different relaxations of the mathematical formulation and a branch-and-cut-and-price algorithm. The new algorithm is tested on two classes of problems from the literature and compared with a recent branch-and-cut-and-price algorithm from the literature.
Resumo:
In this thesis we present some combinatorial optimization problems, suggest models and algorithms for their effective solution. For each problem,we give its description, followed by a short literature review, provide methods to solve it and, finally, present computational results and comparisons with previous works to show the effectiveness of the proposed approaches. The considered problems are: the Generalized Traveling Salesman Problem (GTSP), the Bin Packing Problem with Conflicts(BPPC) and the Fair Layout Problem (FLOP).
Resumo:
One of the most interesting challenge of the next years will be the Air Space Systems automation. This process will involve different aspects as the Air Traffic Management, the Aircrafts and Airport Operations and the Guidance and Navigation Systems. The use of UAS (Uninhabited Aerial System) for civil mission will be one of the most important steps in this automation process. In civil air space, Air Traffic Controllers (ATC) manage the air traffic ensuring that a minimum separation between the controlled aircrafts is always provided. For this purpose ATCs use several operative avoidance techniques like holding patterns or rerouting. The use of UAS in these context will require the definition of strategies for a common management of piloted and piloted air traffic that allow the UAS to self separate. As a first employment in civil air space we consider a UAS surveillance mission that consists in departing from a ground base, taking pictures over a set of mission targets and coming back to the same ground base. During all mission a set of piloted aircrafts fly in the same airspace and thus the UAS has to self separate using the ATC avoidance as anticipated. We consider two objective, the first consists in the minimization of the air traffic impact over the mission, the second consists in the minimization of the impact of the mission over the air traffic. A particular version of the well known Travelling Salesman Problem (TSP) called Time-Dependant-TSP has been studied to deal with traffic problems in big urban areas. Its basic idea consists in a cost of the route between two clients depending on the period of the day in which it is crossed. Our thesis supports that such idea can be applied to the air traffic too using a convenient time horizon compatible with aircrafts operations. The cost of a UAS sub-route will depend on the air traffic that it will meet starting such route in a specific moment and consequently on the avoidance maneuver that it will use to avoid that conflict. The conflict avoidance is a topic that has been hardly developed in past years using different approaches. In this thesis we purpose a new approach based on the use of ATC operative techniques that makes it possible both to model the UAS problem using a TDTSP framework both to use an Air Traffic Management perspective. Starting from this kind of mission, the problem of the UAS insertion in civil air space is formalized as the UAS Routing Problem (URP). For this reason we introduce a new structure called Conflict Graph that makes it possible to model the avoidance maneuvers and to define the arc cost function of the departing time. Two Integer Linear Programming formulations of the problem are proposed. The first is based on a TDTSP formulation that, unfortunately, is weaker then the TSP formulation. Thus a new formulation based on a TSP variation that uses specific penalty to model the holdings is proposed. Different algorithms are presented: exact algorithms, simple heuristics used as Upper Bounds on the number of time steps used, and metaheuristic algorithms as Genetic Algorithm and Simulated Annealing. Finally an air traffic scenario has been simulated using real air traffic data in order to test our algorithms. Graphic Tools have been used to represent the Milano Linate air space and its air traffic during different days. Such data have been provided by ENAV S.p.A (Italian Agency for Air Navigation Services).
Resumo:
This work presents hybrid Constraint Programming (CP) and metaheuristic methods for the solution of Large Scale Optimization Problems; it aims at integrating concepts and mechanisms from the metaheuristic methods to a CP-based tree search environment in order to exploit the advantages of both approaches. The modeling and solution of large scale combinatorial optimization problem is a topic which has arisen the interest of many researcherers in the Operations Research field; combinatorial optimization problems are widely spread in everyday life and the need of solving difficult problems is more and more urgent. Metaheuristic techniques have been developed in the last decades to effectively handle the approximate solution of combinatorial optimization problems; we will examine metaheuristics in detail, focusing on the common aspects of different techniques. Each metaheuristic approach possesses its own peculiarities in designing and guiding the solution process; our work aims at recognizing components which can be extracted from metaheuristic methods and re-used in different contexts. In particular we focus on the possibility of porting metaheuristic elements to constraint programming based environments, as constraint programming is able to deal with feasibility issues of optimization problems in a very effective manner. Moreover, CP offers a general paradigm which allows to easily model any type of problem and solve it with a problem-independent framework, differently from local search and metaheuristic methods which are highly problem specific. In this work we describe the implementation of the Local Branching framework, originally developed for Mixed Integer Programming, in a CP-based environment. Constraint programming specific features are used to ease the search process, still mantaining an absolute generality of the approach. We also propose a search strategy called Sliced Neighborhood Search, SNS, that iteratively explores slices of large neighborhoods of an incumbent solution by performing CP-based tree search and encloses concepts from metaheuristic techniques. SNS can be used as a stand alone search strategy, but it can alternatively be embedded in existing strategies as intensification and diversification mechanism. In particular we show its integration within the CP-based local branching. We provide an extensive experimental evaluation of the proposed approaches on instances of the Asymmetric Traveling Salesman Problem and of the Asymmetric Traveling Salesman Problem with Time Windows. The proposed approaches achieve good results on practical size problem, thus demonstrating the benefit of integrating metaheuristic concepts in CP-based frameworks.
Resumo:
Mixed integer programming is up today one of the most widely used techniques for dealing with hard optimization problems. On the one side, many practical optimization problems arising from real-world applications (such as, e.g., scheduling, project planning, transportation, telecommunications, economics and finance, timetabling, etc) can be easily and effectively formulated as Mixed Integer linear Programs (MIPs). On the other hand, 50 and more years of intensive research has dramatically improved on the capability of the current generation of MIP solvers to tackle hard problems in practice. However, many questions are still open and not fully understood, and the mixed integer programming community is still more than active in trying to answer some of these questions. As a consequence, a huge number of papers are continuously developed and new intriguing questions arise every year. When dealing with MIPs, we have to distinguish between two different scenarios. The first one happens when we are asked to handle a general MIP and we cannot assume any special structure for the given problem. In this case, a Linear Programming (LP) relaxation and some integrality requirements are all we have for tackling the problem, and we are ``forced" to use some general purpose techniques. The second one happens when mixed integer programming is used to address a somehow structured problem. In this context, polyhedral analysis and other theoretical and practical considerations are typically exploited to devise some special purpose techniques. This thesis tries to give some insights in both the above mentioned situations. The first part of the work is focused on general purpose cutting planes, which are probably the key ingredient behind the success of the current generation of MIP solvers. Chapter 1 presents a quick overview of the main ingredients of a branch-and-cut algorithm, while Chapter 2 recalls some results from the literature in the context of disjunctive cuts and their connections with Gomory mixed integer cuts. Chapter 3 presents a theoretical and computational investigation of disjunctive cuts. In particular, we analyze the connections between different normalization conditions (i.e., conditions to truncate the cone associated with disjunctive cutting planes) and other crucial aspects as cut rank, cut density and cut strength. We give a theoretical characterization of weak rays of the disjunctive cone that lead to dominated cuts, and propose a practical method to possibly strengthen those cuts arising from such weak extremal solution. Further, we point out how redundant constraints can affect the quality of the generated disjunctive cuts, and discuss possible ways to cope with them. Finally, Chapter 4 presents some preliminary ideas in the context of multiple-row cuts. Very recently, a series of papers have brought the attention to the possibility of generating cuts using more than one row of the simplex tableau at a time. Several interesting theoretical results have been presented in this direction, often revisiting and recalling other important results discovered more than 40 years ago. However, is not clear at all how these results can be exploited in practice. As stated, the chapter is a still work-in-progress and simply presents a possible way for generating two-row cuts from the simplex tableau arising from lattice-free triangles and some preliminary computational results. The second part of the thesis is instead focused on the heuristic and exact exploitation of integer programming techniques for hard combinatorial optimization problems in the context of routing applications. Chapters 5 and 6 present an integer linear programming local search algorithm for Vehicle Routing Problems (VRPs). The overall procedure follows a general destroy-and-repair paradigm (i.e., the current solution is first randomly destroyed and then repaired in the attempt of finding a new improved solution) where a class of exponential neighborhoods are iteratively explored by heuristically solving an integer programming formulation through a general purpose MIP solver. Chapters 7 and 8 deal with exact branch-and-cut methods. Chapter 7 presents an extended formulation for the Traveling Salesman Problem with Time Windows (TSPTW), a generalization of the well known TSP where each node must be visited within a given time window. The polyhedral approaches proposed for this problem in the literature typically follow the one which has been proven to be extremely effective in the classical TSP context. Here we present an overall (quite) general idea which is based on a relaxed discretization of time windows. Such an idea leads to a stronger formulation and to stronger valid inequalities which are then separated within the classical branch-and-cut framework. Finally, Chapter 8 addresses the branch-and-cut in the context of Generalized Minimum Spanning Tree Problems (GMSTPs) (i.e., a class of NP-hard generalizations of the classical minimum spanning tree problem). In this chapter, we show how some basic ideas (and, in particular, the usage of general purpose cutting planes) can be useful to improve on branch-and-cut methods proposed in the literature.
Resumo:
Let’s put ourselves in the shoes of an energy company. Our fleet of electricity production plants mainly includes gas, hydroelectric and waste-to-energy plants. We also sold contracts for the supply of gas and electricity. For each year we have to plan the trading of the volumes needed by the plants and customers: better to fix the price of these volumes in advance with the so-called forward contracts, instead of waiting for the delivery months, exposing ourselves to price uncertainty. Here’s the thing: trying to keep uncertainty under control in a market that has never shown such extreme scenarios as in recent years: a pandemic, a worsening climate crisis and a war that is affecting economies around the world have made the energy market more volatile than ever. How to make decisions in such uncertain contexts? There is an optimization problem: given a year, we need to choose the optimal planning of volume trading times, to meet the needs of our portfolio at the best prices, taking into account the liquidity constraints given by the market and the risk constraints imposed by the company. Algorithms are needed for the generation of market scenarios over a finite time horizon, that is, a probabilistic distribution that allows a view of all the dates between now and the end of the year of interest. Algorithms are needed to solve the optimization problem: we have proposed more than one and compared them; a very simple one, which avoids considering part of the complexity, moving on to a scenario approach and finally a reinforcement learning approach.
Resumo:
In this thesis we made the first steps towards the systematic application of a methodology for automatically building formal models of complex biological systems. Such a methodology could be useful also to design artificial systems possessing desirable properties such as robustness and evolvability. The approach we follow in this thesis is to manipulate formal models by means of adaptive search methods called metaheuristics. In the first part of the thesis we develop state-of-the-art hybrid metaheuristic algorithms to tackle two important problems in genomics, namely, the Haplotype Inference by parsimony and the Founder Sequence Reconstruction Problem. We compare our algorithms with other effective techniques in the literature, we show strength and limitations of our approaches to various problem formulations and, finally, we propose further enhancements that could possibly improve the performance of our algorithms and widen their applicability. In the second part, we concentrate on Boolean network (BN) models of gene regulatory networks (GRNs). We detail our automatic design methodology and apply it to four use cases which correspond to different design criteria and address some limitations of GRN modeling by BNs. Finally, we tackle the Density Classification Problem with the aim of showing the learning capabilities of BNs. Experimental evaluation of this methodology shows its efficacy in producing network that meet our design criteria. Our results, coherently to what has been found in other works, also suggest that networks manipulated by a search process exhibit a mixture of characteristics typical of different dynamical regimes.
Resumo:
The Peer-to-Peer network paradigm is drawing the attention of both final users and researchers for its features. P2P networks shift from the classic client-server approach to a high level of decentralization where there is no central control and all the nodes should be able not only to require services, but to provide them to other peers as well. While on one hand such high level of decentralization might lead to interesting properties like scalability and fault tolerance, on the other hand it implies many new problems to deal with. A key feature of many P2P systems is openness, meaning that everybody is potentially able to join a network with no need for subscription or payment systems. The combination of openness and lack of central control makes it feasible for a user to free-ride, that is to increase its own benefit by using services without allocating resources to satisfy other peers’ requests. One of the main goals when designing a P2P system is therefore to achieve cooperation between users. Given the nature of P2P systems based on simple local interactions of many peers having partial knowledge of the whole system, an interesting way to achieve desired properties on a system scale might consist in obtaining them as emergent properties of the many interactions occurring at local node level. Two methods are typically used to face the problem of cooperation in P2P networks: 1) engineering emergent properties when designing the protocol; 2) study the system as a game and apply Game Theory techniques, especially to find Nash Equilibria in the game and to reach them making the system stable against possible deviant behaviors. In this work we present an evolutionary framework to enforce cooperative behaviour in P2P networks that is alternative to both the methods mentioned above. Our approach is based on an evolutionary algorithm inspired by computational sociology and evolutionary game theory, consisting in having each peer periodically trying to copy another peer which is performing better. The proposed algorithms, called SLAC and SLACER, draw inspiration from tag systems originated in computational sociology, the main idea behind the algorithm consists in having low performance nodes copying high performance ones. The algorithm is run locally by every node and leads to an evolution of the network both from the topology and from the nodes’ strategy point of view. Initial tests with a simple Prisoners’ Dilemma application show how SLAC is able to bring the network to a state of high cooperation independently from the initial network conditions. Interesting results are obtained when studying the effect of cheating nodes on SLAC algorithm. In fact in some cases selfish nodes rationally exploiting the system for their own benefit can actually improve system performance from the cooperation formation point of view. The final step is to apply our results to more realistic scenarios. We put our efforts in studying and improving the BitTorrent protocol. BitTorrent was chosen not only for its popularity but because it has many points in common with SLAC and SLACER algorithms, ranging from the game theoretical inspiration (tit-for-tat-like mechanism) to the swarms topology. We discovered fairness, meant as ratio between uploaded and downloaded data, to be a weakness of the original BitTorrent protocol and we drew inspiration from the knowledge of cooperation formation and maintenance mechanism derived from the development and analysis of SLAC and SLACER, to improve fairness and tackle freeriding and cheating in BitTorrent. We produced an extension of BitTorrent called BitFair that has been evaluated through simulation and has shown the abilities of enforcing fairness and tackling free-riding and cheating nodes.
Resumo:
The aim of this Doctoral Thesis is to develop a genetic algorithm based optimization methods to find the best conceptual design architecture of an aero-piston-engine, for given design specifications. Nowadays, the conceptual design of turbine airplanes starts with the aircraft specifications, then the most suited turbofan or turbo propeller for the specific application is chosen. In the aeronautical piston engines field, which has been dormant for several decades, as interest shifted towards turboaircraft, new materials with increased performance and properties have opened new possibilities for development. Moreover, the engine’s modularity given by the cylinder unit, makes it possible to design a specific engine for a given application. In many real engineering problems the amount of design variables may be very high, characterized by several non-linearities needed to describe the behaviour of the phenomena. In this case the objective function has many local extremes, but the designer is usually interested in the global one. The stochastic and the evolutionary optimization techniques, such as the genetic algorithms method, may offer reliable solutions to the design problems, within acceptable computational time. The optimization algorithm developed here can be employed in the first phase of the preliminary project of an aeronautical piston engine design. It’s a mono-objective genetic algorithm, which, starting from the given design specifications, finds the engine propulsive system configuration which possesses minimum mass while satisfying the geometrical, structural and performance constraints. The algorithm reads the project specifications as input data, namely the maximum values of crankshaft and propeller shaft speed and the maximal pressure value in the combustion chamber. The design variables bounds, that describe the solution domain from the geometrical point of view, are introduced too. In the Matlab® Optimization environment the objective function to be minimized is defined as the sum of the masses of the engine propulsive components. Each individual that is generated by the genetic algorithm is the assembly of the flywheel, the vibration damper and so many pistons, connecting rods, cranks, as the number of the cylinders. The fitness is evaluated for each individual of the population, then the rules of the genetic operators are applied, such as reproduction, mutation, selection, crossover. In the reproduction step the elitist method is applied, in order to save the fittest individuals from a contingent mutation and recombination disruption, making it undamaged survive until the next generation. Finally, as the best individual is found, the optimal dimensions values of the components are saved to an Excel® file, in order to build a CAD-automatic-3D-model for each component of the propulsive system, having a direct pre-visualization of the final product, still in the engine’s preliminary project design phase. With the purpose of showing the performance of the algorithm and validating this optimization method, an actual engine is taken, as a case study: it’s the 1900 JTD Fiat Avio, 4 cylinders, 4T, Diesel. Many verifications are made on the mechanical components of the engine, in order to test their feasibility and to decide their survival through generations. A system of inequalities is used to describe the non-linear relations between the design variables, and is used for components checking for static and dynamic loads configurations. The design variables geometrical boundaries are taken from actual engines data and similar design cases. Among the many simulations run for algorithm testing, twelve of them have been chosen as representative of the distribution of the individuals. Then, as an example, for each simulation, the corresponding 3D models of the crankshaft and the connecting rod, have been automatically built. In spite of morphological differences among the component the mass is almost the same. The results show a significant mass reduction (almost 20% for the crankshaft) in comparison to the original configuration, and an acceptable robustness of the method have been shown. The algorithm here developed is shown to be a valid method for an aeronautical-piston-engine preliminary project design optimization. In particular the procedure is able to analyze quite a wide range of design solutions, rejecting the ones that cannot fulfill the feasibility design specifications. This optimization algorithm could increase the aeronautical-piston-engine development, speeding up the production rate and joining modern computation performances and technological awareness to the long lasting traditional design experiences.
Resumo:
In a large number of problems the high dimensionality of the search space, the vast number of variables and the economical constrains limit the ability of classical techniques to reach the optimum of a function, known or unknown. In this thesis we investigate the possibility to combine approaches from advanced statistics and optimization algorithms in such a way to better explore the combinatorial search space and to increase the performance of the approaches. To this purpose we propose two methods: (i) Model Based Ant Colony Design and (ii) Naïve Bayes Ant Colony Optimization. We test the performance of the two proposed solutions on a simulation study and we apply the novel techniques on an appplication in the field of Enzyme Engineering and Design.
Resumo:
Images of a scene, static or dynamic, are generally acquired at different epochs from different viewpoints. They potentially gather information about the whole scene and its relative motion with respect to the acquisition device. Data from different (in the spatial or temporal domain) visual sources can be fused together to provide a unique consistent representation of the whole scene, even recovering the third dimension, permitting a more complete understanding of the scene content. Moreover, the pose of the acquisition device can be achieved by estimating the relative motion parameters linking different views, thus providing localization information for automatic guidance purposes. Image registration is based on the use of pattern recognition techniques to match among corresponding parts of different views of the acquired scene. Depending on hypotheses or prior information about the sensor model, the motion model and/or the scene model, this information can be used to estimate global or local geometrical mapping functions between different images or different parts of them. These mapping functions contain relative motion parameters between the scene and the sensor(s) and can be used to integrate accordingly informations coming from the different sources to build a wider or even augmented representation of the scene. Accordingly, for their scene reconstruction and pose estimation capabilities, nowadays image registration techniques from multiple views are increasingly stirring up the interest of the scientific and industrial community. Depending on the applicative domain, accuracy, robustness, and computational payload of the algorithms represent important issues to be addressed and generally a trade-off among them has to be reached. Moreover, on-line performance is desirable in order to guarantee the direct interaction of the vision device with human actors or control systems. This thesis follows a general research approach to cope with these issues, almost independently from the scene content, under the constraint of rigid motions. This approach has been motivated by the portability to very different domains as a very desirable property to achieve. A general image registration approach suitable for on-line applications has been devised and assessed through two challenging case studies in different applicative domains. The first case study regards scene reconstruction through on-line mosaicing of optical microscopy cell images acquired with non automated equipment, while moving manually the microscope holder. By registering the images the field of view of the microscope can be widened, preserving the resolution while reconstructing the whole cell culture and permitting the microscopist to interactively explore the cell culture. In the second case study, the registration of terrestrial satellite images acquired by a camera integral with the satellite is utilized to estimate its three-dimensional orientation from visual data, for automatic guidance purposes. Critical aspects of these applications are emphasized and the choices adopted are motivated accordingly. Results are discussed in view of promising future developments.
Resumo:
The identification of people by measuring some traits of individual anatomy or physiology has led to a specific research area called biometric recognition. This thesis is focused on improving fingerprint recognition systems considering three important problems: fingerprint enhancement, fingerprint orientation extraction and automatic evaluation of fingerprint algorithms. An effective extraction of salient fingerprint features depends on the quality of the input fingerprint. If the fingerprint is very noisy, we are not able to detect a reliable set of features. A new fingerprint enhancement method, which is both iterative and contextual, is proposed. This approach detects high-quality regions in fingerprints, selectively applies contextual filtering and iteratively expands like wildfire toward low-quality ones. A precise estimation of the orientation field would greatly simplify the estimation of other fingerprint features (singular points, minutiae) and improve the performance of a fingerprint recognition system. The fingerprint orientation extraction is improved following two directions. First, after the introduction of a new taxonomy of fingerprint orientation extraction methods, several variants of baseline methods are implemented and, pointing out the role of pre- and post- processing, we show how to improve the extraction. Second, the introduction of a new hybrid orientation extraction method, which follows an adaptive scheme, allows to improve significantly the orientation extraction in noisy fingerprints. Scientific papers typically propose recognition systems that integrate many modules and therefore an automatic evaluation of fingerprint algorithms is needed to isolate the contributions that determine an actual progress in the state-of-the-art. The lack of a publicly available framework to compare fingerprint orientation extraction algorithms, motivates the introduction of a new benchmark area called FOE (including fingerprints and manually-marked orientation ground-truth) along with fingerprint matching benchmarks in the FVC-onGoing framework. The success of such framework is discussed by providing relevant statistics: more than 1450 algorithms submitted and two international competitions.
Resumo:
Although the debate of what data science is has a long history and has not reached a complete consensus yet, Data Science can be summarized as the process of learning from data. Guided by the above vision, this thesis presents two independent data science projects developed in the scope of multidisciplinary applied research. The first part analyzes fluorescence microscopy images typically produced in life science experiments, where the objective is to count how many marked neuronal cells are present in each image. Aiming to automate the task for supporting research in the area, we propose a neural network architecture tuned specifically for this use case, cell ResUnet (c-ResUnet), and discuss the impact of alternative training strategies in overcoming particular challenges of our data. The approach provides good results in terms of both detection and counting, showing performance comparable to the interpretation of human operators. As a meaningful addition, we release the pre-trained model and the Fluorescent Neuronal Cells dataset collecting pixel-level annotations of where neuronal cells are located. In this way, we hope to help future research in the area and foster innovative methodologies for tackling similar problems. The second part deals with the problem of distributed data management in the context of LHC experiments, with a focus on supporting ATLAS operations concerning data transfer failures. In particular, we analyze error messages produced by failed transfers and propose a Machine Learning pipeline that leverages the word2vec language model and K-means clustering. This provides groups of similar errors that are presented to human operators as suggestions of potential issues to investigate. The approach is demonstrated on one full day of data, showing promising ability in understanding the message content and providing meaningful groupings, in line with previously reported incidents by human operators.