5 resultados para real world learning
em AMS Tesi di Dottorato - Alm@DL - Università di Bologna
Resumo:
Logistics involves planning, managing, and organizing the flows of goods from the point of origin to the point of destination in order to meet some requirements. Logistics and transportation aspects are very important and represent a relevant costs for producing and shipping companies, but also for public administration and private citizens. The optimization of resources and the improvement in the organization of operations is crucial for all branches of logistics, from the operation management to the transportation. As we will have the chance to see in this work, optimization techniques, models, and algorithms represent important methods to solve the always new and more complex problems arising in different segments of logistics. Many operation management and transportation problems are related to the optimization class of problems called Vehicle Routing Problems (VRPs). In this work, we consider several real-world deterministic and stochastic problems that are included in the wide class of the VRPs, and we solve them by means of exact and heuristic methods. We treat three classes of real-world routing and logistics problems. We deal with one of the most important tactical problems that arises in the managing of the bike sharing systems, that is the Bike sharing Rebalancing Problem (BRP). We propose models and algorithms for real-world earthwork optimization problems. We describe the 3DP process and we highlight several optimization issues in 3DP. Among those, we define the problem related to the tool path definition in the 3DP process, the 3D Routing Problem (3DRP), which is a generalization of the arc routing problem. We present an ILP model and several heuristic algorithms to solve the 3DRP.
Resumo:
In many application domains data can be naturally represented as graphs. When the application of analytical solutions for a given problem is unfeasible, machine learning techniques could be a viable way to solve the problem. Classical machine learning techniques are defined for data represented in a vectorial form. Recently some of them have been extended to deal directly with structured data. Among those techniques, kernel methods have shown promising results both from the computational complexity and the predictive performance point of view. Kernel methods allow to avoid an explicit mapping in a vectorial form relying on kernel functions, which informally are functions calculating a similarity measure between two entities. However, the definition of good kernels for graphs is a challenging problem because of the difficulty to find a good tradeoff between computational complexity and expressiveness. Another problem we face is learning on data streams, where a potentially unbounded sequence of data is generated by some sources. There are three main contributions in this thesis. The first contribution is the definition of a new family of kernels for graphs based on Directed Acyclic Graphs (DAGs). We analyzed two kernels from this family, achieving state-of-the-art results from both the computational and the classification point of view on real-world datasets. The second contribution consists in making the application of learning algorithms for streams of graphs feasible. Moreover,we defined a principled way for the memory management. The third contribution is the application of machine learning techniques for structured data to non-coding RNA function prediction. In this setting, the secondary structure is thought to carry relevant information. However, existing methods considering the secondary structure have prohibitively high computational complexity. We propose to apply kernel methods on this domain, obtaining state-of-the-art results.
Resumo:
Machine learning comprises a series of techniques for automatic extraction of meaningful information from large collections of noisy data. In many real world applications, data is naturally represented in structured form. Since traditional methods in machine learning deal with vectorial information, they require an a priori form of preprocessing. Among all the learning techniques for dealing with structured data, kernel methods are recognized to have a strong theoretical background and to be effective approaches. They do not require an explicit vectorial representation of the data in terms of features, but rely on a measure of similarity between any pair of objects of a domain, the kernel function. Designing fast and good kernel functions is a challenging problem. In the case of tree structured data two issues become relevant: kernel for trees should not be sparse and should be fast to compute. The sparsity problem arises when, given a dataset and a kernel function, most structures of the dataset are completely dissimilar to one another. In those cases the classifier has too few information for making correct predictions on unseen data. In fact, it tends to produce a discriminating function behaving as the nearest neighbour rule. Sparsity is likely to arise for some standard tree kernel functions, such as the subtree and subset tree kernel, when they are applied to datasets with node labels belonging to a large domain. A second drawback of using tree kernels is the time complexity required both in learning and classification phases. Such a complexity can sometimes prevents the kernel application in scenarios involving large amount of data. This thesis proposes three contributions for resolving the above issues of kernel for trees. A first contribution aims at creating kernel functions which adapt to the statistical properties of the dataset, thus reducing its sparsity with respect to traditional tree kernel functions. Specifically, we propose to encode the input trees by an algorithm able to project the data onto a lower dimensional space with the property that similar structures are mapped similarly. By building kernel functions on the lower dimensional representation, we are able to perform inexact matchings between different inputs in the original space. A second contribution is the proposal of a novel kernel function based on the convolution kernel framework. Convolution kernel measures the similarity of two objects in terms of the similarities of their subparts. Most convolution kernels are based on counting the number of shared substructures, partially discarding information about their position in the original structure. The kernel function we propose is, instead, especially focused on this aspect. A third contribution is devoted at reducing the computational burden related to the calculation of a kernel function between a tree and a forest of trees, which is a typical operation in the classification phase and, for some algorithms, also in the learning phase. We propose a general methodology applicable to convolution kernels. Moreover, we show an instantiation of our technique when kernels such as the subtree and subset tree kernels are employed. In those cases, Direct Acyclic Graphs can be used to compactly represent shared substructures in different trees, thus reducing the computational burden and storage requirements.
Resumo:
Cost, performance and availability considerations are forcing even the most conservative high-integrity embedded real-time systems industry to migrate from simple hardware processors to ones equipped with caches and other acceleration features. This migration disrupts the practices and solutions that industry had developed and consolidated over the years to perform timing analysis. Industry that are confident with the efficiency/effectiveness of their verification and validation processes for old-generation processors, do not have sufficient insight on the effects of the migration to cache-equipped processors. Caches are perceived as an additional source of complexity, which has potential for shattering the guarantees of cost- and schedule-constrained qualification of their systems. The current industrial approach to timing analysis is ill-equipped to cope with the variability incurred by caches. Conversely, the application of advanced WCET analysis techniques on real-world industrial software, developed without analysability in mind, is hardly feasible. We propose a development approach aimed at minimising the cache jitters, as well as at enabling the application of advanced WCET analysis techniques to industrial systems. Our approach builds on:(i) identification of those software constructs that may impede or complicate timing analysis in industrial-scale systems; (ii) elaboration of practical means, under the model-driven engineering (MDE) paradigm, to enforce the automated generation of software that is analyzable by construction; (iii) implementation of a layout optimisation method to remove cache jitters stemming from the software layout in memory, with the intent of facilitating incremental software development, which is of high strategic interest to industry. The integration of those constituents in a structured approach to timing analysis achieves two interesting properties: the resulting software is analysable from the earliest releases onwards - as opposed to becoming so only when the system is final - and more easily amenable to advanced timing analysis by construction, regardless of the system scale and complexity.
Resumo:
The development of High-Integrity Real-Time Systems has a high footprint in terms of human, material and schedule costs. Factoring functional, reusable logic in the application favors incremental development and contains costs. Yet, achieving incrementality in the timing behavior is a much harder problem. Complex features at all levels of the execution stack, aimed to boost average-case performance, exhibit timing behavior highly dependent on execution history, which wrecks time composability and incrementaility with it. Our goal here is to restitute time composability to the execution stack, working bottom up across it. We first characterize time composability without making assumptions on the system architecture or the software deployment to it. Later, we focus on the role played by the real-time operating system in our pursuit. Initially we consider single-core processors and, becoming less permissive on the admissible hardware features, we devise solutions that restore a convincing degree of time composability. To show what can be done for real, we developed TiCOS, an ARINC-compliant kernel, and re-designed ORK+, a kernel for Ada Ravenscar runtimes. In that work, we added support for limited-preemption to ORK+, an absolute premiere in the landscape of real-word kernels. Our implementation allows resource sharing to co-exist with limited-preemptive scheduling, which extends state of the art. We then turn our attention to multicore architectures, first considering partitioned systems, for which we achieve results close to those obtained for single-core processors. Subsequently, we shy away from the over-provision of those systems and consider less restrictive uses of homogeneous multiprocessors, where the scheduling algorithm is key to high schedulable utilization. To that end we single out RUN, a promising baseline, and extend it to SPRINT, which supports sporadic task sets, hence matches real-world industrial needs better. To corroborate our results we present findings from real-world case studies from avionic industry.