998 resultados para ART ALGORITHM
Resumo:
Consider the problem of assigning implicit-deadline sporadic tasks on a heterogeneous multiprocessor platform comprising two different types of processors—such a platform is referred to as two-type platform. We present two low degree polynomial time-complexity algorithms, SA and SA-P, each providing the following guarantee. For a given two-type platform and a task set, if there exists a task assignment such that tasks can be scheduled to meet deadlines by allowing them to migrate only between processors of the same type (intra-migrative), then (i) using SA, it is guaranteed to find such an assignment where the same restriction on task migration applies but given a platform in which processors are 1+α/2 times faster and (ii) SA-P succeeds in finding a task assignment where tasks are not allowed to migrate between processors (non-migrative) but given a platform in which processors are 1+α times faster. The parameter 0<α≤1 is a property of the task set; it is the maximum of all the task utilizations that are no greater than 1. We evaluate average-case performance of both the algorithms by generating task sets randomly and measuring how much faster processors the algorithms need (which is upper bounded by 1+α/2 for SA and 1+α for SA-P) in order to output a feasible task assignment (intra-migrative for SA and non-migrative for SA-P). In our evaluations, for the vast majority of task sets, these algorithms require significantly smaller processor speedup than indicated by their theoretical bounds. Finally, we consider a special case where no task utilization in the given task set can exceed one and for this case, we (re-)prove the performance guarantees of SA and SA-P. We show, for both of the algorithms, that changing the adversary from intra-migrative to a more powerful one, namely fully-migrative, in which tasks can migrate between processors of any type, does not deteriorate the performance guarantees. For this special case, we compare the average-case performance of SA-P and a state-of-the-art algorithm by generating task sets randomly. In our evaluations, SA-P outperforms the state-of-the-art by requiring much smaller processor speedup and by running orders of magnitude faster.
Resumo:
Purpose: We present an iterative framework for CT reconstruction from transmission ultrasound data which accurately and efficiently models the strong refraction effects that occur in our target application: Imaging the female breast. Methods: Our refractive ray tracing framework has its foundation in the fast marching method (FNMM) and it allows an accurate as well as efficient modeling of curved rays. We also describe a novel regularization scheme that yields further significant reconstruction quality improvements. A final contribution is the development of a realistic anthropomorphic digital breast phantom based on the NIH Visible Female data set. Results: Our system is able to resolve very fine details even in the presence of significant noise, and it reconstructs both sound speed and attenuation data. Excellent correspondence with a traditional, but significantly more computationally expensive wave equation solver is achieved. Conclusions: Apart from the accurate modeling of curved rays, decisive factors have also been our regularization scheme and the high-quality interpolation filter we have used. An added benefit of our framework is that it accelerates well on GPUs where we have shown that clinical 3D reconstruction speeds on the order of minutes are possible.
Resumo:
This paper presents a method for fast calculation of the egomotion done by a robot using visual features. The method is part of a complete system for automatic map building and Simultaneous Localization and Mapping (SLAM). The method uses optical flow in order to determine if the robot has done a movement. If so, some visual features which do not accomplish several criteria (like intersection, unicity, etc,) are deleted, and then the egomotion is calculated. We use a state-of-the-art algorithm (TORO) in order to rectify the map and solve the SLAM problem. The proposed method provides better efficiency that other current methods.
Resumo:
This paper presents a method for the fast calculation of a robot’s egomotion using visual features. The method is part of a complete system for automatic map building and Simultaneous Location and Mapping (SLAM). The method uses optical flow to determine whether the robot has undergone a movement. If so, some visual features that do not satisfy several criteria are deleted, and then egomotion is calculated. Thus, the proposed method improves the efficiency of the whole process because not all the data is processed. We use a state-of-the-art algorithm (TORO) to rectify the map and solve the SLAM problem. Additionally, a study of different visual detectors and descriptors has been conducted to identify which of them are more suitable for the SLAM problem. Finally, a navigation method is described using the map obtained from the SLAM solution.
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-06
Resumo:
Network Virtualization is a key technology for the Future Internet, allowing the deployment of multiple independent virtual networks that use resources of the same basic infrastructure. An important challenge in the dynamic provision of virtual networks resides in the optimal allocation of physical resources (nodes and links) to requirements of virtual networks. This problem is known as Virtual Network Embedding (VNE). For the resolution of this problem, previous research has focused on designing algorithms based on the optimization of a single objective. On the contrary, in this work we present a multi-objective algorithm, called VNE-MO-ILP, for solving dynamic VNE problem, which calculates an approximation of the Pareto Front considering simultaneously resource utilization and load balancing. Experimental results show evidences that the proposed algorithm is better or at least comparable to a state-of-the-art algorithm. Two performance metrics were simultaneously evaluated: (i) Virtual Network Request Acceptance Ratio and (ii) Revenue/Cost Relation. The size of test networks used in the experiments shows that the proposed algorithm scales well in execution times, for networks of 84 nodes
Resumo:
This paper presents an improved parallel Two-Pass Hexagonal (TPA) algorithm constituted by Linear Hashtable Motion Estimation Algorithm (LHMEA) and Hexagonal Search (HEXBS) for motion estimation. Motion Vectors (MV) are generated from the first-pass LHMEA and used as predictors for second-pass HEXBS motion estimation, which only searches a small number of Macroblocks (MBs). We used bashtable into video processing and completed parallel implementation. The hashtable structure of LHMEA is improved compared to the original TPA and LHMEA. We propose and evaluate parallel implementations of the LHMEA of TPA on clusters of workstations for real time video compression. The implementation contains spatial and temporal approaches. The performance of the algorithm is evaluated by using standard video sequences and the results are compared to current algorithms.
Resumo:
Given a set of mixed spectral (multispectral or hyperspectral) vectors, linear spectral mixture analysis, or linear unmixing, aims at estimating the number of reference substances, also called endmembers, their spectral signatures, and their abundance fractions. This paper presents a new method for unsupervised endmember extraction from hyperspectral data, termed vertex component analysis (VCA). The algorithm exploits two facts: (1) the endmembers are the vertices of a simplex and (2) the affine transformation of a simplex is also a simplex. In a series of experiments using simulated and real data, the VCA algorithm competes with state-of-the-art methods, with a computational complexity between one and two orders of magnitude lower than the best available method.
Resumo:
In visual sensor networks, local feature descriptors can be computed at the sensing nodes, which work collaboratively on the data obtained to make an efficient visual analysis. In fact, with a minimal amount of computational effort, the detection and extraction of local features, such as binary descriptors, can provide a reliable and compact image representation. In this paper, it is proposed to extract and code binary descriptors to meet the energy and bandwidth constraints at each sensing node. The major contribution is a binary descriptor coding technique that exploits the correlation using two different coding modes: Intra, which exploits the correlation between the elements that compose a descriptor; and Inter, which exploits the correlation between descriptors of the same image. The experimental results show bitrate savings up to 35% without any impact in the performance efficiency of the image retrieval task. © 2014 EURASIP.
Resumo:
Recent integrated circuit technologies have opened the possibility to design parallel architectures with hundreds of cores on a single chip. The design space of these parallel architectures is huge with many architectural options. Exploring the design space gets even more difficult if, beyond performance and area, we also consider extra metrics like performance and area efficiency, where the designer tries to design the architecture with the best performance per chip area and the best sustainable performance. In this paper we present an algorithm-oriented approach to design a many-core architecture. Instead of doing the design space exploration of the many core architecture based on the experimental execution results of a particular benchmark of algorithms, our approach is to make a formal analysis of the algorithms considering the main architectural aspects and to determine how each particular architectural aspect is related to the performance of the architecture when running an algorithm or set of algorithms. The architectural aspects considered include the number of cores, the local memory available in each core, the communication bandwidth between the many-core architecture and the external memory and the memory hierarchy. To exemplify the approach we did a theoretical analysis of a dense matrix multiplication algorithm and determined an equation that relates the number of execution cycles with the architectural parameters. Based on this equation a many-core architecture has been designed. The results obtained indicate that a 100 mm(2) integrated circuit design of the proposed architecture, using a 65 nm technology, is able to achieve 464 GFLOPs (double precision floating-point) for a memory bandwidth of 16 GB/s. This corresponds to a performance efficiency of 71 %. Considering a 45 nm technology, a 100 mm(2) chip attains 833 GFLOPs which corresponds to 84 % of peak performance These figures are better than those obtained by previous many-core architectures, except for the area efficiency which is limited by the lower memory bandwidth considered. The results achieved are also better than those of previous state-of-the-art many-cores architectures designed specifically to achieve high performance for matrix multiplication.
Resumo:
Natural selection favors the survival and reproduction of organisms that are best adapted to their environment. Selection mechanism in evolutionary algorithms mimics this process, aiming to create environmental conditions in which artificial organisms could evolve solving the problem at hand. This paper proposes a new selection scheme for evolutionary multiobjective optimization. The similarity measure that defines the concept of the neighborhood is a key feature of the proposed selection. Contrary to commonly used approaches, usually defined on the basis of distances between either individuals or weight vectors, it is suggested to consider the similarity and neighborhood based on the angle between individuals in the objective space. The smaller the angle, the more similar individuals. This notion is exploited during the mating and environmental selections. The convergence is ensured by minimizing distances from individuals to a reference point, whereas the diversity is preserved by maximizing angles between neighboring individuals. Experimental results reveal a highly competitive performance and useful characteristics of the proposed selection. Its strong diversity preserving ability allows to produce a significantly better performance on some problems when compared with stat-of-the-art algorithms.
Resumo:
Descriptors based on Molecular Interaction Fields (MIF) are highly suitable for drug discovery, but their size (thousands of variables) often limits their application in practice. Here we describe a simple and fast computational method that extracts from a MIF a handful of highly informative points (hot spots) which summarize the most relevant information. The method was specifically developed for drug discovery, is fast, and does not require human supervision, being suitable for its application on very large series of compounds. The quality of the results has been tested by running the method on the ligand structure of a large number of ligand-receptor complexes and then comparing the position of the selected hot spots with actual atoms of the receptor. As an additional test, the hot spots obtained with the novel method were used to obtain GRIND-like molecular descriptors which were compared with the original GRIND. In both cases the results show that the novel method is highly suitable for describing ligand-receptor interactions and compares favorably with other state-of-the-art methods.
Resumo:
From a managerial point of view, the more effcient, simple, and parameter-free (ESP) an algorithm is, the more likely it will be used in practice for solving real-life problems. Following this principle, an ESP algorithm for solving the Permutation Flowshop Sequencing Problem (PFSP) is proposed in this article. Using an Iterated Local Search (ILS) framework, the so-called ILS-ESP algorithm is able to compete in performance with other well-known ILS-based approaches, which are considered among the most effcient algorithms for the PFSP. However, while other similar approaches still employ several parameters that can affect their performance if not properly chosen, our algorithm does not require any particular fine-tuning process since it uses basic "common sense" rules for the local search, perturbation, and acceptance criterion stages of the ILS metaheuristic. Our approach defines a new operator for the ILS perturbation process, a new acceptance criterion based on extremely simple and transparent rules, and a biased randomization process of the initial solution to randomly generate different alternative initial solutions of similar quality -which is attained by applying a biased randomization to a classical PFSP heuristic. This diversification of the initial solution aims at avoiding poorly designed starting points and, thus, allows the methodology to take advantage of current trends in parallel and distributed computing. A set of extensive tests, based on literature benchmarks, has been carried out in order to validate our algorithm and compare it against other approaches. These tests show that our parameter-free algorithm is able to compete with state-of-the-art metaheuristics for the PFSP. Also, the experiments show that, when using parallel computing, it is possible to improve the top ILS-based metaheuristic by just incorporating to it our biased randomization process with a high-quality pseudo-random number generator.
Resumo:
Although fetal anatomy can be adequately viewed in new multi-slice MR images, many critical limitations remain for quantitative data analysis. To this end, several research groups have recently developed advanced image processing methods, often denoted by super-resolution (SR) techniques, to reconstruct from a set of clinical low-resolution (LR) images, a high-resolution (HR) motion-free volume. It is usually modeled as an inverse problem where the regularization term plays a central role in the reconstruction quality. Literature has been quite attracted by Total Variation energies because of their ability in edge preserving but only standard explicit steepest gradient techniques have been applied for optimization. In a preliminary work, it has been shown that novel fast convex optimization techniques could be successfully applied to design an efficient Total Variation optimization algorithm for the super-resolution problem. In this work, two major contributions are presented. Firstly, we will briefly review the Bayesian and Variational dual formulations of current state-of-the-art methods dedicated to fetal MRI reconstruction. Secondly, we present an extensive quantitative evaluation of our SR algorithm previously introduced on both simulated fetal and real clinical data (with both normal and pathological subjects). Specifically, we study the robustness of regularization terms in front of residual registration errors and we also present a novel strategy for automatically select the weight of the regularization as regards the data fidelity term. Our results show that our TV implementation is highly robust in front of motion artifacts and that it offers the best trade-off between speed and accuracy for fetal MRI recovery as in comparison with state-of-the art methods.
Resumo:
Ordered gene problems are a very common classification of optimization problems. Because of their popularity countless algorithms have been developed in an attempt to find high quality solutions to the problems. It is also common to see many different types of problems reduced to ordered gene style problems as there are many popular heuristics and metaheuristics for them due to their popularity. Multiple ordered gene problems are studied, namely, the travelling salesman problem, bin packing problem, and graph colouring problem. In addition, two bioinformatics problems not traditionally seen as ordered gene problems are studied: DNA error correction and DNA fragment assembly. These problems are studied with multiple variations and combinations of heuristics and metaheuristics with two distinct types or representations. The majority of the algorithms are built around the Recentering- Restarting Genetic Algorithm. The algorithm variations were successful on all problems studied, and particularly for the two bioinformatics problems. For DNA Error Correction multiple cases were found with 100% of the codes being corrected. The algorithm variations were also able to beat all other state-of-the-art DNA Fragment Assemblers on 13 out of 16 benchmark problem instances.