959 resultados para efficient algorithms
Resumo:
We focus on large-scale and dense deeply embedded systems where, due to the large amount of information generated by all nodes, even simple aggregate computations such as the minimum value (MIN) of the sensor readings become notoriously expensive to obtain. Recent research has exploited a dominance-based medium access control(MAC) protocol, the CAN bus, for computing aggregated quantities in wired systems. For example, MIN can be computed efficiently and an interpolation function which approximates sensor data in an area can be obtained efficiently as well. Dominance-based MAC protocols have recently been proposed for wireless channels and these protocols can be expected to be used for achieving highly scalable aggregate computations in wireless systems. But no experimental demonstration is currently available in the research literature. In this paper, we demonstrate that highly scalable aggregate computations in wireless networks are possible. We do so by (i) building a new wireless hardware platform with appropriate characteristics for making dominance-based MAC protocols efficient, (ii) implementing dominance-based MAC protocols on this platform, (iii) implementing distributed algorithms for aggregate computations (MIN, MAX, Interpolation) using the new implementation of the dominance-based MAC protocol and (iv) performing experiments to prove that such highly scalable aggregate computations in wireless networks are possible.
Resumo:
This paper proposes an efficient scalable Residue Number System (RNS) architecture supporting moduli sets with an arbitrary number of channels, allowing to achieve larger dynamic range and a higher level of parallelism. The proposed architecture allows the forward and reverse RNS conversion, by reusing the arithmetic channel units. The arithmetic operations supported at the channel level include addition, subtraction, and multiplication with accumulation capability. For the reverse conversion two algorithms are considered, one based on the Chinese Remainder Theorem and the other one on Mixed-Radix-Conversion, leading to implementations optimized for delay and required circuit area. With the proposed architecture a complete and compact RNS platform is achieved. Experimental results suggest gains of 17 % in the delay in the arithmetic operations, with an area reduction of 23 % regarding the RNS state of the art. When compared with a binary system the proposed architecture allows to perform the same computation 20 times faster alongside with only 10 % of the circuit area resources.
Resumo:
A MATLAB/SIMULINK-based simulator was employed for studies concerning the control of baker’s yeast fed-batch fermentation. Four control algorithms were implemented and compared: the classical PID control, two discrete versions- modified velocity and position algorithms, and a fuzzy law. The simulation package was seen to be an efficient tool for the simulation and tests of control strategies of the nonlinear process.
Resumo:
The trajectory planning of redundant robots is an important area of research and efficient optimization algorithms are needed. This paper presents a new technique that combines the closed-loop pseudoinverse method with genetic algorithms. The results are compared with a genetic algorithm that adopts the direct kinematics. In both cases the trajectory planning is formulated as an optimization problem with constraints.
Resumo:
Empowered by virtualisation technology, cloud infrastructures enable the construction of flexi- ble and elastic computing environments, providing an opportunity for energy and resource cost optimisation while enhancing system availability and achieving high performance. A crucial re- quirement for effective consolidation is the ability to efficiently utilise system resources for high- availability computing and energy-efficiency optimisation to reduce operational costs and carbon footprints in the environment. Additionally, failures in highly networked computing systems can negatively impact system performance substantially, prohibiting the system from achieving its initial objectives. In this paper, we propose algorithms to dynamically construct and readjust vir- tual clusters to enable the execution of users’ jobs. Allied with an energy optimising mechanism to detect and mitigate energy inefficiencies, our decision-making algorithms leverage virtuali- sation tools to provide proactive fault-tolerance and energy-efficiency to virtual clusters. We conducted simulations by injecting random synthetic jobs and jobs using the latest version of the Google cloud tracelogs. The results indicate that our strategy improves the work per Joule ratio by approximately 12.9% and the working efficiency by almost 15.9% compared with other state-of-the-art algorithms.
Resumo:
Dissertação apresentada na Faculdade de Ciências e Tecnologia da Universidade Nova de Lisboa para obtenção do grau de Mestre em Engenharia Informática
Resumo:
Recent integrated circuit technologies have opened the possibility to design parallel architectures with hundreds of cores on a single chip. The design space of these parallel architectures is huge with many architectural options. Exploring the design space gets even more difficult if, beyond performance and area, we also consider extra metrics like performance and area efficiency, where the designer tries to design the architecture with the best performance per chip area and the best sustainable performance. In this paper we present an algorithm-oriented approach to design a many-core architecture. Instead of doing the design space exploration of the many core architecture based on the experimental execution results of a particular benchmark of algorithms, our approach is to make a formal analysis of the algorithms considering the main architectural aspects and to determine how each particular architectural aspect is related to the performance of the architecture when running an algorithm or set of algorithms. The architectural aspects considered include the number of cores, the local memory available in each core, the communication bandwidth between the many-core architecture and the external memory and the memory hierarchy. To exemplify the approach we did a theoretical analysis of a dense matrix multiplication algorithm and determined an equation that relates the number of execution cycles with the architectural parameters. Based on this equation a many-core architecture has been designed. The results obtained indicate that a 100 mm(2) integrated circuit design of the proposed architecture, using a 65 nm technology, is able to achieve 464 GFLOPs (double precision floating-point) for a memory bandwidth of 16 GB/s. This corresponds to a performance efficiency of 71 %. Considering a 45 nm technology, a 100 mm(2) chip attains 833 GFLOPs which corresponds to 84 % of peak performance These figures are better than those obtained by previous many-core architectures, except for the area efficiency which is limited by the lower memory bandwidth considered. The results achieved are also better than those of previous state-of-the-art many-cores architectures designed specifically to achieve high performance for matrix multiplication.
Resumo:
The theme of this dissertation is the finite element method applied to mechanical structures. A new finite element program is developed that, besides executing different types of structural analysis, also allows the calculation of the derivatives of structural performances using the continuum method of design sensitivities analysis, with the purpose of allowing, in combination with the mathematical programming algorithms found in the commercial software MATLAB, to solve structural optimization problems. The program is called EFFECT – Efficient Finite Element Code. The object-oriented programming paradigm and specifically the C ++ programming language are used for program development. The main objective of this dissertation is to design EFFECT so that it can constitute, in this stage of development, the foundation for a program with analysis capacities similar to other open source finite element programs. In this first stage, 6 elements are implemented for linear analysis: 2-dimensional truss (Truss2D), 3-dimensional truss (Truss3D), 2-dimensional beam (Beam2D), 3-dimensional beam (Beam3D), triangular shell element (Shell3Node) and quadrilateral shell element (Shell4Node). The shell elements combine two distinct elements, one for simulating the membrane behavior and the other to simulate the plate bending behavior. The non-linear analysis capability is also developed, combining the corotational formulation with the Newton-Raphson iterative method, but at this stage is only avaiable to solve problems modeled with Beam2D elements subject to large displacements and rotations, called nonlinear geometric problems. The design sensitivity analysis capability is implemented in two elements, Truss2D and Beam2D, where are included the procedures and the analytic expressions for calculating derivatives of displacements, stress and volume performances with respect to 5 different design variables types. Finally, a set of test examples were created to validate the accuracy and consistency of the result obtained from EFFECT, by comparing them with results published in the literature or obtained with the ANSYS commercial finite element code.
Resumo:
PhD thesis in Bioengineering
Resumo:
The algorithmic approach to data modelling has developed rapidly these last years, in particular methods based on data mining and machine learning have been used in a growing number of applications. These methods follow a data-driven methodology, aiming at providing the best possible generalization and predictive abilities instead of concentrating on the properties of the data model. One of the most successful groups of such methods is known as Support Vector algorithms. Following the fruitful developments in applying Support Vector algorithms to spatial data, this paper introduces a new extension of the traditional support vector regression (SVR) algorithm. This extension allows for the simultaneous modelling of environmental data at several spatial scales. The joint influence of environmental processes presenting different patterns at different scales is here learned automatically from data, providing the optimum mixture of short and large-scale models. The method is adaptive to the spatial scale of the data. With this advantage, it can provide efficient means to model local anomalies that may typically arise in situations at an early phase of an environmental emergency. However, the proposed approach still requires some prior knowledge on the possible existence of such short-scale patterns. This is a possible limitation of the method for its implementation in early warning systems. The purpose of this paper is to present the multi-scale SVR model and to illustrate its use with an application to the mapping of Cs137 activity given the measurements taken in the region of Briansk following the Chernobyl accident.
Resumo:
Forest fires are a serious threat to humans and nature from an ecological, social and economic point of view. Predicting their behaviour by simulation still delivers unreliable results and remains a challenging task. Latest approaches try to calibrate input variables, often tainted with imprecision, using optimisation techniques like Genetic Algorithms. To converge faster towards fitter solutions, the GA is guided with knowledge obtained from historical or synthetical fires. We developed a robust and efficient knowledge storage and retrieval method. Nearest neighbour search is applied to find the fire configuration from knowledge base most similar to the current configuration. Therefore, a distance measure was elaborated and implemented in several ways. Experiments show the performance of the different implementations regarding occupied storage and retrieval time with overly satisfactory results.
Resumo:
Defining an efficient training set is one of the most delicate phases for the success of remote sensing image classification routines. The complexity of the problem, the limited temporal and financial resources, as well as the high intraclass variance can make an algorithm fail if it is trained with a suboptimal dataset. Active learning aims at building efficient training sets by iteratively improving the model performance through sampling. A user-defined heuristic ranks the unlabeled pixels according to a function of the uncertainty of their class membership and then the user is asked to provide labels for the most uncertain pixels. This paper reviews and tests the main families of active learning algorithms: committee, large margin, and posterior probability-based. For each of them, the most recent advances in the remote sensing community are discussed and some heuristics are detailed and tested. Several challenging remote sensing scenarios are considered, including very high spatial resolution and hyperspectral image classification. Finally, guidelines for choosing the good architecture are provided for new and/or unexperienced user.
Resumo:
This paper discusses the use of probabilistic or randomized algorithms for solving combinatorial optimization problems. Our approach employs non-uniform probability distributions to add a biased random behavior to classical heuristics so a large set of alternative good solutions can be quickly obtained in a natural way and without complex conguration processes. This procedure is especially useful in problems where properties such as non-smoothness or non-convexity lead to a highly irregular solution space, for which the traditional optimization methods, both of exact and approximate nature, may fail to reach their full potential. The results obtained are promising enough to suggest that randomizing classical heuristics is a powerful method that can be successfully applied in a variety of cases.
Resumo:
The UHPLC strategy which combines sub-2 microm porous particles and ultra-high pressure (>1000 bar) was investigated considering very high resolution criteria in both isocratic and gradient modes, with mobile phase temperatures between 30 and 90 degrees C. In isocratic mode, experimental conditions to reach the maximal efficiency were determined using the kinetic plot representation for DeltaP(max)=1000 bar. It has been first confirmed that the molecular weight of the compounds (MW) was a critical parameter which should be considered in the construction of such curves. With a MW around 1000 g mol(-1), efficiencies as high as 300,000 plates could be theoretically attained using UHPLC at 30 degrees C. By limiting the column length to 450 mm, the maximal plate count was around 100,000. In gradient mode, the longest column does not provide the maximal peak capacity for a given analysis time in UHPLC. This was attributed to the fact that peak capacity is not only related to the plate number but also to column dead time. Therefore, a compromise should be found and a 150 mm column should be preferentially selected for gradient lengths up to 60 min at 30 degrees C, while the columns coupled in series (3x 150 mm) were attractive only for t(grad)>250 min. Compared to 30 degrees C, peak capacities were increased by about 20-30% for a constant gradient length at 90 degrees C and gradient time decreased by 2-fold for an identical peak capacity.
Resumo:
An implicitly parallel method for integral-block driven restricted active space self-consistent field (RASSCF) algorithms is presented. The approach is based on a model space representation of the RAS active orbitals with an efficient expansion of the model subspaces. The applicability of the method is demonstrated with a RASSCF investigation of the first two excited states of indole