905 resultados para Parallel algorithm


Relevância:

30.00% 30.00%

Publicador:

Resumo:

We present an algorithm for the computation of reducible invariant tori of discrete dynamical systems that is suitable for tori of dimensions larger than 1. It is based on a quadratically convergent scheme that approximates, at the same time, the Fourier series of the torus, its Floquet transformation, and its Floquet matrix. The Floquet matrix describes the linearization of the dynamics around the torus and, hence, its linear stability. The algorithm presents a high degree of parallelism, and the computational effort grows linearly with the number of Fourier modes needed to represent the solution. For these reasons it is a very good option to compute quasi-periodic solutions with several basic frequencies. The paper includes some examples (flows) to show the efficiency of the method in a parallel computer. In these flows we compute invariant tori of dimensions up to 5, by taking suitable sections.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The maximum realizable power throughput of power electronic converters may be limited or constrained by technical or economical considerations. One solution to this problemis to connect several power converter units in parallel. The parallel connection can be used to increase the current carrying capacity of the overall system beyond the ratings of individual power converter units. Thus, it is possible to use several lower-power converter units, produced in large quantities, as building blocks to construct high-power converters in a modular manner. High-power converters realized by using parallel connection are needed for example in multimegawatt wind power generation systems. Parallel connection of power converter units is also required in emerging applications such as photovoltaic and fuel cell power conversion. The parallel operation of power converter units is not, however, problem free. This is because parallel-operating units are subject to overcurrent stresses, which are caused by unequal load current sharing or currents that flow between the units. Commonly, the term ’circulatingcurrent’ is used to describe both the unequal load current sharing and the currents flowing between the units. Circulating currents, again, are caused by component tolerances and asynchronous operation of the parallel units. Parallel-operating units are also subject to stresses caused by unequal thermal stress distribution. Both of these problemscan, nevertheless, be handled with a proper circulating current control. To design an effective circulating current control system, we need information about circulating current dynamics. The dynamics of the circulating currents can be investigated by developing appropriate mathematical models. In this dissertation, circulating current models aredeveloped for two different types of parallel two-level three-phase inverter configurations. Themodels, which are developed for an arbitrary number of parallel units, provide a framework for analyzing circulating current generation mechanisms and developing circulating current control systems. In addition to developing circulating current models, modulation of parallel inverters is considered. It is illustrated that depending on the parallel inverter configuration and the modulation method applied, common-mode circulating currents may be excited as a consequence of the differential-mode circulating current control. To prevent the common-mode circulating currents that are caused by the modulation, a dual modulator method is introduced. The dual modulator basically consists of two independently operating modulators, the outputs of which eventually constitute the switching commands of the inverter. The two independently operating modulators are referred to as primary and secondary modulators. In its intended usage, the same voltage vector is fed to the primary modulators of each parallel unit, and the inputs of the secondary modulators are obtained from the circulating current controllers. To ensure that voltage commands obtained from the circulating current controllers are realizable, it must be guaranteed that the inverter is not driven into saturation by the primary modulator. The inverter saturation can be prevented by limiting the inputs of the primary and secondary modulators. Because of this, also a limitation algorithm is proposed. The operation of both the proposed dual modulator and the limitation algorithm is verified experimentally.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Diplomityön tarkoituksena on optimoida asiakkaiden sähkölaskun laskeminen hajautetun laskennan avulla. Älykkäiden etäluettavien energiamittareiden tullessa jokaiseen kotitalouteen, energiayhtiöt velvoitetaan laskemaan asiakkaiden sähkölaskut tuntiperusteiseen mittaustietoon perustuen. Kasvava tiedonmäärä lisää myös tarvittavien laskutehtävien määrää. Työssä arvioidaan vaihtoehtoja hajautetun laskennan toteuttamiseksi ja luodaan tarkempi katsaus pilvilaskennan mahdollisuuksiin. Lisäksi ajettiin simulaatioita, joiden avulla arvioitiin rinnakkaislaskennan ja peräkkäislaskennan eroja. Sähkölaskujen oikeinlaskemisen tueksi kehitettiin mittauspuu-algoritmi.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

To obtain the desirable accuracy of a robot, there are two techniques available. The first option would be to make the robot match the nominal mathematic model. In other words, the manufacturing and assembling tolerances of every part would be extremely tight so that all of the various parameters would match the “design” or “nominal” values as closely as possible. This method can satisfy most of the accuracy requirements, but the cost would increase dramatically as the accuracy requirement increases. Alternatively, a more cost-effective solution is to build a manipulator with relaxed manufacturing and assembling tolerances. By modifying the mathematical model in the controller, the actual errors of the robot can be compensated. This is the essence of robot calibration. Simply put, robot calibration is the process of defining an appropriate error model and then identifying the various parameter errors that make the error model match the robot as closely as possible. This work focuses on kinematic calibration of a 10 degree-of-freedom (DOF) redundant serial-parallel hybrid robot. The robot consists of a 4-DOF serial mechanism and a 6-DOF hexapod parallel manipulator. The redundant 4-DOF serial structure is used to enlarge workspace and the 6-DOF hexapod manipulator is used to provide high load capabilities and stiffness for the whole structure. The main objective of the study is to develop a suitable calibration method to improve the accuracy of the redundant serial-parallel hybrid robot. To this end, a Denavit–Hartenberg (DH) hybrid error model and a Product-of-Exponential (POE) error model are developed for error modeling of the proposed robot. Furthermore, two kinds of global optimization methods, i.e. the differential-evolution (DE) algorithm and the Markov Chain Monte Carlo (MCMC) algorithm, are employed to identify the parameter errors of the derived error model. A measurement method based on a 3-2-1 wire-based pose estimation system is proposed and implemented in a Solidworks environment to simulate the real experimental validations. Numerical simulations and Solidworks prototype-model validations are carried out on the hybrid robot to verify the effectiveness, accuracy and robustness of the calibration algorithms.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The dissertation proposes two control strategies, which include the trajectory planning and vibration suppression, for a kinematic redundant serial-parallel robot machine, with the aim of attaining the satisfactory machining performance. For a given prescribed trajectory of the robot's end-effector in the Cartesian space, a set of trajectories in the robot's joint space are generated based on the best stiffness performance of the robot along the prescribed trajectory. To construct the required system-wide analytical stiffness model for the serial-parallel robot machine, a variant of the virtual joint method (VJM) is proposed in the dissertation. The modified method is an evolution of Gosselin's lumped model that can account for the deformations of a flexible link in more directions. The effectiveness of this VJM variant is validated by comparing the computed stiffness results of a flexible link with the those of a matrix structural analysis (MSA) method. The comparison shows that the numerical results from both methods on an individual flexible beam are almost identical, which, in some sense, provides mutual validation. The most prominent advantage of the presented VJM variant compared with the MSA method is that it can be applied in a flexible structure system with complicated kinematics formed in terms of flexible serial links and joints. Moreover, by combining the VJM variant and the virtual work principle, a systemwide analytical stiffness model can be easily obtained for mechanisms with both serial kinematics and parallel kinematics. In the dissertation, a system-wide stiffness model of a kinematic redundant serial-parallel robot machine is constructed based on integration of the VJM variant and the virtual work principle. Numerical results of its stiffness performance are reported. For a kinematic redundant robot, to generate a set of feasible joints' trajectories for a prescribed trajectory of its end-effector, its system-wide stiffness performance is taken as the constraint in the joints trajectory planning in the dissertation. For a prescribed location of the end-effector, the robot permits an infinite number of inverse solutions, which consequently yields infinite kinds of stiffness performance. Therefore, a differential evolution (DE) algorithm in which the positions of redundant joints in the kinematics are taken as input variables was employed to search for the best stiffness performance of the robot. Numerical results of the generated joint trajectories are given for a kinematic redundant serial-parallel robot machine, IWR (Intersector Welding/Cutting Robot), when a particular trajectory of its end-effector has been prescribed. The numerical results show that the joint trajectories generated based on the stiffness optimization are feasible for realization in the control system since they are acceptably smooth. The results imply that the stiffness performance of the robot machine deviates smoothly with respect to the kinematic configuration in the adjacent domain of its best stiffness performance. To suppress the vibration of the robot machine due to varying cutting force during the machining process, this dissertation proposed a feedforward control strategy, which is constructed based on the derived inverse dynamics model of target system. The effectiveness of applying such a feedforward control in the vibration suppression has been validated in a parallel manipulator in the software environment. The experimental study of such a feedforward control has also been included in the dissertation. The difficulties of modelling the actual system due to the unknown components in its dynamics is noticed. As a solution, a back propagation (BP) neural network is proposed for identification of the unknown components of the dynamics model of the target system. To train such a BP neural network, a modified Levenberg-Marquardt algorithm that can utilize an experimental input-output data set of the entire dynamic system is introduced in the dissertation. Validation of the BP neural network and the modified Levenberg- Marquardt algorithm is done, respectively, by a sinusoidal output approximation, a second order system parameters estimation, and a friction model estimation of a parallel manipulator, which represent three different application aspects of this method.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Decimal multiplication is an integral part of financial, commercial, and internet-based computations. A novel design for single digit decimal multiplication that reduces the critical path delay and area for an iterative multiplier is proposed in this research. The partial products are generated using single digit multipliers, and are accumulated based on a novel RPS algorithm. This design uses n single digit multipliers for an n × n multiplication. The latency for the multiplication of two n-digit Binary Coded Decimal (BCD) operands is (n + 1) cycles and a new multiplication can begin every n cycle. The accumulation of final partial products and the first iteration of partial product generation for next set of inputs are done simultaneously. This iterative decimal multiplier offers low latency and high throughput, and can be extended for decimal floating-point multiplication.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We develop an algorithm that computes the gravitational potentials and forces on N point-masses interacting in three-dimensional space. The algorithm, based on analytical techniques developed by Rokhlin and Greengard, runs in order N time. In contrast to other fast N-body methods such as tree codes, which only approximate the interaction potentials and forces, this method is exact ?? computes the potentials and forces to within any prespecified tolerance up to machine precision. We present an implementation of the algorithm for a sequential machine. We numerically verify the algorithm, and compare its speed with that of an O(N2) direct force computation. We also describe a parallel version of the algorithm that runs on the Connection Machine in order 0(logN) time. We compare experimental results with those of the sequential implementation and discuss how to minimize communication overhead on the parallel machine.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A key capability of data-race detectors is to determine whether one thread executes logically in parallel with another or whether the threads must operate in series. This paper provides two algorithms, one serial and one parallel, to maintain series-parallel (SP) relationships "on the fly" for fork-join multithreaded programs. The serial SP-order algorithm runs in O(1) amortized time per operation. In contrast, the previously best algorithm requires a time per operation that is proportional to Tarjan’s functional inverse of Ackermann’s function. SP-order employs an order-maintenance data structure that allows us to implement a more efficient "English-Hebrew" labeling scheme than was used in earlier race detectors, which immediately yields an improved determinacy-race detector. In particular, any fork-join program running in T₁ time on a single processor can be checked on the fly for determinacy races in O(T₁) time. Corresponding improved bounds can also be obtained for more sophisticated data-race detectors, for example, those that use locks. By combining SP-order with Feng and Leiserson’s serial SP-bags algorithm, we obtain a parallel SP-maintenance algorithm, called SP-hybrid. Suppose that a fork-join program has n threads, T₁ work, and a critical-path length of T[subscript ∞]. When executed on P processors, we prove that SP-hybrid runs in O((T₁/P + PT[subscript ∞]) lg n) expected time. To understand this bound, consider that the original program obtains linear speed-up over a 1-processor execution when P = O(T₁/T[subscript ∞]). In contrast, SP-hybrid obtains linear speed-up when P = O(√T₁/T[subscript ∞]), but the work is increased by a factor of O(lg n).

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In computer graphics, global illumination algorithms take into account not only the light that comes directly from the sources, but also the light interreflections. This kind of algorithms produce very realistic images, but at a high computational cost, especially when dealing with complex environments. Parallel computation has been successfully applied to such algorithms in order to make it possible to compute highly-realistic images in a reasonable time. We introduce here a speculation-based parallel solution for a global illumination algorithm in the context of radiosity, in which we have taken advantage of the hierarchical nature of such an algorithm

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Clustering is defined as the grouping of similar items in a set, and is an important process within the field of data mining. As the amount of data for various applications continues to increase, in terms of its size and dimensionality, it is necessary to have efficient clustering methods. A popular clustering algorithm is K-Means, which adopts a greedy approach to produce a set of K-clusters with associated centres of mass, and uses a squared error distortion measure to determine convergence. Methods for improving the efficiency of K-Means have been largely explored in two main directions. The amount of computation can be significantly reduced by adopting a more efficient data structure, notably a multi-dimensional binary search tree (KD-Tree) to store either centroids or data points. A second direction is parallel processing, where data and computation loads are distributed over many processing nodes. However, little work has been done to provide a parallel formulation of the efficient sequential techniques based on KD-Trees. Such approaches are expected to have an irregular distribution of computation load and can suffer from load imbalance. This issue has so far limited the adoption of these efficient K-Means techniques in parallel computational environments. In this work, we provide a parallel formulation for the KD-Tree based K-Means algorithm and address its load balancing issues.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

One among the most influential and popular data mining methods is the k-Means algorithm for cluster analysis. Techniques for improving the efficiency of k-Means have been largely explored in two main directions. The amount of computation can be significantly reduced by adopting geometrical constraints and an efficient data structure, notably a multidimensional binary search tree (KD-Tree). These techniques allow to reduce the number of distance computations the algorithm performs at each iteration. A second direction is parallel processing, where data and computation loads are distributed over many processing nodes. However, little work has been done to provide a parallel formulation of the efficient sequential techniques based on KD-Trees. Such approaches are expected to have an irregular distribution of computation load and can suffer from load imbalance. This issue has so far limited the adoption of these efficient k-Means variants in parallel computing environments. In this work, we provide a parallel formulation of the KD-Tree based k-Means algorithm for distributed memory systems and address its load balancing issue. Three solutions have been developed and tested. Two approaches are based on a static partitioning of the data set and a third solution incorporates a dynamic load balancing policy.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents the results of the application of a parallel Genetic Algorithm (GA) in order to design a Fuzzy Proportional Integral (FPI) controller for active queue management on Internet routers. The Active Queue Management (AQM) policies are those policies of router queue management that allow the detection of network congestion, the notification of such occurrences to the hosts on the network borders, and the adoption of a suitable control policy. Two different parallel implementations of the genetic algorithm are adopted to determine an optimal configuration of the FPI controller parameters. Finally, the results of several experiments carried out on a forty nodes cluster of workstations are presented.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The paper presents a design for a hardware genetic algorithm which uses a pipeline of systolic arrays. These arrays have been designed using systolic synthesis techniques which involve expressing the algorithm as a set of uniform recurrence relations. The final design divorces the fitness function evaluation from the hardware and can process chromosomes of different lengths, giving the design a generic quality. The paper demonstrates the design methodology by progressively re-writing a simple genetic algorithm, expressed in C code, into a form from which systolic structures can be deduced. This paper extends previous work by introducing a simplification to a previous systolic design for the genetic algorithm. The simplification results in the removal of 2N 2 + 4N cells and reduces the time complexity by 3N + 1 cycles.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents a paralleled Two-Pass Hexagonal (TPA) algorithm constituted by Linear Hashtable Motion Estimation Algorithm (LHMEA) and Hexagonal Search (HEXBS) for motion estimation. In the TPA, Motion Vectors (MV) are generated from the first-pass LHMEA and are used as predictors for second-pass HEXBS motion estimation, which only searches a small number of Macroblocks (MBs). We introduced hashtable into video processing and completed parallel implementation. We propose and evaluate parallel implementations of the LHMEA of TPA on clusters of workstations for real time video compression. It discusses how parallel video coding on load balanced multiprocessor systems can help, especially on motion estimation. The effect of load balancing for improved performance is discussed. The performance of the algorithm is evaluated by using standard video sequences and the results are compared to current algorithms.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents a parallel Linear Hashtable Motion Estimation Algorithm (LHMEA). Most parallel video compression algorithms focus on Group of Picture (GOP). Based on LHMEA we proposed earlier [1][2], we developed a parallel motion estimation algorithm focus inside of frame. We divide each reference frames into equally sized regions. These regions are going to be processed in parallel to increase the encoding speed significantly. The theory and practice speed up of parallel LHMEA according to the number of PCs in the cluster are compared and discussed. Motion Vectors (MV) are generated from the first-pass LHMEA and used as predictors for second-pass Hexagonal Search (HEXBS) motion estimation, which only searches a small number of Macroblocks (MBs). We evaluated distributed parallel implementation of LHMEA of TPA for real time video compression.