929 resultados para dynamic load balancing
Resumo:
Les techniques de groupement technologique sont aujourd’hui utilisées dans de nombreux ateliers de fabrication; elles consistent à décomposer les systèmes industriels en sous-systèmes ou cellules constitués de pièces et de machines. Trouver le groupement technologique le plus efficace est formulé en recherche opérationnelle comme un problème de formation de cellules. La résolution de ce problème permet de tirer plusieurs avantages tels que la réduction des stocks et la simplification de la programmation. Plusieurs critères peuvent être définis au niveau des contraintes du problème tel que le flot intercellulaire,l’équilibrage de charges intracellulaires, les coûts de sous-traitance, les coûts de duplication des machines, etc. Le problème de formation de cellules est un problème d'optimisation NP-difficile. Par conséquent les méthodes exactes ne peuvent être utilisées pour résoudre des problèmes de grande dimension dans un délai raisonnable. Par contre des méthodes heuristiques peuvent générer des solutions de qualité inférieure, mais dans un temps d’exécution raisonnable. Dans ce mémoire, nous considérons ce problème dans un contexte bi-objectif spécifié en termes d’un facteur d’autonomie et de l’équilibre de charge entre les cellules. Nous présentons trois types de méthodes métaheuristiques pour sa résolution et nous comparons numériquement ces métaheuristiques. De plus, pour des problèmes de petite dimension qui peuvent être résolus de façon exacte avec CPLEX, nous vérifions que ces métaheuristiques génèrent des solutions optimales.
Resumo:
Many examples for emergent behaviors may be observed in self-organizing physical and biological systems which prove to be robust, stable, and adaptable. Such behaviors are often based on very simple mechanisms and rules, but artificially creating them is a challenging task which does not comply with traditional software engineering. In this article, we propose a hybrid approach by combining strategies from Genetic Programming and agent software engineering, and demonstrate that this approach effectively yields an emergent design for given problems.
Resumo:
We consider the often-studied problem of sorting, for a parallel computer. Given an input array distributed evenly over p processors, the task is to compute the sorted output array, also distributed over the p processors. Many existing algorithms take the approach of approximately load-balancing the output, leaving each processor with Θ(n/p) elements. However, in many cases, approximate load-balancing leads to inefficiencies in both the sorting itself and in further uses of the data after sorting. We provide a deterministic parallel sorting algorithm that uses parallel selection to produce any output distribution exactly, particularly one that is perfectly load-balanced. Furthermore, when using a comparison sort, this algorithm is 1-optimal in both computation and communication. We provide an empirical study that illustrates the efficiency of exact data splitting, and shows an improvement over two sample sort algorithms.
Resumo:
In this paper, a method for enhancing current QoS routing methods by means of QoS protection is presented. In an MPLS network, the segments (links) to be protected are predefined and an LSP request involves, apart from establishing a working path, creating a specific type of backup path (local, reverse or global). Different QoS parameters, such as network load balancing, resource optimization and minimization of LSP request rejection should be considered. QoS protection is defined as a function of QoS parameters, such as packet loss, restoration time, and resource optimization. A framework to add QoS protection to many of the current QoS routing algorithms is introduced. A backup decision module to select the most suitable protection method is formulated and different case studies are analyzed
Resumo:
The shallow water equations are solved using a mesh of polygons on the sphere, which adapts infrequently to the predicted future solution. Infrequent mesh adaptation reduces the cost of adaptation and load-balancing and will thus allow for more accurate mapping on adaptation. We simulate the growth of a barotropically unstable jet adapting the mesh every 12 h. Using an adaptation criterion based largely on the gradient of the vorticity leads to a mesh with around 20 per cent of the cells of a uniform mesh that gives equivalent results. This is a similar proportion to previous studies of the same test case with mesh adaptation every 1–20 min. The prediction of the mesh density involves solving the shallow water equations on a coarse mesh in advance of the locally refined mesh in order to estimate where features requiring higher resolution will grow, decay or move to. The adaptation criterion consists of two parts: that resolved on the coarse mesh, and that which is not resolved and so is passively advected on the coarse mesh. This combination leads to a balance between resolving features controlled by the large-scale dynamics and maintaining fine-scale features.
Resumo:
Clustering is defined as the grouping of similar items in a set, and is an important process within the field of data mining. As the amount of data for various applications continues to increase, in terms of its size and dimensionality, it is necessary to have efficient clustering methods. A popular clustering algorithm is K-Means, which adopts a greedy approach to produce a set of K-clusters with associated centres of mass, and uses a squared error distortion measure to determine convergence. Methods for improving the efficiency of K-Means have been largely explored in two main directions. The amount of computation can be significantly reduced by adopting a more efficient data structure, notably a multi-dimensional binary search tree (KD-Tree) to store either centroids or data points. A second direction is parallel processing, where data and computation loads are distributed over many processing nodes. However, little work has been done to provide a parallel formulation of the efficient sequential techniques based on KD-Trees. Such approaches are expected to have an irregular distribution of computation load and can suffer from load imbalance. This issue has so far limited the adoption of these efficient K-Means techniques in parallel computational environments. In this work, we provide a parallel formulation for the KD-Tree based K-Means algorithm and address its load balancing issues.
Resumo:
This paper presents a paralleled Two-Pass Hexagonal (TPA) algorithm constituted by Linear Hashtable Motion Estimation Algorithm (LHMEA) and Hexagonal Search (HEXBS) for motion estimation. In the TPA., Motion Vectors (MV) are generated from the first-pass LHMEA and are used as predictors for second-pass HEXBS motion estimation, which only searches a small number of Macroblocks (MBs). We introduced hashtable into video processing and completed parallel implementation. We propose and evaluate parallel implementations of the LHMEA of TPA on clusters of workstations for real time video compression. It discusses how parallel video coding on load balanced multiprocessor systems can help, especially on motion estimation. The effect of load balancing for improved performance is discussed. The performance or the algorithm is evaluated by using standard video sequences and the results are compared to current algorithms.
Resumo:
This paper presents a paralleled Two-Pass Hexagonal (TPA) algorithm constituted by Linear Hashtable Motion Estimation Algorithm (LHMEA) and Hexagonal Search (HEXBS) for motion estimation. In the TPA, Motion Vectors (MV) are generated from the first-pass LHMEA and are used as predictors for second-pass HEXBS motion estimation, which only searches a small number of Macroblocks (MBs). We introduced hashtable into video processing and completed parallel implementation. We propose and evaluate parallel implementations of the LHMEA of TPA on clusters of workstations for real time video compression. It discusses how parallel video coding on load balanced multiprocessor systems can help, especially on motion estimation. The effect of load balancing for improved performance is discussed. The performance of the algorithm is evaluated by using standard video sequences and the results are compared to current algorithms.
Resumo:
This paper presents a paralleled Two-Pass Hexagonal (TPA) algorithm constituted by Linear Hashtable Motion Estimation Algorithm (LHMEA) and Hexagonal Search (HEXBS) for motion estimation. In the TPA, Motion Vectors (MV) are generated from the first-pass LHMEA and are used as predictors for second-pass HEXBS motion estimation, which only searches a small number of Macroblocks (MBs). We introduced hashtable into video processing and completed parallel implementation. We propose and evaluate parallel implementations of the LHMEA of TPA on clusters of workstations for real time video compression. It discusses how parallel video coding on load balanced multiprocessor systems can help, especially on motion estimation. The effect of load balancing for improved performance is discussed. The performance of the algorithm is evaluated by using standard video sequences and the results are compared to current algorithms.
Resumo:
Processor virtualization for process migration in distributed parallel computing systems has formed a significant component of research on load balancing. In contrast, the potential of processor virtualization for fault tolerance has been addressed minimally. The work reported in this paper is motivated towards extending concepts of processor virtualization towards ‘intelligent cores’ as a means to achieve fault tolerance in distributed parallel computing systems. Intelligent cores are an abstraction of the hardware processing cores, with the incorporation of cognitive capabilities, on which parallel tasks can be executed and migrated. When a processing core executing a task is predicted to fail the task being executed is proactively transferred onto another core. A parallel reduction algorithm incorporating concepts of intelligent cores is implemented on a computer cluster using Adaptive MPI and Charm ++. Preliminary results confirm the feasibility of the approach.
Resumo:
Global communicationrequirements andloadimbalanceof someparalleldataminingalgorithms arethe major obstacles to exploitthe computational power of large-scale systems. This work investigates how non-uniform data distributions can be exploited to remove the global communication requirement and to reduce the communication costin parallel data mining algorithms and, in particular, in the k-means algorithm for cluster analysis. In the straightforward parallel formulation of the k-means algorithm, data and computation loads are uniformly distributed over the processing nodes. This approach has excellent load balancing characteristics that may suggest it could scale up to large and extreme-scale parallel computing systems. However, at each iteration step the algorithm requires a global reduction operationwhichhinders thescalabilityoftheapproach.Thisworkstudiesadifferentparallelformulation of the algorithm where the requirement of global communication is removed, while maintaining the same deterministic nature ofthe centralised algorithm. The proposed approach exploits a non-uniform data distribution which can be either found in real-world distributed applications or can be induced by means ofmulti-dimensional binary searchtrees. The approachcanalso be extended to accommodate an approximation error which allows a further reduction ofthe communication costs. The effectiveness of the exact and approximate methods has been tested in a parallel computing system with 64 processors and in simulations with 1024 processing element
Resumo:
An equation of Monge-Ampère type has, for the first time, been solved numerically on the surface of the sphere in order to generate optimally transported (OT) meshes, equidistributed with respect to a monitor function. Optimal transport generates meshes that keep the same connectivity as the original mesh, making them suitable for r-adaptive simulations, in which the equations of motion can be solved in a moving frame of reference in order to avoid mapping the solution between old and new meshes and to avoid load balancing problems on parallel computers. The semi-implicit solution of the Monge-Ampère type equation involves a new linearisation of the Hessian term, and exponential maps are used to map from old to new meshes on the sphere. The determinant of the Hessian is evaluated as the change in volume between old and new mesh cells, rather than using numerical approximations to the gradients. OT meshes are generated to compare with centroidal Voronoi tesselations on the sphere and are found to have advantages and disadvantages; OT equidistribution is more accurate, the number of iterations to convergence is independent of the mesh size, face skewness is reduced and the connectivity does not change. However anisotropy is higher and the OT meshes are non-orthogonal. It is shown that optimal transport on the sphere leads to meshes that do not tangle. However, tangling can be introduced by numerical errors in calculating the gradient of the mesh potential. Methods for alleviating this problem are explored. Finally, OT meshes are generated using observed precipitation as a monitor function, in order to demonstrate the potential power of the technique.
Resumo:
In 2006 the Route load balancing algorithm was proposed and compared to other techniques aiming at optimizing the process allocation in grid environments. This algorithm schedules tasks of parallel applications considering computer neighborhoods (where the distance is defined by the network latency). Route presents good results for large environments, although there are cases where neighbors do not have an enough computational capacity nor communication system capable of serving the application. In those situations the Route migrates tasks until they stabilize in a grid area with enough resources. This migration may take long time what reduces the overall performance. In order to improve such stabilization time, this paper proposes RouteGA (Route with Genetic Algorithm support) which considers historical information on parallel application behavior and also the computer capacities and load to optimize the scheduling. This information is extracted by using monitors and summarized in a knowledge base used to quantify the occupation of tasks. Afterwards, such information is used to parameterize a genetic algorithm responsible for optimizing the task allocation. Results confirm that RouteGA outperforms the load balancing carried out by the original Route, which had previously outperformed others scheduling algorithms from literature.
Resumo:
This work shows the design, simulation, and analysis of two optical interconnection networks for a Dataflow parallel computer architecture. To verify the optical interconnection network performance on the Dataflow architecture, we have analyzed the load balancing among the processors during the parallel programs executions. The load balancing is a very important parameter because it is directly associated to the dataflow parallelism degree. This article proves that optical interconnection networks designed with simple optical devices can provide efficiently the dataflow requirements of a high performance communication system.