40 resultados para Embarrassingly Parallel


Relevância:

20.00% 20.00%

Publicador:

Resumo:

The central product of the DRAMA (Dynamic Re-Allocation of Meshes for parallel Finite Element Applications) project is a library comprising a variety of tools for dynamic re-partitioning of unstructured Finite Element (FE) applications. The input to the DRAMA library is the computational mesh, and corresponding costs, partitioned into sub-domains. The core library functions then perform a parallel computation of a mesh re-allocation that will re-balance the costs based on the DRAMA cost model. We discuss the basic features of this cost model, which allows a general approach to load identification, modelling and imbalance minimisation. Results from crash simulations are presented which show the necessity for multi-phase/multi-constraint partitioning components

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A comprehensive simulation of solidification/melting processes requires the simultaneous representation of free surface fluid flow, heat transfer, phase change, non-linear solid mechanics and, possibly, electromagnetics together with their interactions in what is now referred to as "multi-physics" simulation. A 3D computational procedure and software tool, PHYSICA, embedding the above multi-physics models using finite volume methods on unstructured meshes (FV-UM) has been developed. Multi-physics simulations are extremely compute intensive and a strategy to parallelise such codes has, therefore, been developed. This strategy has been applied to PHYSICA and evaluated on a range of challenging multi-physics problems drawn from actual industrial cases.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We consider the load-balancing problems which arise from parallel scientific codes containing multiple computational phases, or loops over subsets of the data, which are separated by global synchronisation points. We motivate, derive and describe the implementation of an approach which we refer to as the multiphase mesh partitioning strategy to address such issues. The technique is tested on example meshes containing multiple computational phases and it is demonstrated that our method can achieve high quality partitions where a standard mesh partitioning approach fails.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The problem of deriving parallel mesh partitioning algorithms for mapping unstructured meshes to parallel computers is discussed in this chapter. In itself this raises a paradox - we seek to find a high quality partition of the mesh, but to compute it in parallel we require a partition of the mesh. In fact, we overcome this difficulty by deriving an optimisation strategy which can find a high quality partition even if the quality of the initial partition is very poor and then use a crude distribution scheme for the initial partition. The basis of this strategy is to use a multilevel approach combined with local refinement algorithms. Three such refinement algorithms are outlined and some example results presented which show that they can produce very high global quality partitions, very rapidly. The results are also compared with a similar multilevel serial partitioner and shown to be almost identical in quality. Finally we consider the impact of the initial partition on the results and demonstrate that the final partition quality is, modulo a certain amount of noise, independent of the initial partition.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this Chapter we discuss the load-balancing issues arising in parallel mesh based computational mechanics codes for which the processor loading changes during the run. We briefly touch on geometric repartitioning ideas and then focus on different ways of using a graph both to solve the load-balancing problem and the optimisation problem, both locally and globally. We also briefly discuss whether repartitioning is always valid. Sample illustrative results are presented and we conclude that repartitioning is an attractive option if the load changes are not too dramatic and that there is a certain trade-off between partition quality and volume of data that the underlying application needs to migrate.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The scheduling problem of minimizing the makespan for m parallel dedicated machines under single resource constraints is considered. For different variants of the problem the complexity status is established. Heuristic algorithms employing the so-called group technology approach are presented and their worst-case behavior is examined. Finally, a polynomial time approximation scheme is presented for the problem with fixed number of machines.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The paper considers scheduling problems for parallel dedicated machines subject to resource constraints. A fairly complete computational complexity classification is obtained, a number of polynomial-time algorithms are designed. For the problem with a fixed number of machines in which a job uses at most one resource of unit size a polynomial-time approximation scheme is offered.

Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We consider a single machine due date assignment and scheduling problem of minimizing holding costs with no tardy jobs tinder series parallel and somewhat wider class of precedence constraints as well as the properties of series-parallel graphs.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A simulation program has been developed to calculate the power-spectral density of thin avalanche photodiodes, which are used in optical networks. The program extends the time-domain analysis of the dead-space multiplication model to compute the autocorrelation function of the APD impulse response. However, the computation requires a large amount of memory space and is very time consuming. We describe our experiences in parallelizing the code using both MPI and OpenMP. Several array partitioning schemes and scheduling policies are implemented and tested Our results show that the OpenMP code is scalable up to 64 processors on an SGI Origin 2000 machine and has small average errors.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

An important factor for high-speed optical communication is the availability of ultrafast and low-noise photodetectors. Among the semiconductor photodetectors that are commonly used in today’s long-haul and metro-area fiber-optic systems, avalanche photodiodes (APDs) are often preferred over p-i-n photodiodes due to their internal gain, which significantly improves the receiver sensitivity and alleviates the need for optical pre-amplification. Unfortunately, the random nature of the very process of carrier impact ionization, which generates the gain, is inherently noisy and results in fluctuations not only in the gain but also in the time response. Recently, a theory characterizing the autocorrelation function of APDs has been developed by us which incorporates the dead-space effect, an effect that is very significant in thin, high-performance APDs. The research extends the time-domain analysis of the dead-space multiplication model to compute the autocorrelation function of the APD impulse response. However, the computation requires a large amount of memory space and is very time consuming. In this research, we describe our experiences in parallelizing the code in MPI and OpenMP using CAPTools. Several array partitioning schemes and scheduling policies are implemented and tested. Our results show that the code is scalable up to 64 processors on a SGI Origin 2000 machine and has small average errors.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A comprehensive solution of solidification/melting processes requires the simultaneous representation of free surface fluid flow, heat transfer, phase change, nonlinear solid mechanics and, possibly, electromagnetics together with their interactions, in what is now known as multiphysics simulation. Such simulations are computationally intensive and the implementation of solution strategies for multiphysics calculations must embed their effective parallelization. For some years, together with our collaborators, we have been involved in the development of numerical software tools for multiphysics modeling on parallel cluster systems. This research has involved a combination of algorithmic procedures, parallel strategies and tools, plus the design of a computational modeling software environment and its deployment in a range of real world applications. One output from this research is the three-dimensional parallel multiphysics code, PHYSICA. In this paper we report on an assessment of its parallel scalability on a range of increasingly complex models drawn from actual industrial problems, on three contemporary parallel cluster systems.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Parallel processing techniques have been used in the past to provide high performance computing resources for activities such as fire-field modelling. This has traditionally been achieved using specialized hardware and software, the expense of which would be difficult to justify for many fire engineering practices. In this article we demonstrate how typical office-based PCs attached to a Local Area Network has the potential to offer the benefits of parallel processing with minimal costs associated with the purchase of additional hardware or software. It was found that good speedups could be achieved on homogeneous networks of PCs, for example a problem composed of ~100,000 cells would run 9.3 times faster on a network of 12 800MHz PCs than on a single 800MHz PC. It was also found that a network of eight 3.2GHz Pentium 4 PCs would run 7.04 times faster than a single 3.2GHz Pentium computer. A dynamic load balancing scheme was also devised to allow the effective use of the software on heterogeneous PC networks. This scheme also ensured that the impact between the parallel processing task and other computer users on the network was minimized.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we first demonstrate that the classical Purcell's vector method when combined with row pivoting yields a consistently small growth factor in comparison to the well-known Gauss elimination method, the Gauss–Jordan method and the Gauss–Huard method with partial pivoting. We then present six parallel algorithms of the Purcell method that may be used for direct solution of linear systems. The algorithms differ in ways of pivoting and load balancing. We recommend algorithms V and VI for their reliability and algorithms III and IV for good load balance if local pivoting is acceptable. Some numerical results are presented.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents an empirical investigation of policy-based self-management techniques for parallel applications executing in loosely-coupled environments. The dynamic and heterogeneous nature of these environments is discussed and the special considerations for parallel applications are identified. An adaptive strategy for the run-time deployment of tasks of parallel applications is presented. The strategy is based on embedding numerous policies which are informed by contextual and environmental inputs. The policies govern various aspects of behaviour, enhancing flexibility so that the goals of efficiency and performance are achieved despite high levels of environmental variability. A prototype self-managing parallel application is used as a vehicle to explore the feasibility and benefits of the strategy. In particular, several aspects of stability are investigated. The implementation and behaviour of three policies are discussed and sample results examined.