933 resultados para Parallel computing


Relevância:

30.00% 30.00%

Publicador:

Resumo:

As computational Grids are increasingly used for executing long running multi-phase parallel applications, it is important to develop efficient rescheduling frameworks that adapt application execution in response to resource and application dynamics. In this paper, three strategies or algorithms have been developed for deciding when and where to reschedule parallel applications that execute on multi-cluster Grids. The algorithms derive rescheduling plans that consist of potential points in application execution for rescheduling and schedules of resources for application execution between two consecutive rescheduling points. Using large number of simulations, it is shown that the rescheduling plans developed by the algorithms can lead to large decrease in application execution times when compared to executions without rescheduling on dynamic Grid resources. The rescheduling plans generated by the algorithms are also shown to be competitive when compared to the near-optimal plans generated by brute-force methods. Of the algorithms, genetic algorithm yielded the most efficient rescheduling plans with 9-12% smaller average execution times than the other algorithms.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, we present an algebraic method to study and design spatial parallel manipulators that demonstrate isotropy in the force and moment distributions.We use the force and moment transformation matrices separately,and derive conditions for their isotropy individually as well as in combination. The isotropy conditions are derived in closed-form in terms of the invariants of the quadratic forms associated with these matrices. The formulation has been applied to a class of Stewart platform manipulators. We obtain multi-parameter families of isotropic manipulator analytically. In addition to computing the isotropic configurations of an existing manipulator,we demonstrate a procedure for designing the manipulator for isotropy at a given configuration.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The Morse-Smale complex is a useful topological data structure for the analysis and visualization of scalar data. This paper describes an algorithm that processes all mesh elements of the domain in parallel to compute the Morse-Smale complex of large two-dimensional data sets at interactive speeds. We employ a reformulation of the Morse-Smale complex using Forman's Discrete Morse Theory and achieve scalability by computing the discrete gradient using local accesses only. We also introduce a novel approach to merge gradient paths that ensures accurate geometry of the computed complex. We demonstrate that our algorithm performs well on both multicore environments and on massively parallel architectures such as the GPU.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Information forms the basis of modern technology. To meet the ever-increasing demand for information, means have to be devised for a more efficient and better-equipped technology to intelligibly process data. Advances in photonics have made their impact on each of the four key applications in information processing, i.e., acquisition, transmission, storage and processing of information. The inherent advantages of ultrahigh bandwidth, high speed and low-loss transmission has already established fiber-optics as the backbone of communication technology. However, the optics to electronics inter-conversion at the transmitter and receiver ends severely limits both the speed and bit rate of lightwave communication systems. As the trend towards still faster and higher capacity systems continues, it has become increasingly necessary to perform more and more signal-processing operations in the optical domain itself, i.e., with all-optical components and devices that possess a high bandwidth and can perform parallel processing functions to eliminate the electronic bottleneck.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The Morse-Smale complex is a topological structure that captures the behavior of the gradient of a scalar function on a manifold. This paper discusses scalable techniques to compute the Morse-Smale complex of scalar functions defined on large three-dimensional structured grids. Computing the Morse-Smale complex of three-dimensional domains is challenging as compared to two-dimensional domains because of the non-trivial structure introduced by the two types of saddle criticalities. We present a parallel shared-memory algorithm to compute the Morse-Smale complex based on Forman's discrete Morse theory. The algorithm achieves scalability via synergistic use of the CPU and the GPU. We first prove that the discrete gradient on the domain can be computed independently for each cell and hence can be implemented on the GPU. Second, we describe a two-step graph traversal algorithm to compute the 1-saddle-2-saddle connections efficiently and in parallel on the CPU. Simultaneously, the extremasaddle connections are computed using a tree traversal algorithm on the GPU.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The LURR theory is a new approach for earthquake prediction, which achieves good results in earthquake prediction within the China mainland and regions in America, Japan and Australia. However, the expansion of the prediction region leads to the refinement of its longitude and latitude, and the increase of the time period. This requires increasingly more computations, and the volume of data reaches the order of GB, which will be very difficult for a single CPU. In this paper, a new method was introduced to solve this problem. Adopting the technology of domain decomposition and parallelizing using MPI, we developed a new parallel tempo-spatial scanning program.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This thesis describes the design and implementation of an integrated circuit and associated packaging to be used as the building block for the data routing network of a large scale shared memory multiprocessor system. A general purpose multiprocessor depends on high-bandwidth, low-latency communications between computing elements. This thesis describes the design and construction of RN1, a novel self-routing, enhanced crossbar switch as a CMOS VLSI chip. This chip provides the basic building block for a scalable pipelined routing network with byte-wide data channels. A series of RN1 chips can be cascaded with no additional internal network components to form a multistage fault-tolerant routing switch. The chip is designed to operate at clock frequencies up to 100Mhz using Hewlett-Packard's HP34 $1.2\\mu$ process. This aggressive performance goal demands that special attention be paid to optimization of the logic architecture and circuit design.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A parallel method for the dynamic partitioning of unstructured meshes is described. The method introduces a new iterative optimization technique known as relative gain optimization which both balances the workload and attempts to minimize the interprocessor communications overhead. Experiments on a series of adaptively refined meshes indicate that the algorithm provides partitions of an equivalent or higher quality to static partitioners (which do not reuse the existing partition) and much more rapidly. Perhaps more importantly, the algorithm results in only a small fraction of the amount of data migration compared to the static partitioners.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The demands of the process of engineering design, particularly for structural integrity, have exploited computational modelling techniques and software tools for decades. Frequently, the shape of structural components or assemblies is determined to optimise the flow distribution or heat transfer characteristics, and to ensure that the structural performance in service is adequate. From the perspective of computational modelling these activities are typically separated into: • fluid flow and the associated heat transfer analysis (possibly with chemical reactions), based upon Computational Fluid Dynamics (CFD) technology • structural analysis again possibly with heat transfer, based upon finite element analysis (FEA) techniques.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The problem of deriving parallel mesh partitioning algorithms for mapping unstructured meshes to parallel computers is discussed in this chapter. In itself this raises a paradox - we seek to find a high quality partition of the mesh, but to compute it in parallel we require a partition of the mesh. In fact, we overcome this difficulty by deriving an optimisation strategy which can find a high quality partition even if the quality of the initial partition is very poor and then use a crude distribution scheme for the initial partition. The basis of this strategy is to use a multilevel approach combined with local refinement algorithms. Three such refinement algorithms are outlined and some example results presented which show that they can produce very high global quality partitions, very rapidly. The results are also compared with a similar multilevel serial partitioner and shown to be almost identical in quality. Finally we consider the impact of the initial partition on the results and demonstrate that the final partition quality is, modulo a certain amount of noise, independent of the initial partition.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this Chapter we discuss the load-balancing issues arising in parallel mesh based computational mechanics codes for which the processor loading changes during the run. We briefly touch on geometric repartitioning ideas and then focus on different ways of using a graph both to solve the load-balancing problem and the optimisation problem, both locally and globally. We also briefly discuss whether repartitioning is always valid. Sample illustrative results are presented and we conclude that repartitioning is an attractive option if the load changes are not too dramatic and that there is a certain trade-off between partition quality and volume of data that the underlying application needs to migrate.