12 resultados para Distributed algorithm
em Greenwich Academic Literature Archive - UK
Resumo:
Temperature distributions involved in some metal-cutting or surface-milling processes may be obtained by solving a non-linear inverse problem. A two-level concept on parallelism is introduced to compute such temperature distribution. The primary level is based on a problem-partitioning concept driven by the nature and properties of the non-linear inverse problem. Such partitioning results to a coarse-grained parallel algorithm. A simplified 2-D metal-cutting process is used as an example to illustrate the concept. A secondary level exploitation of further parallel properties based on the concept of domain-data parallelism is explained and implemented using MPI. Some experiments were performed on a network of loosely coupled machines consist of SUN Sparc Classic workstations and a network of tightly coupled processors, namely the Origin 2000.
Resumo:
Numerical solutions of realistic 2-D and 3-D inverse problems may require a very large amount of computation. A two-level concept on parallelism is often used to solve such problems. The primary level uses the problem partitioning concept which is a decomposition based on the mathematical/physical problem. The secondary level utilizes the widely used data partitioning concept. A theoretical performance model is built based on the two-level parallelism. The observed performance results obtained from a network of general purpose Sun Sparc stations are compared with the theoretical values. Restrictions of the theoretical model are also discussed.
Resumo:
A distributed algorithm is developed to solve nonlinear Black-Scholes equations in the hedging of portfolios. The algorithm is based on an approximate inverse Laplace transform and is particularly suitable for problems that do not require detailed knowledge of each intermediate time steps.
Resumo:
Abstract not available
Resumo:
Financial modelling in the area of option pricing involves the understanding of the correlations between asset and movements of buy/sell in order to reduce risk in investment. Such activities depend on financial analysis tools being available to the trader with which he can make rapid and systematic evaluation of buy/sell contracts. In turn, analysis tools rely on fast numerical algorithms for the solution of financial mathematical models. There are many different financial activities apart from shares buy/sell activities. The main aim of this chapter is to discuss a distributed algorithm for the numerical solution of a European option. Both linear and non-linear cases are considered. The algorithm is based on the concept of the Laplace transform and its numerical inverse. The scalability of the algorithm is examined. Numerical tests are used to demonstrate the effectiveness of the algorithm for financial analysis. Time dependent functions for volatility and interest rates are also discussed. Applications of the algorithm to non-linear Black-Scholes equation where the volatility and the interest rate are functions of the option value are included. Some qualitative results of the convergence behaviour of the algorithm is examined. This chapter also examines the various computational issues of the Laplace transformation method in terms of distributed computing. The idea of using a two-level temporal mesh in order to achieve distributed computation along the temporal axis is introduced. Finally, the chapter ends with some conclusions.
Resumo:
The availability of a very accurate dependence graph for a scalar code is the basis for the automatic generation of an efficient parallel implementation. The strategy for this task which is encapsulated in a comprehensive data partitioning code generation algorithm is described. This algorithm involves the data partition, calculation of assignment ranges for partitioned arrays, addition of a comprehensive set of execution control masks, altering loop limits, addition and optimisation of communications for all data. In this context, the development and implementation of strategies to merge communications wherever possible has proved an important feature in producing efficient parallel implementations for numerical mesh based codes. The code generation strategies described here are embedded within the Computer Aided Parallelisation tools (CAPTools) software as a key part of a toolkit for automating as much as possible of the parallelisation process for mesh based numerical codes. The algorithms used enables parallelisation of real computational mechanics codes with only minor user interaction and without any prior manual customisation of the serial code to suit the parallelisation tool.
Resumo:
In fluid mechanics, it is well accepted that the Euler equation is one of the reduced forms of the Navier-Stokes equation by truncating the viscous effect. There are other truncation techniques currently being used in order to truncate the Navier-Stokes equation to a reduced form. This paper describes one such technique, suitable for adaptive domain decomposition methods for the solution of viscous flow problems. The physical domain of a viscous flow problem is partitioned into viscous and inviscid subdomains without overlapping regions, and the technique is embedded into a finite volume method. Some numerical results are provided for a flat plate and the NACA0012 aerofoil. Issues related to distributed computing are discussed.
Resumo:
A flexible elimination algorithm is presented and is applied to the solution of dense systems of linear equations. Properties of the algorithm are exploited in relation to panel element methods for potential flow and subsonic compressible flow. Further properties in relation to distributed computing are also discussed.
Resumo:
Existing election algorithms suffer limited scalability. This limit stems from the communication design which in turn stems from their fundamentally two-state behaviour. This paper presents a new election algorithm specifically designed to be highly scalable in broadcast networks whilst allowing any processing node to become coordinator with initially equal probability. To achieve this, careful attention has been paid to the communication design, and an additional state has been introduced. The design of the tri-state election algorithm has been motivated by the requirements analysis of a major research project to deliver robust scalable distributed applications, including load sharing, in hostile computing environments in which it is common for processing nodes to be rebooted frequently without notice. The new election algorithm is based in-part on a simple 'emergent' design. The science of emergence is of great relevance to developers of distributed applications because it describes how higher-level self-regulatory behaviour can arise from many participants following a small set of simple rules. The tri-state election algorithm is shown to have very low communication complexity in which the number of messages generated remains loosely-bounded regardless of scale for large systems; is highly scalable because nodes in the idle state do not transmit any messages; and because of its self-organising characteristics, is very stable.
Resumo:
Natural distributed systems are adaptive, scalable and fault-tolerant. Emergence science describes how higher-level self-regulatory behaviour arises in natural systems from many participants following simple rulesets. Emergence advocates simple communication models, autonomy and independence, enhancing robustness and self-stabilization. High-quality distributed applications such as autonomic systems must satisfy the appropriate nonfunctional requirements which include scalability, efficiency, robustness, low-latency and stability. However the traditional design of distributed applications, especially in terms of the communication strategies employed, can introduce compromises between these characteristics. This paper discusses ways in which emergence science can be applied to distributed computing, avoiding some of the compromises associated with traditionally-designed applications. To demonstrate the effectiveness of this paradigm, an emergent election algorithm is described and its performance evaluated. The design incorporates nondeterministic behaviour. The resulting algorithm has very low communication complexity, and is simultaneously very stable, scalable and robust.
Resumo:
Fractal video compression is a relatively new video compression method. Its attraction is due to the high compression ratio and the simple decompression algorithm. But its computational complexity is high and as a result parallel algorithms on high performance machines become one way out. In this study we partition the matching search, which occupies the majority of the work in a fractal video compression process, into small tasks and implement them in two distributed computing environments, one using DCOM and the other using .NET Remoting technology, based on a local area network consists of loosely coupled PCs. Experimental results show that the parallel algorithm is able to achieve a high speedup in these distributed environments.
Resumo:
A method is outlined for optimising graph partitions which arise in mapping unstructured mesh calculations to parallel computers. The method employs a relative gain iterative technique to both evenly balance the workload and minimise the number and volume of interprocessor communications. A parallel graph reduction technique is also briefly described and can be used to give a global perspective to the optimisation. The algorithms work efficiently in parallel as well as sequentially and when combined with a fast direct partitioning technique (such as the Greedy algorithm) to give an initial partition, the resulting two-stage process proves itself to be both a powerful and flexible solution to the static graph-partitioning problem. Experiments indicate that the resulting parallel code can provide high quality partitions, independent of the initial partition, within a few seconds. The algorithms can also be used for dynamic load-balancing, reusing existing partitions and in this case the procedures are much faster than static techniques, provide partitions of similar or higher quality and, in comparison, involve the migration of a fraction of the data.