850 resultados para parallel scalability
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-08
Resumo:
Abstract not available
Resumo:
Abstract—This paper presents PORBS, a parallelised observation-based slicing tool. The tool itself is written in Java making it platform independent and leverages the build chain of the system being sliced to avoid the need to replicate complex compiler analysis. The target audience of PORBS is software engineers and researchers working with and on tools and techniques for software comprehension, debugging, re-engineering, and maintenance.
Resumo:
Finance is one of the fastest growing areas in modern applied mathematics with real world applications. The interest of this branch of applied mathematics is best described by an example involving shares. Shareholders of a company receive dividends which come from the profit made by the company. The proceeds of the company, once it is taken over or wound up, will also be distributed to shareholders. Therefore shares have a value that reflects the views of investors about the likely dividend payments and capital growth of the company. Obviously such value will be quantified by the share price on stock exchanges. Therefore financial modelling serves to understand the correlations between asset and movements of buy/sell in order to reduce risk. Such activities depend on financial analysis tools being available to the trader with which he can make rapid and systematic evaluation of buy/sell contracts. There are other financial activities and it is not an intention of this paper to discuss all of these activities. The main concern of this paper is to propose a parallel algorithm for the numerical solution of an European option. This paper is organised as follows. First, a brief introduction is given of a simple mathematical model for European options and possible numerical schemes of solving such mathematical model. Second, Laplace transform is applied to the mathematical model which leads to a set of parametric equations where solutions of different parametric equations may be found concurrently. Numerical inverse Laplace transform is done by means of an inversion algorithm developed by Stehfast. The scalability of the algorithm in a distributed environment is demonstrated. Third, a performance analysis of the present algorithm is compared with a spatial domain decomposition developed particularly for time-dependent heat equation. Finally, a number of issues are discussed and future work suggested.
Resumo:
A new parallel approach for solving a pentadiagonal linear system is presented. The parallel partition method for this system and the TW parallel partition method on a chain of P processors are introduced and discussed. The result of this algorithm is a reduced pentadiagonal linear system of order P \Gamma 2 compared with a system of order 2P \Gamma 2 for the parallel partition method. More importantly the new method involves only half the number of communications startups than the parallel partition method (and other standard parallel methods) and hence is a far more efficient parallel algorithm.
Resumo:
Abstract not available
Resumo:
In this paper, we develop a fast implementation of an hyperspectral coded aperture (HYCA) algorithm on different platforms using OpenCL, an open standard for parallel programing on heterogeneous systems, which includes a wide variety of devices, from dense multicore systems from major manufactures such as Intel or ARM to new accelerators such as graphics processing units (GPUs), field programmable gate arrays (FPGAs), the Intel Xeon Phi and other custom devices. Our proposed implementation of HYCA significantly reduces its computational cost. Our experiments have been conducted using simulated data and reveal considerable acceleration factors. This kind of implementations with the same descriptive language on different architectures are very important in order to really calibrate the possibility of using heterogeneous platforms for efficient hyperspectral imaging processing in real remote sensing missions.
Resumo:
A large class of computational problems are characterised by frequent synchronisation, and computational requirements which change as a function of time. When such a problem is solved on a message passing multiprocessor machine [5], the combination of these characteristics leads to system performance which deteriorate in time. As the communication performance of parallel hardware steadily improves so load balance becomes a dominant factor in obtaining high parallel efficiency. Performance can be improved with periodic redistribution of computational load; however, redistribution can sometimes be very costly. We study the issue of deciding when to invoke a global load re-balancing mechanism. Such a decision policy must actively weigh the costs of remapping against the performance benefits, and should be general enough to apply automatically to a wide range of computations. This paper discusses a generic strategy for Dynamic Load Balancing (DLB) in unstructured mesh computational mechanics applications. The strategy is intended to handle varying levels of load changes throughout the run. The major issues involved in a generic dynamic load balancing scheme will be investigated together with techniques to automate the implementation of a dynamic load balancing mechanism within the Computer Aided Parallelisation Tools (CAPTools) environment, which is a semi-automatic tool for parallelisation of mesh based FORTRAN codes.
Resumo:
Abstract not available
Resumo:
Abstract not available
Resumo:
Abstract not available
Resumo:
Unstructured mesh based codes for the modelling of continuum physics phenomena have evolved to provide the facility to model complex interacting systems. Such codes have the potential to provide a high performance on parallel platforms for a small investment in programming. The critical parameters for success are to minimise changes to the code to allow for maintenance while providing high parallel efficiency, scalability to large numbers of processors and portability to a wide range of platforms. The paradigm of domain decomposition with message passing has for some time been demonstrated to provide a high level of efficiency, scalability and portability across shared and distributed memory systems without the need to re-author the code into a new language. This paper addresses these issues in the parallelisation of a complex three dimensional unstructured mesh Finite Volume multiphysics code and discusses the implications of automating the parallelisation process.
Resumo:
Abstract not available
Resumo:
A parallel method for the dynamic partitioning of unstructured meshes is outlined. The method includes diffusive load-balancing techniques and an iterative optimisation technique known as relative gain optimisationwhich both balances theworkload and attempts to minimise the interprocessor communications overhead. It can also optionally include amultilevel strategy. Experiments on a series of adaptively refined meshes indicate that the algorithmprovides partitions of an equivalent or higher quality to static partitioners (which do not reuse the existing partition) and much more rapidly. Perhaps more importantly, the algorithm results in only a small fraction of the amount of data migration compared to the static partitioners.