843 resultados para parallel processing systems


Relevância:

90.00% 90.00%

Publicador:

Resumo:

A simulation program has been developed to calculate the power-spectral density of thin avalanche photodiodes, which are used in optical networks. The program extends the time-domain analysis of the dead-space multiplication model to compute the autocorrelation function of the APD impulse response. However, the computation requires a large amount of memory space and is very time consuming. We describe our experiences in parallelizing the code using both MPI and OpenMP. Several array partitioning schemes and scheduling policies are implemented and tested Our results show that the OpenMP code is scalable up to 64 processors on an SGI Origin 2000 machine and has small average errors.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

A comprehensive solution of solidification/melting processes requires the simultaneous representation of free surface fluid flow, heat transfer, phase change, nonlinear solid mechanics and, possibly, electromagnetics together with their interactions, in what is now known as multiphysics simulation. Such simulations are computationally intensive and the implementation of solution strategies for multiphysics calculations must embed their effective parallelization. For some years, together with our collaborators, we have been involved in the development of numerical software tools for multiphysics modeling on parallel cluster systems. This research has involved a combination of algorithmic procedures, parallel strategies and tools, plus the design of a computational modeling software environment and its deployment in a range of real world applications. One output from this research is the three-dimensional parallel multiphysics code, PHYSICA. In this paper we report on an assessment of its parallel scalability on a range of increasingly complex models drawn from actual industrial problems, on three contemporary parallel cluster systems.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Parallel processing techniques have been used in the past to provide high performance computing resources for activities such as fire-field modelling. This has traditionally been achieved using specialized hardware and software, the expense of which would be difficult to justify for many fire engineering practices. In this article we demonstrate how typical office-based PCs attached to a Local Area Network has the potential to offer the benefits of parallel processing with minimal costs associated with the purchase of additional hardware or software. It was found that good speedups could be achieved on homogeneous networks of PCs, for example a problem composed of ~100,000 cells would run 9.3 times faster on a network of 12 800MHz PCs than on a single 800MHz PC. It was also found that a network of eight 3.2GHz Pentium 4 PCs would run 7.04 times faster than a single 3.2GHz Pentium computer. A dynamic load balancing scheme was also devised to allow the effective use of the software on heterogeneous PC networks. This scheme also ensured that the impact between the parallel processing task and other computer users on the network was minimized.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper presents an investigation into dynamic self-adjustment of task deployment and other aspects of self-management, through the embedding of multiple policies. Non-dedicated loosely-coupled computing environments, such as clusters and grids are increasingly popular platforms for parallel processing. These abundant systems are highly dynamic environments in which many sources of variability affect the run-time efficiency of tasks. The dynamism is exacerbated by the incorporation of mobile devices and wireless communication. This paper proposes an adaptive strategy for the flexible run-time deployment of tasks; to continuously maintain efficiency despite the environmental variability. The strategy centres on policy-based scheduling which is informed by contextual and environmental inputs such as variance in the round-trip communication time between a client and its workers and the effective processing performance of each worker. A self-management framework has been implemented for evaluation purposes. The framework integrates several policy-controlled, adaptive services with the application code, enabling the run-time behaviour to be adapted to contextual and environmental conditions. Using this framework, an exemplar self-managing parallel application is implemented and used to investigate the extent of the benefits of the strategy

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Parallel processing techniques have been used in the past to provide high performance computing resources for activities such as Computational Fluid Dynamics. This is normally achieved using specialized hardware and software, the expense of which would be difficult to justify for many fire engineering practices. In this paper, we demonstrate how typical office-based PCs attached to a local area network have the potential to offer the benefits of parallel processing with minimal costs associated with the purchase of additional hardware or software. A dynamic load balancing scheme was devised to allow the effective use of the software on heterogeneous PC networks. This scheme ensured that the impact between the parallel processing task and other computer users on the network was minimized thus allowing practical parallel processing within a conventional office environment. Copyright © 2006 John Wiley & Sons, Ltd.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Computer egress simulation has potential to be used in large scale incidents to provide live advice to incident commanders. While there are many considerations which must be taken into account when applying such models to live incidents, one of the first concerns the computational speed of simulations. No matter how important the insight provided by the simulation, numerical hindsight will not prove useful to an incident commander. Thus for this type of application to be useful, it is essential that the simulation can be run many times faster than real time. Parallel processing is a method of reducing run times for very large computational simulations by distributing the workload amongst a number of CPUs. In this paper we examine the development of a parallel version of the buildingEXODUS software. The parallel strategy implemented is based on a systematic partitioning of the problem domain onto an arbitrary number of sub-domains. Each sub-domain is computed on a separate processor and runs its own copy of the EXODUS code. The software has been designed to work on typical office based networked PCs but will also function on a Windows based cluster. Two evaluation scenarios using the parallel implementation of EXODUS are described; a large open area and a 50 story high-rise building scenario. Speed-ups of up to 3.7 are achieved using up to six computers, with high-rise building evacuation simulation achieving run times of 6.4 times faster than real time.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper, chosen as a best paper from the 2005 SAMOS Workshop on Computer Systems: describes the for the first time the major Abhainn project for automated system level design of embedded signal processing systems. In particular, this describes four key novelties: novel algorithm modelling techniques for DSP systems, automated implementation realisation, algorithm transformation for system optimisation and automated inter-processor communication. This is applied to two complex systems: a radar and sonar system. In both cases technology which allows non-experts to automatically create low-overhead, high performance embedded signal processing systems is exhibited.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The real time implementation of an efficient signal compression technique, Vector Quantization (VQ), is of great importance to many digital signal coding applications. In this paper, we describe a new family of bit level systolic VLSI architectures which offer an attractive solution to this problem. These architectures are based on a bit serial, word parallel approach and high performance and efficiency can be achieved for VQ applications of a wide range of bandwidths. Compared with their bit parallel counterparts, these bit serial circuits provide better alternatives for VQ implementations in terms of performance and cost. © 1995 Kluwer Academic Publishers.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The application of fine grain pipelining techniques in the design of high performance Wave Digital Filters (WDFs) is described. It is shown that significant increases in the sampling rate of bit parallel circuits can be achieved using most significant bit (msb) first arithmetic. A novel VLSI architecture for implementing two-port adaptor circuits is described which embodies these ideas. The circuit in question is highly regular, uses msb first arithmetic and is implemented using simple carry-save adders. © 1992 Kluwer Academic Publishers.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Several novel systolic architectures for implementing densely pipelined bit parallel IIR filter sections are presented. The fundamental problem of latency in the feedback loop is overcome by employing redundant arithmetic in combination with bit-level feedback, allowing a basic first-order section to achieve a wordlength-independent latency of only two clock cycles. This is extended to produce a building block from which higher order sections can be constructed. The architecture is then refined by combining the use of both conventional and redundant arithmetic, resulting in two new structures offering substantial hardware savings over the original design. In contrast to alternative techniques, bit-level pipelinability is achieved with no net cost in hardware. © 1989 Kluwer Academic Publishers.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The inherent difficulty of thread-based shared-memory programming has recently motivated research in high-level, task-parallel programming models. Recent advances of Task-Parallel models add implicit synchronization, where the system automatically detects and satisfies data dependencies among spawned tasks. However, dynamic dependence analysis incurs significant runtime overheads, because the runtime must track task resources and use this information to schedule tasks while avoiding conflicts and races.
We present SCOOP, a compiler that effectively integrates static and dynamic analysis in code generation. SCOOP combines context-sensitive points-to, control-flow, escape, and effect analyses to remove redundant dependence checks at runtime. Our static analysis can work in combination with existing dynamic analyses and task-parallel runtimes that use annotations to specify tasks and their memory footprints. We use our static dependence analysis to detect non-conflicting tasks and an existing dynamic analysis to handle the remaining dependencies. We evaluate the resulting hybrid dependence analysis on a set of task-parallel programs.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The cycle of the academic year impacts on efforts to refine and improve major group design-build-test (DBT) projects since the time to run and evaluate projects is generally a full calendar year. By definition these major projects have a high degree of complexity since they act as the vehicle for the application of a range of technical knowledge and skills. There is also often an extensive list of desired learning outcomes which extends to include professional skills and attributes such as communication and team working. It is contended that student project definition and operation, like any other designed product, requires a number of iterations to achieve optimisation. The problem however is that if this cycle takes four or more years then by the time a project’s operational structure is fine tuned it is quite possible that the project theme is no longer relevant. The majority of the students will also inevitably experience a sub-optimal project experience over the 5 year development period. It would be much better if the ratio were flipped so that in 1 year an optimised project definition could be achieved which had sufficient longevity that it could run in the same efficient manner for 4 further years. An increased number of parallel investigators would also enable more varied and adventurous project concepts to be examined than a single institution could undertake alone in the same time frame.
This work-in-progress paper describes a parallel processing methodology for the accelerated definition of new student DBT project concepts. This methodology has been devised and implemented by a number of CDIO partner institutions in the UK & Ireland region. An agreed project theme was operated in parallel in one academic year with the objective of replacing a multi-year iterative cycle. Additionally the close collaboration and peer learning derived from the interaction between the coordinating academics facilitated the development of faculty teaching skills in line with CDIO standard 10.