13 resultados para hardware implementation
em Greenwich Academic Literature Archive - UK
Resumo:
Parallel processing techniques have been used in the past to provide high performance computing resources for activities such as fire-field modelling. This has traditionally been achieved using specialized hardware and software, the expense of which would be difficult to justify for many fire engineering practices. In this article we demonstrate how typical office-based PCs attached to a Local Area Network has the potential to offer the benefits of parallel processing with minimal costs associated with the purchase of additional hardware or software. It was found that good speedups could be achieved on homogeneous networks of PCs, for example a problem composed of ~100,000 cells would run 9.3 times faster on a network of 12 800MHz PCs than on a single 800MHz PC. It was also found that a network of eight 3.2GHz Pentium 4 PCs would run 7.04 times faster than a single 3.2GHz Pentium computer. A dynamic load balancing scheme was also devised to allow the effective use of the software on heterogeneous PC networks. This scheme also ensured that the impact between the parallel processing task and other computer users on the network was minimized.
Resumo:
The parallelization of an industrially important in-house computational fluid dynamics (CFD) code for calculating the airflow over complex aircraft configurations using the Euler or Navier–Stokes equations is presented. The code discussed is the flow solver module of the SAUNA CFD suite. This suite uses a novel grid system that may include block-structured hexahedral or pyramidal grids, unstructured tetrahedral grids or a hybrid combination of both. To assist in the rapid convergence to a solution, a number of convergence acceleration techniques are employed including implicit residual smoothing and a multigrid full approximation storage scheme (FAS). Key features of the parallelization approach are the use of domain decomposition and encapsulated message passing to enable the execution in parallel using a single programme multiple data (SPMD) paradigm. In the case where a hybrid grid is used, a unified grid partitioning scheme is employed to define the decomposition of the mesh. The parallel code has been tested using both structured and hybrid grids on a number of different distributed memory parallel systems and is now routinely used to perform industrial scale aeronautical simulations. Copyright © 2000 John Wiley & Sons, Ltd.
Resumo:
Three paradigms for distributed-memory parallel computation that free the application programmer from the details of message passing are compared for an archetypal structured scientific computation -- a nonlinear, structured-grid partial differential equation boundary value problem -- using the same algorithm on the same hardware. All of the paradigms -- parallel languages represented by the Portland Group's HPF, (semi-)automated serial-to-parallel source-to-source translation represented by CAP-Tools from the University of Greenwich, and parallel libraries represented by Argonne's PETSc -- are found to be easy to use for this problem class, and all are reasonably effective in exploiting concurrency after a short learning curve. The level of involvement required by the application programmer under any paradigm includes specification of the data partitioning, corresponding to a geometrically simple decomposition of the domain of the PDE. Programming in SPMD style for the PETSc library requires writing only the routines that discretize the PDE and its Jacobian, managing subdomain-to-processor mappings (affine global-to-local index mappings), and interfacing to library solver routines. Programming for HPF requires a complete sequential implementation of the same algorithm as a starting point, introduction of concurrency through subdomain blocking (a task similar to the index mapping), and modest experimentation with rewriting loops to elucidate to the compiler the latent concurrency. Programming with CAPTools involves feeding the same sequential implementation to the CAPTools interactive parallelization system, and guiding the source-to-source code transformation by responding to various queries about quantities knowable only at runtime. Results representative of "the state of the practice" for a scaled sequence of structured grid problems are given on three of the most important contemporary high-performance platforms: the IBM SP, the SGI Origin 2000, and the CRAYY T3E.
Resumo:
Virtual manufacturing and design assessment increasingly involve the simulation of interacting phenomena, sic. multi-physics, an activity which is very computationally intensive. This chapter describes an attempt to address the parallel issues associated with a multi-physics simulation approach based upon a range of compatible procedures operating on one mesh using a single database - the distinct physics solvers can operate separately or coupled on sub-domains of the whole geometric space. Moreover, the finite volume unstructured mesh solvers use different discretization schemes (and, particularly, different ‘nodal’ locations and control volumes). A two-level approach to the parallelization of this simulation software is described: the code is restructured into parallel form on the basis of the mesh partitioning alone, that is, without regard to the physics. However, at run time, the mesh is partitioned to achieve a load balance, by considering the load per node/element across the whole domain. The latter of course is determined by the problem specific physics at a particular location.
Resumo:
This paper presents a proactive approach to load sharing and describes the architecture of a scheme, Concert, based on this approach. A proactive approach is characterized by a shift of emphasis from reacting to load imbalance to avoiding its occurrence. In contrast, in a reactive load sharing scheme, activity is triggered when a processing node is either overloaded or underloaded. The main drawback of this approach is that a load imbalance is allowed to develop before costly corrective action is taken. Concert is a load sharing scheme for loosely-coupled distributed systems. Under this scheme, load and task behaviour information is collected and cached in advance of when it is needed. Concert uses Linux as a platform for development. Implemented partially in kernel space and partially in user space, it achieves transparency to users and applications whilst keeping the extent of kernel modifications to a minimum. Non-preemptive task transfers are used exclusively, motivated by lower complexity, lower overheads and faster transfers. The goal is to minimize the average response-time of tasks. Concert is compared with other schemes by considering the level of transparency it provides with respect to users, tasks and the underlying operating system.
Resumo:
A simulation program has been developed to calculate the power-spectral density of thin avalanche photodiodes, which are used in optical networks. The program extends the time-domain analysis of the dead-space multiplication model to compute the autocorrelation function of the APD impulse response. However, the computation requires a large amount of memory space and is very time consuming. We describe our experiences in parallelizing the code using both MPI and OpenMP. Several array partitioning schemes and scheduling policies are implemented and tested Our results show that the OpenMP code is scalable up to 64 processors on an SGI Origin 2000 machine and has small average errors.
A policy-definition language and prototype implementation library for policy-based autonomic systems
Resumo:
This paper presents work towards generic policy toolkit support for autonomic computing systems in which the policies themselves can be adapted dynamically and automatically. The work is motivated by three needs: the need for longer-term policy-based adaptation where the policy itself is dynamically adapted to continually maintain or improve its effectiveness despite changing environmental conditions; the need to enable non autonomics-expert practitioners to embed self-managing behaviours with low cost and risk; and the need for adaptive policy mechanisms that are easy to deploy into legacy code. A policy definition language is presented; designed to permit powerful expression of self-managing behaviours. The language is very flexible through the use of simple yet expressive syntax and semantics, and facilitates a very diverse policy behaviour space through both hierarchical and recursive uses of language elements. A prototype library implementation of the policy support mechanisms is described. The library reads and writes policies in well-formed XML script. The implementation extends the state of the art in policy-based autonomics through innovations which include support for multiple policy versions of a given policy type, multiple configuration templates, and meta-policies to dynamically select between policy instances and templates. Most significantly, the scheme supports hot-swapping between policy instances. To illustrate the feasibility and generalised applicability of these tools, two dissimilar example deployment scenarios are examined. The first is taken from an exploratory implementation of self-managing parallel processing, and is used to demonstrate the simple and efficient use of the tools. The second example demonstrates more-advanced functionality, in the context of an envisioned multi-policy stock trading scheme which is sensitive to environmental volatility
Resumo:
Fractal video compression is a relatively new video compression method. Its attraction is due to the high compression ratio and the simple decompression algorithm. But its computational complexity is high and as a result parallel algorithms on high performance machines become one way out. In this study we partition the matching search, which occupies the majority of the work in a fractal video compression process, into small tasks and implement them in two distributed computing environments, one using DCOM and the other using .NET Remoting technology, based on a local area network consists of loosely coupled PCs. Experimental results show that the parallel algorithm is able to achieve a high speedup in these distributed environments.
Resumo:
Computer egress simulation has potential to be used in large scale incidents to provide live advice to incident commanders. While there are many considerations which must be taken into account when applying such models to live incidents, one of the first concerns the computational speed of simulations. No matter how important the insight provided by the simulation, numerical hindsight will not prove useful to an incident commander. Thus for this type of application to be useful, it is essential that the simulation can be run many times faster than real time. Parallel processing is a method of reducing run times for very large computational simulations by distributing the workload amongst a number of CPUs. In this paper we examine the development of a parallel version of the buildingEXODUS software. The parallel strategy implemented is based on a systematic partitioning of the problem domain onto an arbitrary number of sub-domains. Each sub-domain is computed on a separate processor and runs its own copy of the EXODUS code. The software has been designed to work on typical office based networked PCs but will also function on a Windows based cluster. Two evaluation scenarios using the parallel implementation of EXODUS are described; a large open area and a 50 story high-rise building scenario. Speed-ups of up to 3.7 are achieved using up to six computers, with high-rise building evacuation simulation achieving run times of 6.4 times faster than real time.
Resumo:
Within the building evacuation context, wayfinding describes the process in which an individual located within an arbitrarily complex enclosure attempts to find a path which leads them to relative safety, usually the exterior of the enclosure. Within most evacuation modelling tools, wayfinding is completely ignored; agents are either assigned the shortest distance path or use a potential field to find the shortest path to the exits. In this paper a novel wayfinding technique that attempts to represent the manner in which people wayfind within structures is introduced and demonstrated through two examples. The first step is to encode the spatial information of the enclosure in terms of a graph. The second step is to apply search algorithms to the graph to find possible routes to the destination and assign a cost to the routes based on their personal route preferences such as "least time" or "least distance" or a combination of criteria. The third step is the route execution and refinement. In this step, the agent moves along the chosen route and reassesses the route at regular intervals and may decide to take an alternative path if the agent determines that an alternate route is more favourable e.g. initial path is highly congested or is blocked due to fire.
Resumo:
An aqueous solution of sucrose was lyophilised, producing amorphous sucrose. This wasthen stored under different humidity at 25ºC for 1 week, allowing some samples tocrystallise. FT-Raman spectroscopy and PXRD have been successfully shown toqualitatively distinguish between amorphous and crystalline samples of sucrose. The datafrom the two techniques is complementary.
Resumo:
Today, the key to commercial success in manufacturing is the timely development of new products that are not only functionally fit for purpose but offer high performance and quality throughout their entire lifecycle. In principle, this demands the introduction of a fully developed and optimised product from the outset. To accomplish this, manufacturing companies must leverage existing knowledge in their current technical, manufacturing and service capabilities. This is especially true in the field of tolerance selection and application, the subject area of this research. Tolerance knowledge must be readily available and deployed as an integral part of the product development process. This paper describes a methodology and framework,currently under development in a UK manufacturer, to achieve this objective.
Resumo:
Background: Personal health records were implemented with adults with learning disabilities (AWLD) to try to improve their health-care. Materials and Method: Forty GP practices were randomized to the Personal Health Profile (PHP) implementation or control group. Two hundred and one AWLD were interviewed at baseline and 163 followed up after 12 months intervention (PHP group). AWLD and carers of AWLD were employed as research interviewers. AWLD were full research participants. Results: Annual consultation rates in the intervention and control groups at baseline were low (2.3 and 2.6 visits respectively). A slightly greater increase occurred over the year in the intervention group 0.6 ()0.4 to 1.6) visits ⁄ year compared with controls. AWLD in PHP group reported more health problems at follow-up 0.9 (0.0 to 1.8). AWLD liked their PHP (92%) but only 63% AWLD and 55% carers reported PHP usage. Carers had high turnover (34%). Conclusions: No significant outcomes were achieved by the intervention.