9 resultados para HARDWARE
em Greenwich Academic Literature Archive - UK
Resumo:
Three paradigms for distributed-memory parallel computation that free the application programmer from the details of message passing are compared for an archetypal structured scientific computation -- a nonlinear, structured-grid partial differential equation boundary value problem -- using the same algorithm on the same hardware. All of the paradigms -- parallel languages represented by the Portland Group's HPF, (semi-)automated serial-to-parallel source-to-source translation represented by CAP-Tools from the University of Greenwich, and parallel libraries represented by Argonne's PETSc -- are found to be easy to use for this problem class, and all are reasonably effective in exploiting concurrency after a short learning curve. The level of involvement required by the application programmer under any paradigm includes specification of the data partitioning, corresponding to a geometrically simple decomposition of the domain of the PDE. Programming in SPMD style for the PETSc library requires writing only the routines that discretize the PDE and its Jacobian, managing subdomain-to-processor mappings (affine global-to-local index mappings), and interfacing to library solver routines. Programming for HPF requires a complete sequential implementation of the same algorithm as a starting point, introduction of concurrency through subdomain blocking (a task similar to the index mapping), and modest experimentation with rewriting loops to elucidate to the compiler the latent concurrency. Programming with CAPTools involves feeding the same sequential implementation to the CAPTools interactive parallelization system, and guiding the source-to-source code transformation by responding to various queries about quantities knowable only at runtime. Results representative of "the state of the practice" for a scaled sequence of structured grid problems are given on three of the most important contemporary high-performance platforms: the IBM SP, the SGI Origin 2000, and the CRAYY T3E.
Resumo:
The shared-memory programming model can be an effective way to achieve parallelism on shared memory parallel computers. Historically however, the lack of a programming standard using directives and the limited scalability have affected its take-up. Recent advances in hardware and software technologies have resulted in improvements to both the performance of parallel programs with compiler directives and the issue of portability with the introduction of OpenMP. In this study, the Computer Aided Parallelisation Toolkit has been extended to automatically generate OpenMP-based parallel programs with nominal user assistance. We categorize the different loop types and show how efficient directives can be placed using the toolkit's in-depth interprocedural analysis. Examples are taken from the NAS parallel benchmarks and a number of real-world application codes. This demonstrates the great potential of using the toolkit to quickly parallelise serial programs as well as the good performance achievable on up to 300 processors for hybrid message passing-directive parallelisations.
Resumo:
From the model geometry creation to the model analysis, the stages in between such as mesh generation are the most manpower intensive phase in a mesh-based computational mechanics simulation process. On the other hand the model analysis is the most computing intensive phase. Advanced computational hardware and software have significantly reduced the computing time - and more importantly the trend is downward. With the kind of models envisaged coming, which are larger, more complex in geometry and modelling, and multiphysics, there is no clear trend that the manpower intensive phase is to decrease significantly in time - in the present way of operation it is more likely to increase with model complexity. In this paper we address this dilemma in collaborating components for models in electronic packaging application.
Resumo:
Many Web applications walk the thin line between the need for dynamic data and the need to meet user performance expectations. In environments where funds are not available to constantly upgrade hardware inline with user demand, alternative approaches need to be considered. This paper introduces a ‘Data farming’ model whereby dynamic data, which is ‘grown’ in operational applications, is ‘harvested’ and ‘packaged’ for various consumer markets. Like any well managed agricultural operation, crops are harvested according to historical and perceived demand as inferred by a self-optimising process. This approach aims to make enhanced use of available resources through better utlilisation of system downtime - thereby improving application performance and increasing the availability of key business data.
Resumo:
Parallel processing techniques have been used in the past to provide high performance computing resources for activities such as fire-field modelling. This has traditionally been achieved using specialized hardware and software, the expense of which would be difficult to justify for many fire engineering practices. In this article we demonstrate how typical office-based PCs attached to a Local Area Network has the potential to offer the benefits of parallel processing with minimal costs associated with the purchase of additional hardware or software. It was found that good speedups could be achieved on homogeneous networks of PCs, for example a problem composed of ~100,000 cells would run 9.3 times faster on a network of 12 800MHz PCs than on a single 800MHz PC. It was also found that a network of eight 3.2GHz Pentium 4 PCs would run 7.04 times faster than a single 3.2GHz Pentium computer. A dynamic load balancing scheme was also devised to allow the effective use of the software on heterogeneous PC networks. This scheme also ensured that the impact between the parallel processing task and other computer users on the network was minimized.
Resumo:
The scalability of a computer system is its response to growth. It is also depended on its hardware, its operating system and the applications it is running. Most distributed systems technology today still depends on bus-based shared memory which do not scale well, and systems based on the grid or hypercube scheme requires significantly less connections than a full inter-connection that would exhibit a quadratic growth rate. The rapid convergence of mobile communication, digital broadcasting and network infrastructures calls for rich multimedia content that is adaptive and responsive to the needs of individuals, businesses and the public organisations. This paper will discuss the emergence of mobile Multimedia systems and provides an overview of the issues regarding design and delivery of multimedia content to mobile devices.
Resumo:
Parallel processing techniques have been used in the past to provide high performance computing resources for activities such as Computational Fluid Dynamics. This is normally achieved using specialized hardware and software, the expense of which would be difficult to justify for many fire engineering practices. In this paper, we demonstrate how typical office-based PCs attached to a local area network have the potential to offer the benefits of parallel processing with minimal costs associated with the purchase of additional hardware or software. A dynamic load balancing scheme was devised to allow the effective use of the software on heterogeneous PC networks. This scheme ensured that the impact between the parallel processing task and other computer users on the network was minimized thus allowing practical parallel processing within a conventional office environment. Copyright © 2006 John Wiley & Sons, Ltd.
Resumo:
Embedded electronic systems in vehicles are of rapidly increasing commercial importance for the automotive industry. While current vehicular embedded systems are extremely limited and static, a more dynamic configurable system would greatly simplify the integration work and increase quality of vehicular systems. This brings in features like separation of concerns, customised software configuration for individual vehicles, seamless connectivity, and plug-and-play capability. Furthermore, such a system can also contribute to increased dependability and resource optimization due to its inherent ability to adjust itself dynamically to changes in software, hardware resources, and environment condition. This paper describes the architectural approach to achieving the goals of dynamically self-configuring automotive embedded electronic systems by the EU research project DySCAS. The architecture solution outlined in this paper captures the application and operational contexts, expected features, middleware services, functions and behaviours, as well as the basic mechanisms and technologies. The paper also covers the architecture conceptualization by presenting the rationale, concerning the architecture structuring, control principles, and deployment concept. In this paper, we also present the adopted architecture V&V strategy and discuss some open issues in regards to the industrial acceptance.