30 resultados para grid code
Resumo:
This paper describes an interactive parallelisation toolkit that can be used to generate parallel code suitable for either a distributed memory system (using message passing) or a shared memory system (using OpenMP). This study focuses on how the toolkit is used to parallelise a complex heterogeneous ocean modelling code within a few hours for use on a shared memory parallel system. The generated parallel code is essentially the serial code with OpenMP directives added to express the parallelism. The results show that substantial gains in performance can be achieved over the single thread version with very little effort.
Resumo:
This chapter discusses the code parallelization environment, where a number of tools that address the main tasks, such as code parallelization, debugging, and optimization are available. The parallelization tools include ParaWise and CAPO, which enable the near automatic parallelization of real world scientific application codes for shared and distributed memory-based parallel systems. The chapter discusses the use of ParaWise and CAPO to transform the original serial code into an equivalent parallel code that contains appropriate OpenMP directives. Additionally, as user involvement can introduce errors, a relative debugging tool (P2d2) is also available and can be used to perform near automatic relative debugging of an OpenMP program that has been parallelized either using the tools or manually. In order for these tools to be effective in parallelizing a range of applications, a high quality fully inter-procedural dependence analysis, as well as user interaction is vital to the generation of efficient parallel code and in the optimization of the backtracking and speculation process used in relative debugging. Results of parallelized NASA codes are discussed and show the benefits of using the environment.
Resumo:
Despite the apparent simplicity of the OpenMP directive shared memory programming model and the sophisticated dependence analysis and code generation capabilities of the ParaWise/CAPO tools, experience shows that a level of expertise is required to produce efficient parallel code. In a real world application the investigation of a single loop in a generated parallel code can soon become an in-depth inspection of numerous dependencies in many routines. The additional understanding of dependencies is also needed to effectively interpret the information provided and supply the required feedback. The ParaWise Expert Assistant has been developed to automate this investigation and present questions to the user about, and in the context of, their application code. In this paper, we demonstrate that knowledge of dependence information and OpenMP are no longer essential to produce efficient parallel code with the Expert Assistant. It is hoped that this will enable a far wider audience to use the tools and subsequently, exploit the benefits of large parallel systems.
Resumo:
Code parallelization using OpenMP for shared memory systems is relatively easier than using message passing for distributed memory systems. Despite this, it is still a challenge to use OpenMP to parallelize application codes in a way that yields effective scalable performance when executed on a shared memory parallel system. We describe an environment that will assist the programmer in the various tasks of code parallelization and this is achieved in a greatly reduced time frame and level of skill required. The parallelization environment includes a number of tools that address the main tasks of parallelism detection, OpenMP source code generation, debugging and optimization. These tools include a high quality, fully interprocedural dependence analysis with user interaction capabilities to facilitate the generation of efficient parallel code, an automatic relative debugging tool to identify erroneous user decisions in that interaction and also performance profiling to identify bottlenecks. Finally, experiences of parallelizing some NASA application codes are presented to illustrate some of the benefits of using the evolving environment.
Resumo:
This paper presents an investigation into dynamic self-adjustment of task deployment and other aspects of self-management, through the embedding of multiple policies. Non-dedicated loosely-coupled computing environments, such as clusters and grids are increasingly popular platforms for parallel processing. These abundant systems are highly dynamic environments in which many sources of variability affect the run-time efficiency of tasks. The dynamism is exacerbated by the incorporation of mobile devices and wireless communication. This paper proposes an adaptive strategy for the flexible run-time deployment of tasks; to continuously maintain efficiency despite the environmental variability. The strategy centres on policy-based scheduling which is informed by contextual and environmental inputs such as variance in the round-trip communication time between a client and its workers and the effective processing performance of each worker. A self-management framework has been implemented for evaluation purposes. The framework integrates several policy-controlled, adaptive services with the application code, enabling the run-time behaviour to be adapted to contextual and environmental conditions. Using this framework, an exemplar self-managing parallel application is implemented and used to investigate the extent of the benefits of the strategy
Resumo:
The problem to be examined here is the fluctuating pressure distribution along the open cavity of the sun-roof at the top of a car compartment due to gusts passing over the sun-roof. The aim of this test is to investigate the capability of a typical commercial CFD package, PHOENICS, in recognising pressure fluctuations occurring in an important automotive industrial problem. In particular to examine the accuracy of transporting pulsatory gusts traveling along the main flow through the use of finite volume methods with higher order schemes in the numercial solutins of the unsteady compressible Navier-Stokes equations. The Helmholtz equation is used to solve the sound distribution inside the car compartment, resulting from the externally induced fluctuations.
Resumo:
Unstructured grid meshes used in most commercial CFD codes inevitably adopt collocated variable solution schemes. These schemes have several shortcomings, mainly due to the interpolation of the pressure gradient, that lead to slow convergence. In this publication we show how it is possible to use a much more stable staggered mesh arrangement in an unstructured code. Several alternative groupings of variables are investigated in a search for the optimum scheme.
Resumo:
The Sahara desert is a significant source of particulate pollution not only to the Mediterranean region, but also to the Atlantic and beyond. In this paper, PM 10 exceedences recorded in the UK and the island of Crete are studied and their source investigated, using Lagrangian Particle Dispersion (LPD) methods. Forward and inverse simulations identify Saharan dust storms as the primary source of these episodes. The methodology used allows comparison between this primary source and other possible candidates, for example large forest fires or volcanic eruptions. Two LPD models are used in the simulations, namely the open source code FLEXPART and the proprietary code HYSPLIT. Driven by the same meteorological fields (the ECMWF MARS archive and the PSU/NCAR Mesoscale model, known as MM5) the codes produce similar, but not identical predictions. This inter-model comparison enables a critical assessment of the physical modelling assumptions employed in each code, plus the influence of boundary conditions and solution grid density. The outputs, in the form of particle concentrations evolving in time, are compared against satellite images and receptor data from multiple ground-based sites. Quantitative comparisons are good, especially in predicting the time of arrival of the dust plume in a particular location.
Resumo:
The effect of a high electric current density on the interfacial reactions of micro ball grid array solder joints was studied at room temperature and at 150 °C. Four types of phenomena were reported. Along with electromigration-induced interfacial intermetallic compound (IMC) formation, dissolution at the Cu under bump metallization (UBM)/bond pad was also noticed. With a detailed investigation, it was found that the narrow and thin metallization at the component side produced “Joule heating” due to its higher resistance, which in turn was responsible for the rapid dissolution of the Cu UBM/bond pad near to the Cu trace. During an “electromigration test” of a solder joint, the heat generation due to Joule heating and the heat dissipation from the package should be considered carefully. When the heat dissipation fails to compete with the Joule heating, the solder joint melts and molten solder accelerates the interfacial reactions in the solder joint. The presence of a liquid phase was demonstrated from microstructural evidence of solder joints after different current stressing (ranging from 0.3 to 2 A) as well as an in situ observation. Electromigration-induced liquid state diffusion of Cu was found to be responsible for the higher growth rate of the IMC on the anode side.
Resumo:
This paper evaluates the shearing behavior of ball grid array (BGA) solder joints on Au/Ni/Cu pads of FR4 substrates after multiple reflow soldering. A new Pb-free solder, Sn–3Ag–0.5Cu–8In (SACI), has been compared with Sn–3Ag–0.5Cu (SAC) and Sn–37Pb (SP) solders, in terms of fracture surfaces, shearing forces and microstructures. Three failure modes, ball cut, a combination of solder shear and solder/pad bond separation, and pad lift, are assessed for the different solders and reflow cycles. It is found that the shearing forces of the SP and SAC solder joints tend to increase slightly with an increase in the number of reflow cycles due to diffusion-induced solid solution strengthening of the bulk solder and augmentation of the shearing area. However, the shearing forces of the SACI solder joints decrease slightly after four cycles of reflow, which is ascribed to the thermal degradation of both the solder/intermetallic compound (IMC) and IMC/Ni interfaces. The SACI solder joints yield the highest strengths, whereas the SP solder joints give the smallest values, irrespective of the number of reflow cycles. Thickening of the interfacial IMC layer and coarsening of the dispersing IMC particles within the bulk solders were also observed. Nevertheless, the variation of shearing forces and IMC thickness with different numbers of reflow cycles was not so significant since the Ni under layer acted as an effective diffusion barrier. In addition, the initially-formed IMC layer retarded the further extensive dissolution of the pad material and its interaction with the solder
Resumo:
Ball shear test is the most common test method used to assess the reliability of bond strength for ball grid array (BGA) packages. In this work, a combined experimental and numerical study was carried out to realize of BGA solder interface strength. Solder mask defined bond pads on the BGA substrate were used for BGA ball bonding. Different bond pad metallizations and solder alloys were used. Solid state aging at 150degC up to 1000 h has been carried out to change the interfacial microstructure. Cross-sectional studies of the solder-to-bond pad interfaces was conducted by scanning electron microscopy (SEM) equipped with an energy dispersive X-ray (EDX) analyzer to investigate the interfacial reaction phenomena. Ball shear tests have been carried out to obtain the mechanical strength of the solder joints and to correlate shear behaviour with the interfacial reaction products. An attempt has been taken to realize experimental findings by Finite Element Analysis (FEA). It was found that intermetallic compound (IMC) formation at the solder interface plays an important role in the BGA solder bond strength. By changing the morphology and the microchemistry of IMCs, the fracture propagation path could be changed and hence, reliability could be improved
Resumo:
Image inpainting refers to restoring a damaged image with missing information. The total variation (TV) inpainting model is one such method that simultaneously fills in the regions with available information from their surroundings and eliminates noises. The method works well with small narrow inpainting domains. However there remains an urgent need to develop fast iterative solvers, as the underlying problem sizes are large. In addition one needs to tackle the imbalance of results between inpainting and denoising. When the inpainting regions are thick and large, the procedure of inpainting works quite slowly and usually requires a significant number of iterations and leads inevitably to oversmoothing in the outside of the inpainting domain. To overcome these difficulties, we propose a solution for TV inpainting method based on the nonlinear multi-grid algorithm.
Resumo:
A zone based systems design framework is described and utilised in the implementation of a message authentication code (MAC) algorithm based on symmetric key block ciphers. The resulting block cipher based MAC algorithm may be used to provide assurance of the authenticity and, hence, the integrity of binary data. Using software simulation to benchmark against the de facto cipher block chaining MAC (CBC-MAC) variant used in the TinySec security protocol for wireless sensor networks and the NIST cipher block chaining MAC standard, CMAC; we show that our zone based systems design framework can lead to block cipher based MAC constructs that point to improvements in message processing efficiency, processing throughput and processing latency.
Resumo:
Zaha Hadid's Kartal Pendik Masterplan (2006) for a new city centre on the east bank of Istanbul proposes the redevelopment of an abandoned industrial site located in a crucial infrastructural node between Europe and Asia as a connecting system between the neighbouring areas of Kartal in the west and Pendik in the east. The project is organised on what its architects call a soft grid, a flexible and adaptable grid that allows it to articulate connections and differences of form, density and use within the same spatial structure [1]. Its final overall design constitutes only one of the many possible configurations that the project may take in response to the demands of the different areas included in the masterplan, and is produced from a script that is able to generate both built volumes and open spaces, skyscrapers as well as parks. The soft grid in fact produces a ‘becoming’ rather than a finite and definitive form: its surface space does not look like a grid, but is derived from a grid operation which is best explained by the project presentation in video animation. The grid here is a process of ‘gridding’, enacted according to ancient choreographed linear movements of measuring, defining, adjusting, reconnecting spaces through an articulated surface rather than superimposed on an ignored given like an indifferent colonising carpet.