92 resultados para Parallel Computations
Resumo:
The Danish Eulerian Model (DEM) is a powerful air pollution model, designed to calculate the concentrations of various dangerous species over a large geographical region (e.g. Europe). It takes into account the main physical and chemical processes between these species, the actual meteorological conditions, emissions, etc.. This is a huge computational task and requires significant resources of storage and CPU time. Parallel computing is essential for the efficient practical use of the model. Some efficient parallel versions of the model were created over the past several years. A suitable parallel version of DEM by using the Message Passing Interface library (AIPI) was implemented on two powerful supercomputers of the EPCC - Edinburgh, available via the HPC-Europa programme for transnational access to research infrastructures in EC: a Sun Fire E15K and an IBM HPCx cluster. Although the implementation is in principal, the same for both supercomputers, few modifications had to be done for successful porting of the code on the IBM HPCx cluster. Performance analysis and parallel optimization was done next. Results from bench marking experiments will be presented in this paper. Another set of experiments was carried out in order to investigate the sensitivity of the model to variation of some chemical rate constants in the chemical submodel. Certain modifications of the code were necessary to be done in accordance with this task. The obtained results will be used for further sensitivity analysis Studies by using Monte Carlo simulation.
Resumo:
In any data mining applications, automated text and text and image retrieval of information is needed. This becomes essential with the growth of the Internet and digital libraries. Our approach is based on the latent semantic indexing (LSI) and the corresponding term-by-document matrix suggested by Berry and his co-authors. Instead of using deterministic methods to find the required number of first "k" singular triplets, we propose a stochastic approach. First, we use Monte Carlo method to sample and to build much smaller size term-by-document matrix (e.g. we build k x k matrix) from where we then find the first "k" triplets using standard deterministic methods. Second, we investigate how we can reduce the problem to finding the "k"-largest eigenvalues using parallel Monte Carlo methods. We apply these methods to the initial matrix and also to the reduced one. The algorithms are running on a cluster of workstations under MPI and results of the experiments arising in textual retrieval of Web documents as well as comparison of the stochastic methods proposed are presented. (C) 2003 IMACS. Published by Elsevier Science B.V. All rights reserved.
Resumo:
In models of complicated physical-chemical processes operator splitting is very often applied in order to achieve sufficient accuracy as well as efficiency of the numerical solution. The recently rediscovered weighted splitting schemes have the great advantage of being parallelizable on operator level, which allows us to reduce the computational time if parallel computers are used. In this paper, the computational times needed for the weighted splitting methods are studied in comparison with the sequential (S) splitting and the Marchuk-Strang (MSt) splitting and are illustrated by numerical experiments performed by use of simplified versions of the Danish Eulerian model (DEM).
Resumo:
Large scale air pollution models are powerful tools, designed to meet the increasing demand in different environmental studies. The atmosphere is the most dynamic component of the environment, where the pollutants can be moved quickly on far distnce. Therefore the air pollution modeling must be done in a large computational domain. Moreover, all relevant physical, chemical and photochemical processes must be taken into account. In such complex models operator splitting is very often applied in order to achieve sufficient accuracy as well as efficiency of the numerical solution. The Danish Eulerian Model (DEM) is one of the most advanced such models. Its space domain (4800 × 4800 km) covers Europe, most of the Mediterian and neighboring parts of Asia and the Atlantic Ocean. Efficient parallelization is crucial for the performance and practical capabilities of this huge computational model. Different splitting schemes, based on the main processes mentioned above, have been implemented and tested with respect to accuracy and performance in the new version of DEM. Some numerical results of these experiments are presented in this paper.
Resumo:
In the 1990s the Message Passing Interface Forum defined MPI bindings for Fortran, C, and C++. With the success of MPI these relatively conservative languages have continued to dominate in the parallel computing community. There are compelling arguments in favour of more modern languages like Java. These include portability, better runtime error checking, modularity, and multi-threading. But these arguments have not converted many HPC programmers, perhaps due to the scarcity of full-scale scientific Java codes, and the lack of evidence for performance competitive with C or Fortran. This paper tries to redress this situation by porting two scientific applications to Java. Both of these applications are parallelized using our thread-safe Java messaging system—MPJ Express. The first application is the Gadget-2 code, which is a massively parallel structure formation code for cosmological simulations. The second application uses the finite-domain time-difference method for simulations in the area of computational electromagnetics. We evaluate and compare the performance of the Java and C versions of these two scientific applications, and demonstrate that the Java codes can achieve performance comparable with legacy applications written in conventional HPC languages. Copyright © 2009 John Wiley & Sons, Ltd.
Resumo:
As consumers demand more functionality) from their electronic devices and manufacturers supply the demand then electrical power and clock requirements tend to increase, however reassessing system architecture can fortunately lead to suitable counter reductions. To maintain low clock rates and therefore reduce electrical power, this paper presents a parallel convolutional coder for the transmit side in many wireless consumer devices. The coder accepts a parallel data input and directly computes punctured convolutional codes without the need for a separate puncturing operation while the coded bits are available at the output of the coder in a parallel fashion. Also as the computation is in parallel then the coder can be clocked at 7 times slower than the conventional shift-register based convolutional coder (using DVB 7/8 rate). The presented coder is directly relevant to the design of modern low-power consumer devices
Resumo:
The work reported in this paper is motivated by the fact that there is a need to apply autonomic computing concepts to parallel computing systems. Advancing on prior work based on intelligent cores [36], a swarm-array computing approach, this paper focuses on ‘Intelligent agents’ another swarm-array computing approach in which the task to be executed on a parallel computing core is considered as a swarm of autonomous agents. A task is carried to a computing core by carrier agents and is seamlessly transferred between cores in the event of a predicted failure, thereby achieving self-ware objectives of autonomic computing. The feasibility of the proposed swarm-array computing approach is validated on a multi-agent simulator.