Biblioteca Digital

835 resultados para Parallel version

Parallel performance and scalability experiments with the Danish Eulerian Model on the EPCC supercomputers

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The Danish Eulerian Model (DEM) is a powerful air pollution model, designed to calculate the concentrations of various dangerous species over a large geographical region (e.g. Europe). It takes into account the main physical and chemical processes between these species, the actual meteorological conditions, emissions, etc.. This is a huge computational task and requires significant resources of storage and CPU time. Parallel computing is essential for the efficient practical use of the model. Some efficient parallel versions of the model were created over the past several years. A suitable parallel version of DEM by using the Message Passing Interface library (AIPI) was implemented on two powerful supercomputers of the EPCC - Edinburgh, available via the HPC-Europa programme for transnational access to research infrastructures in EC: a Sun Fire E15K and an IBM HPCx cluster. Although the implementation is in principal, the same for both supercomputers, few modifications had to be done for successful porting of the code on the IBM HPCx cluster. Performance analysis and parallel optimization was done next. Results from bench marking experiments will be presented in this paper. Another set of experiments was carried out in order to investigate the sensitivity of the model to variation of some chemical rate constants in the chemical submodel. Certain modifications of the code were necessary to be done in accordance with this task. The obtained results will be used for further sensitivity analysis Studies by using Monte Carlo simulation.

Parallel Layout Construction Algorithm for Irregular Shape Packing Problems

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Cutting and packing problems arise in a variety of industries, including garment, wood and shipbuilding. Irregular shape packing is a special case which admits irregular items and is much more complex due to the geometry of items. In order to ensure that items do not overlap and no item from the layout protrudes from the container, the collision free region concept was adopted. It represents all possible translations for a new item to be inserted into a container with already placed items. To construct a feasible layout, collision free region for each item is determined through a sequence of Boolean operations over polygons. In order to improve the speed of the algorithm, a parallel version of the layout construction was proposed and it was applied to a simulated annealing algorithm used to solve bin packing problems. Tests were performed in order to determine the speed improvement of the parallel version over the serial algorithm

A parallel implementation of the lattice solid model for the simulation of rock mechanics and earthquake dynamics

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The Lattice Solid Model has been used successfully as a virtual laboratory to simulate fracturing of rocks, the dynamics of faults, earthquakes and gouge processes. However, results from those simulations show that in order to make the next step towards more realistic experiments it will be necessary to use models containing a significantly larger number of particles than current models. Thus, those simulations will require a greatly increased amount of computational resources. Whereas the computing power provided by single processors can be expected to increase according to Moore's law, i.e., to double every 18-24 months, parallel computers can provide significantly larger computing power today. In order to make this computing power available for the simulation of the microphysics of earthquakes, a parallel version of the Lattice Solid Model has been implemented. Benchmarks using large models with several millions of particles have shown that the parallel implementation of the Lattice Solid Model can achieve a high parallel-efficiency of about 80% for large numbers of processors on different computer architectures.

CXTANNEAL: an improved program for estimating solute transport parameters

Relevância:

60.00% 60.00%

Publicador:

Resumo:

CXTANNEAL is a program for analysing contaminant transport in soils. The code, written in Fortran 77, is a modified version of CXTFIT, a commonly used package for estimating solute transport parameters in soils. The improvement of the present code is that it includes simulated annealing as the optimization technique for curve fitting. Tests with hypothetical data show that CXTANNEAL performs better than the original code in searching for optimal parameter estimates. To reduce the computational time, a parallel version of CXTANNEAL (CXTANNEAL_P) was also developed. (C) 1999 Elsevier Science Ltd. All rights reserved.

Algorithms for information extraction and signal annotation on long-term biosignals using clustering techniques

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Dissertação para obtenção do Grau de Mestre em Engenharia Biomédica

A user-friendly web portal for T-Coffee on supercomputers

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Background: Parallel T-Coffee (PTC) was the first parallel implementation of the T-Coffee multiple sequence alignment tool. It is based on MPI and RMA mechanisms. Its purpose is to reduce the execution time of the large-scale sequence alignments. It can be run on distributed memory clusters allowing users to align data sets consisting of hundreds of proteins within a reasonable time. However, most of the potential users of this tool are not familiar with the use of grids or supercomputers. Results: In this paper we show how PTC can be easily deployed and controlled on a super computer architecture using a web portal developed using Rapid. Rapid is a tool for efficiently generating standardized portlets for a wide range of applications and the approach described here is generic enough to be applied to other applications, or to deploy PTC on different HPC environments. Conclusions: The PTC portal allows users to upload a large number of sequences to be aligned by the parallel version of TC that cannot be aligned by a single machine due to memory and execution time constraints. The web portal provides a user-friendly solution.

Communicative and Anticipatory Decision-Making Supported by Bayesian Networks

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The purpose of this research is to draw up a clear construction of an anticipatory communicative decision-making process and a successful implementation of a Bayesian application that can be used as an anticipatory communicative decision-making support system. This study is a decision-oriented and constructive research project, and it includes examples of simulated situations. As a basis for further methodological discussion about different approaches to management research, in this research, a decision-oriented approach is used, which is based on mathematics and logic, and it is intended to develop problem solving methods. The approach is theoretical and characteristic of normative management science research. Also, the approach of this study is constructive. An essential part of the constructive approach is to tie the problem to its solution with theoretical knowledge. Firstly, the basic deﬁnitions and behaviours of an anticipatory management and managerial communication are provided. These descriptions include discussions of the research environment and formed management processes. These issues deﬁne and explain the background to further research. Secondly, it is processed to managerial communication and anticipatory decision-making based on preparation, problem solution, and solution search, which are also related to risk management analysis. After that, a solution to the decision-making support application is formed, using four different Bayesian methods, as follows: the Bayesian network, the inﬂuence diagram, the qualitative probabilistic network, and the time critical dynamic network. The purpose of the discussion is not to discuss different theories but to explain the theories which are being implemented. Finally, an application of Bayesian networks to the research problem is presented. The usefulness of the prepared model in examining a problem and the represented results of research is shown. The theoretical contribution includes deﬁnitions and a model of anticipatory decision-making. The main theoretical contribution of this study has been to develop a process for anticipatory decision-making that includes management with communication, problem-solving, and the improvement of knowledge. The practical contribution includes a Bayesian Decision Support Model, which is based on Bayesian inﬂuenced diagrams. The main contributions of this research are two developed processes, one for anticipatory decision-making, and the other to produce a model of a Bayesian network for anticipatory decision-making. In summary, this research contributes to decision-making support by being one of the few publicly available academic descriptions of the anticipatory decision support system, by representing a Bayesian model that is grounded on ﬁrm theoretical discussion, by publishing algorithms suitable for decision-making support, and by deﬁning the idea of anticipatory decision-making for a parallel version. Finally, according to the results of research, an analysis of anticipatory management for planned decision-making is presented, which is based on observation of environment, analysis of weak signals, and alternatives to creative problem solving and communication.

An O(N) Algorithm for Three-Dimensional N-Body Simulations

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We develop an algorithm that computes the gravitational potentials and forces on N point-masses interacting in three-dimensional space. The algorithm, based on analytical techniques developed by Rokhlin and Greengard, runs in order N time. In contrast to other fast N-body methods such as tree codes, which only approximate the interaction potentials and forces, this method is exact ?? computes the potentials and forces to within any prespecified tolerance up to machine precision. We present an implementation of the algorithm for a sequential machine. We numerically verify the algorithm, and compare its speed with that of an O(N2) direct force computation. We also describe a parallel version of the algorithm that runs on the Connection Machine in order 0(logN) time. We compare experimental results with those of the sequential implementation and discuss how to minimize communication overhead on the parallel machine.

Acceleration of radio network optimisation algorithms

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The increasing demand for cheaper-faster-better services anytime and anywhere has made radio network optimisation much more complex than ever before. In order to dynamically optimise the serving network, Dynamic Network Optimisation (DNO), is proposed as the ultimate solution and future trend. The realization of DNO, however, has been hindered by a significant bottleneck of the optimisation speed as the network complexity grows. This paper presents a multi-threaded parallel solution to accelerate complicated proprietary network optimisation algorithms, under a rigid condition of numerical consistency. ariesoACP product from Arieso Ltd serves as the platform for parallelisation. This parallel solution has been benchmarked and results exhibit a high scalability and a run-time reduction by 11% to 42% based on the technology, subscriber density and blocking rate of a given network in comparison with the original version. Further, it is highly essential that the parallel version produces equivalent optimisation quality in terms of identical optimisation outputs.

A scalable expressive ensemble learning using Random Prism: a MapReduce approach

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The induction of classification rules from previously unseen examples is one of the most important data mining tasks in science as well as commercial applications. In order to reduce the influence of noise in the data, ensemble learners are often applied. However, most ensemble learners are based on decision tree classifiers which are affected by noise. The Random Prism classifier has recently been proposed as an alternative to the popular Random Forests classifier, which is based on decision trees. Random Prism is based on the Prism family of algorithms, which is more robust to noise. However, like most ensemble classification approaches, Random Prism also does not scale well on large training data. This paper presents a thorough discussion of Random Prism and a recently proposed parallel version of it called Parallel Random Prism. Parallel Random Prism is based on the MapReduce programming paradigm. The paper provides, for the first time, novel theoretical analysis of the proposed technique and in-depth experimental study that show that Parallel Random Prism scales well on a large number of training examples, a large number of data features and a large number of processors. Expressiveness of decision rules that our technique produces makes it a natural choice for Big Data applications where informed decision making increases the user’s trust in the system.

Algorithms for Maximum Independent Set in Convex Bipartite Graphs

Relevância:

60.00% 60.00%

Publicador:

Resumo:

A bipartite graph G = (V, W, E) is convex if there exists an ordering of the vertices of W such that, for each v. V, the neighbors of v are consecutive in W. We describe both a sequential and a BSP/CGM algorithm to find a maximum independent set in a convex bipartite graph. The sequential algorithm improves over the running time of the previously known algorithm and the BSP/CGM algorithm is a parallel version of the sequential one. The complexity of the algorithms does not depend on |W|.

Implementação do algoritmo (RTM) para processamento sísmico em arquiteturas não convencionais

Relevância:

60.00% 60.00%

Publicador:

Resumo:

With the growth of energy consumption worldwide, conventional reservoirs, the reservoirs called "easy exploration and production" are not meeting the global energy demand. This has led many researchers to develop projects that will address these needs, companies in the oil sector has invested in techniques that helping in locating and drilling wells. One of the techniques employed in oil exploration process is the reverse time migration (RTM), in English, Reverse Time Migration, which is a method of seismic imaging that produces excellent image of the subsurface. It is algorithm based in calculation on the wave equation. RTM is considered one of the most advanced seismic imaging techniques. The economic value of the oil reserves that require RTM to be localized is very high, this means that the development of these algorithms becomes a competitive differentiator for companies seismic processing. But, it requires great computational power, that it still somehow harms its practical success. The objective of this work is to explore the implementation of this algorithm in unconventional architectures, specifically GPUs using the CUDA by making an analysis of the difficulties in developing the same, as well as the performance of the algorithm in the sequential and parallel version

Análise de escalabilidade de uma implementação paralela do simulated annealing acoplado

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper analyzes the performance of a parallel implementation of Coupled Simulated Annealing (CSA) for the unconstrained optimization of continuous variables problems. Parallel processing is an efficient form of information processing with emphasis on exploration of simultaneous events in the execution of software. It arises primarily due to high computational performance demands, and the difficulty in increasing the speed of a single processing core. Despite multicore processors being easily found nowadays, several algorithms are not yet suitable for running on parallel architectures. The algorithm is characterized by a group of Simulated Annealing (SA) optimizers working together on refining the solution. Each SA optimizer runs on a single thread executed by different processors. In the analysis of parallel performance and scalability, these metrics were investigated: the execution time; the speedup of the algorithm with respect to increasing the number of processors; and the efficient use of processing elements with respect to the increasing size of the treated problem. Furthermore, the quality of the final solution was verified. For the study, this paper proposes a parallel version of CSA and its equivalent serial version. Both algorithms were analysed on 14 benchmark functions. For each of these functions, the CSA is evaluated using 2-24 optimizers. The results obtained are shown and discussed observing the analysis of the metrics. The conclusions of the paper characterize the CSA as a good parallel algorithm, both in the quality of the solutions and the parallel scalability and parallel efficiency

Using GPU to exploit parallelism on cryptography

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In this article we explore the NVIDIA graphical processing units (GPU) computational power in cryptography using CUDA (Compute Unified Device Architecture) technology. CUDA makes the general purpose computing easy using the parallel processing presents in GPUs. To do this, the NVIDIA GPUs architectures and CUDA are presented, besides cryptography concepts. Furthermore, we do the comparison between the versions executed in CPU with the parallel version of the cryptography algorithms Advanced Encryption Standard (AES) and Message-digest Algorithm 5 (MD5) wrote in CUDA. © 2011 AISTI.

Paralelização da ferramenta de alinhamento de sequências MUSCLE para um ambiente distribuído

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

«
1
2
3
4
5
6
7
8
...
55
56
»