933 resultados para Parallel computing


Relevância:

60.00% 60.00%

Publicador:

Resumo:

A computação paralela permite uma série de vantagens para a execução de aplicações de grande porte, sendo que o uso efetivo dos recursos computacionais paralelos é um aspecto relevante da computação de alto desempenho. Este trabalho apresenta uma metodologia que provê a execução, de forma automatizada, de aplicações paralelas baseadas no modelo BSP com tarefas heterogêneas. É considerado no modelo adotado, que o tempo de computação de cada tarefa secundária não possui uma alta variância entre uma iteração e outra. A metodologia é denominada de ASE e é composta por três etapas: Aquisição (Acquisition), Escalonamento (Scheduling) e Execução (Execution). Na etapa de Aquisição, os tempos de processamento das tarefas são obtidos; na etapa de Escalonamento a metodologia busca encontrar a distribuição de tarefas que maximize a velocidade de execução da aplicação paralela, mas minimizando o uso de recursos, por meio de um algoritmo desenvolvido neste trabalho; e por fim a etapa de Execução executa a aplicação paralela com a distribuição definida na etapa anterior. Ferramentas que são aplicadas na metodologia foram implementadas. Um conjunto de testes aplicando a metodologia foi realizado e os resultados apresentados mostram que os objetivos da proposta foram alcançados.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Self-organising neural models have the ability to provide a good representation of the input space. In particular the Growing Neural Gas (GNG) is a suitable model because of its flexibility, rapid adaptation and excellent quality of representation. However, this type of learning is time-consuming, especially for high-dimensional input data. Since real applications often work under time constraints, it is necessary to adapt the learning process in order to complete it in a predefined time. This paper proposes a Graphics Processing Unit (GPU) parallel implementation of the GNG with Compute Unified Device Architecture (CUDA). In contrast to existing algorithms, the proposed GPU implementation allows the acceleration of the learning process keeping a good quality of representation. Comparative experiments using iterative, parallel and hybrid implementations are carried out to demonstrate the effectiveness of CUDA implementation. The results show that GNG learning with the proposed implementation achieves a speed-up of 6× compared with the single-threaded CPU implementation. GPU implementation has also been applied to a real application with time constraints: acceleration of 3D scene reconstruction for egomotion, in order to validate the proposal.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

A parallel algorithm to remove impulsive noise in digital images using heterogeneous CPU/GPU computing is proposed. The parallel denoising algorithm is based on the peer group concept and uses an Euclidean metric. In order to identify the amount of pixels to be allocated in multi-core and GPUs, a performance analysis using large images is presented. A comparison of the parallel implementation in multi-core, GPUs and a combination of both is performed. Performance has been evaluated in terms of execution time and Megapixels/second. We present several optimization strategies especially effective for the multi-core environment, and demonstrate significant performance improvements. The main advantage of the proposed noise removal methodology is its computational speed, which enables efficient filtering of color images in real-time applications.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Bound and resonance states of HO2 are calculated quantum mechanically using both the Lanczos homogeneous filter diagonalization method and the real Chebyshev filter diagonalization method for nonzero total angular momentum J=6 and 10, using a parallel computing strategy. For bound states, agreement between the two methods is quite satisfactory; for resonances, while the energies are in good agreement, the widths are in general agreement. The quantum nonzero-J specific unimolecular dissociation rates for HO2 are also calculated. (C) 2004 American Institute of Physics.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We give a selective review of quantum mechanical methods for calculating and characterizing resonances in small molecular systems, with an emphasis on recent progress in Chebyshev and Lanczos iterative methods. Two archetypal molecular systems are discussed: isolated resonances in HCO, which exhibit regular mode and state specificity, and overlapping resonances in strongly bound HO2, which exhibit irregular and chaotic behavior. Recent progresses for non-zero total angular momentum J calculations of resonances including parallel computing models are also included and future directions in this field are discussed.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We explore the calculation of unimolecular bound states and resonances for deep-well species at large angular momentum using a Chebychev filter diagonalization scheme incorporating doubling of the autocorrelation function as presented recently by Neumaier and Mandelshtam [Phys. Rev. Lett. 86, 5031 (2001)]. The method has been employed to compute the challenging J=20 bound and resonance states for the HO2 system. The methodology has firstly been tested for J=2 in comparison with previous calculations, and then extended to J=20 using a parallel computing strategy. The quantum J-specific unimolecular dissociation rates for HO2-> H+O-2 in the energy range from 2.114 to 2.596 eV have been reported for the first time, and comparisons with the results of Troe and co-workers [J. Chem. Phys. 113, 11019 (2000) Phys. Chem. Chem. Phys. 2, 631 (2000)] from statistical adiabatic channel method/classical trajectory calculations have been made. For most of the energies, the reported statistical adiabatic channel method/classical trajectory rate constants agree well with the average of the fluctuating quantum-mechanical rates. Near the dissociation threshold, quantum rates fluctuate more severely, but their average is still in agreement with the statistical adiabatic channel method/classical trajectory results.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The paper is a description of information and software content of a computer knowledge bank on medical diagnostics. The classes of its users and the tasks which they can solve are described. The information content of the bank contains three ontologies: an ontology of observations in the field of medical diagnostics, an ontology of knowledge base (diseases) in medical diagnostics and an ontology of case records, and also it contains three classes of information resources for every division of medicine – observation bases, knowledge bases, and data bases (with data about patients), that correspond to these ontologies. Software content consists of editors for information of different kinds (ontologies, bases of observations, knowledge and data), and also of a program which performs medical diagnostics.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

* The work is partially supported by the grant of National Academy of Science of Ukraine for the support of scientific researches by young scientists No 24-7/05, " Розробка Desktop Grid-системи і оптимізація її продуктивності ”.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Since the development of large scale power grid interconnections and power markets, research on available transfer capability (ATC) has attracted great attention. The challenges for accurate assessment of ATC originate from the numerous uncertainties in electricity generation, transmission, distribution and utilization sectors. Power system uncertainties can be mainly described as two types: randomness and fuzziness. However, the traditional transmission reliability margin (TRM) approach only considers randomness. Based on credibility theory, this paper firstly built models of generators, transmission lines and loads according to their features of both randomness and fuzziness. Then a random fuzzy simulation is applied, along with a novel method proposed for ATC assessment, in which both randomness and fuzziness are considered. The bootstrap method and multi-core parallel computing technique are introduced to enhance the processing speed. By implementing simulation for the IEEE-30-bus system and a real-life system located in Northwest China, the viability of the models and the proposed method is verified.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Three-Dimensional (3-D) imaging is vital in computer-assisted surgical planning including minimal invasive surgery, targeted drug delivery, and tumor resection. Selective Internal Radiation Therapy (SIRT) is a liver directed radiation therapy for the treatment of liver cancer. Accurate calculation of anatomical liver and tumor volumes are essential for the determination of the tumor to normal liver ratio and for the calculation of the dose of Y-90 microspheres that will result in high concentration of the radiation in the tumor region as compared to nearby healthy tissue. Present manual techniques for segmentation of the liver from Computed Tomography (CT) tend to be tedious and greatly dependent on the skill of the technician/doctor performing the task. ^ This dissertation presents the development and implementation of a fully integrated algorithm for 3-D liver and tumor segmentation from tri-phase CT that yield highly accurate estimations of the respective volumes of the liver and tumor(s). The algorithm as designed requires minimal human intervention without compromising the accuracy of the segmentation results. Embedded within this algorithm is an effective method for extracting blood vessels that feed the tumor(s) in order to plan effectively the appropriate treatment. ^ Segmentation of the liver led to an accuracy in excess of 95% in estimating liver volumes in 20 datasets in comparison to the manual gold standard volumes. In a similar comparison, tumor segmentation exhibited an accuracy of 86% in estimating tumor(s) volume(s). Qualitative results of the blood vessel segmentation algorithm demonstrated the effectiveness of the algorithm in extracting and rendering the vasculature structure of the liver. Results of the parallel computing process, using a single workstation, showed a 78% gain. Also, statistical analysis carried out to determine if the manual initialization has any impact on the accuracy showed user initialization independence in the results. ^ The dissertation thus provides a complete 3-D solution towards liver cancer treatment planning with the opportunity to extract, visualize and quantify the needed statistics for liver cancer treatment. Since SIRT requires highly accurate calculation of the liver and tumor volumes, this new method provides an effective and computationally efficient process required of such challenging clinical requirements.^

Relevância:

60.00% 60.00%

Publicador:

Resumo:

With hundreds of millions of users reporting locations and embracing mobile technologies, Location Based Services (LBSs) are raising new challenges. In this dissertation, we address three emerging problems in location services, where geolocation data plays a central role. First, to handle the unprecedented growth of generated geolocation data, existing location services rely on geospatial database systems. However, their inability to leverage combined geographical and textual information in analytical queries (e.g. spatial similarity joins) remains an open problem. To address this, we introduce SpsJoin, a framework for computing spatial set-similarity joins. SpsJoin handles combined similarity queries that involve textual and spatial constraints simultaneously. LBSs use this system to tackle different types of problems, such as deduplication, geolocation enhancement and record linkage. We define the spatial set-similarity join problem in a general case and propose an algorithm for its efficient computation. Our solution utilizes parallel computing with MapReduce to handle scalability issues in large geospatial databases. Second, applications that use geolocation data are seldom concerned with ensuring the privacy of participating users. To motivate participation and address privacy concerns, we propose iSafe, a privacy preserving algorithm for computing safety snapshots of co-located mobile devices as well as geosocial network users. iSafe combines geolocation data extracted from crime datasets and geosocial networks such as Yelp. In order to enhance iSafe's ability to compute safety recommendations, even when crime information is incomplete or sparse, we need to identify relationships between Yelp venues and crime indices at their locations. To achieve this, we use SpsJoin on two datasets (Yelp venues and geolocated businesses) to find venues that have not been reviewed and to further compute the crime indices of their locations. Our results show a statistically significant dependence between location crime indices and Yelp features. Third, review centered LBSs (e.g., Yelp) are increasingly becoming targets of malicious campaigns that aim to bias the public image of represented businesses. Although Yelp actively attempts to detect and filter fraudulent reviews, our experiments showed that Yelp is still vulnerable. Fraudulent LBS information also impacts the ability of iSafe to provide correct safety values. We take steps toward addressing this problem by proposing SpiDeR, an algorithm that takes advantage of the richness of information available in Yelp to detect abnormal review patterns. We propose a fake venue detection solution that applies SpsJoin on Yelp and U.S. housing datasets. We validate the proposed solutions using ground truth data extracted by our experiments and reviews filtered by Yelp.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The reverse time migration algorithm (RTM) has been widely used in the seismic industry to generate images of the underground and thus reduce the risk of oil and gas exploration. Its widespread use is due to its high quality in underground imaging. The RTM is also known for its high computational cost. Therefore, parallel computing techniques have been used in their implementations. In general, parallel approaches for RTM use a coarse granularity by distributing the processing of a subset of seismic shots among nodes of distributed systems. Parallel approaches with coarse granularity for RTM have been shown to be very efficient since the processing of each seismic shot can be performed independently. For this reason, RTM algorithm performance can be considerably improved by using a parallel approach with finer granularity for the processing assigned to each node. This work presents an efficient parallel algorithm for 3D reverse time migration with fine granularity using OpenMP. The propagation algorithm of 3D acoustic wave makes up much of the RTM. Different load balancing were analyzed in order to minimize possible losses parallel performance at this stage. The results served as a basis for the implementation of other phases RTM: backpropagation and imaging condition. The proposed algorithm was tested with synthetic data representing some of the possible underground structures. Metrics such as speedup and efficiency were used to analyze its parallel performance. The migrated sections show that the algorithm obtained satisfactory performance in identifying subsurface structures. As for the parallel performance, the analysis clearly demonstrate the scalability of the algorithm achieving a speedup of 22.46 for the propagation of the wave and 16.95 for the RTM, both with 24 threads.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This work's objective is the development of a methodology to represent an unknown soil through a stratified horizontal multilayer soil model, from which the engineer may carry out eletrical grounding projects with high precision. The methodology uses the experimental electrical apparent resistivity curve, obtained through measurements on the ground, using a 4-wire earth ground resistance tester kit, along with calculations involving the measured resistance. This curve is then compared with the theoretical electrical apparent resistivity curve, obtained through calculations over a horizontally strati ed soil, whose parameters are conjectured. This soil model parameters, such as the number of layers, in addition to the resistivity and the thickness of each layer, are optimized by Differential Evolution method, with enhanced performance through parallel computing, in order to both apparent resistivity curves get close enough, and it is possible to represent the unknown soil through the multilayer horizontal soil model fitted with optimized parameters. In order to assist the Differential Evolution method, in case of a stagnation during an arbitrary amount of generations, an optimization process unstuck methodology is proposed, to expand the search space and test new combinations, allowing the algorithm to nd a better solution and/or leave the local minima. It is further proposed an error improvement methodology, in order to smooth the error peaks between the apparent resistivity curves, by giving opportunities for other more uniform solutions to excel, in order to improve the whole algorithm precision, minimizing the maximum error. Methodologies to verify the polynomial approximation of the soil characteristic function and the theoretical apparent resistivity calculations are also proposed by including middle points among the approximated ones in the verification. Finally, a statistical evaluation prodecure is presented, in order to enable the classication of soil samples. The soil stratification methodology is used in a control group, formed by horizontally stratified soils. By using statistical inference, one may calculate the amount of soils that, within an error margin, does not follow the horizontal multilayer model.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The present paper is a report on progress in the simulation of turbulent flames using the Cray T3D and T3E at the Edinburgh parallel computing centre, using codes developed in Cambridge. Two combustion DNS codes are described, ANGUS and SENGA, which solve incompressible and fully compressible reacting flows respectively. The technical background to combustion DNS is presented, and the resource requirements explained in terms of the physic and chemistry of the problem. Results for flame turbulence interaction studies are presented and discussed in terms of their relevance to modelling. Recent work on the fully compressible problem is highlighted and future directions outlined.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

As the efficiency of parallel software increases it is becoming common to measure near linear speedup for many applications. For a problem size N on P processors then with software running at O(N=P ) the performance restrictions due to file i/o systems and mesh decomposition running at O(N) become increasingly apparent especially for large P . For distributed memory parallel systems an additional limit to scalability results from the finite memory size available for i/o scatter/gather operations. Simple strategies developed to address the scalability of scatter/gather operations for unstructured mesh based applications have been extended to provide scalable mesh decomposition through the development of a parallel graph partitioning code, JOSTLE [8]. The focus of this work is directed towards the development of generic strategies that can be incorporated into the Computer Aided Parallelisation Tools (CAPTools) project.