802 resultados para parallel programming paradigms
Resumo:
En el entorno actual, diversas ramas de las ciencias, tienen la necesidad de auxiliarse de la computación de altas prestaciones para la obtención de resultados a relativamente corto plazo. Ello es debido fundamentalmente, al alto volumen de información que necesita ser procesada y también al costo computacional que demandan dichos cálculos. El beneficio al realizar este procesamiento de manera distribuida y paralela, logra acortar los tiempos de espera en la obtención de los resultados y de esta forma posibilita una toma decisiones con mayor anticipación. Para soportar ello, existen fundamentalmente dos modelos de programación ampliamente extendidos: el modelo de paso de mensajes a través de librerías basadas en el estándar MPI, y el de memoria compartida con la utilización de OpenMP. Las aplicaciones híbridas son aquellas que combinan ambos modelos con el fin de aprovechar en cada caso, las potencialidades específicas del paralelismo en cada uno. Lamentablemente, la práctica ha demostrado que la utilización de esta combinación de modelos, no garantiza necesariamente una mejoría en el comportamiento de las aplicaciones. Por lo tanto, un análisis de los factores que influyen en el rendimiento de las mismas, nos beneficiaría a la hora de implementarlas pero también, sería un primer paso con el fin de llegar a predecir su comportamiento. Adicionalmente, supondría una vía para determinar que parámetros de la aplicación modificar con el fin de mejorar su rendimiento. En el trabajo actual nos proponemos definir una metodología para la identificación de factores de rendimiento en aplicaciones híbridas y en congruencia, la identificación de algunos factores que influyen en el rendimiento de las mismas.
Resumo:
COMPSs és un entorn de programació paral·lela desenvolupat per BSC-CNS. Aquest projecte busca estendre aquest entorn per tal de dotar-lo de funcionalitats inicialment no suportades. Aquest conjunt d’extensions radiquen principalment en la implementació de mecanismes que permetin incrementar la flexibilitat, robustesa i polivalència del sistema.
Resumo:
El TFC que aquí s'exposa ha volgut ser, sobretot, un punt de trobada on poder explorar, de manera pràctica, aquests nous paradigmes de programació iconcretament, les solucions que ofereix la plataforma .NET de Microsoft a través de les seves tecnologies Windows Presentation Foundation (WPF), Language Integrades Query (LINQ) i la seva vessant, LINQ to SQL.
Resumo:
S'analitzen les problemàtiques relacionades amb la presentació d'informació gràfica en temps real durant un càlcul paral·lel o col·laboratiu en un entorn distribuït, i es fa una proposta de toolkit obert que estén el llenguatge OpenGL per la seva resolució.
Resumo:
Diplomityö tarkastelee säikeistettyä ohjelmointia rinnakkaisohjelmoinnin ylemmällä hierarkiatasolla tarkastellen erityisesti hypersäikeistysteknologiaa. Työssä tarkastellaan hypersäikeistyksen hyviä ja huonoja puolia sekä sen vaikutuksia rinnakkaisalgoritmeihin. Työn tavoitteena oli ymmärtää Intel Pentium 4 prosessorin hypersäikeistyksen toteutus ja mahdollistaa sen hyödyntäminen, missä se tuo suorituskyvyllistä etua. Työssä kerättiin ja analysoitiin suorituskykytietoa ajamalla suuri joukko suorituskykytestejä eri olosuhteissa (muistin käsittely, kääntäjän asetukset, ympäristömuuttujat...). Työssä tarkasteltiin kahdentyyppisiä algoritmeja: matriisioperaatioita ja lajittelua. Näissä sovelluksissa on säännöllinen muistinkäyttökuvio, mikä on kaksiteräinen miekka. Se on etu aritmeettis-loogisissa prosessoinnissa, mutta toisaalta huonontaa muistin suorituskykyä. Syynä siihen on nykyaikaisten prosessorien erittäin hyvä raaka suorituskyky säännöllistä dataa käsiteltäessä, mutta muistiarkkitehtuuria rajoittaa välimuistien koko ja useat puskurit. Kun ongelman koko ylittää tietyn rajan, todellinen suorituskyky voi pudota murto-osaan huippusuorituskyvystä.
Resumo:
En radiothérapie, la tomodensitométrie (CT) fournit l’information anatomique du patient utile au calcul de dose durant la planification de traitement. Afin de considérer la composition hétérogène des tissus, des techniques de calcul telles que la méthode Monte Carlo sont nécessaires pour calculer la dose de manière exacte. L’importation des images CT dans un tel calcul exige que chaque voxel exprimé en unité Hounsfield (HU) soit converti en une valeur physique telle que la densité électronique (ED). Cette conversion est habituellement effectuée à l’aide d’une courbe d’étalonnage HU-ED. Une anomalie ou artefact qui apparaît dans une image CT avant l’étalonnage est susceptible d’assigner un mauvais tissu à un voxel. Ces erreurs peuvent causer une perte cruciale de fiabilité du calcul de dose. Ce travail vise à attribuer une valeur exacte aux voxels d’images CT afin d’assurer la fiabilité des calculs de dose durant la planification de traitement en radiothérapie. Pour y parvenir, une étude est réalisée sur les artefacts qui sont reproduits par simulation Monte Carlo. Pour réduire le temps de calcul, les simulations sont parallélisées et transposées sur un superordinateur. Une étude de sensibilité des nombres HU en présence d’artefacts est ensuite réalisée par une analyse statistique des histogrammes. À l’origine de nombreux artefacts, le durcissement de faisceau est étudié davantage. Une revue sur l’état de l’art en matière de correction du durcissement de faisceau est présentée suivi d’une démonstration explicite d’une correction empirique.
Resumo:
Software Defined Radio (SDR) hardware platforms use parallel architectures. Current concepts of developing applications (such as WLAN) for these platforms are complex, because developers describe an application with hardware-specifics that are relevant to parallelism such as mapping and scheduling. To reduce this complexity, we have developed a new programming approach for SDR applications, called Virtual Radio Engine (VRE). VRE defines a language for describing applications, and a tool chain that consists of a compiler kernel and other tools (such as a code generator) to generate executables. The thesis presents this concept, as well as describes the language and the compiler kernel that have been developed by the author. The language is hardware-independent, i.e., developers describe tasks and dependencies between them. The compiler kernel performs automatic parallelization, i.e., it is capable of transforming a hardware-independent program into a hardware-specific program by solving hardware-specifics, in particular mapping, scheduling and synchronizations. Thus, VRE simplifies programming tasks as developers do not solve hardware-specifics manually.
Resumo:
How can a bridge be built between autonomic computing approaches and parallel computing systems? How can autonomic computing approaches be extended towards building reliable systems? How can existing technologies be merged to provide a solution for self-managing systems? The work reported in this paper aims to answer these questions by proposing Swarm-Array Computing, a novel technique inspired from swarm robotics and built on the foundations of autonomic and parallel computing paradigms. Two approaches based on intelligent cores and intelligent agents are proposed to achieve autonomy in parallel computing systems. The feasibility of the proposed approaches is validated on a multi-agent simulator.
Resumo:
Mutation testing has been used to assess the quality of test case suites by analyzing the ability in distinguishing the artifact under testing from a set of alternative artifacts, the so-called mutants. The mutants are generated from the artifact under testing by applying a set of mutant operators, which produce artifacts with simple syntactical differences. The mutant operators are usually based on typical errors that occur during the software development and can be related to a fault model. In this paper, we propose a language-named MuDeL (MUtant DEfinition Language)-for the definition of mutant operators, aiming not only at automating the mutant generation, but also at providing precision and formality to the operator definition. The proposed language is based on concepts from transformational and logical programming paradigms, as well as from context-free grammar theory. Denotational semantics formal framework is employed to define the semantics of the MuDeL language. We also describe a system-named mudelgen-developed to support the use of this language. An executable representation of the denotational semantics of the language is used to check the correctness of the implementation of mudelgen. At the very end, a mutant generator module is produced, which can be incorporated into a specific mutant tool/environment. (C) 2008 Elsevier Ltd. All rights reserved.
Algoritmo evolutivo paralelo para o problema de atribuição de localidades a anéis em redes sonet/sdh
Resumo:
The telecommunications play a fundamental role in the contemporary society, having as one of its main roles to give people the possibility to connect them and integrate them into society in which they operate and, therewith, accelerate development through knowledge. But as new technologies are introduced on the market, increases the demand for new products and services that depend on the infrastructure offered, making the problems of planning of telecommunication networks become increasingly large and complex. Many of these problems, however, can be formulated as combinatorial optimization models, and the use of heuristic algorithms can help solve these issues in the planning phase. This paper proposes the development of a Parallel Evolutionary Algorithm to be applied to telecommunications problem known in the literature as SONET Ring Assignment Problem SRAP. This problem is the class NP-hard and arises during the physical planning of a telecommunication network and consists of determining the connections between locations (customers), satisfying a series of constrains of the lowest possible cost. Experimental results illustrate the effectiveness of the Evolutionary Algorithm parallel, over other methods, to obtain solutions that are either optimal or very close to it
Resumo:
The seismic method is of extreme importance in geophysics. Mainly associated with oil exploration, this line of research focuses most of all investment in this area. The acquisition, processing and interpretation of seismic data are the parts that instantiate a seismic study. Seismic processing in particular is focused on the imaging that represents the geological structures in subsurface. Seismic processing has evolved significantly in recent decades due to the demands of the oil industry, and also due to the technological advances of hardware that achieved higher storage and digital information processing capabilities, which enabled the development of more sophisticated processing algorithms such as the ones that use of parallel architectures. One of the most important steps in seismic processing is imaging. Migration of seismic data is one of the techniques used for imaging, with the goal of obtaining a seismic section image that represents the geological structures the most accurately and faithfully as possible. The result of migration is a 2D or 3D image which it is possible to identify faults and salt domes among other structures of interest, such as potential hydrocarbon reservoirs. However, a migration fulfilled with quality and accuracy may be a long time consuming process, due to the mathematical algorithm heuristics and the extensive amount of data inputs and outputs involved in this process, which may take days, weeks and even months of uninterrupted execution on the supercomputers, representing large computational and financial costs, that could derail the implementation of these methods. Aiming at performance improvement, this work conducted the core parallelization of a Reverse Time Migration (RTM) algorithm, using the parallel programming model Open Multi-Processing (OpenMP), due to the large computational effort required by this migration technique. Furthermore, analyzes such as speedup, efficiency were performed, and ultimately, the identification of the algorithmic scalability degree with respect to the technological advancement expected by future processors
Resumo:
This work presents the concept, design and implementation of a MP-SoC platform, named STORM (MP-SoC DirecTory-Based PlatfORM). Currently the platform is composed of the following modules: SPARC V8 processor, GPOP processor, Cache module, Memory module, Directory module and two different modles of Network-on-Chip, NoCX4 and Obese Tree. All modules were implemented using SystemC, simulated and validated, individually or in group. The modules description is presented in details. For programming the platform in C it was implemented a SPARC assembler, fully compatible with gcc s generated assembly code. For the parallel programming it was implemented a library for mutex managing, using the due assembler s support. A total of 10 simulations of increasing complexity are presented for the validation of the presented concepts. The simulations include real parallel applications, such as matrix multiplication, Mergesort, KMP, Motion Estimation and DCT 2D
Resumo:
Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)
Resumo:
Transactional memory (TM) is a new synchronization mechanism devised to simplify parallel programming, thereby helping programmers to unleash the power of current multicore processors. Although software implementations of TM (STM) have been extensively analyzed in terms of runtime performance, little attention has been paid to an equally important constraint faced by nearly all computer systems: energy consumption. In this work we conduct a comprehensive study of energy and runtime tradeoff sin software transactional memory systems. We characterize the behavior of three state-of-the-art lock-based STM algorithms, along with three different conflict resolution schemes. As a result of this characterization, we propose a DVFS-based technique that can be integrated into the resolution policies so as to improve the energy-delay product (EDP). Experimental results show that our DVFS-enhanced policies are indeed beneficial for applications with high contention levels. Improvements of up to 59% in EDP can be observed in this scenario, with an average EDP reduction of 16% across the STAMP workloads. © 2012 IEEE.