459 resultados para T-parallelism
Resumo:
Trabalho apresentado à Universidade Fernando Pessoa como parte dos requisitos para obtenção do grau de Mestre em Medicina Dentária.
Resumo:
Projeto de Pós-Graduação/Dissertação apresentado à Universidade Fernando Pessoa como parte dos requisitos para obtenção do grau de Mestre em Ciências Farmacêuticas
Resumo:
We consider a fault model of Boolean gates, both classical and quantum, where some of the inputs may not be connected to the actual gate hardware. This model is somewhat similar to the stuck-at model which is a very popular model in testing Boolean circuits. We consider the problem of detecting such faults; the detection algorithm can query the faulty gate and its complexity is the number of such queries. This problem is related to determining the sensitivity of Boolean functions. We show how quantum parallelism can be used to detect such faults. Specifically, we show that a quantum algorithm can detect such faults more efficiently than a classical algorithm for a Parity gate and an AND gate. We give explicit constructions of quantum detector algorithms and show lower bounds for classical algorithms. We show that the model for detecting such faults is similar to algebraic decision trees and extend some known results from quantum query complexity to prove some of our results.
Resumo:
A fast and efficient segmentation algorithm based on the Boundary Contour System/Feature Contour System (BCS/FCS) of Grossberg and Mingolla [3] is presented. This implementation is based on the FFT algorithm and the parallelism of the system.
Resumo:
An abstract of this work will be presented at the Compiler, Architecture and Tools Conference (CATC), Intel Development Center, Haifa, Israel November 23, 2015.
Resumo:
The classical Purcell's vector method, for the construction of solutions to dense systems of linear equations is extended to a flexible orthogonalisation procedure. Some properties are revealed of the orthogonalisation procedure in relation to the classical Gauss-Jordan elimination with or without pivoting. Additional properties that are not shared by the classical Gauss-Jordan elimination are exploited. Further properties related to distributed computing are discussed with applications to panel element equations in subsonic compressible aerodynamics. Using an orthogonalisation procedure within panel methods enables a functional decomposition of the sequential panel methods and leads to a two-level parallelism.
Resumo:
Temperature distributions involved in some metal-cutting or surface-milling processes may be obtained by solving a non-linear inverse problem. A two-level concept on parallelism is introduced to compute such temperature distribution. The primary level is based on a problem-partitioning concept driven by the nature and properties of the non-linear inverse problem. Such partitioning results to a coarse-grained parallel algorithm. A simplified 2-D metal-cutting process is used as an example to illustrate the concept. A secondary level exploitation of further parallel properties based on the concept of domain-data parallelism is explained and implemented using MPI. Some experiments were performed on a network of loosely coupled machines consist of SUN Sparc Classic workstations and a network of tightly coupled processors, namely the Origin 2000.
Resumo:
This paper briefly describes an interactive parallelisation toolkit that can be used to generate parallel code suitable for either a distributed memory system (using message passing) or a shared memory system (using OpenMP). This study focuses on how the toolkit is used to parallelise a complex heterogeneous ocean modelling code within a few hours for use on a shared memory parallel system. The generated parallel code is essentially the serial code with OpenMP directives added to express the parallelism. The results show that substantial gains in performance can be achieved over the single thread version with very little effort.
Resumo:
Numerical solutions of realistic 2-D and 3-D inverse problems may require a very large amount of computation. A two-level concept on parallelism is often used to solve such problems. The primary level uses the problem partitioning concept which is a decomposition based on the mathematical/physical problem. The secondary level utilizes the widely used data partitioning concept. A theoretical performance model is built based on the two-level parallelism. The observed performance results obtained from a network of general purpose Sun Sparc stations are compared with the theoretical values. Restrictions of the theoretical model are also discussed.
Resumo:
This paper describes an interactive parallelisation toolkit that can be used to generate parallel code suitable for either a distributed memory system (using message passing) or a shared memory system (using OpenMP). This study focuses on how the toolkit is used to parallelise a complex heterogeneous ocean modelling code within a few hours for use on a shared memory parallel system. The generated parallel code is essentially the serial code with OpenMP directives added to express the parallelism. The results show that substantial gains in performance can be achieved over the single thread version with very little effort.
Resumo:
The shared-memory programming model can be an effective way to achieve parallelism on shared memory parallel computers. Historically however, the lack of a programming standard using directives and the limited scalability have affected its take-up. Recent advances in hardware and software technologies have resulted in improvements to both the performance of parallel programs with compiler directives and the issue of portability with the introduction of OpenMP. In this study, the Computer Aided Parallelisation Toolkit has been extended to automatically generate OpenMP-based parallel programs with nominal user assistance. We categorize the different loop types and show how efficient directives can be placed using the toolkit's in-depth interprocedural analysis. Examples are taken from the NAS parallel benchmarks and a number of real-world application codes. This demonstrates the great potential of using the toolkit to quickly parallelise serial programs as well as the good performance achievable on up to 300 processors for hybrid message passing-directive parallelisations.
Resumo:
Code parallelization using OpenMP for shared memory systems is relatively easier than using message passing for distributed memory systems. Despite this, it is still a challenge to use OpenMP to parallelize application codes in a way that yields effective scalable performance when executed on a shared memory parallel system. We describe an environment that will assist the programmer in the various tasks of code parallelization and this is achieved in a greatly reduced time frame and level of skill required. The parallelization environment includes a number of tools that address the main tasks of parallelism detection, OpenMP source code generation, debugging and optimization. These tools include a high quality, fully interprocedural dependence analysis with user interaction capabilities to facilitate the generation of efficient parallel code, an automatic relative debugging tool to identify erroneous user decisions in that interaction and also performance profiling to identify bottlenecks. Finally, experiences of parallelizing some NASA application codes are presented to illustrate some of the benefits of using the evolving environment.
Resumo:
The oceans have shown a recent rapid and accelerating rise in temperature with, given the close link between temperature and marine organisms, pronounced effects on ecosystems. Here we describe for the first time a globally synchronous pattern of pulsed short period (�1 year long) emanations of warm sea surface temperature anomalies from tropical seas towards the poles on the shelf/slope with an intensification of the warming after the 1976/1977, 1986/1987 and 1997/1998 El Nin˜os. On the eastern margins of continents the anomalies propagate towards the poles in part by largely baroclinic boundary currents, reinforced by regional atmospheric warming. The processes contributing to the less continuous warm anomalies on western margins are linked to the transfer of warmth from adjacent western boundary currents. These climate induced events show a close parallelism with the timing of ecosystem changes in shelf seas, important for fisheries and ecosystem services, and melting of sea-ice.
Resumo:
A BSP (Bulk Synchronous Parallelism) computation is characterized by the generation of asynchronous messages in packages during independent execution of a number of processes and their subsequent delivery at synchronization points. Bundling messages together represents a significant departure from the traditional ‘one communication at a time’ approach. In this paper the semantic consequences of communication packaging are explored. In particular, the BSP communication structure is identified with a general form of substitution—predicate substitution. Predicate substitution provides a means of reasoning about the synchronized delivery of asynchronous communications when the immediate programming context does not explicitly refer to the variables that are to be updated (unlike traditional operations, such as the assignment $x := e$, where the names of the updated variables can be extracted from the context). Proofs of implementations of Newton's root finding method and prefix sum are used to illustrate the practical application of the proposed approach.
Resumo:
The construction and operation of a prism/variable-gap/sample system (or variable-gap Otto coupler) for the excitation of surface electromagnetic modes is reported. This system has been used for the observation and characterization of surface plasmon polaritons on thin film structures. The initial alignment of prism and sample is performed under gravity and the subsequent gap variation is performed by means of a single actuator operating a flexure stage on which the prism is mounted. The flexure stage ensures the maintenance of good parallelism between sample and prism as the gap dimension is varied. The coupler has also served as a prototype, in terms of design principle, for the construction of a more sophisticated, variable-gap Otto coupler that can operate in vacuum at temperatures from ambient to 85 K. (C) 2000 American Institute of Physics. [S0034-6748(00)02311-X].