985 resultados para Parallel computation
Resumo:
The Second Round of Oil & Gas Exploration needs more precision imaging method, velocity vs. depth model and geometry description on Complicated Geological Mass. Prestack time migration on inhomogeneous media was the technical basic of velocity analysis, prestack time migration on Rugged surface, angle gather and multi-domain noise suppression. In order to realize this technique, several critical technical problems need to be solved, such as parallel computation, velocity algorithm on ununiform grid and visualization. The key problem is organic combination theories of migration and computational geometry. Based on technical problems of 3-D prestack time migration existing in inhomogeneous media and requirements from nonuniform grid, parallel process and visualization, the thesis was studied systematically on three aspects: Infrastructure of velocity varies laterally Green function traveltime computation on ununiform grid, parallel computational of kirchhoff integral migration and 3D visualization, by combining integral migration theory and Computational Geometry. The results will provide powerful technical support to the implement of prestack time migration and convenient compute infrastructure of wave number domain simulation in inhomogeneous media. The main results were obtained as follows: 1. Symbol of one way wave Lie algebra integral, phase and green function traveltime expressions were analyzed, and simple 2-D expression of Lie algebra integral symbol phase and green function traveltime in time domain were given in inhomogeneous media by using pseudo-differential operators’ exponential map and Lie group algorithm preserving geometry structure. Infrastructure calculation of five parts, including derivative, commutating operator, Lie algebra root tree, exponential map root tree and traveltime coefficients , was brought forward when calculating asymmetry traveltime equation containing lateral differential in 3-D by this method. 2. By studying the infrastructure calculation of asymmetry traveltime in 3-D based on lateral velocity differential and combining computational geometry, a method to build velocity library and interpolate on velocity library using triangulate was obtained, which fit traveltime calculate requirements of parallel time migration and velocity estimate. 3. Combining velocity library triangulate and computational geometry, a structure which was convenient to calculate differential in horizontal, commutating operator and integral in vertical was built. Furthermore, recursive algorithm, for calculating architecture on lie algebra integral and exponential map root tree (Magnus in Math), was build and asymmetry traveltime based on lateral differential algorithm was also realized. 4. Based on graph theory and computational geometry, a minimum cycle method to decompose area into polygon blocks, which can be used as topological representation of migration result was proposed, which provided a practical method to block representation and research to migration interpretation results. 5. Based on MPI library, a process of bringing parallel migration algorithm at arbitrary sequence traces into practical was realized by using asymmetry traveltime based on lateral differential calculation and Kirchhoff integral method. 6. Visualization of geological data and seismic data were studied by the tools of OpenGL and Open Inventor, based on computational geometry theory, and a 3D visualize system on seismic imaging data was designed.
Resumo:
The Message-Driven Processor is a node of a large-scale multiprocessor being developed by the Concurrent VLSI Architecture Group. It is intended to support fine-grained, message passing, parallel computation. It contains several novel architectural features, such as a low-latency network interface, extensive type-checking hardware, and on-chip memory that can be used as an associative lookup table. This document is a programmer's guide to the MDP. It describes the processor's register architecture, instruction set, and the data types supported by the processor. It also details the MDP's message sending and exception handling facilities.
Resumo:
Techniques, suitable for parallel implementation, for robust 2D model-based object recognition in the presence of sensor error are studied. Models and scene data are represented as local geometric features and robust hypothesis of feature matchings and transformations is considered. Bounds on the error in the image feature geometry are assumed constraining possible matchings and transformations. Transformation sampling is introduced as a simple, robust, polynomial-time, and highly parallel method of searching the space of transformations to hypothesize feature matchings. Key to the approach is that error in image feature measurement is explicitly accounted for. A Connection Machine implementation and experiments on real images are presented.
Resumo:
Rapid judgments about the properties and spatial relations of objects are the crux of visually guided interaction with the world. Vision begins, however, with essentially pointwise representations of the scene, such as arrays of pixels or small edge fragments. For adequate time-performance in recognition, manipulation, navigation, and reasoning, the processes that extract meaningful entities from the pointwise representations must exploit parallelism. This report develops a framework for the fast extraction of scene entities, based on a simple, local model of parallel computation.sAn image chunk is a subset of an image that can act as a unit in the course of spatial analysis. A parallel preprocessing stage constructs a variety of simple chunks uniformly over the visual array. On the basis of these chunks, subsequent serial processes locate relevant scene components and assemble detailed descriptions of them rapidly. This thesis defines image chunks that facilitate the most potentially time-consuming operations of spatial analysis---boundary tracing, area coloring, and the selection of locations at which to apply detailed analysis. Fast parallel processes for computing these chunks from images, and chunk-based formulations of indexing, tracing, and coloring, are presented. These processes have been simulated and evaluated on the lisp machine and the connection machine.
Resumo:
The aim of this paper is to demonstrate the applicability and the effectiveness of a computationally demanding stereo matching algorithm in different lowcost and low-complexity embedded devices, by focusing on the analysis of timing and image quality performances. Various optimizations have been implemented to allow its deployment on specific hardware architectures while decreasing memory and processing time requirements: (1) reduction of color channel information and resolution for input images, (2) low-level software optimizations such as parallel computation, replacement of function calls or loop unrolling, (3) reduction of redundant data structures and internal data representation. The feasibility of a stereovision system on a low cost platform is evaluated by using standard datasets and images taken from Infra-Red (IR) cameras. Analysis of the resulting disparity map accuracy with respect to a full-size dataset is performed as well as the testing of suboptimal solutions
Resumo:
The (n, k)-star interconnection network was proposed in 1995 as an attractive alternative to the n-star topology in parallel computation. The (n, k )-star has significant advantages over the n-star which itself was proposed as an attractive alternative to the popular hypercube. The major advantage of the (n, k )-star network is its scalability, which makes it more flexible than the n-star as an interconnection network. In this thesis, we will focus on finding graph theoretical properties of the (n, k )-star as well as developing parallel algorithms that run on this network. The basic topological properties of the (n, k )-star are first studied. These are useful since they can be used to develop efficient algorithms on this network. We then study the (n, k )-star network from algorithmic point of view. Specifically, we will investigate both fundamental and application algorithms for basic communication, prefix computation, and sorting, etc. A literature review of the state-of-the-art in relation to the (n, k )-star network as well as some open problems in this area are also provided.
Resumo:
Genetic Programming (GP) is a widely used methodology for solving various computational problems. GP's problem solving ability is usually hindered by its long execution times. In this thesis, GP is applied toward real-time computer vision. In particular, object classification and tracking using a parallel GP system is discussed. First, a study of suitable GP languages for object classification is presented. Two main GP approaches for visual pattern classification, namely the block-classifiers and the pixel-classifiers, were studied. Results showed that the pixel-classifiers generally performed better. Using these results, a suitable language was selected for the real-time implementation. Synthetic video data was used in the experiments. The goal of the experiments was to evolve a unique classifier for each texture pattern that existed in the video. The experiments revealed that the system was capable of correctly tracking the textures in the video. The performance of the system was on-par with real-time requirements.
Resumo:
The KCube interconnection network was first introduced in 2010 in order to exploit the good characteristics of two well-known interconnection networks, the hypercube and the Kautz graph. KCube links up multiple processors in a communication network with high density for a fixed degree. Since the KCube network is newly proposed, much study is required to demonstrate its potential properties and algorithms that can be designed to solve parallel computation problems. In this thesis we introduce a new methodology to construct the KCube graph. Also, with regard to this new approach, we will prove its Hamiltonicity in the general KC(m; k). Moreover, we will find its connectivity followed by an optimal broadcasting scheme in which a source node containing a message is to communicate it with all other processors. In addition to KCube networks, we have studied a version of the routing problem in the traditional hypercube, investigating this problem: whether there exists a shortest path in a Qn between two nodes 0n and 1n, when the network is experiencing failed components. We first conditionally discuss this problem when there is a constraint on the number of faulty nodes, and subsequently introduce an algorithm to tackle the problem without restrictions on the number of nodes.
Resumo:
Le travail de modélisation a été réalisé à travers EGSnrc, un logiciel développé par le Conseil National de Recherche Canada.
Resumo:
La multiplication dans le corps de Galois à 2^m éléments (i.e. GF(2^m)) est une opérations très importante pour les applications de la théorie des correcteurs et de la cryptographie. Dans ce mémoire, nous nous intéressons aux réalisations parallèles de multiplicateurs dans GF(2^m) lorsque ce dernier est généré par des trinômes irréductibles. Notre point de départ est le multiplicateur de Montgomery qui calcule A(x)B(x)x^(-u) efficacement, étant donné A(x), B(x) in GF(2^m) pour u choisi judicieusement. Nous étudions ensuite l'algorithme diviser pour régner PCHS qui permet de partitionner les multiplicandes d'un produit dans GF(2^m) lorsque m est impair. Nous l'appliquons pour la partitionnement de A(x) et de B(x) dans la multiplication de Montgomery A(x)B(x)x^(-u) pour GF(2^m) même si m est pair. Basé sur cette nouvelle approche, nous construisons un multiplicateur dans GF(2^m) généré par des trinôme irréductibles. Une nouvelle astuce de réutilisation des résultats intermédiaires nous permet d'éliminer plusieurs portes XOR redondantes. Les complexités de temps (i.e. le délais) et d'espace (i.e. le nombre de portes logiques) du nouveau multiplicateur sont ensuite analysées: 1. Le nouveau multiplicateur demande environ 25% moins de portes logiques que les multiplicateurs de Montgomery et de Mastrovito lorsque GF(2^m) est généré par des trinômes irréductible et m est suffisamment grand. Le nombre de portes du nouveau multiplicateur est presque identique à celui du multiplicateur de Karatsuba proposé par Elia. 2. Le délai de calcul du nouveau multiplicateur excède celui des meilleurs multiplicateurs d'au plus deux évaluations de portes XOR. 3. Nous determinons le délai et le nombre de portes logiques du nouveau multiplicateur sur les deux corps de Galois recommandés par le National Institute of Standards and Technology (NIST). Nous montrons que notre multiplicateurs contient 15% moins de portes logiques que les multiplicateurs de Montgomery et de Mastrovito au coût d'un délai d'au plus une porte XOR supplémentaire. De plus, notre multiplicateur a un délai d'une porte XOR moindre que celui du multiplicateur d'Elia au coût d'une augmentation de moins de 1% du nombre total de portes logiques.
Resumo:
”compositions” is a new R-package for the analysis of compositional and positive data. It contains four classes corresponding to the four different types of compositional and positive geometry (including the Aitchison geometry). It provides means for computation, plotting and high-level multivariate statistical analysis in all four geometries. These geometries are treated in an fully analogous way, based on the principle of working in coordinates, and the object-oriented programming paradigm of R. In this way, called functions automatically select the most appropriate type of analysis as a function of the geometry. The graphical capabilities include ternary diagrams and tetrahedrons, various compositional plots (boxplots, barplots, piecharts) and extensive graphical tools for principal components. Afterwards, ortion and proportion lines, straight lines and ellipses in all geometries can be added to plots. The package is accompanied by a hands-on-introduction, documentation for every function, demos of the graphical capabilities and plenty of usage examples. It allows direct and parallel computation in all four vector spaces and provides the beginner with a copy-and-paste style of data analysis, while letting advanced users keep the functionality and customizability they demand of R, as well as all necessary tools to add own analysis routines. A complete example is included in the appendix
Resumo:
In computer graphics, global illumination algorithms take into account not only the light that comes directly from the sources, but also the light interreflections. This kind of algorithms produce very realistic images, but at a high computational cost, especially when dealing with complex environments. Parallel computation has been successfully applied to such algorithms in order to make it possible to compute highly-realistic images in a reasonable time. We introduce here a speculation-based parallel solution for a global illumination algorithm in the context of radiosity, in which we have taken advantage of the hierarchical nature of such an algorithm
Resumo:
Neste trabalho, o método FDTD em coordenadas gerais (LN-FDTD) foi implementado para a análise de estruturas de aterramento com geometrias coincidentes ou não com o sistema de coordenadas cartesiano. O método soluciona as equações de Maxwell no domínio do tempo, permitindo a obtenção de dados a respeito da resposta transitória e de regime estacionário de estruturas diversas de aterramento. Uma nova formulação para a técnica de truncagem UPML em coordenadas gerais, para meios condutivos, foi desenvolvida e implementada para viabilizar a análise dos problemas (LN-UPML). Uma nova metodologia baseada em duas redes neurais artificiais é apresentada para a deteccão de defeitos em malhas de terra. O software FDTD em coordenadas gerais foi testado e validado para vários casos. Uma interface gráfica para usuários, chamada LANE SAGS, foi desenvolvida para simplificar o uso e automatizar o processamento dos dados.
Resumo:
Desenvolvemos a modelagem numérica de dados sintéticos Marine Controlled Source Electromagnetic (MCSEM) usada na exploração de hidrocarbonetos para simples modelos tridimensionais usando computação paralela. Os modelos são constituidos de duas camadas estrati cadas: o mar e o sedimentos encaixantes de um delgado reservatório tridimensional, sobrepostas pelo semi-espaço correspondente ao ar. Neste Trabalho apresentamos uma abordagem tridimensional da técnica dos elementos nitos aplicada ao método MCSEM, usando a formulação da decomposição primária e secundária dos potenciais acoplados magnético e elétrico. Num pós-processamento, os campos eletromagnéticos são calculados a partir dos potenciais espalhados via diferenciação numérica. Exploramos o paralelismo dos dados MCSEM 3D em um levantamento multitransmissor, em que para cada posição do transmissor temos o mesmo processo de cálculos com dados diferentes. Para isso, usamos a biblioteca Message Passing Interface (MPI) e o modelo servidor cliente, onde o processador administrador envia os dados de entradas para os processadores clientes computar a modelagem. Os dados de entrada são formados pelos parâmetros da malha de elementos nitos, dos transmissores e do modelo geoelétrico do reservatório. Esse possui geometria prismática que representa lentes de reservatórios de hidrocarbonetos em águas profundas. Observamos que quando a largura e o comprimento horizontais desses reservatório têm a mesma ordem de grandeza, as resposta in-line são muito semelhantes e conseqüentemente o efeito tridimensional não é detectado. Por sua vez, quando a diferença nos tamanhos da largura e do comprimento do reservatório é signi cativa o efeito 3D é facilmente detectado em medidas in-line na maior dimensão horizontal do reservatório. Para medidas na menor dimensão esse efeito não é detectável, pois, nesse caso o modelo 3D se aproxima de um modelo bidimensional. O paralelismo dos dados é de rápida implementação e processamento. O tempo de execução para a modelagem multitransmissor em ambiente paralelo é equivalente ao tempo de processamento da modelagem para um único transmissor em uma máquina seqüêncial, com o acréscimo do tempo de latência na transmissão de dados entre os nós do cluster, o que justi ca o uso desta metodologia na modelagem e interpretação de dados MCSEM. Devido a reduzida memória (2 Gbytes) em cada processador do cluster do departamento de geofísica da UFPA, apenas modelos muito simples foram executados.
Resumo:
The modern GPUs are well suited for intensive computational tasks and massive parallel computation. Sparse matrix multiplication and linear triangular solver are the most important and heavily used kernels in scientific computation, and several challenges in developing a high performance kernel with the two modules is investigated. The main interest it to solve linear systems derived from the elliptic equations with triangular elements. The resulting linear system has a symmetric positive definite matrix. The sparse matrix is stored in the compressed sparse row (CSR) format. It is proposed a CUDA algorithm to execute the matrix vector multiplication using directly the CSR format. A dependence tree algorithm is used to determine which variables the linear triangular solver can determine in parallel. To increase the number of the parallel threads, a coloring graph algorithm is implemented to reorder the mesh numbering in a pre-processing phase. The proposed method is compared with parallel and serial available libraries. The results show that the proposed method improves the computation cost of the matrix vector multiplication. The pre-processing associated with the triangular solver needs to be executed just once in the proposed method. The conjugate gradient method was implemented and showed similar convergence rate for all the compared methods. The proposed method showed significant smaller execution time.