157 resultados para parallelization
Resumo:
The seismic method is of extreme importance in geophysics. Mainly associated with oil exploration, this line of research focuses most of all investment in this area. The acquisition, processing and interpretation of seismic data are the parts that instantiate a seismic study. Seismic processing in particular is focused on the imaging that represents the geological structures in subsurface. Seismic processing has evolved significantly in recent decades due to the demands of the oil industry, and also due to the technological advances of hardware that achieved higher storage and digital information processing capabilities, which enabled the development of more sophisticated processing algorithms such as the ones that use of parallel architectures. One of the most important steps in seismic processing is imaging. Migration of seismic data is one of the techniques used for imaging, with the goal of obtaining a seismic section image that represents the geological structures the most accurately and faithfully as possible. The result of migration is a 2D or 3D image which it is possible to identify faults and salt domes among other structures of interest, such as potential hydrocarbon reservoirs. However, a migration fulfilled with quality and accuracy may be a long time consuming process, due to the mathematical algorithm heuristics and the extensive amount of data inputs and outputs involved in this process, which may take days, weeks and even months of uninterrupted execution on the supercomputers, representing large computational and financial costs, that could derail the implementation of these methods. Aiming at performance improvement, this work conducted the core parallelization of a Reverse Time Migration (RTM) algorithm, using the parallel programming model Open Multi-Processing (OpenMP), due to the large computational effort required by this migration technique. Furthermore, analyzes such as speedup, efficiency were performed, and ultimately, the identification of the algorithmic scalability degree with respect to the technological advancement expected by future processors
Resumo:
The increasing demand for high performance wireless communication systems has shown the inefficiency of the current model of fixed allocation of the radio spectrum. In this context, cognitive radio appears as a more efficient alternative, by providing opportunistic spectrum access, with the maximum bandwidth possible. To ensure these requirements, it is necessary that the transmitter identify opportunities for transmission and the receiver recognizes the parameters defined for the communication signal. The techniques that use cyclostationary analysis can be applied to problems in either spectrum sensing and modulation classification, even in low signal-to-noise ratio (SNR) environments. However, despite the robustness, one of the main disadvantages of cyclostationarity is the high computational cost for calculating its functions. This work proposes efficient architectures for obtaining cyclostationary features to be employed in either spectrum sensing and automatic modulation classification (AMC). In the context of spectrum sensing, a parallelized algorithm for extracting cyclostationary features of communication signals is presented. The performance of this features extractor parallelization is evaluated by speedup and parallel eficiency metrics. The architecture for spectrum sensing is analyzed for several configuration of false alarm probability, SNR levels and observation time for BPSK and QPSK modulations. In the context of AMC, the reduced alpha-profile is proposed as as a cyclostationary signature calculated for a reduced cyclic frequencies set. This signature is validated by a modulation classification architecture based on pattern matching. The architecture for AMC is investigated for correct classification rates of AM, BPSK, QPSK, MSK and FSK modulations, considering several scenarios of observation length and SNR levels. The numerical results of performance obtained in this work show the eficiency of the proposed architectures
Resumo:
A parallel technique, for a distributed memory machine, based on domain decomposition for solving the Navier-Stokes equations in cartesian and cylindrical coordinates in two dimensions with free surfaces is described. It is based on the code by Tome and McKee (J. Comp. Phys. 110 (1994) 171-186) and Tome (Ph.D. Thesis, University of Strathclyde, Glasgow, 1993) which in turn is based on the SMAC method by Amsden and Harlow (Report LA-4370, Los Alamos Scientific Laboratory, 1971), which solves the Navier-Stokes equations in three steps: the momentum and Poisson equations and particle movement, These equations are discretized by explicit and 5-point finite differences. The parallelization is performed by splitting the computation domain into vertical panels and assigning each of these panels to a processor. All the computation can then be performed using nearest neighbour communication. Test runs comparing the performance of the parallel with the serial code, and a discussion of the load balancing question are presented. PVM is used for communication between processes. (C) 1999 Elsevier B.V. B.V. All rights reserved.
Resumo:
This work presents an algorithm for the security control of electric power systems using control actions like generation reallocation, determined by sensitivity analysis (linearized model) and optimization by neural networks. The model is developed taking into account the dynamic network aspects. The preventive control methodology is developed by means of sensitivity analysis of the security margin related with the mechanical power of the system synchronous machines. The reallocation power in each machine is determined using neural networks. The neural network used in this work is of Hopfield type. These networks are dedicated electric circuits which simulate the constraint set and the objective function of an optimization problem. The advantage of using these networks is the higher speed in getting the solutions when compared to conventional optimization algorithms due to the great convergence rate of the process and the facility of the method parallelization. Then, the objectives are: formulate and investigate these networks implementations in determining. The generation reallocation in digital computers. Aiming to illustrate the proposed methodology an application considering a multi-machine system is presented.
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
Neste trabalho avaliamos uma classe de operadores de continuação de campos de onda, baseados em equações unidirecionais e com aplicação direta à migração sísmica. O método de representação de equações de onda unidirecionais, desenvolvido neste trabalho, é válido para abertura angular arbitrária, baseia-se no conceito de rigidez de um semiespaço, na transformação Dirichlet-Neumann e em sua discretização por elementos finitos. O método de construção dos operadores de continuação requer a introdução de variáveis auxiliares cujo número cresce em função da maior abertura angular desejada para o operador. Efetuamos a implementação no domínio do espaço e da frequência o que permite sua imediata paralelização. Baseados em experimentos numéricos, que avaliam a relação de dispersão e a resposta ao impulso do operador, propomos prescrições que permitem especificar o número de variáveis auxiliares e o passo de continuação para o operador de migração. A aplicação do algoritmo nos dados do modelo de domo salino da SEG-EAGE demonstra a capacidade do algoritmo em migrar refletores com forte mergulho em meios com forte variação lateral de velocidade.
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
Continuous improvement of industrial processes is one way to become companies more competitive in the market. For this, a fairly widespread use is the methodology of lean production systems, by eliminating waste. One of the tools of these systems is the method of rapid exchange of die or also called SMED, which will be applied in this study. The study aims to develop proposals for reducing the time of set up of two machines in two machining lines, watching also the ergonomics and safety conditions in this operations. The set up time reduction is justified among others, due to increase the machine productivity. In applying the connecting rod machining line, there were two types of exchange. Reductions in time to set up proposals reached values of 47% to one of them, and 55% to the other. It is important to underline that to reach this result, there was no need for large investments. In the application in block machining line, was developed an improvement in the ergonomic area. It was placed a pulley block that came to increase the time of tool change. Aiming to improve the security of the exchange, without productivity loss, the method was applied to reduce this time. It was developed two proposals.: the first would reduce that time by 19%, and does not require many resources of the company. The second involves the parallelization of the exchange, so that the reduction is 48%. However, this proposal requires one more manpower at the time of exchange and it is not always possible
Resumo:
Sao Paulo State Research Foundation-FAPESP
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
Identify opportunities for software parallelism is a task that takes a lot of human time, but once some code patterns for parallelism are identified, a software could quickly accomplish this task. Thus, automating this process brings many benefits such as saving time and reducing errors caused by the programmer [1]. This work aims at developing a software environment that identifies opportunities for parallelism in a source code written in C language, and generates a program with the same behavior, but with higher degree of parallelism, compatible with a graphics processor compatible with CUDA architecture.
Resumo:
In this article, we introduce two new variants of the Assembly Line Worker Assignment and Balancing Problem (ALWABP) that allow parallelization of and collaboration between heterogeneous workers. These new approaches suppose an additional level of complexity in the Line Design and Assignment process, but also higher flexibility; which may be particularly useful in practical situations where the aim is to progressively integrate slow or limited workers in conventional assembly lines. We present linear models and heuristic procedures for these two new problems. Computational results show the efficiency of the proposed approaches and the efficacy of the studied layouts in different situations. (C) 2012 Elsevier B.V. All rights reserved.
Resumo:
We report cross sections for elastic electron scattering by gas phase glycine (neutral form), obtained with the Schwinger multichannel method. The present results are the first obtained with a new implementation that combines parallelization with OpenMP directives and pseudopotentials. The position of the well known pi* shape resonance ranged from 2.3 eV to 2.8 eV depending on the polarization model and conformer. For the most stable isomer, the present result (2.4 eV) is in fair agreement with electron transmission spectroscopy assignments (1.93 +/- 0.05 eV) and available calculations. Our results also point out a shape resonance around 9.5 eV in the A' symmetry that would be weakly coupled to vibrations of the hydroxyl group. Since electron attachment to a broad and lower lying sigma* orbital located on the OH bond has been suggested the underlying mechanism leading to dissociative electron attachment at low energies, we sought for a shape resonance around similar to 4 eV. Though we obtained cross sections with the target molecule at the equilibrium geometry and with stretched OH bond lengths, least-squares fits to the calculated eigenphase sums did not point out signatures of this anion state (though, in principle, it could be hidden in the large background). The low energy (similar to 1 eV) integral cross section strongly scales as the bond length is stretched, and this could indicate a virtual state pole, since dipole supported bound states are not expected at the geometries addressed here. (C) 2012 American Institute of Physics. [http://dx.doi.org/10.1063/1.3687345]
Resumo:
The modern GPUs are well suited for intensive computational tasks and massive parallel computation. Sparse matrix multiplication and linear triangular solver are the most important and heavily used kernels in scientific computation, and several challenges in developing a high performance kernel with the two modules is investigated. The main interest it to solve linear systems derived from the elliptic equations with triangular elements. The resulting linear system has a symmetric positive definite matrix. The sparse matrix is stored in the compressed sparse row (CSR) format. It is proposed a CUDA algorithm to execute the matrix vector multiplication using directly the CSR format. A dependence tree algorithm is used to determine which variables the linear triangular solver can determine in parallel. To increase the number of the parallel threads, a coloring graph algorithm is implemented to reorder the mesh numbering in a pre-processing phase. The proposed method is compared with parallel and serial available libraries. The results show that the proposed method improves the computation cost of the matrix vector multiplication. The pre-processing associated with the triangular solver needs to be executed just once in the proposed method. The conjugate gradient method was implemented and showed similar convergence rate for all the compared methods. The proposed method showed significant smaller execution time.