8 resultados para PETSc
Resumo:
大规模科学计算已经广泛应用在气象、海洋、化学、生物医药、电子工程等领域。科学计算软件的开发是科学计算的关键环节。开发一个具有良好可靠性的计算工具,并与大型计算工具箱集成而完成大规模、复杂实际问题的计算,具有重要意义。 PETSc(Portable, Extensible Toolkit for Scientific Computation)是国际流行的科学计算工具箱,它可用于偏微分方程的求解及相关的高性能计算问题。本文分析了PETSc的主要功能、结构与特色,并剖析了其核心组件,包括向量、矩阵、线性方程组求解器KSP、非线性求解器SNES等。 自动微分是计算函数导数的重要方法,它可以应用在最优化问题的实际计算中。在PETSc中提供了ADIC、ADIFOR等自动微分软件包的接口。本文分析了自动微分计算函数一阶导数的切线性模式与伴随模式的基本原理,介绍了现有的自动微分软件的情况,特别是ADIC的开发及其与PETSc的接口。 DTC(Differentiation Transforming System in C)是针对C语言自动微分工具,用于生成切线性模式。DTC生成的切线性代码可用于计算雅可比矩阵-向量乘积等。本文详细介绍了DTC系统的设计及关键技术,包括编译技术、输入/输出(IO)相关分析等。针对PETSc的复杂数据结构,开发了DTC与PETSc的接口,将两者集成,并应用在求解二维全球正压大气浅水波方程中。最后给出了DTC系统的相关测试结果。
Resumo:
This work presents a study about the use of standards and directions on parallel programming in distributed systems, using the MPI standard and PETSc toolkit, performing an analysis of their performances over certain mathematic operations involving matrices. The concepts are used to develop applications to solve problems involving Principal Components Analysis (PCA), which are executed in a Beowulf cluster. The results are compared to the ones of an analogous application with sequencial execution, and then it is analized if there was any performance boost on the parallel application
Resumo:
A three-dimensional MHD solver is described in the paper. The solver simulates reacting flows with nonequilibrium between translational-rotational, vibrational and electron translational modes. The conservation equations are discretized with implicit time marching and the second-order modified Steger-Warming scheme, and the resulted linear system is solved iteratively with Newton-Krylov-Schwarz method that is implemented by PETSc package. The results of convergence tests are plotted, which show good scalability and convergence around twice faster when compared with the DPLR method. Then five test runs are conducted simulating the experiments done at the NASA Ames MHD channel, and the calculated pressures, temperatures, electrical conductivity, back EMF, load factors and flow accelerations are shown to agree with the experimental data. Our computation shows that the electrical conductivity distribution is not uniform in the powered section of the MHD channel, and that it is important to include Joule heating in order to calculate the correct conductivity and the MHD acceleration.
Resumo:
A engenharia geotécnica é uma das grandes áreas da engenharia civil que estuda a interação entre as construções realizadas pelo homem ou de fenômenos naturais com o ambiente geológico, que na grande maioria das vezes trata-se de solos parcialmente saturados. Neste sentido, o desempenho de obras como estabilização, contenção de barragens, muros de contenção, fundações e estradas estão condicionados a uma correta predição do fluxo de água no interior dos solos. Porém, como a área das regiões a serem estudas com relação à predição do fluxo de água são comumente da ordem de quilômetros quadrados, as soluções dos modelos matemáticos exigem malhas computacionais de grandes proporções, ocasionando sérias limitações associadas aos requisitos de memória computacional e tempo de processamento. A fim de contornar estas limitações, métodos numéricos eficientes devem ser empregados na solução do problema em análise. Portanto, métodos iterativos para solução de sistemas não lineares e lineares esparsos de grande porte devem ser utilizados neste tipo de aplicação. Em suma, visto a relevância do tema, esta pesquisa aproximou uma solução para a equação diferencial parcial de Richards pelo método dos volumes finitos em duas dimensões, empregando o método de Picard e Newton com maior eficiência computacional. Para tanto, foram utilizadas técnicas iterativas de resolução de sistemas lineares baseados no espaço de Krylov com matrizes pré-condicionadoras com a biblioteca numérica Portable, Extensible Toolkit for Scientific Computation (PETSc). Os resultados indicam que quando se resolve a equação de Richards considerando-se o método de PICARD-KRYLOV, não importando o modelo de avaliação do solo, a melhor combinação para resolução dos sistemas lineares é o método dos gradientes biconjugados estabilizado mais o pré-condicionador SOR. Por outro lado, quando se utiliza as equações de van Genuchten deve ser optar pela combinação do método dos gradientes conjugados em conjunto com pré-condicionador SOR. Quando se adota o método de NEWTON-KRYLOV, o método gradientes biconjugados estabilizado é o mais eficiente na resolução do sistema linear do passo de Newton, com relação ao pré-condicionador deve-se dar preferência ao bloco Jacobi. Por fim, há evidências que apontam que o método PICARD-KRYLOV pode ser mais vantajoso que o método de NEWTON-KRYLOV, quando empregados na resolução da equação diferencial parcial de Richards.
Resumo:
O estudo do fluxo de água e do transporte escalar em reservatórios hidrelétricos é importante para a determinação da qualidade da água durante as fases iniciais do enchimento e durante a vida útil do reservatório. Neste contexto, um código de elementos finitos paralelo 2D foi implementado para resolver as equações de Navier-Stokes para fluido incompressível acopladas a transporte escalar, utilizando o modelo de programação de troca de mensagens, a fim de realizar simulações em um ambiente de cluster de computadores. A discretização espacial é baseada no elemento MINI, que satisfaz as condições de Babuska-Brezzi (BB), que permite uma formulação mista estável. Todas as estruturas de dados distribuídos necessárias nas diferentes fases do código, como pré-processamento, solução e pós-processamento, foram implementadas usando a biblioteca PETSc. Os sistemas lineares resultantes foram resolvidos usando o método da projeção discreto com fatoração LU por blocos. Para aumentar o desempenho paralelo na solução dos sistemas lineares, foi empregado o método de condensação estática para resolver a velocidade intermediária nos vértices e no centróide do elemento MINI separadamente. Os resultados de desempenho do método de condensação estática com a abordagem da solução do sistema completo foram comparados. Os testes mostraram que o método de condensação estática apresenta melhor desempenho para grandes problemas, às custas de maior uso de memória. O desempenho de outras partes do código também são apresentados.
Resumo:
Numerical modeling of groundwater is very important for understanding groundwater flow and solving hydrogeological problem. Today, groundwater studies require massive model cells and high calculation accuracy, which are beyond single-CPU computer’s capabilities. With the development of high performance parallel computing technologies, application of parallel computing method on numerical modeling of groundwater flow becomes necessary and important. Using parallel computing can improve the ability to resolve various hydro-geological and environmental problems. In this study, parallel computing method on two main types of modern parallel computer architecture, shared memory parallel systems and distributed shared memory parallel systems, are discussed. OpenMP and MPI (PETSc) are both used to parallelize the most widely used groundwater simulator, MODFLOW. Two parallel solvers, P-PCG and P-MODFLOW, were developed for MODFLOW. The parallelized MODFLOW was used to simulate regional groundwater flow in Beishan, Gansu Province, which is a potential high-level radioactive waste geological disposal area in China. 1. The OpenMP programming paradigm was used to parallelize the PCG (preconditioned conjugate-gradient method) solver, which is one of the main solver for MODFLOW. The parallel PCG solver, P-PCG, is verified using an 8-processor computer. Both the impact of compilers and different model domain sizes were considered in the numerical experiments. The largest test model has 1000 columns, 1000 rows and 1000 layers. Based on the timing results, execution times using the P-PCG solver are typically about 1.40 to 5.31 times faster than those using the serial one. In addition, the simulation results are the exact same as the original PCG solver, because the majority of serial codes were not changed. It is worth noting that this parallelizing approach reduces cost in terms of software maintenance because only a single source PCG solver code needs to be maintained in the MODFLOW source tree. 2. P-MODFLOW, a domain decomposition–based model implemented in a parallel computing environment is developed, which allows efficient simulation of a regional-scale groundwater flow. The basic approach partitions a large model domain into any number of sub-domains. Parallel processors are used to solve the model equations within each sub-domain. The use of domain decomposition method to achieve the MODFLOW program distributed shared memory parallel computing system will process the application of MODFLOW be extended to the fleet of the most popular systems, so that a large-scale simulation could take full advantage of hundreds or even thousands parallel processors. P-MODFLOW has a good parallel performance, with the maximum speedup of 18.32 (14 processors). Super linear speedups have been achieved in the parallel tests, indicating the efficiency and scalability of the code. Parallel program design, load balancing and full use of the PETSc were considered to achieve a highly efficient parallel program. 3. The characterization of regional ground water flow system is very important for high-level radioactive waste geological disposal. The Beishan area, located in northwestern Gansu Province, China, is selected as a potential site for disposal repository. The area includes about 80000 km2 and has complicated hydrogeological conditions, which greatly increase the computational effort of regional ground water flow models. In order to reduce computing time, parallel computing scheme was applied to regional ground water flow modeling. Models with over 10 million cells were used to simulate how the faults and different recharge conditions impact regional ground water flow pattern. The results of this study provide regional ground water flow information for the site characterization of the potential high-level radioactive waste disposal.
Resumo:
Three paradigms for distributed-memory parallel computation that free the application programmer from the details of message passing are compared for an archetypal structured scientific computation -- a nonlinear, structured-grid partial differential equation boundary value problem -- using the same algorithm on the same hardware. All of the paradigms -- parallel languages represented by the Portland Group's HPF, (semi-)automated serial-to-parallel source-to-source translation represented by CAP-Tools from the University of Greenwich, and parallel libraries represented by Argonne's PETSc -- are found to be easy to use for this problem class, and all are reasonably effective in exploiting concurrency after a short learning curve. The level of involvement required by the application programmer under any paradigm includes specification of the data partitioning, corresponding to a geometrically simple decomposition of the domain of the PDE. Programming in SPMD style for the PETSc library requires writing only the routines that discretize the PDE and its Jacobian, managing subdomain-to-processor mappings (affine global-to-local index mappings), and interfacing to library solver routines. Programming for HPF requires a complete sequential implementation of the same algorithm as a starting point, introduction of concurrency through subdomain blocking (a task similar to the index mapping), and modest experimentation with rewriting loops to elucidate to the compiler the latent concurrency. Programming with CAPTools involves feeding the same sequential implementation to the CAPTools interactive parallelization system, and guiding the source-to-source code transformation by responding to various queries about quantities knowable only at runtime. Results representative of "the state of the practice" for a scaled sequence of structured grid problems are given on three of the most important contemporary high-performance platforms: the IBM SP, the SGI Origin 2000, and the CRAYY T3E.
Resumo:
This paper compares three alternative numerical algorithms applied to a nonlinear metal cutting problem. One algorithm is based on an explicit method and the other two are implicit. Domain decomposition (DD) is used to break the original domain into subdomains, each containing a properly connected, well-formulated and continuous subproblem. The serial version of the explicit algorithm is implemented in FORTRAN and its parallel version uses MPI (Message Passing Interface) calls. One implicit algorithm is implemented by coupling the state-of-the-art PETSc (Portable, Extensible Toolkit for Scientific Computation) software with in-house software in order to solve the subproblems. The second implicit algorithm is implemented completely within PETSc. PETSc uses MPI as the underlying communication library. Finally, a 2D example is used to test the algorithms and various comparisons are made.