43 resultados para Supercomputers


Relevância:

10.00% 10.00%

Publicador:

Resumo:

Immersed boundary simulations have been under development for physiological flows, allowing for elegant handling of fluid-structure interaction modelling with large deformations due to retained domain-specific meshing. We couple a structural system in Lagrangian representation that is formulated in a weak form with a Navier-Stokes system discretized through a finite differences scheme. We build upon a proven highly scalable imcompressible flow solver that we extend to handle FSI. We aim at applying our method to investigating the hemodynamics of Aortic Valves. The code is going to be extended to conform to the new hybrid-node supercomputers.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Esta tesis constituye un gran avance en el conocimiento del estudio y análisis de inestabilidades hidrodinámicas desde un punto de vista físico y teórico, como consecuencia de haber desarrollado innovadoras técnicas para la resolución computacional eficiente y precisa de la parte principal del espectro correspondiente a los problemas de autovalores (EVP) multidimensionales que gobiernan la inestabilidad de flujos con dos o tres direcciones espaciales inhomogéneas, denominados problemas de estabilidad global lineal. En el contexto del trabajo de desarrollo de herramientas computacionales presentado en la tesis, la discretización mediante métodos de diferencias finitas estables de alto orden de los EVP bidimensionales y tridimensionales que se derivan de las ecuaciones de Navier-Stokes linealizadas sobre flujos con dos o tres direcciones espaciales inhomogéneas, ha permitido una aceleración de cuatro órdenes de magnitud en su resolución. Esta mejora de eficiencia numérica se ha conseguido gracias al hecho de que usando estos esquemas de diferencias finitas, técnicas eficientes de resolución de problemas lineales son utilizables, explotando el alto nivel de dispersión o alto número de elementos nulos en las matrices involucradas en los problemas tratados. Como más notable consecuencia cabe destacar que la resolución de EVPs multidimensionales de inestabilidad global, que hasta la fecha necesitaban de superordenadores, se ha podido realizar en ordenadores de sobremesa. Además de la solución de problemas de estabilidad global lineal, el mencionado desarrollo numérico facilitó la extensión de las ecuaciones de estabilidad parabolizadas (PSE) lineales y no lineales para analizar la inestabilidad de flujos que dependen fuertemente en dos direcciones espaciales y suavemente en la tercera con las ecuaciones de estabilidad parabolizadas tridimensionales (PSE-3D). Precisamente la capacidad de extensión del novedoso algoritmo PSE-3D para el estudio de interacciones no lineales de los modos de estabilidad, desarrollado íntegramente en esta tesis, permite la predicción de transición en flujos complejos de gran interés industrial y por lo tanto extiende el concepto clásico de PSE, el cuál ha sido empleado exitosamente durante las pasadas tres décadas en el mismo contexto para problemas de capa límite bidimensional. Típicos ejemplos de flujos incompresibles se han analizado en este trabajo sin la necesidad de recurrir a restrictivas presuposiciones usadas en el pasado. Se han estudiado problemas vorticales como es el caso de un vórtice aislado o sistemas de vórtices simulando la estela de alas, en los que la homogeneidad axial no se impone y así se puede considerar la difusión viscosa del flujo. Además, se ha estudiado el chorro giratorio turbulento, cuya inestabilidad se utiliza para mejorar las características de funcionamiento de combustores. En la tesis se abarcan adicionalmente problemas de flujos compresibles. Se presenta el estudio de inestabilidad de flujos de borde de ataque a diferentes velocidades de vuelo. También se analiza la estela formada por un elemento rugoso aislado en capa límite supersónica e hipersónica, mostrando excelentes comparaciones con resultados obtenidos mediante simulación numérica directa. Finalmente, nuevas inestabilidades se han identificado en el flujo hipersónico a Mach 7 alrededor de un cono elíptico que modela el vehículo de pruebas en vuelo HIFiRE-5. Los resultados comparan favorablemente con experimentos en vuelo, lo que subraya aún más el potencial de las metodologías de análisis de estabilidad desarrolladas en esta tesis. ABSTRACT The present thesis constitutes a step forward in advancing the frontiers of knowledge of fluid flow instability from a physical point of view, as a consequence of having been successful in developing groundbreaking methodologies for the efficient and accurate computation of the leading part of the spectrum pertinent to multi-dimensional eigenvalue problems (EVP) governing instability of flows with two or three inhomogeneous spatial directions. In the context of the numerical work presented in this thesis, the discretization of the spatial operator resulting from linearization of the Navier-Stokes equations around flows with two or three inhomogeneous spatial directions by variable-high-order stable finite-difference methods has permitted a speedup of four orders of magnitude in the solution of the corresponding two- and three-dimensional EVPs. This improvement of numerical performance has been achieved thanks to the high-sparsity level offered by the high-order finite-difference schemes employed for the discretization of the operators. This permitted use of efficient sparse linear algebra techniques without sacrificing accuracy and, consequently, solutions being obtained on typical workstations, as opposed to the previously employed supercomputers. Besides solution of the two- and three-dimensional EVPs of global linear instability, this development paved the way for the extension of the (linear and nonlinear) Parabolized Stability Equations (PSE) to analyze instability of flows which depend in a strongly-coupled inhomogeneous manner on two spatial directions and weakly on the third. Precisely the extensibility of the novel PSE-3D algorithm developed in the framework of the present thesis to study nonlinear flow instability permits transition prediction in flows of industrial interest, thus extending the classic PSE concept which has been successfully employed in the same context to boundary-layer type of flows over the last three decades. Typical examples of incompressible flows, the instability of which was analyzed in the present thesis without the need to resort to the restrictive assumptions used in the past, range from isolated vortices, and systems thereof, in which axial homogeneity is relaxed to consider viscous diffusion, as well as turbulent swirling jets, the instability of which is exploited in order to improve flame-holding properties of combustors. The instability of compressible subsonic and supersonic leading edge flows has been solved, and the wake of an isolated roughness element in a supersonic and hypersonic boundary-layer has also been analyzed with respect to its instability: excellent agreement with direct numerical simulation results has been obtained in all cases. Finally, instability analysis of Mach number 7 ow around an elliptic cone modeling the HIFiRE-5 flight test vehicle has unraveled flow instabilities near the minor-axis centerline, results comparing favorably with flight test predictions.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper focuses on the parallelization of an ocean model applying current multicore processor-based cluster architectures to an irregular computational mesh. The aim is to maximize the efficiency of the computational resources used. To make the best use of the resources offered by these architectures, this parallelization has been addressed at all the hardware levels of modern supercomputers: firstly, exploiting the internal parallelism of the CPU through vectorization; secondly, taking advantage of the multiple cores of each node using OpenMP; and finally, using the cluster nodes to distribute the computational mesh, using MPI for communication within the nodes. The speedup obtained with each parallelization technique as well as the combined overall speedup have been measured for the western Mediterranean Sea for different cluster configurations, achieving a speedup factor of 73.3 using 256 processors. The results also show the efficiency achieved in the different cluster nodes and the advantages obtained by combining OpenMP and MPI versus using only OpenMP or MPI. Finally, the scalability of the model has been analysed by examining computation and communication times as well as the communication and synchronization overhead due to parallelization.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Protein folding occurs on a time scale ranging from milliseconds to minutes for a majority of proteins. Computer simulation of protein folding, from a random configuration to the native structure, is nontrivial owing to the large disparity between the simulation and folding time scales. As an effort to overcome this limitation, simple models with idealized protein subdomains, e.g., the diffusion–collision model of Karplus and Weaver, have gained some popularity. We present here new results for the folding of a four-helix bundle within the framework of the diffusion–collision model. Even with such simplifying assumptions, a direct application of standard Brownian dynamics methods would consume 10,000 processor-years on current supercomputers. We circumvent this difficulty by invoking a special Brownian dynamics simulation. The method features the calculation of the mean passage time of an event from the flux overpopulation method and the sampling of events that lead to productive collisions even if their probability is extremely small (because of large free-energy barriers that separate them from the higher probability events). Using these developments, we demonstrate that a coarse-grained model of the four-helix bundle can be simulated in several days on current supercomputers. Furthermore, such simulations yield folding times that are in the range of time scales observed in experiments.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper, parallel Relaxed and Extrapolated algorithms based on the Power method for accelerating the PageRank computation are presented. Different parallel implementations of the Power method and the proposed variants are analyzed using different data distribution strategies. The reported experiments show the behavior and effectiveness of the designed algorithms for realistic test data using either OpenMP, MPI or an hybrid OpenMP/MPI approach to exploit the benefits of shared memory inside the nodes of current SMP supercomputers.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Mode of access: Internet.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Climate change has a great impact on the build and the work of natural ecosystems. Disappearance of some population or growth of the number in some species can be already caused by little change in temperature. A Theoretical Ecosystem Growth Model was investigated in order to examine the effects of various climate patterns on the ecological equilibrium. The answers of the ecosystems which are given to the climate change could be described by means of global climate modelling and dynamic vegetation models. The examination of the operation of the ecosystems is only possible in huge centres on supercomputers because of the number and the complexity of the calculation. The number of the calculation could be decreased to the level of a PC by considering the temperature and the reproduction during the modelling of a theoretical ecosystem and several important theoretical questions could be answered.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Climate change has serious effects on the setting up and the operation of natural ecosystems. Small increase in temperature could cause rise in the amount of some species or potential disappearance of others. During our researches, the dispersion of the species and biomass production of a theoretical ecosystem were examined on the effect of the temperature–climate change. The answers of the ecosystems which are given to the climate change could be described by means of global climate modelling and dynamic vegetation models. The examination of the operation of the ecosystems is only possible in huge centres on supercomputers because of the number and the complexity of the calculation. The number of the calculation could be decreased to the level of a PC by considering the temperature and the reproduction during modelling a theoretical ecosystem, and several important theoretical questions could be answered.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Energy-efficient computing remains a critical challenge across the wide range of future data-processing engines — from ultra-low-power embedded systems to servers, mainframes, and supercomputers. In addition, the advent of cloud and mobile computing as well as the explosion of IoT technologies have created new research challenges in the already complex, multidimensional space of modern and future computer systems. These new research challenges led to the establishment of the IEEE Rebooting Computing Initiative, which specifically addresses novel low-power solutions and technologies as one of the main areas of concern.With this in mind, we thought it timely to survey the state of the art of energy-efficient computing.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The availability of CFD software that can easily be used and produce high efficiency on a wide range of parallel computers is extremely limited. The investment and expertise required to parallelise a code can be enormous. In addition, the cost of supercomputers forces high utilisation to justify their purchase, requiring a wide range of software. To break this impasse, tools are urgently required to assist in the parallelisation process that dramatically reduce the parallelisation time but do not degrade the performance of the resulting parallel software. In this paper we discuss enhancements to the Computer Aided Parallelisation Tools (CAPTools) to assist in the parallelisation of complex unstructured mesh-based computational mechanics codes.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Due to the growth of design size and complexity, design verification is an important aspect of the Logic Circuit development process. The purpose of verification is to validate that the design meets the system requirements and specification. This is done by either functional or formal verification. The most popular approach to functional verification is the use of simulation based techniques. Using models to replicate the behaviour of an actual system is called simulation. In this thesis, a software/data structure architecture without explicit locks is proposed to accelerate logic gate circuit simulation. We call thus system ZSIM. The ZSIM software architecture simulator targets low cost SIMD multi-core machines. Its performance is evaluated on the Intel Xeon Phi and 2 other machines (Intel Xeon and AMD Opteron). The aim of these experiments is to: • Verify that the data structure used allows SIMD acceleration, particularly on machines with gather instructions ( section 5.3.1). • Verify that, on sufficiently large circuits, substantial gains could be made from multicore parallelism ( section 5.3.2 ). • Show that a simulator using this approach out-performs an existing commercial simulator on a standard workstation ( section 5.3.3 ). • Show that the performance on a cheap Xeon Phi card is competitive with results reported elsewhere on much more expensive super-computers ( section 5.3.5 ). To evaluate the ZSIM, two types of test circuits were used: 1. Circuits from the IWLS benchmark suit [1] which allow direct comparison with other published studies of parallel simulators.2. Circuits generated by a parametrised circuit synthesizer. The synthesizer used an algorithm that has been shown to generate circuits that are statistically representative of real logic circuits. The synthesizer allowed testing of a range of very large circuits, larger than the ones for which it was possible to obtain open source files. The experimental results show that with SIMD acceleration and multicore, ZSIM gained a peak parallelisation factor of 300 on Intel Xeon Phi and 11 on Intel Xeon. With only SIMD enabled, ZSIM achieved a maximum parallelistion gain of 10 on Intel Xeon Phi and 4 on Intel Xeon. Furthermore, it was shown that this software architecture simulator running on a SIMD machine is much faster than, and can handle much bigger circuits than a widely used commercial simulator (Xilinx) running on a workstation. The performance achieved by ZSIM was also compared with similar pre-existing work on logic simulation targeting GPUs and supercomputers. It was shown that ZSIM simulator running on a Xeon Phi machine gives comparable simulation performance to the IBM Blue Gene supercomputer at very much lower cost. The experimental results have shown that the Xeon Phi is competitive with simulation on GPUs and allows the handling of much larger circuits than have been reported for GPU simulation. When targeting Xeon Phi architecture, the automatic cache management of the Xeon Phi, handles and manages the on-chip local store without any explicit mention of the local store being made in the architecture of the simulator itself. However, targeting GPUs, explicit cache management in program increases the complexity of the software architecture. Furthermore, one of the strongest points of the ZSIM simulator is its portability. Note that the same code was tested on both AMD and Xeon Phi machines. The same architecture that efficiently performs on Xeon Phi, was ported into a 64 core NUMA AMD Opteron. To conclude, the two main achievements are restated as following: The primary achievement of this work was proving that the ZSIM architecture was faster than previously published logic simulators on low cost platforms. The secondary achievement was the development of a synthetic testing suite that went beyond the scale range that was previously publicly available, based on prior work that showed the synthesis technique is valid.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

After a decade evolving in the High Performance Computing arena, GPU-equipped supercomputers have con- quered the top500 and green500 lists, providing us unprecedented levels of computational power and memory bandwidth. This year, major vendors have introduced new accelerators based on 3D memory, like Xeon Phi Knights Landing by Intel and Pascal architecture by Nvidia. This paper reviews hardware features of those new HPC accelerators and unveils potential performance for scientific applications, with an emphasis on Hybrid Memory Cube (HMC) and High Bandwidth Memory (HBM) used by commercial products according to roadmaps already announced.