922 resultados para Parallel numerical algorithms
Resumo:
Finite element techniques for solving the problem of fluid-structure interaction of an elastic solid material in a laminar incompressible viscous flow are described. The mathematical problem consists of the Navier-Stokes equations in the Arbitrary Lagrangian-Eulerian formulation coupled with a non-linear structure model, considering the problem as one continuum. The coupling between the structure and the fluid is enforced inside a monolithic framework which computes simultaneously for the fluid and the structure unknowns within a unique solver. We used the well-known Crouzeix-Raviart finite element pair for discretization in space and the method of lines for discretization in time. A stability result using the Backward-Euler time-stepping scheme for both fluid and solid part and the finite element method for the space discretization has been proved. The resulting linear system has been solved by multilevel domain decomposition techniques. Our strategy is to solve several local subproblems over subdomain patches using the Schur-complement or GMRES smoother within a multigrid iterative solver. For validation and evaluation of the accuracy of the proposed methodology, we present corresponding results for a set of two FSI benchmark configurations which describe the self-induced elastic deformation of a beam attached to a cylinder in a laminar channel flow, allowing stationary as well as periodically oscillating deformations, and for a benchmark proposed by COMSOL multiphysics where a narrow vertical structure attached to the bottom wall of a channel bends under the force due to both viscous drag and pressure. Then, as an example of fluid-structure interaction in biomedical problems, we considered the academic numerical test which consists in simulating the pressure wave propagation through a straight compliant vessel. All the tests show the applicability and the numerical efficiency of our approach to both two-dimensional and three-dimensional problems.
Resumo:
Shape memory materials (SMMs) represent an important class of smart materials that have the ability to return from a deformed state to their original shape. Thanks to such a property, SMMs are utilized in a wide range of innovative applications. The increasing number of applications and the consequent involvement of industrial players in the field have motivated researchers to formulate constitutive models able to catch the complex behavior of these materials and to develop robust computational tools for design purposes. Such a research field is still under progress, especially in the prediction of shape memory polymer (SMP) behavior and of important effects characterizing shape memory alloy (SMA) applications. Moreover, the frequent use of shape memory and metallic materials in biomedical devices, particularly in cardiovascular stents, implanted in the human body and experiencing millions of in-vivo cycles by the blood pressure, clearly indicates the need for a deeper understanding of fatigue/fracture failure in microsize components. The development of reliable stent designs against fatigue is still an open subject in scientific literature. Motivated by the described framework, the thesis focuses on several research issues involving the advanced constitutive, numerical and fatigue modeling of elastoplastic and shape memory materials. Starting from the constitutive modeling, the thesis proposes to develop refined phenomenological models for reliable SMA and SMP behavior descriptions. Then, concerning the numerical modeling, the thesis proposes to implement the models into numerical software by developing implicit/explicit time-integration algorithms, to guarantee robust computational tools for practical purposes. The described modeling activities are completed by experimental investigations on SMA actuator springs and polyethylene polymers. Finally, regarding the fatigue modeling, the thesis proposes the introduction of a general computational approach for the fatigue-life assessment of a classical stent design, in order to exploit computer-based simulations to prevent failures and modify design, without testing numerous devices.
Resumo:
This dissertation studies the geometric static problem of under-constrained cable-driven parallel robots (CDPRs) supported by n cables, with n ≤ 6. The task consists of determining the overall robot configuration when a set of n variables is assigned. When variables relating to the platform posture are assigned, an inverse geometric static problem (IGP) must be solved; whereas, when cable lengths are given, a direct geometric static problem (DGP) must be considered. Both problems are challenging, as the robot continues to preserve some degrees of freedom even after n variables are assigned, with the final configuration determined by the applied forces. Hence, kinematics and statics are coupled and must be resolved simultaneously. In this dissertation, a general methodology is presented for modelling the aforementioned scenario with a set of algebraic equations. An elimination procedure is provided, aimed at solving the governing equations analytically and obtaining a least-degree univariate polynomial in the corresponding ideal for any value of n. Although an analytical procedure based on elimination is important from a mathematical point of view, providing an upper bound on the number of solutions in the complex field, it is not practical to compute these solutions as it would be very time-consuming. Thus, for the efficient computation of the solution set, a numerical procedure based on homotopy continuation is implemented. A continuation algorithm is also applied to find a set of robot parameters with the maximum number of real assembly modes for a given DGP. Finally, the end-effector pose depends on the applied load and may change due to external disturbances. An investigation into equilibrium stability is therefore performed.
Resumo:
Theories and numerical modeling are fundamental tools for understanding, optimizing and designing present and future laser-plasma accelerators (LPAs). Laser evolution and plasma wave excitation in a LPA driven by a weakly relativistically intense, short-pulse laser propagating in a preformed parabolic plasma channel, is studied analytically in 3D including the effects of pulse steepening and energy depletion. At higher laser intensities, the process of electron self-injection in the nonlinear bubble wake regime is studied by means of fully self-consistent Particle-in-Cell simulations. Considering a non-evolving laser driver propagating with a prescribed velocity, the geometrical properties of the non-evolving bubble wake are studied. For a range of parameters of interest for laser plasma acceleration, The dependence of the threshold for self-injection in the non-evolving wake on laser intensity and wake velocity is characterized. Due to the nonlinear and complex nature of the Physics involved, computationally challenging numerical simulations are required to model laser-plasma accelerators operating at relativistic laser intensities. The numerical and computational optimizations, that combined in the codes INF&RNO and INF&RNO/quasi-static give the possibility to accurately model multi-GeV laser wakefield acceleration stages with present supercomputing architectures, are discussed. The PIC code jasmine, capable of efficiently running laser-plasma simulations on Graphics Processing Units (GPUs) clusters, is presented. GPUs deliver exceptional performance to PIC codes, but the core algorithms had to be redesigned for satisfying the constraints imposed by the intrinsic parallelism of the architecture. The simulation campaigns, run with the code jasmine for modeling the recent LPA experiments with the INFN-FLAME and CNR-ILIL laser systems, are also presented.
Resumo:
In vielen Bereichen der industriellen Fertigung, wie zum Beispiel in der Automobilindustrie, wer- den digitale Versuchsmodelle (sog. digital mock-ups) eingesetzt, um die Entwicklung komplexer Maschinen m ̈oglichst gut durch Computersysteme unterstu ̈tzen zu k ̈onnen. Hierbei spielen Be- wegungsplanungsalgorithmen eine wichtige Rolle, um zu gew ̈ahrleisten, dass diese digitalen Pro- totypen auch kollisionsfrei zusammengesetzt werden k ̈onnen. In den letzten Jahrzehnten haben sich hier sampling-basierte Verfahren besonders bew ̈ahrt. Diese erzeugen eine große Anzahl von zuf ̈alligen Lagen fu ̈r das ein-/auszubauende Objekt und verwenden einen Kollisionserken- nungsmechanismus, um die einzelnen Lagen auf Gu ̈ltigkeit zu u ̈berpru ̈fen. Daher spielt die Kollisionserkennung eine wesentliche Rolle beim Design effizienter Bewegungsplanungsalgorith- men. Eine Schwierigkeit fu ̈r diese Klasse von Planern stellen sogenannte “narrow passages” dar, schmale Passagen also, die immer dort auftreten, wo die Bewegungsfreiheit der zu planenden Objekte stark eingeschr ̈ankt ist. An solchen Stellen kann es schwierig sein, eine ausreichende Anzahl von kollisionsfreien Samples zu finden. Es ist dann m ̈oglicherweise n ̈otig, ausgeklu ̈geltere Techniken einzusetzen, um eine gute Performance der Algorithmen zu erreichen.rnDie vorliegende Arbeit gliedert sich in zwei Teile: Im ersten Teil untersuchen wir parallele Kollisionserkennungsalgorithmen. Da wir auf eine Anwendung bei sampling-basierten Bewe- gungsplanern abzielen, w ̈ahlen wir hier eine Problemstellung, bei der wir stets die selben zwei Objekte, aber in einer großen Anzahl von unterschiedlichen Lagen auf Kollision testen. Wir im- plementieren und vergleichen verschiedene Verfahren, die auf Hu ̈llk ̈operhierarchien (BVHs) und hierarchische Grids als Beschleunigungsstrukturen zuru ̈ckgreifen. Alle beschriebenen Verfahren wurden auf mehreren CPU-Kernen parallelisiert. Daru ̈ber hinaus vergleichen wir verschiedene CUDA Kernels zur Durchfu ̈hrung BVH-basierter Kollisionstests auf der GPU. Neben einer un- terschiedlichen Verteilung der Arbeit auf die parallelen GPU Threads untersuchen wir hier die Auswirkung verschiedener Speicherzugriffsmuster auf die Performance der resultierenden Algo- rithmen. Weiter stellen wir eine Reihe von approximativen Kollisionstests vor, die auf den beschriebenen Verfahren basieren. Wenn eine geringere Genauigkeit der Tests tolerierbar ist, kann so eine weitere Verbesserung der Performance erzielt werden.rnIm zweiten Teil der Arbeit beschreiben wir einen von uns entworfenen parallelen, sampling- basierten Bewegungsplaner zur Behandlung hochkomplexer Probleme mit mehreren “narrow passages”. Das Verfahren arbeitet in zwei Phasen. Die grundlegende Idee ist hierbei, in der er- sten Planungsphase konzeptionell kleinere Fehler zuzulassen, um die Planungseffizienz zu erh ̈ohen und den resultierenden Pfad dann in einer zweiten Phase zu reparieren. Der hierzu in Phase I eingesetzte Planer basiert auf sogenannten Expansive Space Trees. Zus ̈atzlich haben wir den Planer mit einer Freidru ̈ckoperation ausgestattet, die es erlaubt, kleinere Kollisionen aufzul ̈osen und so die Effizienz in Bereichen mit eingeschr ̈ankter Bewegungsfreiheit zu erh ̈ohen. Optional erlaubt unsere Implementierung den Einsatz von approximativen Kollisionstests. Dies setzt die Genauigkeit der ersten Planungsphase weiter herab, fu ̈hrt aber auch zu einer weiteren Perfor- mancesteigerung. Die aus Phase I resultierenden Bewegungspfade sind dann unter Umst ̈anden nicht komplett kollisionsfrei. Um diese Pfade zu reparieren, haben wir einen neuartigen Pla- nungsalgorithmus entworfen, der lokal beschr ̈ankt auf eine kleine Umgebung um den bestehenden Pfad einen neuen, kollisionsfreien Bewegungspfad plant.rnWir haben den beschriebenen Algorithmus mit einer Klasse von neuen, schwierigen Metall- Puzzlen getestet, die zum Teil mehrere “narrow passages” aufweisen. Unseres Wissens nach ist eine Sammlung vergleichbar komplexer Benchmarks nicht ̈offentlich zug ̈anglich und wir fan- den auch keine Beschreibung von vergleichbar komplexen Benchmarks in der Motion-Planning Literatur.
Resumo:
In the past two decades the work of a growing portion of researchers in robotics focused on a particular group of machines, belonging to the family of parallel manipulators: the cable robots. Although these robots share several theoretical elements with the better known parallel robots, they still present completely (or partly) unsolved issues. In particular, the study of their kinematic, already a difficult subject for conventional parallel manipulators, is further complicated by the non-linear nature of cables, which can exert only efforts of pure traction. The work presented in this thesis therefore focuses on the study of the kinematics of these robots and on the development of numerical techniques able to address some of the problems related to it. Most of the work is focused on the development of an interval-analysis based procedure for the solution of the direct geometric problem of a generic cable manipulator. This technique, as well as allowing for a rapid solution of the problem, also guarantees the results obtained against rounding and elimination errors and can take into account any uncertainties in the model of the problem. The developed code has been tested with the help of a small manipulator whose realization is described in this dissertation together with the auxiliary work done during its design and simulation phases.
Resumo:
The self-regeneration capacity of articular cartilage is limited, due to its avascular and aneural nature. Loaded explants and cell cultures demonstrated that chondrocyte metabolism can be regulated via physiologic loading. However, the explicit ranges of mechanical stimuli that correspond to favourable metabolic response associated with extracellular matrix (ECM) synthesis are elusive. Unsystematic protocols lacking this knowledge produce inconsistent results. This study aims to determine the intrinsic ranges of physical stimuli that increase ECM synthesis and simultaneously inhibit nitric oxide (NO) production in chondrocyte-agarose constructs, by numerically re-evaluating the experiments performed by Tsuang et al. (2008). Twelve loading patterns were simulated with poro-elastic finite element models in ABAQUS. Pressure on solid matrix, von Mises stress, maximum principle stress and pore pressure were selected as intrinsic mechanical stimuli. Their development rates and magnitudes at the steady state of cyclic loading were calculated with MATLAB at the construct level. Concurrent increase in glycosaminoglycan and collagen was observed at 2300 Pa pressure and 40 Pa/s pressure rate. Between 0-1500 Pa and 0-40 Pa/s, NO production was consistently positive with respect to controls, whereas ECM synthesis was negative in the same range. A linear correlation was found between pressure rate and NO production (R = 0.77). Stress states identified in this study are generic and could be used to develop predictive algorithms for matrix production in agarose-chondrocyte constructs of arbitrary shape, size and agarose concentration. They could also be helpful to increase the efficacy of loading protocols for avascular tissue engineering. Copyright (c) 2010 John Wiley \& Sons, Ltd.
Resumo:
The performance of the parallel vector implementation of the one- and two-dimensional orthogonal transforms is evaluated. The orthogonal transforms are computed using actual or modified fast Fourier transform (FFT) kernels. The factors considered in comparing the speed-up of these vectorized digital signal processing algorithms are discussed and it is shown that the traditional way of comparing th execution speed of digital signal processing algorithms by the ratios of the number of multiplications and additions is no longer effective for vector implementation; the structure of the algorithm must also be considered as a factor when comparing the execution speed of vectorized digital signal processing algorithms. Simulation results on the Cray X/MP with the following orthogonal transforms are presented: discrete Fourier transform (DFT), discrete cosine transform (DCT), discrete sine transform (DST), discrete Hartley transform (DHT), discrete Walsh transform (DWHT), and discrete Hadamard transform (DHDT). A comparison between the DHT and the fast Hartley transform is also included.(34 refs)
Resumo:
Linear programs, or LPs, are often used in optimization problems, such as improving manufacturing efficiency of maximizing the yield from limited resources. The most common method for solving LPs is the Simplex Method, which will yield a solution, if one exists, but over the real numbers. From a purely numerical standpoint, it will be an optimal solution, but quite often we desire an optimal integer solution. A linear program in which the variables are also constrained to be integers is called an integer linear program or ILP. It is the focus of this report to present a parallel algorithm for solving ILPs. We discuss a serial algorithm using a breadth-first branch-and-bound search to check the feasible solution space, and then extend it into a parallel algorithm using a client-server model. In the parallel mode, the search may not be truly breadth-first, depending on the solution time for each node in the solution tree. Our search takes advantage of pruning, often resulting in super-linear improvements in solution time. Finally, we present results from sample ILPs, describe a few modifications to enhance the algorithm and improve solution time, and offer suggestions for future work.
Resumo:
The numerical solution of the incompressible Navier-Stokes Equations offers an effective alternative to the experimental analysis of Fluid-Structure interaction i.e. dynamical coupling between a fluid and a solid which otherwise is very complex, time consuming and very expensive. To have a method which can accurately model these types of mechanical systems by numerical solutions becomes a great option, since these advantages are even more obvious when considering huge structures like bridges, high rise buildings, or even wind turbine blades with diameters as large as 200 meters. The modeling of such processes, however, involves complex multiphysics problems along with complex geometries. This thesis focuses on a novel vorticity-velocity formulation called the KLE to solve the incompressible Navier-stokes equations for such FSI problems. This scheme allows for the implementation of robust adaptive ODE time integration schemes and thus allows us to tackle the various multiphysics problems as separate modules. The current algorithm for KLE employs a structured or unstructured mesh for spatial discretization and it allows the use of a self-adaptive or fixed time step ODE solver while dealing with unsteady problems. This research deals with the analysis of the effects of the Courant-Friedrichs-Lewy (CFL) condition for KLE when applied to unsteady Stoke’s problem. The objective is to conduct a numerical analysis for stability and, hence, for convergence. Our results confirmthat the time step ∆t is constrained by the CFL-like condition ∆t ≤ const. hα, where h denotes the variable that represents spatial discretization.
Resumo:
Stepwise uncertainty reduction (SUR) strategies aim at constructing a sequence of points for evaluating a function f in such a way that the residual uncertainty about a quantity of interest progressively decreases to zero. Using such strategies in the framework of Gaussian process modeling has been shown to be efficient for estimating the volume of excursion of f above a fixed threshold. However, SUR strategies remain cumbersome to use in practice because of their high computational complexity, and the fact that they deliver a single point at each iteration. In this article we introduce several multipoint sampling criteria, allowing the selection of batches of points at which f can be evaluated in parallel. Such criteria are of particular interest when f is costly to evaluate and several CPUs are simultaneously available. We also manage to drastically reduce the computational cost of these strategies through the use of closed form formulas. We illustrate their performances in various numerical experiments, including a nuclear safety test case. Basic notions about kriging, auxiliary problems, complexity calculations, R code, and data are available online as supplementary materials.
Resumo:
We investigate parallel algorithms for the solution of the Navier–Stokes equations in space-time. For periodic solutions, the discretized problem can be written as a large non-linear system of equations. This system of equations is solved by a Newton iteration. The Newton correction is computed using a preconditioned GMRES solver. The parallel performance of the algorithm is illustrated.
Resumo:
This paper presents a parallel surrogate-based global optimization method for computationally expensive objective functions that is more effective for larger numbers of processors. To reach this goal, we integrated concepts from multi-objective optimization and tabu search into, single objective, surrogate optimization. Our proposed derivative-free algorithm, called SOP, uses non-dominated sorting of points for which the expensive function has been previously evaluated. The two objectives are the expensive function value of the point and the minimum distance of the point to previously evaluated points. Based on the results of non-dominated sorting, P points from the sorted fronts are selected as centers from which many candidate points are generated by random perturbations. Based on surrogate approximation, the best candidate point is subsequently selected for expensive evaluation for each of the P centers, with simultaneous computation on P processors. Centers that previously did not generate good solutions are tabu with a given tenure. We show almost sure convergence of this algorithm under some conditions. The performance of SOP is compared with two RBF based methods. The test results show that SOP is an efficient method that can reduce time required to find a good near optimal solution. In a number of cases the efficiency of SOP is so good that SOP with 8 processors found an accurate answer in less wall-clock time than the other algorithms did with 32 processors.
Resumo:
Magnetic iron minerals are widespread and indicative sediment constituents in estuarine, coastal and shelf systems. We combine environmental magnetic, sedimentological and numerical methods to identify magnetite-enriched placer-like zones in a complex coastal system and delineate their formation mechanisms. Magnetic susceptibility and remanence measurements on 245 surficial sediment samples collected in and around Tauranga Harbour, the largest barrier-enclosed tidal estuary of New Zealand, reveal several discrete enrichment zones controlled by local hydrodynamic conditions. Active magnetite enrichment takes place in tidal channels, which feed into two coast-parallel nearshore magnetite-enriched belts centered at water depths of 6-10 m and 10-20 m. A close correlation between magnetite content and magnetic grain size was found, where higher susceptibility values are associated within coarser magnetic crystal sizes. Two key mechanisms for magnetite enrichment are identified. First, tide-induced residual currents primarily enable magnetite enrichment within the estuarine channel network. A coast-parallel, fine sand magnetite enrichment belt in water depths of less than 10 m along the barrier island has a strong decrease in magnetite content away from the southern tidal inlet and is apparently related to active coast-parallel transport combined with mobilizing surf zone processes. A second, less pronounced, but more uniform magnetite enrichment belt at 10-20 m water depth is composed of non-mobile, medium-coarse-grained relict sands, which have been reworked during post-glacial sea level transgression. We demonstrate the potential of magnetic methods to reveal and differentiate coastal magnetite enrichment patterns and investigate their formative mechanisms.
Resumo:
This paper describes new improvements for BB-MaxClique (San Segundo et al. in Comput Oper Resour 38(2):571–581, 2011 ), a leading maximum clique algorithm which uses bit strings to efficiently compute basic operations during search by bit masking. Improvements include a recently described recoloring strategy in Tomita et al. (Proceedings of the 4th International Workshop on Algorithms and Computation. Lecture Notes in Computer Science, vol 5942. Springer, Berlin, pp 191–203, 2010 ), which is now integrated in the bit string framework, as well as different optimization strategies for fast bit scanning. Reported results over DIMACS and random graphs show that the new variants improve over previous BB-MaxClique for a vast majority of cases. It is also established that recoloring is mainly useful for graphs with high densities.