193 resultados para parallelization
Resumo:
Continuous improvement of industrial processes is one way to become companies more competitive in the market. For this, a fairly widespread use is the methodology of lean production systems, by eliminating waste. One of the tools of these systems is the method of rapid exchange of die or also called SMED, which will be applied in this study. The study aims to develop proposals for reducing the time of set up of two machines in two machining lines, watching also the ergonomics and safety conditions in this operations. The set up time reduction is justified among others, due to increase the machine productivity. In applying the connecting rod machining line, there were two types of exchange. Reductions in time to set up proposals reached values of 47% to one of them, and 55% to the other. It is important to underline that to reach this result, there was no need for large investments. In the application in block machining line, was developed an improvement in the ergonomic area. It was placed a pulley block that came to increase the time of tool change. Aiming to improve the security of the exchange, without productivity loss, the method was applied to reduce this time. It was developed two proposals.: the first would reduce that time by 19%, and does not require many resources of the company. The second involves the parallelization of the exchange, so that the reduction is 48%. However, this proposal requires one more manpower at the time of exchange and it is not always possible
Resumo:
Sao Paulo State Research Foundation-FAPESP
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
Identify opportunities for software parallelism is a task that takes a lot of human time, but once some code patterns for parallelism are identified, a software could quickly accomplish this task. Thus, automating this process brings many benefits such as saving time and reducing errors caused by the programmer [1]. This work aims at developing a software environment that identifies opportunities for parallelism in a source code written in C language, and generates a program with the same behavior, but with higher degree of parallelism, compatible with a graphics processor compatible with CUDA architecture.
Resumo:
In this article, we introduce two new variants of the Assembly Line Worker Assignment and Balancing Problem (ALWABP) that allow parallelization of and collaboration between heterogeneous workers. These new approaches suppose an additional level of complexity in the Line Design and Assignment process, but also higher flexibility; which may be particularly useful in practical situations where the aim is to progressively integrate slow or limited workers in conventional assembly lines. We present linear models and heuristic procedures for these two new problems. Computational results show the efficiency of the proposed approaches and the efficacy of the studied layouts in different situations. (C) 2012 Elsevier B.V. All rights reserved.
Resumo:
We report cross sections for elastic electron scattering by gas phase glycine (neutral form), obtained with the Schwinger multichannel method. The present results are the first obtained with a new implementation that combines parallelization with OpenMP directives and pseudopotentials. The position of the well known pi* shape resonance ranged from 2.3 eV to 2.8 eV depending on the polarization model and conformer. For the most stable isomer, the present result (2.4 eV) is in fair agreement with electron transmission spectroscopy assignments (1.93 +/- 0.05 eV) and available calculations. Our results also point out a shape resonance around 9.5 eV in the A' symmetry that would be weakly coupled to vibrations of the hydroxyl group. Since electron attachment to a broad and lower lying sigma* orbital located on the OH bond has been suggested the underlying mechanism leading to dissociative electron attachment at low energies, we sought for a shape resonance around similar to 4 eV. Though we obtained cross sections with the target molecule at the equilibrium geometry and with stretched OH bond lengths, least-squares fits to the calculated eigenphase sums did not point out signatures of this anion state (though, in principle, it could be hidden in the large background). The low energy (similar to 1 eV) integral cross section strongly scales as the bond length is stretched, and this could indicate a virtual state pole, since dipole supported bound states are not expected at the geometries addressed here. (C) 2012 American Institute of Physics. [http://dx.doi.org/10.1063/1.3687345]
Resumo:
The modern GPUs are well suited for intensive computational tasks and massive parallel computation. Sparse matrix multiplication and linear triangular solver are the most important and heavily used kernels in scientific computation, and several challenges in developing a high performance kernel with the two modules is investigated. The main interest it to solve linear systems derived from the elliptic equations with triangular elements. The resulting linear system has a symmetric positive definite matrix. The sparse matrix is stored in the compressed sparse row (CSR) format. It is proposed a CUDA algorithm to execute the matrix vector multiplication using directly the CSR format. A dependence tree algorithm is used to determine which variables the linear triangular solver can determine in parallel. To increase the number of the parallel threads, a coloring graph algorithm is implemented to reorder the mesh numbering in a pre-processing phase. The proposed method is compared with parallel and serial available libraries. The results show that the proposed method improves the computation cost of the matrix vector multiplication. The pre-processing associated with the triangular solver needs to be executed just once in the proposed method. The conjugate gradient method was implemented and showed similar convergence rate for all the compared methods. The proposed method showed significant smaller execution time.
Resumo:
Water distribution networks optimization is a challenging problem due to the dimension and the complexity of these systems. Since the last half of the twentieth century this field has been investigated by many authors. Recently, to overcome discrete nature of variables and non linearity of equations, the research has been focused on the development of heuristic algorithms. This algorithms do not require continuity and linearity of the problem functions because they are linked to an external hydraulic simulator that solve equations of mass continuity and of energy conservation of the network. In this work, a NSGA-II (Non-dominating Sorting Genetic Algorithm) has been used. This is a heuristic multi-objective genetic algorithm based on the analogy of evolution in nature. Starting from an initial random set of solutions, called population, it evolves them towards a front of solutions that minimize, separately and contemporaneously, all the objectives. This can be very useful in practical problems where multiple and discordant goals are common. Usually, one of the main drawback of these algorithms is related to time consuming: being a stochastic research, a lot of solutions must be analized before good ones are found. Results of this thesis about the classical optimal design problem shows that is possible to improve results modifying the mathematical definition of objective functions and the survival criterion, inserting good solutions created by a Cellular Automata and using rules created by classifier algorithm (C4.5). This part has been tested using the version of NSGA-II supplied by Centre for Water Systems (University of Exeter, UK) in MATLAB® environment. Even if orientating the research can constrain the algorithm with the risk of not finding the optimal set of solutions, it can greatly improve the results. Subsequently, thanks to CINECA help, a version of NSGA-II has been implemented in C language and parallelized: results about the global parallelization show the speed up, while results about the island parallelization show that communication among islands can improve the optimization. Finally, some tests about the optimization of pump scheduling have been carried out. In this case, good results are found for a small network, while the solutions of a big problem are affected by the lack of constraints on the number of pump switches. Possible future research is about the insertion of further constraints and the evolution guide. In the end, the optimization of water distribution systems is still far from a definitive solution, but the improvement in this field can be very useful in reducing the solutions cost of practical problems, where the high number of variables makes their management very difficult from human point of view.
Resumo:
This thesis presents new methods to simulate systems with hydrodynamic and electrostatic interactions. Part 1 is devoted to computer simulations of Brownian particles with hydrodynamic interactions. The main influence of the solvent on the dynamics of Brownian particles is that it mediates hydrodynamic interactions. In the method, this is simulated by numerical solution of the Navier--Stokes equation on a lattice. To this end, the Lattice--Boltzmann method is used, namely its D3Q19 version. This model is capable to simulate compressible flow. It gives us the advantage to treat dense systems, in particular away from thermal equilibrium. The Lattice--Boltzmann equation is coupled to the particles via a friction force. In addition to this force, acting on {it point} particles, we construct another coupling force, which comes from the pressure tensor. The coupling is purely local, i.~e. the algorithm scales linearly with the total number of particles. In order to be able to map the physical properties of the Lattice--Boltzmann fluid onto a Molecular Dynamics (MD) fluid, the case of an almost incompressible flow is considered. The Fluctuation--Dissipation theorem for the hybrid coupling is analyzed, and a geometric interpretation of the friction coefficient in terms of a Stokes radius is given. Part 2 is devoted to the simulation of charged particles. We present a novel method for obtaining Coulomb interactions as the potential of mean force between charges which are dynamically coupled to a local electromagnetic field. This algorithm scales linearly, too. We focus on the Molecular Dynamics version of the method and show that it is intimately related to the Car--Parrinello approach, while being equivalent to solving Maxwell's equations with freely adjustable speed of light. The Lagrangian formulation of the coupled particles--fields system is derived. The quasi--Hamiltonian dynamics of the system is studied in great detail. For implementation on the computer, the equations of motion are discretized with respect to both space and time. The discretization of the electromagnetic fields on a lattice, as well as the interpolation of the particle charges on the lattice is given. The algorithm is as local as possible: Only nearest neighbors sites of the lattice are interacting with a charged particle. Unphysical self--energies arise as a result of the lattice interpolation of charges, and are corrected by a subtraction scheme based on the exact lattice Green's function. The method allows easy parallelization using standard domain decomposition. Some benchmarking results of the algorithm are presented and discussed.
Resumo:
Being basic ingredients of numerous daily-life products with significant industrial importance as well as basic building blocks for biomaterials, charged hydrogels continue to pose a series of unanswered challenges for scientists even after decades of practical applications and intensive research efforts. Despite a rather simple internal structure it is mainly the unique combination of short- and long-range forces which render scientific investigations of their characteristic properties to be quite difficult. Hence early on computer simulations were used to link analytical theory and empirical experiments, bridging the gap between the simplifying assumptions of the models and the complexity of real world measurements. Due to the immense numerical effort, even for high performance supercomputers, system sizes and time scales were rather restricted until recently, whereas it only now has become possible to also simulate a network of charged macromolecules. This is the topic of the presented thesis which investigates one of the fundamental and at the same time highly fascinating phenomenon of polymer research: The swelling behaviour of polyelectrolyte networks. For this an extensible simulation package for the research on soft matter systems, ESPResSo for short, was created which puts a particular emphasis on mesoscopic bead-spring-models of complex systems. Highly efficient algorithms and a consistent parallelization reduced the necessary computation time for solving equations of motion even in case of long-ranged electrostatics and large number of particles, allowing to tackle even expensive calculations and applications. Nevertheless, the program has a modular and simple structure, enabling a continuous process of adding new potentials, interactions, degrees of freedom, ensembles, and integrators, while staying easily accessible for newcomers due to a Tcl-script steering level controlling the C-implemented simulation core. Numerous analysis routines provide means to investigate system properties and observables on-the-fly. Even though analytical theories agreed on the modeling of networks in the past years, our numerical MD-simulations show that even in case of simple model systems fundamental theoretical assumptions no longer apply except for a small parameter regime, prohibiting correct predictions of observables. Applying a "microscopic" analysis of the isolated contributions of individual system components, one of the particular strengths of computer simulations, it was then possible to describe the behaviour of charged polymer networks at swelling equilibrium in good solvent and close to the Theta-point by introducing appropriate model modifications. This became possible by enhancing known simple scaling arguments with components deemed crucial in our detailed study, through which a generalized model could be constructed. Herewith an agreement of the final system volume of swollen polyelectrolyte gels with results of computer simulations could be shown successfully over the entire investigated range of parameters, for different network sizes, charge fractions, and interaction strengths. In addition, the "cell under tension" was presented as a self-regulating approach for predicting the amount of swelling based on the used system parameters only. Without the need for measured observables as input, minimizing the free energy alone already allows to determine the the equilibrium behaviour. In poor solvent the shape of the network chains changes considerably, as now their hydrophobicity counteracts the repulsion of like-wise charged monomers and pursues collapsing the polyelectrolytes. Depending on the chosen parameters a fragile balance emerges, giving rise to fascinating geometrical structures such as the so-called pear-necklaces. This behaviour, known from single chain polyelectrolytes under similar environmental conditions and also theoretically predicted, could be detected for the first time for networks as well. An analysis of the total structure factors confirmed first evidences for the existence of such structures found in experimental results.
Resumo:
Flood disasters are a major cause of fatalities and economic losses, and several studies indicate that global flood risk is currently increasing. In order to reduce and mitigate the impact of river flood disasters, the current trend is to integrate existing structural defences with non structural measures. This calls for a wider application of advanced hydraulic models for flood hazard and risk mapping, engineering design, and flood forecasting systems. Within this framework, two different hydraulic models for large scale analysis of flood events have been developed. The two models, named CA2D and IFD-GGA, adopt an integrated approach based on the diffusive shallow water equations and a simplified finite volume scheme. The models are also designed for massive code parallelization, which has a key importance in reducing run times in large scale and high-detail applications. The two models were first applied to several numerical cases, to test the reliability and accuracy of different model versions. Then, the most effective versions were applied to different real flood events and flood scenarios. The IFD-GGA model showed serious problems that prevented further applications. On the contrary, the CA2D model proved to be fast and robust, and able to reproduce 1D and 2D flow processes in terms of water depth and velocity. In most applications the accuracy of model results was good and adequate to large scale analysis. Where complex flow processes occurred local errors were observed, due to the model approximations. However, they did not compromise the correct representation of overall flow processes. In conclusion, the CA model can be a valuable tool for the simulation of a wide range of flood event types, including lowland and flash flood events.
Resumo:
The improvement of devices provided by Nanotechnology has put forward new classes of sensors, called bio-nanosensors, which are very promising for the detection of biochemical molecules in a large variety of applications. Their use in lab-on-a-chip could gives rise to new opportunities in many fields, from health-care and bio-warfare to environmental and high-throughput screening for pharmaceutical industry. Bio-nanosensors have great advantages in terms of cost, performance, and parallelization. Indeed, they require very low quantities of reagents and improve the overall signal-to-noise-ratio due to increase of binding signal variations vs. area and reduction of stray capacitances. Additionally, they give rise to new challenges, such as the need to design high-performance low-noise integrated electronic interfaces. This thesis is related to the design of high-performance advanced CMOS interfaces for electrochemical bio-nanosensors. The main focus of the thesis is: 1) critical analysis of noise in sensing interfaces, 2) devising new techniques for noise reduction in discrete-time approaches, 3) developing new architectures for low-noise, low-power sensing interfaces. The manuscript reports a multi-project activity focusing on low-noise design and presents two developed integrated circuits (ICs) as examples of advanced CMOS interfaces for bio-nanosensors. The first project concerns low-noise current-sensing interface for DC and transient measurements of electrophysiological signals. The focus of this research activity is on the noise optimization of the electronic interface. A new noise reduction technique has been developed so as to realize an integrated CMOS interfaces with performance comparable with state-of-the-art instrumentations. The second project intends to realize a stand-alone, high-accuracy electrochemical impedance spectroscopy interface. The system is tailored for conductivity-temperature-depth sensors in environmental applications, as well as for bio-nanosensors. It is based on a band-pass delta-sigma technique and combines low-noise performance with low-power requirements.
Resumo:
Coupled-cluster theory provides one of the most successful concepts in electronic-structure theory. This work covers the parallelization of coupled-cluster energies, gradients, and second derivatives and its application to selected large-scale chemical problems, beside the more practical aspects such as the publication and support of the quantum-chemistry package ACES II MAB and the design and development of a computational environment optimized for coupled-cluster calculations. The main objective of this thesis was to extend the range of applicability of coupled-cluster models to larger molecular systems and their properties and therefore to bring large-scale coupled-cluster calculations into day-to-day routine of computational chemistry. A straightforward strategy for the parallelization of CCSD and CCSD(T) energies, gradients, and second derivatives has been outlined and implemented for closed-shell and open-shell references. Starting from the highly efficient serial implementation of the ACES II MAB computer code an adaptation for affordable workstation clusters has been obtained by parallelizing the most time-consuming steps of the algorithms. Benchmark calculations for systems with up to 1300 basis functions and the presented applications show that the resulting algorithm for energies, gradients and second derivatives at the CCSD and CCSD(T) level of theory exhibits good scaling with the number of processors and substantially extends the range of applicability. Within the framework of the ’High accuracy Extrapolated Ab initio Thermochemistry’ (HEAT) protocols effects of increased basis-set size and higher excitations in the coupled- cluster expansion were investigated. The HEAT scheme was generalized for molecules containing second-row atoms in the case of vinyl chloride. This allowed the different experimental reported values to be discriminated. In the case of the benzene molecule it was shown that even for molecules of this size chemical accuracy can be achieved. Near-quantitative agreement with experiment (about 2 ppm deviation) for the prediction of fluorine-19 nuclear magnetic shielding constants can be achieved by employing the CCSD(T) model together with large basis sets at accurate equilibrium geometries if vibrational averaging and temperature corrections via second-order vibrational perturbation theory are considered. Applying a very similar level of theory for the calculation of the carbon-13 NMR chemical shifts of benzene resulted in quantitative agreement with experimental gas-phase data. The NMR chemical shift study for the bridgehead 1-adamantyl cation at the CCSD(T) level resolved earlier discrepancies of lower-level theoretical treatment. The equilibrium structure of diacetylene has been determined based on the combination of experimental rotational constants of thirteen isotopic species and zero-point vibrational corrections calculated at various quantum-chemical levels. These empirical equilibrium structures agree to within 0.1 pm irrespective of the theoretical level employed. High-level quantum-chemical calculations on the hyperfine structure parameters of the cyanopolyynes were found to be in excellent agreement with experiment. Finally, the theoretically most accurate determination of the molecular equilibrium structure of ferrocene to date is presented.
Resumo:
Coupled-cluster theory in its single-reference formulation represents one of the most successful approaches in quantum chemistry for the description of atoms and molecules. To extend the applicability of single-reference coupled-cluster theory to systems with degenerate or near-degenerate electronic configurations, multireference coupled-cluster methods have been suggested. One of the most promising formulations of multireference coupled cluster theory is the state-specific variant suggested by Mukherjee and co-workers (Mk-MRCC). Unlike other multireference coupled-cluster approaches, Mk-MRCC is a size-extensive theory and results obtained so far indicate that it has the potential to develop to a standard tool for high-accuracy quantum-chemical treatments. This work deals with developments to overcome the limitations in the applicability of the Mk-MRCC method. Therefore, an efficient Mk-MRCC algorithm has been implemented in the CFOUR program package to perform energy calculations within the singles and doubles (Mk-MRCCSD) and singles, doubles, and triples (Mk-MRCCSDT) approximations. This implementation exploits the special structure of the Mk-MRCC working equations that allows to adapt existing efficient single-reference coupled-cluster codes. The algorithm has the correct computational scaling of d*N^6 for Mk-MRCCSD and d*N^8 for Mk-MRCCSDT, where N denotes the system size and d the number of reference determinants. For the determination of molecular properties as the equilibrium geometry, the theory of analytic first derivatives of the energy for the Mk-MRCC method has been developed using a Lagrange formalism. The Mk-MRCC gradients within the CCSD and CCSDT approximation have been implemented and their applicability has been demonstrated for various compounds such as 2,6-pyridyne, the 2,6-pyridyne cation, m-benzyne, ozone and cyclobutadiene. The development of analytic gradients for Mk-MRCC offers the possibility of routinely locating minima and transition states on the potential energy surface. It can be considered as a key step towards routine investigation of multireference systems and calculation of their properties. As the full inclusion of triple excitations in Mk-MRCC energy calculations is computational demanding, a parallel implementation is presented in order to circumvent limitations due to the required execution time. The proposed scheme is based on the adaption of a highly efficient serial Mk-MRCCSDT code by parallelizing the time-determining steps. A first application to 2,6-pyridyne is presented to demonstrate the efficiency of the current implementation.
Resumo:
Zeitreihen sind allgegenwärtig. Die Erfassung und Verarbeitung kontinuierlich gemessener Daten ist in allen Bereichen der Naturwissenschaften, Medizin und Finanzwelt vertreten. Das enorme Anwachsen aufgezeichneter Datenmengen, sei es durch automatisierte Monitoring-Systeme oder integrierte Sensoren, bedarf außerordentlich schneller Algorithmen in Theorie und Praxis. Infolgedessen beschäftigt sich diese Arbeit mit der effizienten Berechnung von Teilsequenzalignments. Komplexe Algorithmen wie z.B. Anomaliedetektion, Motivfabfrage oder die unüberwachte Extraktion von prototypischen Bausteinen in Zeitreihen machen exzessiven Gebrauch von diesen Alignments. Darin begründet sich der Bedarf nach schnellen Implementierungen. Diese Arbeit untergliedert sich in drei Ansätze, die sich dieser Herausforderung widmen. Das umfasst vier Alignierungsalgorithmen und ihre Parallelisierung auf CUDA-fähiger Hardware, einen Algorithmus zur Segmentierung von Datenströmen und eine einheitliche Behandlung von Liegruppen-wertigen Zeitreihen.rnrnDer erste Beitrag ist eine vollständige CUDA-Portierung der UCR-Suite, die weltführende Implementierung von Teilsequenzalignierung. Das umfasst ein neues Berechnungsschema zur Ermittlung lokaler Alignierungsgüten unter Verwendung z-normierten euklidischen Abstands, welches auf jeder parallelen Hardware mit Unterstützung für schnelle Fouriertransformation einsetzbar ist. Des Weiteren geben wir eine SIMT-verträgliche Umsetzung der Lower-Bound-Kaskade der UCR-Suite zur effizienten Berechnung lokaler Alignierungsgüten unter Dynamic Time Warping an. Beide CUDA-Implementierungen ermöglichen eine um ein bis zwei Größenordnungen schnellere Berechnung als etablierte Methoden.rnrnAls zweites untersuchen wir zwei Linearzeit-Approximierungen für das elastische Alignment von Teilsequenzen. Auf der einen Seite behandeln wir ein SIMT-verträgliches Relaxierungschema für Greedy DTW und seine effiziente CUDA-Parallelisierung. Auf der anderen Seite führen wir ein neues lokales Abstandsmaß ein, den Gliding Elastic Match (GEM), welches mit der gleichen asymptotischen Zeitkomplexität wie Greedy DTW berechnet werden kann, jedoch eine vollständige Relaxierung der Penalty-Matrix bietet. Weitere Verbesserungen umfassen Invarianz gegen Trends auf der Messachse und uniforme Skalierung auf der Zeitachse. Des Weiteren wird eine Erweiterung von GEM zur Multi-Shape-Segmentierung diskutiert und auf Bewegungsdaten evaluiert. Beide CUDA-Parallelisierung verzeichnen Laufzeitverbesserungen um bis zu zwei Größenordnungen.rnrnDie Behandlung von Zeitreihen beschränkt sich in der Literatur in der Regel auf reellwertige Messdaten. Der dritte Beitrag umfasst eine einheitliche Methode zur Behandlung von Liegruppen-wertigen Zeitreihen. Darauf aufbauend werden Distanzmaße auf der Rotationsgruppe SO(3) und auf der euklidischen Gruppe SE(3) behandelt. Des Weiteren werden speichereffiziente Darstellungen und gruppenkompatible Erweiterungen elastischer Maße diskutiert.