931 resultados para Desktop parallel computing
Resumo:
Streaming SIMD Extensions (SSE) is a unique feature embedded in the Pentium III class of microprocessors. By fully exploiting SSE, parallel algorithms can be implemented on a standard personal computer and a theoretical speedup of four can be achieved. In this paper, we demonstrate the implementation of a parallel LU matrix decomposition algorithm for solving power systems network equations with SSE and discuss advantages and disadvantages of this approach.
Resumo:
The Streaming SIMD extension (SSE) is a special feature that is available in the Intel Pentium III and P4 classes of microprocessors. As its name implies, SSE enables the execution of SIMD (Single Instruction Multiple Data) operations upon 32-bit floating-point data therefore, performance of floating-point algorithms can be improved. In electrified railway system simulation, the computation involves the solving of a huge set of simultaneous linear equations, which represent the electrical characteristic of the railway network at a particular time-step and a fast solution for the equations is desirable in order to simulate the system in real-time. In this paper, we present how SSE is being applied to the railway network simulation.
Resumo:
Experimental and theoretical studies have shown the importance of stochastic processes in genetic regulatory networks and cellular processes. Cellular networks and genetic circuits often involve small numbers of key proteins such as transcriptional factors and signaling proteins. In recent years stochastic models have been used successfully for studying noise in biological pathways, and stochastic modelling of biological systems has become a very important research field in computational biology. One of the challenge problems in this field is the reduction of the huge computing time in stochastic simulations. Based on the system of the mitogen-activated protein kinase cascade that is activated by epidermal growth factor, this work give a parallel implementation by using OpenMP and parallelism across the simulation. Special attention is paid to the independence of the generated random numbers in parallel computing, that is a key criterion for the success of stochastic simulations. Numerical results indicate that parallel computers can be used as an efficient tool for simulating the dynamics of large-scale genetic regulatory networks and cellular processes
Resumo:
The emergence of pseudo-marginal algorithms has led to improved computational efficiency for dealing with complex Bayesian models with latent variables. Here an unbiased estimator of the likelihood replaces the true likelihood in order to produce a Bayesian algorithm that remains on the marginal space of the model parameter (with latent variables integrated out), with a target distribution that is still the correct posterior distribution. Very efficient proposal distributions can be developed on the marginal space relative to the joint space of model parameter and latent variables. Thus psuedo-marginal algorithms tend to have substantially better mixing properties. However, for pseudo-marginal approaches to perform well, the likelihood has to be estimated rather precisely. This can be difficult to achieve in complex applications. In this paper we propose to take advantage of multiple central processing units (CPUs), that are readily available on most standard desktop computers. Here the likelihood is estimated independently on the multiple CPUs, with the ultimate estimate of the likelihood being the average of the estimates obtained from the multiple CPUs. The estimate remains unbiased, but the variability is reduced. We compare and contrast two different technologies that allow the implementation of this idea, both of which require a negligible amount of extra programming effort. The superior performance of this idea over the standard approach is demonstrated on simulated data from a stochastic volatility model.
Resumo:
These lecture notes describe the use and implementation of a framework in which mathematical as well as engineering optimisation problems can be analysed. The foundations of the framework and algorithms described -Hierarchical Asynchronous Parallel Evolutionary Algorithms (HAPEAs) - lie upon traditional evolution strategies and incorporate the concepts of a multi-objective optimisation, hierarchical topology, asynchronous evaluation of candidate solutions , parallel computing and game strategies. In a step by step approach, the numerical implementation of EAs and HAPEAs for solving multi criteria optimisation problems is conducted providing the reader with the knowledge to reproduce these hand on training in his – her- academic or industrial environment.
Resumo:
Background Recent advances in Immunology highlighted the importance of local properties on the overall progression of HIV infection. In particular, the gastrointestinal tract is seen as a key area during early infection, and the massive cell depletion associated with it may influence subsequent disease progression. This motivated the development of a large-scale agent-based model. Results Lymph nodes are explicitly implemented, and considerations on parallel computing permit large simulations and the inclusion of local features. The results obtained show that GI tract inclusion in the model leads to an accelerated disease progression, during both the early stages and the long-term evolution, compared to a theoretical, uniform model. Conclusions These results confirm the potential of treatment policies currently under investigation, which focus on this region. They also highlight the potential of this modelling framework, incorporating both agent-based and network-based components, in the context of complex systems where scaling-up alone does not result in models providing additional insights.
Resumo:
Biomedical systems involve a large number of entities and intricate interactions between these. Their direct analysis is, therefore, difficult, and it is often necessary to rely on computational models. These models require significant resources and parallel computing solutions. These approaches are particularly suited, given parallel aspects in the nature of biomedical systems. Model hybridisation also permits the integration and simultaneous study of multiple aspects and scales of these systems, thus providing an efficient platform for multidisciplinary research.
Resumo:
Understanding the dynamics of disease spread is essential in contexts such as estimating load on medical services, as well as risk assessment and interven- tion policies against large-scale epidemic outbreaks. However, most of the information is available after the outbreak itself, and preemptive assessment is far from trivial. Here, we report on an agent-based model developed to investigate such epidemic events in a stylised urban environment. For most diseases, infection of a new individual may occur from casual contact in crowds as well as from repeated interactions with social partners such as work colleagues or family members. Our model therefore accounts for these two phenomena. Given the scale of the system, efficient parallel computing is required. In this presentation, we focus on aspects related to paralllelisation for large networks generation and massively multi-agent simulations.
Resumo:
A new parallel algorithm for transforming an arithmetic infix expression into a par se tree is presented. The technique is based on a result due to Fischer (1980) which enables the construction of the parse tree, by appropriately scanning the vector of precedence values associated with the elements of the expression. The algorithm presented here is suitable for execution on a shared memory model of an SIMD machine with no read/write conflicts permitted. It uses O(n) processors and has a time complexity of O(log2n) where n is the expression length. Parallel algorithms for generating code for an SIMD machine are also presented.
Resumo:
Knowledge of protein-ligand interactions is essential to understand several biological processes and important for applications ranging from understanding protein function to drug discovery and protein engineering. Here, we describe an algorithm for the comparison of three-dimensional ligand-binding sites in protein structures. A previously described algorithm, PocketMatch (version 1.0) is optimised, expanded, and MPI-enabled for parallel execution. PocketMatch (version 2.0) rapidly quantifies binding-site similarity based on structural descriptors such as residue nature and interatomic distances. Atomic-scale alignments may also be obtained from amino acid residue pairings generated. It allows an end-user to compute database-wide, all-to-all comparisons in a matter of hours. The use of our algorithm on a sample dataset, performance-analysis, and annotated source code is also included.
Resumo:
The aim of this technical report is to present some detailed explanations in order to help to understand and use the Message Passing Interface (MPI) parallel programming for solving several mixed integer optimization problems. We have developed a C++ experimental code that uses the IBM ILOG CPLEX optimizer within the COmputational INfrastructure for Operations Research (COIN-OR) and MPI parallel computing for solving the optimization models under UNIX-like systems. The computational experience illustrates how can we solve 44 optimization problems which are asymmetric with respect to the number of integer and continuous variables and the number of constraints. We also report a comparative with the speedup and efficiency of several strategies implemented for some available number of threads.
Resumo:
Hartree-Fock (HF) calculations have had remarkable success in describing large nuclei at high spin, temperature and deformation. To allow full range of possible deformations, the Skyrme HF equations can be discretized on a three-dimensional mesh. However, such calculations are currently limited by the computational resources provided by traditional supercomputers. To take advantage of recent developments in massively parallel computing technology, we have implemented the LLNL Skyrme-force static and rotational HF codes on Intel's DELTA and GAMMA systems at Caltech.
We decomposed the HF code by assigning a portion of the mesh to each node, with nearest neighbor meshes assigned to nodes connected by communication· channels. This kind of decomposition is well-suited for the DELTA and the GAMMA architecture because the only non-local operations are wave function orthogonalization and the boundary conditions of the Poisson equation for the Coulomb field.
Our first application of the HF code on parallel computers has been the study of identical superdeformed (SD) rotational bands in the Hg region. In the last ten years, many SD rotational bands have been found experimentally. One very surprising feature found in these SD rotational bands is that many pairs of bands in nuclei that differ by one or two mass units have nearly identical deexcitation gamma-ray energies. Our calculations of the five rotational bands in ^(192)Hg and ^(194)Pb show that the filling of specific orbitals can lead to bands with deexcitation gamma-ray energies differing by at most 2 keV in nuclei differing by two mass units and over a range of angular momenta comparable to that observed experimentally. Our calculations of SD rotational bands in the Dy region also show that twinning can be achieved by filling or emptying some specific orbitals.
The interpretation of future precise experiments on atomic parity nonconservation (PNC) in terms of parameters of the Standard Model could be hampered by uncertainties in the atomic and nuclear structure. As a further application of the massively parallel HF calculations, we calculated the proton and neutron densities of the Cesium isotopes from A = 125 to A = 139. Based on our good agreement with experimental charge radii, binding energies, and ground state spins, we conclude that the uncertainties in the ratios of weak charges are less than 10^(-3), comfortably smaller than the anticipated experimental error.
Resumo:
We describe the key role played by partial evaluation in the Supercomputer Toolkit, a parallel computing system for scientific applications that effectively exploits the vast amount of parallelism exposed by partial evaluation. The Supercomputer Toolkit parallel processor and its associated partial evaluation-based compiler have been used extensively by scientists at M.I.T., and have made possible recent results in astrophysics showing that the motion of the planets in our solar system is chaotically unstable.