46 resultados para PDE-based parallel preconditioner
Resumo:
A parallel processor architecture based on a communicating sequential processor chip, the transputer, is described. The architecture is easily linearly extensible to enable separate functions to be included in the controller. To demonstrate the power of the resulting controller some experimental results are presented comparing PID and full inverse dynamics on the first three joints of a Puma 560 robot. Also examined are some of the sample rate issues raised by the asynchronous updating of inertial parameters, and the need for full inverse dynamics at every sample interval is questioned.
Resumo:
Both the (5,3) counter and (2,2,3) counter multiplication techniques are investigated for the efficiency of their operation speed and the viability of the architectures when implemented in a fast bipolar ECL technology. The implementation of the counters in series-gated ECL and threshold logic are contrasted for speed, noise immunity and complexity, and are critically compared with the fastest practical design of a full-adder. A novel circuit technique to overcome the problems of needing high fan-in input weights in threshold circuits through the use of negative weighted inputs is presented. The authors conclude that a (2,2,3) counter based array multiplier implemented in series-gated ECL should enable a significant increase in speed over conventional full adder based array multipliers.
Resumo:
The authors compare various array multiplier architectures based on (p,q) counter circuits. The tradeoff in multiplier design is always between adding complexity and increasing speed. It is shown that by using a (2,2,3) counter cell it is possible to gain a significant increase in speed over a conventional full-adder, carry-save array based approach. The increase in complexity should be easily accommodated using modern emitter-coupled-logic processes.
Resumo:
The Distributed Rule Induction (DRI) project at the University of Portsmouth is concerned with distributed data mining algorithms for automatically generating rules of all kinds. In this paper we present a system architecture and its implementation for inducing modular classification rules in parallel in a local area network using a distributed blackboard system. We present initial results of a prototype implementation based on the Prism algorithm.
Resumo:
Java is becoming an increasingly popular language for developing distributed and parallel scientific and engineering applications. Jini is a Java-based infrastructure developed by Sun that can allegedly provide all the services necessary to support distributed applications. It is the aim of this paper to explore and investigate the services and properties that Jini actually provides and match these against the needs of high performance distributed and parallel applications written in Java. The motivation for this work is the need to develop a distributed infrastructure to support an MPI-like interface to Java known as MPJ. In the first part of the paper we discuss the needs of MPJ, the parallel environment that we wish to support. In particular we look at aspects such as reliability and ease of use. We then move on to sketch out the Jini architecture and review the components and services that Jini provides. In the third part of the paper we critically explore a Jini infrastructure that could be used to support MPJ. Here we are particularly concerned with Jini's ability to support reliably a cocoon of MPJ processes executing in a heterogeneous envirnoment. In the final part of the paper we summarise our findings and report on future work being undertaken on Jini and MPJ.
Resumo:
There exists a well-developed body of theory based on quasi-geostrophic (QG) dynamics that is central to our present understanding of large-scale atmospheric and oceanic dynamics. An important question is the extent to which this body of theory may generalize to more accurate dynamical models. As a first step in this process, we here generalize a set of theoretical results, concerning the evolution of disturbances to prescribed basic states, to semi-geostrophic (SG) dynamics. SG dynamics, like QG dynamics, is a Hamiltonian balanced model whose evolution is described by the material conservation of potential vorticity, together with an invertibility principle relating the potential vorticity to the advecting fields. SG dynamics has features that make it a good prototype for balanced models that are more accurate than QG dynamics. In the first part of this two-part study, we derive a pseudomomentum invariant for the SG equations, and use it to obtain: (i) linear and nonlinear generalized Charney–Stern theorems for disturbances to parallel flows; (ii) a finite-amplitude local conservation law for the invariant, obeying the group-velocity property in the WKB limit; and (iii) a wave-mean-flow interaction theorem consisting of generalized Eliassen–Palm flux diagnostics, an elliptic equation for the stream-function tendency, and a non-acceleration theorem. All these results are analogous to their QG forms. The pseudomomentum invariant – a conserved second-order disturbance quantity that is associated with zonal symmetry – is constructed using a variational principle in a similar manner to the QG calculations. Such an approach is possible when the equations of motion under the geostrophic momentum approximation are transformed to isentropic and geostrophic coordinates, in which the ageostrophic advection terms are no longer explicit. Symmetry-related wave-activity invariants such as the pseudomomentum then arise naturally from the Hamiltonian structure of the SG equations. We avoid use of the so-called ‘massless layer’ approach to the modelling of isentropic gradients at the lower boundary, preferring instead to incorporate explicitly those boundary contributions into the wave-activity and stability results. This makes the analogy with QG dynamics most transparent. This paper treats the f-plane Boussinesq form of SG dynamics, and its recent extension to β-plane, compressible flow by Magnusdottir & Schubert. In the limit of small Rossby number, the results reduce to their respective QG forms. Novel features particular to SG dynamics include apparently unnoticed lateral boundary stability criteria in (i), and the necessity of including additional zonal-mean eddy correlation terms besides the zonal-mean potential vorticity fluxes in the wave-mean-flow balance in (iii). In the companion paper, wave-activity conservation laws and stability theorems based on the SG form of the pseudoenergy are presented.
Resumo:
We have optimised the atmospheric radiation algorithm of the FAMOUS climate model on several hardware platforms. The optimisation involved translating the Fortran code to C and restructuring the algorithm around the computation of a single air column. Instead of the existing MPI-based domain decomposition, we used a task queue and a thread pool to schedule the computation of individual columns on the available processors. Finally, four air columns are packed together in a single data structure and computed simultaneously using Single Instruction Multiple Data operations. The modified algorithm runs more than 50 times faster on the CELL’s Synergistic Processing Elements than on its main PowerPC processing element. On Intel-compatible processors, the new radiation code runs 4 times faster. On the tested graphics processor, using OpenCL, we find a speed-up of more than 2.5 times as compared to the original code on the main CPU. Because the radiation code takes more than 60% of the total CPU time, FAMOUS executes more than twice as fast. Our version of the algorithm returns bit-wise identical results, which demonstrates the robustness of our approach. We estimate that this project required around two and a half man-years of work.
Resumo:
Exascale systems are the next frontier in high-performance computing and are expected to deliver a performance of the order of 10^18 operations per second using massive multicore processors. Very large- and extreme-scale parallel systems pose critical algorithmic challenges, especially related to concurrency, locality and the need to avoid global communication patterns. This work investigates a novel protocol for dynamic group communication that can be used to remove the global communication requirement and to reduce the communication cost in parallel formulations of iterative data mining algorithms. The protocol is used to provide a communication-efficient parallel formulation of the k-means algorithm for cluster analysis. The approach is based on a collective communication operation for dynamic groups of processes and exploits non-uniform data distributions. Non-uniform data distributions can be either found in real-world distributed applications or induced by means of multidimensional binary search trees. The analysis of the proposed dynamic group communication protocol has shown that it does not introduce significant communication overhead. The parallel clustering algorithm has also been extended to accommodate an approximation error, which allows a further reduction of the communication costs. The effectiveness of the exact and approximate methods has been tested in a parallel computing system with 64 processors and in simulations with 1024 processing elements.
Resumo:
Dynamic viscoelasticity of electrorheological fluids based on microcrystalline cellulose/castor oil suspensions was experimentally investigated in squeeze flow. The dependence of storage modulus G' and loss modulus G" parallel to external electric field on electric fields and strain amplitudes is presented. The experiments show that, when external electric field is higher than the critical field, the viscoelasticity of the ER fluids converts from linear to nonlinear, and the ER fluids transfer from solid-like state to fluid state with the growth of strain amplitude. The influences of strain amplitude and oscillatory frequency on the nonlinearity of viscoelasticity were also studied.
Resumo:
A great number of studies on wind conditions in passages between slab-type buildings have been conducted in the past. However, wind conditions under different structure and configuration of buildings is still unclear and studies existed still can’t provide guidance on urban planning and design, due to the complexity of buildings and aerodynamics. The aim of this paper is to provide more insight in the mechanism of wind conditions in passages. In this paper, a simplified passage model with non-parallel buildings is developed on the basis of the wind tunnel experiments conducted by Blocken et al. (2008). Numerical simulation based on CFD is employed for a detailed investigation of the wind environment in passages between two long narrow buildings with different directions and model validation is performed by comparing numerical results with corresponding wind tunnel measurements.
Resumo:
We study a series of transient entries into the low-latitude boundary layer (LLBL) of all four Cluster spacecraft during an outbound pass through the mid-afternoon magnetopause ([X(GSM), Y(GSM), Z(GSM)] approximate to [2, 7, 9] R(E)). The events take place during an interval of northward IMF, as seen in the data from the ACE satellite and lagged by a propagation delay of 75 min that is well-defined by two separate studies: (1) the magnetospheric variations prior to the northward turning (Lockwood et al., 2001, this issue) and (2) the field clock angle seen by Cluster after it had emerged into the magnetosheath (Opgenoorth et al., 2001, this issue). With an additional lag of 16.5 min, the transient LLBL events cor-relate well with swings of the IMF clock angle (in GSM) to near 90degrees. Most of this additional lag is explained by ground-based observations, which reveal signatures of transient reconnection in the pre-noon sector that then take 10-15 min to propagate eastward to 15 MLT, where they are observed by Cluster. The eastward phase speed of these signatures agrees very well with the motion deduced by the cross-correlation of the signatures seen on the four Cluster spacecraft. The evidence that these events are reconnection pulses includes: transient erosion of the noon 630 nm (cusp/cleft) aurora to lower latitudes; transient and travelling enhancements of the flow into the polar cap, imaged by the AMIE technique; and poleward-moving events moving into the polar cap, seen by the EISCAT Svalbard Radar (ESR). A pass of the DMSP-F15 satellite reveals that the open field lines near noon have been opened for some time: the more recently opened field lines were found closer to dusk where the flow transient and the poleward-moving event intersected the satellite pass. The events at Cluster have ion and electron characteristics predicted and observed by Lockwood and Hapgood (1998) for a Flux Transfer Event (FTE), with allowance for magnetospheric ion reflection at Alfvenic disturbances in the magnetopause reconnection layer. Like FTEs, the events are about 1 R(E) in their direction of motion and show a rise in the magnetic field strength, but unlike FTEs, in general, they show no pressure excess in their core and hence, no characteristic bipolar signature in the boundary-normal component. However, most of the events were observed when the magnetic field was southward, i.e. on the edge of the interior magnetic cusp, or when the field was parallel to the magnetic equatorial plane. Only when the satellite begins to emerge from the exterior boundary (when the field was northward), do the events start to show a pressure excess in their core and the consequent bipolar signature. We identify the events as the first observations of FTEs at middle altitudes.
Resumo:
A coordinated ground-based observational campaign using the IMAGE magnetometer network, EISCAT radars and optical instruments on Svalbard has made possible detailed studies of a travelling convection vortices (TCV) event on 6 January 1992. Combining the data from these facilities allows us to draw a very detailed picture of the features and dynamics of this TCV event. On the way from the noon to the drawn meridian, the vortices went through a remarkable development. The propagation velocity in the ionosphere increased from 2.5 to 7.4 km s−1, and the orientation of the major axes of the vortices rotated from being almost parallel to the magnetic meridian near noon to essentially perpendicular at dawn. By combining electric fields obtained by EISCAT and ionospheric currents deduced from magnetic field recordings, conductivities associated with the vortices could be estimated. Contrary to expectations we found higher conductivities below the downward field aligned current (FAC) filament than below the upward directed. Unexpected results also emerged from the optical observations. For most of the time there were no discrete aurora at 557.7 nm associated with the TCVs. Only once did a discrete form appear at the foot of the upward FAC. This aurora subsequently expanded eastward and westward leaving its centre at the same longitude while the TCV continued to travel westward. Also we try to identify the source regions of TCVs in the magnetosphere and discuss possible generation mechanisms.
Resumo:
We present an analysis of a cusp ion step, observed by the Defense Meteorological Satellite Program (DMSP) F10 spacecraft, between two poleward moving events of enhanced ionospheric electron temperature, observed by the European Incoherent Scatter (EISCAT) radar. From the ions detected by the satellite, the variation of the reconnection rate is computed for assumed distances along the open-closed field line separatrix from the satellite to the X line, do. Comparison with the onset times of the associated ionospheric events allows this distance to be estimated, but with an uncertainty due to the determination of the low-energy cutoff of the ion velocity distribution function, ƒ(ν). Nevertheless, the reconnection site is shown to be on the dayside magnetopause, consistent with the reconnection model of the cusp during southward interplanetary magnetic field (IMF). Analysis of the time series of distribution function at constant energies, ƒ(ts), shows that the best estimate of the distance do is 14.5±2 RE. This is consistent with various magnetopause observations of the signatures of reconnection for southward IMF. The ion precipitation is used to reconstruct the field-parallel part of the Cowley D ion distribution function injected into the open low-latitude boundary layer in the vicinity of the X line. From this reconstruction, the field-aligned component of the magnetosheath flow is found to be only −55±65 km s−1 near the X line, which means either that the reconnection X line is near the stagnation region at the nose of the magnetosphere, or that it is closely aligned with the magnetosheath flow streamline which is orthogonal to the magnetosheath field, or both. In addition, the sheath Alfvén speed at the X line is found to be 220±45 km s−1, and the speed with which newly opened field lines are ejected from the X line is 165±30 km s−1. We show that the inferred magnetic field, plasma density, and temperature of the sheath near the X line are consistent with a near-subsolar reconnection site and confirm that the magnetosheath field makes a large angle (>58°) with the X line.
Resumo:
In this paper, we summarise this recent progress to underline the features specific to this nonlinear elliptic case, and we give a new classification of boundary conditions on the semistrip that satisfy a necessary condition for yielding a boundary value problem can be effectively linearised. This classification is based on formulation the equation in terms of an alternative Lax pair.
Resumo:
A parallel formulation for the simulation of a branch prediction algorithm is presented. This parallel formulation identifies independent tasks in the algorithm which can be executed concurrently. The parallel implementation is based on the multithreading model and two parallel programming platforms: pthreads and Cilk++. Improvement in execution performance by up to 7 times is observed for a generic 2-bit predictor in a 12-core multiprocessor system.