864 resultados para parallel scalability
Flow And Heat-Transfer Over An Upstream Moving Wall With A Magnetic-Field And A Parallel Free Stream
Resumo:
The flow and heat transfer over an upstream moving non-isothermal wall with a parallel free stream have been considered. The magnetic field has been applied in the free stream parallel to the wall and the effect of induced magnetic field has been included in the analysis. The boundary layer equations governing the steady incompressible electrically conducting fluid flow have been solved numerically using a shooting method. This problem is interesting because a solution exists only when the ratio of the wall velocity does not exceed a certain critical value and this critical value depends on the magnetic field and magnetic Prandtl number. Also dual solutions exist for a certain range of wall velocity.
Resumo:
Massively parallel SIMD computing is applied to obtain an order of magnitude improvement in the executional speed of an important algorithm in VLSI design automation. The physical design of a VLSI circuit involves logic module placement as a subtask. The paper is concerned with accelerating the well known Min-cut placement technique for logic cell placement. The inherent parallelism of the Min-cut algorithm is identified, and it is shown that a parallel machine based on the efficient execution of the placement procedure.
Resumo:
X-ray crystallographlc studies on 3′–5′ ollgomers have provided a great deal of information on the stereochemistry and conformational flexibility of nucleic acids and polynucleotides. In contrast, there is very little Information available on 2′–5′ polynucleotides. We have now obtained the crystal structure of Cytidylyl-2′,5′-Adenoslne (C2′p5′A) at atomic resolution to establish the conformational differences between these two classes of polymers. The dlnucleoside phosphate crystallises in the monocllnlc space group C2, with a = 33.912(4)Å, b =16.824(4)Å, c = 12.898(2)Å and 0 = 112.35(1) with two molecules in the asymmetric unit. Spectacularly, the two independent C2′p5′A molecules in the asymmetric unit form right handed miniature parallel stranded double helices with their respective crystallographic two fold (b axis) symmetry mates. Remarkably, the two mini duplexes are almost indistinguishable. The cytosines and adenines form self-pairs with three and two hydrogen bonds respectively. The conformation of the C and A residues about the glycosyl bond is anti same as in the 3′–5′ analog but contrasts the anti and syn geometry of C and A residues in A2′p5′C. The furanose ring conformation is C3′endo, C2′endo mixed puckering as in the C3′p5′A-proflavine complex. A comparison of the backbone torsion angles with other 2′–5′ dinucleoside structures reveals that the major deviations occur in the torsion angles about the C3′–C2′ and C4′-C3′ bonds. A right-handed 2′–5′ parallel stranded double helix having eight base pairs per turn and 45° turn angle between them has been constructed using this dinucleoside phosphate as repeat unit. A discussion on 2′–5′ parallel stranded double helix and its relevance to biological systems is presented.
Resumo:
Single tract guanine residues can associate to form stable parallel quadruplex structures in the presence of certain cations. Nanosecond scale molecular dynamics simulations have been performed on fully solvated fibre model of parallel d(G7) quadruplex structures with Na+ or K+ ions coordinated in the cavity formed by the 06 atoms of the guanine bases. The AMBER 4.1 force field and Particle Mesh Ewald technique for electrostatic interactions have been used in all simulations. These quadruplex structures are stable during the simulation, with the middle four base tetrads showing root mean square deviation values between 0.5 to 0.8 A from the initial structure as well the high resolution crystal structure. Even in the absence of any coordinated ion in the initial structure, the G-quadruplex structure remains intact throughout the simulation. During the 1.1 ns MD simulation, one Na+ counter ion from the solvent as well as several water molecules enter the central cavity to occupy the empty coordination sites within the parallel quadruplex and help stabilize the structure. Hydrogen bonding pattern depends on the nature of the coordinated ion, with the G-tetrad undergoing local structural variation to accommodate cations of different sizes. In the absence of any coordinated ion, due to strong mutual repulsion, 06 atoms within G-tetrad are forced farther apart from each other, which leads to a considerably different hydrogen bonding scheme within the G-tetrads and very favourable interaction energy between the guanine bases constituting a G-tetrad. However, a coordinated ion between G-tetrads provides extra stacking energy for the G-tetrads and makes the quadruplex structure more rigid. Na+ ions, within the quadruplex cavity, are more mobile than coordinated K+ ions. A number of hydrogen bonded water molecules are observed within the grooves of all quadruplex structures
Resumo:
A simple and efficient algorithm for the bandwidth reduction of sparse symmetric matrices is proposed. It involves column-row permutations and is well-suited to map onto the linear array topology of the SIMD architectures. The efficiency of the algorithm is compared with the other existing algorithms. The interconnectivity and the memory requirement of the linear array are discussed and the complexity of its layout area is derived. The parallel version of the algorithm mapped onto the linear array is then introduced and is explained with the help of an example. The optimality of the parallel algorithm is proved by deriving the time complexities of the algorithm on a single processor and the linear array.
Resumo:
Molecular dynamics (MD) studies have been carried out on the Hoogsteen hydrogen bonded parallel and the reverse Hoogsteen hydrogen banded antiparallel C.G*G triplexes. Earlier, the molecular mechanics studies had shown that the parallel structure was energetically more favourable than the antiparallel structure. To characterize the structural stability of the two triplexes and to investigate whether the antiparallel structure can transit to an energetically more favourable structure, due to the local fluctuations in the structure during the MD simulation, the two structures were subjected to 200ps of constant temperature vacuum MD simulations at 300K. Initially no constraints were applied to the structures and it was observed that for the antiparallel tripler, the structure showed a large root mean square deviation from the starting structure within the first 12ps and the N4-H41-O6 hydrogen bond in the WC duplex got distorted due to a high propeller twist and a moderate increase in the opening angle in the basepairs. Starting from an initial value of 30 degrees, helical twist of the average structure from this simulation had a value of 36 degrees, while the parallel structure stabilized at a twist of 33 degrees. In spite of the hydrogen bond distortions in the antiparallel tripler, it was energetically comparable to the parallel tripler. To examine the structural characteristics of an undistorted structure, another MD simulation was performed on the antiparallel tripler by constraining all the hydrogen bonds. This structure stabilized at an average twist of 33 degrees. In the course of the dynamics though the energy of the molecule - compared to the initial structure - improved, it did not become comparable to the parallel structure. Energy minimization studies performed in the presence of explicit water and counterions also showed the two structures to be equally favourable energetically Together these results indicate that the parallel C.G*G tripler with Hoogsteen hydrogen bonds also represents a stereochemically and energetically favourable structure for this class of triplexes.
Resumo:
In this paper we develop a multithreaded VLSI processor linear array architecture to render complex environments based on the radiosity approach. The processing elements are identical and multithreaded. They work in Single Program Multiple Data (SPMD) mode. A new algorithm to do the radiosity computations based on the progressive refinement approach[2] is proposed. Simulation results indicate that the architecture is latency tolerant and scalable. It is shown that a linear array of 128 uni-threaded processing elements sustains a throughput close to 0.4 million patches/sec.
Resumo:
In this paper, a wireless control strategy for parallel operation of three-phase four-wire inverters is proposed. A generalized situation is considered where the inverters are of unequal power ratings and the loads are nonlinear and unbalanced in nature. The proposed control algorithm exploits the potential of sinusoidal domain proportional+multiresonant controller ( in the inner voltage regulation loop) to make the system suitable for nonlinear and unbalanced loads with a simple and generalized structure of virtual output-impedance loop. The decentralized operation is achieved by using three-phase P/Q droop characteristics. The overall control algorithm helps to limit the harmonic contents and the degree of unbalance in the output-voltage waveform and to achieve excellent power-sharing accuracy in spite of mismatch in the inverter output impedances. Moreover, a synchronized turn on with consequent change over to the droop mode is applied for the new incoming unit in order to limit the circulating current completely. The simulation and experimental results from-1 kVA and -0.5 kVA paralleled units validate the effectiveness of the scheme.
Resumo:
In this paper, we introduce an analytical technique based on queueing networks and Petri nets for making a performance analysis of dataflow computations when executed on the Manchester machine. This technique is also applicable for the analysis of parallel computations on multiprocessors. We characterize the parallelism in dataflow computations through a four-parameter characterization, namely, the minimum parallelism, the maximum parallelism, the average parallelism and the variance in parallelism. We observe through detailed investigation of our analytical models that the average parallelism is a good characterization of the dataflow computations only as long as the variance in parallelism is small. However, significant difference in performance measures will result when the variance in parallelism is comparable to or higher than the average parallelism.
Resumo:
We use parallel weighted finite-state transducers to implement a part-of-speech tagger, which obtains state-of-the-art accuracy when used to tag the Europarl corpora for Finnish, Swedish and English. Our system consists of a weighted lexicon and a guesser combined with a bigram model factored into two weighted transducers. We use both lemmas and tag sequences in the bigram model, which guarantees reliable bigram estimates.
Resumo:
Clustered VLIW architectures solve the scalability problem associated with flat VLIW architectures by partitioning the register file and connecting only a subset of the functional units to a register file. However, inter-cluster communication in clustered architectures leads to increased leakage in functional components and a high number of register accesses. In this paper, we propose compiler scheduling algorithms targeting two previously ignored power-hungry components in clustered VLIW architectures, viz., instruction decoder and register file. We consider a split decoder design and propose a new energy-aware instruction scheduling algorithm that provides 14.5% and 17.3% benefit in the decoder power consumption on an average over a purely hardware based scheme in the context of 2-clustered and 4-clustered VLIW machines. In the case of register files, we propose two new scheduling algorithms that exploit limited register snooping capability to reduce extra register file accesses. The proposed algorithms reduce register file power consumption on an average by 6.85% and 11.90% (10.39% and 17.78%), respectively, along with performance improvement of 4.81% and 5.34% (9.39% and 11.16%) over a traditional greedy algorithm for 2-clustered (4-clustered) VLIW machine. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
Adsorption of n-alkane mixtures in the zeolite LTA-5A under liquid-phase conditions has been studied using grand canonical Monte Carlo (GCMC) simulations combined with parallel tempering. Normal GCMC techniques fail for some of these systems due to the preference of linear molecules to coil within a single cage in the zeolite. The narrow zeolite windows severerly restrict interactions of the molecules, making it difficult to simulate cooperative rearrangements necessary to explore configuration space. Because of these reasons, normal GCMC simulations results show poor reproducibility in some cases. These problems were overcome with parallel tempering techniques. Even with parallel tempering, these are very challenging systems for molecular simulation. Similar problems may arise for other zeolites such as CHA, AFX, ERI, KFI, and RHO having cages connected by narrow windows. The simulations capture the complex selectivity behavior observed in experiments such as selectivity inversion and azeotrope formation.
Resumo:
Floquet analysis is widely used for small-order systems (say, order M < 100) to find trim results of control inputs and periodic responses, and stability results of damping levels and frequencies, Presently, however, it is practical neither for design applications nor for comprehensive analysis models that lead to large systems (M > 100); the run time on a sequential computer is simply prohibitive, Accordingly, a massively parallel Floquet analysis is developed with emphasis on large systems, and it is implemented on two SIMD or single-instruction, multiple-data computers with 4096 and 8192 processors, The focus of this development is a parallel shooting method with damped Newton iteration to generate trim results; the Floquet transition matrix (FTM) comes out as a byproduct, The eigenvalues and eigenvectors of the FTM are computed by a parallel QR method, and thereby stability results are generated, For illustration, flap and flap-lag stability of isolated rotors are treated by the parallel analysis and by a corresponding sequential analysis with the conventional shooting and QR methods; linear quasisteady airfoil aerodynamics and a finite-state three-dimensional wake model are used, Computational reliability is quantified by the condition numbers of the Jacobian matrices in Newton iteration, the condition numbers of the eigenvalues and the residual errors of the eigenpairs, and reliability figures are comparable in both the parallel and sequential analyses, Compared to the sequential analysis, the parallel analysis reduces the run time of large systems dramatically, and the reduction increases with increasing system order; this finding offers considerable promise for design and comprehensive-analysis applications.