135 resultados para Parallel execution


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Computational docking of ligands to protein structures is a key step in structure-based drug design. Currently, the time required for each docking run is high and thus limits the use of docking in a high-throughput manner, warranting parallelization of docking algorithms. AutoDock, a widely used tool, has been chosen for parallelization. Near-linear increases in speed were observed with 96 processors, reducing the time required for docking ligands to HIV-protease from 81 min, as an example, on a single IBM Power-5 processor ( 1.65 GHz), to about 1 min on an IBM cluster, with 96 such processors. This implementation would make it feasible to perform virtual ligand screening using AutoDock.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The StreamIt programming model has been proposed to exploit parallelism in streaming applications on general purpose multi-core architectures. This model allows programmers to specify the structure of a program as a set of filters that act upon data, and a set of communication channels between them. The StreamIt graphs describe task, data and pipeline parallelism which can be exploited on modern Graphics Processing Units (GPUs), as they support abundant parallelism in hardware. In this paper, we describe the challenges in mapping StreamIt to GPUs and propose an efficient technique to software pipeline the execution of stream programs on GPUs. We formulate this problem - both scheduling and assignment of filters to processors - as an efficient Integer Linear Program (ILP), which is then solved using ILP solvers. We also describe a novel buffer layout technique for GPUs which facilitates exploiting the high memory bandwidth available in GPUs. The proposed scheduling utilizes both the scalar units in GPU, to exploit data parallelism, and multiprocessors, to exploit task and pipelin parallelism. Further it takes into consideration the synchronization and bandwidth limitations of GPUs, and yields speedups between 1.87X and 36.83X over a single threaded CPU.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The structures of two crystal forms of Boc-Trp-Ile-Ala-Aib-Ile-Val-Aib-Leu-Aib-Pro-OMe have been determined. The triclinic form (P1, Z = 1) from DMSO/H2O crystallizes as a dihydrate (Karle, Sukumar & Balaram (1986) Proc, Natl, Acad. Sci. USA 83, 9284-9288). The monoclinic form (P2(1), Z = 2) crystallized from dioxane is anhydrous. The conformation of the peptide is essentially the same in both crystal system, but small changes in conformational angles are associated with a shift of the helix from a predominantly alpha-type to a predominantly 3(10)-type. The r.m.s. deviation of 33 atoms in the backbone and C beta positions of residues 2-8 is only 0.29 A between molecules in the two polymorphs. In both space groups, the helical molecules pack in a parallel fashion, rather than antiparallel. The only intermolecular hydrogen bonding is head-to-tail between helices. There are no lateral hydrogen bonds. In the P2(1) cell, a = 9.422(2) A, b = 36.392(11) A, c = 10.548(2) A, beta = 111.31(2) degrees and V = 3369.3 A for 2 molecules of C60H97N11O13 per cell.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

REDEFINE is a reconfigurable SoC architecture that provides a unique platform for high performance and low power computing by exploiting the synergistic interaction between coarse grain dynamic dataflow model of computation (to expose abundant parallelism in applications) and runtime composition of efficient compute structures (on the reconfigurable computation resources). We propose and study the throttling of execution in REDEFINE to maximize the architecture efficiency. A feature specific fast hybrid (mixed level) simulation framework for early in design phase study is developed and implemented to make the huge design space exploration practical. We do performance modeling in terms of selection of important performance criteria, ranking of the explored throttling schemes and investigate effectiveness of the design space exploration using statistical hypothesis testing. We find throttling schemes which give appreciable (24.8%) overall performance gain in the architecture and 37% resource usage gain in the throttling unit simultaneously.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

An apolar helical decapeptide with different end groups, Boc- or Ac-, crystallizes in a completely parallel fashion for the Boc-analog and in an antiparallel fashion for the Ac-analog. In both crystals, the packing motif consists of rows of parallel molecules. In the Boc-crystals, adjacent rows assemble with the helix axes pointed in the same direction. In the Ac-crystals, adjacent rows assemble with the helix axes pointed in opposite directions. The conformations of the molecules in both crystals are quite similar, predominantly alpha-helical, except for the tryptophanyl side chain where chi 1 congruent to 60 degrees in the Boc- analog and congruent to 180 degrees in the Ac-analog. As a result, there is one lateral hydrogen bond between helices, N(1 epsilon)...O(7), in the Ac-analog. The structures do not provide a ready rationalization of packing preference in terms of side-chain interactions and do not support a major role for helix dipole interactions in determining helix orientation in crystals. The crystal parameters are as follow. Boc-analog: C60H97N11O13.C3H7OH, space group Pl with a = 10.250(3) A, b = 12.451(4) A, c = 15.077(6) A, alpha = 96.55(3) degrees, beta = 92.31(3) degrees, gamma = 106.37(3) degrees, Z = 1, R = 5.5% for 5581 data ([F] greater than 3.0 sigma(F)), resolution 0.89 A. Ac-analog: C57H91N11O12, space group P2(1) with a = 9.965(1) A, b = 19.707(3) A, c = 16.648(3) A, beta = 94.08(1), Z = 2, R = 7.2% for 2530 data ([F] greater than 3.0 sigma(F)), resolution 1.00 A.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we consider the design and bit-error performance analysis of linear parallel interference cancellers (LPIC) for multicarrier (MC) direct-sequence code division multiple access (DS-CDMA) systems. We propose an LPIC scheme where we estimate and cancel the multiple access interference (MAT) based on the soft decision outputs on individual subcarriers, and the interference cancelled outputs on different subcarriers are combined to form the final decision statistic. We scale the MAI estimate on individual subcarriers by a weight before cancellation. In order to choose these weights optimally, we derive exact closed-form expressions for the bit-error rate (BER) at the output of different stages of the LPIC, which we minimize to obtain the optimum weights for the different stages. In addition, using an alternate approach involving the characteristic function of the decision variable, we derive BER expressions for the weighted LPIC scheme, matched filter (MF) detector, decorrelating detector, and minimum mean square error (MMSE) detector for the considered multicarrier DS-CDMA system. We show that the proposed BER-optimized weighted LPIC scheme performs better than the MF detector and the conventional LPIC scheme (where the weights are taken to be unity), and close to the decorrelating and MMSE detectors.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A new mathematical model for the solution of the problem of free convection heat transfer between vertical parallel flat isothermal plates under isothermal boundary conditions, has been presented. The set of boundary layer equations used in the model are transformed to nonlinear coupled differential equations by similarity type variables as obtained by Ostrach for vertical flat plates in an infinite fluid medium. By utilising a parameter ηw* to represent the outer boundary, the governing differential equations are solved numerically for parametric values of Pr = 0.733. 2 and 3, and ηw* = 0.1, 0.5, 1, 2, 3, 4, ... and 8.0. The velocity and temperature profiles are presented. Results indicate that ηw* can effectively classify the system into (1) thin layers where conduction predominates, (2) intermediate layers and (3) thick layers whose results can be predicted by the solutions for vertical flat plates in infinite fluid medium. Heat transfer correlations are presented for the 3 categories. Several experimental and analytical results available in the literature agree with the present correlations.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents a study of kinematic and force singularities in parallel manipulators and closed-loop mechanisms and their relationship to accessibility and controllability of such manipulators and closed-loop mechanisms, Parallel manipulators and closed-loop mechanisms are classified according to their degrees of freedom, number of output Cartesian variables used to describe their motion and the number of actuated joint inputs. The singularities in the workspace are obtained by considering the force transformation matrix which maps the forces and torques in joint space to output forces and torques ill Cartesian space. The regions in the workspace which violate the small time local controllability (STLC) and small time local accessibility (STLA) condition are obtained by deriving the equations of motion in terms of Cartesian variables and by using techniques from Lie algebra.We show that for fully actuated manipulators when the number ofactuated joint inputs is equal to the number of output Cartesian variables, and the force transformation matrix loses rank, the parallel manipulator does not meet the STLC requirement. For the case where the number of joint inputs is less than the number of output Cartesian variables, if the constraint forces and torques (represented by the Lagrange multipliers) become infinite, the force transformation matrix loses rank. Finally, we show that the singular and non-STLC regions in the workspace of a parallel manipulator and closed-loop mechanism can be reduced by adding redundant joint actuators and links. The results are illustrated with the help of numerical examples where we plot the singular and non-STLC/non-STLA regions of parallel manipulators and closed-loop mechanisms belonging to the above mentioned classes. (C) 2000 Elsevier Science Ltd. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The present work concerns with the static scheduling of jobs to parallel identical batch processors with incompatible job families for minimizing the total weighted tardiness. This scheduling problem is applicable in burn-in operations and wafer fabrication in semiconductor manufacturing. We decompose the problem into two stages: batch formation and batch scheduling, as in the literature. The Ant Colony Optimization (ACO) based algorithm called ATC-BACO algorithm is developed in which ACO is used to solve the batch scheduling problems. Our computational experimentation shows that the proposed ATC-BACO algorithm performs better than the available best traditional dispatching rule called ATC-BATC rule.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Ramakrishnan A, Chokhandre S, Murthy A. Voluntary control of multisaccade gaze shifts during movement preparation and execution. J Neurophysiol 103: 2400-2416, 2010. First published February 17, 2010; doi: 10.1152/jn.00843.2009. Although the nature of gaze control regulating single saccades is relatively well documented, how such control is implemented to regulate multisaccade gaze shifts is not known. We used highly eccentric targets to elicit multisaccade gaze shifts and tested the ability of subjects to control the saccade sequence by presenting a second target on random trials. Their response allowed us to test the nature of control at many levels: before, during, and between saccades. Although the saccade sequence could be inhibited before it began, we observed clear signs of truncation of the first saccade, which confirmed that it could be inhibited in midflight as well. Using a race model that explains the control of single saccades, we estimated that it took about 100 ms to inhibit a planned saccade but took about 150 ms to inhibit a saccade during its execution. Although the time taken to inhibit was different, the high subject-wise correlation suggests a unitary inhibitory control acting at different levels in the oculomotor system. We also frequently observed responses that consisted of hypometric initial saccades, followed by secondary saccades to the initial target. Given the estimates of the inhibitory process provided by the model that also took into account the variances of the processes as well, the secondary saccades (average latency similar to 215 ms) should have been inhibited. Failure to inhibit the secondary saccade suggests that the intersaccadic interval in a multisaccade response is a ballistic stage. Collectively, these data indicate that the oculomotor system can control a response until a very late stage in its execution. However, if the response consists of multiple movements then the preparation of the second movement becomes refractory to new visual input, either because it is part of a preprogrammed sequence or as a consequence of being a corrective response to a motor error.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

ingle tract guanine residues can associate to form stable parallel quadruplex structures in the presence of certain cations. Nanosecond scale molecular dynamics simulations have been performed on fully solvated fibre model of parallel d(G(7)) quadruplex structures with Na+ or K+ ions coordinated in the cavity formed by the O6 atoms of the guanine bases. The AMBER 4.1 force field and Particle Mesh Ewald technique for electrostatic interactions have been used in all simulations. There quadruplex structures are stable during the simulation, with the middle four base tetrads showing root mean square deviation values between 0.5 to 0.8 Angstrom from the initial structure as well the high resolution crystal structure. Even in the absence of any coordinated ion in the initial structure, the G-quadruplex structure remains intact throughout the simulation. During the 1.1 ns MD simulation, one Nai counter ion from the solvent as well as several water molecules enter the central cavity to occupy the empty coordination sites within the parallel quadruplex and help stabilize the structure. Hydrogen bonding pattern depends on the nature of the coordinated ion, with the G-tetrad undergoing local structural variation to accommodate cations of different sizes. in the absence of any coordinated ion. due to strong mutual repulsion, O6 atoms within G-tetrad are forced farther apart from each other, which leads to a considerably different hydrogen bonding scheme within the G-tetrads and very favourable interaction energy between the guanine bases constituting a G-tetrad. However, a coordinated ion between G-tetrads provides extra stacking energy for the G-tetrads and makes the quadruplex structure more rigid. Na+ ions, within the quadruplex cavity, are more mobile than coordinated K+ ions. A number of hydrogen bonded water molecules are observed within the grooves of all quadruplex structures.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The physical design of a VLSI circuit involves circuit partitioning as a subtask. Typically, it is necessary to partition a large electrical circuit into several smaller circuits such that the total cross-wiring is minimized. This problem is a variant of the more general graph partitioning problem, and it is known that there does not exist a polynomial time algorithm to obtain an optimal partition. The heuristic procedure proposed by Kernighan and Lin1,2 requires O(n2 log2n) time to obtain a near-optimal two-way partition of a circuit with n modules. In the VLSI context, due to the large problem size involved, this computational requirement is unacceptably high. This paper is concerned with the hardware acceleration of the Kernighan-Lin procedure on an SIMD architecture. The proposed parallel partitioning algorithm requires O(n) processors, and has a time complexity of O(n log2n). In the proposed scheme, the reduced array architecture is employed with due considerations towards cost effectiveness and VLSI realizability of the architecture.The authors are not aware of any earlier attempts to parallelize a circuit partitioning algorithm in general or the Kernighan-Lin algorithm in particular. The use of the reduced array architecture is novel and opens up the possibilities of using this computing structure for several other applications in electronic design automation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

One of the key problems in the design of any incompletely connected multiprocessor system is to appropriately assign the set of tasks in a program to the Processing Elements (PEs) in the system. The task assignment problem has proven difficult both in theory and in practice. This paper presents a simple and efficient heuristic algorithm for assigning program tasks with precedence and communication constraints to the PEs in a Message-based Multiple-bus Multiprocessor System, M3, so that the total execution time for the program is minimized. The algorithm uses a cost function: “Minimum Distance and Parallel Transfer” to minimize the completion time. The effectiveness of the algorithm has been demonstrated by comparing the results with (i) the lower bound on the execution time of a program (task) graph and (ii) a random assignment.