3 resultados para parallel algorithm

em Massachusetts Institute of Technology


Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper addresses the problem of efficiently computing the motor torques required to drive a lower-pair kinematic chain (e.g., a typical manipulator arm in free motion, or a mechanical leg in the swing phase) given the desired trajectory; i.e., the Inverse Dynamics problem. It investigates the high degree of parallelism inherent in the computations, and presents two "mathematically exact" formulations especially suited to high-speed, highly parallel implementations using special-purpose hardware or VLSI devices. In principle, the formulations should permit the calculations to run at a speed bounded only by I/O. The first presented is a parallel version of the recent linear Newton-Euler recursive algorithm. The time cost is also linear in the number of joints, but the real-time coefficients are reduced by almost two orders of magnitude. The second formulation reports a new parallel algorithm which shows that it is possible to improve upon the linear time dependency. The real time required to perform the calculations increases only as the [log2] of the number of joints. Either formulation is susceptible to a systolic pipelined architecture in which complete sets of joint torques emerge at successive intervals of four floating-point operations. Hardware requirements necessary to support the algorithm are considered and found not to be excessive, and a VLSI implementation architecture is suggested. We indicate possible applications to incorporating dynamical considerations into trajectory planning, e.g. it may be possible to build an on-line trajectory optimizer.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We develop an algorithm that computes the gravitational potentials and forces on N point-masses interacting in three-dimensional space. The algorithm, based on analytical techniques developed by Rokhlin and Greengard, runs in order N time. In contrast to other fast N-body methods such as tree codes, which only approximate the interaction potentials and forces, this method is exact ?? computes the potentials and forces to within any prespecified tolerance up to machine precision. We present an implementation of the algorithm for a sequential machine. We numerically verify the algorithm, and compare its speed with that of an O(N2) direct force computation. We also describe a parallel version of the algorithm that runs on the Connection Machine in order 0(logN) time. We compare experimental results with those of the sequential implementation and discuss how to minimize communication overhead on the parallel machine.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A key capability of data-race detectors is to determine whether one thread executes logically in parallel with another or whether the threads must operate in series. This paper provides two algorithms, one serial and one parallel, to maintain series-parallel (SP) relationships "on the fly" for fork-join multithreaded programs. The serial SP-order algorithm runs in O(1) amortized time per operation. In contrast, the previously best algorithm requires a time per operation that is proportional to Tarjan’s functional inverse of Ackermann’s function. SP-order employs an order-maintenance data structure that allows us to implement a more efficient "English-Hebrew" labeling scheme than was used in earlier race detectors, which immediately yields an improved determinacy-race detector. In particular, any fork-join program running in T₁ time on a single processor can be checked on the fly for determinacy races in O(T₁) time. Corresponding improved bounds can also be obtained for more sophisticated data-race detectors, for example, those that use locks. By combining SP-order with Feng and Leiserson’s serial SP-bags algorithm, we obtain a parallel SP-maintenance algorithm, called SP-hybrid. Suppose that a fork-join program has n threads, T₁ work, and a critical-path length of T[subscript ∞]. When executed on P processors, we prove that SP-hybrid runs in O((T₁/P + PT[subscript ∞]) lg n) expected time. To understand this bound, consider that the original program obtains linear speed-up over a 1-processor execution when P = O(T₁/T[subscript ∞]). In contrast, SP-hybrid obtains linear speed-up when P = O(√T₁/T[subscript ∞]), but the work is increased by a factor of O(lg n).