190 resultados para PARALLEL COMPUTING
Resumo:
A global recursive bisection algorithm is described for computing the complex zeros of a polynomial. It has complexityO(n 3 p) wheren is the degree of the polynomial andp the bit precision requirement. Ifn processors are available, it can be realized in parallel with complexityO(n 2 p); also it can be implemented using exact arithmetic. A combined Wilf-Hansen algorithm is suggested for reduction in complexity.
Resumo:
An axis-parallel k-dimensional box is a Cartesian product R-1 x R-2 x...x R-k where R-i (for 1 <= i <= k) is a closed interval of the form [a(i), b(i)] on the real line. For a graph G, its boxicity box(G) is the minimum dimension k, such that G is representable as the intersection graph of (axis-parallel) boxes in k-dimensional space. The concept of boxicity finds applications in various areas such as ecology, operations research etc. A number of NP-hard problems are either polynomial time solvable or have much better approximation ratio on low boxicity graphs. For example, the max-clique problem is polynomial time solvable on bounded boxicity graphs and the maximum independent set problem for boxicity d graphs, given a box representation, has a left perpendicular1 + 1/c log n right perpendicular(d-1) approximation ratio for any constant c >= 1 when d >= 2. In most cases, the first step usually is computing a low dimensional box representation of the given graph. Deciding whether the boxicity of a graph is at most 2 itself is NP-hard. We give an efficient randomized algorithm to construct a box representation of any graph G on n vertices in left perpendicular(Delta + 2) ln nright perpendicular dimensions, where Delta is the maximum degree of G. This algorithm implies that box(G) <= left perpendicular(Delta + 2) ln nright perpendicular for any graph G. Our bound is tight up to a factor of ln n. We also show that our randomized algorithm can be derandomized to get a polynomial time deterministic algorithm. Though our general upper bound is in terms of maximum degree Delta, we show that for almost all graphs on n vertices, their boxicity is O(d(av) ln n) where d(av) is the average degree.
Resumo:
The move towards IT outsourcing is the first step towards an environment where compute infrastructure is treated as a service. In utility computing this IT service has to honor Service Level Agreements (SLA) in order to meet the desired Quality of Service (QoS) guarantees. Such an environment requires reliable services in order to maximize the utilization of the resources and to decrease the Total Cost of Ownership (TCO). Such reliability cannot come at the cost of resource duplication, since it increases the TCO of the data center and hence the cost per compute unit. We, in this paper, look into aspects of projecting impact of hardware failures on the SLAs and techniques required to take proactive recovery steps in case of a predicted failure. By maintaining health vectors of all hardware and system resources, we predict the failure probability of resources based on observed hardware errors/failure events, at runtime. This inturn influences an availability aware middleware to take proactive action (even before the application is affected in case the system and the application have low recoverability). The proposed framework has been prototyped on a system running HP-UX. Our offline analysis of the prediction system on hardware error logs indicate no more than 10% false positives. This work to the best of our knowledge is the first of its kind to perform an end-to-end analysis of the impact of a hardware fault on application SLAs, in a live system.
Resumo:
The physical design of a VLSI circuit involves circuit partitioning as a subtask. Typically, it is necessary to partition a large electrical circuit into several smaller circuits such that the total cross-wiring is minimized. This problem is a variant of the more general graph partitioning problem, and it is known that there does not exist a polynomial time algorithm to obtain an optimal partition. The heuristic procedure proposed by Kernighan and Lin1,2 requires O(n2 log2n) time to obtain a near-optimal two-way partition of a circuit with n modules. In the VLSI context, due to the large problem size involved, this computational requirement is unacceptably high. This paper is concerned with the hardware acceleration of the Kernighan-Lin procedure on an SIMD architecture. The proposed parallel partitioning algorithm requires O(n) processors, and has a time complexity of O(n log2n). In the proposed scheme, the reduced array architecture is employed with due considerations towards cost effectiveness and VLSI realizability of the architecture.The authors are not aware of any earlier attempts to parallelize a circuit partitioning algorithm in general or the Kernighan-Lin algorithm in particular. The use of the reduced array architecture is novel and opens up the possibilities of using this computing structure for several other applications in electronic design automation.
Resumo:
In this paper we give a generalized predictor-corrector algorithm for solving ordinary differential equations with specified initial values. The method uses multiple correction steps which can be carried out in parallel with a prediction step. The proposed method gives a larger stability interval compared to the existing parallel predictor-corrector methods. A method has been suggested to implement the algorithm in multiple processor systems with efficient utilization of all the processors.
Resumo:
Massively parallel SIMD computing is applied to obtain an order of magnitude improvement in the executional speed of an important algorithm in VLSI design automation. The physical design of a VLSI circuit involves logic module placement as a subtask. The paper is concerned with accelerating the well known Min-cut placement technique for logic cell placement. The inherent parallelism of the Min-cut algorithm is identified, and it is shown that a parallel machine based on the efficient execution of the placement procedure.
Resumo:
This paper discusses the parallel implementation of the solution of a set of linear equations using the Alternative Quadrant Interlocking Factorisation Methods (AQIF), on a star topology. Both the AQIF and LU decomposition methods are mapped onto star topology on an IBM SP2 system, with MPI as the internode communicator. Performance parameters such as speedup, efficiency have been obtained through experimental and theoretical means. The studies demonstrate (i) a mismatch of 15% between the theoretical and experimental results, (ii) scalability of the AQIF algorithm, and (iii) faster executing AQIF algorithm.
Resumo:
As computational Grids are increasingly used for executing long running multi-phase parallel applications, it is important to develop efficient rescheduling frameworks that adapt application execution in response to resource and application dynamics. In this paper, three strategies or algorithms have been developed for deciding when and where to reschedule parallel applications that execute on multi-cluster Grids. The algorithms derive rescheduling plans that consist of potential points in application execution for rescheduling and schedules of resources for application execution between two consecutive rescheduling points. Using large number of simulations, it is shown that the rescheduling plans developed by the algorithms can lead to large decrease in application execution times when compared to executions without rescheduling on dynamic Grid resources. The rescheduling plans generated by the algorithms are also shown to be competitive when compared to the near-optimal plans generated by brute-force methods. Of the algorithms, genetic algorithm yielded the most efficient rescheduling plans with 9-12% smaller average execution times than the other algorithms.
Resumo:
In this paper, we present an algebraic method to study and design spatial parallel manipulators that demonstrate isotropy in the force and moment distributions.We use the force and moment transformation matrices separately,and derive conditions for their isotropy individually as well as in combination. The isotropy conditions are derived in closed-form in terms of the invariants of the quadratic forms associated with these matrices. The formulation has been applied to a class of Stewart platform manipulators. We obtain multi-parameter families of isotropic manipulator analytically. In addition to computing the isotropic configurations of an existing manipulator,we demonstrate a procedure for designing the manipulator for isotropy at a given configuration.
Resumo:
The Morse-Smale complex is a useful topological data structure for the analysis and visualization of scalar data. This paper describes an algorithm that processes all mesh elements of the domain in parallel to compute the Morse-Smale complex of large two-dimensional data sets at interactive speeds. We employ a reformulation of the Morse-Smale complex using Forman's Discrete Morse Theory and achieve scalability by computing the discrete gradient using local accesses only. We also introduce a novel approach to merge gradient paths that ensures accurate geometry of the computed complex. We demonstrate that our algorithm performs well on both multicore environments and on massively parallel architectures such as the GPU.
Resumo:
Information forms the basis of modern technology. To meet the ever-increasing demand for information, means have to be devised for a more efficient and better-equipped technology to intelligibly process data. Advances in photonics have made their impact on each of the four key applications in information processing, i.e., acquisition, transmission, storage and processing of information. The inherent advantages of ultrahigh bandwidth, high speed and low-loss transmission has already established fiber-optics as the backbone of communication technology. However, the optics to electronics inter-conversion at the transmitter and receiver ends severely limits both the speed and bit rate of lightwave communication systems. As the trend towards still faster and higher capacity systems continues, it has become increasingly necessary to perform more and more signal-processing operations in the optical domain itself, i.e., with all-optical components and devices that possess a high bandwidth and can perform parallel processing functions to eliminate the electronic bottleneck.
Resumo:
The Morse-Smale complex is a topological structure that captures the behavior of the gradient of a scalar function on a manifold. This paper discusses scalable techniques to compute the Morse-Smale complex of scalar functions defined on large three-dimensional structured grids. Computing the Morse-Smale complex of three-dimensional domains is challenging as compared to two-dimensional domains because of the non-trivial structure introduced by the two types of saddle criticalities. We present a parallel shared-memory algorithm to compute the Morse-Smale complex based on Forman's discrete Morse theory. The algorithm achieves scalability via synergistic use of the CPU and the GPU. We first prove that the discrete gradient on the domain can be computed independently for each cell and hence can be implemented on the GPU. Second, we describe a two-step graph traversal algorithm to compute the 1-saddle-2-saddle connections efficiently and in parallel on the CPU. Simultaneously, the extremasaddle connections are computed using a tree traversal algorithm on the GPU.
Resumo:
In this paper, we present a decentralized dynamic load scheduling/balancing algorithm called ELISA (Estimated Load Information Scheduling Algorithm) for general purpose distributed computing systems. ELISA uses estimated state information based upon periodic exchange of exact state information between neighbouring nodes to perform load scheduling. The primary objective of the algorithm is to cut down on the communication and load transfer overheads by minimizing the frequency of status exchange and by restricting the load transfer and status exchange within the buddy set of a processor. It is shown that the resulting algorithm performs almost as well as a perfect information algorithm and is superior to other load balancing schemes based on the random sharing and Ni-Hwang algorithms. A sensitivity analysis to study the effect of various design parameters on the effectiveness of load balancing is also carried out. Finally, the algorithm's performance is tested on large dimensional hypercubes in the presence of time-varying load arrival process and is shown to perform well in comparison to other algorithms. This makes ELISA a viable and implementable load balancing algorithm for use in general purpose distributed computing systems.
Resumo:
The unsteady incompressible viscous fluid flow between two parallel infinite disks which are located at a distance h(t*) at time t* has been studied. The upper disk moves towards the lower disk with velocity h'(t*). The lower disk is porous and rotates with angular velocity Omega(t*). A magnetic field B(t*) is applied perpendicular to the two disks. It has been found that the governing Navier-Stokes equations reduce to a set of ordinary differential equations if h(t*), a(t*) and B(t*) vary with time t* in a particular manner, i.e. h(t*) = H(1 - alpha t*)(1/2), Omega(t*) = Omega(0)(1 - alpha t*)(-1), B(t*) = B-0(1 - alpha t*)(-1/2). These ordinary differential equations have been solved numerically using a shooting method. For small Reynolds numbers, analytical solutions have been obtained using a regular perturbation technique. The effects of squeeze Reynolds numbers, Hartmann number and rotation of the disk on the flow pattern, normal force or load and torque have been studied in detail
Diffraction Of Elastic Waves By Two Parallel Rigid Strips Embedded In An Infinite Orthotropic Medium
Resumo:
The elastodynamic response of a pair of parallel rigid strips embedded in an infinite orthotropic medium due to elastic waves incident normally on the strips has been investigated. The mixed boundary value problem has been solved by the Integral Equation method. The normal stress and the vertical displacement have been derived in closed form. Numerical values of stress intensity factors at inner and outer edges of the strips and vertical displacement at points in the plane of the strips for several orthotropic materials have been calculated and plotted graphically to show the effect of material orthotropy.