9 resultados para Multiprocessor
em CentAUR: Central Archive University of Reading - UK
Resumo:
Comparison-based diagnosis is an effective approach to system-level fault diagnosis. Under the Maeng-Malek comparison model (NM* model), Sengupta and Dahbura proposed an O(N-5) diagnosis algorithm for general diagnosable systems with N nodes. Thanks to lower diameter and better graph embedding capability as compared with a hypercube of the same size, the crossed cube has been a promising candidate for interconnection networks. In this paper, we propose a fault diagnosis algorithm tailored for crossed cube connected multicomputer systems under the MM* model. By introducing appropriate data structures, this algorithm runs in O(Nlog(2)(2) N) time, which is linear in the size of the input. As a result, this algorithm is significantly superior to the Sengupta-Dahbura's algorithm when applied to crossed cube systems. (C) 2004 Elsevier B.V. All rights reserved.
Resumo:
Inferring population admixture from genetic data and quantifying it is a difficult but crucial task in evolutionary and conservation biology. Unfortunately state-of-the-art probabilistic approaches are computationally demanding. Effectively exploiting the computational power of modern multiprocessor systems can thus have a positive impact to Monte Carlo-based simulation of admixture modeling. A novel parallel approach is briefly described and promising results on its message passing interface (MPI)-based C implementation are reported.
Resumo:
This paper presents a paralleled Two-Pass Hexagonal (TPA) algorithm constituted by Linear Hashtable Motion Estimation Algorithm (LHMEA) and Hexagonal Search (HEXBS) for motion estimation. In the TPA., Motion Vectors (MV) are generated from the first-pass LHMEA and are used as predictors for second-pass HEXBS motion estimation, which only searches a small number of Macroblocks (MBs). We introduced hashtable into video processing and completed parallel implementation. We propose and evaluate parallel implementations of the LHMEA of TPA on clusters of workstations for real time video compression. It discusses how parallel video coding on load balanced multiprocessor systems can help, especially on motion estimation. The effect of load balancing for improved performance is discussed. The performance or the algorithm is evaluated by using standard video sequences and the results are compared to current algorithms.
Resumo:
This paper presents a paralleled Two-Pass Hexagonal (TPA) algorithm constituted by Linear Hashtable Motion Estimation Algorithm (LHMEA) and Hexagonal Search (HEXBS) for motion estimation. In the TPA, Motion Vectors (MV) are generated from the first-pass LHMEA and are used as predictors for second-pass HEXBS motion estimation, which only searches a small number of Macroblocks (MBs). We introduced hashtable into video processing and completed parallel implementation. We propose and evaluate parallel implementations of the LHMEA of TPA on clusters of workstations for real time video compression. It discusses how parallel video coding on load balanced multiprocessor systems can help, especially on motion estimation. The effect of load balancing for improved performance is discussed. The performance of the algorithm is evaluated by using standard video sequences and the results are compared to current algorithms.
Resumo:
This paper presents a paralleled Two-Pass Hexagonal (TPA) algorithm constituted by Linear Hashtable Motion Estimation Algorithm (LHMEA) and Hexagonal Search (HEXBS) for motion estimation. In the TPA, Motion Vectors (MV) are generated from the first-pass LHMEA and are used as predictors for second-pass HEXBS motion estimation, which only searches a small number of Macroblocks (MBs). We introduced hashtable into video processing and completed parallel implementation. We propose and evaluate parallel implementations of the LHMEA of TPA on clusters of workstations for real time video compression. It discusses how parallel video coding on load balanced multiprocessor systems can help, especially on motion estimation. The effect of load balancing for improved performance is discussed. The performance of the algorithm is evaluated by using standard video sequences and the results are compared to current algorithms.
Resumo:
An n-dimensional Mobius cube, 0MQ(n) or 1MQ(n), is a variation of n-dimensional cube Q(n) which possesses many attractive properties such as significantly smaller communication delay and stronger graph-embedding capabilities. In some practical situations, the fault tolerance of a distributed memory multiprocessor system can be measured more precisely by the connectivity of the underlying graph under forbidden fault set models. This article addresses the connectivity of 0MQ(n)/1MQ(n), under two typical forbidden fault set models. We first prove that the connectivity of 0MQ(n)/1MQ(n) is 2n - 2 when the fault set does not contain the neighborhood of any vertex as a subset. We then prove that the connectivity of 0MQ(n)/1MQ(n) is 3n - 5 provided that the neighborhood of any vertex as well as that of any edge cannot fail simultaneously These results demonstrate that 0MQ(n)/1MQ(n) has the same connectivity as Q(n) under either of the previous assumptions.
Resumo:
We consider the linear equality-constrained least squares problem (LSE) of minimizing ${\|c - Gx\|}_2 $, subject to the constraint $Ex = p$. A preconditioned conjugate gradient method is applied to the Kuhn–Tucker equations associated with the LSE problem. We show that our method is well suited for structural optimization problems in reliability analysis and optimal design. Numerical tests are performed on an Alliant FX/8 multiprocessor and a Cray-X-MP using some practical structural analysis data.
Resumo:
Hybrid multiprocessor architectures which combine re-configurable computing and multiprocessors on a chip are being proposed to transcend the performance of standard multi-core parallel systems. Both fine-grained and coarse-grained parallel algorithm implementations are feasible in such hybrid frameworks. A compositional strategy for designing fine-grained multi-phase regular processor arrays to target hybrid architectures is presented in this paper. The method is based on deriving component designs using classical regular array techniques and composing the components into a unified global design. Effective designs with phase-changes and data routing at run-time are characteristics of these designs. In order to describe the data transfer between phases, the concept of communication domain is introduced so that the producer–consumer relationship arising from multi-phase computation can be treated in a unified way as a data routing phase. This technique is applied to derive new designs of multi-phase regular arrays with different dataflow between phases of computation.
Resumo:
A parallel formulation for the simulation of a branch prediction algorithm is presented. This parallel formulation identifies independent tasks in the algorithm which can be executed concurrently. The parallel implementation is based on the multithreading model and two parallel programming platforms: pthreads and Cilk++. Improvement in execution performance by up to 7 times is observed for a generic 2-bit predictor in a 12-core multiprocessor system.