3 resultados para importance performance analysis

em Digital Commons - Michigan Tech


Relevância:

40.00% 40.00%

Publicador:

Resumo:

An important problem in computational biology is finding the longest common subsequence (LCS) of two nucleotide sequences. This paper examines the correctness and performance of a recently proposed parallel LCS algorithm that uses successor tables and pruning rules to construct a list of sets from which an LCS can be easily reconstructed. Counterexamples are given for two pruning rules that were given with the original algorithm. Because of these errors, performance measurements originally reported cannot be validated. The work presented here shows that speedup can be reliably achieved by an implementation in Unified Parallel C that runs on an Infiniband cluster. This performance is partly facilitated by exploiting the software cache of the MuPC runtime system. In addition, this implementation achieved speedup without bulk memory copy operations and the associated programming complexity of message passing.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This thesis develops high performance real-time signal processing modules for direction of arrival (DOA) estimation for localization systems. It proposes highly parallel algorithms for performing subspace decomposition and polynomial rooting, which are otherwise traditionally implemented using sequential algorithms. The proposed algorithms address the emerging need for real-time localization for a wide range of applications. As the antenna array size increases, the complexity of signal processing algorithms increases, making it increasingly difficult to satisfy the real-time constraints. This thesis addresses real-time implementation by proposing parallel algorithms, that maintain considerable improvement over traditional algorithms, especially for systems with larger number of antenna array elements. Singular value decomposition (SVD) and polynomial rooting are two computationally complex steps and act as the bottleneck to achieving real-time performance. The proposed algorithms are suitable for implementation on field programmable gated arrays (FPGAs), single instruction multiple data (SIMD) hardware or application specific integrated chips (ASICs), which offer large number of processing elements that can be exploited for parallel processing. The designs proposed in this thesis are modular, easily expandable and easy to implement. Firstly, this thesis proposes a fast converging SVD algorithm. The proposed method reduces the number of iterations it takes to converge to correct singular values, thus achieving closer to real-time performance. A general algorithm and a modular system design are provided making it easy for designers to replicate and extend the design to larger matrix sizes. Moreover, the method is highly parallel, which can be exploited in various hardware platforms mentioned earlier. A fixed point implementation of proposed SVD algorithm is presented. The FPGA design is pipelined to the maximum extent to increase the maximum achievable frequency of operation. The system was developed with the objective of achieving high throughput. Various modern cores available in FPGAs were used to maximize the performance and details of these modules are presented in detail. Finally, a parallel polynomial rooting technique based on Newton’s method applicable exclusively to root-MUSIC polynomials is proposed. Unique characteristics of root-MUSIC polynomial’s complex dynamics were exploited to derive this polynomial rooting method. The technique exhibits parallelism and converges to the desired root within fixed number of iterations, making this suitable for polynomial rooting of large degree polynomials. We believe this is the first time that complex dynamics of root-MUSIC polynomial were analyzed to propose an algorithm. In all, the thesis addresses two major bottlenecks in a direction of arrival estimation system, by providing simple, high throughput, parallel algorithms.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The Pierre Auger Cosmic Ray Observatory North site employs a large array of surface detector stations (tanks) to detect the secondary particle showers generated by ultra-high energy cosmic rays. Due to the rare nature of ultra-high energy cosmic rays, it is important to have a high reliability on tank communications, ensuring no valuable data is lost. The Auger North site employs a peer-to-peer paradigm, the Wireless Architecture for Hard Real-Time Embedded Networks (WAHREN), designed specifically for highly reliable message delivery over fixed networks, under hard real-time deadlines. The WAHREN design included two retransmission protocols, Micro- and Macro- retransmission. To fully understand how each retransmission protocol increased the reliability of communications, this analysis evaluated the system without using either retransmission protocol (Case-0), both Micro- and Macro-retransmission individually (Micro and Macro), and Micro- and Macro-retransmission combined. This thesis used a multimodal modeling methodology to prove that a performance and reliability analysis of WAHREN was possible, and provided the results of the analysis. A multimodal approach was necessary because these processes were driven by different mathematical models. The results from this analysis can be used as a framework for making design decisions for the Auger North communication system.