979 resultados para 291600 Computer Hardware
Resumo:
A BSP (Bulk Synchronous Parallelism) computation is characterized by the generation of asynchronous messages in packages during independent execution of a number of processes and their subsequent delivery at synchronization points. Bundling messages together represents a significant departure from the traditional ‘one communication at a time’ approach. In this paper the semantic consequences of communication packaging are explored. In particular, the BSP communication structure is identified with a general form of substitution—predicate substitution. Predicate substitution provides a means of reasoning about the synchronized delivery of asynchronous communications when the immediate programming context does not explicitly refer to the variables that are to be updated (unlike traditional operations, such as the assignment $x := e$, where the names of the updated variables can be extracted from the context). Proofs of implementations of Newton's root finding method and prefix sum are used to illustrate the practical application of the proposed approach.
Resumo:
A novel application-specific instruction set processor (ASIP) for use in the construction of modern signal processing systems is presented. This is a flexible device that can be used in the construction of array processor systems for the real-time implementation of functions such as singular-value decomposition (SVD) and QR decomposition (QRD), as well as other important matrix computations. It uses a coordinate rotation digital computer (CORDIC) module to perform arithmetic operations and several approaches are adopted to achieve high performance including pipelining of the micro-rotations, the use of parallel instructions and a dual-bus architecture. In addition, a novel method for scale factor correction is presented which only needs to be applied once at the end of the computation. This also reduces computation time and enhances performance. Methods are described which allow this processor to be used in reduced dimension (i.e., folded) array processor structures that allow tradeoffs between hardware and performance. The net result is a flexible matrix computational processing element (PE) whose functionality can be changed under program control for use in a wider range of scenarios than previous work. Details are presented of the results of a design study, which considers the application of this decomposition PE architecture in a combined SVD/QRD system and demonstrates that a combination of high performance and efficient silicon implementation are achievable. © 2005 IEEE.
Resumo:
This paper, chosen as a best paper from the 2005 SAMOS Workshop on Computer Systems: describes the for the first time the major Abhainn project for automated system level design of embedded signal processing systems. In particular, this describes four key novelties: novel algorithm modelling techniques for DSP systems, automated implementation realisation, algorithm transformation for system optimisation and automated inter-processor communication. This is applied to two complex systems: a radar and sonar system. In both cases technology which allows non-experts to automatically create low-overhead, high performance embedded signal processing systems is exhibited.
Resumo:
The authors are concerned with the development of computer systems that are capable of using information from faces and voices to recognise people's emotions in real-life situations. The paper addresses the nature of the challenges that lie ahead, and provides an assessment of the progress that has been made in the areas of signal processing and analysis techniques (with regard to speech and face), and the psychological and linguistic analyses of emotion. Ongoing developmental work by the authors in each of these areas is described.
Resumo:
A hardware performance analysis of the SHACAL-2 encryption algorithm is presented in this paper. SHACAL-2 was one of four symmetric key algorithms chosen in the New European Schemes for Signatures, Integrity and Encryption (NESSIE) initiative in 2003. The paper describes a fully pipelined encryption SHACAL-2 architecture implemented on a Xilinx Field Programmable Gate Array (FPGA) device that achieves a throughput of over 25 Gbps. This is the fastest private key encryption algorithm architecture currently available. The SHACAL-2 decryption algorithm is also defined in the paper as it was not provided in the NESSIE submission.
Resumo:
This paper proposes a novel hybrid forward algorithm (HFA) for the construction of radial basis function (RBF) neural networks with tunable nodes. The main objective is to efficiently and effectively produce a parsimonious RBF neural network that generalizes well. In this study, it is achieved through simultaneous network structure determination and parameter optimization on the continuous parameter space. This is a mixed integer hard problem and the proposed HFA tackles this problem using an integrated analytic framework, leading to significantly improved network performance and reduced memory usage for the network construction. The computational complexity analysis confirms the efficiency of the proposed algorithm, and the simulation results demonstrate its effectiveness