23 resultados para SoC
em QUB Research Portal - Research Directory and Institutional Repository for Queen's University Belfast
Resumo:
A generic architecture for implementing a QR array processor in silicon is presented. This improves on previous research by considerably simplifying the derivation of timing schedules for a QR system implemented as a folded linear array, where account has to be taken of processor cell latency and timing at the detailed circuit level. The architecture and scheduling derived have been used to create a generator for the rapid design of System-on-a-Chip (SoC) cores for QR decomposition. This is demonstrated through the design of a single-chip architecture for implementing an adaptive beamformer for radar applications. Published as IEEE Trans Circuits and Systems Part II, Analog and Digital Signal Processing, April 2003 NOT Express Briefs. Parts 1 and II of Journal reorganised since then into Regular Papers and Express briefs
Resumo:
A novel application-specific instruction set processor (ASIP) for use in the construction of modern signal processing systems is presented. This is a flexible device that can be used in the construction of array processor systems for the real-time implementation of functions such as singular-value decomposition (SVD) and QR decomposition (QRD), as well as other important matrix computations. It uses a coordinate rotation digital computer (CORDIC) module to perform arithmetic operations and several approaches are adopted to achieve high performance including pipelining of the micro-rotations, the use of parallel instructions and a dual-bus architecture. In addition, a novel method for scale factor correction is presented which only needs to be applied once at the end of the computation. This also reduces computation time and enhances performance. Methods are described which allow this processor to be used in reduced dimension (i.e., folded) array processor structures that allow tradeoffs between hardware and performance. The net result is a flexible matrix computational processing element (PE) whose functionality can be changed under program control for use in a wider range of scenarios than previous work. Details are presented of the results of a design study, which considers the application of this decomposition PE architecture in a combined SVD/QRD system and demonstrates that a combination of high performance and efficient silicon implementation are achievable. © 2005 IEEE.
Resumo:
Hardware synthesis from dataflow graphs of signal processing systems is a growing research area as focus shifts to high level design methodologies. For data intensive systems, dataflow based synthesis can lead to an inefficient usage of memory due to the restrictive nature of synchronous dataflow and its inability to easily model data reuse. This paper explores how dataflow graph changes can be used to drive both the on-chip and off-chip memory organisation and how these memory architectures can be mapped to a hardware implementation. By exploiting the data reuse inherent to many image processing algorithms and by creating memory hierarchies, off-chip memory bandwidth can be reduced by a factor of a thousand from the original dataflow graph level specification of a motion estimation algorithm, with a minimal increase in memory size. This analysis is verified using results gathered from implementation of the motion estimation algorithm on a Xilinx Virtex-4 FPGA, where the delay between the memories and processing elements drops from 14.2 ns down to 1.878 ns through the refinement of the memory architecture. Care must be taken when modeling these algorithms however, as inefficiencies in these models can be easily translated into overuse of hardware resources.
Resumo:
A scheduling method for implementing a generic linear QR array processor architecture is presented. This improves on previous work. It also considerably simplifies the derivation of schedules for a folded linear system, where detailed account has to be taken of processor cell latency. The architecture and scheduling derived provide the basis of a generator for the rapid design of System-on-a-Chip (SoC) cores for QR decomposition.