80 resultados para Array Signal Processing
Resumo:
The most promising way to maintain reliable data transfer across the rapidly fluctuating channels used by next generation multiple-input multiple-output communications schemes is to exploit run-time variable modulation and antenna configurations. This demands that the baseband signal processing architectures employed in the communications terminals must provide low cost and high performance with runtime reconfigurability. We present a softcore-processor based solution to this issue, and show for the first time, that such programmable architectures can enable real-time data operation for cutting-edge standards
such as 802.11n; furthermore, by exploiting deep processing pipelines and interleaved task execution, the cost and performance of these architectures is shown to be on a par with traditional dedicated circuit based solutions. We believe this to be the first such programmable architecture to achieve this, and the combination of implementation efficiency and programmability makes this implementation style the most promising approach for hosting such dynamic architectures.
Resumo:
A bit level systolic array for computing the convolution operation is described. The circuit in question is highly regular and ideally suited to VLSI chip design. It is also optimized in the sense that all the cells contribute to the computation on each clock cycle. This makes the array almost four times more efficient than one which was previously described.
Resumo:
Two major UK systolic array projects are described. The first concerns the development of a wavefront array processor for adaptive beamforming; the second concerns the design of bit-level systolic arrays for high-performance signal processing.
Resumo:
A bit level systolic array system is proposed for the Winograd Fourier transform algorithm. The design uses bit-serial arithmetic and, in common with other systolic arrays, features nearest-neighbor interconnections, regularity and high throughput. The short interconnections in this method contrast favorably with the long interconnections between butterflies required in the FFT. The structure is well suited to VLSI implementations. It is demonstrated how long transforms can be implemented with components designed to perform a short length transform. These components build into longer transforms preserving the regularity and structure of the short length transform design.
Resumo:
A scheduling method for implementing a generic linear QR array processor architecture is presented. This improves on previous work. It also considerably simplifies the derivation of schedules for a folded linear system, where detailed account has to be taken of processor cell latency. The architecture and scheduling derived provide the basis of a generator for the rapid design of System-on-a-Chip (SoC) cores for QR decomposition.
Resumo:
The paper presents IPPro which is a high performance, scalable soft-core processor targeted for image processing applications. It has been based on the Xilinx DSP48E1 architecture using the ZYNQ Field Programmable Gate Array and is a scalar 16-bit RISC processor that operates at 526MHz, giving 526MIPS of performance. Each IPPro core uses 1 DSP48, 1 Block RAM and 330 Kintex-7 slice-registers, thus making the processor as compact as possible whilst maintaining flexibility and programmability. A key aspect of the approach is in reducing the application design time and implementation effort by using multiple IPPro processors in a SIMD mode. For different applications, this allows us to exploit different levels of parallelism and mapping for the specified processing architecture with the supported instruction set. In this context, a Traffic Sign Recognition (TSR) algorithm has been prototyped on a Zedboard with the colour and morphology operations accelerated using multiple IPPros. Simulation and experimental results demonstrate that the processing platform is able to achieve a speedup of 15 to 33 times for colour filtering and morphology operations respectively, with a reduced design effort and time.
Resumo:
The Field Programmable Gate Array (FPGA) implementation of the commonly used Histogram of Oriented Gradients (HOG) algorithm is explored. The HOG algorithm is employed to extract features for object detection. A key focus has been to explore the use of a new FPGA-based processor which has been targeted at image processing. The paper gives details of the mapping and scheduling factors that influence the performance and the stages that were undertaken to allow the algorithm to be deployed on FPGA hardware, whilst taking into account the specific IPPro architecture features. We show that multi-core IPPro performance can exceed that of against state-of-the-art FPGA designs by up to 3.2 times with reduced design and implementation effort and increased flexibility all on a low cost, Zynq programmable system.
Resumo:
The increasing design complexity associated with modern Field Programmable Gate Array (FPGA) has prompted the emergence of 'soft'-programmable processors which attempt to replace at least part of the custom circuit design problem with a problem of programming parallel processors. Despite substantial advances in this technology, its performance and resource efficiency for computationally complex operations remains in doubt. In this paper we present the first recorded implementation of a softcore Fast-Fourier Transform (FFT) on Xilinx Virtex FPGA technology. By employing a streaming processing architecture, we show how it is possible to achieve architectures which offer 1.1 GSamples/s throughput and up to 19 times speed-up against the Xilinx Radix-2 FFT dedicated circuit with comparable cost.
Resumo:
By modification of the classical retrodirective arrays (RDAs) architecture a directional modulation (DM) transmitter can be realized without the need for synthesis. Importantly, through analytical analysis and exemplar simulations, it is proved that, besides the conventional DM application scenario, i.e., secure transmission to one legitimate receiver located along one spatial direction in free space, the proposed synthesis-free DM transmitter should also perform well for systems where there are more than one legitimate receivers positioned along different directions in free space, and where one or more legitimate receivers exist in a multipath environment. None of these have ever been achieved before using synthesis-free DM arrangements.
Resumo:
With security and surveillance, there is an increasing need to process image data efficiently and effectively either at source or in a large data network. Whilst a Field-Programmable Gate Array (FPGA) has been seen as a key technology for enabling this, the design process has been viewed as problematic in terms of the time and effort needed for implementation and verification. The work here proposes a different approach of using optimized FPGA-based soft-core processors which allows the user to exploit the task and data level parallelism to achieve the quality of dedicated FPGA implementations whilst reducing design time. The paper also reports some preliminary
progress on the design flow to program the structure. An implementation for a Histogram of Gradients algorithm is also reported which shows that a performance of 328 fps can be achieved with this design approach, whilst avoiding the long design time, verification and debugging steps associated with conventional FPGA implementations.
Resumo:
The technical challenges in the design and programming of signal processors for multimedia communication are discussed. The development of terminal equipment to meet such demand presents a significant technical challenge, considering that it is highly desirable that the equipment be cost effective, power efficient, versatile, and extensible for future upgrades. The main challenges in the design and programming of signal processors for multimedia communication are, general-purpose signal processor design, application-specific signal processor design, operating systems and programming support and application programming. The size of FFT is programmable so that it can be used for various OFDM-based communication systems, such as digital audio broadcasting (DAB), digital video broadcasting-terrestrial (DVB-T) and digital video broadcasting-handheld (DVB-H). The clustered architecture design and distributed ping-pong register files in the PAC DSP raise new challenges of code generation.
Resumo:
This paper presents a matrix inversion architecture based on the novel Modified Squared Givens Rotations (MSGR) algorithm, which extends the original SGR method to complex valued data, and also corrects erroneous results in the original SGR method when zeros occur on the diagonal of the matrix either initially or during processing. The MSGR algorithm also avoids complex dividers in the matrix inversion, thus minimising the complexity of potential real-time implementations. A systolic array architecture is implemented and FPGA synthesis results indicate a high-throughput low-latency complex matrix inversion solution. © 2008 IEEE.
Resumo:
This paper presents the design of a novel single chip adaptive beamformer capable of performing 50 Gflops, (Giga-floating-point operations/second). The core processor is a QR array implemented on a fully efficient linear systolic architecture, derived using a mapping that allows individual processors for boundary and internal cell operations. In addition, the paper highlights a number of rapid design techniques that have been used to realise this system. These include an architecture synthesis tool for quickly developing the circuit architecture and the utilisation of a library of parameterisable silicon intellectual property (IP) cores, to rapidly develop detailed silicon designs.