52 resultados para Processing Element Array

em QUB Research Portal - Research Directory and Institutional Repository for Queen's University Belfast


Relevância:

100.00% 100.00%

Publicador:

Resumo:

A novel application-specific instruction set processor (ASIP) for use in the construction of modern signal processing systems is presented. This is a flexible device that can be used in the construction of array processor systems for the real-time implementation of functions such as singular-value decomposition (SVD) and QR decomposition (QRD), as well as other important matrix computations. It uses a coordinate rotation digital computer (CORDIC) module to perform arithmetic operations and several approaches are adopted to achieve high performance including pipelining of the micro-rotations, the use of parallel instructions and a dual-bus architecture. In addition, a novel method for scale factor correction is presented which only needs to be applied once at the end of the computation. This also reduces computation time and enhances performance. Methods are described which allow this processor to be used in reduced dimension (i.e., folded) array processor structures that allow tradeoffs between hardware and performance. The net result is a flexible matrix computational processing element (PE) whose functionality can be changed under program control for use in a wider range of scenarios than previous work. Details are presented of the results of a design study, which considers the application of this decomposition PE architecture in a combined SVD/QRD system and demonstrates that a combination of high performance and efficient silicon implementation are achievable. © 2005 IEEE.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

A method for producing a retrodirective (self-tracking) antenna, which can also be operated as a phased (selectively pointed) array through the addition of a simple switching circuit and DC bias offset adjustment, is presented. Phase adjustment to individual antenna elements is shown to be readily carried out by a simple frequency pushing technique, applied to a PLL circuit, thus replacing the requirement for additional phase shifters. Practical results when applied to a ten-element array operating at 2.4 GHz are shown for both modes of operation.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

A new strategy for remote reconfiguration of an antenna array far field radiation pattern is described. The scheme uses a pilot tone co-transmitted with a carrier signal from a location distant from that of a receive antenna array whose far field pattern is to be reconfigured. By mixing the co-transmitted signals locally at each antenna element in the array an IF signal is formed which defines an equivalent array spacing that can be made variable by tuning the frequency of the pilot tone with respect to the RF carrier. This makes the antenna array factor hence far field spatial characteristic reconfigurable on receive. For a 10 x 1 microstrip patch element array we show that the receive pattern can be made to vary from 35 to 10 degrees half power beam width as the difference frequency between the pilot and the carrier at 2.45 GHz varies between 10 MHz and 500 MHz carrier.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The overall aim of the work presented in this paper has been to develop Montgomery modular multiplication architectures suitable for implementation on modern reconfigurable hardware. Accordingly, novel high-radix systolic array Montgomery multiplier designs are presented, as we believe that the inherent regular structure and absence of global interconnect associated with these, make them well-suited for implementation on modern FPGAs. Unlike previous approaches, each processing element (PE) comprises both an adder and a multiplier. The inclusion of a multiplier in the PE means that the need to pre-compute or store any multiples of the operands is avoided. This also allows very high-radix implementations to be realised, further reducing the amount of clock cycles per modular multiplication, while still maintaining a competitive critical delay. For demonstrative purposes, 512-bit and 1024-bit FPGA implementations using radices of 2(8) and 2(16) are presented. The subsequent throughput rates are the fastest reported to date.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

With the advent of new video standards such as MPEG-4 part-10 and H.264/H.26L, demands for advanced video coding, particularly in the area of variable block size video motion estimation (VBSME), are increasing. In this paper, we propose a new one-dimensional (1-D) very large-scale integration architecture for full-search VBSME (FSVBSME). The VBS sum of absolute differences (SAD) computation is performed by re-using the results of smaller sub-block computations. These are distributed and combined by incorporating a shuffling mechanism within each processing element. Whereas a conventional 1-D architecture can process only one motion vector (MV), this new architecture can process up to 41 MV sub-blocks (within a macroblock) in the same number of clock cycles.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

An application specific programmable processor (ASIP) suitable for the real-time implementation of matrix computations such as Singular Value and QR Decomposition is presented. The processor incorporates facilities for the issue of parallel instructions and a dual-bus architecture that are designed to achieve high performance. Internally, it uses a CORDIC module to perform arithmetic operations, with pipelining of the internal recursive loop exploited to multiplex the two independent micro-rotations onto a single piece of hardware. The net result is a flexible processing element whose functionality can be changed under program control, which combines high performance with efficient silicon implementation. This is illustrated through the results of a detailed silicon design study and the applications of the techniques to a combined SVD/QRD system.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

An SVD processor system is presented in which each processing element is implemented using a simple CORDIC unit. The internal recursive loop within the CORDIC module is exploited, with pipelining being used to multiplex the two independent micro-rotations onto a single CORDIC processor. This leads to a high performance and efficient hardware architecture. In addition, a novel method for scale factor correction is presented which only need be applied once at the end of the computation. This also reduces the computation time. The net result is an SVD architecture based on a conventional CORDIC approach, which combines high performance with high silicon area efficiency.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

A means of encoding and decoding data using wireless orbital angular momentum (OAM) modes is proposed and analysed. Source data symbols are used to select an OAM mode, which is generated using an 8-element circular array. A 2-element array is used to detect the mode by estimating the phase gradient of the received signal, and hence identifying the transmitted data symbol. The results are presented in terms of mode estimation error.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

It is shown that the direction-of-arrival (DoA) information carried by an incident electromagnetic (EM) wave can be encoded into the evanescent near field of an electrically small resonance antenna array with a spatial rate higher than that of the incident field oscillation rate in free space. Phase conjugation of the received signal leads to the retrodirection of the near field in the antenna array environment, which in turn generates a retrodirected far-field beam toward the original DoA. This EM phenomenon enables electrically small retrodirective antenna arrays with superdirective, angular super-resolution, auto-pointing properties for an arbitrary DoA. A theoretical explanation of the phenomenon based on first principal observations is given and full-wave simulations demonstrate a realizability route for the proposed retrodirective terminal that is comprised of resonance dipole antenna elements. Specifically, it is shown that a three-element disk-loaded retrodirective dipole array with 0.15\lambda spacings can achieve a 3.4-dBi maximal gain, 3-dBi front-to-back ratio, and 13% return loss fractional bandwidth (at the 10-dB level). Then, it is demonstrated that the radiation gain of a three-element array can be improved to approximately 6 dBi at the expense of the return loss fractional bandwidth reduction (2%).

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Dynamic power consumption is very dependent on interconnect, so clever mapping of digital signal processing algorithms to parallelised realisations with data locality is vital. This is a particular problem for fast algorithm implementations where typically, designers will have sacrificed circuit structure for efficiency in software implementation. This study outlines an approach for reducing the dynamic power consumption of a class of fast algorithms by minimising the index space separation; this allows the generation of field programmable gate array (FPGA) implementations with reduced power consumption. It is shown how a 50% reduction in relative index space separation results in a measured power gain of 36 and 37% over a Cooley-Tukey Fast Fourier Transform (FFT)-based solution for both actual power measurements for a Xilinx Virtex-II FPGA implementation and circuit measurements for a Xilinx Virtex-5 implementation. The authors show the generality of the approach by applying it to a number of other fast algorithms namely the discrete cosine, the discrete Hartley and the Walsh-Hadamard transforms.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Due to its efficiency and simplicity, the finite-difference time-domain method is becoming a popular choice for solving wideband, transient problems in various fields of acoustics. So far, the issue of extracting a binaural response from finite difference simulations has only been discussed in the context of embedding a listener geometry in the grid. In this paper, we propose and study a method for binaural response rendering based on a spatial decomposition of the sound field. The finite difference grid is locally sampled using a volumetric array of receivers, from which a plane wave density function is computed and integrated with free-field head related transfer functions, in the spherical harmonics domain. The volumetric array is studied in terms of numerical robustness and spatial aliasing. Analytic formulas that predict the performance of the array are developed, facilitating spatial resolution analysis and numerical binaural response analysis for a number of finite difference schemes. Particular emphasis is placed on the effects of numerical dispersion on array processing and on the resulting binaural responses. Our method is compared to a binaural simulation based on the image method. Results indicate good spatial and temporal agreement between the two methods.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Field programmable gate array devices boast abundant resources with which custom accelerator components for signal, image and data processing may be realised; however, realising high performance, low cost accelerators currently demands manual register transfer level design. Software-programmable ’soft’ processors have been proposed as a way to reduce this design burden but they are unable to support performance and cost comparable to custom circuits. This paper proposes a new soft processing approach for FPGA which promises to overcome this barrier. A high performance, fine-grained streaming processor, known as a Streaming Accelerator Element, is proposed which realises accelerators as large scale custom multicore networks. By adopting a streaming execution approach with advanced program control and memory addressing capabilities, typical program inefficiencies can be almost completely eliminated to enable performance and cost which are unprecedented amongst software-programmable solutions. When used to realise accelerators for fast fourier transform, motion estimation, matrix multiplication and sobel edge detection it is shown how the proposed architecture enables real-time performance and with performance and cost comparable with hand-crafted custom circuit accelerators and up to two orders of magnitude beyond existing soft processors.