583 resultados para 290900 Electrical and Electronic Engineering
Resumo:
Methods are presented for developing synthesizable FFT cores. These are based on a modular approach in which parameterized commutator and processor blocks are cascaded to implement the computations required in many important FFT signal flow graphs. In addition, it is shown how the use of a digital serial data organization can be used to produce systems that offer 100% processor utilization along with reductions in storage requirements. The approach has been used to create generators for the automated synthesis of FFT cores that are portable across a broad range of silicon technologies. Resulting chip designs are competitive with ones created using manual methods but with significant reductions in design times.
Resumo:
In this paper, a novel configurable content addressable memory (CCAM) cell is proposed, to increase the flexibility of embedded CAMs for SoC employment. It can be easily configured as a Binary CAM (BiCAM) or Ternary CAM (TCAM) without significant penalty of power consumption or searching speed. A 64x128 CCAM array has been built and verified through simulation. ©2007 IEEE.
Resumo:
DDR-SDRAM based data lookup techniques are evolving into a core technology for packet lookup applications for data network, benefitting from the features of high density, high bandwidth and low price of DDR memory products in the market. Our proposed DDR-SDRAM based lookup circuit is capable of achieving IP header lookup for network line-rates of up to 10Gbps, providing a solution on high-performance and economic packet header inspections. ©2008 IEEE.
Resumo:
The design of a System-on-a-Chip (SoC) demonstrator for a baseline JPEG encoder core is presented. This combines a highly optimized Discrete Cosine Transform (DCT) and quantization unit with an entropy coder which has been realized using off-the-shelf synthesizable IP cores (Run-length coder, Huffman coder and data packer). When synthesized in a 0.35 µm CMOS process, the core can operate at speeds up to 100 MHz and contains 50 k gates plus 11.5 kbits of RAM. This is approximately 20% less than similar JPEG encoder designs reported in literature. When targeted at FPGA the core can operate up to 30 MHz and is capable of compressing 9-bit full-frame color input data at NTSC or PAL rates.
Resumo:
A generator for the automated design of Discrete Cosine Transform (DCT) cores is presented. This can be used to rapidly create silicon circuits from a high level specification. These compare very favourably with existing designs. The DCT cores produced are scaleable in terms of point size as well as input/output and coefficient wordlengths. This provides a high degree of flexibility. An example, 8-point 1D DCT design, produced occupies less than 0.92 mm when implemented in a 0.35µ double level metal CMOS technology. This can be clocked at a rate of 100MHz.
Resumo:
A novel hardware architecture for elliptic curve cryptography (ECC) over GF(p) is introduced. This can perform the main prime field arithmetic functions needed in these cryptosystems including modular inversion and multiplication. This is based on a new unified modular inversion algorithm that offers considerable improvement over previous ECC techniques that use Fermat's Little Theorem for this operation. The processor described uses a full-word multiplier which requires much fewer clock cycles than previous methods, while still maintaining a competitive critical path delay. The benefits of the approach have been demonstrated by utilizing these techniques to create a field-programmable gate array (FPGA) design. This can perform a 256-bit prime field scalar point multiplication in 3.86 ms, the fastest FPGA time reported to date. The ECC architecture described can also perform four different types of modular inversion, making it suitable for use in many different ECC applications. © 2006 IEEE.
Resumo:
A scheduling method for implementing a generic linear QR array processor architecture is presented. This improves on previous work. It also considerably simplifies the derivation of schedules for a folded linear system, where detailed account has to be taken of processor cell latency. The architecture and scheduling derived provide the basis of a generator for the rapid design of System-on-a-Chip (SoC) cores for QR decomposition.
Resumo:
This paper presents single-chip FPGA Rijndael algorithm implementations of the Advanced Encryption Standard (AES) algorithm, Rijndael. In particular, the designs utilise look-up tables to implement the entire Rijndael Round function. A comparison is provided between these designs and similar existing implementations. Hardware implementations of encryption algorithms prove much faster than equivalent software implementations and since there is a need to perform encryption on data in real time, speed is very important. In particular, Field Programmable Gate Arrays (FPGAs) are well suited to encryption implementations due to their flexibility and an architecture, which can be exploited to accommodate typical encryption transformations. In this paper, a Look-Up Table (LUT) methodology is introduced where complex and slow operations are replaced by simple LUTs. A LUT-based fully pipelined Rijndael implementation is described which has a pre-placement performance of 12 Gbits/sec, which is a factor 1.2 times faster than an alternative design in which look-up tables are utilised to implement only one of the Round function transformations, and 6 times faster than other previous single-chip implementations. Iterative Rijndael implementations based on the Look-Up-Table design approach are also discussed and prove faster than typical iterative implementations.
Resumo:
In this paper, a new reconfigurable multi-standard architecture is introduced for integer-pixel motion estimation and a standard-cell based chip design study is presented. This has been designed to cover most of the common block-based video compression standards, including MPEG-2, MPEG-4, H.263, H.264, AVS and WMV-9. The architecture exhibits simpler control, high throughput and relative low hardware cost and highly competitive when compared with excising designs for specific video standards. It can also, through the use of control signals, be dynamically reconfigured at run-time to accommodate different system constraint such as the trade-off in power dissipation and video-quality. The computational rates achieved make the circuit suitable for high end video processing applications. Silicon design studies indicate that circuits based on this approach incur only a relatively small penalty in terms of power dissipation and silicon area when compared with implementations for specific standards.
Resumo:
A methodology has been developed which allows a non-specialist to rapidly design silicon wavelet transform cores for a variety of specifications. The cores include both forward and inverse orthonormal wavelet transforms. This methodology is based on efficient, modular and scaleable architectures utilising time-interleaved coefficients for the wavelet transform filters. The cores are parameterized in terms of wavelet type and data and coefficient word lengths. The designs have been captured in VHDL and are hence portable across a range of silicon foundries as well as FPGA and PLD implementations.