955 resultados para VLSI CAD
Resumo:
A bit-level systolic array system is proposed for the Winograd Fourier transform algorithm. The design uses bit-serial arithmetic and, in common with other systolic arrays, features nearest neighbor interconnections, regularity, and high throughput. The short interconnections in this method contrast favorably with the long interconnections between butterflies required in the FFT. The structure is well suited to VLSI implementations. It is demonstrated how long transforms can be implemented with components designed to perform short-length transforms. These components build into longer transforms, preserving the regularity and structure of the short-length transform design.
Resumo:
A bit-level systolic array system for performing a binary tree Vector Quantization codebook search is described. This consists of a linear chain of regular VLSI building blocks and exhibits data rates suitable for a wide range of real-time applications. A technique is described which reduces the computation required at each node in the binary tree to that of a single inner product operation. This method applies to all the common distortion measures (including the Euclidean distance, the Weighted Euclidean distance and the Itakura-Saito distortion measure) and significantly reduces the hardware required to implement the tree search system. © 1990 Kluwer Academic Publishers.
Resumo:
A systematic design methodology is described for the rapid derivation of VLSI architectures for implementing high performance recursive digital filters, particularly ones based on most significant digit (msd) first arithmetic. The method has been derived by undertaking theoretical investigations of msd first multiply-accumulate algorithms and by deriving important relationships governing the dependencies between circuit latency, levels of pipe-lining and the range and number representations of filter operands. The techniques described are general and can be applied to both bit parallel and bit serial circuits, including those based on on-line arithmetic. The method is illustrated by applying it to the design of a number of highly pipelined bit parallel IIR and wave digital filter circuits. It is shown that established architectures, which were previously designed using heuristic techniques, can be derived directly from the equations described.
Resumo:
The fabrication and performance of the first bit-level systolic correlator array is described. The application of systolic array concepts at the bit level provides a simple and extremely powerful method for implementing high-performance digital processing functions. The resulting structure is highly regular, facilitating yield enhancement through fault-tolerant redundancy techniques and therefore ideally suited to implementation as a VLSI chip. The CMOS/SOS chip operates at 35 MHz, is fully cascadable and exhibits 64-stage correlation for 1-bit reference and 4-bit data. 7 refs.
Resumo:
The inclusion of the Discrete Wavelet Transform in the JPEG-2000 standard has added impetus to the research of hardware architectures for the two-dimensional wavelet transform. In this paper, a VLSI architecture for performing the symmetrically extended two-dimensional transform is presented. This architecture conforms to the JPEG-2000 standard and is capable of near-optimal performance when dealing with the image boundaries. The architecture also achieves efficient processor utilization. Implementation results based on a Xilinx Virtex-2 FPGA device are included.
Resumo:
This paper presents single-chip FPGA Rijndael algorithm implementations of the Advanced Encryption Standard (AES) algorithm, Rijndael. In particular, the designs utilise look-up tables to implement the entire Rijndael Round function. A comparison is provided between these designs and similar existing implementations. Hardware implementations of encryption algorithms prove much faster than equivalent software implementations and since there is a need to perform encryption on data in real time, speed is very important. In particular, Field Programmable Gate Arrays (FPGAs) are well suited to encryption implementations due to their flexibility and an architecture, which can be exploited to accommodate typical encryption transformations. In this paper, a Look-Up Table (LUT) methodology is introduced where complex and slow operations are replaced by simple LUTs. A LUT-based fully pipelined Rijndael implementation is described which has a pre-placement performance of 12 Gbits/sec, which is a factor 1.2 times faster than an alternative design in which look-up tables are utilised to implement only one of the Round function transformations, and 6 times faster than other previous single-chip implementations. Iterative Rijndael implementations based on the Look-Up-Table design approach are also discussed and prove faster than typical iterative implementations.
Resumo:
Details are presented of the IRIS synthesis system for high-performance digital signal processing. This tool allows non-specialists to automatically derive VLSI circuit architectures from high-level, algorithmic representations, and provides a quick route to silicon implementation. The applicability of the system is demonstrated using the design example of a one-dimensional Discrete Cosine Transform circuit.
Resumo:
Real time digital signal processing requires the development of high performance arithmetic algorithms suitable for VLSI design. In this paper, a new online, circular coordinate system CORDIC algorithm is described, which has a constant scale factor. This algorithm was developed using a new Angular Representation (AR) model A radix 2 version of the CORDIC algorithm is presented, along with an architecture suitable for VLSI implementation.
Resumo:
The implementation of a multi-bit convolver chip based on a systolic array is described. The convolver is fabricated on a 7mm multiplied by 8mm CMOS chip and operates on 8-bit serial data and coefficient words. It has a length of 17 stages, and this is cascadable. The circuit can be clocked at more than 20 MHz giving a data throughput rate of greater than 1 Mword/s. Details of important implementation decisions and a summary of chip characteristics are given together with the advantages which the systolic approach has afforded to the design process.
Resumo:
Details of a new low power FFT processor for use in digital television applications are presented. This has been fabricated using a 0.6 µm CMOS technology and can perform a 64 point complex forward or inverse FFT on real-rime video at up to 18 Megasamples per second. It comprises 0.5 million transistors in a die area of 7.8×8 mm and dissipates 1 W. Its performance, in terms of computational rate per area per watt, is significantly higher than previously reported devices, leading to a cost-effective silicon solution for high quality video processing applications. This is the result of using a novel VLSI architecture which has been derived from a first principles factorisation of the DFT matrix and tailored to a direct silicon implementation.
Resumo:
Optimized circuits for implementing high-performance bit-parallel IIR filters are presented. Circuits constructed mainly from simple carry save adders and based on most-significant-bit (MSB) first arithmetic are described. Two methods resulting in systems which are 100% efficient in that they are capable of sampling data every cycle are presented. In the first approach the basic circuit is modified so that the level of pipelining used is compatible with the small, but fixed, latency associated with the computation in question. This is achieved through insertion of pipeline delays (half latches) on every second row of cells. This produces an area-efficient solution in which the throughput rate is determined by a critical path of 76 gate delays. A second approach combines the MSB first arithmetic methods with the scattered look-ahead methods. Important design issues are addressed, including wordlength truncation, overflow detection, and saturation.
Resumo:
Dual-rail encoding, return-to-spacer protocol, and hazard-free logic can be used to resist power analysis attacks by making energy consumed per clock cycle independent of processed data. Standard dual-rail logic uses a protocol with a single spacer, e.g., all-zeros, which gives rise to energy balancing problems. We address these problems by incorporating two spacers; the spacers alternate between adjacent clock cycles. This guarantees that all gates switch in every clock cycle regardless of the transmitted data values. To generate these dual-rail circuits, an automated tool has been developed. It is capable of converting synchronous netlists into dual-rail circuits and it is interfaced to industry CAD tools. Dual-rail and single-rail benchmarks based upon the advanced encryption standard (AES) have been simulated and compared in order to evaluate the method and the tool.
Resumo:
Objective Oxidative stress is implicated in the pathogenesis of many human diseases including atherosclerosis. Human glutathione peroxidase 1 (hgpx1) participates in limiting cellular damage caused by oxidation. A characteristic polyalanine sequence polymorphism in exon 1 of hgpx1 produces three alleles with five, six or seven alanine (ALA) repeats in this sequence. The objective of this study was to determine whether hgpx1 genotype is associated with an altered risk of coronary artery disease (CAD).
Methods The frequency of the ALA6 allele was determined in 207 men with angiographic evidence of significant CAD compared to a control group (n = 146), by analysing the lengths of polymerase chain reaction fragments containing the ALA repeat polymorphism. Additional information was collected on severity of CAD, presence or absence of a prior acute myocardial infarction (AMI), smoking status, body mass index (BMI) and other clinical data.
Results There was a significant association between individuals with at least one ALA6 allele and an increased risk of CAD after adjustment for age, BMI and smoking status (odds ratio, 2.07, 95% confidence interval, 1.08-3.99, P = 0.029). However, there was no association between hgpx1 genotype and a previous history of AMI or hgpx1 genotype and severity of CAD.
Conclusion We conclude that individuals possessing one or two ALA6 alleles appear to be at a modest increased risk of CAD. This observation merits further investigation in other patient populations.