943 resultados para Arithmetic.
Resumo:
An area-efficient high-throughput architecture based on distributed arithmetic is proposed for 3D discrete wavelet transform (DWT). The 3D DWT processor was designed in VHDL and mapped to a Xilinx Virtex-E FPGA. The processor runs up to 85 MHz, which can process the five-level DWT analysis of a 128 x 128 x 128 fMRI volume image in 20 ms.
Resumo:
In this paper a novel scalable public-key processor architecture is presented that supports modular exponentiation and Elliptic Curve Cryptography over both prime GF(p) and binary GF(2) extension fields. This is achieved by a high performance instruction set that provides a comprehensive range of integer and polynomial basis field arithmetic. The instruction set and associated hardware are generic in nature and do not specifically support any cryptographic algorithms or protocols. Firmware within the device is used to efficiently implement complex and data intensive arithmetic. A firmware library has been developed in order to demonstrate support for numerous exponentiation and ECC approaches, such as different coordinate systems and integer recoding methods. The processor has been developed as a high-performance asymmetric cryptography platform in the form of a scalable Verilog RTL core. Various features of the processor may be scaled, such as the pipeline width and local memory subsystem, in order to suit area, speed and power requirements. The processor is evaluated and compares favourably with previous work in terms of performance while offering an unparalleled degree of flexibility. © 2006 IEEE.
Resumo:
The arithmetical performance of typically achieving 5- to 7-year-olds (N = 29) was measured at four 6-month intervals. The same seven tasks were used at each time point: exact calculation, story problems, approximate arithmetic, place value, calculation principles, forced retrieval, and written problems. Although group analysis showed mostly linear growth over the 18-month period, analysis of individual differences revealed a much more complex picture. Some children exhibited marked variation in performance across the seven tasks, including evidence of difficulty in some cases. Individual growth patterns also showed differences in developmental trajectories between children on each task and within children across tasks. The findings support the idea of the componential nature of arithmetical ability and underscore the need for further longitudinal research on typically achieving children and of careful consideration of individual differences. (C) 2009 Elsevier Inc. All rights reserved.
Resumo:
The choice of radix is crucial for multi-valued logic synthesis. Practical examples, however, reveal that it is not always possible to find the optimal radix when taking into consideration actual physical parameters of multi-valued operations. In other words, each radix has its advantages and disadvantages. Our proposal is to synthesise logic in different radices, so it may benefit from their combination. The theory presented in this paper is based on Reed-Muller expansions over Galois field arithmetic. The work aims to firstly estimate the potential of the new approach and to secondly analyse its impact on circuit parameters down to the level of physical gates. The presented theory has been applied to real-life examples focusing on cryptographic circuits where Galois Fields find frequent application. The benchmark results show the approach creates a new dimension for the trade-off between circuit parameters and provides information on how the implemented functions are related to different radices.
Resumo:
This implementation of a two-dimensional discrete cosine transform demonstrates the development of a suitable architectural style for a specific technology-in this case, the Xilinx XC6200 FPGA series. The design exploits distributed arithmetic, parallelism, and pipelining to achieve a high-performance custom-computing implementation.
Resumo:
The competition between Photoinduced electron transfer (PET) and other de-excitation pathways such as fluorescence and phosphorescence can be controlled within designed molecular structures. Depending on the particular design, the resulting optical output is thus a function of various inputs such as ion concentration and excitation light dose. Once digitized into binary code, these input-output patterns can be interpreted according to Boolean logic. The single-input logic types of YES and NOT cover simple sensors and the double- (or higher-) input logic types represent other gates such as AND and OR. The logic-based arithmetic processors such as half-adders and half-subtractors are also featured. Naturally, a principal application of the more complex gates is in multi-sensing contexts.
Resumo:
The initial part of this paper reviews the early challenges (c 1980) in achieving real-time silicon implementations of DSP computations. In particular, it discusses research on application specific architectures, including bit level systolic circuits that led to important advances in achieving the DSP performance levels then required. These were many orders of magnitude greater than those achievable using programmable (including early DSP) processors, and were demonstrated through the design of commercial digital correlator and digital filter chips. As is discussed, an important challenge was the application of these concepts to recursive computations as occur, for example, in Infinite Impulse Response (IIR) filters. An important breakthrough was to show how fine grained pipelining can be used if arithmetic is performed most significant bit (msb) first. This can be achieved using redundant number systems, including carry-save arithmetic. This research and its practical benefits were again demonstrated through a number of novel IIR filter chip designs which at the time, exhibited performance much greater than previous solutions. The architectural insights gained coupled with the regular nature of many DSP and video processing computations also provided the foundation for new methods for the rapid design and synthesis of complex DSP System-on-Chip (SoC), Intellectual Property (IP) cores. This included the creation of a wide portfolio of commercial SoC video compression cores (MPEG2, MPEG4, H.264) for very high performance applications ranging from cell phones to High Definition TV (HDTV). The work provided the foundation for systematic methodologies, tools and design flows including high-level design optimizations based on "algorithmic engineering" and also led to the creation of the Abhainn tool environment for the design of complex heterogeneous DSP platforms comprising processors and multiple FPGAs. The paper concludes with a discussion of the problems faced by designers in developing complex DSP systems using current SoC technology. © 2007 Springer Science+Business Media, LLC.
Resumo:
A novel high performance bit parallel architecture to perform square root and division is proposed. Relevant VLSI design issues have been addressed. By employing redundant arithmetic and a semisystolic schedule, the throughput has been made independent of the size of the array.
Resumo:
A bit level systolic array system is proposed for the Winograd Fourier transform algorithm. The design uses bit-serial arithmetic and, in common with other systolic arrays, features nearest-neighbor interconnections, regularity and high throughput. The short interconnections in this method contrast favorably with the long interconnections between butterflies required in the FFT. The structure is well suited to VLSI implementations. It is demonstrated how long transforms can be implemented with components designed to perform a short length transform. These components build into longer transforms preserving the regularity and structure of the short length transform design.
Resumo:
The application of fine grain pipelining techniques in the design of high performance Wave Digital Filters (WDFs) is described. It is shown that significant increases in the sampling rate of bit parallel circuits can be achieved using most significant bit (msb) first arithmetic. A novel VLSI architecture for implementing two-port adaptor circuits is described which embodies these ideas. The circuit in question is highly regular, uses msb first arithmetic and is implemented using simple carry-save adders. © 1992 Kluwer Academic Publishers.
Resumo:
A high-performance VLSI architecture to perform multiply-accumulate, division and square root operations is proposed. The circuit is highly regular, requires only minimal control and can be pipelined right down to the bit level. The system can also be reconfigured on every cycle to perform any one of these operations. The gate count per row has been estimated at (27n+70) gate equivalents where n is the divisor wordlength. The throughput rate, which equals the clock speed, is the same for each operation and is independent of the wordlength. This is achieved through the combination of pipelining and redundant arithmetic. With a 1.0 µm CMOS technology and extensive pipelining, throughput rates in excess of 70 million operations per second are expected.
Resumo:
Several novel systolic architectures for implementing densely pipelined bit parallel IIR filter sections are presented. The fundamental problem of latency in the feedback loop is overcome by employing redundant arithmetic in combination with bit-level feedback, allowing a basic first-order section to achieve a wordlength-independent latency of only two clock cycles. This is extended to produce a building block from which higher order sections can be constructed. The architecture is then refined by combining the use of both conventional and redundant arithmetic, resulting in two new structures offering substantial hardware savings over the original design. In contrast to alternative techniques, bit-level pipelinability is achieved with no net cost in hardware. © 1989 Kluwer Academic Publishers.
Resumo:
A novel bit-level systolic array architecture for implementing bit-parallel IIR filter sections is presented. The authors have shown previously how the fundamental obstacle of pipeline latency in recursive structures can be overcome by the use of redundant arithmetic in combination with bit-level feedback. These ideas are extended by optimizing the degree of redundancy used in different parts of the circuit and combining redundant circuit techniques with those of conventional arithmetic. The resultant architecture offers significant improvements in hardware complexity and throughput rate.
Resumo:
A bit-level systolic array system is proposed for the Winograd Fourier transform algorithm. The design uses bit-serial arithmetic and, in common with other systolic arrays, features nearest neighbor interconnections, regularity, and high throughput. The short interconnections in this method contrast favorably with the long interconnections between butterflies required in the FFT. The structure is well suited to VLSI implementations. It is demonstrated how long transforms can be implemented with components designed to perform short-length transforms. These components build into longer transforms, preserving the regularity and structure of the short-length transform design.
Resumo:
A systematic design methodology is described for the rapid derivation of VLSI architectures for implementing high performance recursive digital filters, particularly ones based on most significant digit (msd) first arithmetic. The method has been derived by undertaking theoretical investigations of msd first multiply-accumulate algorithms and by deriving important relationships governing the dependencies between circuit latency, levels of pipe-lining and the range and number representations of filter operands. The techniques described are general and can be applied to both bit parallel and bit serial circuits, including those based on on-line arithmetic. The method is illustrated by applying it to the design of a number of highly pipelined bit parallel IIR and wave digital filter circuits. It is shown that established architectures, which were previously designed using heuristic techniques, can be derived directly from the equations described.