949 resultados para Carry Save Adders


Relevância:

100.00% 100.00%

Publicador:

Resumo:

"To be presented at the First Annual IEEE Computer Conference, Chicago, Illinois, September 6-8, 1967."

Relevância:

100.00% 100.00%

Publicador:

Resumo:

"March 7, 1960"

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The application of fine grain pipelining techniques in the design of high performance Wave Digital Filters (WDFs) is described. It is shown that significant increases in the sampling rate of bit parallel circuits can be achieved using most significant bit (msb) first arithmetic. A novel VLSI architecture for implementing two-port adaptor circuits is described which embodies these ideas. The circuit in question is highly regular, uses msb first arithmetic and is implemented using simple carry-save adders. © 1992 Kluwer Academic Publishers.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Optimized circuits for implementing high-performance bit-parallel IIR filters are presented. Circuits constructed mainly from simple carry save adders and based on most-significant-bit (MSB) first arithmetic are described. Two methods resulting in systems which are 100% efficient in that they are capable of sampling data every cycle are presented. In the first approach the basic circuit is modified so that the level of pipelining used is compatible with the small, but fixed, latency associated with the computation in question. This is achieved through insertion of pipeline delays (half latches) on every second row of cells. This produces an area-efficient solution in which the throughput rate is determined by a critical path of 76 gate delays. A second approach combines the MSB first arithmetic methods with the scattered look-ahead methods. Important design issues are addressed, including wordlength truncation, overflow detection, and saturation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Decimal multiplication is an integral part of financial, commercial, and internet-based computations. This paper presents a novel double digit decimal multiplication (DDDM) technique that performs 2 digit multiplications simultaneously in one clock cycle. This design offers low latency and high throughput. When multiplying two n-digit operands to produce a 2n-digit product, the design has a latency of (n / 2) 1 cycles. The paper presents area and delay comparisons for 7-digit, 16-digit, 34-digit double digit decimal multipliers on different families of Xilinx, Altera, Actel and Quick Logic FPGAs. The multipliers presented can be extended to support decimal floating-point multiplication for IEEE P754 standard

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Decimal multiplication is an integral part of financial, commercial, and internet-based computations. This paper presents a novel double digit decimal multiplication (DDDM) technique that offers low latency and high throughput. This design performs two digit multiplications simultaneously in one clock cycle. Double digit fixed point decimal multipliers for 7digit, 16 digit and 34 digit are simulated using Leonardo Spectrum from Mentor Graphics Corporation using ASIC Library. The paper also presents area and delay comparisons for these fixed point multipliers on Xilinx, Altera, Actel and Quick logic FPGAs. This multiplier design can be extended to support decimal floating point multiplication for IEEE 754- 2008 standard.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Wave pipelining is a design technique for increasing the throughput of a digital circuit or system without introducing pipelining registers between adjacent combinational logic blocks in the circuit/system. However, this requires balancing of the delays along all the paths from the input to the output which comes the way of its implementation. Static CMOS is inherently susceptible to delay variation with input data, and hence, receives a low priority for wave pipelined digital design. On the other hand, ECL and CML, which are amenable to wave pipelining, lack the compactness and low power attributes of CMOS. In this paper we attempt to exploit wave pipelining in CMOS technology. We use a single generic building block in Normal Process Complementary Pass Transistor Logic (NPCPL), modeled after CPL, to achieve equal delay along all the propagation paths in the logic structure. An 8×8 b multiplier is designed using this logic in a 0.8 ?m technology. The carry-save multiplier architecture is modified suitably to support wave pipelining, viz., the logic depth of all the paths are made identical. The 1 mm×0.6 mm multiplier core supports a throughput of 400 MHz and dissipates a total power of 0.6 W. We develop simple enhancements to the NPCPL building blocks that allow the multiplier to sustain throughputs in excess of 600 MHz. The methodology can be extended to introduce wave pipelining in other circuits as well

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The use of delayed coefficient adaptation in the least mean square (LMS) algorithm has enabled the design of pipelined architectures for real-time transversal adaptive filtering. However, the convergence speed of this delayed LMS (DLMS) algorithm, when compared with that of the standard LMS algorithm, is degraded and worsens with increase in the adaptation delay. Existing pipelined DLMS architectures have large adaptation delay and hence degraded convergence speed. We in this paper, first present a pipelined DLMS architecture with minimal adaptation delay for any given sampling rate. The architecture is synthesized by using a number of function preserving transformations on the signal flow graph representation of the DLMS algorithm. With the use of carry-save arithmetic, the pipelined architecture can support high sampling rates, limited only by the delay of a full adder and a 2-to-1 multiplexer. In the second part of this paper, we extend the synthesis methodology described in the first part, to synthesize pipelined DLMS architectures whose power dissipation meets a specified budget. This low-power architecture exploits the parallelism in the DLMS algorithm to meet the required computational throughput. The architecture exhibits a novel tradeoff between algorithmic performance (convergence speed) and power dissipation. (C) 1999 Elsevier Science B.V. All rights resented.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The initial part of this paper reviews the early challenges (c 1980) in achieving real-time silicon implementations of DSP computations. In particular, it discusses research on application specific architectures, including bit level systolic circuits that led to important advances in achieving the DSP performance levels then required. These were many orders of magnitude greater than those achievable using programmable (including early DSP) processors, and were demonstrated through the design of commercial digital correlator and digital filter chips. As is discussed, an important challenge was the application of these concepts to recursive computations as occur, for example, in Infinite Impulse Response (IIR) filters. An important breakthrough was to show how fine grained pipelining can be used if arithmetic is performed most significant bit (msb) first. This can be achieved using redundant number systems, including carry-save arithmetic. This research and its practical benefits were again demonstrated through a number of novel IIR filter chip designs which at the time, exhibited performance much greater than previous solutions. The architectural insights gained coupled with the regular nature of many DSP and video processing computations also provided the foundation for new methods for the rapid design and synthesis of complex DSP System-on-Chip (SoC), Intellectual Property (IP) cores. This included the creation of a wide portfolio of commercial SoC video compression cores (MPEG2, MPEG4, H.264) for very high performance applications ranging from cell phones to High Definition TV (HDTV). The work provided the foundation for systematic methodologies, tools and design flows including high-level design optimizations based on "algorithmic engineering" and also led to the creation of the Abhainn tool environment for the design of complex heterogeneous DSP platforms comprising processors and multiple FPGAs. The paper concludes with a discussion of the problems faced by designers in developing complex DSP systems using current SoC technology. © 2007 Springer Science+Business Media, LLC.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Whilst conventional bit level pipelining introduces an m cycle delay, it does allow m separate computations to be processed at throughput rates comparable to that using word level systolic arrays. We concentrate on exploiting this delay and describe a systematic method for the design of high performance multiplexed IIR filters. Two multiply and accumulate structures are identified based on shift-and-add and carry-save data organisations which can be used as building blocks in the design of IIR filters. By replacing the word level multiply and accumulate units in word level systolic structures with their equivalent bit level circuits and introducing latches to ensure correct timing, numerous architectures can be designed that process multiplexed data directly without any additional circuit overhead.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper, we propose a novel finite impulse response (FIR) filter design methodology that reduces the number of operations with a motivation to reduce power consumption and enhance performance. The novelty of our approach lies in the generation of filter coefficients such that they conform to a given low-power architecture, while meeting the given filter specifications. The proposed algorithm is formulated as a mixed integer linear programming problem that minimizes chebychev error and synthesizes coefficients which consist of pre-specified alphabets. The new modified coefficients can be used for low-power VLSI implementation of vector scaling operations such as FIR filtering using computation sharing multiplier (CSHM). Simulations in 0.25um technology show that CSHM FIR filter architecture can result in 55% power and 34% speed improvement compared to carry save multiplier (CSAM) based filters.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The authors present a VLSI circuit for implementing wave digital filter (WDF) two-port adaptors. Considerable speedups over conventional designs have been obtained using fine grained pipelining. This has been achieved through the use of most significant bit (MSB) first carry-save arithmetic, which allows systems to be designed in which latency L is small and independent of either coefficient or input data wordlength. L is determined by the online delay associated with the computation required at each node in the circuit (in this case a multiply/add plus two separate additions). This in turn means that pipelining can be used to considerably enhance the sampling rate of a recursive digital filter. The level of pipelining which will offer enhancement is determined by L and is fine-grained rather than bit level. In the case of the circuit considered, L = 3. For this reason pipeline delays (half latches) have been introduced between every two rows of cells to produce a system with a once every cycle sample rate.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The authors compare various array multiplier architectures based on (p,q) counter circuits. The tradeoff in multiplier design is always between adding complexity and increasing speed. It is shown that by using a (2,2,3) counter cell it is possible to gain a significant increase in speed over a conventional full-adder, carry-save array based approach. The increase in complexity should be easily accommodated using modern emitter-coupled-logic processes.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

On cover: C00-1018-1152.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

As a post-CMOS technology, the incipient Quantum-dot Cellular Automata technology has various advantages. A key aspect which makes it highly desirable is low power dissipation. One method that is used to analyse power dissipation in QCA circuits is bit erasure analysis. This method has been applied to analyse previously proposed QCA binary adders. However, a number of improved QCA adders have been proposed more recently that have only been evaluated in terms of area and speed. As the three key performance metrics for QCA circuits are speed, area and power, in this paper, a bit erasure analysis of these adders will be presented to determine their power dissipation. The adders to be analysed are the Carry Flow Adder (CFA), Brent-Kung Adder (B-K), Ladner-Fischer Adder (L-F) and a more recently developed area-delay efficient adder. This research will allow for a more comprehensive comparison between the different QCA adder proposals. To the best of the authors' knowledge, this is the first time power dissipation analysis has been carried out on these adders.