847 resultados para cadenas agroindustriales
Resumo:
Design for low power in FPGA is rather limited since technology factors affecting power are either fixed or limited for FPGA families. This paper investigates opportunities for power savings of a pipelined 2D IDCT design at the architecture and logic level. We report power consumption savings of over 25% achieved in FPGA circuits obtained from clock gating implementation of optimizations made at the algorithmic level(1).
Resumo:
Pullpipelining, a pipeline technique where data is pulled from successor stages from predecessor stages is proposed Control circuits using a synchronous, a semi-synchronous and an asynchronous approach are given. Simulation examples for a DLX generic RISC datapath show that common control pipeline circuit overhead is avoided using the proposal. Applications to linear systolic arrays in cases when computation is finished at early stages in the array are foreseen. This would allow run-time data-driven digital frequency modulation of synchronous pipelined designs. This has applications to implement algorithms exhibiting average-case processing time using a synchronous approach.
Resumo:
This paper presents the evaluation in power consumption of gated clocks pipelined circuits with different register configurations in Virtex-based FPGA devices. Power impact of a gated clock circuitry aimed at reducing flip-flops output rate at the bit level is studied. Power performance is also given for pipeline stages based on the implementation of a double edge-triggered flip-flop. Using a pipelined Cordic Core circuit as an example, this study did not find evidence in power benefits either when gated clock at the bit-level or double-edge triggered flip-flops used when synthesized with FPGA logic resources.
Resumo:
This paper formally derives a new path-based neural branch prediction algorithm (FPP) into blocks of size two for a lower hardware solution while maintaining similar input-output characteristic to the algorithm. The blocked solution, here referred to as B2P algorithm, is obtained using graph theory and retiming methods. Verification approaches were exercised to show that prediction performances obtained from the FPP and B2P algorithms differ within one mis-prediction per thousand instructions using a known framework for branch prediction evaluation. For a chosen FPGA device, circuits generated from the B2P algorithm showed average area savings of over 25% against circuits for the FPP algorithm with similar time performances thus making the proposed blocked predictor superior from a practical viewpoint.
Resumo:
An unaltered rearrangement of the original computation of a neural based predictor at the algorithmic level is introduced as a new organization. Its FPGA implementation generates circuits that are 1.7 faster than a direct implementation of the original algorithm. This faster clock rate allows to implement predictors with longer history lengths using the nearly the same hardware budget.
Resumo:
This paper develops cycle-level FPGA circuits of an organization for a fast path-based neural branch predictor Our results suggest that practical sizes of prediction tables are limited to around 32 KB to 64 KB in current FPGA technology due mainly to FPGA area of logic resources to maintain the tables. However the predictor scales well in terms of prediction speed. Table sizes alone should not be used as the only metric for hardware budget when comparing neural-based predictor to predictors of totally different organizations. This paper also gives early evidence to shift the attention on to the recovery from mis-prediction latency rather than on prediction latency as the most critical factor impacting accuracy of predictions for this class of branch predictors.
Resumo:
A Fractal Quantizer is proposed that replaces the expensive division operation for the computation of scalar quantization by more modest and available multiplication, addition and shift operations. Although the proposed method is iterative in nature, simulations prove a virtually undetectable distortion to the naked eve for JPEG compressed images using a single iteration. The method requires a change to the usual tables used in JPEG algorithins but of similar size. For practical purposes, performing quantization is reduced to a multiplication plus addition operation easily programmed in either low-end embedded processors and suitable for efficient and very high speed implementation in ASIC or FPGA hardware. FPGA hardware implementation shows up to x15 area-time savingscompared to standars solutions for devices with dedicated multipliers. The method can be also immediately extended to perform adaptive quantization(1).
Resumo:
The SystemVerilog implementation of the Open Verification Methodology (OVM) is exercised on an 8b/10b RTL open core design in the hope of being a simple yet complete exercise to expose the key features of OVM. Emphasis is put onto the actual usage of the verification components rather than a complete verification flow aiming at being of help to readers unfamiliar with OVM seeking to apply the methodology to their own designs. A link that takes you to the complete code is given to reinforce this aim. We found the methodology easy to use but intimidating at first glance specially for someone with little experience in object oriented programming. However it is clear to see the flexibility, portability and reusability of verification code once you manage to give some first steps.
Resumo:
This paper discusses the design, implementation and synthesis of an FFT module that has been specifically optimized for use in the OFDM based Multiband UWB system, although the work is generally applicable to many other OFDM based receiver systems. Previous work has detailed the requirements for the receiver FFT module within the Multiband UWB ODFM based system and this paper draws on those requirements coupled with modern digital architecture principles and low power design criteria to converge on our optimized solution. The FFT design obtained in this paper is also applicable for implementation of the transmitter IFFT module therefore only needing one FFT module for half-duplex operation. The results from this paper enable the baseband designers of the 200Mbit/sec variant of Multiband UWB systems (and indeed other OFDM based receivers) using System-on-Chip (SoC), FPGA and ASIC technology to create cost effective and low power solutions biased toward the competitive consumer electronics market.
Resumo:
The relative fast processing speed requirements in Wireless Personal Area Network (WPAN) consumer based products are often in conflict with their low power and cost requirements. In order to solve this conflict the efficiency and cost effectiveness of these products and the underlying functional modules become paramount. This paper presents a low-cost, simple, yet high performance solution for the receiver Channel Estimator and Equalizer for the Mutiband OFDM (MB-OFDM) system, particularly directed to the WiMedia Consortium Physical Later (ECMA-368) consumer implementation for Wireless-USB and Fast Bluetooth. In this paper, the receiver fixed point performance is measured and the results indicate excellent performance compared to the current literature(1).