984 resultados para voltage scaling
Resumo:
Multiple Clock Domain processors provide an attractive solution to the increasingly challenging problems of clock distribution and power dissipation. They allow their chips to be partitioned into different clock domains, and each domain’s frequency (voltage) to be independently configured. This flexibility adds new dimensions to the Dynamic Voltage and Frequency Scaling problem, while providing better scope for saving energy and meeting performance demands. In this paper, we propose a compiler directed approach for MCD-DVFS. We build a formal petri net based program performance model, parameterized by settings of microarchitectural components and resource configurations, and integrate it with our compiler passes for frequency selection.Our model estimates the performance impact of a frequency setting, unlike the existing best techniques which rely on weaker indicators of domain performance such as queue occupancies(used by online methods) and slack manifestation for a particular frequency setting (software based methods).We evaluate our method with subsets of SPECFP2000,Mediabench and Mibench benchmarks. Our mean energy savings is 60.39% (versus 33.91% of the best software technique)in a memory constrained system for cache miss dominated benchmarks, and we meet the performance demands.Our ED2 improves by 22.11% (versus 18.34%) for other benchmarks. For a CPU with restricted frequency settings, our energy consumption is within 4.69% of the optimal.
Resumo:
Energy consumption has become a major constraint in providing increased functionality for devices with small form factors. Dynamic voltage and frequency scaling has been identified as an effective approach for reducing the energy consumption of embedded systems. Earlier works on dynamic voltage scaling focused mainly on performing voltage scaling when the CPU is waiting for memory subsystem or concentrated chiefly on loop nests and/or subroutine calls having sufficient number of dynamic instructions. This paper concentrates on coarser program regions and for the first time uses program phase behavior for performing dynamic voltage scaling. Program phases are annotated at compile time with mode switch instructions. Further, we relate the Dynamic Voltage Scaling Problem to the Multiple Choice Knapsack Problem, and use well known heuristics to solve it efficiently. Also, we develop a simple integer linear program formulation for this problem. Experimental evaluation on a set of media applications reveal that our heuristic method obtains a 38% reduction in energy consumption on an average, with a performance degradation of 1% and upto 45% reduction in energy with a performance degradation of 5%. Further, the energy consumed by the heuristic solution is within 1% of the optimal solution obtained from the ILP approach.
Resumo:
A generalized power tracking algorithm that minimizes power consumption of digital circuits by dynamic control of supply voltage and the body bias is proposed. A direct power monitoring scheme is proposed that does not need any replica and hence can sense total power consumed by load circuit across process, voltage, and temperature corners. Design details and performance of power monitor and tracking algorithm are examined by a simulation framework developed using UMC 90-nm CMOS triple well process. The proposed algorithm with direct power monitor achieves a power savings of 42.2% for activity of 0.02 and 22.4% for activity of 0.04. Experimental results from test chip fabricated in AMS 350 nm process shows power savings of 46.3% and 65% for load circuit operating in super threshold and near sub-threshold region, respectively. Measured resolution of power monitor is around 0.25 mV and it has a power overhead of 2.2% of die power. Issues with loop convergence and design tradeoff for power monitor are also discussed in this paper.
Resumo:
A generalized power tracking algorithm that minimizes power consumption of digital circuits by dynamic control of supply voltage and the body bias is proposed. A direct power monitoring scheme is proposed that does not need any replica and hence can sense total power consumed by load circuit across process, voltage, and temperature corners. Design details and performance of power monitor and tracking algorithm are examined by a simulation framework developed using UMC 90-nm CMOS triple well process. The proposed algorithm with direct power monitor achieves a power savings of 42.2% for activity of 0.02 and 22.4% for activity of 0.04. Experimental results from test chip fabricated in AMS 350 nm process shows power savings of 46.3% and 65% for load circuit operating in super threshold and near sub-threshold region, respectively. Measured resolution of power monitor is around 0.25 mV and it has a power overhead of 2.2% of die power. Issues with loop convergence and design tradeoff for power monitor are also discussed in this paper.
Resumo:
With extensive use of dynamic voltage scaling (DVS) there is increasing need for voltage scalable models. Similarly, leakage being very sensitive to temperature motivates the need for a temperature scalable model as well. We characterize standard cell libraries for statistical leakage analysis based on models for transistor stacks. Modeling stacks has the advantage of using a single model across many gates there by reducing the number of models that need to be characterized. Our experiments on 15 different gates show that we needed only 23 models to predict the leakage across 126 input vector combinations. We investigate the use of neural networks for the combined PVT model, for the stacks, which can capture the effect of inter die, intra gate variations, supply voltage(0.6-1.2 V) and temperature (0 - 100degC) on leakage. Results show that neural network based stack models can predict the PDF of leakage current across supply voltage and temperature accurately with the average error in mean being less than 2% and that in standard deviation being less than 5% across a range of voltage, temperature.
Resumo:
With the emergence of voltage scaling as one of the most powerful power reduction techniques, it has been important to support voltage scalable statistical static timing analysis (SSTA) in deep submicrometer process nodes. In this paper, we propose a single delay model of logic gate using neural network which comprehensively captures process, voltage, and temperature variation along with input slew and output load. The number of simulation programs with integrated circuit emphasis (SPICE) required to create this model over a large voltage and temperature range is found to be modest and 4x less than that required for a conventional table-based approach with comparable accuracy. We show how the model can be used to derive sensitivities required for linear SSTA for an arbitrary voltage and temperature. Our experimentation on ISCAS 85 benchmarks across a voltage range of 0.9-1.1V shows that the average error in mean delay is less than 1.08% and average error in standard deviation is less than 2.85%. The errors in predicting the 99% and 1% probability point are 1.31% and 1%, respectively, with respect to SPICE. The two potential applications of voltage-aware SSTA have been presented, i.e., one for improving the accuracy of timing analysis by considering instance-specific voltage drops in power grids and the other for determining optimum supply voltage for target yield for dynamic voltage scaling applications.
Resumo:
In this paper, we present a novel discrete cosine transform (DCT) architecture that allows aggressive voltage scaling for low-power dissipation, even under process parameter variations with minimal overhead as opposed to existing techniques. Under a scaled supply voltage and/or variations in process parameters, any possible delay errors appear only from the long paths that are designed to be less contributive to output quality. The proposed architecture allows a graceful degradation in the peak SNR (PSNR) under aggressive voltage scaling as well as extreme process variations. Results show that even under large process variations (±3σ around mean threshold voltage) and aggressive supply voltage scaling (at 0.88 V, while the nominal voltage is 1.2 V for a 90-nm technology), there is a gradual degradation of image quality with considerable power savings (71% at PSNR of 23.4 dB) for the proposed architecture, when compared to existing implementations in a 90-nm process technology. © 2006 IEEE.
Resumo:
In this paper, we explore various arithmetic units for possible use in high-speed, high-yield ALUs operated at scaled supply voltage with adaptive clock stretching. We demonstrate that careful logic optimization of the existing arithmetic units (to create hybrid units) indeed make them further amenable to supply voltage scaling. Such hybrid units result from mixing right amount of fast arithmetic into the slower ones. Simulations on different hybrid adder and multipliers in BPTM 70 nm technology show 18%-50% improvements in power compared to standard adders with only 2%-8% increase in die-area at iso-yield. These optimized datapath units can be used to construct voltage scalable robust ALUs that can operate at high clock frequency with minimal performance degradation due to occasional clock stretching. © 2009 IEEE.
Resumo:
In this paper, we propose a system level design approach considering voltage over-scaling (VOS) that achieves error resiliency using unequal error protection of different computation elements, while incurring minor quality degradation. Depending on user specifications and severity of process variations/channel noise, the degree of VOS in each block of the system is adaptively tuned to ensure minimum system power while providing "just-the-right" amount of quality and robustness. This is achieved, by taking into consideration block level interactions and ensuring that under any change of operating conditions, only the "less-crucial" computations, that contribute less to block/system output quality, are affected. The proposed approach applies unequal error protection to various blocks of a system-logic and memory-and spans multiple layers of design hierarchy-algorithm, architecture and circuit. The design methodology when applied to a multimedia subsystem shows large power benefits ( up to 69% improvement in power consumption) at reasonable image quality while tolerating errors introduced due to VOS, process variations, and channel noise.
Resumo:
Static timing analysis provides the basis for setting the clock period of a microprocessor core, based on its worst-case critical path. However, depending on the design, this critical path is not always excited and therefore dynamic timing margins exist that can theoretically be exploited for the benefit of better speed or lower power consumption (through voltage scaling). This paper introduces predictive instruction-based dynamic clock adjustment as a technique to trim dynamic timing margins in pipelined microprocessors. To this end, we exploit the different timing requirements for individual instructions during the dynamically varying program execution flow without the need for complex circuit-level measures to detect and correct timing violations. We provide a design flow to extract the dynamic timing information for the design using post-layout dynamic timing analysis and we integrate the results into a custom cycle-accurate simulator. This simulator allows annotation of individual instructions with their impact on timing (in each pipeline stage) and rapidly derives the overall code execution time for complex benchmarks. The design methodology is illustrated at the microarchitecture level, demonstrating the performance and power gains possible on a 6-stage OpenRISC in-order general purpose processor core in a 28nm CMOS technology. We show that employing instruction-dependent dynamic clock adjustment leads on average to an increase in operating speed by 38% or to a reduction in power consumption by 24%, compared to traditional synchronous clocking, which at all times has to respect the worst-case timing identified through static timing analysis.
Resumo:
In high performance digital systems as well as in RF systems, voltage scaling and modulation techniques have been adopted to achieve a more efficient processing of the energy. The implementation of such techniques relies on a power supply that is capable of rapidly adjusting the system supply voltage. In this paper, a pulsewidth modulation multiphase topology with magnetic coupling is proposed for its use in voltage modulation techniques. Since the magnetic coupling in this topology is done with transformers instead of coupled inductors, the energy storage is reduced and very fast voltage changes are achieved. Advantages and drawbacks of this topology have been previously presented in the literature and in this paper, the design criteria for implementing a power supply for the envelope elimination and restoration technique in an RF system are presented along with an implementation of the power supply.
Resumo:
To consider the energy-aware scheduling problem in computer-controlled systems is necessary to improve the control performance, to use the limited computing resource sufficiently, and to reduce the energy consumption to extend the lifetime of the whole system. In this paper, the scheduling problem of multiple control tasks is discussed based on an adjustable voltage processor. A feedback fuzzy-DVS (dynamic voltage scaling) scheduling architecture is presented by applying technologies of the feedback control and the fuzzy DVS. The simulation results show that, by using the actual utilization as the feedback information to adjust the supply voltage of processor dynamically, the high CPU utilization can be implemented under the precondition of guaranteeing the control performance, whilst the low energy consumption can be achieved as well. The proposed method can be applied to the design in computer-controlled systems based on an adjustable voltage processor.
Resumo:
The impact of source/drain engineering on the performance of a six-transistor (6-T) static random access memory (SRAM) cell, based on 22 nm double-gate (DG) SOI MOSFETs, has been analyzed using mixed-mode simulation, for three different circuit topologies for low voltage operation. The trade-offs associated with the various conflicting requirements relating to read/write/standby operations have been evaluated comprehensively in terms of eight performance metrics, namely retention noise margin, static noise margin, static voltage/current noise margin, write-ability current, write trip voltage/current and leakage current. Optimal design parameters with gate-underlap architecture have been identified to enhance the overall SRAM performance, and the influence of parasitic source/drain resistance and supply voltage scaling has been investigated. A gate-underlap device designed with a spacer-to-straggle (s/sigma) ratio in the range 2-3 yields improved SRAM performance metrics, regardless of circuit topology. An optimal two word-line double-gate SOI 6-T SRAM cell design exhibits a high SNM similar to 162 mV, I-wr similar to 35 mu A and low I-leak similar to 70 pA at V-DD = 0.6 V, while maintaining SNM similar to 30% V-DD over the supply voltage (V-DD) range of 0.4-0.9 V.