43 resultados para Savings
Resumo:
A natural velocity field method for shape optimization of reinforced concrete (RC) flexural members has been demonstrated. The possibility of shape optimization by modifying the shape of an initially rectangular section, in addition to variation of breadth and depth along the length, has been explored. Necessary shape changes have been computed using the sequential quadratic programming (SQP) technique. Genetic algorithm (Goldberg and Samtani 1986) has been used to optimize the diameter and number of main reinforcement bars. A limit-state design approach has been adopted for the nonprismatic RC sections. Such relevant issues as formulation of optimization problem, finite-element modeling, and solution procedure have been described. Three design examples-a simply supported beam, a cantilever beam, and a two-span continuous beam, all under uniformly distributed loads-have been optimized. The results show a significant savings (40-56%) in material and cost and also result in aesthetically pleasing structures. This procedure will lead to considerable cost saving, particularly in cases of mass-produced precast members and a heavy cast-in-place member such as a bridge girder.
Resumo:
An in-situ power monitoring technique for Dynamic Voltage and Threshold scaling (DVTS) systems is proposed which measures total power consumed by load circuit using sleep transistor acting as power sensor. Design details of power monitor are examined using simulation framework in UMC 90nm CMOS process. Experimental results of test chip fabricated in AMS 0.35µm CMOS process are presented. The test chip has variable activity between 0.05 and 0.5 and has PMOS VTH control through nWell contact. Maximum resolution obtained from power monitor is 0.25mV. Overhead of power monitor in terms of its power consumption is 0.244 mW (2.2% of total power of load circuit). Lastly, power monitor is used to demonstrate closed loop DVTS system. DVTS algorithm shows 46.3% power savings using in-situ power monitor.
Resumo:
Several replacement policies for web caches have been proposed and studied extensively in the literature. Different replacement policies perform better in terms of (i) the number of objects found in the cache (cache hit), (ii) the network traffic avoided by fetching the referenced object from the cache, or (iii) the savings in response time. In this paper, we propose a simple and efficient replacement policy (hereafter known as SE) which improves all three performance measures. Trace-driven simulations were done to evaluate the performance of SE. We compare SE with two widely used and efficient replacement policies, namely Least Recently Used (LRU) and Least Unified Value (LUV) algorithms. Our results show that SE performs at least as well as, if not better than, both these replacement policies. Unlike various other replacement policies proposed in literature, our SE policy does not require parameter tuning or a-priori trace analysis and has an efficient and simple implementation that can be incorporated in any existing proxy server or web server with ease.
Resumo:
Multiple Clock Domain processors provide an attractive solution to the increasingly challenging problems of clock distribution and power dissipation. They allow their chips to be partitioned into different clock domains, and each domain’s frequency (voltage) to be independently configured. This flexibility adds new dimensions to the Dynamic Voltage and Frequency Scaling problem, while providing better scope for saving energy and meeting performance demands. In this paper, we propose a compiler directed approach for MCD-DVFS. We build a formal petri net based program performance model, parameterized by settings of microarchitectural components and resource configurations, and integrate it with our compiler passes for frequency selection.Our model estimates the performance impact of a frequency setting, unlike the existing best techniques which rely on weaker indicators of domain performance such as queue occupancies(used by online methods) and slack manifestation for a particular frequency setting (software based methods).We evaluate our method with subsets of SPECFP2000,Mediabench and Mibench benchmarks. Our mean energy savings is 60.39% (versus 33.91% of the best software technique)in a memory constrained system for cache miss dominated benchmarks, and we meet the performance demands.Our ED2 improves by 22.11% (versus 18.34%) for other benchmarks. For a CPU with restricted frequency settings, our energy consumption is within 4.69% of the optimal.
Resumo:
This paper is concerned with the dynamic analysis of flexible,non-linear multi-body beam systems. The focus is on problems where the strains within each elastic body (beam) remain small. Based on geometrically non-linear elasticity theory, the non-linear 3-D beam problem splits into either a linear or non-linear 2-D analysis of the beam cross-section and a non-linear 1-D analysis along the beam reference line. The splitting of the three-dimensional beam problem into two- and one-dimensional parts, called dimensional reduction,results in a tremendous savings of computational effort relative to the cost of three-dimensional finite element analysis,the only alternative for realistic beams. The analysis of beam-like structures made of laminated composite materials requires a much more complicated methodology. Hence, the analysis procedure based on Variational Asymptotic Method (VAM), a tool to carry out the dimensional reduction, is used here.The analysis methodology can be viewed as a 3-step procedure. First, the sectional properties of beams made of composite materials are determined either based on an asymptotic procedure that involves a 2-D finite element nonlinear analysis of the beam cross-section to capture trapeze effect or using strip-like beam analysis, starting from Classical Laminated Shell Theory (CLST). Second, the dynamic response of non-linear, flexible multi-body beam systems is simulated within the framework of energy-preserving and energy-decaying time integration schemes that provide unconditional stability for non-linear beam systems. Finally,local 3-D responses in the beams are recovered, based on the 1-D responses predicted in the second step. Numerical examples are presented and results from this analysis are compared with those available in the literature.
Resumo:
Fault-tolerance is due to the semiconductor technology development important, not only for safety-critical systems but also for general-purpose (non-safety critical) systems. However, instead of guaranteeing that deadlines always are met, it is for general-purpose systems important to minimize the average execution time (AET) while ensuring fault-tolerance. For a given job and a soft (transient) error probability, we define mathematical formulas for AET that includes bus communication overhead for both voting (active replication) and rollback-recovery with checkpointing (RRC). And, for a given multi-processor system-on-chip (MPSoC), we define integer linear programming (ILP) models that minimize AET including bus communication overhead when: (1) selecting the number of checkpoints when using RRC, (2) finding the number of processors and job-to-processor assignment when using voting, and (3) defining fault-tolerance scheme (voting or RRC) per job and defining its usage for each job. Experiments demonstrate significant savings in AET.
Resumo:
Wireless networks transmit information from a source to a destination via multiple hops in order to save energy and, thus, increase the lifetime of battery-operated nodes. The energy savings can be especially significant in cooperative transmission schemes, where several nodes cooperate during one hop to forward the information to the next node along a route to the destination. Finding the best multi-hop transmission policy in such a network which determines nodes that are involved in each hop, is a very important problem, but also a very difficult one especially when the physical wireless channel behavior is to be accounted for and exploited. We model the above optimization problem for randomly fading channels as a decentralized control problem – the channel observations available at each node define the information structure, while the control policy is defined by the power and phase of the signal transmitted by each node.In particular, we consider the problem of computing an energy-optimal cooperative transmission scheme in a wireless network for two different channel fading models: (i) slow fading channels, where the channel gains of the links remain the same for a large number of transmissions, and (ii) fast fading channels,where the channel gains of the links change quickly from one transmission to another. For slow fading, we consider a factored class of policies (corresponding to local cooperation between nodes), and show that the computation of an optimal policy in this class is equivalent to a shortest path computation on an induced graph, whose edge costs can be computed in a decentralized manner using only locally available channel state information(CSI). For fast fading, both CSI acquisition and data transmission consume energy. Hence, we need to jointly optimize over both these; we cast this optimization problem as a large stochastic optimization problem. We then jointly optimize over a set of CSI functions of the local channel states, and a corresponding factored class of control policies corresponding to local cooperation between nodes with a local outage constraint. The resulting optimal scheme in this class can again be computed efficiently in a decentralized manner. We demonstrate significant energy savings for both slow and fast fading channels through numerical simulations of randomly distributed networks.
Resumo:
Large instruction windows and issue queues are key to exploiting greater instruction level parallelism in out-of-order superscalar processors. However, the cycle time and energy consumption of conventional large monolithic issue queues are high. Previous efforts to reduce cycle time segment the issue queue and pipeline wakeup. Unfortunately, this results in significant IPC loss. Other proposals which address energy efficiency issues by avoiding only the unnecessary tag-comparisons do not reduce broadcasts. These schemes also increase the issue latency.To address both these issues comprehensively, we propose the Scalable Lowpower Issue Queue (SLIQ). SLIQ augments a pipelined issue queue with direct indexing to mitigate the problem of delayed wakeups while reducing the cycle time. Also, the SLIQ design naturally leads to significant energy savings by reducing both the number of tag broadcasts and comparisons required.A 2 segment SLIQ incurs an average IPC loss of 0.2% over the entire SPEC CPU2000 suite, while achieving a 25.2% reduction in issue latency when compared to a monolithic 128-entry issue queue for an 8-wide superscalar processor. An 8 segment SLIQ improves scalability by reducing the issue latency by 38.3% while incurring an IPC loss of only 2.3%. Further, the 8 segment SLIQ significantly reduces the energy consumption and energy-delay product by 48.3% and 67.4% respectively on average.
Resumo:
A generalized power tracking algorithm that minimizes power consumption of digital circuits by dynamic control of supply voltage and the body bias is proposed. A direct power monitoring scheme is proposed that does not need any replica and hence can sense total power consumed by load circuit across process, voltage, and temperature corners. Design details and performance of power monitor and tracking algorithm are examined by a simulation framework developed using UMC 90-nm CMOS triple well process. The proposed algorithm with direct power monitor achieves a power savings of 42.2% for activity of 0.02 and 22.4% for activity of 0.04. Experimental results from test chip fabricated in AMS 350 nm process shows power savings of 46.3% and 65% for load circuit operating in super threshold and near sub-threshold region, respectively. Measured resolution of power monitor is around 0.25 mV and it has a power overhead of 2.2% of die power. Issues with loop convergence and design tradeoff for power monitor are also discussed in this paper.
Resumo:
In this paper, we analyze the throughput and energy efficiency performance of user datagram protocol (UDP) using linear, binary exponential, and geometric backoff algorithms at the link layer (LL) on point-to-point wireless fading links. Using a first-order Markov chain representation of the packet success/failure process on fading channels, we derive analytical expressions for throughput and energy efficiency of UDP/LL with and without LL backoff. The analytical results are verified through simulations. We also evaluate the mean delay and delay variation of voice packets and energy efficiency performance over a wireless link that uses UDP for transport of voice packets and the proposed backoff algorithms at the LL. We show that the proposed LL backoff algorithms achieve energy efficiency improvement of the order of 2-3 dB compared to LL with no backoff, without compromising much on the throughput and delay performance at the UDP layer. Such energy savings through protocol means will improve the battery life in wireless mobile terminals.
Resumo:
A generalized power tracking algorithm that minimizes power consumption of digital circuits by dynamic control of supply voltage and the body bias is proposed. A direct power monitoring scheme is proposed that does not need any replica and hence can sense total power consumed by load circuit across process, voltage, and temperature corners. Design details and performance of power monitor and tracking algorithm are examined by a simulation framework developed using UMC 90-nm CMOS triple well process. The proposed algorithm with direct power monitor achieves a power savings of 42.2% for activity of 0.02 and 22.4% for activity of 0.04. Experimental results from test chip fabricated in AMS 350 nm process shows power savings of 46.3% and 65% for load circuit operating in super threshold and near sub-threshold region, respectively. Measured resolution of power monitor is around 0.25 mV and it has a power overhead of 2.2% of die power. Issues with loop convergence and design tradeoff for power monitor are also discussed in this paper.
Resumo:
We propose a novel technique for reducing the power consumed by the on-chip cache in SNUCA chip multicore platform. This is achieved by what we call a "remap table", which maps accesses to the cache banks that are as close as possible to the cores, on which the processes are scheduled. With this technique, instead of using all the available cache, we use a portion of the cache and allocate lesser cache to the application. We formulate the problem as an energy-delay (ED) minimization problem and solve it offline using a scalable genetic algorithm approach. Our experiments show up to 40% of savings in the memory sub-system power consumption and 47% savings in energy-delay product (ED).
Resumo:
We propose a novel technique for reducing the power consumed by the on-chip cache in SNUCA chip multicore platform. This is achieved by what we call a "remap table", which maps accesses to the cache banks that are as close as possible to the cores, on which the processes are scheduled. With this technique, instead of using all the available cache, we use a portion of the cache and allocate lesser cache to the application. We formulate the problem as an energy-delay (ED) minimization problem and solve it offline using a scalable genetic algorithm approach. Our experiments show up to 40% of savings in the memory sub-system power consumption and 47% savings in energy-delay product (ED).
Resumo:
Two models for AF relaying, namely, fixed gain and fixed power relaying, have been extensively studied in the literature given their ability to harness spatial diversity. In fixed gain relaying, the relay gain is fixed but its transmit power varies as a function of the source-relay channel gain. In fixed power relaying, the relay transmit power is fixed, but its gain varies. We revisit and generalize the fundamental two-hop AF relaying model. We present an optimal scheme in which an average power constrained AF relay adapts its gain and transmit power to minimize the symbol error probability (SEP) at the destination. Also derived are insightful and practically amenable closed-form bounds for the optimal relay gain. We then analyze the SEP of MPSK, derive tight bounds for it, and characterize the diversity order for Rayleigh fading. Also derived is an SEP approximation that is accurate to within 0.1 dB. Extensive results show that the scheme yields significant energy savings of 2.0-7.7 dB at the source and relay. Optimal relay placement for the proposed scheme is also characterized, and is different from fixed gain or power relaying. Generalizations to MQAM and other fading distributions are also discussed.
Resumo:
Dynamic Voltage and Frequency Scaling (DVFS) is a very effective tool for designing trade-offs between energy and performance. In this paper, we use a formal Petri net based program performance model that directly captures both the application and system properties, to find energy efficient DVFS settings for CMP systems, that satisfy a given performance constraint, for SPMD multithreaded programs. Experimental evaluation shows that we achieve significant energy savings, while meeting the performance constraints.