87 resultados para Energy optimization


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Traditionally, an instruction decoder is designed as a monolithic structure that inhibit the leakage energy optimization. In this paper, we consider a split instruction decoder that enable the leakage energy optimization. We also propose a compiler scheduling algorithm that exploits instruction slack to increase the simultaneous active and idle duration in instruction decoder. The proposed compiler-assisted scheme obtains a further 14.5% reduction of energy consumption of instruction decoder over a hardware-only scheme for a VLIW architecture. The benefits are 17.3% and 18.7% in the context of a 2-clustered and a 4-clustered VLIW architecture respectively.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Frequent accesses to the register file make it one of the major sources of energy consumption in ILP architectures. The large number of functional units connected to a large unified register file in VLIW architectures make power dissipation in the register file even worse because of the need for a large number of ports. High power dissipation in a relatively smaller area occupied by a register file leads to a high power density in the register file and makes it one of the prime hot-spots. This makes it highly susceptible to the possibility of a catastrophic heatstroke. This in turn impacts the performance and cost because of the need for periodic cool down and sophisticated packaging and cooling techniques respectively. Clustered VLIW architectures partition the register file among clusters of functional units and reduce the number of ports required thereby reducing the power dissipation. However, we observe that the aggregate accesses to register files in clustered VLIW architectures (and associated energy consumption) become very high compared to the centralized VLIW architectures and this can be attributed to a large number of explicit inter-cluster communications. Snooping based clustered VLIW architectures provide very limited but very fast way of inter-cluster communication by allowing some of the functional units to directly read some of the operands from the register file of some of the other clusters. In this paper, we propose instruction scheduling algorithms that exploit the limited snooping capability to reduce the register file energy consumption on an average by 12% and 18% and improve the overall performance by 5% and 11% for a 2-clustered and a 4-clustered machine respectively, over an earlier state-of-the-art clustered scheduling algorithm when evaluated in the context of snooping based clustered VLIW architectures.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Miniaturization of devices and the ensuing decrease in the threshold voltage has led to a substantial increase in the leakage component of the total processor energy consumption. Relatively simpler issue logic and the presence of a large number of function units in the VLIW and the clustered VLIW architectures attribute a large fraction of this leakage energy consumption in the functional units. However, functional units are not fully utilized in the VLIW architectures because of the inherent variations in the ILP of the programs. This underutilization is even more pronounced in the context of clustered VLIW architectures because of the contentions for the limited number of slow intercluster communication channels which lead to many short idle cycles.In the past, some architectural schemes have been proposed to obtain leakage energy bene .ts by aggressively exploiting the idleness of functional units. However, presence of many short idle cycles cause frequent transitions from the active mode to the sleep mode and vice-versa and adversely a ffects the energy benefits of a purely hardware based scheme. In this paper, we propose and evaluate a compiler instruction scheduling algorithm that assist such a hardware based scheme in the context of VLIW and clustered VLIW architectures. The proposed scheme exploits the scheduling slacks of instructions to orchestrate the functional unit mapping with the objective of reducing the number of transitions in functional units thereby keeping them off for a longer duration. The proposed compiler-assisted scheme obtains a further 12% reduction of energy consumption of functional units with negligible performance degradation over a hardware-only scheme for a VLIW architecture. The benefits are 15% and 17% in the context of a 2-clustered and a 4-clustered VLIW architecture respectively. Our test bed uses the Trimaran compiler infrastructure.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Clustered architecture processors are preferred for embedded systems because centralized register file architectures scale poorly in terms of clock rate, chip area, and power consumption. Although clustering helps by improving the clock speed, reducing the energy consumption of the logic, and making the design simpler, it introduces extra overheads by way of inter-cluster communication. This communication happens over long global wires having high load capacitance which leads to delay in execution and significantly high energy consumption. Inter-cluster communication also introduces many short idle cycles, thereby significantly increasing the overall leakage energy consumption in the functional units. The trend towards miniaturization of devices (and associated reduction in threshold voltage) makes energy consumption in interconnects and functional units even worse, and limits the usability of clustered architectures in smaller technologies. However, technological advancements now permit the design of interconnects and functional units with varying performance and power modes. In this paper, we propose scheduling algorithms that aggregate the scheduling slack of instructions and communication slack of data values to exploit the low-power modes of functional units and interconnects. Finally, we present a synergistic combination of these algorithms that simultaneously saves energy in functional units and interconnects to improves the usability of clustered architectures by achieving better overall energy-performance trade-offs. Even with conservative estimates of the contribution of the functional units and interconnects to the overall processor energy consumption, the proposed combined scheme obtains on average 8% and 10% improvement in overall energy-delay product with 3.5% and 2% performance degradation for a 2-clustered and a 4-clustered machine, respectively. We present a detailed experimental evaluation of the proposed schemes. Our test bed uses the Trimaran compiler infrastructure. (C) 2012 Elsevier Inc. All rights reserved.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The optimization of a photovoltaic pumping system based on an induction motor driven pump that is powered by a solar array is presented in this paper. The motor-pump subsystem is analyzed from the point of view of optimizing the power requirement of the induction motor, which has led to an optimum u-f relationship useful in controlling the motor. The complete pumping system is implemented using a dc-dc converter, a three-phase inverter, and an induction motor-pump set. The dc-dc converter is used as a power conditioner and its duty cycle is controlled so as to match the load to the array. A microprocessor-based controller is used to carry out the load-matching.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The ion energy distribution of inductively coupled plasma ion source for focused ion beam application is measured using a four grid retarding field energy analyzer. Without using any Faraday shield, ion energy spread is found to be 50 eV or more. Moreover, the ion energy distribution is found to have double peaks showing that the power coupling to the plasma is not purely inductive, but a strong parasitic capacitive coupling is also present. By optimizing the various source parameters and Faraday shield, ion energy distribution having a single peak, well separated from zero energy and with ion energy spread of 4 eV is achieved. A novel plasma chamber, with proper Faraday shield is designed to ignite the plasma at low RF powers which otherwise would require 300-400 W of RF power. Optimization of various parameters of the ion source to achieve ions with very low energy spread and the experimental results are presented in this article. (C) 2010 Elsevier Ltd. All rights reserved.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper proposes a hybrid solar cooking system where the solar energy is transported to the kitchen. The thermal energy source is used to supplement the Liquefied Petroleum Gas (LPG) that is in common use in kitchens. Solar energy is transferred to the kitchen by means of a circulating fluid. Energy collected from sun is maximized by changing the flow rate dynamically. This paper proposes a concept of maximum power point tracking (MPPT) for the solar thermal collector. The diameter of the pipe is selected to optimize the overall energy transfer. Design and sizing of different components of the system are explained. Concept of MPPT is validated with simulation and experimental results. (C) 2010 Elsevier Ltd. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Wireless adhoc networks transmit information from a source to a destination via multiple hops in order to save energy and, thus, increase the lifetime of battery-operated nodes. The energy savings can be especially significant in cooperative transmission schemes, where several nodes cooperate during one hop to forward the information to the next node along a route to the destination. Finding the best multi-hop transmission policy in such a network which determines nodes that are involved in each hop, is a very important problem, but also a very difficult one especially when the physical wireless channel behavior is to be accounted for and exploited. We model the above optimization problem for randomly fading channels as a decentralized control problem - the channel observations available at each node define the information structure, while the control policy is defined by the power and phase of the signal transmitted by each node. In particular, we consider the problem of computing an energy-optimal cooperative transmission scheme in a wireless network for two different channel fading models: (i) slow fading channels, where the channel gains of the links remain the same for a large number of transmissions, and (ii) fast fading channels, where the channel gains of the links change quickly from one transmission to another. For slow fading, we consider a factored class of policies (corresponding to local cooperation between nodes), and show that the computation of an optimal policy in this class is equivalent to a shortest path computation on an induced graph, whose edge costs can be computed in a decentralized manner using only locally available channel state information (CSI). For fast fading, both CSI acquisition and data transmission consume energy. Hence, we need to jointly optimize over both these; we cast this optimization problem as a large stochastic optimization problem. We then jointly optimize over a set of CSI functions of the local channel states, and a c- - orresponding factored class of control poli.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, non-linear programming techniques are applied to the problem of controlling the vibration pattern of a stretched string. First, the problem of finding the magnitudes of two control forces applied at two points l1 and l2 on the string to reduce the energy of vibration over the interval (l1, l2) relative to the energy outside the interval (l1, l2) is considered. For this problem the relative merits of various methods of non-linear programming are compared. The more complicated problem of finding the positions and magnitudes of two control forces to obtain the desired energy pattern is then solved by using the slack unconstrained minimization technique with the Fletcher-Powell search. In the discussion of the results it is shown that the position of the control force is very important in controlling the energy pattern of the string.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, we present numerical evidence that supports the notion of minimization in the sequence space of proteins for a target conformation. We use the conformations of the real proteins in the Protein Data Bank (PDB) and present computationally efficient methods to identify the sequences with minimum energy. We use edge-weighted connectivity graph for ranking the residue sites with reduced amino acid alphabet and then use continuous optimization to obtain the energy-minimizing sequences. Our methods enable the computation of a lower bound as well as a tight upper bound for the energy of a given conformation. We validate our results by using three different inter-residue energy matrices for five proteins from protein data bank (PDB), and by comparing our energy-minimizing sequences with 80 million diverse sequences that are generated based on different considerations in each case. When we submitted some of our chosen energy-minimizing sequences to Basic Local Alignment Search Tool (BLAST), we obtained some sequences from non-redundant protein sequence database that are similar to ours with an E-value of the order of 10(-7). In summary, we conclude that proteins show a trend towards minimizing energy in the sequence space but do not seem to adopt the global energy-minimizing sequence. The reason for this could be either that the existing energy matrices are not able to accurately represent the inter-residue interactions in the context of the protein environment or that Nature does not push the optimization in the sequence space, once it is able to perform the function.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We present a new computationally efficient method for large-scale polypeptide folding using coarse-grained elastic networks and gradient-based continuous optimization techniques. The folding is governed by minimization of energy based on Miyazawa–Jernigan contact potentials. Using this method we are able to substantially reduce the computation time on ordinary desktop computers for simulation of polypeptide folding starting from a fully unfolded state. We compare our results with available native state structures from Protein Data Bank (PDB) for a few de-novo proteins and two natural proteins, Ubiquitin and Lysozyme. Based on our simulations we are able to draw the energy landscape for a small de-novo protein, Chignolin. We also use two well known protein structure prediction software, MODELLER and GROMACS to compare our results. In the end, we show how a modification of normal elastic network model can lead to higher accuracy and lower time required for simulation.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The notion of optimization is inherent in protein design. A long linear chain of twenty types of amino acid residues are known to fold to a 3-D conformation that minimizes the combined inter-residue energy interactions. There are two distinct protein design problems, viz. predicting the folded structure from a given sequence of amino acid monomers (folding problem) and determining a sequence for a given folded structure (inverse folding problem). These two problems have much similarity to engineering structural analysis and structural optimization problems respectively. In the folding problem, a protein chain with a given sequence folds to a conformation, called a native state, which has a unique global minimum energy value when compared to all other unfolded conformations. This involves a search in the conformation space. This is somewhat akin to the principle of minimum potential energy that determines the deformed static equilibrium configuration of an elastic structure of given topology, shape, and size that is subjected to certain boundary conditions. In the inverse-folding problem, one has to design a sequence with some objectives (having a specific feature of the folded structure, docking with another protein, etc.) and constraints (sequence being fixed in some portion, a particular composition of amino acid types, etc.) while obtaining a sequence that would fold to the desired conformation satisfying the criteria of folding. This requires a search in the sequence space. This is similar to structural optimization in the design-variable space wherein a certain feature of structural response is optimized subject to some constraints while satisfying the governing static or dynamic equilibrium equations. Based on this similarity, in this work we apply the topology optimization methods to protein design, discuss modeling issues and present some initial results.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Determining the sequence of amino acid residues in a heteropolymer chain of a protein with a given conformation is a discrete combinatorial problem that is not generally amenable for gradient-based continuous optimization algorithms. In this paper we present a new approach to this problem using continuous models. In this modeling, continuous "state functions" are proposed to designate the type of each residue in the chain. Such a continuous model helps define a continuous sequence space in which a chosen criterion is optimized to find the most appropriate sequence. Searching a continuous sequence space using a deterministic optimization algorithm makes it possible to find the optimal sequences with much less computation than many other approaches. The computational efficiency of this method is further improved by combining it with a graph spectral method, which explicitly takes into account the topology of the desired conformation and also helps make the combined method more robust. The continuous modeling used here appears to have additional advantages in mimicking the folding pathways and in creating the energy landscapes that help find sequences with high stability and kinetic accessibility. To illustrate the new approach, a widely used simplifying assumption is made by considering only two types of residues: hydrophobic (H) and polar (P). Self-avoiding compact lattice models are used to validate the method with known results in the literature and data that can be practically obtained by exhaustive enumeration on a desktop computer. We also present examples of sequence design for the HP models of some real proteins, which are solved in less than five minutes on a single-processor desktop computer Some open issues and future extensions are noted.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Clustered VLIW architectures solve the scalability problem associated with flat VLIW architectures by partitioning the register file and connecting only a subset of the functional units to a register file. However, inter-cluster communication in clustered architectures leads to increased leakage in functional components and a high number of register accesses. In this paper, we propose compiler scheduling algorithms targeting two previously ignored power-hungry components in clustered VLIW architectures, viz., instruction decoder and register file. We consider a split decoder design and propose a new energy-aware instruction scheduling algorithm that provides 14.5% and 17.3% benefit in the decoder power consumption on an average over a purely hardware based scheme in the context of 2-clustered and 4-clustered VLIW machines. In the case of register files, we propose two new scheduling algorithms that exploit limited register snooping capability to reduce extra register file accesses. The proposed algorithms reduce register file power consumption on an average by 6.85% and 11.90% (10.39% and 17.78%), respectively, along with performance improvement of 4.81% and 5.34% (9.39% and 11.16%) over a traditional greedy algorithm for 2-clustered (4-clustered) VLIW machine. (C) 2010 Elsevier B.V. All rights reserved.