60 resultados para dynamic performance appraisal
Resumo:
The increasing variability in device leakage has made the design of keepers for wide OR structures a challenging task. The conventional feedback keepers (CONV) can no longer improve the performance of wide dynamic gates for the future technologies. In this paper, we propose an adaptive keeper technique called rate sensing keeper (RSK) that enables faster switching and tracks the variation across different process corners. It can switch upto 1.9x faster (for 20 legs) than CONV and can scale upto 32 legs as against 20 legs for CONV in a 130-nm 1.2-V process. The delay tracking is within 8% across the different process corners. We demonstrate the circuit operation of RSK using a 32 x 8 register file implemented in an industrial 130-nm 1.2-V CMOS process. The performance of individual dynamic logic gates are also evaluated on chip for various keeper techniques. We show that the RSK technique gives superior performance compared to the other alternatives such as Conditional Keeper (CKP) and current mirror-based keeper (LCR).
Resumo:
A new design technique for an SVC-based power system damping controller has been proposed. The controller attempts to place all plant poles within a specified region on the s-plane to guarantee the desired closed loop performance. The use of Horowitz's quantitative feedback theory (QFT) permits the design of a 'fixed gain controller' that maintains its performance in spite of large variations in the plant parameters during its normal course of operation. The required controller parameters are arrived at by solving an optimization problem that incorporates the control specifications. The performance of this robust controller has been evaluated on a single machine infinite bus system equipped with a mid point SVC, and the results are shown to be consistent with the expected performance of the stabilizer. (C) 1998 Elsevier Science S.A. All rights reserved.
Resumo:
Just-in-Time (JIT) compilers for Java can be augmented by making use of runtime profile information to produce better quality code and hence achieve higher performance. In a JIT compilation environment, the profile information obtained can be readily exploited in the same run to aid recompilation and optimization of frequently executed (hot) methods. This paper discusses a low overhead path profiling scheme for dynamically profiling AT produced native code. The profile information is used in recompilation during a subsequent invocation of the hot method. During recompilation tree regions along the hot paths are enlarged and instruction scheduling at the superblock level is performed. We have used the open source LaTTe AT compiler framework for our implementation. Our results on a SPARC platform for SPEC JVM98 benchmarks indicate that (i) there is a significant reduction in the number of tree regions along the hot paths, and (ii) profile aided recompilation in LaTTe achieves performance comparable to that of adaptive LaTTe in spite of retranslation and profiling overheads.
Resumo:
In this paper we propose a new method of data handling for web servers. We call this method Network Aware Buffering and Caching (NABC for short). NABC facilitates reduction of data copies in web server's data sending path, by doing three things: (1) Layout the data in main memory in a way that protocol processing can be done without data copies (2) Keep a unified cache of data in kernel and ensure safe access to it by various processes and kernel and (3) Pass only the necessary meta data between processes so that bulk data handling time spent during IPC can be reduced. We realize NABC by implementing a set of system calls and an user library. The end product of the implementation is a set of APIs specifically designed for use by the web servers. We port an in house web server called SWEET, to NABC APIs and evaluate performance using a range of workloads both simulated and real. The results show a very impressive gain of 12% to 21% in throughput for static file serving and 1.6 to 4 times gain in throughput for lightweight dynamic content serving for a server using NABC APIs over the one using UNIX APIs.
Resumo:
In this paper a new parallel algorithm for nonlinear transient dynamic analysis of large structures has been presented. An unconditionally stable Newmark-beta method (constant average acceleration technique) has been employed for time integration. The proposed parallel algorithm has been devised within the broad framework of domain decomposition techniques. However, unlike most of the existing parallel algorithms (devised for structural dynamic applications) which are basically derived using nonoverlapped domains, the proposed algorithm uses overlapped domains. The parallel overlapped domain decomposition algorithm proposed in this paper has been formulated by splitting the mass, damping and stiffness matrices arises out of finite element discretisation of a given structure. A predictor-corrector scheme has been formulated for iteratively improving the solution in each step. A computer program based on the proposed algorithm has been developed and implemented with message passing interface as software development environment. PARAM-10000 MIMD parallel computer has been used to evaluate the performances. Numerical experiments have been conducted to validate as well as to evaluate the performance of the proposed parallel algorithm. Comparisons have been made with the conventional nonoverlapped domain decomposition algorithms. Numerical studies indicate that the proposed algorithm is superior in performance to the conventional domain decomposition algorithms. (C) 2003 Elsevier Ltd. All rights reserved.
Resumo:
In this paper, we investigate the effect of vacuum sealing the backside cavity of a Capacitive Micromachined Ultrasonic Transducer (CMUT). The presence or absence of air inside the cavity has a marked effect upon the system parameters, such as the natural frequency, damping, and the pull-in voltage. The presence of vacuum inside the cavity of the device causes a reduction in the effective gap height which leads to a reduction in the pull-in voltage. We carry out ANSYS simulations to quantify this reduction. The presence of vacuum inside the cavity of the device causes stress stiffening of the membrane, which changes the natural frequency of the device. A prestressed modal analysis is carried out to determine the change in natural frequency due to stress stiffening. The equivalent circuit method is used to evaluate the performance of the device in the receiver mode. The lumped parameters of the device are obtained and an equivalent circuit model of the device is constructed to determine the open circuit receiving sensitivity of the device. The effect of air in the cavity is included by incorporating an equivalent compliance and an equivalent resistance in the equivalent circuit.
Resumo:
A dynamic model of the COREX melter gasifier is developed to study the transient behavior of the furnace. The effect of pulse disturbance and step disturbance on the process performance has been studied. This study shows that the effect of pulse disturbance decays asymptotically. The step change brings the system to a new steady state after a delay of about 5 hours. The dynamic behavior of the melter gasifier with respect to a shutdown/blow-on condition and the effect of tapping are also studied. The results show that the time response of the melter gasifier is much less than that of a blast furnace.
Resumo:
Earlier studies have exploited statistical multiplexing of flows in the core of the Internet to reduce the buffer requirement in routers. Reducing the memory requirement of routers is important as it enables an improvement in performance and at the same time a decrease in the cost. In this paper, we observe that the links in the core of the Internet are typically over-provisioned and this can be exploited to reduce the buffering requirement in routers. The small on-chip memory of a network processor (NP) can be effectively used to buffer packets during most regimes of traffic. We propose a dynamic buffering strategy which buffers packets in the receive and transmit buffers of a NP when the memory requirement is low. When the buffer requirement increases due to bursts in the traffic, memory is allocated to packets in the off-chip DRAM. This scheme effectively mitigates the DRAM access bottleneck, as only a part of the traffic is stored in the DRAM. We build a Petri net model and evaluate the proposed scheme with core Internet like traffic. At 77% link utilization, the dynamic buffering scheme has a drop rate of just 0.65%, whereas the traditional DRAM buffering has 4.64% packet drop rate. Even with a high link utilization of 90%, which rarely happens in the core, our dynamic buffering results in a packet drop rate of only 2.17%, while supporting a throughput of 7.39 Gbps. We study the proposed scheme under different conditions to understand the provisioning of processing threads and to determine the queue length at which packets must be buffered in the DRAM. We show that the proposed dynamic buffering strategy drastically reduces the buffering requirement while still maintaining low packet drop rates.
Resumo:
In this paper we are concerned with finding the maximum throughput that a mobile ad hoc network can support. Even when nodes are stationary, the problem of determining the capacity region has long been known to be NP-hard. Mobility introduces an additional dimension of complexity because nodes now also have to decide when they should initiate route discovery. Since route discovery involves communication and computation overhead, it should not be invoked very often. On the other hand, mobility implies that routes are bound to become stale resulting in sub-optimal performance if routes are not updated. We attempt to gain some understanding of these effects by considering a simple one-dimensional network model. The simplicity of our model allows us to use stochastic dynamic programming (SDP) to find the maximum possible network throughput with ideal routing and medium access control (MAC) scheduling. Using the optimal value as a benchmark, we also propose and evaluate the performance of a simple threshold-based heuristic. Unlike the optimal policy which requires considerable state information, the heuristic is very simple to implement and is not overly sensitive to the threshold value used. We find empirical conditions for our heuristic to be near-optimal as well as network scenarios when our simple heuristic does not perform very well. We provide extensive numerical and simulation results for different parameter settings of our model.
Resumo:
Energy consumption has become a major constraint in providing increased functionality for devices with small form factors. Dynamic voltage and frequency scaling has been identified as an effective approach for reducing the energy consumption of embedded systems. Earlier works on dynamic voltage scaling focused mainly on performing voltage scaling when the CPU is waiting for memory subsystem or concentrated chiefly on loop nests and/or subroutine calls having sufficient number of dynamic instructions. This paper concentrates on coarser program regions and for the first time uses program phase behavior for performing dynamic voltage scaling. Program phases are annotated at compile time with mode switch instructions. Further, we relate the Dynamic Voltage Scaling Problem to the Multiple Choice Knapsack Problem, and use well known heuristics to solve it efficiently. Also, we develop a simple integer linear program formulation for this problem. Experimental evaluation on a set of media applications reveal that our heuristic method obtains a 38% reduction in energy consumption on an average, with a performance degradation of 1% and upto 45% reduction in energy with a performance degradation of 5%. Further, the energy consumed by the heuristic solution is within 1% of the optimal solution obtained from the ILP approach.
Resumo:
Based on dynamic inversion, a relatively straightforward approach is presented in this paper for nonlinear flight control design of high performance aircrafts, which does not require the normal and lateral acceleration commands to be first transferred to body rates before computing the required control inputs. This leads to substantial improvement of the tracking response. Promising results are obtained from six degree-offreedom simulation studies of F-16 aircraft, which are found to be superior as compared to an existing approach (which is also based on dynamic inversion). The new approach has two potential benefits, namely reduced oscillatory response (including elimination of non-minimum phase behavior) and reduced control magnitude. Next, a model-following neuron-adaptive design is augmented the nominal design in order to assure robust performance in the presence of parameter inaccuracies in the model. Note that in the approach the model update takes place adaptively online and hence it is philosophically similar to indirect adaptive control. However, unlike a typical indirect adaptive control approach, there is no need to update the individual parameters explicitly. Instead the inaccuracy in the system output dynamics is captured directly and then used in modifying the control. This leads to faster adaptation, which helps in stabilizing the unstable plant quicker. The robustness study from a large number of simulations shows that the adaptive design has good amount of robustness with respect to the expected parameter inaccuracies in the model.
Resumo:
We study the problem of optimal bandwidth allocation in communication networks. We consider a queueing model with two queues to which traffic from different competing flows arrive. The queue length at the buffers is observed every T instants of time, on the basis of which a decision on the amount of bandwidth to be allocated to each buffer for the next T instants is made. We consider a class of closed-loop feedback policies for the system and use a twotimescale simultaneous perturbation stochastic approximation(SPSA) algorithm to find an optimal policy within the prescribed class. We study the performance of the proposed algorithm on a numerical setting. Our algorithm is found to exhibit good performance.
Resumo:
Superscalar processors currently have the potential to fetch multiple basic blocks per cycle by employing one of several recently proposed instruction fetch mechanisms. However, this increased fetch bandwidth cannot be exploited unless pipeline stages further downstream correspondingly improve. In particular,register renaming a large number of instructions per cycle is diDcult. A large instruction window, needed to receive multiple basic blocks per cycle, will slow down dependence resolution and instruction issue. This paper addresses these and related issues by proposing (i) partitioning of the instruction window into multiple blocks, each holding a dynamic code sequence; (ii) logical partitioning of the registerjle into a global file and several local jles, the latter holding registers local to a dynamic code sequence; (iii) the dynamic recording and reuse of register renaming information for registers local to a dynamic code sequence. Performance studies show these mechanisms improve performance over traditional superscalar processors by factors ranging from 1.5 to a little over 3 for the SPEC Integer programs. Next, it is observed that several of the loops in the benchmarks display vector-like behavior during execution, even if the static loop bodies are likely complex for compile-time vectorization. A dynamic loop vectorization mechanism that builds on top of the above mechanisms is briefly outlined. The mechanism vectorizes up to 60% of the dynamic instructions for some programs, albeit the average number of iterations per loop is quite small.
Resumo:
A generalized power tracking algorithm that minimizes power consumption of digital circuits by dynamic control of supply voltage and the body bias is proposed. A direct power monitoring scheme is proposed that does not need any replica and hence can sense total power consumed by load circuit across process, voltage, and temperature corners. Design details and performance of power monitor and tracking algorithm are examined by a simulation framework developed using UMC 90-nm CMOS triple well process. The proposed algorithm with direct power monitor achieves a power savings of 42.2% for activity of 0.02 and 22.4% for activity of 0.04. Experimental results from test chip fabricated in AMS 350 nm process shows power savings of 46.3% and 65% for load circuit operating in super threshold and near sub-threshold region, respectively. Measured resolution of power monitor is around 0.25 mV and it has a power overhead of 2.2% of die power. Issues with loop convergence and design tradeoff for power monitor are also discussed in this paper.
Resumo:
A generalized power tracking algorithm that minimizes power consumption of digital circuits by dynamic control of supply voltage and the body bias is proposed. A direct power monitoring scheme is proposed that does not need any replica and hence can sense total power consumed by load circuit across process, voltage, and temperature corners. Design details and performance of power monitor and tracking algorithm are examined by a simulation framework developed using UMC 90-nm CMOS triple well process. The proposed algorithm with direct power monitor achieves a power savings of 42.2% for activity of 0.02 and 22.4% for activity of 0.04. Experimental results from test chip fabricated in AMS 350 nm process shows power savings of 46.3% and 65% for load circuit operating in super threshold and near sub-threshold region, respectively. Measured resolution of power monitor is around 0.25 mV and it has a power overhead of 2.2% of die power. Issues with loop convergence and design tradeoff for power monitor are also discussed in this paper.