31 resultados para Sustainability performance framework
em Indian Institute of Science - Bangalore - Índia
Resumo:
We describe a System-C based framework we are developing, to explore the impact of various architectural and microarchitectural level parameters of the on-chip interconnection network elements on its power and performance. The framework enables one to choose from a variety of architectural options like topology, routing policy, etc., as well as allows experimentation with various microarchitectural options for the individual links like length, wire width, pitch, pipelining, supply voltage and frequency. The framework also supports a flexible traffic generation and communication model. We provide preliminary results of using this framework to study the power, latency and throughput of a 4x4 multi-core processing array using mesh, torus and folded torus, for two different communication patterns of dense and sparse linear algebra. The traffic consists of both Request-Response messages (mimicing cache accesses)and One-Way messages. We find that the average latency can be reduced by increasing the pipeline depth, as it enables higher link frequencies. We also find that there exists an optimum degree of pipelining which minimizes energy-delay product.
Resumo:
Elasticity in cloud systems provides the flexibility to acquire and relinquish computing resources on demand. However, in current virtualized systems resource allocation is mostly static. Resources are allocated during VM instantiation and any change in workload leading to significant increase or decrease in resources is handled by VM migration. Hence, cloud users tend to characterize their workloads at a coarse grained level which potentially leads to under-utilized VM resources or under performing application. A more flexible and adaptive resource allocation mechanism would benefit variable workloads, such as those characterized by web servers. In this paper, we present an elastic resources framework for IaaS cloud layer that addresses this need. The framework provisions for application workload forecasting engine, that predicts at run-time the expected demand, which is input to the resource manager to modulate resource allocation based on the predicted demand. Based on the prediction errors, resources can be over-allocated or under-allocated as compared to the actual demand made by the application. Over-allocation leads to unused resources and under allocation could cause under performance. To strike a good trade-off between over-allocation and under-performance we derive an excess cost model. In this model excess resources allocated are captured as over-allocation cost and under-allocation is captured as a penalty cost for violating application service level agreement (SLA). Confidence interval for predicted workload is used to minimize this excess cost with minimal effect on SLA violations. An example case-study for an academic institute web server workload is presented. Using the confidence interval to minimize excess cost, we achieve significant reduction in resource allocation requirement while restricting application SLA violations to below 2-3%.
Resumo:
We develop a communication theoretic framework for modeling 2-D magnetic recording channels. Using the model, we define the signal-to-noise ratio (SNR) for the channel considering several physical parameters, such as the channel bit density, code rate, bit aspect ratio, and noise parameters. We analyze the problem of optimizing the bit aspect ratio for maximizing SNR. The read channel architecture comprises a novel 2-D joint self-iterating equalizer and detection system with noise prediction capability. We evaluate the system performance based on our channel model through simulations. The coded performance with the 2-D equalizer detector indicates similar to 5.5 dB of SNR gain over uncoded data.
Resumo:
This article describes a new performance-based approach for evaluating the return period of seismic soil liquefaction based on standard penetration test (SPT) and cone penetration test (CPT) data. The conventional liquefaction evaluation methods consider a single acceleration level and magnitude and these approaches fail to take into account the uncertainty in earthquake loading. The seismic hazard analysis based on the probabilistic method clearly shows that a particular acceleration value is being contributed by different magnitudes with varying probability. In the new method presented in this article, the entire range of ground shaking and the entire range of earthquake magnitude are considered and the liquefaction return period is evaluated based on the SPT and CPT data. This article explains the performance-based methodology for the liquefaction analysis – starting from probabilistic seismic hazard analysis (PSHA) for the evaluation of seismic hazard and the performance-based method to evaluate the liquefaction return period. A case study has been done for Bangalore, India, based on SPT data and converted CPT values. The comparison of results obtained from both the methods have been presented. In an area of 220 km2 in Bangalore city, the site class was assessed based on large number of borehole data and 58 Multi-channel analysis of surface wave survey. Using the site class and peak acceleration at rock depth from PSHA, the peak ground acceleration at the ground surface was estimated using probabilistic approach. The liquefaction analysis was done based on 450 borehole data obtained in the study area. The results of CPT match well with the results obtained from similar analysis with SPT data.
Resumo:
The random early detection (RED) technique has seen a lot of research over the years. However, the functional relationship between RED performance and its parameters viz,, queue weight (omega(q)), marking probability (max(p)), minimum threshold (min(th)) and maximum threshold (max(th)) is not analytically availa ble. In this paper, we formulate a probabilistic constrained optimization problem by assuming a nonlinear relationship between the RED average queue length and its parameters. This problem involves all the RED parameters as the variables of the optimization problem. We use the barrier and the penalty function approaches for its Solution. However (as above), the exact functional relationship between the barrier and penalty objective functions and the optimization variable is not known, but noisy samples of these are available for different parameter values. Thus, for obtaining the gradient and Hessian of the objective, we use certain recently developed simultaneous perturbation stochastic approximation (SPSA) based estimates of these. We propose two four-timescale stochastic approximation algorithms based oil certain modified second-order SPSA updates for finding the optimum RED parameters. We present the results of detailed simulation experiments conducted over different network topologies and network/traffic conditions/settings, comparing the performance of Our algorithms with variants of RED and a few other well known adaptive queue management (AQM) techniques discussed in the literature.
Resumo:
This paper presents a power, latency and throughput trade-off study on NoCs by varying microarchitectural (e.g. pipelining) and circuit level (e.g. frequency and voltage) parameters. We change pipelining depth, operating frequency and supply voltage for 3 example NoCs - 16 node 2D Torus, Tree network and Reduced 2D Torus. We use an in-house NoC exploration framework capable of topology generation and comparison using parameterized models of Routers and links developed in SystemC. The framework utilizes interconnect power and delay models from a low-level modelling tool called Intacte[1]1. We find that increased pipelining can actually reduce latency. We also find that there exists an optimal degree of pipelining which is the most energy efficient in terms of minimizing energy-delay product.
Resumo:
Packet forwarding is a memory-intensive application requiring multiple accesses through a trie structure. With the requirement to process packets at line rates, high-performance routers need to forward millions of packets every second with each packet needing up to seven memory accesses. Earlier work shows that a single cache for the nodes of a trie can reduce the number of external memory accesses. It is observed that the locality characteristics of the level-one nodes of a trie are significantly different from those of lower level nodes. Hence, we propose a heterogeneously segmented cache architecture (HSCA) which uses separate caches for level-one and lower level nodes, each with carefully chosen sizes. Besides reducing misses, segmenting the cache allows us to focus on optimizing the more frequently accessed level-one node segment. We find that due to the nonuniform distribution of nodes among cache sets, the level-one nodes cache is susceptible t high conflict misses. We reduce conflict misses by introducing a novel two-level mapping-based cache placement framework. We also propose an elegant way to fit the modified placement function into the cache organization with minimal increase in access time. Further, we propose an attribute preserving trace generation methodology which emulates real traces and can generate traces with varying locality. Performanc results reveal that our HSCA scheme results in a 32 percent speedup in average memory access time over a unified nodes cache. Also, HSC outperforms IHARC, a cache for lookup results, with as high as a 10-fold speedup in average memory access time. Two-level mappin further enhances the performance of the base HSCA by up to 13 percent leading to an overall improvement of up to 40 percent over the unified scheme.
Resumo:
Glaucoma is the second leading cause of blindness worldwide. Often, the optic nerve head (ONH) glaucomatous damage and ONH changes occur prior to visual field loss and are observable in vivo. Thus, digital image analysis is a promising choice for detecting the onset and/or progression of glaucoma. In this paper, we present a new framework for detecting glaucomatous changes in the ONH of an eye using the method of proper orthogonal decomposition (POD). A baseline topograph subspace was constructed for each eye to describe the structure of the ONH of the eye at a reference/baseline condition using POD. Any glaucomatous changes in the ONH of the eye present during a follow-up exam were estimated by comparing the follow-up ONH topography with its baseline topograph subspace representation. Image correspondence measures of L-1-norm and L-2-norm, correlation, and image Euclidean distance (IMED) were used to quantify the ONH changes. An ONH topographic library built from the Louisiana State University Experimental Glaucoma study was used to evaluate the performance of the proposed method. The area under the receiver operating characteristic curves (AUCs) was used to compare the diagnostic performance of the POD-induced parameters with the parameters of the topographic change analysis (TCA) method. The IMED and L-2-norm parameters in the POD framework provided the highest AUC of 0.94 at 10 degrees. field of imaging and 0.91 at 15 degrees. field of imaging compared to the TCA parameters with an AUC of 0.86 and 0.88, respectively. The proposed POD framework captures the instrument measurement variability and inherent structure variability and shows promise for improving our ability to detect glaucomatous change over time in glaucoma management.
Resumo:
The peaking of most oil reserves and impending climate change are critically driving the adoption of solar photovoltaic's (PV) as a sustainable renewable and eco-friendly alternative. Ongoing material research has yet to find a breakthrough in significantly raising the conversion efficiency of commercial PV modules. The installation of PV systems for optimum yield is primarily dictated by its geographic location (latitude and available solar insolation) and installation design (tilt, orientation and altitude) to maximize solar exposure. However, once these parameters have been addressed appropriately, there are other depending factors that arise in determining the system performance (efficiency and output). Dust is the lesser acknowledged factor that significantly influences the performance of the PV installations. This paper provides an appraisal on the current status of research in studying the impact of dust on PV system performance and identifies challenges to further pertinent research. A framework to understand the various factors that govern the settling/assimilation of dust and likely mitigation measures have been discussed in this paper. (C) 2010 Elsevier Ltd. All rights reserved.
Resumo:
Our ability to infer the protein quaternary structure automatically from atom and lattice information is inadequate, especially for weak complexes, and heteromeric quaternary structures. Several approaches exist, but they have limited performance. Here, we present a new scheme to infer protein quaternary structure from lattice and protein information, with all-around coverage for strong, weak and very weak affinity homomeric and heteromeric complexes. The scheme combines naive Bayes classifier and point group symmetry under Boolean framework to detect quaternary structures in crystal lattice. It consistently produces >= 90% coverage across diverse benchmarking data sets, including a notably superior 95% coverage for recognition heteromeric complexes, compared with 53% on the same data set by current state-of-the-art method. The detailed study of a limited number of prediction-failed cases offers interesting insights into the intriguing nature of protein contacts in lattice. The findings have implications for accurate inference of quaternary states of proteins, especially weak affinity complexes.
Resumo:
The statistical performance analysis of ESPRIT, root-MUSIC, minimum-norm methods for direction estimation, due to finite data perturbations, using the modified spatially smoothed covariance matrix, is developed. Expressions for the mean-squared error in the direction estimates are derived based on a common framework. Based on the analysis, the use of the modified smoothed covariance matrix improves the performance of the methods when the sources are fully correlated. Also, the performance is better even when the number of subarrays is large unlike in the case of the conventionally smoothed covariance matrix. However, the performance for uncorrelated sources deteriorates due to an artificial correlation introduced by the modified smoothing. The theoretical expressions are validated using extensive simulations. (C) 1999 Elsevier Science B.V. All rights reserved.
Resumo:
We address the problem of allocating a single divisible good to a number of agents. The agents have concave valuation functions parameterized by a scalar type. The agents report only the type. The goal is to find allocatively efficient, strategy proof, nearly budget balanced mechanisms within the Groves class. Near budget balance is attained by returning as much of the received payments as rebates to agents. Two performance criteria are of interest: the maximum ratio of budget surplus to efficient surplus, and the expected budget surplus, within the class of linear rebate functions. The goal is to minimize them. Assuming that the valuation functions are known, we show that both problems reduce to convex optimization problems, where the convex constraint sets are characterized by a continuum of half-plane constraints parameterized by the vector of reported types. We then propose a randomized relaxation of these problems by sampling constraints. The relaxed problem is a linear programming problem (LP). We then identify the number of samples needed for ``near-feasibility'' of the relaxed constraint set. Under some conditions on the valuation function, we show that value of the approximate LP is close to the optimal value. Simulation results show significant improvements of our proposed method over the Vickrey-Clarke-Groves (VCG) mechanism without rebates. In the special case of indivisible goods, the mechanisms in this paper fall back to those proposed by Moulin, by Guo and Conitzer, and by Gujar and Narahari, without any need for randomization. Extension of the proposed mechanisms to situations when the valuation functions are not known to the central planner are also discussed. Note to Practitioners-Our results will be useful in all resource allocation problems that involve gathering of information privately held by strategic users, where the utilities are any concave function of the allocations, and where the resource planner is not interested in maximizing revenue, but in efficient sharing of the resource. Such situations arise quite often in fair sharing of internet resources, fair sharing of funds across departments within the same parent organization, auctioning of public goods, etc. We study methods to achieve near budget balance by first collecting payments according to the celebrated VCG mechanism, and then returning as much of the collected money as rebates. Our focus on linear rebate functions allows for easy implementation. The resulting convex optimization problem is solved via relaxation to a randomized linear programming problem, for which several efficient solvers exist. This relaxation is enabled by constraint sampling. Keeping practitioners in mind, we identify the number of samples that assures a desired level of ``near-feasibility'' with the desired confidence level. Our methodology will occasionally require subsidy from outside the system. We however demonstrate via simulation that, if the mechanism is repeated several times over independent instances, then past surplus can support the subsidy requirements. We also extend our results to situations where the strategic users' utility functions are not known to the allocating entity, a common situation in the context of internet users and other problems.
Resumo:
Remanufacturing activities in India are still in nascent stages. However, the substantial growth of Indian economy, coupled with serious issues of population and environmental burden demands a radical shift in market strategies and legislations. The scattered and inefficient product recovery methods prevalent in India are unable to cope with increasing environmental and economic burden on the society - remanufacturing seems to be a promising strategy to explore for these. Our study investigated from a user's context the opportunity of establishing remanufacturing as a formal activity, answering the fundamental questions of whether remanufactured products would be accepted by Indian consumers and how these will fit into the Indian market. The study of the Indian mobile phone market eco-system showed how mobile phones currently move through the value chain, and the importance of the grey and used phone markets in this movement. A prescriptive model has been proposed which utilizes the usage patterns of different consumer groups to create a self-sustainable demand-supply system, potentially complementing frameworks such as the Automotive Remanufacturing Decision-Making Framework (RDMF). (C) 2011 Elsevier Ltd. All rights reserved.
Resumo:
Today's feature-rich multimedia products require embedded system solution with complex System-on-Chip (SoC) to meet market expectations of high performance at a low cost and lower energy consumption. The memory architecture of the embedded system strongly influences critical system design objectives like area, power and performance. Hence the embedded system designer performs a complete memory architecture exploration to custom design a memory architecture for a given set of applications. Further, the designer would be interested in multiple optimal design points to address various market segments. However, tight time-to-market constraints enforces short design cycle time. In this paper we address the multi-level multi-objective memory architecture exploration problem through a combination of exhaustive-search based memory exploration at the outer level and a two step based integrated data layout for SPRAM-Cache based architectures at the inner level. We present a two step integrated approach for data layout for SPRAM-Cache based hybrid architectures with the first step as data-partitioning that partitions data between SPRAM and Cache, and the second step is the cache conscious data layout. We formulate the cache-conscious data layout as a graph partitioning problem and show that our approach gives up to 34% improvement over an existing approach and also optimizes the off-chip memory address space. We experimented our approach with 3 embedded multimedia applications and our approach explores several hundred memory configurations for each application, yielding several optimal design points in a few hours of computation on a standard desktop.
Resumo:
The memory subsystem is a major contributor to the performance, power, and area of complex SoCs used in feature rich multimedia products. Hence, memory architecture of the embedded DSP is complex and usually custom designed with multiple banks of single-ported or dual ported on-chip scratch pad memory and multiple banks of off-chip memory. Building software for such large complex memories with many of the software components as individually optimized software IPs is a big challenge. In order to obtain good performance and a reduction in memory stalls, the data buffers of the application need to be placed carefully in different types of memory. In this paper we present a unified framework (MODLEX) that combines different data layout optimizations to address the complex DSP memory architectures. Our method models the data layout problem as multi-objective genetic algorithm (GA) with performance and power being the objectives and presents a set of solution points which is attractive from a platform design viewpoint. While most of the work in the literature assumes that performance and power are non-conflicting objectives, our work demonstrates that there is significant trade-off (up to 70%) that is possible between power and performance.