12 resultados para performance gains
em Indian Institute of Science - Bangalore - Índia
Resumo:
Data Prefetchers identify and make use of any regularity present in the history/training stream to predict future references and prefetch them into the cache. The training information used is typically the primary misses seen at a particular cache level, which is a filtered version of the accesses seen by the cache. In this work we demonstrate that extending the training information to include secondary misses and hits along with primary misses helps improve the performance of prefetchers. In addition to empirical evaluation, we use the information theoretic metric entropy, to quantify the regularity present in extended histories. Entropy measurements indicate that extended histories are more regular than the default primary miss only training stream. Entropy measurements also help corroborate our empirical findings. With extended histories, further benefits can be achieved by triggering prefetches during secondary misses also. In this paper we explore the design space of extended prefetch histories and alternative prefetch trigger points for delta correlation prefetchers. We observe that different prefetch schemes benefit to a different extent with extended histories and alternative trigger points. Also the best performing design point varies on a per-benchmark basis. To meet these requirements, we propose a simple adaptive scheme that identifies the best performing design point for a benchmark-prefetcher combination at runtime. In SPEC2000 benchmarks, using all the L2 accesses as history for prefetcher improves the performance in terms of both IPC and misses reduced over techniques that use only primary misses as history. The adaptive scheme improves the performance of CZone prefetcher over Baseline by 4.6% on an average. These performance gains are accompanied by a moderate reduction in the memory traffic requirements.
Resumo:
With the increasing adoption of wireless technology, it is reasonable to expect an increase in file demand for supporting both real-time multimedia and high rate reliable data services. Next generation wireless systems employ Orthogonal Frequency Division Multiplexing (OFDM) physical layer owing, to the high data rate transmissions that are possible without increase in bandwidth. Towards improving file performance of these systems, we look at the design of resource allocation algorithms at medium-access layer, and their impact on higher layers. While TCP-based clastic traffic needs reliable transport, UDP-based real-time applications have stringent delay and rate requirements. The MAC algorithms while catering to the heterogeneous service needs of these higher layers, tradeoff between maximizing the system capacity and providing fairness among users. The novelly of this work is the proposal of various channel-aware resource allocation algorithms at the MAC layer. which call result in significant performance gains in an OFDM based wireless system.
Resumo:
This paper compares and analyzes the performance of distributed cophasing techniques for uplink transmission over wireless sensor networks. We focus on a time-division duplexing approach, and exploit the channel reciprocity to reduce the channel feedback requirement. We consider periodic broadcast of known pilot symbols by the fusion center (FC), and maximum likelihood estimation of the channel by the sensor nodes for the subsequent uplink cophasing transmission. We assume carrier and phase synchronization across the participating nodes for analytical tractability. We study binary signaling over frequency-flat fading channels, and quantify the system performance such as the expected gains in the received signal-to-noise ratio (SNR) and the average probability of error at the FC, as a function of the number of sensor nodes and the pilot overhead. Our results show that a modest amount of accumulated pilot SNR is sufficient to realize a large fraction of the maximum possible beamforming gain. We also investigate the performance gains obtained by censoring transmission at the sensors based on the estimated channel state, and the benefits obtained by using maximum ratio transmission (MRT) and truncated channel inversion (TCI) at the sensors in addition to cophasing transmission. Simulation results corroborate the theoretical expressions and show the relative performance benefits offered by the various schemes.
Resumo:
Community Climate System Model (CCSM) is a Multiple Program Multiple Data (MPMD) parallel global climate model comprising atmosphere, ocean, land, ice and coupler components. The simulations have a time-step of the order of tens of minutes and are typically performed for periods of the order of centuries. These climate simulations are highly computationally intensive and can take several days to weeks to complete on most of today’s multi-processor systems. ExecutingCCSM on grids could potentially lead to a significant reduction in simulation times due to the increase in number of processors. However, in order to obtain performance gains on grids, several challenges have to be met. In this work,we describe our load balancing efforts in CCSM to make it suitable for grid enabling.We also identify the various challenges in executing CCSM on grids. Since CCSM is an MPI application, we also describe our current work on building a MPI implementation for grids to grid-enable CCSM.
Resumo:
We establish zero-crossing rate (ZCR) relations between the input and the subbands of a maximally decimated M-channel power complementary analysis filterbank when the input is a stationary Gaussian process. The ZCR at lag is defined as the number of sign changes between the samples of a sequence and its 1-sample shifted version, normalized by the sequence length. We derive the relationship between the ZCR of the Gaussian process at lags that are integer multiples of Al and the subband ZCRs. Based on this result, we propose a robust iterative autocorrelation estimator for a signal consisting of a sum of sinusoids of fixed amplitudes and uniformly distributed random phases. Simulation results show that the performance of the proposed estimator is better than the sample autocorrelation over the SNR range of -6 to 15 dB. Validation on a segment of a trumpet signal showed similar performance gains.
Resumo:
In this paper, we shed light on the cross-layer interactions between the PHY, link and routing layers in networks with MIMO links operating in the diversity mode. Many previous studies assume an overly simplistic PHY layer model that does not sufficiently capture these interactions. We show that the use of simplistic models can in fact lead to misleading conclusions with regards to the higher layer performance with MIMO diversity. Towards understanding the impact of various PHY layer features on MIMO diversity, we begin with a simple but widely-used model and progressively incorporate these features to create new models. We examine the goodness of these models by comparing the simulated performance results with each, with measurements on an indoor 802.11 n testbed. Our work reveals several interesting cross-layer dependencies that affect the gains due to MIMO diversity. In particular, we observe that relative to SISO links: (a) PHY layer gains due to MIMO diversity do not always carry over to the higher layers, (b) the use of other PHY layer features such as FEC codes significantly influence the gains due to MIMO diversity, and (c) the choice of the routing metric can impact the gains possible with MIMO.
Resumo:
Frequency-domain scheduling and rate adaptation enable next generation wireless cellular systems such as Long Term Evolution (LTE) to achieve significantly higher downlink throughput. LTE assigns subcarriers in chunks, called physical resource blocks (PRBs), to users to reduce control signaling overhead. To reduce the enormous feedback overhead, the channel quality indicator (CQI) report that is used to feed back channel state information is averaged over a subband, which, in turn, is a group of multiple PRBs. In this paper, we develop closed-form expressions for the throughput achieved by the subband-level CQI feedback mechanism of LTE. We show that the coarse frequency resolution of the CQI incurs a significant loss in throughput and limits the multi-user gains achievable by the system. We then show that the performance can be improved by means of an offset mechanism that effectively makes the users more conservative in reporting their CQI.
Resumo:
The importance of air bearing design is growing in engineering. As the trend to precision and ultra precision manufacture gains pace and the drive to higher quality and more reliable products continues, the advantages which can be gained from applying aerostatic bearings to machine tools, instrumentation and test rigs is becoming more apparent. The inlet restrictor design is significant for air bearings because it affects the static and dynamic performance of the air bearing. For instance pocketed orifice bearings give higher load capacity as compared to inherently compensated orifice type bearings, however inherently compensated orifices, also known as laminar flow restrictors are known to give highly stable air bearing systems (less prone to pneumatic hammer) as compared to pocketed orifice air bearing systems. However, they are not commonly used because of the difficulties encountered in manufacturing and assembly of the orifice designs. This paper aims to analyse the static and dynamic characteristics of inherently compensated orifice based flat pad air bearing system. Based on Reynolds equation and mass conservation equation for incompressible flow, the steady state characteristics are studied while the dynamic state characteristics are performed in a similar manner however, using the above equations for compressible flow. Steady state experiments were also performed for a single orifice air bearing and the results are compared to that obtained from theoretical studies. A technique to ease the assembly of orifices with the air bearing plate has also been discussed so as to make the manufacturing of the inherently compensated bearings more commercially viable. (c) 2012 Elsevier Inc. All rights reserved.
Resumo:
In a cooperative system with an amplify-and-forward relay, the cascaded channel training protocol enables the destination to estimate the source-destination channel gain and the product of the source-relay (SR) and relay-destination (RD) channel gains using only two pilot transmissions from the source. Notably, the destination does not require a separate estimate of the SR channel. We develop a new expression for the symbol error probability (SEP) of AF relaying when imperfect channel state information (CSI) is acquired using the above training protocol. A tight SEP upper bound is also derived; it shows that full diversity is achieved, albeit at a high signal-to-noise ratio (SNR). Our analysis uses fewer simplifying assumptions, and leads to expressions that are accurate even at low SNRs and are different from those in the literature. For instance, it does not approximate the estimate of the product of SR and RD channel gains by the product of the estimates of the SR and RD channel gains. We show that cascaded channel estimation often outperforms a channel estimation protocol that incurs a greater training overhead by forwarding a quantized estimate of the SR channel gain to the destination. The extent of pilot power boosting, if allowed, that is required to improve performance is also quantified.
Resumo:
It is well known that extremely long low-density parity-check (LDPC) codes perform exceptionally well for error correction applications, short-length codes are preferable in practical applications. However, short-length LDPC codes suffer from performance degradation owing to graph-based impairments such as short cycles, trapping sets and stopping sets and so on in the bipartite graph of the LDPC matrix. In particular, performance degradation at moderate to high E-b/N-0 is caused by the oscillations in bit node a posteriori probabilities induced by short cycles and trapping sets in bipartite graphs. In this study, a computationally efficient algorithm is proposed to improve the performance of short-length LDPC codes at moderate to high E-b/N-0. This algorithm makes use of the information generated by the belief propagation (BP) algorithm in previous iterations before a decoding failure occurs. Using this information, a reliability-based estimation is performed on each bit node to supplement the BP algorithm. The proposed algorithm gives an appreciable coding gain as compared with BP decoding for LDPC codes of a code rate equal to or less than 1/2 rate coding. The coding gains are modest to significant in the case of optimised (for bipartite graph conditioning) regular LDPC codes, whereas the coding gains are huge in the case of unoptimised codes. Hence, this algorithm is useful for relaxing some stringent constraints on the graphical structure of the LDPC code and for developing hardware-friendly designs.
Resumo:
Given the significant gains that relay-based cooperation promises, the practical problems of acquisition of channel state information (CSI) and the characterization and optimization of performance with imperfect CSI are receiving increasing attention. We develop novel and accurate expressions for the symbol error probability (SEP) for fixed-gain amplify-and-forward relaying when the destination acquires CSI using the time-efficient cascaded channel estimation (CCE) protocol. The CCE protocol saves time by making the destination directly estimate the product of the source-relay and relay-destination channel gains. For a single relay system, we first develop a novel SEP expression and a tight SEP upper bound. We then similarly analyze an opportunistic multi-relay system, in which both selection and coherent demodulation use imperfect estimates. A distinctive aspect of our approach is the use of as few simplifying approximations as possible, which results in new results that are accurate at signal-to-noise-ratios as low as 1 dB for single and multi-relay systems. Using insights gleaned from an asymptotic analysis, we also present a simple, closed-form, nearly-optimal solution for allocation of energy between pilot and data symbols at the source and relay(s).
Resumo:
Contemporary cellular standards, such as Long Term Evolution (LTE) and LTE-Advanced, employ orthogonal frequency-division multiplexing (OFDM) and use frequency-domain scheduling and rate adaptation. In conjunction with feedback reduction schemes, high downlink spectral efficiencies are achieved while limiting the uplink feedback overhead. One such important scheme that has been adopted by these standards is best-m feedback, in which every user feeds back its m largest subchannel (SC) power gains and their corresponding indices. We analyze the single cell average throughput of an OFDM system with uniformly correlated SC gains that employs best-m feedback and discrete rate adaptation. Our model incorporates three schedulers that cover a wide range of the throughput versus fairness tradeoff and feedback delay. We show that, for small m, correlation significantly reduces average throughput with best-m feedback. This result is pertinent as even in typical dispersive channels, correlation is high. We observe that the schedulers exhibit varied sensitivities to correlation and feedback delay. The analysis also leads to insightful expressions for the average throughput in the asymptotic regime of a large number of users.