12 resultados para Houston, Texas, USA

em Indian Institute of Science - Bangalore - Índia


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Each new generation of GPUs vastly increases the resources available to GPGPU programs. GPU programming models (like CUDA) were designed to scale to use these resources. However, we find that CUDA programs actually do not scale to utilize all available resources, with over 30% of resources going unused on average for programs of the Parboil2 suite that we used in our work. Current GPUs therefore allow concurrent execution of kernels to improve utilization. In this work, we study concurrent execution of GPU kernels using multiprogram workloads on current NVIDIA Fermi GPUs. On two-program workloads from the Parboil2 benchmark suite we find concurrent execution is often no better than serialized execution. We identify that the lack of control over resource allocation to kernels is a major serialization bottleneck. We propose transformations that convert CUDA kernels into elastic kernels which permit fine-grained control over their resource usage. We then propose several elastic-kernel aware concurrency policies that offer significantly better performance and concurrency compared to the current CUDA policy. We evaluate our proposals on real hardware using multiprogrammed workloads constructed from benchmarks in the Parboil 2 suite. On average, our proposals increase system throughput (STP) by 1.21x and improve the average normalized turnaround time (ANTT) by 3.73x for two-program workloads when compared to the current CUDA concurrency implementation.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper we construct low decoding complexity STBCs by using the Pauli matrices as linear dispersion matrices. In this case the Hurwitz-Radon orthogonality condition is shown to be easily checked by transferring the problem to $\mathbb{F}_4$ domain. The problem of constructing low decoding complexity STBCs is shown to be equivalent to finding certain codes over $\mathbb{F}_4$. It is shown that almost all known low complexity STBCs can be obtained by this approach. New codes are given that have the least known decoding complexity in particular ranges of rate.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Regenerating codes are a class of codes for distributed storage networks that provide reliability and availability of data, and also perform efficient node repair. Another important aspect of a distributed storage network is its security. In this paper, we consider a threat model where an eavesdropper may gain access to the data stored in a subset of the storage nodes, and possibly also, to the data downloaded during repair of some nodes. We provide explicit constructions of regenerating codes that achieve information-theoretic secrecy capacity in this setting.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Network life time maximization is becoming an important design goal in wireless sensor networks. Energy harvesting has recently become a preferred choice for achieving this goal as it provides near perpetual operation. We study such a sensor node with an energy harvesting source and compare various architectures by which the harvested energy is used. We find its Shannon capacity when it is transmitting its observations over a fading AWGN channel with perfect/no channel state information provided at the transmitter. We obtain an achievable rate when there are inefficiencies in energy storage and the capacity when energy is spent in activities other than transmission.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Opportunistic selection is a practically appealing technique that is used in multi-node wireless systems to maximize throughput, implement proportional fairness, etc. However, selection is challenging since the information about a node's channel gains is often available only locally at each node and not centrally. We propose a novel multiple access-based distributed selection scheme that generalizes the best features of the timer scheme, which requires minimal feedback but does not always guarantee successful selection, and the fast splitting scheme, which requires more feedback but guarantees successful selection. The proposed scheme's design explicitly accounts for feedback time overheads unlike the conventional splitting scheme and guarantees selection of the user with the highest metric unlike the timer scheme. We analyze and minimize the average time including feedback required by the scheme to select. With feedback overheads, the proposed scheme is scalable and considerably faster than several schemes proposed in the literature. Furthermore, the gains increase as the feedback overhead increases.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper analyzes the error exponents in Bayesian decentralized spectrum sensing, i.e., the detection of occupancy of the primary spectrum by a cognitive radio, with probability of error as the performance metric. At the individual sensors, the error exponents of a Central Limit Theorem (CLT) based detection scheme are analyzed. At the fusion center, a K-out-of-N rule is employed to arrive at the overall decision. It is shown that, in the presence of fading, for a fixed number of sensors, the error exponents with respect to the number of observations at both the individual sensors as well as at the fusion center are zero. This motivates the development of the error exponent with a certain probability as a novel metric that can be used to compare different detection schemes in the presence of fading. The metric is useful, for example, in answering the question of whether to sense for a pilot tone in a narrow band (and suffer Rayleigh fading) or to sense the entire wide-band signal (and suffer log-normal shadowing), in terms of the error exponent performance. The error exponents with a certain probability at both the individual sensors and at the fusion center are derived, with both Rayleigh as well as log-normal shadow fading. Numerical results are used to illustrate and provide a visual feel for the theoretical expressions obtained.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We develop a Markov model for a TCP CUBIC connection. Next we use it to obtain approximate expressions for throughput when there may be queuing in the network. Finally we provide the throughputs different TCP CUBIC and TCP NewReno connections obtain while sharing a channel when they may have different round trip delays and packet loss probabilities.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper, we determine packet scheduling policies for efficient power management in Energy Harvesting Sensors (EHS) which have to transmit packets of high and low priorities over a fading channel. We assume that incoming packets are stored in a buffer and the quality of service for a particular type of message is determined by the expected waiting time of packets of that type of message. The sensors are constrained to work with the energy that they garner from the environment. We derive transmit policies which minimize the sum of expected waiting times of the two types of messages, weighted by penalties. First, we show that for schemes with a constant rate of transmission, under a decoupling approximation, a form of truncated channel inversion is optimal. Using this result, we derive optimal solutions that minimize the weighted sum of the waiting times in the different queues.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Motivated by the recent Coherent Space-Time Shift Keying (CSTSK) philosophy, we construct new dispersion matrices for rotationally invariant PSK signaling sets. Given a specific PSK signal constellation, the dispersion matrices of the existing CSTSK scheme were chosen by maximizing the mutual information over randomly generated sets of dispersion matrices. In this contribution we propose a general method for constructing a set of structured dispersion matrices for arbitrary PSK signaling sets using Field Extension (FE) codes and then study the attainable Symbol Error Rate (SER) performance of some example constructions. We demonstrate that the proposed dispersion scheme is capable of outperforming the existing dispersion arrangement at medium to high SNRs.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In a cooperative system with an amplify-and-forward relay, the cascaded channel training protocol enables the destination to estimate the source-destination channel gain and the product of the source-relay (SR) and relay-destination (RD) channel gains using only two pilot transmissions from the source. Notably, the destination does not require a separate estimate of the SR channel. We develop a new expression for the symbol error probability (SEP) of AF relaying when imperfect channel state information (CSI) is acquired using the above training protocol. A tight SEP upper bound is also derived; it shows that full diversity is achieved, albeit at a high signal-to-noise ratio (SNR). Our analysis uses fewer simplifying assumptions, and leads to expressions that are accurate even at low SNRs and are different from those in the literature. For instance, it does not approximate the estimate of the product of SR and RD channel gains by the product of the estimates of the SR and RD channel gains. We show that cascaded channel estimation often outperforms a channel estimation protocol that incurs a greater training overhead by forwarding a quantized estimate of the SR channel gain to the destination. The extent of pilot power boosting, if allowed, that is required to improve performance is also quantified.