351 resultados para Hardware Transactional Memory


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Massive multiple-input multiple-output (MIMO) systems are cellular networks where the base stations (BSs) are equipped with unconventionally many antennas, deployed on colocated or distributed arrays. Huge spatial degrees-of-freedom are achieved by coherent processing over these massive arrays, which provide strong signal gains, resilience to imperfect channel knowledge, and low interference. This comes at the price of more infrastructure; the hardware cost and circuit power consumption scale linearly/affinely with the number of BS antennas N. Hence, the key to cost-efficient deployment of large arrays is low-cost antenna branches with low circuit power, in contrast to today’s conventional expensive and power-hungry BS antenna branches. Such low-cost transceivers are prone to hardware imperfections, but it has been conjectured that the huge degrees-of-freedom would bring robustness to such imperfections. We prove this claim for a generalized uplink system with multiplicative phasedrifts, additive distortion noise, and noise amplification. Specifically, we derive closed-form expressions for the user rates and a scaling law that shows how fast the hardware imperfections can increase with N while maintaining high rates. The connection between this scaling law and the power consumption of different transceiver circuits is rigorously exemplified. This reveals that one can make the circuit power increase as p N, instead of linearly, by careful circuit-aware system design.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Radio-frequency (RF) impairments in the transceiver hardware of communication systems (e.g., phase noise (PN), high power amplifier (HPA) nonlinearities, or in-phase/quadrature-phase (I/Q) imbalance) can severely degrade the performance of traditional multiple-input multiple-output (MIMO) systems. Although calibration algorithms can partially compensate these impairments, the remaining distortion still has substantial impact. Despite this, most prior works have not analyzed this type of distortion. In this paper, we investigate the impact of residual transceiver hardware impairments on the MIMO system performance. In particular, we consider a transceiver impairment model, which has been experimentally validated, and derive analytical ergodic capacity expressions for both exact and high signal-to-noise ratios (SNRs). We demonstrate that the capacity saturates in the high-SNR regime, thereby creating a finite capacity ceiling. We also present a linear approximation for the ergodic capacity in the low-SNR regime, and show that impairments have only a second-order impact on the capacity. Furthermore, we analyze the effect of transceiver impairments on large-scale MIMO systems; interestingly, we prove that if one increases the number of antennas at one side only, the capacity behaves similar to the finite-dimensional case. On the contrary, if the number of antennas on both sides increases with a fixed ratio, the capacity ceiling vanishes; thus, impairments cause only a bounded offset in the capacity compared to the ideal transceiver hardware case.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Massive multiple-input multiple-output (MIMO) systems are cellular networks where the base stations (BSs) are equipped with unconventionally many antennas. Such large antenna arrays offer huge spatial degrees-of-freedom for transmission optimization; in particular, great signal gains, resilience to imperfect channel knowledge, and small inter-user interference are all achievable without extensive inter-cell coordination. The key to cost-efficient deployment of large arrays is the use of hardware-constrained base stations with low-cost antenna elements, as compared to today's expensive and power-hungry BSs. Low-cost transceivers are prone to hardware imperfections, but it has been conjectured that the excessive degrees-of-freedom of massive MIMO would bring robustness to such imperfections. We herein prove this claim for an uplink channel with multiplicative phase-drift, additive distortion noise, and noise amplification. Specifically, we derive a closed-form scaling law that shows how fast the imperfections increase with the number of antennas.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we propose a system level design approach considering voltage over-scaling (VOS) that achieves error resiliency using unequal error protection of different computation elements, while incurring minor quality degradation. Depending on user specifications and severity of process variations/channel noise, the degree of VOS in each block of the system is adaptively tuned to ensure minimum system power while providing "just-the-right" amount of quality and robustness. This is achieved, by taking into consideration block level interactions and ensuring that under any change of operating conditions, only the "less-crucial" computations, that contribute less to block/system output quality, are affected. The proposed approach applies unequal error protection to various blocks of a system-logic and memory-and spans multiple layers of design hierarchy-algorithm, architecture and circuit. The design methodology when applied to a multimedia subsystem shows large power benefits ( up to 69% improvement in power consumption) at reasonable image quality while tolerating errors introduced due to VOS, process variations, and channel noise.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present DRASync, a region-based allocator that implements a global address space abstraction for MPI programs with pointer-based data structures. The main features of DRASync are: (a) it amortizes communication among nodes to allow efficient parallel allocation in a global address space; (b) it takes advantage of bulk deallocation and good locality with pointer-based data structures; (c) it supports ownership semantics of regions by nodes akin to reader–writer locks, which makes for a high-level, intuitive synchronization tool in MPI programs, without sacrificing message-passing performance. We evaluate DRASync against a state-of-the-art distributed allocator and find that it produces comparable performance while offering a higher-level abstraction to programmers.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The overall aim of the work presented in this paper has been to develop Montgomery modular multiplication architectures suitable for implementation on modern reconfigurable hardware. Accordingly, novel high-radix systolic array Montgomery multiplier designs are presented, as we believe that the inherent regular structure and absence of global interconnect associated with these, make them well-suited for implementation on modern FPGAs. Unlike previous approaches, each processing element (PE) comprises both an adder and a multiplier. The inclusion of a multiplier in the PE means that the need to pre-compute or store any multiples of the operands is avoided. This also allows very high-radix implementations to be realised, further reducing the amount of clock cycles per modular multiplication, while still maintaining a competitive critical delay. For demonstrative purposes, 512-bit and 1024-bit FPGA implementations using radices of 2(8) and 2(16) are presented. The subsequent throughput rates are the fastest reported to date.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we investigate the end-to-end performance of dual-hop proactive decode-and-forward relaying networks with Nth best relay selection in the presence of two practical deleterious effects: i) hardware impairment and ii) cochannel interference. In particular, we derive new exact and asymptotic closed-form expressions for the outage probability and average channel capacity of Nth best partial and opportunistic relay selection schemes over Rayleigh fading channels. Insightful discussions are provided. It is shown that, when the system cannot select the best relay for cooperation, the partial relay selection scheme outperforms the opportunistic method under the impact of the same co-channel interference (CCI). In addition, without CCI but under the effect of hardware impairment, it is shown that both selection strategies have the same asymptotic channel capacity. Monte Carlo simulations are presented to corroborate our analysis.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We study the competing effects of simultaneous Markovian and non-Markovian decoherence mechanisms acting on a single spin. We show the existence of a threshold in the relative strength of such mechanisms above which the spin dynamics becomes fully Markovian, as revealed by the use of several non-Markovianity measures. We identify a measure-dependent nested structure of such thresholds, hinting at a causality relationship among the various non-Markovianity witnesses used in our analysis. Our considerations are then used to argue the unavoidably non-Markovian evolution of a single-electron quantum dot exposed to both intrinsic and Markovian technical noise, the latter of arbitrary strength. 

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The Field Programmable Gate Array (FPGA) implementation of the commonly used Histogram of Oriented Gradients (HOG) algorithm is explored. The HOG algorithm is employed to extract features for object detection. A key focus has been to explore the use of a new FPGA-based processor which has been targeted at image processing. The paper gives details of the mapping and scheduling factors that influence the performance and the stages that were undertaken to allow the algorithm to be deployed on FPGA hardware, whilst taking into account the specific IPPro architecture features. We show that multi-core IPPro performance can exceed that of against state-of-the-art FPGA designs by up to 3.2 times with reduced design and implementation effort and increased flexibility all on a low cost, Zynq programmable system.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper investigates the mechanism of nanoscale fatigue of functionally graded TiN/TiNi films using nano-impact and multiple-loading-cycle nanoindentation tests. The functionally graded films were deposited on silicon substrate, in which TiNi films maintain shape memory and pseudo elastic behavior, while a modified TiN surface layer provides tribological and anti-corrosion properties. Nanomechanical tests were performed to comprehend the localized film performance and failure modes of the functionally graded film using NanoTestTM equipped with Berkovich and conical indenter between 100 μN to 500 mN loads. The loading mechanism and load history are critical to define film failure modes (i.e. backward depth deviation) including the shape memory effect of the functionally graded layer. The results are sensitive to the applied load, loading type (e.g. semi-static, dynamic) and probe geometry. Based on indentation force-depth profiles, depth-time data and post-test surface observations of films, it is concluded that the shape of the nanoindenter is critical in inducing the localized indentation stress and film failure, including shape recovery at the lower load range. Elastic-plastic finite element (FE) simulation during nanoindentation loading indicated that the location of subsurface maximum stress near the interface influences the backward depth deviation type of film failure. A standalone, molecular dynamics simulation was performed with the help of a long range potential energy function to simulate the tensile test of TiN nanowire with two different aspect ratios to investigate the theory of its failure mechanism.