Biblioteca Digital

2 resultados para High Performance

em CaltechTHESIS

Electrical and optical interconnects for high-performance computing

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Technology scaling has enabled drastic growth in the computational and storage capacity of integrated circuits (ICs). This constant growth drives an increasing demand for high-bandwidth communication between and within ICs. In this dissertation we focus on low-power solutions that address this demand. We divide communication links into three subcategories depending on the communication distance. Each category has a different set of challenges and requirements and is affected by CMOS technology scaling in a different manner. We start with short-range chip-to-chip links for board-level communication. Next we will discuss board-to-board links, which demand a longer communication range. Finally on-chip links with communication ranges of a few millimeters are discussed.

Electrical signaling is a natural choice for chip-to-chip communication due to efficient integration and low cost. IO data rates have increased to the point where electrical signaling is now limited by the channel bandwidth. In order to achieve multi-Gb/s data rates, complex designs that equalize the channel are necessary. In addition, a high level of parallelism is central to sustaining bandwidth growth. Decision feedback equalization (DFE) is one of the most commonly employed techniques to overcome the limited bandwidth problem of the electrical channels. A linear and low-power summer is the central block of a DFE. Conventional approaches employ current-mode techniques to implement the summer, which require high power consumption. In order to achieve low-power operation we propose performing the summation in the charge domain. This approach enables a low-power and compact realization of the DFE as well as crosstalk cancellation. A prototype receiver was fabricated in 45nm SOI CMOS to validate the functionality of the proposed technique and was tested over channels with different levels of loss and coupling. Measurement results show that the receiver can equalize channels with maximum 21dB loss while consuming about 7.5mW from a 1.2V supply. We also introduce a compact, low-power transmitter employing passive equalization. The efficacy of the proposed technique is demonstrated through implementation of a prototype in 65nm CMOS. The design achieves up to 20Gb/s data rate while consuming less than 10mW.

An alternative to electrical signaling is to employ optical signaling for chip-to-chip interconnections, which offers low channel loss and cross-talk while providing high communication bandwidth. In this work we demonstrate the possibility of building compact and low-power optical receivers. A novel RC front-end is proposed that combines dynamic offset modulation and double-sampling techniques to eliminate the need for a short time constant at the input of the receiver. Unlike conventional designs, this receiver does not require a high-gain stage that runs at the data rate, making it suitable for low-power implementations. In addition, it allows time-division multiplexing to support very high data rates. A prototype was implemented in 65nm CMOS and achieved up to 24Gb/s with less than 0.4pJ/b power efficiency per channel. As the proposed design mainly employs digital blocks, it benefits greatly from technology scaling in terms of power and area saving.

As the technology scales, the number of transistors on the chip grows. This necessitates a corresponding increase in the bandwidth of the on-chip wires. In this dissertation, we take a close look at wire scaling and investigate its effect on wire performance metrics. We explore a novel on-chip communication link based on a double-sampling architecture and dynamic offset modulation technique that enables low power consumption and high data rates while achieving high bandwidth density in 28nm CMOS technology. The functionality of the link is demonstrated using different length minimum-pitch on-chip wires. Measurement results show that the link achieves up to 20Gb/s of data rate (12.5Gb/s/$\mu$m) with better than 136fJ/b of power efficiency.

Veja mais

A fully-nonlocal energy-based formulation and high-performance realization of the quasicontinuum method

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The quasicontinuum (QC) method was introduced to coarse-grain crystalline atomic ensembles in order to bridge the scales from individual atoms to the micro- and mesoscales. Though many QC formulations have been proposed with varying characteristics and capabilities, a crucial cornerstone of all QC techniques is the concept of summation rules, which attempt to efficiently approximate the total Hamiltonian of a crystalline atomic ensemble by a weighted sum over a small subset of atoms. In this work we propose a novel, fully-nonlocal, energy-based formulation of the QC method with support for legacy and new summation rules through a general energy-sampling scheme. Our formulation does not conceptually differentiate between atomistic and coarse-grained regions and thus allows for seamless bridging without domain-coupling interfaces. Within this structure, we introduce a new class of summation rules which leverage the affine kinematics of this QC formulation to most accurately integrate thermodynamic quantities of interest. By comparing this new class of summation rules to commonly-employed rules through analysis of energy and spurious force errors, we find that the new rules produce no residual or spurious force artifacts in the large-element limit under arbitrary affine deformation, while allowing us to seamlessly bridge to full atomistics. We verify that the new summation rules exhibit significantly smaller force artifacts and energy approximation errors than all comparable previous summation rules through a comprehensive suite of examples with spatially non-uniform QC discretizations in two and three dimensions. Due to the unique structure of these summation rules, we also use the new formulation to study scenarios with large regions of free surface, a class of problems previously out of reach of the QC method. Lastly, we present the key components of a high-performance, distributed-memory realization of the new method, including a novel algorithm for supporting unparalleled levels of deformation. Overall, this new formulation and implementation allows us to efficiently perform simulations containing an unprecedented number of degrees of freedom with low approximation error.

Veja mais

2 resultados para High Performance

em CaltechTHESIS

Filtro por publicador

Electrical and optical interconnects for high-performance computing

A fully-nonlocal energy-based formulation and high-performance realization of the quasicontinuum method