933 resultados para PARALLEL COMPUTING


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Technology scaling has enabled drastic growth in the computational and storage capacity of integrated circuits (ICs). This constant growth drives an increasing demand for high-bandwidth communication between and within ICs. In this dissertation we focus on low-power solutions that address this demand. We divide communication links into three subcategories depending on the communication distance. Each category has a different set of challenges and requirements and is affected by CMOS technology scaling in a different manner. We start with short-range chip-to-chip links for board-level communication. Next we will discuss board-to-board links, which demand a longer communication range. Finally on-chip links with communication ranges of a few millimeters are discussed.

Electrical signaling is a natural choice for chip-to-chip communication due to efficient integration and low cost. IO data rates have increased to the point where electrical signaling is now limited by the channel bandwidth. In order to achieve multi-Gb/s data rates, complex designs that equalize the channel are necessary. In addition, a high level of parallelism is central to sustaining bandwidth growth. Decision feedback equalization (DFE) is one of the most commonly employed techniques to overcome the limited bandwidth problem of the electrical channels. A linear and low-power summer is the central block of a DFE. Conventional approaches employ current-mode techniques to implement the summer, which require high power consumption. In order to achieve low-power operation we propose performing the summation in the charge domain. This approach enables a low-power and compact realization of the DFE as well as crosstalk cancellation. A prototype receiver was fabricated in 45nm SOI CMOS to validate the functionality of the proposed technique and was tested over channels with different levels of loss and coupling. Measurement results show that the receiver can equalize channels with maximum 21dB loss while consuming about 7.5mW from a 1.2V supply. We also introduce a compact, low-power transmitter employing passive equalization. The efficacy of the proposed technique is demonstrated through implementation of a prototype in 65nm CMOS. The design achieves up to 20Gb/s data rate while consuming less than 10mW.

An alternative to electrical signaling is to employ optical signaling for chip-to-chip interconnections, which offers low channel loss and cross-talk while providing high communication bandwidth. In this work we demonstrate the possibility of building compact and low-power optical receivers. A novel RC front-end is proposed that combines dynamic offset modulation and double-sampling techniques to eliminate the need for a short time constant at the input of the receiver. Unlike conventional designs, this receiver does not require a high-gain stage that runs at the data rate, making it suitable for low-power implementations. In addition, it allows time-division multiplexing to support very high data rates. A prototype was implemented in 65nm CMOS and achieved up to 24Gb/s with less than 0.4pJ/b power efficiency per channel. As the proposed design mainly employs digital blocks, it benefits greatly from technology scaling in terms of power and area saving.

As the technology scales, the number of transistors on the chip grows. This necessitates a corresponding increase in the bandwidth of the on-chip wires. In this dissertation, we take a close look at wire scaling and investigate its effect on wire performance metrics. We explore a novel on-chip communication link based on a double-sampling architecture and dynamic offset modulation technique that enables low power consumption and high data rates while achieving high bandwidth density in 28nm CMOS technology. The functionality of the link is demonstrated using different length minimum-pitch on-chip wires. Measurement results show that the link achieves up to 20Gb/s of data rate (12.5Gb/s/$\mu$m) with better than 136fJ/b of power efficiency.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This thesis describes a compositional framework for developing situation awareness applications: applications that provide ongoing information about a user's changing environment. The thesis describes how the framework is used to develop a situation awareness application for earthquakes. The applications are implemented as Cloud computing services connected to sensors and actuators. The architecture and design of the Cloud services are described and measurements of performance metrics are provided. The thesis includes results of experiments on earthquake monitoring conducted over a year. The applications developed by the framework are (1) the CSN --- the Community Seismic Network --- which uses relatively low-cost sensors deployed by members of the community, and (2) SAF --- the Situation Awareness Framework --- which integrates data from multiple sources, including the CSN, CISN --- the California Integrated Seismic Network, a network consisting of high-quality seismometers deployed carefully by professionals in the CISN organization and spread across Southern California --- and prototypes of multi-sensor platforms that include carbon monoxide, methane, dust and radiation sensors.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The 0.2% experimental accuracy of the 1968 Beers and Hughes measurement of the annihilation lifetime of ortho-positronium motivates the attempt to compute the first order quantum electrodynamic corrections to this lifetime. The theoretical problems arising in this computation are here studied in detail up to the point of preparing the necessary computer programs and using them to carry out some of the less demanding steps -- but the computation has not yet been completed. Analytic evaluation of the contributing Feynman diagrams is superior to numerical evaluation, and for this process can be carried out with the aid of the Reduce algebra manipulation computer program.

The relation of the positronium decay rate to the electronpositron annihilation-in-flight amplitude is derived in detail, and it is shown that at threshold annihilation-in-flight, Coulomb divergences appear while infrared divergences vanish. The threshold Coulomb divergences in the amplitude cancel against like divergences in the modulating continuum wave function.

Using the lowest order diagrams of electron-positron annihilation into three photons as a test case, various pitfalls of computer algebraic manipulation are discussed along with ways of avoiding them. The computer manipulation of artificial polynomial expressions is preferable to the direct treatment of rational expressions, even though redundant variables may have to be introduced.

Special properties of the contributing Feynman diagrams are discussed, including the need to restore gauge invariance to the sum of the virtual photon-photon scattering box diagrams by means of a finite subtraction.

A systematic approach to the Feynman-Brown method of Decomposition of single loop diagram integrals with spin-related tensor numerators is developed in detail. This approach allows the Feynman-Brown method to be straightforwardly programmed in the Reduce algebra manipulation language.

The fundamental integrals needed in the wake of the application of the Feynman-Brown decomposition are exhibited and the methods which were used to evaluate them -- primarily dis persion techniques are briefly discussed.

Finally, it is pointed out that while the techniques discussed have permitted the computation of a fair number of the simpler integrals and diagrams contributing to the first order correction of the ortho-positronium annihilation rate, further progress with the more complicated diagrams and with the evaluation of traces is heavily contingent on obtaining access to adequate computer time and core capacity.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

I. Trimesic acid (1, 3, 5-benzenetricarboxylic acid) crystallizes with a monoclinic unit cell of dimensions a = 26.52 A, b = 16.42 A, c = 26.55 A, and β = 91.53° with 48 molecules /unit cell. Extinctions indicated a space group of Cc or C2/c; a satisfactory structure was obtained in the latter with 6 molecules/asymmetric unit - C54O36H36 with a formula weight of 1261 g. Of approximately 12,000 independent reflections within the CuKα sphere, intensities of 11,563 were recorded visually from equi-inclination Weissenberg photographs.

The structure was solved by packing considerations aided by molecular transforms and two- and three-dimensional Patterson functions. Hydrogen positions were found on difference maps. A total of 978 parameters were refined by least squares; these included hydrogen parameters and anisotropic temperature factors for the C and O atoms. The final R factor was 0.0675; the final "goodness of fit" was 1.49. All calculations were carried out on the Caltech IBM 7040-7094 computer using the CRYRM Crystallographic Computing System.

The six independent molecules fall into two groups of three nearly parallel molecules. All molecules are connected by carboxylto- carboxyl hydrogen bond pairs to form a continuous array of sixmolecule rings with a chicken-wire appearance. These arrays bend to assume two orientations, forming pleated sheets. Arrays in different orientations interpenetrate - three molecules in one orientation passing through the holes of three parallel arrays in the alternate orientation - to produce a completely interlocking network. One third of the carboxyl hydrogen atoms were found to be disordered.

II. Optical transforms as related to x-ray diffraction patterns are discussed with reference to the theory of Fraunhofer diffraction.

The use of a systems approach in crystallographic computing is discussed with special emphasis on the way in which this has been done at the California Institute of Technology.

An efficient manner of calculating Fourier and Patterson maps on a digital computer is presented. Expressions for the calculation of to-scale maps for standard sections and for general-plane sections are developed; space-group-specific expressions in a form suitable for computers are given for all space groups except the hexagonal ones.

Expressions for the calculation of settings for an Eulerian-cradle diffractometer are developed for both the general triclinic case and the orthogonal case.

Photographic materials on pp. 4, 6, 10, and 20 are essential and will not reproduce clearly on Xerox copies. Photographic copies should be ordered.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This thesis presents a new class of solvers for the subsonic compressible Navier-Stokes equations in general two- and three-dimensional spatial domains. The proposed methodology incorporates: 1) A novel linear-cost implicit solver based on use of higher-order backward differentiation formulae (BDF) and the alternating direction implicit approach (ADI); 2) A fast explicit solver; 3) Dispersionless spectral spatial discretizations; and 4) A domain decomposition strategy that negotiates the interactions between the implicit and explicit domains. In particular, the implicit methodology is quasi-unconditionally stable (it does not suffer from CFL constraints for adequately resolved flows), and it can deliver orders of time accuracy between two and six in the presence of general boundary conditions. In fact this thesis presents, for the first time in the literature, high-order time-convergence curves for Navier-Stokes solvers based on the ADI strategy---previous ADI solvers for the Navier-Stokes equations have not demonstrated orders of temporal accuracy higher than one. An extended discussion is presented in this thesis which places on a solid theoretical basis the observed quasi-unconditional stability of the methods of orders two through six. The performance of the proposed solvers is favorable. For example, a two-dimensional rough-surface configuration including boundary layer effects at Reynolds number equal to one million and Mach number 0.85 (with a well-resolved boundary layer, run up to a sufficiently long time that single vortices travel the entire spatial extent of the domain, and with spatial mesh sizes near the wall of the order of one hundred-thousandth the length of the domain) was successfully tackled in a relatively short (approximately thirty-hour) single-core run; for such discretizations an explicit solver would require truly prohibitive computing times. As demonstrated via a variety of numerical experiments in two- and three-dimensions, further, the proposed multi-domain parallel implicit-explicit implementations exhibit high-order convergence in space and time, useful stability properties, limited dispersion, and high parallel efficiency.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A direct twos-complement parallel array multiplication algorithm is introduced and modified for digital optical numerical computation. The modified version overcomes the problems encountered in the conventional optical twos-complement algorithm. In the array, all the summands are generated in parallel, and the relevant summands having the same weights are added simultaneously without carries, resulting in the product expressed in a mixed twos-complement system. In a two-stage array, complex multiplication is possible with using four real subarrays. Furthermore, with a three-stage array architecture, complex matrix operation is straightforwardly accomplished. In the experiment, parallel two-stage array complex multiplication with liquid-crystal panels is demonstrated.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

On the basis of signed-digit negabinary representation, parallel two-step addition and one-step subtraction can be performed for arbitrary-length negabinary operands.; The arithmetic is realized by signed logic operations and optically implemented by spatial encoding and decoding techniques. The proposed algorithm and optical system are simple, reliable, and practicable, and they have the property of parallel processing of two-dimensional data. This leads to an efficient design for the optical arithmetic and logic unit. (C) 1997 Optical Society of America.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A compact two-step modified-signed-digit arithmetic-logic array processor is proposed. When the reference digits are programmed, both addition and subtraction can be performed by the same binary logic operations regardless of the sign of the input digits. The optical implementation and experimental demonstration with an electron-trapping device are shown. Each digit is encoded by a single pixel, and no polarization is included. Any combinational logic can be easily performed without optoelectronic and electro-optic conversions of the intermediate results. The system is compact, general purpose, simple to align, and has a high signal-to-noise ratio. (C) 1999 Optical Society of America.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Based on birefringence, a building-block stacking technique is suggested in this paper. A solid-state optical morphological processor module is thus developed, which is an integration of a beam array generator submodule, an optical connector submodule, and a Pockels readout optical modulator. It is shown that the technique is compact in construction, simple for fabrication, and insensitive to the environment.