218 resultados para 291605 Processor Architectures


Relevância:

10.00% 10.00%

Publicador:

Resumo:

Quantum dot lattices (QDLs) have the potential to allow for the tailoring of optical, magnetic, and electronic properties of a user-defined artificial solid. We use a dual gated device structure to controllably tune the potential landscape in a GaAs/AlGaAs two-dimensional electron gas, thereby enabling the formation of a periodic QDL. The current-voltage characteristics, I (V), follow a power law, as expected for a QDL. In addition, a systematic study of the scaling behavior of I (V) allows us to probe the effects of background disorder on transport through the QDL. Our results are particularly important for semiconductor-based QDL architectures which aim to probe collective phenomena.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Unending quest for performance improvement coupled with the advancements in integrated circuit technology have led to the development of new architectural paradigm. Speculative multithreaded architecture (SpMT) philosophy relies on aggressive speculative execution for improved performance. However, aggressive speculative execution comes with a mixed flavor of improving performance, when successful, and adversely affecting the energy consumption (and performance) because of useless computation in the event of mis-speculation. Dynamic instruction criticality information can be usefully applied to control and guide such an aggressive speculative execution. In this paper, we present a model of micro-execution for SpMT architecture that we have developed to determine the dynamic instruction criticality. We have also developed two novel techniques utilizing the criticality information namely delaying the non-critical loads and the criticality based thread-prediction for reducing useless computations and energy consumption. Experimental results showing break-up of critical instructions and effectiveness of proposed techniques in reducing energy consumption are presented in the context of multiscalar processor that implements SpMT architecture. Our experiments show 17.7% and 11.6% reduction in dynamic energy for criticality based thread prediction and criticality based delayed load scheme respectively while the improvement in dynamic energy delay product is 13.9% and 5.5%, respectively. (c) 2012 Published by Elsevier B.V.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Computational grids with multiple batch systems (batch grids) can be powerful infrastructures for executing long-running multi-component parallel applications. In this paper, we evaluate the potential improvements in throughput of long-running multi-component applications when the different components of the applications are executed on multiple batch systems of batch grids. We compare the multiple batch executions with executions of the components on a single batch system without increasing the number of processors used for executions. We perform our analysis with a foremost long-running multi-component application for climate modeling, the Community Climate System Model (CCSM). We have built a robust simulator that models the characteristics of both the multi-component application and the batch systems. By conducting large number of simulations with different workload characteristics and queuing policies of the systems, processor allocations to components of the application, distributions of the components to the batch systems and inter-cluster bandwidths, we show that multiple batch executions lead to 55% average increase in throughput over single batch executions for long-running CCSM. We also conducted real experiments with a practical middleware infrastructure and showed that multi-site executions lead to effective utilization of batch systems for executions of CCSM and give higher simulation throughput than single-site executions. Copyright (c) 2011 John Wiley & Sons, Ltd.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Isolated magnetic nanowires have been studied extensively and the magnetization reversal mechanism is well understood in these systems. But when these nanowires are joined together in different architectures, they behave differently and can give novel properties. Using this approach, one can engineer the network architectures to get artificial anisotropy. Here, we report six-fold anisotropy by joining the magnetic nanowires into hexagonal network. For this study, we also benchmark the widely used micromagnetic packages: OOMMF, Nmag, and LLG-simulator. Further, we propose a local hysteresis method by post processing the spatial magnetization information. With this approach we obtained the hysteresis of nanowires to understand the six-fold anisotropy and the reversal mechanism within the hexagonal networks.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This work describes the formation of hydrogels from sodium cholate solution in the presence of a variety of metal ions (Ca2+, Cu2+, Co2+, Zn2+, Cd2+, Hg2+ and Ag+). Morphological studies of the xerogels by electron microscopy reveal the presence of helical nanofibres. The rigid helical framework in the calcium cholate hydrogel was utilised to synthesize hybrid materials (AuNPs and AgNPs). Doping of transition metal salts into the calcium cholate hydrogel brings out the possibility of synthesising metal sulphide nano-architectures keeping the hydrogel network intact. These novel gel-nanoparticle hybrid materials have encouraging application potentials.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Introduction of processor based instruments in power systems is resulting in the rapid growth of the measured data volume. The present practice in most of the utilities is to store only some of the important data in a retrievable fashion for a limited period. Subsequently even this data is either deleted or stored in some back up devices. The investigations presented here explore the application of lossless data compression techniques for the purpose of archiving all the operational data - so that they can be put to more effective use. Four arithmetic coding methods suitably modified for handling power system steady state operational data are proposed here. The performance of the proposed methods are evaluated using actual data pertaining to the Southern Regional Grid of India. (C) 2012 Elsevier Ltd. All rights reserved.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Six new copper metal complexes with formulas Cu(H2O)(2,2'-bpy) (H2L)](2) center dot H4L center dot 4 H2O (1), {Cu(H2O)(2,2'-bpy)-(H3L)}(2)(H2L)]center dot 2H(2)O (2), Cu(H2O)(1,10-phen)(H2L)](2)center dot 6H(2)O (3), Cu(2,2'-bpy)(H2L)](n)center dot nH(2)O (4), Cu(1,10-phen)(H2L)](n)center dot 3nH(2)O (5), and {Cu(2,2'-bpy)(MoO3)}(2)(L)](n)center dot 2nH(2)O (6) have been synthesized starting from p-xylylenediphosphonic acid (H4L) and 2,2'-bipyridine (2,2'-bpy) or 1,10-phenanthroline (1,10-phen) as secondary linkers and characterized by single crystal X-ray diffraction analysis, IR spectroscopy, and thermogravimetric (TG) analysis. All the complexes were synthesized by hydrothermal methods. A dinuclear motif (Cu-dimer) bridged by phosphonic acid represents a new class of simple building unit (SBU) in the construction of coordination architectures in metal phosphonate chemistry. The initial pH of the reaction mixture induced by the secondary linker plays an important role in the formation of the molecular phosphonates 1, 2, and 3. Temperature dependent hydrothermal synthesis of the compounds 1, 2, and 3 reveals the mechanism of the self assembly of the compounds based on the solubility of the phosphonic acid H4L. Two-dimensional coordination polymers 4, 5, and 6, which are formed by increasing the pH of the reaction mixture, comprise Cu-dimers as nodes, organic (H2L) and inorganic (Mo4O12) ligands as linkers. The void space-areas, created by the (4,4) connected nets in compounds 4 and 5, are occupied by lattice water molecules. Thus compounds 4 and 5 have the potential to accommodate guest species/molecules. Variable temperature magnetic studies of the compounds 3, 4, 5, and 6 reveal the antiferromagnetic interactions between the two Cu(II) ions in the eight membered ring, observed in their crystal structures. A density functional theory (DFT) calculation correlates the conformation of the Cu-dimer ring with the magnitude of the exchange parameter based on the torsion angle of the conformation.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We report low-dimensional fabrication of technologically important giant dielectric material CaCu3Ti4O12 (CCTO) using soft electron beam lithographic technique. Sol-gel precursor solution of CCTO was prepared using inorganic metal nitrates and Ti-isopropoxide. Employing the prepared precursor solution and e-beam lithographically fabricated resist mask CCTO dots with similar to 200 nm characteristic dimension were fabricated on platinized Si (111) substrate. Phase formation, chemical purity and crystalline nature of fabricated low dimensional structures were investigated with X-ray diffraction (XRD), energy dispersive X-ray spectroscopy (EDS) and selected area electron diffraction (SAED), respectively. Morphological investigations were carried out with the help of scanning electron microscopy (SEM) and transmission electron microscopy (TEM). This kind of solution based fabrication of patterned low-dimensional high dielectric architectures might get potential significance for cost-effective technological applications. (C) 2012 Elsevier B.V. All rights reserved.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Artificial Neural Networks (ANNs) have been found to be a robust tool to model many non-linear hydrological processes. The present study aims at evaluating the performance of ANN in simulating and predicting ground water levels in the uplands of a tropical coastal riparian wetland. The study involves comparison of two network architectures, Feed Forward Neural Network (FFNN) and Recurrent Neural Network (RNN) trained under five algorithms namely Levenberg Marquardt algorithm, Resilient Back propagation algorithm, BFGS Quasi Newton algorithm, Scaled Conjugate Gradient algorithm, and Fletcher Reeves Conjugate Gradient algorithm by simulating the water levels in a well in the study area. The study is analyzed in two cases-one with four inputs to the networks and two with eight inputs to the networks. The two networks-five algorithms in both the cases are compared to determine the best performing combination that could simulate and predict the process satisfactorily. Ad Hoc (Trial and Error) method is followed in optimizing network structure in all cases. On the whole, it is noticed from the results that the Artificial Neural Networks have simulated and predicted the water levels in the well with fair accuracy. This is evident from low values of Normalized Root Mean Square Error and Relative Root Mean Square Error and high values of Nash-Sutcliffe Efficiency Index and Correlation Coefficient (which are taken as the performance measures to calibrate the networks) calculated after the analysis. On comparison of ground water levels predicted with those at the observation well, FFNN trained with Fletcher Reeves Conjugate Gradient algorithm taken four inputs has outperformed all other combinations.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Knowledge about program worst case execution time (WCET) is essential in validating real-time systems and helps in effective scheduling. One popular approach used in industry is to measure execution time of program components on the target architecture and combine them using static analysis of the program. Measurements need to be taken in the least intrusive way in order to avoid affecting accuracy of estimated WCET. Several programs exhibit phase behavior, wherein program dynamic execution is observed to be composed of phases. Each phase being distinct from the other, exhibits homogeneous behavior with respect to cycles per instruction (CPI), data cache misses etc. In this paper, we show that phase behavior has important implications on timing analysis. We make use of the homogeneity of a phase to reduce instrumentation overhead at the same time ensuring that accuracy of WCET is not largely affected. We propose a model for estimating WCET using static worst case instruction counts of individual phases and a function of measured average CPI. We describe a WCET analyzer built on this model which targets two different architectures. The WCET analyzer is observed to give safe estimates for most benchmarks considered in this paper. The tightness of the WCET estimates are observed to be improved for most benchmarks compared to Chronos, a well known static WCET analyzer.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Most of the existing WCET estimation methods directly estimate execution time, ET, in cycles. We propose to study ET as a product of two factors, ET = IC * CPI, where IC is instruction count and CPI is cycles per instruction. Considering directly the estimation of ET may lead to a highly pessimistic estimate since implicitly these methods may be using worst case IC and worst case CPI. We hypothesize that there exists a functional relationship between CPI and IC such that CPI=f(IC). This is ascertained by computing the covariance matrix and studying the scatter plots of CPI versus IC. IC and CPI values are obtained by running benchmarks with a large number of inputs using the cycle accurate architectural simulator, Simplescalar on two different architectures. It is shown that the benchmarks can be grouped into different classes based on the CPI versus IC relationship. For some benchmarks like FFT, FIR etc., both IC and CPI are almost a constant irrespective of the input. There are other benchmarks that exhibit a direct or an inverse relationship between CPI and IC. In such a case, one can predict CPI for a given IC as CPI=f(IC). We derive the theoretical worst case IC for a program, denoted as SWIC, using integer linear programming(ILP) and estimate WCET as SWIC*f(SWIC). However, if CPI decreases sharply with IC then measured maximum cycles is observed to be a better estimate. For certain other benchmarks, it is observed that the CPI versus IC relationship is either random or CPI remains constant with varying IC. In such cases, WCET is estimated as the product of SWIC and measured maximum CPI. It is observed that use of the proposed method results in tighter WCET estimates than Chronos, a static WCET analyzer, for most benchmarks for the two architectures considered in this paper.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Data Prefetchers identify and make use of any regularity present in the history/training stream to predict future references and prefetch them into the cache. The training information used is typically the primary misses seen at a particular cache level, which is a filtered version of the accesses seen by the cache. In this work we demonstrate that extending the training information to include secondary misses and hits along with primary misses helps improve the performance of prefetchers. In addition to empirical evaluation, we use the information theoretic metric entropy, to quantify the regularity present in extended histories. Entropy measurements indicate that extended histories are more regular than the default primary miss only training stream. Entropy measurements also help corroborate our empirical findings. With extended histories, further benefits can be achieved by triggering prefetches during secondary misses also. In this paper we explore the design space of extended prefetch histories and alternative prefetch trigger points for delta correlation prefetchers. We observe that different prefetch schemes benefit to a different extent with extended histories and alternative trigger points. Also the best performing design point varies on a per-benchmark basis. To meet these requirements, we propose a simple adaptive scheme that identifies the best performing design point for a benchmark-prefetcher combination at runtime. In SPEC2000 benchmarks, using all the L2 accesses as history for prefetcher improves the performance in terms of both IPC and misses reduced over techniques that use only primary misses as history. The adaptive scheme improves the performance of CZone prefetcher over Baseline by 4.6% on an average. These performance gains are accompanied by a moderate reduction in the memory traffic requirements.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper, based on the temporal and spatial locality characteristics of memory accesses in multicores, we propose a re-organization of the existing single large row buffer in a DRAM bank into multiple smaller row-buffers. The proposed configuration helps improve the row hit rates and also brings down the energy required for row-activations. The major contribution of this work is proposing such a reorganization without requiring any significant changes to the existing widely accepted DRAM specifications. Our proposed reorganization improves performance by 35.8%, 14.5% and 21.6% in quad, eight and sixteen core workloads along with a 42%, 28% and 31% reduction in DRAM energy. Additionally, we introduce a Need Based Allocation scheme for buffer management that shows additional performance improvement.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Network life time maximization is becoming an important design goal in wireless sensor networks. Energy harvesting has recently become a preferred choice for achieving this goal as it provides near perpetual operation. We study such a sensor node with an energy harvesting source and compare various architectures by which the harvested energy is used. We find its Shannon capacity when it is transmitting its observations over a fading AWGN channel with perfect/no channel state information provided at the transmitter. We obtain an achievable rate when there are inefficiencies in energy storage and the capacity when energy is spent in activities other than transmission.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Decoherence as an obstacle in quantum computation is viewed as a struggle between two forces [1]: the computation which uses the exponential dimension of Hilbert space, and decoherence which destroys this entanglement by collapse. In this model of decohered quantum computation, a sequential quantum computer loses the battle, because at each time step, only a local operation is carried out but g*(t) number of gates collapse. With quantum circuits computing in parallel way the situation is different- g(t) number of gates can be applied at each time step and number gates collapse because of decoherence. As g(t) ≈ g*(t) competition here is even [1]. Our paper improves on this model by slowing down g*(t) by encoding the circuit in parallel computing architectures and running it in Single Instruction Multiple Data (SIMD) paradigm. We have proposed a parallel ion trap architecture for single-bit rotation of a qubit.