972 resultados para cache consistency


Relevância:

20.00% 20.00%

Publicador:

Resumo:

A ternary thermodynamic function has been developed based on statistico-thermodynamic considerations, with a particular emphasis on the higher-order terms indicating the effects of truncation at the various stages of the treatment. Although the truncation of a series involved in the equation introduces inconsistency, the latter may be removed by imposing various thermodynamic boundary conditions. These conditions are discussed in the paper. The present equation with higher-order terms shows that the α function of a component reduces to a quadratic function of composition at constant compositional paths involving the other two components in the system. The form of the function has been found to be representative of various experimental observations.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

CMPs enable simultaneous execution of multiple applications on the same platforms that share cache resources. Diversity in the cache access patterns of these simultaneously executing applications can potentially trigger inter-application interference, leading to cache pollution. Whereas a large cache can ameliorate this problem, the issues of larger power consumption with increasing cache size, amplified at sub-100nm technologies, makes this solution prohibitive. In this paper in order to address the issues relating to power-aware performance of caches, we propose a caching structure that addresses the following: 1. Definition of application-specific cache partitions as an aggregation of caching units (molecules). The parameters of each molecule namely size, associativity and line size are chosen so that the power consumed by it and access time are optimal for the given technology. 2. Application-Specific resizing of cache partitions with variable and adaptive associativity per cache line, way size and variable line size. 3. A replacement policy that is transparent to the partition in terms of size, heterogeneity in associativity and line size. Through simulation studies we establish the superiority of molecular cache (caches built as aggregations of molecules) that offers a 29% power advantage over that of an equivalently performing traditional cache.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

After Gödel's incompleteness theorems and the collapse of Hilbert's programme Gerhard Gentzen continued the quest for consistency proofs of Peano arithmetic. He considered a finitistic or constructive proof still possible and necessary for the foundations of mathematics. For a proof to be meaningful, the principles relied on should be considered more reliable than the doubtful elements of the theory concerned. He worked out a total of four proofs between 1934 and 1939. This thesis examines the consistency proofs for arithmetic by Gentzen from different angles. The consistency of Heyting arithmetic is shown both in a sequent calculus notation and in natural deduction. The former proof includes a cut elimination theorem for the calculus and a syntactical study of the purely arithmetical part of the system. The latter consistency proof in standard natural deduction has been an open problem since the publication of Gentzen's proofs. The solution to this problem for an intuitionistic calculus is based on a normalization proof by Howard. The proof is performed in the manner of Gentzen, by giving a reduction procedure for derivations of falsity. In contrast to Gentzen's proof, the procedure contains a vector assignment. The reduction reduces the first component of the vector and this component can be interpreted as an ordinal less than epsilon_0, thus ordering the derivations by complexity and proving termination of the process.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The apparent contradiction between the exact nature of the interaction parameter formalism as presented by Lupis and Elliott and the inconsistencies discussed recently by Pelton and Bale arise from the truncation of the Maclaurin series in the latter treatment. The truncation removes the exactness of the expression for the logarithm of the activity coefficient of a solute in a multi-component system. The integrals are therefore path dependent. Formulae for integration along paths of constant Xi,or X i/Xj are presented. The expression for In γsolvent given by Pelton and Bale is valid only in the limit that the mole fraction of solvent tends to one. The truncation also destroys the general relations between interaction parameters derived by Lupis and Elliott. For each specific choice of parameters special relationships are obtained between interaction parameters.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Three different types of consistencies, viz., semiweak, weak, and strong, of a read-only transaction in a schedule s of a set T of transactions are defined and these are compared with the existing notions of consistencies of a read-only transaction in a schedule. We present a technique that enables a user to control the consistency of a read-only transaction in heterogeneous locking protocols. Since the weak consistency of a read-only transaction improves concurrency in heterogeneous locking protocols, the users can help to improve concurrency in heterogeneous locking protocols by supplying the consistency requirements of read-only transactions. A heterogeneous locking protocol P' derived from a locking protocol P that uses exclusive mode locks only and ensures serializability need not be deadlock-free. We present a sufficient condition that ensures the deadlock-freeness of Pprime, when P is deadlock-free and all the read-only transactions in Pprime are two phase.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This article examines the changes in interparticle forces brought about on prolonged contact (1 year period) of a bentonite clay with artificial seawater. The study is undertaken with the purpose of identifying the physico-chemical factors that impart a nonswelling character to smectite clays deposited in marine environments. Results show that equilibration of the bentonite clay with artificial seawater (total pore salinity approximately 42 gL-1) for a 1 year period does not lead to any mineralogical changes in the clay specimens; however, their exchangeable cation positions become prominently dominated by magnesium ions. The consistency limits of the seawater-equilibrated bentonite was determined on stepwise leaching to lower salinities. The predominance of diffuse double-layer repulsion forces in the pore salt concentration range of 42 gL-1 to 1.1 gL-1 caused an increase in the liquid limits of the seawater-equilibrated bentonite specimens on reducing the salinity in the corresponding range (42 gL-1 to 1.1 gL-1). The attraction forces, however, prevail over the repulsion forces at salt concentrations <1.1 gL-1 and cause a decrease in liquid limit of the clay specimens with reduction in pore salinity, which is typical of nonswelling clays. The attraction forces cause aggregation of the clay unit layers into domains that break down on sodium saturation of the clay specimens. It is inferred that the physico-chemical factors responsible for the nonswelling character of the seawater-equilibrated bentonite specimens at pore salt concentrations below 1.1 gL-1 are inadequate to explain the nonswelling character of smectite-rich Ariake marine clays. The lower consistency limits of the Ariake marine clays in comparison to the nonswelling character, seawater-equilibrated bentonite specimens is attributed to a relative deficiency of interparticle forces in the Ariake marine clay.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Software transactional memory (STM) is a promising programming paradigm for shared memory multithreaded programs as an alternative to traditional lock based synchronization. However adoption of STM in mainstream software has been quite low due to its considerable overheads and its poor cache/memory performance. In this paper, we perform a detailed study of the cache behavior of STM applications and quantify the impact of different STM factors on the cache misses experienced by the applications. Based on our analysis, we propose a compiler driven Lock-Data Colocation (LDC), targeted at reducing the cache overheads on STM. We show that LDC is effective in improving the cache behavior of STM applications by reducing the dcache miss latency and improving execution time performance.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this work, we propose a new organization for the last level shared cache of a rnulticore system. Our design is based on the observation that the Next-Use distance, measured in terms of intervening misses between the eviction of a line and its next use, for lines brought in by a given delinquent PC falls within a predictable range of values. We exploit this correlation to improve the performance of shared caches in multi-core architectures by proposing the NUcache organization.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we present a cache coherence protocol for multistage interconnection network (MIN)-based multiprocessors with two distinct private caches: private-blocks caches (PCache) containing blocks private to a process and shared-blocks caches (SCache) containing data accessible by all processes. The architecture is extended by a coherence control bus connecting all shared-block cache controllers. Timing problems due to variable transit delays through the MIN are dealt with by introducing Transient states in the proposed cache coherence protocol. The impact of the coherence protocol on system performance is evaluated through a performance study of three phases. Assuming homogeneity of all nodes, a single-node queuing model (phase 3) is developed to analyze system performance. This model is solved for processor and coherence bus utilizations using the mean value analysis (MVA) technique with shared-blocks steady state probabilities (phase 1) and communication delays (phase 2) as input parameters. The performance of our system is compared to that of a system with an equivalent-sized unified cache and with a multiprocessor implementing a directory-based coherence protocol. System performance measures are verified through simulation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The logarithm of activity coefficients of the components of the ternary system is derived based on the Maclaurin infinite series, which is expressed in terms of the integral property of the system and subjected to appropriate boundary conditions. The derivation of the functions involves extensive summation of various infinite series pertaining to the first-order interaction coefficients that have been shown completely to remove any truncational error. Since the conventional equations involving interaction coefficients are internally inconsistent, a consistent form of the partial functions is developed in the article using the technique just described. The thermodynamic consistency of the functions based on the Maxwell and the Gibbs-Duhem relations has been established. The derived values of the logarithmic activity coefficients of the components have been found to be in agreement with the thermodynamic data of the Fe-Cr-Ni system at 1873 K and have been found to be independent of the compositional paths.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Several replacement policies for web caches have been proposed and studied extensively in the literature. Different replacement policies perform better in terms of (i) the number of objects found in the cache (cache hit), (ii) the network traffic avoided by fetching the referenced object from the cache, or (iii) the savings in response time. In this paper, we propose a simple and efficient replacement policy (hereafter known as SE) which improves all three performance measures. Trace-driven simulations were done to evaluate the performance of SE. We compare SE with two widely used and efficient replacement policies, namely Least Recently Used (LRU) and Least Unified Value (LUV) algorithms. Our results show that SE performs at least as well as, if not better than, both these replacement policies. Unlike various other replacement policies proposed in literature, our SE policy does not require parameter tuning or a-priori trace analysis and has an efficient and simple implementation that can be incorporated in any existing proxy server or web server with ease.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Today's feature-rich multimedia products require embedded system solution with complex System-on-Chip (SoC) to meet market expectations of high performance at a low cost and lower energy consumption. The memory architecture of the embedded system strongly influences critical system design objectives like area, power and performance. Hence the embedded system designer performs a complete memory architecture exploration to custom design a memory architecture for a given set of applications. Further, the designer would be interested in multiple optimal design points to address various market segments. However, tight time-to-market constraints enforces short design cycle time. In this paper we address the multi-level multi-objective memory architecture exploration problem through a combination of exhaustive-search based memory exploration at the outer level and a two step based integrated data layout for SPRAM-Cache based architectures at the inner level. We present a two step integrated approach for data layout for SPRAM-Cache based hybrid architectures with the first step as data-partitioning that partitions data between SPRAM and Cache, and the second step is the cache conscious data layout. We formulate the cache-conscious data layout as a graph partitioning problem and show that our approach gives up to 34% improvement over an existing approach and also optimizes the off-chip memory address space. We experimented our approach with 3 embedded multimedia applications and our approach explores several hundred memory configurations for each application, yielding several optimal design points in a few hours of computation on a standard desktop.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The inherent temporal locality in memory accesses is filtered out by the L1 cache. As a consequence, an L2 cache with LRU replacement incurs significantly higher misses than the optimal replacement policy (OPT). We propose to narrow this gap through a novel replacement strategy that mimics the replacement decisions of OPT. The L2 cache is logically divided into two components, a Shepherd Cache (SC) with a simple FIFO replacement and a Main Cache (MC) with an emulation of optimal replacement. The SC plays the dual role of caching lines and guiding the replacement decisions in MC. Our pro- posed organization can cover 40% of the gap between OPT and LRU for a 2MB cache resulting in 7% overall speedup. Comparison with the dynamic insertion policy, a victim buffer, a V-Way cache and an LRU based fully associative cache demonstrates that our scheme performs better than all these strategies.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Packet forwarding is a memory-intensive application requiring multiple accesses through a trie structure. The efficiency of a cache for this application critically depends on the placement function to reduce conflict misses. Traditional placement functions use a one-level mapping that naively partitions trie-nodes into cache sets. However, as a significant percentage of trie nodes are not useful, these schemes suffer from a non-uniform distribution of useful nodes to sets. This in turn results in increased conflict misses. Newer organizations such as variable associativity caches achieve flexibility in placement at the expense of increased hit-latency. This makes them unsuitable for L1 caches.We propose a novel two-level mapping framework that retains the hit-latency of one-level mapping yet incurs fewer conflict misses. This is achieved by introducing a secondlevel mapping which reorganizes the nodes in the naive initial partitions into refined partitions with near-uniform distribution of nodes. Further as this remapping is accomplished by simply adapting the index bits to a given routing table the hit-latency is not affected. We propose three new schemes which result in up to 16% reduction in the number of misses and 13% speedup in memory access time. In comparison, an XOR-based placement scheme known to perform extremely well for general purpose architectures, can obtain up to 2% speedup in memory access time.