983 resultados para improving standards


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Solution of generalized eigenproblem, K phi = lambda M phi, by the classical inverse iteration method exhibits slow convergence for some eigenproblems. In this paper, a modified inverse iteration algorithm is presented for improving the convergence rate. At every iteration, an optimal linear combination of the latest and the preceding iteration vectors is used as the input vector for the next iteration. The effectiveness of the proposed algorithm is demonstrated for three typical eigenproblems, i.e. eigenproblems with distinct, close and repeated eigenvalues. The algorithm yields 29, 96 and 23% savings in computational time, respectively, for these problems. The algorithm is simple and easy to implement, and this renders the algorithm even more attractive.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A new design technique for an SVC-based power system damping controller has been proposed. The controller attempts to place all plant poles within a specified region on the s-plane to guarantee the desired closed loop performance. The use of Horowitz's quantitative feedback theory (QFT) permits the design of a 'fixed gain controller' that maintains its performance in spite of large variations in the plant parameters during its normal course of operation. The required controller parameters are arrived at by solving an optimization problem that incorporates the control specifications. The performance of this robust controller has been evaluated on a single machine infinite bus system equipped with a mid point SVC, and the results are shown to be consistent with the expected performance of the stabilizer. (C) 1998 Elsevier Science S.A. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we propose a new method of data handling for web servers. We call this method Network Aware Buffering and Caching (NABC for short). NABC facilitates reduction of data copies in web server's data sending path, by doing three things: (1) Layout the data in main memory in a way that protocol processing can be done without data copies (2) Keep a unified cache of data in kernel and ensure safe access to it by various processes and kernel and (3) Pass only the necessary meta data between processes so that bulk data handling time spent during IPC can be reduced. We realize NABC by implementing a set of system calls and an user library. The end product of the implementation is a set of APIs specifically designed for use by the web servers. We port an in house web server called SWEET, to NABC APIs and evaluate performance using a range of workloads both simulated and real. The results show a very impressive gain of 12% to 21% in throughput for static file serving and 1.6 to 4 times gain in throughput for lightweight dynamic content serving for a server using NABC APIs over the one using UNIX APIs.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Digest caches have been proposed as an effective method tospeed up packet classification in network processors. In this paper, weshow that the presence of a large number of small flows and a few largeflows in the Internet has an adverse impact on the performance of thesedigest caches. In the Internet, a few large flows transfer a majority ofthe packets whereas the contribution of several small flows to the totalnumber of packets transferred is small. In such a scenario, the LRUcache replacement policy, which gives maximum priority to the mostrecently accessed digest, tends to evict digests belonging to the few largeflows. We propose a new cache management algorithm called SaturatingPriority (SP) which aims at improving the performance of digest cachesin network processors by exploiting the disparity between the number offlows and the number of packets transferred. Our experimental resultsdemonstrate that SP performs better than the widely used LRU cachereplacement policy in size constrained caches. Further, we characterizethe misses experienced by flow identifiers in digest caches.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The approach taken in this paper in order to modify the scattering features of electrons and phonons and improve the figure of merit (ZT) of thermoelectric PbTe is to alter the microstructure at constant chemistry. A lamellar pattern of PbTe/GeTe at the nano- and microscale was produced in Pb(0.36)Ge(0.64)Te alloy by the diffusional decomposition of a supersaturated solid solution. The mechanism of nanostructuration is most likely a discontinuous spinodal decomposition. A simple model relating the interface velocity to the observed lamellar spacing is proposed. The effects of nanostructuration in Pb(0.36)Ge(0.64)Te alloy on the electrical and thermal conductivity, thermopower and ZT were investigated. It was shown that nanostructuration through the formation of a lamellar pattern of PbTe/GeTe is unlikely to provide a significant improvement due to the occurrence of discontinuous coarsening. However, the present study allows an analysis of possible strategies to improve thermoelectric materials via optimal design of the microstructure and optimized heat treatment. (C) 2011 Acta Materialia Inc. Published by Elsevier Ltd. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Superscalar processors currently have the potential to fetch multiple basic blocks per cycle by employing one of several recently proposed instruction fetch mechanisms. However, this increased fetch bandwidth cannot be exploited unless pipeline stages further downstream correspondingly improve. In particular,register renaming a large number of instructions per cycle is diDcult. A large instruction window, needed to receive multiple basic blocks per cycle, will slow down dependence resolution and instruction issue. This paper addresses these and related issues by proposing (i) partitioning of the instruction window into multiple blocks, each holding a dynamic code sequence; (ii) logical partitioning of the registerjle into a global file and several local jles, the latter holding registers local to a dynamic code sequence; (iii) the dynamic recording and reuse of register renaming information for registers local to a dynamic code sequence. Performance studies show these mechanisms improve performance over traditional superscalar processors by factors ranging from 1.5 to a little over 3 for the SPEC Integer programs. Next, it is observed that several of the loops in the benchmarks display vector-like behavior during execution, even if the static loop bodies are likely complex for compile-time vectorization. A dynamic loop vectorization mechanism that builds on top of the above mechanisms is briefly outlined. The mechanism vectorizes up to 60% of the dynamic instructions for some programs, albeit the average number of iterations per loop is quite small.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Software transactional memory (STM) has been proposed as a promising programming paradigm for shared memory multi-threaded programs as an alternative to conventional lock based synchronization primitives. Typical STM implementations employ a conflict detection scheme, which works with uniform access granularity, tracking shared data accesses either at word/cache line or at object level. It is well known that a single fixed access tracking granularity cannot meet the conflicting goals of reducing false conflicts without impacting concurrency adversely. A fine grained granularity while improving concurrency can have an adverse impact on performance due to lock aliasing, lock validation overheads, and additional cache pressure. On the other hand, a coarse grained granularity can impact performance due to reduced concurrency. Thus, in general, a fixed or uniform granularity access tracking (UGAT) scheme is application-unaware and rarely matches the access patterns of individual application or parts of an application, leading to sub-optimal performance for different parts of the application(s). In order to mitigate the disadvantages associated with UGAT scheme, we propose a Variable Granularity Access Tracking (VGAT) scheme in this paper. We propose a compiler based approach wherein the compiler uses inter-procedural whole program static analysis to select the access tracking granularity for different shared data structures of the application based on the application's data access pattern. We describe our prototype VGAT scheme, using TL2 as our STM implementation. Our experimental results reveal that VGAT-STM scheme can improve the application performance of STAMP benchmarks from 1.87% to up to 21.2%.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Epitaxial-Bain-Path and Uniaxial-Bain-Path studies reveal that a B2-CuZr nanowire with Zr atoms on the surface is energetically more stable compared to a B2-CuZr nanowire with Cu atoms on the surface. Nanowires of cross-sectional dimensions in the range of similar to 20-50 are considered. Such stability is also correlated with the initial state of stress in the nanowires. It is also demonstrated here that a more stable structure, i.e., B2-CuZr nanowire with Zr atoms at surface shows improved yield strength compared to B2-CuZr nanowire with Cu atoms at surface site, over range of temperature under both the tensile and the compressive loadings. Nearly 18% increase in the average yield strength under tensile loading and nearly 26% increase in the averaged yield strength under compressive loading are observed for nanowires with various cross-sectional dimensions and temperatures. It is also observed that the B2-CuZr nanowire with Cu atom at the surface site shows a decrease in failure/plastic strain with an increase in temperature. On the other hand, B2-CuZr nanowires with Zr at the surface site shows an improvement in failure/plastic strain, specially at higher temperature as compared to the B2-CuZr nanowires which are having Cu atoms at the surface site. Finally, a possible design methodology for an energetically stable nano-structure with improved thermo-mechanical properties via manipulating the surface atom configuration is proposed.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The generalization performance of the SVM classifier depends mainly on the VC dimension and the dimensionality of the data. By reducing the VC dimension of the SVM classifier, its generalization performance is expected to increase. In the present paper, we argue that the VC dimension of SVM classifier can be reduced by applying bootstrapping and dimensionality reduction techniques. Experimental results showed that bootstrapping the original data and bootstrapping the projected (dimensionally reduced) data improved the performance of the SVM classifier.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Data Prefetchers identify and make use of any regularity present in the history/training stream to predict future references and prefetch them into the cache. The training information used is typically the primary misses seen at a particular cache level, which is a filtered version of the accesses seen by the cache. In this work we demonstrate that extending the training information to include secondary misses and hits along with primary misses helps improve the performance of prefetchers. In addition to empirical evaluation, we use the information theoretic metric entropy, to quantify the regularity present in extended histories. Entropy measurements indicate that extended histories are more regular than the default primary miss only training stream. Entropy measurements also help corroborate our empirical findings. With extended histories, further benefits can be achieved by triggering prefetches during secondary misses also. In this paper we explore the design space of extended prefetch histories and alternative prefetch trigger points for delta correlation prefetchers. We observe that different prefetch schemes benefit to a different extent with extended histories and alternative trigger points. Also the best performing design point varies on a per-benchmark basis. To meet these requirements, we propose a simple adaptive scheme that identifies the best performing design point for a benchmark-prefetcher combination at runtime. In SPEC2000 benchmarks, using all the L2 accesses as history for prefetcher improves the performance in terms of both IPC and misses reduced over techniques that use only primary misses as history. The adaptive scheme improves the performance of CZone prefetcher over Baseline by 4.6% on an average. These performance gains are accompanied by a moderate reduction in the memory traffic requirements.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, based on the temporal and spatial locality characteristics of memory accesses in multicores, we propose a re-organization of the existing single large row buffer in a DRAM bank into multiple smaller row-buffers. The proposed configuration helps improve the row hit rates and also brings down the energy required for row-activations. The major contribution of this work is proposing such a reorganization without requiring any significant changes to the existing widely accepted DRAM specifications. Our proposed reorganization improves performance by 35.8%, 14.5% and 21.6% in quad, eight and sixteen core workloads along with a 42%, 28% and 31% reduction in DRAM energy. Additionally, we introduce a Need Based Allocation scheme for buffer management that shows additional performance improvement.