66 resultados para hit-miss transform

em Indian Institute of Science - Bangalore - Índia


Relevância:

80.00% 80.00%

Publicador:

Resumo:

Simulation is an important means of evaluating new microarchitectures. With the invention of multi-core (CMP) platforms, simulators are becoming larger and more complex. However, with the availability of CMPs with larger caches and higher operating frequency, the wall clock time required for simulating an application has become comparatively shorter. Reducing this simulation time further is a great challenge, especially in the case of multi-threaded workload due to indeterminacy introduced due to simultaneously executing various threads. In this paper, we propose a technique for speeding multi-core simulation. The model of the processor core and cache are replaced with functional models, to achieve speedup. A timed Petri net model is used to estimate the execution time of the processor and the memory access latencies are estimated using hit/miss information obtained from the functional model of the cache. This model can be used to predict performance of data parallel applications or multiprogramming workload on CMP platform with various cache hierarchies and shared bus interconnect. The error in estimation of the execution time of an application is within 6%. The speedup achieved ranges between an average of 2x--4x over the cycle accurate simulator.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, we present Bi-Modal Cache - a flexible stacked DRAM cache organization which simultaneously achieves several objectives: (i) improved cache hit ratio, (ii) moving the tag storage overhead to DRAM, (iii) lower cache hit latency than tags-in-SRAM, and (iv) reduction in off-chip bandwidth wastage. The Bi-Modal Cache addresses the miss rate versus off-chip bandwidth dilemma by organizing the data in a bi-modal fashion - blocks with high spatial locality are organized as large blocks and those with little spatial locality as small blocks. By adaptively selecting the right granularity of storage for individual blocks at run-time, the proposed DRAM cache organization is able to make judicious use of the available DRAM cache capacity as well as reduce the off-chip memory bandwidth consumption. The Bi-Modal Cache improves cache hit latency despite moving the metadata to DRAM by means of a small SRAM based Way Locator. Further by leveraging the tremendous internal bandwidth and capacity that stacked DRAM organizations provide, the Bi-Modal Cache enables efficient concurrent accesses to tags and data to reduce hit time. Through detailed simulations, we demonstrate that the Bi-Modal Cache achieves overall performance improvement (in terms of Average Normalized Turnaround Time (ANTT)) of 10.8%, 13.8% and 14.0% in 4-core, 8-core and 16-core workloads respectively.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we generalize the existing rate-one space frequency (SF) and space-time frequency (STF) code constructions. The objective of this exercise is to provide a systematic design of full-diversity STF codes with high coding gain. Under this generalization, STF codes are formulated as linear transformations of data. Conditions on these linear transforms are then derived so that the resulting STF codes achieve full diversity and high coding gain with a moderate decoding complexity. Many of these conditions involve channel parameters like delay profile (DP) and temporal correlation. When these quantities are not available at the transmitter, design of codes that exploit full diversity on channels with arbitrary DIP and temporal correlation is considered. Complete characterization of a class of such robust codes is provided and their bit error rate (BER) performance is evaluated. On the other hand, when channel DIP and temporal correlation are available at the transmitter, linear transforms are optimized to maximize the coding gain of full-diversity STF codes. BER performance of such optimized codes is shown to be better than those of existing codes.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents the architecture and the VHDL design of an integer 2-D DCT used in the H.264/AVC. The 2-D DCT computation is performed by exploiting it’s orthogonality and separability property. The symmetry of the forward and inverse transform is used in this implementation. To reduce the computation overhead for the addition, subtraction and multiplication operations, we analyze the suitability of carry-free position independent residue number system (RNS) for the implementation of 2-D DCT. The implementation has been carried out in VHDL for Altera FPGA. We used the negative number representation in RNS, bit width analysis of the transforms and dedicated registers present in the Logic element of the FPGA to optimize the area. The complexity and efficiency analysis show that the proposed architecture could provide higher through-put.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The breakdown of the usual method of Fourier transforms in the problem of an external line crack in a thin infinite elastic plate is discovered and the correct solution of this problem is derived using the concept of a generalised Fourier transform of a type discussed first by Golecki [1] in connection with Flamant's problem.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Quantization formats of four digital holographic codes (Lohmann,Lee, Burckhardt and Hsueh-Sawchuk) are evaluated. A quantitative assessment is made from errors in both the Fourier transform and image domains. In general, small errors in the Fourier amplitude or phase alone do not guarantee high image fidelity. From quantization considerations, the Lee hologram is shown to be the best choice for randomly phase coded objects. When phase coding is not feasible, the Lohmann hologram is preferable as it is easier to plot.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Using analysis-by-synthesis (AbS) approach, we develop a soft decision based switched vector quantization (VQ) method for high quality and low complexity coding of wideband speech line spectral frequency (LSF) parameters. For each switching region, a low complexity transform domain split VQ (TrSVQ) is designed. The overall rate-distortion (R/D) performance optimality of new switched quantizer is addressed in the Gaussian mixture model (GMM) based parametric framework. In the AbS approach, the reduction of quantization complexity is achieved through the use of nearest neighbor (NN) TrSVQs and splitting the transform domain vector into higher number of subvectors. Compared to the current LSF quantization methods, the new method is shown to provide competitive or better trade-off between R/D performance and complexity.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The Fourier transforms of the collagen molecular structure have been calculated taking into consideration various side chain atoms, as well as the presence of bound water molecules. There is no significant change in the calculated intensity distribution on including the side chain atoms of non-imino-acid residues. Taking into account the presence of about two bound water molecules per tripeptide unit, the agreement with the observed x-ray pattern is slightly improved. Fourier transforms have also been calculated for the detailed molecular geometries proposed from other laboratories. It is found that there are no major differences between them, as compared to our structure, either in the positions of peak intensity or in the intensity distribution. Hence it is not possible to judge the relative merits of the various molecular geometries for the collagen triple helix from a comparison of the calculated transforms with the meagre data available from its x-ray fibre pattern. It is also concluded that the collagen molecular structure should be regarded as a somewhat flexible chain structure, capable of adapting itself to the requirements of the different side groups which occur in each local region.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We investigate the use of a two stage transform vector quantizer (TSTVQ) for coding of line spectral frequency (LSF) parameters in wideband speech coding. The first stage quantizer of TSTVQ, provides better matching of source distribution and the second stage quantizer provides additional coding gain through using an individual cluster specific decorrelating transform and variance normalization. Further coding gain is shown to be achieved by exploiting the slow time-varying nature of speech spectra and thus using inter-frame cluster continuity (ICC) property in the first stage of TSTVQ method. The proposed method saves 3-4 bits and reduces the computational complexity by 58-66%, compared to the traditional split vector quantizer (SVQ), but at the expense of 1.5-2.5 times of memory.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A number of papers have appeared on the application of operational methods and in particular the Laplace transform to problems concerning non-linear systems of one kind or other. This, however, has met with only partial success in solving a class of non-linear problems as each approach has some limitations and drawbacks. In this study the approach of Baycura has been extended to certain third-order non-linear systems subjected to non-periodic excitations, as this approximate method combines the advantages of engineering accuracy with ease of application to such problems. Under non-periodic excitations the method provides a procedure for estimating quickly the maximum response amplitude, which is important from the point of view of a designer. Limitations of such a procedure are brought out and the method is illustrated by an example taken from a physical situation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Histones H1a and H1t are two major linker histone variants present at the pachytene interval of mammalian spermatogenesis. The DNA- and chromatin-condensing properties of these two variants isolated from rat testes were studied and compared with those from rat liver. For this purpose, the histone H1 subtypes were purified from the respective tissues using bath acid and salt extraction procedures, Circular dichroism studies revealed that acid exposure during isolation affects the alpha-helical structure of both the globular domain (in the presence of 1 M NaCl) and the C-terminal lambda-tail (in the presence of 60% trifluoroethanol). The condensation of rat oligonucleosomal DNA, as measured by circular dichroism spectroscopy, by the salt-extracted histone H1 was at least 10 times more efficient than condensation by the acid-extracted histone H1. A site size of 16-20 base pairs was calculated for the salt-extracted histone H1. Among the different histone H1 subtypes, somatic histone H1bdec had the highest DNA-condensing property, followed by histone H1a and histone H1t. All the salt-extracted histones condensed rat oligonucleosomal DNA more efficiently than linear pBR-322 DNA, Histones H1bdec and H1a condensed histone H1-depleted chromatin, prepared from rat liver nuclei, with relatively equal efficiency. On the other hand, there was no condensation of histone H1-depleted chromatin with the testes specific histone H1t. A comparison of the amino acid sequences of histone H1d (rat) and histone H1t (rat) revealed several interesting differences in the occurrence of DNA-binding motifs at the C-terminus. A striking observation is the presence of a direct repeat of an octapeptide motif K(A)T(S)PKKA(S)K(T)K(A) in histone H1d that is absent in histone H1t.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present a signal processing approach using discrete wavelet transform (DWT) for the generation of complex synthetic aperture radar (SAR) images at an arbitrary number of dyadic scales of resolution. The method is computationally efficient and is free from significant system-imposed limitations present in traditional subaperture-based multiresolution image formation. Problems due to aliasing associated with biorthogonal decomposition of the complex signals are addressed. The lifting scheme of DWT is adapted to handle complex signal approximations and employed to further enhance the computational efficiency. Multiresolution SAR images formed by the proposed method are presented.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We derive expressions for convolution multiplication properties of discrete cosine transform II (DCT II) starting from equivalent discrete Fourier transform (DFT) representations. Using these expressions, a method for implementing linear filtering through block convolution in the DCT II domain is presented. For the case of nonsymmetric impulse response, additional discrete sine transform II (DST II) is required for implementing the filter in DCT II domain, where as for a symmetric impulse response, the additional transform is not required. Comparison with recently proposed circular convolution technique in DCT II domain shows that the proposed new method is computationally more efficient.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The images of Hermite and Laguerre-Sobolev spaces under the Hermite and special Hermite semigroups (respectively) are characterized. These are used to characterize the image of Schwartz class of rapidly decreasing functions f on R-n and C-n under these semigroups. The image of the space of tempered distributions is also considered and a Paley-Wiener theorem for the windowed (short-time) Fourier transform is proved.