997 resultados para hierarchical memory
Resumo:
Today's feature-rich multimedia products require embedded system solution with complex System-on-Chip (SoC) to meet market expectations of high performance at a low cost and lower energy consumption. The memory architecture of the embedded system strongly influences these parameters. Hence the embedded system designer performs a complete memory architecture exploration. This problem is a multi-objective optimization problem and can be tackled as a two-level optimization problem. The outer level explores various memory architecture while the inner level explores placement of data sections (data layout problem) to minimize memory stalls. Further, the designer would be interested in multiple optimal design points to address various market segments. However, tight time-to-market constraints enforces short design cycle time. In this paper we address the multi-level multi-objective memory architecture exploration problem through a combination of Multi-objective Genetic Algorithm (Memory Architecture exploration) and an efficient heuristic data placement algorithm. At the outer level the memory architecture exploration is done by picking memory modules directly from a ASIC memory Library. This helps in performing the memory architecture exploration in a integrated framework, where the memory allocation, memory exploration and data layout works in a tightly coupled way to yield optimal design points with respect to area, power and performance. We experimented our approach for 3 embedded applications and our approach explores several thousand memory architecture for each application, yielding a few hundred optimal design points in a few hours of computation time on a standard desktop.
Resumo:
We propose the design and implementation of hardware architecture for spatial prediction based image compression scheme, which consists of prediction phase and quantization phase. In prediction phase, the hierarchical tree structure obtained from the test image is used to predict every central pixel of an image by its four neighboring pixels. The prediction scheme generates an error image, to which the wavelet/sub-band coding algorithm can be applied to obtain efficient compression. The software model is tested for its performance in terms of entropy, standard deviation. The memory and silicon area constraints play a vital role in the realization of the hardware for hand-held devices. The hardware architecture is constructed for the proposed scheme, which involves the aspects of parallelism in instructions and data. The processor consists of pipelined functional units to obtain the maximum throughput and higher speed of operation. The hardware model is analyzed for performance in terms throughput, speed and power. The results of hardware model indicate that the proposed architecture is suitable for power constrained implementations with higher data rate
Resumo:
We report one-pot hydrothermal synthesis of nearly mono-disperse 3-mercaptopropionic acid capped water-soluble cadmium telluride (CdTe) quantum dots (QDs) using an air stable Te source. The optical and electrical characteristics were also studied here. It was shown that the hydrothermal synthesis could be tuned to synthesize nano structures of uniform size close to nanometers. The emissions of the CdTe QDs thus synthesized were in the range of 500-700 nm by varying the duration of synthesis. The full width at half maximum (FWHM) of the emission peaks is relatively narrow (40-90 nm), which indicates a nearly uniform distribution of QD size. The structural and optical properties of the QDs were characterized by transmission electron microscopy (TEM), photoluminescence (PL) and Ultraviolet-visible (UV-Vis) spectroscopy. The photoluminescence quenching of CdTe QDs in the presence of L-cysteine and DNA confirms its biocompatibility and its utility for biosensing applications. The room temperature current-voltage characteristics of QD film on ITO coated glass substrate show an electrically induced switching between states with high and low conductivities. The phenomenon is explained on the basis of charge confinement in quantum dots. (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
The properties of widely used Ni-Ti-based shape memory alloys (SMAs) are highly sensitive to the underlying microstructure. Hence, controlling the evolution of microstructure during high-temperature deformation becomes important. In this article, the ``processing maps'' approach is utilized to identify the combination of temperature and strain rate for thermomechanical processing of a Ni(42)Ti(50)Cu(8) SMA. Uniaxial compression experiments were conducted in the temperature range of 800-1050 degrees C and at strain rate range of 10(-3) and 10(2) s(-1). Two-dimensional power dissipation efficiency and instability maps have been generated and various deformation mechanisms, which operate in different temperature and strain rate regimes, were identified with the aid of the maps and complementary microstructural analysis of the deformed specimens. Results show that the safe window for industrial processing of this alloy is in the range of 800-850 degrees C and at 0.1 s(-1), which leads to grain refinement and strain-free grains. Regions of the instability were identified, which result in strained microstructure, which in turn can affect the performance of the SMA.
Resumo:
We propose robust and scalable processes for the fabrication of floating gate devices using ordered arrays of 7 nm size gold nanoparticles as charge storage nodes. The proposed strategy can be readily adapted for fabricating next generation (sub-20 nm node) non-volatile memory devices.
Resumo:
Past studies of memory interference in multiprocessor systems have generally assumed that the references of each processor are uniformly distributed among the memory modules. In this paper we develop a model with local referencing, which reflects more closely the behavior of real-life programs. This model is analyzed using Markov chain techniques and expressions are derived for the multiprocessor performance. New expressions are also obtained for the performance in the traditional uniform reference model and are compared with other expressions-available in the literature. Results of a simulation study are given to show the accuracy of the expressions for both models.
Resumo:
Software transactional memory (STM) has been proposed as a promising programming paradigm for shared memory multi-threaded programs as an alternative to conventional lock based synchronization primitives. Typical STM implementations employ a conflict detection scheme, which works with uniform access granularity, tracking shared data accesses either at word/cache line or at object level. It is well known that a single fixed access tracking granularity cannot meet the conflicting goals of reducing false conflicts without impacting concurrency adversely. A fine grained granularity while improving concurrency can have an adverse impact on performance due to lock aliasing, lock validation overheads, and additional cache pressure. On the other hand, a coarse grained granularity can impact performance due to reduced concurrency. Thus, in general, a fixed or uniform granularity access tracking (UGAT) scheme is application-unaware and rarely matches the access patterns of individual application or parts of an application, leading to sub-optimal performance for different parts of the application(s). In order to mitigate the disadvantages associated with UGAT scheme, we propose a Variable Granularity Access Tracking (VGAT) scheme in this paper. We propose a compiler based approach wherein the compiler uses inter-procedural whole program static analysis to select the access tracking granularity for different shared data structures of the application based on the application's data access pattern. We describe our prototype VGAT scheme, using TL2 as our STM implementation. Our experimental results reveal that VGAT-STM scheme can improve the application performance of STAMP benchmarks from 1.87% to up to 21.2%.
Resumo:
The effect of deposition temperature on residual stress evolution with temperature in Ti-rich NiTi films deposited on silicon substrates was studied. Ti-rich NiTi films were deposited on 3? Si (100) substrates by DC magnetron sputtering at three deposition temperatures (300, 350 and 400 degrees C) with subsequent annealing in vacuum at their respective deposition temperatures for 4 h. The initial value of residual stress was found to be the highest for the film deposited and annealed at 400 degrees C and the lowest for the film deposited and annealed at 300 degrees C. All the three films were found to be amorphous in the as-deposited and annealed conditions. The nature of the stress response with temperature on heating in the first cycle (room temperature to 450 degrees C) was similar for all three films although the spike in tensile stress, which occurs at similar to 330 degrees C, was significantly higher in the film deposited and annealed at 300 degrees C. All the films were also found to undergo partial crystallisation on heating up to 450 degrees C and this resulted in decrease in the stress values around 5560 degrees C in the cooling cycle. The stress response with temperature in the second thermal cycle (room temperature to 450 degrees C and back), which is reflective of the intrinsic film behaviour, was found to be similar in all cases and the elastic modulus determined from the stress response was also more or less identical. The three deposition temperatures were also not found to have a significant effect on the transformation characteristics of these films such as transformation start and finish temperatures, recovery stress and hysteresis.
Resumo:
We address the problem of recognition and retrieval of relatively weak industrial signal such as Partial Discharges (PD) buried in excessive noise. The major bottleneck being the recognition and suppression of stochastic pulsive interference (PI) which has similar time-frequency characteristics as PD pulse. Therefore conventional frequency based DSP techniques are not useful in retrieving PD pulses. We employ statistical signal modeling based on combination of long-memory process and probabilistic principal component analysis (PPCA). An parametric analysis of the signal is exercised for extracting the features of desired pules. We incorporate a wavelet based bootstrap method for obtaining the noise training vectors from observed data. The procedure adopted in this work is completely different from the research work reported in the literature, which is generally based on deserved signal frequency and noise frequency.