134 resultados para Hardware Transactional Memory
Resumo:
Ensuring reliable operation over an extended period of time is one of the biggest challenges facing present day electronic systems. The increased vulnerability of the components to atmospheric particle strikes poses a big threat in attaining the reliability required for various mission critical applications. Various soft error mitigation methodologies exist to address this reliability challenge. A general solution to this problem is to arrive at a soft error mitigation methodology with an acceptable implementation overhead and error tolerance level. This implementation overhead can then be reduced by taking advantage of various derating effects like logical derating, electrical derating and timing window derating, and/or making use of application redundancy, e. g. redundancy in firmware/software executing on the so designed robust hardware. In this paper, we analyze the impact of various derating factors and show how they can be profitably employed to reduce the hardware overhead to implement a given level of soft error robustness. This analysis is performed on a set of benchmark circuits using the delayed capture methodology. Experimental results show upto 23% reduction in the hardware overhead when considering individual and combined derating factors.
Resumo:
Prediction of the Sun's magnetic activity is important because of its effect on space environment and climate. However, recent efforts to predict the amplitude of the solar cycle have resulted in diverging forecasts with no consensus. Yeates et al. have shown that the dynamical memory of the solar dynamo mechanism governs predictability, and this memory is different for advection- and diffusion-dominated solar convection zones. By utilizing stochastically forced, kinematic dynamo simulations, we demonstrate that the inclusion of downward turbulent pumping of magnetic flux reduces the memory of both advection- and diffusion-dominated solar dynamos to only one cycle; stronger pumping degrades this memory further. Thus, our results reconcile the diverging dynamo-model-based forecasts for the amplitude of solar cycle 24. We conclude that reliable predictions for the maximum of solar activity can be made only at the preceding minimum-allowing about five years of advance planning for space weather. For more accurate predictions, sequential data assimilation would be necessary in forecasting models to account for the Sun's short memory.
Resumo:
We present external memory data structures for efficiently answering range-aggregate queries. The range-aggregate problem is defined as follows: Given a set of weighted points in R-d, compute the aggregate of the weights of the points that lie inside a d-dimensional orthogonal query rectangle. The aggregates we consider in this paper include COUNT, sum, and MAX. First, we develop a structure for answering two-dimensional range-COUNT queries that uses O(N/B) disk blocks and answers a query in O(log(B) N) I/Os, where N is the number of input points and B is the disk block size. The structure can be extended to obtain a near-linear-size structure for answering range-sum queries using O(log(B) N) I/Os, and a linear-size structure for answering range-MAX queries in O(log(B)(2) N) I/Os. Our structures can be made dynamic and extended to higher dimensions. (C) 2012 Elsevier B.V. All rights reserved.
Resumo:
The implementation of semiconductor circuits and systems in nano-technology makes it possible to achieve high speed, lower voltage level and smaller area. The unintended and undesirable result of this scaling is that it makes integrated circuits susceptible to soft errors normally caused by alpha particle or neutron hits. These events of radiation strike resulting into bit upsets referred to as single event upsets(SEU), become increasingly of concern for the reliable circuit operation in the field. Storage elements are worst hit by this phenomenon. As we further scale down, there is greater interest in reliability of the circuits and systems, apart from the performance, power and area aspects. In this paper we propose an improved 12T SEU tolerant SRAM cell design. The proposed SRAM cell is economical in terms of area overhead. It is easy to fabricate as compared to earlier designs. Simulation results show that the proposed cell is highly robust, as it does not flip even for a transient pulse with 62 times the Q(crit) of a standard 6T SRAM cell.
Resumo:
In large flexible software systems, bloat occurs in many forms, causing excess resource utilization and resource bottlenecks. This results in lost throughput and wasted joules. However, mitigating bloat is not easy; efforts are best applied where savings would be substantial. To aid this we develop an analytical model establishing the relation between bottleneck in resources, bloat, performance and power. Analyses with the model places into perspective results from the first experimental study of the power-performance implications of bloat. In the experiments we find that while bloat reduction can provide as much as 40% energy savings, the degree of impact depends on hardware and software characteristics. We confirm predictions from our model with selected results from our experimental study. Our findings show that a software-only view is inadequate when assessing the effects of bloat. The impact of bloat on physical resource usage and power should be understood for a full systems perspective to properly deploy bloat reduction solutions and reap their power-performance benefits.
Resumo:
Video decoders used in emerging applications need to be flexible to handle a large variety of video formats and deliver scalable performance to handle wide variations in workloads. In this paper we propose a unified software and hardware architecture for video decoding to achieve scalable performance with flexibility. The light weight processor tiles and the reconfigurable hardware tiles in our architecture enable software and hardware implementations to co-exist, while a programmable interconnect enables dynamic interconnection of the tiles. Our process network oriented compilation flow achieves realization agnostic application partitioning and enables seamless migration across uniprocessor, multi-processor, semi hardware and full hardware implementations of a video decoder. An application quality of service aware scheduler monitors and controls the operation of the entire system. We prove the concept through a prototype of the architecture on an off-the-shelf FPGA. The FPGA prototype shows a scaling in performance from QCIF to 1080p resolutions in four discrete steps. We also demonstrate that the reconfiguration time is short enough to allow migration from one configuration to the other without any frame loss.
Resumo:
SrRuO3 is widely known to be an itinerant ferromagnet with a T-C similar to 160 K. It is well known that glassy materials exhibit time dependent phenomena such as memory effect due to their generic slow dynamics. However, for the first time, we have observed memory effect in SrRu(1-x)O3 (0.01
Resumo:
Maintaining metadata consistency is a critical issue in designing a filesystem. Although satisfactory solutions are available for filesystems residing on magnetic disks, these solutions may not give adequate performance for filesystems residing on flash devices. Prabhakaran et al. have designed a metadata consistency mechanism specifically for flash chips, called Transactional Flash1]. It uses cyclic commit mechanism to provide transactional abstractions. Although significant improvement over usual journaling techniques, this mechanism has certain drawbacks such as complex protocol and necessity to read whole flash during recovery, which slows down recovery process. In this paper we propose addition of thin journaling layer on top of Transactional Flash to simplify the protocol and speed up the recovery process. The simplified protocol named Quick Recovery Cyclic Commit (QRCC) uses journal stored on NOR flash for recovery. Our evaluations on actual raw flash card show that journal writes add negligible penalty compared to original Transactional Flash's write performance, while quick recovery is facilitated by journal in case of failures.
Resumo:
Microstructural changes of Ni-rich NiTi shape memory alloy during thermal and thermo-mechanical cycling have been investigated using Electron Back Scattered Diffraction. A strong dependence of the orientation of the prior austenite grain on the misorientation development has been observed during thermal cycling and thermo-mechanical cycling. This effect is more pronounced at the grain boundaries compared to grain interior. At a larger applied strain, the volume fraction of stabilized martensite phase increases with increase in the number of cycling. Deformation within the martensite leads to stabilization of martensitic phase even at temperatures slightly above the austenite finish temperature. Modulus variation with respect to temperature has been explained on the basis of martensitic transformation.
Resumo:
``The goal of this study was to examine the effect of maternal iron deficiency on the developing hippocampus in order to define a developmental window for this effect, and to see whether iron deficiency causes changes in glucocorticoid levels. The study was carried out using pre-natal, post-natal, and pre + post-natal iron deficiency paradigm. Iron deficient pregnant dams and their pups displayed elevated corticosterone which, in turn, differentially affected glucocorticoid receptor (GR) expression in the CA1 and the dentate gyrus. Brain Derived Neurotrophic Factor (BDNF) was reduced in the hippocampi of pups following elevated corticosterone levels. Reduced neurogenesis at P7 was seen in pups born to iron deficient mothers, and these pups had reduced numbers of hippocampal pyramidal and granule cells as adults. Hippocampal subdivision volumes also were altered. The structural and molecular defects in the pups were correlated with radial arm maze performance; reference memory function was especially affected. Pups from dams that were iron deficient throughout pregnancy and lactation displayed the complete spectrum of defects, while pups from dams that were iron deficient only during pregnancy or during lactation displayed subsets of defects. These findings show that maternal iron deficiency is associated with altered levels of corticosterone and GR expression, and with spatial memory deficits in their pups.'' (C) 2013 Elsevier Inc. All rights reserved.
Resumo:
Instrumented microindentation (IM) on two Ni-Ti shape memory alloys (SMAs), where one is austenitic and the other is martensitic at room temperature, were conducted from 40 to 150 degrees C. Results show that the depth and work recovery ratios, eta(d) and eta(w) respectively, are complementary to each other. While eta(d) decreases gradually with temperature for austenite, it drops markedly for the martensite in the martensite-to-austenite transformation regime. These results affirm the utility of IM for characterizing SMAs.
Resumo:
The nanoindentation technique can be employed in shape memory alloys (SMAs) to discern the transformation temperatures as well as to characterize their mechanical behavior. In this paper, we use it with simultaneous measurements of the mechanical and the electrical contact resistances (ECR) at room temperature to probe two SMAs: austenite (RTA) and martensite (RTM). Two different types of indenter tips - Berkovich and spherical - are employed to examine the SMAs' indentation responses as a function of the representative strain, epsilon(R). In Berkovich indentation, because of the sharp nature of the tip, and in consequence the high levels of strain imposed, discerning the two SMAs on the basis of the indentation response alone is difficult. In the case of the spherical tip, epsilon(R) is systematically varied and its effect on the depth recovery ratio, eta(d), is examined. Results indicate that RTA has higher eta(d) than RTM, but the difference decreases with increasing epsilon(R) such that eta(d) values for both the alloys would be similar in the fully plastic regime. The experimental trends in eta(d) vs. epsilon(R) for both the alloys could be described well with a eta(d) proportional to (epsilon(R))(-1) type equation, which is developed on the basis of a phenomenological model. This fit, in turn, directs us to the maximum epsilon(R), below which plasticity underneath the indenter would not mask the differences in the two SMAs. It was demonstrated that the ECR measurements complement the mechanical measurements in demarcating the reverse transformation from martensite to austenite during unloading of RTA, wherein a marked increase in the voltage was noted. A correlation between recovery due to reverse transformation during unloading and increase in voltage (and hence the electrical resistance) was found. (C) 2013 Acta Materialia Inc. Published by Elsevier Ltd. All rights reserved.
Resumo:
This paper deals with the evolution of microstructure and texture during hot rolling of hafnium containing NiTi based shape memory alloy Ni49.4Ti38.6Hf12. The formation of the R-phase has been associated with the precipitation of (Ti,Hf)(2)Ni phase. The crystallographic texture of the parent phase B2 as well as the product phases R and B19' have been determined. It has been found that the variant selection during the B2 -> R phase transformation is quite strong compared to the case of the B2 -> B19' transformation. During deformation, the texture of the austenite phase evolves with strong Goss and Bs components. After transformation to martensitic structure, it gives rise to a 011]parallel to RD fiber. Microstructure and texture studies reveal the occurrence of partial dynamic recrystallization during hot rolling. Large strain heterogeneities that occur surrounding (Ti,Hf)(2)Ni precipitates are relieved through extended dynamic recovery instead of particle stimulated nucleation.
Resumo:
We demonstrate the possibility of accelerated identification of potential compositions for high-temperature shape memory alloys (SMAs) through a combinatorial material synthesis and analysis approach, wherein we employ the combination of diffusion couple and indentation techniques. The former was utilized to generate smooth and compositionally graded inter-diffusion zones (IDZs) in the Ni-Ti-Pd ternary alloy system of varying IDZ thickness, depending on the annealing time at high temperature. The IDZs thus produced were then impressed with an indenter with a spherical tip so as to inscribe a predetermined indentation strain. Subsequent annealing of the indented samples at various elevated temperatures, T-a, ranging between 150 and 550 degrees C allows for partial to full relaxation of the strain imposed due to the shape memory effect. If T-a is above the austenite finish temperature, A(f), the relaxation will be complete. By measuring the depth recovery, which serves as a proxy for the shape recovery characteristic of the SMA, a three-dimensional map in the recovery temperature composition space is constructed. A comparison of the published Af data for different compositions with the Ta data shows good agreement when the depth recovery is between 70% and 80%, indicating that the methodology proposed in this paper can be utilized for the identification of promising compositions. Advantages and further possibilities of this methodology are discussed.
Resumo:
Exploiting the performance potential of GPUs requires managing the data transfers to and from them efficiently which is an error-prone and tedious task. In this paper, we develop a software coherence mechanism to fully automate all data transfers between the CPU and GPU without any assistance from the programmer. Our mechanism uses compiler analysis to identify potential stale accesses and uses a runtime to initiate transfers as necessary. This allows us to avoid redundant transfers that are exhibited by all other existing automatic memory management proposals. We integrate our automatic memory manager into the X10 compiler and runtime, and find that it not only results in smaller and simpler programs, but also eliminates redundant memory transfers. Tested on eight programs ported from the Rodinia benchmark suite it achieves (i) a 1.06x speedup over hand-tuned manual memory management, and (ii) a 1.29x speedup over another recently proposed compiler--runtime automatic memory management system. Compared to other existing runtime-only and compiler-only proposals, it also transfers 2.2x to 13.3x less data on average.