158 resultados para Software Transactional Memory (STM)


Relevância:

20.00% 20.00%

Publicador:

Resumo:

In large flexible software systems, bloat occurs in many forms, causing excess resource utilization and resource bottlenecks. This results in lost throughput and wasted joules. However, mitigating bloat is not easy; efforts are best applied where savings would be substantial. To aid this we develop an analytical model establishing the relation between bottleneck in resources, bloat, performance and power. Analyses with the model places into perspective results from the first experimental study of the power-performance implications of bloat. In the experiments we find that while bloat reduction can provide as much as 40% energy savings, the degree of impact depends on hardware and software characteristics. We confirm predictions from our model with selected results from our experimental study. Our findings show that a software-only view is inadequate when assessing the effects of bloat. The impact of bloat on physical resource usage and power should be understood for a full systems perspective to properly deploy bloat reduction solutions and reap their power-performance benefits.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Memory models for shared-memory concurrent programming languages typically guarantee sequential consistency (SC) semantics for datarace-free (DRF) programs, while providing very weak or no guarantees for non-DRF programs. In effect programmers are expected to write only DRF programs, which are then executed with SC semantics. With this in mind, we propose a novel scalable solution for dataflow analysis of concurrent programs, which is proved to be sound for DRF programs with SC semantics. We use the synchronization structure of the program to propagate dataflow information among threads without requiring to consider all interleavings explicitly. Given a dataflow analysis that is sound for sequential programs and meets certain criteria, our technique automatically converts it to an analysis for concurrent programs.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Pervasive use of pointers in large-scale real-world applications continues to make points-to analysis an important optimization-enabler. Rapid growth of software systems demands a scalable pointer analysis algorithm. A typical inclusion-based points-to analysis iteratively evaluates constraints and computes a points-to solution until a fixpoint. In each iteration, (i) points-to information is propagated across directed edges in a constraint graph G and (ii) more edges are added by processing the points-to constraints. We observe that prioritizing the order in which the information is processed within each of the above two steps can lead to efficient execution of the points-to analysis. While earlier work in the literature focuses only on the propagation order, we argue that the other dimension, that is, prioritizing the constraint processing, can lead to even higher improvements on how fast the fixpoint of the points-to algorithm is reached. This becomes especially important as we prove that finding an optimal sequence for processing the points-to constraints is NP-Complete. The prioritization scheme proposed in this paper is general enough to be applied to any of the existing points-to analyses. Using the prioritization framework developed in this paper, we implement prioritized versions of Andersen's analysis, Deep Propagation, Hardekopf and Lin's Lazy Cycle Detection and Bloom Filter based points-to analysis. In each case, we report significant improvements in the analysis times (33%, 47%, 44%, 20% respectively) as well as the memory requirements for a large suite of programs, including SPEC 2000 benchmarks and five large open source programs.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Video decoders used in emerging applications need to be flexible to handle a large variety of video formats and deliver scalable performance to handle wide variations in workloads. In this paper we propose a unified software and hardware architecture for video decoding to achieve scalable performance with flexibility. The light weight processor tiles and the reconfigurable hardware tiles in our architecture enable software and hardware implementations to co-exist, while a programmable interconnect enables dynamic interconnection of the tiles. Our process network oriented compilation flow achieves realization agnostic application partitioning and enables seamless migration across uniprocessor, multi-processor, semi hardware and full hardware implementations of a video decoder. An application quality of service aware scheduler monitors and controls the operation of the entire system. We prove the concept through a prototype of the architecture on an off-the-shelf FPGA. The FPGA prototype shows a scaling in performance from QCIF to 1080p resolutions in four discrete steps. We also demonstrate that the reconfiguration time is short enough to allow migration from one configuration to the other without any frame loss.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

SrRuO3 is widely known to be an itinerant ferromagnet with a T-C similar to 160 K. It is well known that glassy materials exhibit time dependent phenomena such as memory effect due to their generic slow dynamics. However, for the first time, we have observed memory effect in SrRu(1-x)O3 (0.01memory etc. arise in disordered glassy systems. Thus the observation of memory effect in case of an itinerant ferromagnetic system like SrRuO3 is quite strange. The emergence of such unusual magnetic response is strongly believed to be connected with a cryptic interactions arises in the low temperature. Our effort on neutron diffraction study has been able to trace the cause of such hidden magnetic interaction responsible for bringing glassiness in a ferromagnet.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Maintaining metadata consistency is a critical issue in designing a filesystem. Although satisfactory solutions are available for filesystems residing on magnetic disks, these solutions may not give adequate performance for filesystems residing on flash devices. Prabhakaran et al. have designed a metadata consistency mechanism specifically for flash chips, called Transactional Flash1]. It uses cyclic commit mechanism to provide transactional abstractions. Although significant improvement over usual journaling techniques, this mechanism has certain drawbacks such as complex protocol and necessity to read whole flash during recovery, which slows down recovery process. In this paper we propose addition of thin journaling layer on top of Transactional Flash to simplify the protocol and speed up the recovery process. The simplified protocol named Quick Recovery Cyclic Commit (QRCC) uses journal stored on NOR flash for recovery. Our evaluations on actual raw flash card show that journal writes add negligible penalty compared to original Transactional Flash's write performance, while quick recovery is facilitated by journal in case of failures.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Microstructural changes of Ni-rich NiTi shape memory alloy during thermal and thermo-mechanical cycling have been investigated using Electron Back Scattered Diffraction. A strong dependence of the orientation of the prior austenite grain on the misorientation development has been observed during thermal cycling and thermo-mechanical cycling. This effect is more pronounced at the grain boundaries compared to grain interior. At a larger applied strain, the volume fraction of stabilized martensite phase increases with increase in the number of cycling. Deformation within the martensite leads to stabilization of martensitic phase even at temperatures slightly above the austenite finish temperature. Modulus variation with respect to temperature has been explained on the basis of martensitic transformation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

``The goal of this study was to examine the effect of maternal iron deficiency on the developing hippocampus in order to define a developmental window for this effect, and to see whether iron deficiency causes changes in glucocorticoid levels. The study was carried out using pre-natal, post-natal, and pre + post-natal iron deficiency paradigm. Iron deficient pregnant dams and their pups displayed elevated corticosterone which, in turn, differentially affected glucocorticoid receptor (GR) expression in the CA1 and the dentate gyrus. Brain Derived Neurotrophic Factor (BDNF) was reduced in the hippocampi of pups following elevated corticosterone levels. Reduced neurogenesis at P7 was seen in pups born to iron deficient mothers, and these pups had reduced numbers of hippocampal pyramidal and granule cells as adults. Hippocampal subdivision volumes also were altered. The structural and molecular defects in the pups were correlated with radial arm maze performance; reference memory function was especially affected. Pups from dams that were iron deficient throughout pregnancy and lactation displayed the complete spectrum of defects, while pups from dams that were iron deficient only during pregnancy or during lactation displayed subsets of defects. These findings show that maternal iron deficiency is associated with altered levels of corticosterone and GR expression, and with spatial memory deficits in their pups.'' (C) 2013 Elsevier Inc. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Instrumented microindentation (IM) on two Ni-Ti shape memory alloys (SMAs), where one is austenitic and the other is martensitic at room temperature, were conducted from 40 to 150 degrees C. Results show that the depth and work recovery ratios, eta(d) and eta(w) respectively, are complementary to each other. While eta(d) decreases gradually with temperature for austenite, it drops markedly for the martensite in the martensite-to-austenite transformation regime. These results affirm the utility of IM for characterizing SMAs.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The nanoindentation technique can be employed in shape memory alloys (SMAs) to discern the transformation temperatures as well as to characterize their mechanical behavior. In this paper, we use it with simultaneous measurements of the mechanical and the electrical contact resistances (ECR) at room temperature to probe two SMAs: austenite (RTA) and martensite (RTM). Two different types of indenter tips - Berkovich and spherical - are employed to examine the SMAs' indentation responses as a function of the representative strain, epsilon(R). In Berkovich indentation, because of the sharp nature of the tip, and in consequence the high levels of strain imposed, discerning the two SMAs on the basis of the indentation response alone is difficult. In the case of the spherical tip, epsilon(R) is systematically varied and its effect on the depth recovery ratio, eta(d), is examined. Results indicate that RTA has higher eta(d) than RTM, but the difference decreases with increasing epsilon(R) such that eta(d) values for both the alloys would be similar in the fully plastic regime. The experimental trends in eta(d) vs. epsilon(R) for both the alloys could be described well with a eta(d) proportional to (epsilon(R))(-1) type equation, which is developed on the basis of a phenomenological model. This fit, in turn, directs us to the maximum epsilon(R), below which plasticity underneath the indenter would not mask the differences in the two SMAs. It was demonstrated that the ECR measurements complement the mechanical measurements in demarcating the reverse transformation from martensite to austenite during unloading of RTA, wherein a marked increase in the voltage was noted. A correlation between recovery due to reverse transformation during unloading and increase in voltage (and hence the electrical resistance) was found. (C) 2013 Acta Materialia Inc. Published by Elsevier Ltd. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper deals with the evolution of microstructure and texture during hot rolling of hafnium containing NiTi based shape memory alloy Ni49.4Ti38.6Hf12. The formation of the R-phase has been associated with the precipitation of (Ti,Hf)(2)Ni phase. The crystallographic texture of the parent phase B2 as well as the product phases R and B19' have been determined. It has been found that the variant selection during the B2 -> R phase transformation is quite strong compared to the case of the B2 -> B19' transformation. During deformation, the texture of the austenite phase evolves with strong Goss and Bs components. After transformation to martensitic structure, it gives rise to a 011]parallel to RD fiber. Microstructure and texture studies reveal the occurrence of partial dynamic recrystallization during hot rolling. Large strain heterogeneities that occur surrounding (Ti,Hf)(2)Ni precipitates are relieved through extended dynamic recovery instead of particle stimulated nucleation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We demonstrate the possibility of accelerated identification of potential compositions for high-temperature shape memory alloys (SMAs) through a combinatorial material synthesis and analysis approach, wherein we employ the combination of diffusion couple and indentation techniques. The former was utilized to generate smooth and compositionally graded inter-diffusion zones (IDZs) in the Ni-Ti-Pd ternary alloy system of varying IDZ thickness, depending on the annealing time at high temperature. The IDZs thus produced were then impressed with an indenter with a spherical tip so as to inscribe a predetermined indentation strain. Subsequent annealing of the indented samples at various elevated temperatures, T-a, ranging between 150 and 550 degrees C allows for partial to full relaxation of the strain imposed due to the shape memory effect. If T-a is above the austenite finish temperature, A(f), the relaxation will be complete. By measuring the depth recovery, which serves as a proxy for the shape recovery characteristic of the SMA, a three-dimensional map in the recovery temperature composition space is constructed. A comparison of the published Af data for different compositions with the Ta data shows good agreement when the depth recovery is between 70% and 80%, indicating that the methodology proposed in this paper can be utilized for the identification of promising compositions. Advantages and further possibilities of this methodology are discussed.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Rapid advancements in multi-core processor architectures coupled with low-cost, low-latency, high-bandwidth interconnects have made clusters of multi-core machines a common computing resource. Unfortunately, writing good parallel programs that efficiently utilize all the resources in such a cluster is still a major challenge. Various programming languages have been proposed as a solution to this problem, but are yet to be adopted widely to run performance-critical code mainly due to the relatively immature software framework and the effort involved in re-writing existing code in the new language. In this paper, we motivate and describe our initial study in exploring CUDA as a programming language for a cluster of multi-cores. We develop CUDA-For-Clusters (CFC), a framework that transparently orchestrates execution of CUDA kernels on a cluster of multi-core machines. The well-structured nature of a CUDA kernel, the growing popularity, support and stability of the CUDA software stack collectively make CUDA a good candidate to be considered as a programming language for a cluster. CFC uses a mixture of source-to-source compiler transformations, a work distribution runtime and a light-weight software distributed shared memory to manage parallel executions. Initial results on running several standard CUDA benchmark programs achieve impressive speedups of up to 7.5X on a cluster with 8 nodes, thereby opening up an interesting direction of research for further investigation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Ni49.4Ti38.6Hf12 shape memory alloy has been characterized for structure, microstructure and transformation temperatures. The microstructure of the as-cast sample consists of B19' and R-phases, and (Ti,Hf)(2)Ni precipitate phase along the grain boundaries in the form of dendrites. The microstructure of the solution treated sample contains only B19' martensite phase, whereas a second heat treatment after solutionizing results in reappearance of the R-phase and the (Ti,Hf)(2)Ni grain boundary precipitate phase in the microstructure. A detailed microstructural examination shows the presence of precipitates having both coherent and incoherent interface with the matrix, the type of interface being dictated by the crystallographic orientation of the matrix phase. The present study shows that the (Ti,Hf)(2)Ni precipitates having coherent interface with the matrix, drive the formation of the R-phase in the microstructure. (C) 2013 Elsevier Ltd. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Combining the electronic properties of graphene(1,2) and molybdenum disulphide (MoS2)(3-6) in hybrid heterostructures offers the possibility to create devices with various functionalities. Electronic logic and memory devices have already been constructed from graphene-MoS2 hybrids(7,8), but they do not make use of the photosensitivity of MoS2, which arises from its optical-range bandgap(9). Here, we demonstrate that graphene-on-MoS2 binary heterostructures display remarkable dual optoelectronic functionality, including highly sensitive photodetection and gate-tunable persistent photoconductivity. The responsivity of the hybrids was found to be nearly 1 x 10(10) A W-1 at 130 K and 5 x 10(8) A W-1 at room temperature, making them the most sensitive graphene-based photodetectors. When subjected to time-dependent photoillumination, the hybrids could also function as a rewritable optoelectronic switch or memory, where the persistent state shows almost no relaxation or decay within experimental timescales, indicating near-perfect charge retention. These effects can be quantitatively explained by gate-tunable charge exchange between the graphene and MoS2 layers, and may lead to new graphene-based optoelectronic devices that are naturally scalable for large-area applications at room temperature.