57 resultados para Improvement programs
Resumo:
It is well known that extremely long low-density parity-check (LDPC) codes perform exceptionally well for error correction applications, short-length codes are preferable in practical applications. However, short-length LDPC codes suffer from performance degradation owing to graph-based impairments such as short cycles, trapping sets and stopping sets and so on in the bipartite graph of the LDPC matrix. In particular, performance degradation at moderate to high E-b/N-0 is caused by the oscillations in bit node a posteriori probabilities induced by short cycles and trapping sets in bipartite graphs. In this study, a computationally efficient algorithm is proposed to improve the performance of short-length LDPC codes at moderate to high E-b/N-0. This algorithm makes use of the information generated by the belief propagation (BP) algorithm in previous iterations before a decoding failure occurs. Using this information, a reliability-based estimation is performed on each bit node to supplement the BP algorithm. The proposed algorithm gives an appreciable coding gain as compared with BP decoding for LDPC codes of a code rate equal to or less than 1/2 rate coding. The coding gains are modest to significant in the case of optimised (for bipartite graph conditioning) regular LDPC codes, whereas the coding gains are huge in the case of unoptimised codes. Hence, this algorithm is useful for relaxing some stringent constraints on the graphical structure of the LDPC code and for developing hardware-friendly designs.
Resumo:
Estimating program worst case execution time(WCET) accurately and efficiently is a challenging task. Several programs exhibit phase behavior wherein cycles per instruction (CPI) varies in phases during execution. Recent work has suggested the use of phases in such programs to estimate WCET with minimal instrumentation. However the suggested model uses a function of mean CPI that has no probabilistic guarantees. We propose to use Chebyshev's inequality that can be applied to any arbitrary distribution of CPI samples, to probabilistically bound CPI of a phase. Applying Chebyshev's inequality to phases that exhibit high CPI variation leads to pessimistic upper bounds. We propose a mechanism that refines such phases into sub-phases based on program counter(PC) signatures collected using profiling and also allows the user to control variance of CPI within a sub-phase. We describe a WCET analyzer built on these lines and evaluate it with standard WCET and embedded benchmark suites on two different architectures for three chosen probabilities, p={0.9, 0.95 and 0.99}. For p= 0.99, refinement based on PC signatures alone, reduces average pessimism of WCET estimate by 36%(77%) on Arch1 (Arch2). Compared to Chronos, an open source static WCET analyzer, the average improvement in estimates obtained by refinement is 5%(125%) on Arch1 (Arch2). On limiting variance of CPI within a sub-phase to {50%, 10%, 5% and 1%} of its original value, average accuracy of WCET estimate improves further to {9%, 11%, 12% and 13%} respectively, on Arch1. On Arch2, average accuracy of WCET improves to 159% when CPI variance is limited to 50% of its original value and improvement is marginal beyond that point.
Resumo:
The ability to perform strong updates is the main contributor to the precision of flow-sensitive pointer analysis algorithms. Traditional flow-sensitive pointer analyses cannot strongly update pointers residing in the heap. This is a severe restriction for Java programs. In this paper, we propose a new flow-sensitive pointer analysis algorithm for Java that can perform strong updates on heap-based pointers effectively. Instead of points-to graphs, we represent our points-to information as maps from access paths to sets of abstract objects. We have implemented our analysis and run it on several large Java benchmarks. The results show considerable improvement in precision over the points-to graph based flow-insensitive and flow-sensitive analyses, with reasonable running time.
Resumo:
Large software systems are developed by composing multiple programs. If the programs manip-ulate and exchange complex data, such as network packets or files, it is essential to establish that they follow compatible data formats. Most of the complexity of data formats is associated with the headers. In this paper, we address compatibility of programs operating over headers of network packets, files, images, etc. As format specifications are rarely available, we infer the format associated with headers by a program as a set of guarded layouts. In terms of these formats, we define and check compatibility of (a) producer-consumer programs and (b) different versions of producer (or consumer) programs. A compatible producer-consumer pair is free of type mismatches and logical incompatibilities such as the consumer rejecting valid outputs gen-erated by the producer. A backward compatible producer (resp. consumer) is guaranteed to be compatible with consumers (resp. producers) that were compatible with its older version. With our prototype tool, we identified 5 known bugs and 1 potential bug in (a) sender-receiver modules of Linux network drivers of 3 vendors and (b) different versions of a TIFF image library.
Resumo:
The twin demands of energy-efficiency and higher performance on DRAM are highly emphasized in multicore architectures. A variety of schemes have been proposed to address either the latency or the energy consumption of DRAMs. These schemes typically require non-trivial hardware changes and end up improving latency at the cost of energy or vice-versa. One specific DRAM performance problem in multicores is that interleaved accesses from different cores can potentially degrade row-buffer locality. In this paper, based on the temporal and spatial locality characteristics of memory accesses, we propose a reorganization of the existing single large row-buffer in a DRAM bank into multiple sub-row buffers (MSRB). This re-organization not only improves row hit rates, and hence the average memory latency, but also brings down the energy consumed by the DRAM. The first major contribution of this work is proposing such a reorganization without requiring any significant changes to the existing widely accepted DRAM specifications. Our proposed reorganization improves weighted speedup by 35.8%, 14.5% and 21.6% in quad, eight and sixteen core workloads along with a 42%, 28% and 31% reduction in DRAM energy. The proposed MSRB organization enables opportunities for the management of multiple row-buffers at the memory controller level. As the memory controller is aware of the behaviour of individual cores it allows us to implement coordinated buffer allocation schemes for different cores that take into account program behaviour. We demonstrate two such schemes, namely Fairness Oriented Allocation and Performance Oriented Allocation, which show the flexibility that memory controllers can now exploit in our MSRB organization to improve overall performance and/or fairness. Further, the MSRB organization enables additional opportunities for DRAM intra-bank parallelism and selective early precharging of the LRU row-buffer to further improve memory access latencies. These two optimizations together provide an additional 5.9% performance improvement.
Resumo:
Simulations using Ansys Fluent 6.3.26 have been performed to look into the adsorption characteristics of a single silica gel particle exposed to saturated humid air streams at Re=108 & 216 and temperature of 300K. The adsorption of the particle has been modeled as a source term in the species and the energy equations using a Linear Driving Force (LDF) equation. The interdependence of the thermal and the water vapor concentration field has been analysed. This work is intended to aid in understanding the adsorption effects in silica gel beds and in their efficient design. (C) 2013 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/3.0/).
Resumo:
Lead-tin-telluride is a well-known thermoelectric material in the temperature range 350-750 K. Here, this alloy doped with manganese (Pb0.96-yMn0.04SnyTe) was prepared for different amounts of tin. X-ray diffraction showed a decrease of the lattice constant with increasing tin content, which indicated solid solution formation. Microstructural analysis showed a wide distribution of grain sizes from <1 mu m to 10 mm and the presence of a SnTe rich phase. All the transport properties were measured in the range of 300-720 K. The Seebeck coefficient showed that all the samples were p-type indicating holes as dominant carriers in the measurement range. The magnitude increased systematically on reduction of the Sn content due to possible decreasing hole concentration. Electrical conductivity showed the degenerate nature of the samples. Large values of the electrical conductivity could have possibly resulted from a large hole concentration due to a high Sn content and secondly, due to increased mobility by sp-d orbital interaction between the Pb1-ySnyTe sublattice and the Mn2+ ions. High thermal conductivity was observed due to higher electronic contribution, which decreased systematically with decreasing Sn content. The highest zT = 0.82 at 720 K was obtained for the alloy with the lowest Sn content (y = 0.56) due to the optimum doping level.
Resumo:
We propose a new approach for producing precise constrained slices of programs in a language such as C. We build upon a previous approach for this problem, which is based on term-rewriting, which primarily targets loop-free fragments and is fully precise in this setting. We incorporate abstract interpretation into term-rewriting, using a given arbitrary abstract lattice, resulting in a novel technique for slicing loops whose precision is linked to the power of the given abstract lattice. We address pointers in a first-class manner, including when they are used within loops to traverse and update recursive data structures. Finally, we illustrate the comparative precision of our slices over those of previous approaches using representative examples.
Resumo:
Task-parallel languages are increasingly popular. Many of them provide expressive mechanisms for intertask synchronization. For example, OpenMP 4.0 will integrate data-driven execution semantics derived from the StarSs research language. Compared to the more restrictive data-parallel and fork-join concurrency models, the advanced features being introduced into task-parallelmodels in turn enable improved scalability through load balancing, memory latency hiding, mitigation of the pressure on memory bandwidth, and, as a side effect, reduced power consumption. In this article, we develop a systematic approach to compile loop nests into concurrent, dynamically constructed graphs of dependent tasks. We propose a simple and effective heuristic that selects the most profitable parallelization idiom for every dependence type and communication pattern. This heuristic enables the extraction of interband parallelism (cross-barrier parallelism) in a number of numerical computations that range from linear algebra to structured grids and image processing. The proposed static analysis and code generation alleviates the burden of a full-blown dependence resolver to track the readiness of tasks at runtime. We evaluate our approach and algorithms in the PPCG compiler, targeting OpenStream, a representative dataflow task-parallel language with explicit intertask dependences and a lightweight runtime. Experimental results demonstrate the effectiveness of the approach.
Resumo:
Enhancement of superconducting transition temperature (T-c) of parent superconductor, Fe1+xSe, of `Fe-11' family by Cr-substitution for excess Fe has been motivated us to investigate the effect of Cr-substitution in optimal superconductor or Fe1+xSe0.5Te0.5 at Fe site. Here, we report structural, magnetic, electrical transport, thermal transport and heat capacity properties or Cr-substitute compounds. x-ray diffraction measurement confirms the substitution of Cr-atoms in host lattice. Magnetic and electrical transport measurements are used to explore the superconducting properties where Cr-substituted compounds show improvement in superconducting diamagnetic fraction with same T-c as undoped one Heat capacity measurement confirms the bulk superconducting properties of compounds. Thermopower measurement characterizes the type of charge carriers in normal state. (C) 2015 Elsevier Ltd. All rights reserved.
Resumo:
An attempt has been made to bring out the influence on strength and volume change behavior of fabric changes and new cementitious compound formation in a soil upon addition of various lime contents and with curing periods. The effects of changes in fabric of treatment with various lime contents (0, 2,4 and 6%) and with curing periods (0, 7, 14 and 28 days) have been evaluated by one-dimensional consolidation tests, in terms of void ratio changes and compressibility. The strength of soil treated with different lime contents with curing periods up to 28 days, and with the optimum lime content of 6% up to one year has been determined by unconfined compression tests. Comparison of effects of lime on the strength and volume change behavior of the soil brings out that the formation of flocculated fabric and cation exchange significantly reduces the compressibility of soil but marginally increases the strength. Cementation of soil particles and filling with cementitious compounds of the voids of flocculated fabric in the soil marginally reduces the compressibility but significantly increases the strength. Thus, the mechanism of volume change behavior of soil treated with lower lime content at short curing periods is distinctly different from that of the soil treated with optimum lime content at longer curing periods. This is consistent with the increase in the permeability caused by the addition from 2 to 4% lime and the decrease following the addition of 6% lime. Changes consistent with mechanical behavior have been determined by scanning electron microscope, X-ray diffraction and thermal analyses, energy dispersive X-ray spectrometer and pH value in microstructure, mineralogy, chemical composition and alkalinity, respectively. (C) 2015 Published by Elsevier B.V.
Resumo:
Clock synchronization in a wireless sensor network (WSN) is quite essential as it provides a consistent and a coherent time frame for all the nodes across the network. Typically, clock synchronization is achieved by message passing using a contention-based scheme for media access, like carrier sense multiple access (CSMA). The nodes try to synchronize with each other, by sending synchronization request messages. If many nodes try to send messages simultaneously, contention-based schemes cannot efficiently avoid collisions. In such a situation, there are chances of collisions, and hence, message losses, which, in turn, affects the convergence of the synchronization algorithms. However, the number of collisions can be reduced with a frame based approach like time division multiple access (TDMA) for message passing. In this paper, we propose a design to utilize TDMA-based media access and control (MAC) protocol for the performance improvement of clock synchronization protocols. The basic idea is to use TDMA-based transmissions when the degree of synchronization improves among the sensor nodes during the execution of the clock synchronization algorithm. The design significantly reduces the collisions among the synchronization protocol messages. We have simulated the proposed protocol in Castalia network simulator. The simulation results show that the proposed protocol significantly reduces the time required for synchronization and also improves the accuracy of the synchronization algorithm.