53 resultados para Computer programs.


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Memory models for shared-memory concurrent programming languages typically guarantee sequential consistency (SC) semantics for datarace-free (DRF) programs, while providing very weak or no guarantees for non-DRF programs. In effect programmers are expected to write only DRF programs, which are then executed with SC semantics. With this in mind, we propose a novel scalable solution for dataflow analysis of concurrent programs, which is proved to be sound for DRF programs with SC semantics. We use the synchronization structure of the program to propagate dataflow information among threads without requiring to consider all interleavings explicitly. Given a dataflow analysis that is sound for sequential programs and meets certain criteria, our technique automatically converts it to an analysis for concurrent programs.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

MATLAB is an array language, initially popular for rapid prototyping, but is now being increasingly used to develop production code for numerical and scientific applications. Typical MATLAB programs have abundant data parallelism. These programs also have control flow dominated scalar regions that have an impact on the program's execution time. Today's computer systems have tremendous computing power in the form of traditional CPU cores and throughput oriented accelerators such as graphics processing units(GPUs). Thus, an approach that maps the control flow dominated regions to the CPU and the data parallel regions to the GPU can significantly improve program performance. In this paper, we present the design and implementation of MEGHA, a compiler that automatically compiles MATLAB programs to enable synergistic execution on heterogeneous processors. Our solution is fully automated and does not require programmer input for identifying data parallel regions. We propose a set of compiler optimizations tailored for MATLAB. Our compiler identifies data parallel regions of the program and composes them into kernels. The problem of combining statements into kernels is formulated as a constrained graph clustering problem. Heuristics are presented to map identified kernels to either the CPU or GPU so that kernel execution on the CPU and the GPU happens synergistically and the amount of data transfer needed is minimized. In order to ensure required data movement for dependencies across basic blocks, we propose a data flow analysis and edge splitting strategy. Thus our compiler automatically handles composition of kernels, mapping of kernels to CPU and GPU, scheduling and insertion of required data transfer. The proposed compiler was implemented and experimental evaluation using a set of MATLAB benchmarks shows that our approach achieves a geometric mean speedup of 19.8X for data parallel benchmarks over native execution of MATLAB.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

With proliferation of chip multicores (CMPs) on desktops and embedded platforms, multi-threaded programs have become ubiquitous. Existence of multiple threads may cause resource contention, such as, in on-chip shared cache and interconnects, depending upon how they access resources. Hence, we propose a tool - Thread Contention Predictor (TCP) to help quantify the number of threads sharing data and their sharing pattern. We demonstrate its use to predict a more profitable shared, last level on-chip cache (LLC) access policy on CMPs. Our cache configuration predictor is 2.2 times faster compared to the cycle-accurate simulations. We also demonstrate its use for identifying hot data structures in a program which may cause performance degradation due to false data sharing. We fix layout of such data structures and show up-to 10% and 18% improvement in execution time and energy-delay product (EDP), respectively.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Large software systems are developed by composing multiple programs. If the programs manip-ulate and exchange complex data, such as network packets or files, it is essential to establish that they follow compatible data formats. Most of the complexity of data formats is associated with the headers. In this paper, we address compatibility of programs operating over headers of network packets, files, images, etc. As format specifications are rarely available, we infer the format associated with headers by a program as a set of guarded layouts. In terms of these formats, we define and check compatibility of (a) producer-consumer programs and (b) different versions of producer (or consumer) programs. A compatible producer-consumer pair is free of type mismatches and logical incompatibilities such as the consumer rejecting valid outputs gen-erated by the producer. A backward compatible producer (resp. consumer) is guaranteed to be compatible with consumers (resp. producers) that were compatible with its older version. With our prototype tool, we identified 5 known bugs and 1 potential bug in (a) sender-receiver modules of Linux network drivers of 3 vendors and (b) different versions of a TIFF image library.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We propose a new approach for producing precise constrained slices of programs in a language such as C. We build upon a previous approach for this problem, which is based on term-rewriting, which primarily targets loop-free fragments and is fully precise in this setting. We incorporate abstract interpretation into term-rewriting, using a given arbitrary abstract lattice, resulting in a novel technique for slicing loops whose precision is linked to the power of the given abstract lattice. We address pointers in a first-class manner, including when they are used within loops to traverse and update recursive data structures. Finally, we illustrate the comparative precision of our slices over those of previous approaches using representative examples.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Task-parallel languages are increasingly popular. Many of them provide expressive mechanisms for intertask synchronization. For example, OpenMP 4.0 will integrate data-driven execution semantics derived from the StarSs research language. Compared to the more restrictive data-parallel and fork-join concurrency models, the advanced features being introduced into task-parallelmodels in turn enable improved scalability through load balancing, memory latency hiding, mitigation of the pressure on memory bandwidth, and, as a side effect, reduced power consumption. In this article, we develop a systematic approach to compile loop nests into concurrent, dynamically constructed graphs of dependent tasks. We propose a simple and effective heuristic that selects the most profitable parallelization idiom for every dependence type and communication pattern. This heuristic enables the extraction of interband parallelism (cross-barrier parallelism) in a number of numerical computations that range from linear algebra to structured grids and image processing. The proposed static analysis and code generation alleviates the burden of a full-blown dependence resolver to track the readiness of tasks at runtime. We evaluate our approach and algorithms in the PPCG compiler, targeting OpenStream, a representative dataflow task-parallel language with explicit intertask dependences and a lightweight runtime. Experimental results demonstrate the effectiveness of the approach.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Eklundh's (1972) algorithm to transpose a large matrix stored on an external device such as a disc has been programmed and tested. A simple description of computer implementation is given in this note.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The benefits that accrue from the use of design database include (i) reduced costs of preparing data for application programs and of producing the final specification, and (ii) possibility of later usage of data stored in the database for other applications related to Computer Aided Engineering (CAE). An INTEractive Relational GRAphics Database (INTERGRAD) based on relational models has been developed to create, store, retrieve and update the data related to two dimensional drawings. INTERGRAD provides two languages, Picture Definition Language (PDL) and Picture Manipulation Language (PML). The software package has been implemented on a PDP 11/35 system under the RSX-11M version 3.1 operating system and uses the graphics facility consisting of a VT-11 graphics terminal, the DECgraphic 11 software and an input device, a lightpen.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Polytypes have been simulated, treating them as analogues of a one-dimensional spin-half Ising chain with competing short-range and infinite-range interactions. Short-range interactions are treated as random variables to approximate conditions of growth from melt as well as from vapour. Besides ordered polytypes up to 12R, short stretches of long-period polytypes (up to 33R) have been observed. Such long-period sequences could be of significance in the context of Frank's theory of polytypism. The form of short-range interactions employed in the study has been justified by carrying out model potential calculations.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Functional Programming (FP) systems are modified and extended to form Nondeterministic Functional Programming (NFP) systems in which nondeterministic programs can be specified and both deterministic and nondeterministic programs can be verified essentially within the system. It is shown that the algebra of NFP programs has simpler laws in comparison with the algebra of FP programs. "Regular" forms are introduced to put forward a disciplined way of reasoning about programs. Finally, an alternative definition of "linear" forms is proposed for reasoning about recursively defined programs. This definition, when used to test the linearity of forms, results in simpler verification conditions than those generated by the original definition of linear forms.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A hot billet in contact with relatively cold dies undergoes rapid cooling in the forging operation. This may give rise to unfilled cavities, poor surface finish and stalling of the press. A knowledge of billet-die temperatures as a function of time is therefore essential for process design. A computer code using finite difference method is written to estimate such temperature histories and validated by comparing the predicted cooling of an integral die-billet configuration with that obtained experimentally.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Using the link-link incidence matrix to represent a simple-jointed kinematic chain algebraic procedures have been developed to determine its structural characteristics such as the type of freedom of the chain, the number of distinct mechanisms and driving mechanisms that can be derived from the chain. A computer program incorporating these graph theory based procedures has been applied successfully for the structural analysis of several typical chains.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

It is shown that a leaky aquifer model can be used for well field analysis in hard rock areas, treating the upper weathered and clayey layers as a composite unconfined aquitard overlying a deeper fractured aquifer. Two long-duration pump test studies are reported in granitic and schist regions in the Vedavati river basin. The validity of simplifications in the analytical solution is verified by finite difference computations.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

An estimate of the irrigation potential over and above the existing utilization was made based on the ground water potential in the Vedavati river basin. The estimate is based on assumed crops and cropping patterns as per existing practice in the various taluks of the basin. Irrigation potential was estimated talukwise based on the available ground water potential identified from the simulation study. It is estimated that 84,100 hectares of additional land can be brought under irrigation from ground water in the entire basin.