15 resultados para Data flow

em Indian Institute of Science - Bangalore - Índia


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Data-flow analysis is an integral part of any aggressive optimizing compiler. We propose a framework for improving the precision of data-flow analysis in the presence of complex control-flow. W initially perform data-flow analysis to determine those control-flow merges which cause the loss in data-flow analysis precision. The control-flow graph of the program is then restructured such that performing data-flow analysis on the resulting restructured graph gives more precise results. The proposed framework is both simple, involving the familiar notion of product automata, and also general, since it is applicable to any forward data-flow analysis. Apart from proving that our restructuring process is correct, we also show that restructuring is effective in that it necessarily leads to more optimization opportunities. Furthermore, the framework handles the trade-off between the increase in data-flow precision and the code size increase inherent in the restructuring. We show that determining an optimal restructuring is NP-hard, and propose and evaluate a greedy strategy. The framework has been implemented in the Scale research compiler, and instantiated for the specific problem of Constant Propagation. On the SPECINT 2000 benchmark suite we observe an average speedup of 4% in the running times over Wegman-Zadeck conditional constant propagation algorithm and 2% over a purely path profile guided approach.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Abstract is not available.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Many novel computer architectures like array and multiprocessors which achieve high performance through the use of concurrency exploit variations of the von Neumann model of computation. The effective utilization of the machines makes special demands on programmers and their programming languages, such as the structuring of data into vectors or the partitioning of programs into concurrent processes. In comparison, the data flow model of computation demands only that the principle of structured programming be followed. A data flow program, often represented as a data flow graph, is a program that expresses a computation by indicating the data dependencies among operators. A data flow computer is a machine designed to take advantage of concurrency in data flow graphs by executing data independent operations in parallel. In this paper, we discuss the design of a high level language (DFL: Data Flow Language) suitable for data flow computers. Some sample procedures in DFL are presented. The implementation aspects have not been discussed in detail since there are no new problems encountered. The language DFL embodies the concepts of functional programming, but in appearance closely resembles Pascal. The language is a better vehicle than the data flow graph for expressing a parallel algorithm. The compiler has been implemented on a DEC 1090 system in Pascal.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Data flow computers are high-speed machines in which an instruction is executed as soon as all its operands are available. This paper describes the EXtended MANchester (EXMAN) data flow computer which incorporates three major extensions to the basic Manchester machine. As extensions we provide a multiple matching units scheme, an efficient, implementation of array data structure, and a facility to concurrently execute reentrant routines. A simulator for the EXMAN computer has been coded in the discrete event simulation language, SIMULA 67, on the DEC 1090 system. Performance analysis studies have been conducted on the simulated EXMAN computer to study the effectiveness of the proposed extensions. The performance experiments have been carried out using three sample problems: matrix multiplication, Bresenham's line drawing algorithm, and the polygon scan-conversion algorithm.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a detailed description of the hardware design and implementation of PROMIDS: a PROtotype Multi-rIng Data flow System for functional programming languages. The hardware constraints and the design trade-offs are discussed. The design of the functional units is described in detail. Finally, we report our experience with PROMIDS.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Implementation details of efficient schemes for lenient execution and concurrent execution of re-entrant routines in a data flow model have been discussed in this paper. The proposed schemes require no extra hardware support and utilise the existing hardware resources such as the Matching Unit and Memory Network Interface, effectively to achieve the above mentioned goals.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Static analysis (aka offline analysis) of a model of an IP network is useful for understanding, debugging, and verifying packet flow properties of the network. Data-flow analysis is a method that has typically been applied to static analysis of programs. We propose a new, data-flow based approach for static analysis of packet flows in networks. We also investigate an application of our analysis to the problem of inferring a high-level policy from the network, which has been addressed in the past only for a single router.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

A scheme for integration of stand-alone INS and GPS sensors is presented, with data interchange over an external bus. This ensures modularity and sensor interchangeability. Use of a medium-coupled scheme reduces data flow and computation, facilitating use in surface vehicles. Results show that the hybrid navigation system is capable of delivering high positioning accuracy.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In this paper we develop compilation techniques for the realization of applications described in a High Level Language (HLL) onto a Runtime Reconfigurable Architecture. The compiler determines Hyper Operations (HyperOps) that are subgraphs of a data flow graph (of an application) and comprise elementary operations that have strong producer-consumer relationship. These HyperOps are hosted on computation structures that are provisioned on demand at runtime. We also report compiler optimizations that collectively reduce the overheads of data-driven computations in runtime reconfigurable architectures. On an average, HyperOps offer a 44% reduction in total execution time and a 18% reduction in management overheads as compared to using basic blocks as coarse grained operations. We show that HyperOps formed using our compiler are suitable to support data flow software pipelining.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In this paper, we look at the problem of scheduling expression trees with reusable registers on delayed load architectures. Reusable registers come into the picture when the compiler has a data-flow analyzer which is able to estimate the extent of use of the registers. Earlier work considered the same problem without allowing for register variables. Subsequently, Venugopal considered non-reusable registers in the tree. We further extend these efforts to consider a much more general form of the tree. We describe an approximate algorithm for the problem. We formally prove that the code schedule produced by this algorithm will, in the worst case, generate one interlock and use just one more register than that used by the optimal schedule. Spilling is minimized. The approximate algorithm is simple and has linear complexity.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

MATLAB is an array language, initially popular for rapid prototyping, but is now being increasingly used to develop production code for numerical and scientific applications. Typical MATLAB programs have abundant data parallelism. These programs also have control flow dominated scalar regions that have an impact on the program's execution time. Today's computer systems have tremendous computing power in the form of traditional CPU cores and throughput oriented accelerators such as graphics processing units(GPUs). Thus, an approach that maps the control flow dominated regions to the CPU and the data parallel regions to the GPU can significantly improve program performance. In this paper, we present the design and implementation of MEGHA, a compiler that automatically compiles MATLAB programs to enable synergistic execution on heterogeneous processors. Our solution is fully automated and does not require programmer input for identifying data parallel regions. We propose a set of compiler optimizations tailored for MATLAB. Our compiler identifies data parallel regions of the program and composes them into kernels. The problem of combining statements into kernels is formulated as a constrained graph clustering problem. Heuristics are presented to map identified kernels to either the CPU or GPU so that kernel execution on the CPU and the GPU happens synergistically and the amount of data transfer needed is minimized. In order to ensure required data movement for dependencies across basic blocks, we propose a data flow analysis and edge splitting strategy. Thus our compiler automatically handles composition of kernels, mapping of kernels to CPU and GPU, scheduling and insertion of required data transfer. The proposed compiler was implemented and experimental evaluation using a set of MATLAB benchmarks shows that our approach achieves a geometric mean speedup of 19.8X for data parallel benchmarks over native execution of MATLAB.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Many shallow landslides are triggered by heavy rainfall on hill slopes resulting in enormous casualties and huge economic losses in mountainous regions. Hill slope failure usually occurs as soil resistance deteriorates in the presence of the acting stress developed due to a number of reasons such as increased soil moisture content, change in land use causing slope instability, etc. Landslides triggered by rainfall can possibly be foreseen in real time by jointly using rainfall intensity-duration and information related to land surface susceptibility. Terrain analysis applications using spatial data such as aspect, slope, flow direction, compound topographic index, etc. along with information derived from remotely sensed data such as land cover / land use maps permit us to quantify and characterise the physical processes governing the landslide occurrence phenomenon. In this work, the probable landslide prone areas are predicted using two different algorithms – GARP (Genetic Algorithm for Rule-set Prediction) and Support Vector Machine (SVM) in a free and open source software package - openModeller. Several environmental layers such as aspect, digital elevation data, flow accumulation, flow direction, slope, land cover, compound topographic index, and precipitation data were used in modelling. A comparison of the simulated outputs, validated by overlaying the actual landslide occurrence points showed 92% accuracy with GARP and 96% accuracy with SVM in predicting landslide prone areas considering precipitation in the wettest month whereas 91% and 94% accuracy were obtained from GARP and SVM considering precipitation in the wettest quarter of the year.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

MATLAB is an array language, initially popular for rapid prototyping, but is now being increasingly used to develop production code for numerical and scientific applications. Typical MATLAB programs have abundant data parallelism. These programs also have control flow dominated scalar regions that have an impact on the program's execution time. Today's computer systems have tremendous computing power in the form of traditional CPU cores and throughput oriented accelerators such as graphics processing units(GPUs). Thus, an approach that maps the control flow dominated regions to the CPU and the data parallel regions to the GPU can significantly improve program performance. In this paper, we present the design and implementation of MEGHA, a compiler that automatically compiles MATLAB programs to enable synergistic execution on heterogeneous processors. Our solution is fully automated and does not require programmer input for identifying data parallel regions. We propose a set of compiler optimizations tailored for MATLAB. Our compiler identifies data parallel regions of the program and composes them into kernels. The problem of combining statements into kernels is formulated as a constrained graph clustering problem. Heuristics are presented to map identified kernels to either the CPU or GPU so that kernel execution on the CPU and the GPU happens synergistically and the amount of data transfer needed is minimized. In order to ensure required data movement for dependencies across basic blocks, we propose a data flow analysis and edge splitting strategy. Thus our compiler automatically handles composition of kernels, mapping of kernels to CPU and GPU, scheduling and insertion of required data transfer. The proposed compiler was implemented and experimental evaluation using a set of MATLAB benchmarks shows that our approach achieves a geometric mean speedup of 19.8X for data parallel benchmarks over native execution of MATLAB.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Transaction processing is a key constituent of the IT workload of commercial enterprises (e.g., banks, insurance companies). Even today, in many large enterprises, transaction processing is done by legacy "batch" applications, which run offline and process accumulated transactions. Developers acknowledge the presence of multiple loosely coupled pieces of functionality within individual applications. Identifying such pieces of functionality (which we call "services") is desirable for the maintenance and evolution of these legacy applications. This is a hard problem, which enterprises grapple with, and one without satisfactory automated solutions. In this paper, we propose a novel static-analysis-based solution to the problem of identifying services within transaction-processing programs. We provide a formal characterization of services in terms of control-flow and data-flow properties, which is well-suited to the idioms commonly exhibited by business applications. Our technique combines program slicing with the detection of conditional code regions to identify services in accordance with our characterization. A preliminary evaluation, based on a manual analysis of three real business programs, indicates that our approach can be effective in identifying useful services from batch applications.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Effective air flow distribution through perforated tiles is required to efficiently cool servers in a raised floor data center. We present detailed computational fluid dynamics (CFD) modeling of air flow through a perforated tile and its entrance to the adjacent server rack. The realistic geometrical details of the perforated tile, as well as of the rack are included in the model. Generally, models for air flow through perforated tiles specify a step pressure loss across the tile surface, or porous jump model based on the tile porosity. An improvement to this includes a momentum source specification above the tile to simulate the acceleration of the air flow through the pores, or body force model. In both of these models, geometrical details of tile such as pore locations and shapes are not included. More details increase the grid size as well as the computational time. However, the grid refinement can be controlled to achieve balance between the accuracy and computational time. We compared the results from CFD using geometrical resolution with the porous jump and body force model solution as well as with the measured flow field using particle image velocimetry (PIV) experiments. We observe that including tile geometrical details gives better results as compared to elimination of tile geometrical details and specifying physical models across and above the tile surface. A modification to the body force model is also suggested and improved results were achieved.