20 resultados para Control-flow Analysis

em Indian Institute of Science - Bangalore - Índia


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Data-flow analysis is an integral part of any aggressive optimizing compiler. We propose a framework for improving the precision of data-flow analysis in the presence of complex control-flow. W initially perform data-flow analysis to determine those control-flow merges which cause the loss in data-flow analysis precision. The control-flow graph of the program is then restructured such that performing data-flow analysis on the resulting restructured graph gives more precise results. The proposed framework is both simple, involving the familiar notion of product automata, and also general, since it is applicable to any forward data-flow analysis. Apart from proving that our restructuring process is correct, we also show that restructuring is effective in that it necessarily leads to more optimization opportunities. Furthermore, the framework handles the trade-off between the increase in data-flow precision and the code size increase inherent in the restructuring. We show that determining an optimal restructuring is NP-hard, and propose and evaluate a greedy strategy. The framework has been implemented in the Scale research compiler, and instantiated for the specific problem of Constant Propagation. On the SPECINT 2000 benchmark suite we observe an average speedup of 4% in the running times over Wegman-Zadeck conditional constant propagation algorithm and 2% over a purely path profile guided approach.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a new approach to the power flow analysis in steady state for multiterminal DC-AC systems. A flexible and practical choice of per unit system is used to formulate the DC network and converter equations. A converter is represented by Norton's equivalent of a current source in parallel with the commutation resistance. Unlike in previous literature, the DC network equations are used to derive the controller equations for the DC system using a subset of specifications. The specifications considered are current or power at all terminals except the slack terminal where the DC voltage is specified. The control equations are solved by Newton's method, using the current injections at the converter terminals as state variables. Further, a systematic approach to the handling of constraints is proposed by identifying the priorities in rescheduling of the specified variables. The methodology is illustrated by example of a 5 terminal DC system.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Static analysis (aka offline analysis) of a model of an IP network is useful for understanding, debugging, and verifying packet flow properties of the network. Data-flow analysis is a method that has typically been applied to static analysis of programs. We propose a new, data-flow based approach for static analysis of packet flows in networks. We also investigate an application of our analysis to the problem of inferring a high-level policy from the network, which has been addressed in the past only for a single router.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Branch divergence is a very commonly occurring performance problem in GPGPU in which the execution of diverging branches is serialized to execute only one control flow path at a time. Existing hardware mechanism to reconverge threads using a stack causes duplicate execution of code for unstructured control flow graphs. Also the stack mechanism cannot effectively utilize the available parallelism among diverging branches. Further, the amount of nested divergence allowed is also limited by depth of the branch divergence stack. In this paper we propose a simple and elegant transformation to handle all of the above mentioned problems. The transformation converts an unstructured CFG to a structured CFG without duplicating user code. It incurs only a linear increase in the number of basic blocks and also the number of instructions. Our solution linearizes the CFG using a predicate variable. This mechanism reconverges the divergent threads as early as possible. It also reduces the depth of the reconvergence stack. The available parallelism in nested branches can be effectively extracted by scheduling the basic blocks to reduce the effect of stalls due to memory accesses. It can also increase execution efficiency of nested loops with different trip counts for different threads. We implemented the proposed transformation at PTX level using the Ocelot compiler infrastructure. We evaluated the technique using various benchmarks to show that it can be effective in handling the performance problem due to divergence in unstructured CFGs.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Computations have been carried out for simulating supersonic flow through a set of converging-diverging nozzles with their expanding jets forming a laser cavity and flow patterns through diffusers, past the cavity. A thorough numerical investigation with 3-D RANS code is carried out to capture the flow distribution which comprises of shock patterns and multiple supersonic jet interactions. The analysis of pressure recovery characteristics during the flow through the diffusers is an important parameter of the simulation and is critical for the performance of the laser device. The results of the computation have shown a close agreement with the experimentally measured parameters as well as other established results indicating that the flow analysis done is found to be satisfactory.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper describes an approach for the analysis and design of 765kV/400kV EHV transmission system which is a typical expansion in Indian power grid system, based on the analysis of steady state and transient over voltages. The approach for transmission system design is iterative in nature. The first step involves exhaustive power flow analysis, based on constraints such as right of way, power to be transmitted, power transfer capabilities of lines, existing interconnecting transformer capabilities etc. Acceptable bus voltage profiles and satisfactory equipment loadings during all foreseeable operating conditions for normal and contingency operation are the guiding criteria. Critical operating strategies are also evolved in this initial design phase. With the steady state over voltages obtained, comprehensive dynamic and transient studies are to be carried out including switching over voltages studies. This paper presents steady state and switching transient studies for alternative two typical configurations of 765kV/400 kV systems and the results are compared. Transient studies are carried out to obtain the peak values of 765 kV transmission systems and are compared with the alternative configurations of existing 400 kV systems.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

MATLAB is an array language, initially popular for rapid prototyping, but is now being increasingly used to develop production code for numerical and scientific applications. Typical MATLAB programs have abundant data parallelism. These programs also have control flow dominated scalar regions that have an impact on the program's execution time. Today's computer systems have tremendous computing power in the form of traditional CPU cores and throughput oriented accelerators such as graphics processing units(GPUs). Thus, an approach that maps the control flow dominated regions to the CPU and the data parallel regions to the GPU can significantly improve program performance. In this paper, we present the design and implementation of MEGHA, a compiler that automatically compiles MATLAB programs to enable synergistic execution on heterogeneous processors. Our solution is fully automated and does not require programmer input for identifying data parallel regions. We propose a set of compiler optimizations tailored for MATLAB. Our compiler identifies data parallel regions of the program and composes them into kernels. The problem of combining statements into kernels is formulated as a constrained graph clustering problem. Heuristics are presented to map identified kernels to either the CPU or GPU so that kernel execution on the CPU and the GPU happens synergistically and the amount of data transfer needed is minimized. In order to ensure required data movement for dependencies across basic blocks, we propose a data flow analysis and edge splitting strategy. Thus our compiler automatically handles composition of kernels, mapping of kernels to CPU and GPU, scheduling and insertion of required data transfer. The proposed compiler was implemented and experimental evaluation using a set of MATLAB benchmarks shows that our approach achieves a geometric mean speedup of 19.8X for data parallel benchmarks over native execution of MATLAB.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Compiler optimizations need precise and scalable analyses to discover program properties. We propose a partially flow-sensitive framework that tries to draw on the scalability of flow-insensitive algorithms while providing more precision at some specific program points. Provided with a set of critical nodes — basic blocks at which more precise information is desired — our partially flow-sensitive algorithm computes a reduced control-flow graph by collapsing some sets of non-critical nodes. The algorithm is more scalable than a fully flow-sensitive one as, assuming that the number of critical nodes is small, the reduced flow-graph is much smaller than the original flow-graph. At the same time, a much more precise information is obtained at certain program points than would had been obtained from a flow-insensitive algorithm.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

MATLAB is an array language, initially popular for rapid prototyping, but is now being increasingly used to develop production code for numerical and scientific applications. Typical MATLAB programs have abundant data parallelism. These programs also have control flow dominated scalar regions that have an impact on the program's execution time. Today's computer systems have tremendous computing power in the form of traditional CPU cores and throughput oriented accelerators such as graphics processing units(GPUs). Thus, an approach that maps the control flow dominated regions to the CPU and the data parallel regions to the GPU can significantly improve program performance. In this paper, we present the design and implementation of MEGHA, a compiler that automatically compiles MATLAB programs to enable synergistic execution on heterogeneous processors. Our solution is fully automated and does not require programmer input for identifying data parallel regions. We propose a set of compiler optimizations tailored for MATLAB. Our compiler identifies data parallel regions of the program and composes them into kernels. The problem of combining statements into kernels is formulated as a constrained graph clustering problem. Heuristics are presented to map identified kernels to either the CPU or GPU so that kernel execution on the CPU and the GPU happens synergistically and the amount of data transfer needed is minimized. In order to ensure required data movement for dependencies across basic blocks, we propose a data flow analysis and edge splitting strategy. Thus our compiler automatically handles composition of kernels, mapping of kernels to CPU and GPU, scheduling and insertion of required data transfer. The proposed compiler was implemented and experimental evaluation using a set of MATLAB benchmarks shows that our approach achieves a geometric mean speedup of 19.8X for data parallel benchmarks over native execution of MATLAB.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Transaction processing is a key constituent of the IT workload of commercial enterprises (e.g., banks, insurance companies). Even today, in many large enterprises, transaction processing is done by legacy "batch" applications, which run offline and process accumulated transactions. Developers acknowledge the presence of multiple loosely coupled pieces of functionality within individual applications. Identifying such pieces of functionality (which we call "services") is desirable for the maintenance and evolution of these legacy applications. This is a hard problem, which enterprises grapple with, and one without satisfactory automated solutions. In this paper, we propose a novel static-analysis-based solution to the problem of identifying services within transaction-processing programs. We provide a formal characterization of services in terms of control-flow and data-flow properties, which is well-suited to the idioms commonly exhibited by business applications. Our technique combines program slicing with the detection of conditional code regions to identify services in accordance with our characterization. A preliminary evaluation, based on a manual analysis of three real business programs, indicates that our approach can be effective in identifying useful services from batch applications.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Fast Decoupled Load Flow (FDLF) is a very popular and widely used power flow analysis method because of its simplicity and efficiency. Even though the basic FDLF algorithm is well investigated, the same is not true in the case of additional schemes/modifications required to obtain adjusted load flow solutions using the FDLF method. Handling generator Q limits is one such important feature needed in any practical load flow method. This paper presents a comprehensive investigation of two classes of schemes intended to handle this aspect i.e. the bus type switching scheme and the sensitivity scheme. We propose two new sensitivity based schemes and assess their performance in comparison with the existing schemes. In addition, a new scheme to avoid the possibility of anomalous solutions encountered while using the conventional schemes is also proposed and evaluated. Results from extensive simulation studies are provided to highlight the strengths and weaknesses of these existing and proposed schemes, especially from the point of view of reliability.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

On the basis of a more realistic tetrakaidecahedral structure of foam bubbles, a network model of static foam drainage has been developed. The model considers the foam to be made up of films and Plateau borders. The films drain into the adjacent Plateau borders, which in turn form a network through which the liquid moves from the foam to the liquid pool. From the structure, a unit flow cell was found, which constitutes the foam when stacked together both horizontally and vertically. Symmetry in the unit flow cell indicates that the flow analysis of a part of it can be employed to obtain the drainage for the whole foam. Material balance equations have been written for each segment of this subsection, ensuring connectivity, and solved with the appropriate boundary and initial conditions. The calculated rates of drainage, when compared with the available experimental results, indicate that the model predicts the experimental results well.