59 resultados para control flow
em Indian Institute of Science - Bangalore - Índia
Resumo:
Branch divergence is a very commonly occurring performance problem in GPGPU in which the execution of diverging branches is serialized to execute only one control flow path at a time. Existing hardware mechanism to reconverge threads using a stack causes duplicate execution of code for unstructured control flow graphs. Also the stack mechanism cannot effectively utilize the available parallelism among diverging branches. Further, the amount of nested divergence allowed is also limited by depth of the branch divergence stack. In this paper we propose a simple and elegant transformation to handle all of the above mentioned problems. The transformation converts an unstructured CFG to a structured CFG without duplicating user code. It incurs only a linear increase in the number of basic blocks and also the number of instructions. Our solution linearizes the CFG using a predicate variable. This mechanism reconverges the divergent threads as early as possible. It also reduces the depth of the reconvergence stack. The available parallelism in nested branches can be effectively extracted by scheduling the basic blocks to reduce the effect of stalls due to memory accesses. It can also increase execution efficiency of nested loops with different trip counts for different threads. We implemented the proposed transformation at PTX level using the Ocelot compiler infrastructure. We evaluated the technique using various benchmarks to show that it can be effective in handling the performance problem due to divergence in unstructured CFGs.
Resumo:
Data-flow analysis is an integral part of any aggressive optimizing compiler. We propose a framework for improving the precision of data-flow analysis in the presence of complex control-flow. W initially perform data-flow analysis to determine those control-flow merges which cause the loss in data-flow analysis precision. The control-flow graph of the program is then restructured such that performing data-flow analysis on the resulting restructured graph gives more precise results. The proposed framework is both simple, involving the familiar notion of product automata, and also general, since it is applicable to any forward data-flow analysis. Apart from proving that our restructuring process is correct, we also show that restructuring is effective in that it necessarily leads to more optimization opportunities. Furthermore, the framework handles the trade-off between the increase in data-flow precision and the code size increase inherent in the restructuring. We show that determining an optimal restructuring is NP-hard, and propose and evaluate a greedy strategy. The framework has been implemented in the Scale research compiler, and instantiated for the specific problem of Constant Propagation. On the SPECINT 2000 benchmark suite we observe an average speedup of 4% in the running times over Wegman-Zadeck conditional constant propagation algorithm and 2% over a purely path profile guided approach.
Resumo:
Compiler optimizations need precise and scalable analyses to discover program properties. We propose a partially flow-sensitive framework that tries to draw on the scalability of flow-insensitive algorithms while providing more precision at some specific program points. Provided with a set of critical nodes — basic blocks at which more precise information is desired — our partially flow-sensitive algorithm computes a reduced control-flow graph by collapsing some sets of non-critical nodes. The algorithm is more scalable than a fully flow-sensitive one as, assuming that the number of critical nodes is small, the reduced flow-graph is much smaller than the original flow-graph. At the same time, a much more precise information is obtained at certain program points than would had been obtained from a flow-insensitive algorithm.
Resumo:
Better operational control of water networks can help reduce leakage, maintain pressure, and control flow. Proportional integral derivative (PID) controllers, with proper fine-tuning, can help water utility operators achieve targets faster without creating undue transients. The authors compared three tuning methods, in different test situations, involving flow and level control to different reservoirs. Although target values were reached with all three tuning methods, the methods’ performances varied significantly. The lowest performer among the three was the method most widely used in the industry—standard tuning by the Ziegler-Nichols method. Achieving better results was offline tuning by genetic algorithms. Achieving the best control, though, was a fuzzy logic–based online tuning approach—the FZPID controller. The FZPID controller had fewer overshoots and took significantly less time to tune the gains for each problem. This new tuning approach for PID controllers can be applied to a variety of problems and can increase the performance of water networks of any size and structure
Resumo:
Donor-doped n-BaTiO3 polycrystalline ceramics show a strong negative temperature coefficient of resistivity below the orthorhombic-rhombohedral phase transition point, from 10(2-3) Omega cm af 190 K to 10(10-13) Omega cm at less than or similar to 50 K, with thermal coefficient of resistance alpha = 20-23% K-1. Stable thermal sensors for low-temperature applications are realized therefrom. The negative temperature coefficient of resistivity region can be modified by substituting isovalent ions in the lattice. Highly nonlinear current-voltage (I-V) curves are observed at low temperatures, with a voltage maximum followed by the negative differential resistance. The I-V curves are sensitive to dissipation so that cryogenic sensors can be fabricated for liquid level control, flow rate monitoring, radiation detection or in-rush voltage limitation.
Resumo:
MATLAB is an array language, initially popular for rapid prototyping, but is now being increasingly used to develop production code for numerical and scientific applications. Typical MATLAB programs have abundant data parallelism. These programs also have control flow dominated scalar regions that have an impact on the program's execution time. Today's computer systems have tremendous computing power in the form of traditional CPU cores and throughput oriented accelerators such as graphics processing units(GPUs). Thus, an approach that maps the control flow dominated regions to the CPU and the data parallel regions to the GPU can significantly improve program performance. In this paper, we present the design and implementation of MEGHA, a compiler that automatically compiles MATLAB programs to enable synergistic execution on heterogeneous processors. Our solution is fully automated and does not require programmer input for identifying data parallel regions. We propose a set of compiler optimizations tailored for MATLAB. Our compiler identifies data parallel regions of the program and composes them into kernels. The problem of combining statements into kernels is formulated as a constrained graph clustering problem. Heuristics are presented to map identified kernels to either the CPU or GPU so that kernel execution on the CPU and the GPU happens synergistically and the amount of data transfer needed is minimized. In order to ensure required data movement for dependencies across basic blocks, we propose a data flow analysis and edge splitting strategy. Thus our compiler automatically handles composition of kernels, mapping of kernels to CPU and GPU, scheduling and insertion of required data transfer. The proposed compiler was implemented and experimental evaluation using a set of MATLAB benchmarks shows that our approach achieves a geometric mean speedup of 19.8X for data parallel benchmarks over native execution of MATLAB.
Resumo:
MATLAB is an array language, initially popular for rapid prototyping, but is now being increasingly used to develop production code for numerical and scientific applications. Typical MATLAB programs have abundant data parallelism. These programs also have control flow dominated scalar regions that have an impact on the program's execution time. Today's computer systems have tremendous computing power in the form of traditional CPU cores and throughput oriented accelerators such as graphics processing units(GPUs). Thus, an approach that maps the control flow dominated regions to the CPU and the data parallel regions to the GPU can significantly improve program performance. In this paper, we present the design and implementation of MEGHA, a compiler that automatically compiles MATLAB programs to enable synergistic execution on heterogeneous processors. Our solution is fully automated and does not require programmer input for identifying data parallel regions. We propose a set of compiler optimizations tailored for MATLAB. Our compiler identifies data parallel regions of the program and composes them into kernels. The problem of combining statements into kernels is formulated as a constrained graph clustering problem. Heuristics are presented to map identified kernels to either the CPU or GPU so that kernel execution on the CPU and the GPU happens synergistically and the amount of data transfer needed is minimized. In order to ensure required data movement for dependencies across basic blocks, we propose a data flow analysis and edge splitting strategy. Thus our compiler automatically handles composition of kernels, mapping of kernels to CPU and GPU, scheduling and insertion of required data transfer. The proposed compiler was implemented and experimental evaluation using a set of MATLAB benchmarks shows that our approach achieves a geometric mean speedup of 19.8X for data parallel benchmarks over native execution of MATLAB.
Resumo:
Transaction processing is a key constituent of the IT workload of commercial enterprises (e.g., banks, insurance companies). Even today, in many large enterprises, transaction processing is done by legacy "batch" applications, which run offline and process accumulated transactions. Developers acknowledge the presence of multiple loosely coupled pieces of functionality within individual applications. Identifying such pieces of functionality (which we call "services") is desirable for the maintenance and evolution of these legacy applications. This is a hard problem, which enterprises grapple with, and one without satisfactory automated solutions. In this paper, we propose a novel static-analysis-based solution to the problem of identifying services within transaction-processing programs. We provide a formal characterization of services in terms of control-flow and data-flow properties, which is well-suited to the idioms commonly exhibited by business applications. Our technique combines program slicing with the detection of conditional code regions to identify services in accordance with our characterization. A preliminary evaluation, based on a manual analysis of three real business programs, indicates that our approach can be effective in identifying useful services from batch applications.
Resumo:
We propose several stochastic approximation implementations for related algorithms in flow-control of communication networks. First, a discrete-time implementation of Kelly's primal flow-control algorithm is proposed. Convergence with probability 1 is shown, even in the presence of communication delays and stochastic effects seen in link congestion indications. This ensues from an analysis of the flow-control algorithm using the asynchronous stochastic approximation (ASA) framework. Two relevant enhancements are then pursued: a) an implementation of the primal algorithm using second-order information, and b) an implementation where edge-routers rectify misbehaving flows. Next, discretetime implementations of Kelly's dual algorithm and primaldual algorithm are proposed. Simulation results a) verifying the proposed algorithms and, b) comparing the stability properties are presented.
Resumo:
FACTS controllers are emerging as viable and economic solutions to the problems of large interconnected ne networks, which can endanger the system security. These devices are characterized by their fast response, absence of inertia, and minimum maintenance requirements. Thyristor controlled equipment like Thyristor Controlled Series Capacitor (TCSC), Static Var Compensator (SVC), Thyristor Controlled Phase angle Regulator (TCPR) etc. which involve passive elements result in devices of large sizes with substantial cost and significant labour for installation. An all solid-state device using GTOs leads to reduction in equipment size and has improved performance. The Unified Power Flow Controller (UPFC) is a versatile controller which can be used to control the active and reactive power in the Line independently. The concept of UPFC makes it possible to handle practically all power flow control and transmission line compensation problems, using solid-state controllers, which provide functional flexibility, generally not attainable by conventional thyristor controlled systems. In this paper, we present the development of a control scheme for the series injected voltage of the UPFC to damp the power oscillations and improve transient stability in a power system. (C) 1998 Elsevier Science Ltd. All rights reserved.
Resumo:
Experimental study and optimization of Plasma Ac- tuators for Flow control in subsonic regime PRADEEP MOISE, JOSEPH MATHEW, KARTIK VENKATRAMAN, JOY THOMAS, Indian Institute of Science, FLOW CONTROL TEAM | The induced jet produced by a dielectric barrier discharge (DBD) setup is capable of preventing °ow separation on airfoils at high angles of attack. The ef-fect of various parameters on the velocity of this induced jet was studied experimentally. The glow discharge was created at atmospheric con-ditions by using a high voltage RF power supply. Flow visualization,photographic studies of the plasma, and hot-wire measurements on the induced jet were performed. The parametric investigation of the charac- teristics of the plasma show that the width of the plasma in the uniform glow discharge regime was an indication of the velocity induced. It was observed that the spanwise and streamwise overlap of the two electrodes,dielectric thickness, voltage and frequency of the applied voltage are the major parameters that govern the velocity and the extent of plasma.e®ect of the optimized con¯guration on the performance characteristics of an airfoil was studied experimentally.
Resumo:
This paper is concerned with the optimal flow control of an ATM switching element in a broadband-integrated services digital network. We model the switching element as a stochastic fluid flow system with a finite buffer, a constant output rate server, and a Gaussian process to characterize the input, which is a heterogeneous set of traffic sources. The fluid level should be maintained between two levels namely b1 and b2 with b1
Resumo:
Control of flow in duct networks has a myriad of applications ranging from heating, ventilation, and air-conditioning to blood flow networks. The system considered here provides vent velocity inputs to a novel 3-D wind display device called the TreadPort Active Wind Tunnel. An error-based robust decentralized sliding-mode control method with nominal feedforward terms is developed for individual ducts while considering cross coupling between ducts and model uncertainty as external disturbances in the output. This approach is important due to limited measurements, geometric complexities, and turbulent flow conditions. Methods for resolving challenges such as turbulence, electrical noise, valve actuator design, and sensor placement are presented. The efficacy of the controller and the importance of feedforward terms are demonstrated with simulations based upon an experimentally validated lumped parameter model and experiments on the physical system. Results show significant improvement over traditional control methods and validate prior assertions regarding the importance of decentralized control in practice.
Resumo:
The hot deformation behavior of hot isostatically pressed (HIPd) P/M IN-100 superalloy has been studied in the temperature range 1000-1200 degrees C and strain rate range 0.0003-10 s(-1) using hot compression testing. A processing map has been developed on the basis of these data and using the principles of dynamic materials modelling. The map exhibited three domains: one at 1050 degrees C and 0.01 s(-1), with a peak efficiency of power dissipation of approximate to 32%, the second at 1150 degrees C and 10 s(-1), with a peak efficiency of approximate to 36% and the third at 1200 degrees C and 0.1 s(-1), with a similar efficiency. On the basis of optical and electron microscopic observations, the first domain was interpreted to represent dynamic recovery of the gamma phase, the second domain represents dynamic recrystallization (DRX) of gamma in the presence of softer gamma', while the third domain represents DRX of the gamma phase only. The gamma' phase is stable upto 1150 degrees C, gets deformed below this temperature and the chunky gamma' accumulates dislocations, which at larger strains cause cracking of this phase. At temperatures lower than 1080 degrees C and strain rates higher than 0.1 s(-1), the material exhibits flow instability, manifested in the form of adiabatic shear bands. The material may be subjected to mechanical processing without cracking or instabilities at 1200 degrees C and 0.1 s(-1), which are the conditions for DRX of the gamma phase.
Resumo:
Biomethanation of herbaceous biomass feedstock has the potential to provide clean energy source for cooking and other activities in areas where such biomass availability predominates. A biomethanation concept that involves fermentation of biomass residues in three steps, occurring in three zones of the fermentor is described. This approach while attempting take advantage of multistage reactors simplifies the reactor operation and obviates the need for a high degree of process control or complex reactor design. Typical herbaceous biomass decompose with a rapid VFA flux initially (with a tendency to float) followed by a slower decomposition showing balanced process of VFA generation and its utilization by methanogens that colonize biomass slowly. The tendency to float at the initial stages is suppressed by allowing previous days feed to hold it below digester liquid which permits VFA to disperse into the digester liquid without causing process inhibition. This approach has been used to build and operate simple biomass digesters to provide cooking gas in rural areas with weed and agro-residues. With appropriate modifications, the same concept has been used for digesting municipal solid wastes in small towns where large fermentors are not viable. With further modifications this concept has been used for solid-liquid feed fermentors. Methanogen colonized leaf biomass has been used as biofilm support to treat coffee processing wastewater as well as crop litter alternately in a year. During summer it functions as a biomass based biogas plants operating in the three-zone mode while in winter, feeding biomass is suspended and high strength coffee processing wastewater is let into the fermentor achieving over 90% BOD reduction. The early field experience of these fermentors is presented.