229 resultados para Program Optimization
Resumo:
We describe a real-time system that supports design of optimal flight paths over terrains. These paths either maximize view coverage or minimize vehicle exposure to ground. A volume-rendered display of multi-viewpoint visibility and a haptic interface assists the user in selecting, assessing, and refining the computed flight path. We design a three-dimensional scalar field representing the visibility of a point above the terrain, describe an efficient algorithm to compute the visibility field, and develop visual and haptic schemes to interact with the visibility field. Given the origin and destination, the desired flight path is computed using an efficient simulation of an articulated rope under the influence of the visibility gradient. The simulation framework also accepts user input, via the haptic interface, thereby allowing manual refinement of the flight path.
Resumo:
The Ball-Larus path-profiling algorithm is an efficient technique to collect acyclic path frequencies of a program. However, longer paths -those extending across loop iterations - describe the runtime behaviour of programs better. We generalize the Ball-Larus profiling algorithm for profiling k-iteration paths - paths that can span up to to k iterations of a loop. We show that it is possible to number suchk-iteration paths perfectly, thus allowing for an efficient profiling algorithm for such longer paths. We also describe a scheme for mixed-mode profiling: profiling different parts of a procedure with different path lengths. Experimental results show that k-iteration profiling is realistic.
Resumo:
The StreamIt programming model has been proposed to exploit parallelism in streaming applications on general purpose multi-core architectures. This model allows programmers to specify the structure of a program as a set of filters that act upon data, and a set of communication channels between them. The StreamIt graphs describe task, data and pipeline parallelism which can be exploited on modern Graphics Processing Units (GPUs), as they support abundant parallelism in hardware. In this paper, we describe the challenges in mapping StreamIt to GPUs and propose an efficient technique to software pipeline the execution of stream programs on GPUs. We formulate this problem - both scheduling and assignment of filters to processors - as an efficient Integer Linear Program (ILP), which is then solved using ILP solvers. We also describe a novel buffer layout technique for GPUs which facilitates exploiting the high memory bandwidth available in GPUs. The proposed scheduling utilizes both the scalar units in GPU, to exploit data parallelism, and multiprocessors, to exploit task and pipelin parallelism. Further it takes into consideration the synchronization and bandwidth limitations of GPUs, and yields speedups between 1.87X and 36.83X over a single threaded CPU.
Resumo:
This paper proposes the use of empirical modeling techniques for building microarchitecture sensitive models for compiler optimizations. The models we build relate program performance to settings of compiler optimization flags, associated heuristics and key microarchitectural parameters. Unlike traditional analytical modeling methods, this relationship is learned entirely from data obtained by measuring performance at a small number of carefully selected compiler/microarchitecture configurations. We evaluate three different learning techniques in this context viz. linear regression, adaptive regression splines and radial basis function networks. We use the generated models to a) predict program performance at arbitrary compiler/microarchitecture configurations, b) quantify the significance of complex interactions between optimizations and the microarchitecture, and c) efficiently search for'optimal' settings of optimization flags and heuristics for any given microarchitectural configuration. Our evaluation using benchmarks from the SPEC CPU2000 suits suggests that accurate models (< 5% average error in prediction) can be generated using a reasonable number of simulations. We also find that using compiler settings prescribed by a model-based search can improve program performance by as much as 19% (with an average of 9.5%) over highly optimized binaries.
Resumo:
Assembly consisting of cast and wrought aluminum alloys has wide spread application in defense and aero space industries. For the efficacious use of the transition joints, the weld should have adequate strength and formability. In the present investigation, A356 and 6061 aluminum alloys were friction stir welded under tool rotational speed of 1000-1400 rpm and traversing speed of 80-240 mm/min, keeping other parameters same. The variable process window is responsible for the change in total heat input and cooling rate during welding. Structural characterization of the bonded assemblies exhibits recovery-recrystallization in the stirring zone and breaking of coarse eutectic network of Al-Si. Dispersion of fine Si rich particles, refinement of 6061 grain size, low residual stress level and high defect density within weld nugget contribute towards the improvement in bond strength. Lower will be the tool rotational and traversing speed, more dominant will be the above phenomena. Therefore, the joint fabricated using lowest tool traversing and rotational speed, exhibits substantial improvement in bond strength (similar to 98% of that of 6061 alloy), which is also maximum with respect to others. (C) 2010 Elsevier Ltd. All rights reserved.
Resumo:
Modern database systems incorporate a query optimizer to identify the most efficient "query execution plan" for executing the declarative SQL queries submitted by users. A dynamic-programming-based approach is used to exhaustively enumerate the combinatorially large search space of plan alternatives and, using a cost model, to identify the optimal choice. While dynamic programming (DP) works very well for moderately complex queries with up to around a dozen base relations, it usually fails to scale beyond this stage due to its inherent exponential space and time complexity. Therefore, DP becomes practically infeasible for complex queries with a large number of base relations, such as those found in current decision-support and enterprise management applications. To address the above problem, a variety of approaches have been proposed in the literature. Some completely jettison the DP approach and resort to alternative techniques such as randomized algorithms, whereas others have retained DP by using heuristics to prune the search space to computationally manageable levels. In the latter class, a well-known strategy is "iterative dynamic programming" (IDP) wherein DP is employed bottom-up until it hits its feasibility limit, and then iteratively restarted with a significantly reduced subset of the execution plans currently under consideration. The experimental evaluation of IDP indicated that by appropriate choice of algorithmic parameters, it was possible to almost always obtain "good" (within a factor of twice of the optimal) plans, and in the few remaining cases, mostly "acceptable" (within an order of magnitude of the optimal) plans, and rarely, a "bad" plan. While IDP is certainly an innovative and powerful approach, we have found that there are a variety of common query frameworks wherein it can fail to consistently produce good plans, let alone the optimal choice. This is especially so when star or clique components are present, increasing the complexity of th- e join graphs. Worse, this shortcoming is exacerbated when the number of relations participating in the query is scaled upwards.
Resumo:
Fuel cells are emerging as alternate green power producers for both large power production and for use in automobiles. Hydrogen is seen as the best option as a fuel; however, hydrogen fuel cells require recirculation of unspent hydrogen. A supersonic ejector is an apt device for recirculation in the operating regimes of a hydrogen fuel cell. Optimal ejectors have to be designed to achieve best performances. The use of the vector evaluated particle swarm optimization technique to optimize supersonic ejectors with a focus on its application for hydrogen recirculation in fuel cells is presented here. Two parameters, compression ratio and efficiency, have been identified as the objective functions to be optimized. Their relation to operating and design parameters of ejector is obtained by control volume based analysis using a constant area mixing approximation. The independent parameters considered are the area ratio and the exit Mach number of the nozzle. The optimization is carried out at a particularentrainment ratio and results in a set of nondominated solutions, the Pareto front. A set of such curves can be used for choosing the optimal design parameters of the ejector.
Resumo:
This work addresses the optimum design of a composite box-beam structure subject to strength constraints. Such box-beams are used as the main load carrying members of helicopter rotor blades. A computationally efficient analytical model for box-beam is used. Optimal ply orientation angles are sought which maximize the failure margins with respect to the applied loading. The Tsai-Wu-Hahn failure criterion is used to calculate the reserve factor for each wall and ply and the minimum reserve factor is maximized. Ply angles are used as design variables and various cases of initial starting design and loadings are investigated. Both gradient-based and particle swarm optimization (PSO) methods are used. It is found that the optimization approach leads to the design of a box-beam with greatly improved reserve factors which can be useful for helicopter rotor structures. While the PSO yields globally best designs, the gradient-based method can also be used with appropriate starting designs to obtain useful designs efficiently. (C) 2006 Elsevier Ltd. All rights reserved.
Resumo:
Many optimal control problems are characterized by their multiple performance measures that are often noncommensurable and competing with each other. The presence of multiple objectives in a problem usually give rise to a set of optimal solutions, largely known as Pareto-optimal solutions. Evolutionary algorithms have been recognized to be well suited for multi-objective optimization because of their capability to evolve a set of nondominated solutions distributed along the Pareto front. This has led to the development of many evolutionary multi-objective optimization algorithms among which Nondominated Sorting Genetic Algorithm (NSGA and its enhanced version NSGA-II) has been found effective in solving a wide variety of problems. Recently, we reported a genetic algorithm based technique for solving dynamic single-objective optimization problems, with single as well as multiple control variables, that appear in fed-batch bioreactor applications. The purpose of this study is to extend this methodology for solution of multi-objective optimal control problems under the framework of NSGA-II. The applicability of the technique is illustrated by solving two optimal control problems, taken from literature, which have usually been solved by several methods as single-objective dynamic optimization problems. (C) 2004 Elsevier Ltd. All rights reserved.
Resumo:
We present a new computationally efficient method for large-scale polypeptide folding using coarse-grained elastic networks and gradient-based continuous optimization techniques. The folding is governed by minimization of energy based on Miyazawa–Jernigan contact potentials. Using this method we are able to substantially reduce the computation time on ordinary desktop computers for simulation of polypeptide folding starting from a fully unfolded state. We compare our results with available native state structures from Protein Data Bank (PDB) for a few de-novo proteins and two natural proteins, Ubiquitin and Lysozyme. Based on our simulations we are able to draw the energy landscape for a small de-novo protein, Chignolin. We also use two well known protein structure prediction software, MODELLER and GROMACS to compare our results. In the end, we show how a modification of normal elastic network model can lead to higher accuracy and lower time required for simulation.
Resumo:
The optimal design of a multiproduct batch chemical plant is formulated as a multiobjective optimization problem, and the resulting constrained mixed-integer nonlinear program (MINLP) is solved by the nondominated sorting genetic algorithm approach (NSGA-II). By putting bounds on the objective function values, the constrained MINLP problem can be solved efficiently by NSGA-II to generate a set of feasible nondominated solutions in the range desired by the decision-maker in a single run of the algorithm. The evolution of the entire set of nondominated solutions helps the decision-maker to make a better choice of the appropriate design from among several alternatives. The large set of solutions also provides a rich source of excellent initial guesses for solution of the same problem by alternative approaches to achieve any specific target for the objective functions
Resumo:
The overall performance of random early detection (RED) routers in the Internet is determined by the settings of their associated parameters. The non-availability of a functional relationship between the RED performance and its parameters makes it difficult to implement optimization techniques directly in order to optimize the RED parameters. In this paper, we formulate a generic optimization framework using a stochastically bounded delay metric to dynamically adapt the RED parameters. The constrained optimization problem thus formulated is solved using traditional nonlinear programming techniques. Here, we implement the barrier and penalty function approaches, respectively. We adopt a second-order nonlinear optimization framework and propose a novel four-timescale stochastic approximation algorithm to estimate the gradient and Hessian of the barrier and penalty objectives and update the RED parameters. A convergence analysis of the proposed algorithm is briefly sketched. We perform simulations to evaluate the performance of our algorithm with both barrier and penalty objectives and compare these with RED and a variant of it in the literature. We observe an improvement in performance using our proposed algorithm over RED, and the above variant of it.
Resumo:
The present work concerns with the static scheduling of jobs to parallel identical batch processors with incompatible job families for minimizing the total weighted tardiness. This scheduling problem is applicable in burn-in operations and wafer fabrication in semiconductor manufacturing. We decompose the problem into two stages: batch formation and batch scheduling, as in the literature. The Ant Colony Optimization (ACO) based algorithm called ATC-BACO algorithm is developed in which ACO is used to solve the batch scheduling problems. Our computational experimentation shows that the proposed ATC-BACO algorithm performs better than the available best traditional dispatching rule called ATC-BATC rule.
Resumo:
The notion of optimization is inherent in protein design. A long linear chain of twenty types of amino acid residues are known to fold to a 3-D conformation that minimizes the combined inter-residue energy interactions. There are two distinct protein design problems, viz. predicting the folded structure from a given sequence of amino acid monomers (folding problem) and determining a sequence for a given folded structure (inverse folding problem). These two problems have much similarity to engineering structural analysis and structural optimization problems respectively. In the folding problem, a protein chain with a given sequence folds to a conformation, called a native state, which has a unique global minimum energy value when compared to all other unfolded conformations. This involves a search in the conformation space. This is somewhat akin to the principle of minimum potential energy that determines the deformed static equilibrium configuration of an elastic structure of given topology, shape, and size that is subjected to certain boundary conditions. In the inverse-folding problem, one has to design a sequence with some objectives (having a specific feature of the folded structure, docking with another protein, etc.) and constraints (sequence being fixed in some portion, a particular composition of amino acid types, etc.) while obtaining a sequence that would fold to the desired conformation satisfying the criteria of folding. This requires a search in the sequence space. This is similar to structural optimization in the design-variable space wherein a certain feature of structural response is optimized subject to some constraints while satisfying the governing static or dynamic equilibrium equations. Based on this similarity, in this work we apply the topology optimization methods to protein design, discuss modeling issues and present some initial results.
Resumo:
There are a number of large networks which occur in many problems dealing with the flow of power, communication signals, water, gas, transportable goods, etc. Both design and planning of these networks involve optimization problems. The first part of this paper introduces the common characteristics of a nonlinear network (the network may be linear, the objective function may be non linear, or both may be nonlinear). The second part develops a mathematical model trying to put together some important constraints based on the abstraction for a general network. The third part deals with solution procedures; it converts the network to a matrix based system of equations, gives the characteristics of the matrix and suggests two solution procedures, one of them being a new one. The fourth part handles spatially distributed networks and evolves a number of decomposition techniques so that we can solve the problem with the help of a distributed computer system. Algorithms for parallel processors and spatially distributed systems have been described.There are a number of common features that pertain to networks. A network consists of a set of nodes and arcs. In addition at every node, there is a possibility of an input (like power, water, message, goods etc) or an output or none. Normally, the network equations describe the flows amoungst nodes through the arcs. These network equations couple variables associated with nodes. Invariably, variables pertaining to arcs are constants; the result required will be flows through the arcs. To solve the normal base problem, we are given input flows at nodes, output flows at nodes and certain physical constraints on other variables at nodes and we should find out the flows through the network (variables at nodes will be referred to as across variables).The optimization problem involves in selecting inputs at nodes so as to optimise an objective function; the objective may be a cost function based on the inputs to be minimised or a loss function or an efficiency function. The above mathematical model can be solved using Lagrange Multiplier technique since the equalities are strong compared to inequalities. The Lagrange multiplier technique divides the solution procedure into two stages per iteration. Stage one calculates the problem variables % and stage two the multipliers lambda. It is shown that the Jacobian matrix used in stage one (for solving a nonlinear system of necessary conditions) occurs in the stage two also.A second solution procedure has also been imbedded into the first one. This is called total residue approach. It changes the equality constraints so that we can get faster convergence of the iterations.Both solution procedures are found to coverge in 3 to 7 iterations for a sample network.The availability of distributed computer systems — both LAN and WAN — suggest the need for algorithms to solve the optimization problems. Two types of algorithms have been proposed — one based on the physics of the network and the other on the property of the Jacobian matrix. Three algorithms have been deviced, one of them for the local area case. These algorithms are called as regional distributed algorithm, hierarchical regional distributed algorithm (both using the physics properties of the network), and locally distributed algorithm (a multiprocessor based approach with a local area network configuration). The approach used was to define an algorithm that is faster and uses minimum communications. These algorithms are found to converge at the same rate as the non distributed (unitary) case.