203 resultados para Infinite horizon problems


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Gradient-based approaches to direct policy search in reinforcement learning have received much recent attention as a means to solve problems of partial observability and to avoid some of the problems associated with policy degradation in value-function methods. In this paper we introduce GPOMDP, a simulation-based algorithm for generating a biased estimate of the gradient of the average reward in Partially Observable Markov Decision Processes (POMDPs) controlled by parameterized stochastic policies. A similar algorithm was proposed by Kimura, Yamamura, and Kobayashi (1995). The algorithm's chief advantages are that it requires storage of only twice the number of policy parameters, uses one free parameter β ∈ [0,1) (which has a natural interpretation in terms of bias-variance trade-off), and requires no knowledge of the underlying state. We prove convergence of GPOMDP, and show how the correct choice of the parameter β is related to the mixing time of the controlled POMDP. We briefly describe extensions of GPOMDP to controlled Markov chains, continuous state, observation and control spaces, multiple-agents, higher-order derivatives, and a version for training stochastic policies with internal states. In a companion paper (Baxter, Bartlett, & Weaver, 2001) we show how the gradient estimates generated by GPOMDP can be used in both a traditional stochastic gradient algorithm and a conjugate-gradient procedure to find local optima of the average reward. ©2001 AI Access Foundation and Morgan Kaufmann Publishers. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We consider a robust filtering problem for uncertain discrete-time, homogeneous, first-order, finite-state hidden Markov models (HMMs). The class of uncertain HMMs considered is described by a conditional relative entropy constraint on measures perturbed from a nominal regular conditional probability distribution given the previous posterior state distribution and the latest measurement. Under this class of perturbations, a robust infinite horizon filtering problem is first formulated as a constrained optimization problem before being transformed via variational results into an unconstrained optimization problem; the latter can be elegantly solved using a risk-sensitive information-state based filtering.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper deals with constrained image-based visual servoing of circular and conical spiral motion about an unknown object approximating a single image point feature. Effective visual control of such trajectories has many applications for small unmanned aerial vehicles, including surveillance and inspection, forced landing (homing), and collision avoidance. A spherical camera model is used to derive a novel visual-predictive controller (VPC) using stability-based design methods for general nonlinear model-predictive control. In particular, a quasi-infinite horizon visual-predictive control scheme is derived. A terminal region, which is used as a constraint in the controller structure, can be used to guide appropriate reference image features for spiral tracking with respect to nominal stability and feasibility. Robustness properties are also discussed with respect to parameter uncertainty and additive noise. A comparison with competing visual-predictive control schemes is made, and some experimental results using a small quad rotor platform are given.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Many large coal mining operations in Australia rely heavily on the rail network to transport coal from mines to coal terminals at ports for shipment. Over the last few years, due to the fast growing demand, the coal rail network is becoming one of the worst industrial bottlenecks in Australia. As a result, this provides great incentives for pursuing better optimisation and control strategies for the operation of the whole rail transportation system under network and terminal capacity constraints. This PhD research aims to achieve a significant efficiency improvement in a coal rail network on the basis of the development of standard modelling approaches and generic solution techniques. Generally, the train scheduling problem can be modelled as a Blocking Parallel- Machine Job-Shop Scheduling (BPMJSS) problem. In a BPMJSS model for train scheduling, trains and sections respectively are synonymous with jobs and machines and an operation is regarded as the movement/traversal of a train across a section. To begin, an improved shifting bottleneck procedure algorithm combined with metaheuristics has been developed to efficiently solve the Parallel-Machine Job- Shop Scheduling (PMJSS) problems without the blocking conditions. Due to the lack of buffer space, the real-life train scheduling should consider blocking or hold-while-wait constraints, which means that a track section cannot release and must hold a train until the next section on the routing becomes available. As a consequence, the problem has been considered as BPMJSS with the blocking conditions. To develop efficient solution techniques for BPMJSS, extensive studies on the nonclassical scheduling problems regarding the various buffer conditions (i.e. blocking, no-wait, limited-buffer, unlimited-buffer and combined-buffer) have been done. In this procedure, an alternative graph as an extension of the classical disjunctive graph is developed and specially designed for the non-classical scheduling problems such as the blocking flow-shop scheduling (BFSS), no-wait flow-shop scheduling (NWFSS), and blocking job-shop scheduling (BJSS) problems. By exploring the blocking characteristics based on the alternative graph, a new algorithm called the topological-sequence algorithm is developed for solving the non-classical scheduling problems. To indicate the preeminence of the proposed algorithm, we compare it with two known algorithms (i.e. Recursive Procedure and Directed Graph) in the literature. Moreover, we define a new type of non-classical scheduling problem, called combined-buffer flow-shop scheduling (CBFSS), which covers four extreme cases: the classical FSS (FSS) with infinite buffer, the blocking FSS (BFSS) with no buffer, the no-wait FSS (NWFSS) and the limited-buffer FSS (LBFSS). After exploring the structural properties of CBFSS, we propose an innovative constructive algorithm named the LK algorithm to construct the feasible CBFSS schedule. Detailed numerical illustrations for the various cases are presented and analysed. By adjusting only the attributes in the data input, the proposed LK algorithm is generic and enables the construction of the feasible schedules for many types of non-classical scheduling problems with different buffer constraints. Inspired by the shifting bottleneck procedure algorithm for PMJSS and characteristic analysis based on the alternative graph for non-classical scheduling problems, a new constructive algorithm called the Feasibility Satisfaction Procedure (FSP) is proposed to obtain the feasible BPMJSS solution. A real-world train scheduling case is used for illustrating and comparing the PMJSS and BPMJSS models. Some real-life applications including considering the train length, upgrading the track sections, accelerating a tardy train and changing the bottleneck sections are discussed. Furthermore, the BPMJSS model is generalised to be a No-Wait Blocking Parallel- Machine Job-Shop Scheduling (NWBPMJSS) problem for scheduling the trains with priorities, in which prioritised trains such as express passenger trains are considered simultaneously with non-prioritised trains such as freight trains. In this case, no-wait conditions, which are more restrictive constraints than blocking constraints, arise when considering the prioritised trains that should traverse continuously without any interruption or any unplanned pauses because of the high cost of waiting during travel. In comparison, non-prioritised trains are allowed to enter the next section immediately if possible or to remain in a section until the next section on the routing becomes available. Based on the FSP algorithm, a more generic algorithm called the SE algorithm is developed to solve a class of train scheduling problems in terms of different conditions in train scheduling environments. To construct the feasible train schedule, the proposed SE algorithm consists of many individual modules including the feasibility-satisfaction procedure, time-determination procedure, tune-up procedure and conflict-resolve procedure algorithms. To find a good train schedule, a two-stage hybrid heuristic algorithm called the SE-BIH algorithm is developed by combining the constructive heuristic (i.e. the SE algorithm) and the local-search heuristic (i.e. the Best-Insertion- Heuristic algorithm). To optimise the train schedule, a three-stage algorithm called the SE-BIH-TS algorithm is developed by combining the tabu search (TS) metaheuristic with the SE-BIH algorithm. Finally, a case study is performed for a complex real-world coal rail network under network and terminal capacity constraints. The computational results validate that the proposed methodology would be very promising because it can be applied as a fundamental tool for modelling and solving many real-world scheduling problems.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper addresses the problem of joint identification of infinite-frequency added mass and fluid memory models of marine structures from finite frequency data. This problem is relevant for cases where the code used to compute the hydrodynamic coefficients of the marine structure does not give the infinite-frequency added mass. This case is typical of codes based on 2D-potential theory since most 3D-potential-theory codes solve the boundary value associated with the infinite frequency. The method proposed in this paper presents a simpler alternative approach to other methods previously presented in the literature. The advantage of the proposed method is that the same identification procedure can be used to identify the fluid-memory models with or without having access to the infinite-frequency added mass coefficient. Therefore, it provides an extension that puts the two identification problems into the same framework. The method also exploits the constraints related to relative degree and low-frequency asymptotic values of the hydrodynamic coefficients derived from the physics of the problem, which are used as prior information to refine the obtained models.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper uses data from a large national project on student-working to examine problems and challenges for school students working in part-time jobs. While literature has identified some potential problems and challenges, and some potential difficulties can be extrapolated from the nature of a young teenage workforce and the nature of the workplaces, these were largely absent in the two companies researched because the companies already had policies in place that addressed the potential problems. Some suggestions are made about how problems and challenges could be avoided in a wider range of adolescent workplaces.