878 resultados para semi-Markov decision process


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Due to the limitation of current condition monitoring technologies, the estimates of asset health states may contain some uncertainties. A maintenance strategy ignoring this uncertainty of asset health state can cause additional costs or downtime. The partially observable Markov decision process (POMDP) is a commonly used approach to derive optimal maintenance strategies when asset health inspections are imperfect. However, existing applications of the POMDP to maintenance decision-making largely adopt the discrete time and state assumptions. The discrete-time assumption requires the health state transitions and maintenance activities only happen at discrete epochs, which cannot model the failure time accurately and is not cost-effective. The discrete health state assumption, on the other hand, may not be elaborate enough to improve the effectiveness of maintenance. To address these limitations, this paper proposes a continuous state partially observable semi-Markov decision process (POSMDP). An algorithm that combines the Monte Carlo-based density projection method and the policy iteration is developed to solve the POSMDP. Different types of maintenance activities (i.e., inspections, replacement, and imperfect maintenance) are considered in this paper. The next maintenance action and the corresponding waiting durations are optimized jointly to minimize the long-run expected cost per unit time and availability. The result of simulation studies shows that the proposed maintenance optimization approach is more cost-effective than maintenance strategies derived by another two approximate methods, when regular inspection intervals are adopted. The simulation study also shows that the maintenance cost can be further reduced by developing maintenance strategies with state-dependent maintenance intervals using the POSMDP. In addition, during the simulation studies the proposed POSMDP shows the ability to adopt a cost-effective strategy structure when multiple types of maintenance activities are involved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We introduce and study a class of non-stationary semi-Markov decision processes on a finite horizon. By constructing an equivalent Markov decision process, we establish the existence of a piecewise open loop relaxed control which is optimal for the finite horizon problem.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

One of the main challenges facing online and offline path planners is the uncertainty in the magnitude and direction of the environmental energy because it is dynamic, changeable with time, and hard to forecast. This thesis develops an artificial intelligence for a mobile robot to learn from historical or forecasted data of environmental energy available in the area of interest which will help for a persistence monitoring under uncertainty using the developed algorithm.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Optimal bang-coast maintenance policies for a machine, subject to failure, are considered. The approach utilizes a semi-Markov model for the system. A simplified model for modifying the probability of machine failure with maintenance is employed. A numerical example is presented to illustrate the procedure and results.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Estimating and predicting degradation processes of engineering assets is crucial for reducing the cost and insuring the productivity of enterprises. Assisted by modern condition monitoring (CM) technologies, most asset degradation processes can be revealed by various degradation indicators extracted from CM data. Maintenance strategies developed using these degradation indicators (i.e. condition-based maintenance) are more cost-effective, because unnecessary maintenance activities are avoided when an asset is still in a decent health state. A practical difficulty in condition-based maintenance (CBM) is that degradation indicators extracted from CM data can only partially reveal asset health states in most situations. Underestimating this uncertainty in relationships between degradation indicators and health states can cause excessive false alarms or failures without pre-alarms. The state space model provides an efficient approach to describe a degradation process using these indicators that can only partially reveal health states. However, existing state space models that describe asset degradation processes largely depend on assumptions such as, discrete time, discrete state, linearity, and Gaussianity. The discrete time assumption requires that failures and inspections only happen at fixed intervals. The discrete state assumption entails discretising continuous degradation indicators, which requires expert knowledge and often introduces additional errors. The linear and Gaussian assumptions are not consistent with nonlinear and irreversible degradation processes in most engineering assets. This research proposes a Gamma-based state space model that does not have discrete time, discrete state, linear and Gaussian assumptions to model partially observable degradation processes. Monte Carlo-based algorithms are developed to estimate model parameters and asset remaining useful lives. In addition, this research also develops a continuous state partially observable semi-Markov decision process (POSMDP) to model a degradation process that follows the Gamma-based state space model and is under various maintenance strategies. Optimal maintenance strategies are obtained by solving the POSMDP. Simulation studies through the MATLAB are performed; case studies using the data from an accelerated life test of a gearbox and a liquefied natural gas industry are also conducted. The results show that the proposed Monte Carlo-based EM algorithm can estimate model parameters accurately. The results also show that the proposed Gamma-based state space model have better fitness result than linear and Gaussian state space models when used to process monotonically increasing degradation data in the accelerated life test of a gear box. Furthermore, both simulation studies and case studies show that the prediction algorithm based on the Gamma-based state space model can identify the mean value and confidence interval of asset remaining useful lives accurately. In addition, the simulation study shows that the proposed maintenance strategy optimisation method based on the POSMDP is more flexible than that assumes a predetermined strategy structure and uses the renewal theory. Moreover, the simulation study also shows that the proposed maintenance optimisation method can obtain more cost-effective strategies than a recently published maintenance strategy optimisation method by optimising the next maintenance activity and the waiting time till the next maintenance activity simultaneously.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Bounded parameter Markov Decision Processes (BMDPs) address the issue of dealing with uncertainty in the parameters of a Markov Decision Process (MDP). Unlike the case of an MDP, the notion of an optimal policy for a BMDP is not entirely straightforward. We consider two notions of optimality based on optimistic and pessimistic criteria. These have been analyzed for discounted BMDPs. Here we provide results for average reward BMDPs. We establish a fundamental relationship between the discounted and the average reward problems, prove the existence of Blackwell optimal policies and, for both notions of optimality, derive algorithms that converge to the optimal value function.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Exploiting wind-energy is one possible way to ex- tend flight duration for Unmanned Arial Vehicles. Wind-energy can also be used to minimise energy consumption for a planned path. In this paper, we consider uncertain time-varying wind fields and plan a path through them. A Gaussian distribution is used to determine uncertainty in the Time-varying wind fields. We use Markov Decision Process to plan a path based upon the uncertainty of Gaussian distribution. Simulation results that compare the direct line of flight between start and target point and our planned path for energy consumption and time of travel are presented. The result is a robust path using the most visited cell while sampling the Gaussian distribution of the wind field in each cell.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Exploiting wind-energy is one possible way to extend flight duration for Unmanned Arial Vehicles. Wind-energy can also be used to minimise energy consumption for a planned path. In this paper, we consider uncertain time-varying wind fields and plan a path through them. A Gaussian distribution is used to determine uncertainty in the Time-varying wind fields. We use Markov Decision Process to plan a path based upon the uncertainty of Gaussian distribution. Simulation results that compare the direct line of flight between start and target point and our planned path for energy consumption and time of travel are presented. The result is a robust path using the most visited cell while sampling the Gaussian distribution of the wind field in each cell.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We consider the problem of controlling a Markov decision process (MDP) with a large state space, so as to minimize average cost. Since it is intractable to compete with the optimal policy for large scale problems, we pursue the more modest goal of competing with a low-dimensional family of policies. We use the dual linear programming formulation of the MDP average cost problem, in which the variable is a stationary distribution over state-action pairs, and we consider a neighborhood of a low-dimensional subset of the set of stationary distributions (defined in terms of state-action features) as the comparison class. We propose a technique based on stochastic convex optimization and give bounds that show that the performance of our algorithm approaches the best achievable by any policy in the comparison class. Most importantly, this result depends on the size of the comparison class, but not on the size of the state space. Preliminary experiments show the effectiveness of the proposed algorithm in a queuing application.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We develop an online actor-critic reinforcement learning algorithm with function approximation for a problem of control under inequality constraints. We consider the long-run average cost Markov decision process (MDP) framework in which both the objective and the constraint functions are suitable policy-dependent long-run averages of certain sample path functions. The Lagrange multiplier method is used to handle the inequality constraints. We prove the asymptotic almost sure convergence of our algorithm to a locally optimal solution. We also provide the results of numerical experiments on a problem of routing in a multi-stage queueing network with constraints on long-run average queue lengths. We observe that our algorithm exhibits good performance on this setting and converges to a feasible point.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We study optimal control of Markov processes with age-dependent transition rates. The control policy is chosen continuously over time based on the state of the process and its age. We study infinite horizon discounted cost and infinite horizon average cost problems. Our approach is via the construction of an equivalent semi-Markov decision process. We characterise the value function and optimal controls for both discounted and average cost cases.