129 resultados para Markov decision process (POMDP)
em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo (BDPI/USP)
Resumo:
When modeling real-world decision-theoretic planning problems in the Markov Decision Process (MDP) framework, it is often impossible to obtain a completely accurate estimate of transition probabilities. For example, natural uncertainty arises in the transition specification due to elicitation of MOP transition models from an expert or estimation from data, or non-stationary transition distributions arising from insufficient state knowledge. In the interest of obtaining the most robust policy under transition uncertainty, the Markov Decision Process with Imprecise Transition Probabilities (MDP-IPs) has been introduced to model such scenarios. Unfortunately, while various solution algorithms exist for MDP-IPs, they often require external calls to optimization routines and thus can be extremely time-consuming in practice. To address this deficiency, we introduce the factored MDP-IP and propose efficient dynamic programming methods to exploit its structure. Noting that the key computational bottleneck in the solution of factored MDP-IPs is the need to repeatedly solve nonlinear constrained optimization problems, we show how to target approximation techniques to drastically reduce the computational overhead of the nonlinear solver while producing bounded, approximately optimal solutions. Our results show up to two orders of magnitude speedup in comparison to traditional ""flat"" dynamic programming approaches and up to an order of magnitude speedup over the extension of factored MDP approximate value iteration techniques to MDP-IPs while producing the lowest error of any approximation algorithm evaluated. (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
The study of Information Technology (IT) outsourcing is relevant because companies are outsourcing their activities more than ever. An important IT outsourcing research area is the decision-making process. In other words, the comprehension of how companies decide about outsourcing their IT operations is relevant from research point of view. Therefore, the objective of this study is to understand the decision-making process used by Brazilian companies when outsourcing their IT operations. An analysis of the literature that refers to this subject showed that six aspects are usually considered by companies on the evaluation of IT outsourcing service alternatives. This research verified how these six aspects are considered by Brazilian companies on IT outsourcing decisions. The survey showed that Brazilian companies consider all the six aspects, but each of them has a different level of importance. The research also grouped the aspects according to their level of importance and interdependency, using factorial analysis to understand the logic behind IT outsourcing decision process. (C) 2009 Elsevier B.V. All rights reserved.
Resumo:
Apesar da idéia consagrada de que arroz é uma commodity e, portanto, pouco passível de diferenciação, há um grande número de produtos, com variação de tipo, classe, padrão, embalagem, marca etc. Observa-se significativa variabilidade nos preços, tanto entre diferentes marcas, fabricantes, lojas, como também para um mesmo produto, em um curto intervalo de tempo. Diante dessas constatações, questiona-se qual o efeito da estratégia de compra de arroz por parte dos consumidores sobre seus dispêndios. Este trabalho utiliza modelos matemáticos para simular o processo de decisão de compra dos consumidores com diferentes perfis de preferência, diante dos produtos nas gôndolas dos supermercados em uma cidade no estado do Rio Grande do Sul e outra em São Paulo.
Resumo:
Background: The criteria and timing for nerve surgery in infants with obstetric brachial plexopathy remain controversial. Our aim was to develop a new method for early prognostic assessment to assist this decision process. Methods: Fifty-four patients with unilateral obstetric brachial plexopathy who were ten to sixty days old underwent bilateral motor-nerve-conduction studies of the axillary, musculocutaneous, proximal radial, distal radial, median, and ulnar nerves. The ratio between the amplitude of the compound muscle action potential of the affected limb and that of the healthy side was called the axonal viability index. The patients were followed and classified in three groups according to the clinical outcome. We analyzed the receiver operating characteristic curve of each index to define the best cutoff point to detect patients with a poor recovery. Results: The best cutoff points on the axonal viability index for each nerve (and its sensitivity and specificity) were <10% (88% and 89%, respectively) for the axillary nerve, 0% (88% and 73%) for the musculocutaneous nerve, <20% (82% and 97%) for the proximal radial nerve, <50% (82% and 97%) for the distal radial nerve, and <50% (59% and 97%) for the ulnar nerve. The indices from the proximal radial, distal radial, and ulnar nerves had better specificities compared with the most frequently used clinical criterion: absence of biceps function at three months of age. Conclusions: The axonal viability index yields an earlier and more specific prognostic estimation of obstetric brachial plexopathy than does the clinical criterion of biceps function, and we believe it may be useful in determining surgical indications in these patients.
Resumo:
This paper deals with the long run average continuous control problem of piecewise deterministic Markov processes (PDMPs) taking values in a general Borel space and with compact action space depending on the state variable. The control variable acts on the jump rate and transition measure of the PDMP, and the running and boundary costs are assumed to be positive but not necessarily bounded. Our first main result is to obtain an optimality equation for the long run average cost in terms of a discrete-time optimality equation related to the embedded Markov chain given by the postjump location of the PDMP. Our second main result guarantees the existence of a feedback measurable selector for the discrete-time optimality equation by establishing a connection between this equation and an integro-differential equation. Our final main result is to obtain some sufficient conditions for the existence of a solution for a discrete-time optimality inequality and an ordinary optimal feedback control for the long run average cost using the so-called vanishing discount approach. Two examples are presented illustrating the possible applications of the results developed in the paper.
Resumo:
This work is concerned with the existence of an optimal control strategy for the long-run average continuous control problem of piecewise-deterministic Markov processes (PDMPs). In Costa and Dufour (2008), sufficient conditions were derived to ensure the existence of an optimal control by using the vanishing discount approach. These conditions were mainly expressed in terms of the relative difference of the alpha-discount value functions. The main goal of this paper is to derive tractable conditions directly related to the primitive data of the PDMP to ensure the existence of an optimal control. The present work can be seen as a continuation of the results derived in Costa and Dufour (2008). Our main assumptions are written in terms of some integro-differential inequalities related to the so-called expected growth condition, and geometric convergence of the post-jump location kernel associated to the PDMP. An example based on the capacity expansion problem is presented, illustrating the possible applications of the results developed in the paper.
Resumo:
The main goal of this paper is to establish some equivalence results on stability, recurrence, and ergodicity between a piecewise deterministic Markov process ( PDMP) {X( t)} and an embedded discrete-time Markov chain {Theta(n)} generated by a Markov kernel G that can be explicitly characterized in terms of the three local characteristics of the PDMP, leading to tractable criterion results. First we establish some important results characterizing {Theta(n)} as a sampling of the PDMP {X( t)} and deriving a connection between the probability of the first return time to a set for the discrete-time Markov chains generated by G and the resolvent kernel R of the PDMP. From these results we obtain equivalence results regarding irreducibility, existence of sigma-finite invariant measures, and ( positive) recurrence and ( positive) Harris recurrence between {X( t)} and {Theta(n)}, generalizing the results of [ F. Dufour and O. L. V. Costa, SIAM J. Control Optim., 37 ( 1999), pp. 1483-1502] in several directions. Sufficient conditions in terms of a modified Foster-Lyapunov criterion are also presented to ensure positive Harris recurrence and ergodicity of the PDMP. We illustrate the use of these conditions by showing the ergodicity of a capacity expansion model.
Resumo:
This article deals with the activity of defining information of hospital systems as fundamental for choosing the type of information systems to be used and also the organizational level to be supported. The use of hospital managing information systems improves the user`s decision -making process by allowing control report generation and following up the procedures made in the hospital as well.
Resumo:
The present paper proposes a flexible consensus scheme for group decision making, which allows one to obtain a consistent collective opinion, from information provided by each expert in terms of multigranular fuzzy estimates. It is based on a linguistic hierarchical model with multigranular sets of linguistic terms, and the choice of the most suitable set is a prerogative of each expert. From the human viewpoint, using such model is advantageous, since it permits each expert to utilize linguistic terms that reflect more adequately the level of uncertainty intrinsic to his evaluation. From the operational viewpoint, the advantage of using such model lies in the fact that it allows one to express the linguistic information in a unique domain, without losses of information, during the discussion process. The proposed consensus scheme supposes that the moderator can interfere in the discussion process in different ways. The intervention can be a request to any expert to update his opinion or can be the adjustment of the weight of each expert`s opinion. An optimal adjustment can be achieved through the execution of an optimization procedure that searches for the weights that maximize a corresponding soft consensus index. In order to demonstrate the usefulness of the presented consensus scheme, a technique for multicriteria analysis, based on fuzzy preference relation modeling, is utilized for solving a hypothetical enterprise strategy planning problem, generated with the use of the Balanced Scorecard methodology. (C) 2009 Elsevier Inc. All rights reserved.
Diagnostic errors and repetitive sequential classifications in on-line process control by attributes
Resumo:
The procedure of on-line process control by attributes, known as Taguchi`s on-line process control, consists of inspecting the mth item (a single item) at every m produced items and deciding, at each inspection, whether the fraction of conforming items was reduced or not. If the inspected item is nonconforming, the production is stopped for adjustment. As the inspection system can be subject to diagnosis errors, one develops a probabilistic model that classifies repeatedly the examined item until a conforming or b non-conforming classification is observed. The first event that occurs (a conforming classifications or b non-conforming classifications) determines the final classification of the examined item. Proprieties of an ergodic Markov chain were used to get the expression of average cost of the system of control, which can be optimized by three parameters: the sampling interval of the inspections (m); the number of repeated conforming classifications (a); and the number of repeated non-conforming classifications (b). The optimum design is compared with two alternative approaches: the first one consists of a simple preventive policy. The production system is adjusted at every n produced items (no inspection is performed). The second classifies the examined item repeatedly r (fixed) times and considers it conforming if most classification results are conforming. Results indicate that the current proposal performs better than the procedure that fixes the number of repeated classifications and classifies the examined item as conforming if most classifications were conforming. On the other hand, the preventive policy can be averagely the most economical alternative rather than those ones that require inspection depending on the degree of errors and costs. A numerical example illustrates the proposed procedure. (C) 2009 Elsevier B. V. All rights reserved.
Resumo:
The procedure for online process control by attributes consists of inspecting a single item at every m produced items. It is decided on the basis of the inspection result whether the process is in-control (the conforming fraction is stable) or out-of-control (the conforming fraction is decreased, for example). Most articles about online process control have cited the stoppage of the production process for an adjustment when the inspected item is non-conforming (then the production is restarted in-control, here denominated as corrective adjustment). Moreover, the articles related to this subject do not present semi-economical designs (which may yield high quantities of non-conforming items), as they do not include a policy of preventive adjustments (in such case no item is inspected), which can be more economical, mainly if the inspected item can be misclassified. In this article, the possibility of preventive or corrective adjustments in the process is decided at every m produced item. If a preventive adjustment is decided upon, then no item is inspected. On the contrary, the m-th item is inspected; if it conforms, the production goes on, otherwise, an adjustment takes place and the process restarts in-control. This approach is economically feasible for some practical situations and the parameters of the proposed procedure are determined minimizing an average cost function subject to some statistical restrictions (for example, to assure a minimal levelfixed in advanceof conforming items in the production process). Numerical examples illustrate the proposal.
Resumo:
The main goal of this paper is to apply the so-called policy iteration algorithm (PIA) for the long run average continuous control problem of piecewise deterministic Markov processes (PDMP`s) taking values in a general Borel space and with compact action space depending on the state variable. In order to do that we first derive some important properties for a pseudo-Poisson equation associated to the problem. In the sequence it is shown that the convergence of the PIA to a solution satisfying the optimality equation holds under some classical hypotheses and that this optimal solution yields to an optimal control strategy for the average control problem for the continuous-time PDMP in a feedback form.
Resumo:
We consider in this paper the optimal stationary dynamic linear filtering problem for continuous-time linear systems subject to Markovian jumps in the parameters (LSMJP) and additive noise (Wiener process). It is assumed that only an output of the system is available and therefore the values of the jump parameter are not accessible. It is a well known fact that in this setting the optimal nonlinear filter is infinite dimensional, which makes the linear filtering a natural numerically, treatable choice. The goal is to design a dynamic linear filter such that the closed loop system is mean square stable and minimizes the stationary expected value of the mean square estimation error. It is shown that an explicit analytical solution to this optimal filtering problem is obtained from the stationary solution associated to a certain Riccati equation. It is also shown that the problem can be formulated using a linear matrix inequalities (LMI) approach, which can be extended to consider convex polytopic uncertainties on the parameters of the possible modes of operation of the system and on the transition rate matrix of the Markov process. As far as the authors are aware of this is the first time that this stationary filtering problem (exact and robust versions) for LSMJP with no knowledge of the Markov jump parameters is considered in the literature. Finally, we illustrate the results with an example.
Resumo:
In this paper we obtain the linear minimum mean square estimator (LMMSE) for discrete-time linear systems subject to state and measurement multiplicative noises and Markov jumps on the parameters. It is assumed that the Markov chain is not available. By using geometric arguments we obtain a Kalman type filter conveniently implementable in a recurrence form. The stationary case is also studied and a proof for the convergence of the error covariance matrix of the LMMSE to a stationary value under the assumption of mean square stability of the system and ergodicity of the associated Markov chain is obtained. It is shown that there exists a unique positive semi-definite solution for the stationary Riccati-like filter equation and, moreover, this solution is the limit of the error covariance matrix of the LMMSE. The advantage of this scheme is that it is very easy to implement and all calculations can be performed offline. (c) 2011 Elsevier Ltd. All rights reserved.
Resumo:
PURPOSE: To investigate the facial symmetry of rats submitted to experimental mandibular condyle fracture and with protein undernutrition (8% of protein) by means of cephalometric measurements. METHODS: Forty-five adult Wistar rats were distributed in three groups: fracture group, submitted to condylar fracture with no changes in diet; undernourished fracture group, submitted to hypoproteic diet and condylar fracture; undernourished group, kept until the end of experiment, without condylar fracture. Displaced fractures of the right condyle were induced under general anesthesia. The specimens were submitted to axial radiographic incidence, and cephalometric mensurations were made using a computer system. The values obtained were subjected to statistical analyses among the groups and between the sides in each group. RESULTS: There was significative decrease of the values of serum proteins and albumin in the undernourished fracture group. There was deviation of the median line of the mandible relative to the median line of the maxilla, significative to undernutrition fracture group, as well as asymmetry of the maxilla and mandible, in special in the final period of experiment. CONCLUSION: The mandibular condyle fracture in rats with proteic undernutrition induced an asymmetry of the mandible, also leading to consequences in the maxilla.