Biblioteca Digital

3 resultados para VARIABLE SAMPLING INTERVAL

em Massachusetts Institute of Technology

Policy Improvement for POMDPs Using Normalized Importance Sampling

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present a new method for estimating the expected return of a POMDP from experience. The estimator does not assume any knowle ge of the POMDP and allows the experience to be gathered with an arbitrary set of policies. The return is estimated for any new policy of the POMDP. We motivate the estimator from function-approximation and importance sampling points-of-view and derive its theoretical properties. Although the estimator is biased, it has low variance and the bias is often irrelevant when the estimator is used for pair-wise comparisons.We conclude by extending the estimator to policies with memory and compare its performance in a greedy search algorithm to the REINFORCE algorithm showing an order of magnitude reduction in the number of trials required.

Veja mais

Estimating Dependency Structure as a Hidden Variable

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Veja mais

Estimating Dependency Structure as a Hidden Variable

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper introduces a probability model, the mixture of trees that can account for sparse, dynamically changing dependence relationships. We present a family of efficient algorithms that use EM and the Minimum Spanning Tree algorithm to find the ML and MAP mixture of trees for a variety of priors, including the Dirichlet and the MDL priors. We also show that the single tree classifier acts like an implicit feature selector, thus making the classification performance insensitive to irrelevant attributes. Experimental results demonstrate the excellent performance of the new model both in density estimation and in classification.

Veja mais

3 resultados para VARIABLE SAMPLING INTERVAL

em Massachusetts Institute of Technology

Filtro por publicador

Policy Improvement for POMDPs Using Normalized Importance Sampling

Estimating Dependency Structure as a Hidden Variable

Estimating Dependency Structure as a Hidden Variable