3 resultados para sampling procedure

em Massachusetts Institute of Technology


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Two methods of obtaining approximate solutions to the classic General Job-shop Scheduling Program are investigated. The first method is iterative. A sampling of the solution space is used to decide which of a collection of space pruning constraints are consistent with "good" schedules. The selected space pruning constraints are then used to reduce the search space and the sampling is repeated. This approach can be used either to verify whether some set of space pruning constraints can prune with discrimination or to generate solutions directly. Schedules can be represented as trajectories through a Cartesian space. Under the objective criteria of Minimum maximum Lateness family of "good" schedules (trajectories) are geometric neighbors (reside with some "tube") in this space. This second method of generating solutions takes advantage of this adjacency by pruning the space from the outside in thus converging gradually upon this "tube." One the average this methods significantly outperforms an array of the Priority Dispatch rules when the object criteria is that of Minimum Maximum Lateness. It also compares favorably with a recent relaxation procedure.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present a framework for learning in hidden Markov models with distributed state representations. Within this framework, we derive a learning algorithm based on the Expectation--Maximization (EM) procedure for maximum likelihood estimation. Analogous to the standard Baum-Welch update rules, the M-step of our algorithm is exact and can be solved analytically. However, due to the combinatorial nature of the hidden state representation, the exact E-step is intractable. A simple and tractable mean field approximation is derived. Empirical results on a set of problems suggest that both the mean field approximation and Gibbs sampling are viable alternatives to the computationally expensive exact algorithm.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present a new method for estimating the expected return of a POMDP from experience. The estimator does not assume any knowle ge of the POMDP and allows the experience to be gathered with an arbitrary set of policies. The return is estimated for any new policy of the POMDP. We motivate the estimator from function-approximation and importance sampling points-of-view and derive its theoretical properties. Although the estimator is biased, it has low variance and the bias is often irrelevant when the estimator is used for pair-wise comparisons.We conclude by extending the estimator to policies with memory and compare its performance in a greedy search algorithm to the REINFORCE algorithm showing an order of magnitude reduction in the number of trials required.