855 resultados para Faster convergence


Relevância:

10.00% 10.00%

Publicador:

Resumo:

Gradient-based approaches to direct policy search in reinforcement learning have received much recent attention as a means to solve problems of partial observability and to avoid some of the problems associated with policy degradation in value-function methods. In this paper we introduce GPOMDP, a simulation-based algorithm for generating a biased estimate of the gradient of the average reward in Partially Observable Markov Decision Processes (POMDPs) controlled by parameterized stochastic policies. A similar algorithm was proposed by Kimura, Yamamura, and Kobayashi (1995). The algorithm's chief advantages are that it requires storage of only twice the number of policy parameters, uses one free parameter β ∈ [0,1) (which has a natural interpretation in terms of bias-variance trade-off), and requires no knowledge of the underlying state. We prove convergence of GPOMDP, and show how the correct choice of the parameter β is related to the mixing time of the controlled POMDP. We briefly describe extensions of GPOMDP to controlled Markov chains, continuous state, observation and control spaces, multiple-agents, higher-order derivatives, and a version for training stochastic policies with internal states. In a companion paper (Baxter, Bartlett, & Weaver, 2001) we show how the gradient estimates generated by GPOMDP can be used in both a traditional stochastic gradient algorithm and a conjugate-gradient procedure to find local optima of the average reward. ©2001 AI Access Foundation and Morgan Kaufmann Publishers. All rights reserved.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A number of Game Strategies (GS) have been developed in past decades. They have been used in the fields of economics, engineering, computer science and biology due to their efficiency in solving design optimization problems. In addition, research in multi-objective (MO) and multidisciplinary design optimization (MDO) has focused on developing robust and efficient optimization methods to produce a set of high quality solutions with low computational cost. In this paper, two optimization techniques are considered; the first optimization method uses multi-fidelity hierarchical Pareto optimality. The second optimization method uses the combination of two Game Strategies; Nash-equilibrium and Pareto optimality. The paper shows how Game Strategies can be hybridised and coupled to Multi-Objective Evolutionary Algorithms (MOEA) to accelerate convergence speed and to produce a set of high quality solutions. Numerical results obtained from both optimization methods are compared in terms of computational expense and model quality. The benefits of using Hybrid-Game Strategies are clearly demonstrated

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We consider complexity penalization methods for model selection. These methods aim to choose a model to optimally trade off estimation and approximation errors by minimizing the sum of an empirical risk term and a complexity penalty. It is well known that if we use a bound on the maximal deviation between empirical and true risks as a complexity penalty, then the risk of our choice is no more than the approximation error plus twice the complexity penalty. There are many cases, however, where complexity penalties like this give loose upper bounds on the estimation error. In particular, if we choose a function from a suitably simple convex function class with a strictly convex loss function, then the estimation error (the difference between the risk of the empirical risk minimizer and the minimal risk in the class) approaches zero at a faster rate than the maximal deviation between empirical and true risks. In this paper, we address the question of whether it is possible to design a complexity penalized model selection method for these situations. We show that, provided the sequence of models is ordered by inclusion, in these cases we can use tight upper bounds on estimation error as a complexity penalty. Surprisingly, this is the case even in situations when the difference between the empirical risk and true risk (and indeed the error of any estimate of the approximation error) decreases much more slowly than the complexity penalty. We give an oracle inequality showing that the resulting model selection method chooses a function with risk no more than the approximation error plus a constant times the complexity penalty.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We study sample-based estimates of the expectation of the function produced by the empirical minimization algorithm. We investigate the extent to which one can estimate the rate of convergence of the empirical minimizer in a data dependent manner. We establish three main results. First, we provide an algorithm that upper bounds the expectation of the empirical minimizer in a completely data-dependent manner. This bound is based on a structural result due to Bartlett and Mendelson, which relates expectations to sample averages. Second, we show that these structural upper bounds can be loose, compared to previous bounds. In particular, we demonstrate a class for which the expectation of the empirical minimizer decreases as O(1/n) for sample size n, although the upper bound based on structural properties is Ω(1). Third, we show that this looseness of the bound is inevitable: we present an example that shows that a sharp bound cannot be universally recovered from empirical data.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In semisupervised learning (SSL), a predictive model is learn from a collection of labeled data and a typically much larger collection of unlabeled data. These paper presented a framework called multi-view point cloud regularization (MVPCR), which unifies and generalizes several semisupervised kernel methods that are based on data-dependent regularization in reproducing kernel Hilbert spaces (RKHSs). Special cases of MVPCR include coregularized least squares (CoRLS), manifold regularization (MR), and graph-based SSL. An accompanying theorem shows how to reduce any MVPCR problem to standard supervised learning with a new multi-view kernel.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Background: Achieving health equity has been identified as a major challenge, both internationally and within Australia. Inequalities in cancer outcomes are well documented, and must be quantified before they can be addressed. One method of portraying geographical variation in data uses maps. Recently we have produced thematic maps showing the geographical variation in cancer incidence and survival across Queensland, Australia. This article documents the decisions and rationale used in producing these maps, with the aim to assist others in producing chronic disease atlases. Methods: Bayesian hierarchical models were used to produce the estimates. Justification for the cancers chosen, geographical areas used, modelling method, outcome measures mapped, production of the adjacency matrix, assessment of convergence, sensitivity analyses performed and determination of significant geographical variation is provided. Conclusions: Although careful consideration of many issues is required, chronic disease atlases are a useful tool for assessing and quantifying geographical inequalities. In addition they help focus research efforts to investigate why the observed inequalities exist, which in turn inform advocacy, policy, support and education programs designed to reduce these inequalities.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The uniformization method (also known as randomization) is a numerically stable algorithm for computing transient distributions of a continuous time Markov chain. When the solution is needed after a long run or when the convergence is slow, the uniformization method involves a large number of matrix-vector products. Despite this, the method remains very popular due to its ease of implementation and its reliability in many practical circumstances. Because calculating the matrix-vector product is the most time-consuming part of the method, overall efficiency in solving large-scale problems can be significantly enhanced if the matrix-vector product is made more economical. In this paper, we incorporate a new relaxation strategy into the uniformization method to compute the matrix-vector products only approximately. We analyze the error introduced by these inexact matrix-vector products and discuss strategies for refining the accuracy of the relaxation while reducing the execution cost. Numerical experiments drawn from computer systems and biological systems are given to show that significant computational savings are achieved in practical applications.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

BACKGROUND The transgenic adenocarcinoma of the mouse prostate (TRAMP) model closely mimics PC-progression as it occurs in humans. However, the timing of disease incidence and progression (especially late stage) makes it logistically difficult to conduct experiments synchronously and economically. The development and characterization of androgen depletion independent (ADI) TRAMP sublines are reported. METHODS Sublines were derived from androgen-sensitive TRAMP-C1 and TRAMP-C2 cell lines by androgen deprivation in vitro and in vivo. Epithelial origin (cytokeratin) and expression of late stage biomarkers (E-cadherin and KAI-1) were evaluated using immunohistochemistry. Androgen receptor (AR) status was assessed through quantitative real time PCR, Western blotting, and immunohistochemistry. Coexpression of AR and E-cadherin was also evaluated. Clonogenicity and invasive potential were measured by soft agar and matrigel invasion assays. Proliferation/survival of sublines in response to androgen was assessed by WST-1 assay. In vivo growth of subcutaneous tumors was assessed in castrated and sham-castrated C57BL/6 mice. RESULTS The sublines were epithelial and displayed ADI in vitro and in vivo. Compared to the parental lines, these showed (1) significantly faster growth rates in vitro and in vivo independent of androgen depletion, (2) greater tumorigenic, and invasive potential in vitro. All showed substantial downregulation in expression levels of tumor suppressor, E-cadherin, and metastatis suppressor, KAI-1. Interestingly, the percentage of cells expressing AR with downregulated E-cadherin was higher in ADI cells, suggesting a possible interaction between the two pathways. CONCLUSIONS The TRAMP model now encompasses ADI sublines potentially representing different phenotypes with increased tumorigenicity and invasiveness.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Recently, many new applications in engineering and science are governed by a series of fractional partial differential equations (FPDEs). Unlike the normal partial differential equations (PDEs), the differential order in a FPDE is with a fractional order, which will lead to new challenges for numerical simulation, because most existing numerical simulation techniques are developed for the PDE with an integer differential order. The current dominant numerical method for FPDEs is Finite Difference Method (FDM), which is usually difficult to handle a complex problem domain, and also hard to use irregular nodal distribution. This paper aims to develop an implicit meshless approach based on the moving least squares (MLS) approximation for numerical simulation of fractional advection-diffusion equations (FADE), which is a typical FPDE. The discrete system of equations is obtained by using the MLS meshless shape functions and the meshless strong-forms. The stability and convergence related to the time discretization of this approach are then discussed and theoretically proven. Several numerical examples with different problem domains and different nodal distributions are used to validate and investigate accuracy and efficiency of the newly developed meshless formulation. It is concluded that the present meshless formulation is very effective for the modeling and simulation of the FADE.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper aims to develop an implicit meshless approach based on the radial basis function (RBF) for numerical simulation of time fractional diffusion equations. The meshless RBF interpolation is firstly briefed. The discrete equations for two-dimensional time fractional diffusion equation (FDE) are obtained by using the meshless RBF shape functions and the strong-forms of the time FDE. The stability and convergence of this meshless approach are discussed and theoretically proven. Numerical examples with different problem domains and different nodal distributions are studied to validate and investigate accuracy and efficiency of the newly developed meshless approach. It has proven that the present meshless formulation is very effective for modeling and simulation of fractional differential equations.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper, a variable-order nonlinear cable equation is considered. A numerical method with first-order temporal accuracy and fourth-order spatial accuracy is proposed. The convergence and stability of the numerical method are analyzed by Fourier analysis. We also propose an improved numerical method with second-order temporal accuracy and fourth-order spatial accuracy. Finally, the results of a numerical example support the theoretical analysis.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This study investigated the hypothesis that muscle damage would be attenuated in muscles subjected to passive hyperthermia 1 day prior to exercise. Fifteen male students performed 24 maximal eccentric actions of the elbow flexors with one arm; the opposite arm performed the same exercise 2-4 weeks later. The elbow flexors of one arm received a microwave diathermy treatment that increased muscle temperature to over 40°C, 16-20 h prior to the exercise. The contralateral arm acted as an untreated control. Maximal voluntary isometric contraction strength (MVC), range of motion (ROM), upper arm circumference, muscle soreness, plasma creatine kinase activity and myoglobin concentration were measured 1 day prior to exercise, immediately before and after exercise, and daily for 4 days following exercise. Changes in the criterion measures were compared between conditions (treatment vs. control) using a two-way repeated measures ANOVA with a significance level of P < 0.05. All measures changed significantly following exercise, but the treatment arm showed a significantly faster recovery of MVC, a smaller change in ROM, and less muscle soreness compared with the control arm. However, the protective effect conferred by the diathermy treatment was significantly less effective compared with that seen in the second bout performed 4-6 weeks after the initial bout by a subgroup of the subjects (n = 11) using the control arm. These results suggest that passive hyperthermia treatment 1 day prior to eccentric exercise-induced muscle damage has a prophylactic effect, but the effect is not as strong as the repeated bout effect. © Springer-Verlag 2006.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this research we examined, by means of case studies, the mechanisms by which relationships can be managed and by which communication and cooperation can be enhanced in developing sustainable supply chains. The research was predicated on the contention that the development of a sustainable supply chain depends, in part, on the transfer of knowledge and capabilities from the larger players in the supply chain. A sustainable supply chain requires proactive relationship management and the development of an appropriate organisational culture, and trust. By legitimising individuals’ expectations of the type of culture which is appropriate to their company and empowering employees to address mismatches that may occur, a situation can be created whereby the collaborating organisations develop their competences symbiotically and so facilitate a sustainable supply chain. Effective supply chain management enhances organisation performance and competitiveness through the management of operations across organisational boundaries. Relational contracting approaches facilitate the exchange of information and knowledge and build capacity in the supply chain, thus enhancing its sustainability. Relationship management also provides the conditions necessary for the development of collaborative and cooperative relationships However, often subcontractors and suppliers are not empowered to attend project meetings or to have direct communication with project based staff. With this being a common phenomenon in the construction industry, one might ask: what are the barriers to implementation of relationship management through the supply chain? In other words, the problem addressed in this research is the engagement of the supply chain through relationship management.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We consider a stochastic regularization method for solving the backward Cauchy problem in Banach spaces. An order of convergence is obtained on sourcewise representative elements.