69 resultados para Gradient terms
em Queensland University of Technology - ePrints Archive
Resumo:
Gradient-based approaches to direct policy search in reinforcement learning have received much recent attention as a means to solve problems of partial observability and to avoid some of the problems associated with policy degradation in value-function methods. In this paper we introduce GPOMDP, a simulation-based algorithm for generating a biased estimate of the gradient of the average reward in Partially Observable Markov Decision Processes (POMDPs) controlled by parameterized stochastic policies. A similar algorithm was proposed by Kimura, Yamamura, and Kobayashi (1995). The algorithm's chief advantages are that it requires storage of only twice the number of policy parameters, uses one free parameter β ∈ [0,1) (which has a natural interpretation in terms of bias-variance trade-off), and requires no knowledge of the underlying state. We prove convergence of GPOMDP, and show how the correct choice of the parameter β is related to the mixing time of the controlled POMDP. We briefly describe extensions of GPOMDP to controlled Markov chains, continuous state, observation and control spaces, multiple-agents, higher-order derivatives, and a version for training stochastic policies with internal states. In a companion paper (Baxter, Bartlett, & Weaver, 2001) we show how the gradient estimates generated by GPOMDP can be used in both a traditional stochastic gradient algorithm and a conjugate-gradient procedure to find local optima of the average reward. ©2001 AI Access Foundation and Morgan Kaufmann Publishers. All rights reserved.
Resumo:
This research investigates the prevalence of sports-related terms among the Web sites of the world’s leading companies, the Fortune Global 500. An automated process copied about four gigabytes of textual data, around 70 million words, from their sites. The subsequent analysis revealed regional and industry differences in the distribution of sports-related terms, the popularity of tennis stars and few references to sports stars, especially in Asia.
Resumo:
In the policy debate about the need for legislation to prohibit the use of unfair terms in consumer contracts, substantive unfairness is often distinguished from procedural unfairness. Current consumer protection laws appear to offer the potential for relief on substantive unfairness grounds alone. However, a review of cases involving credit contracts shows this potential is rarely realised. This reluctance to provide relief for substantive injustice reflects a preoccupation with freedom and certainty of contract, the notions underpinning classical contract theories. As a class, consumers are vulnerable in the marketplace, and they do need protection from substantively unfair terms. A new framework for regulating consumer contracts is needed, one that relies less on classical contract theories and takes the reality of consumer contracting and consumer behavior as its starting point. Unfair contract terms legislation will be a step on the path towards this new framework.
Resumo:
This work investigates the effect of rib stiffeners on the free and forced vibration of a gradient coil in a Magnetic Resonance Imaging (MRI) scanner. Several reinforcement schemes are studied in this paper. One scheme utilizes the existing holes in the gradient coil structure (typically reserved for magnetic shims) to produce the reinforcement. Non-ferrous, non-magnetic carbon fibre rib stiffeners are employed to fill these holes in several ways to strengthen a gradient coil. Another scheme replaces the inner half of the gradient coil material with a grid of interconnected axial and circumferential rib stiffeners. It is found that the structural stiffness of the gradient coil increases substantially when the coil is reinforced by carbon fibre rib stiffeners. The reinforcement affects the noise and vibration response of the gradient coil structure in the following ways. It increases the frequency range of forced response of the gradient coil at low frequencies due to the increased resonant frequency of the fundamental mode of the coil. Secondly, it reduces the forced response amplitude of the coil structure (which is governed by the structural stiffness of the coil). Thirdly, it reduces the number of natural modes in the low and medium frequency range and therefore lessens the chance of the coil structure being excited resonantly by magnetic resonance signal acquisition sequences. It is shown that gradient coils modelled by solid finite element models have higher stiffness along the coil’s circumference and lower stiffness in the axial direction than those using shell finite element models.
Resumo:
Economists rely heavily on self-reported measures to examine the relationship between income and health. We directly compare survey responses of a self-reported measure of health that is commonly used in nationally representative surveys with objective measures of the same health condition. We focus on hypertension. We find no evidence of an income/health greadient using self-reported hypertension but a sizeable gradient when using objectively measured hypertension. We also find that the probability of a false negative reporting is significantly income graded. Our results suggest that using commonly available self-reported chronic health measures might underestimate true income-related inequalities in health.
Resumo:
Artificial neural networks (ANN) have demonstrated good predictive performance in a wide range of applications. They are, however, not considered sufficient for knowledge representation because of their inability to represent the reasoning process succinctly. This paper proposes a novel methodology Gyan that represents the knowledge of a trained network in the form of restricted first-order predicate rules. The empirical results demonstrate that an equivalent symbolic interpretation in the form of rules with predicates, terms and variables can be derived describing the overall behaviour of the trained ANN with improved comprehensibility while maintaining the accuracy and fidelity of the propositional rules.
Resumo:
The flexural capacity of of a new cold-formed hollow flange channel section known as LiteSteel beam (LSB) is limited by lateral distortional buckling for intermediate spans, which is characterised by simultaneous lateral deflection, twist and web distortion. Recent research has developed suitable design rules for the member capacity of LSBs. However, they are limited to a uniform moment distribution that rarely exists in practice. Many steel design codes have adopted equivalent uniform moment distribution factors to accommodate the effect of non-uniform moment distributions in design. But they were derived mostly based on the data for conventional hot-rolled, doubly symmetric I-beams subject to lateral torsional buckling. The effect of moment distribution for LSBs, and the suitability of the current steel design code rules to include this effect for LSBs are not yet known. This paper presents the details of a research study based on finite element analyses of the lateral buckling strength of simply supported LSBs subject to moment gradient effects. It also presents the details of a number of LSB lateral buckling experiments undertaken to validate the results of finite element analyses. Finally, it discusses the suitability of the current design methods, and provides design recommendations for simply supported LSBs subject to moment gradient effects.
Resumo:
Automatic Speech Recognition (ASR) has matured into a technology which is becoming more common in our everyday lives, and is emerging as a necessity to minimise driver distraction when operating in-car systems such as navigation and infotainment. In “noise-free” environments, word recognition performance of these systems has been shown to approach 100%, however this performance degrades rapidly as the level of background noise is increased. Speech enhancement is a popular method for making ASR systems more ro- bust. Single-channel spectral subtraction was originally designed to improve hu- man speech intelligibility and many attempts have been made to optimise this algorithm in terms of signal-based metrics such as maximised Signal-to-Noise Ratio (SNR) or minimised speech distortion. Such metrics are used to assess en- hancement performance for intelligibility not speech recognition, therefore mak- ing them sub-optimal ASR applications. This research investigates two methods for closely coupling subtractive-type enhancement algorithms with ASR: (a) a computationally-efficient Mel-filterbank noise subtraction technique based on likelihood-maximisation (LIMA), and (b) in- troducing phase spectrum information to enable spectral subtraction in the com- plex frequency domain. Likelihood-maximisation uses gradient-descent to optimise parameters of the enhancement algorithm to best fit the acoustic speech model given a word se- quence known a priori. Whilst this technique is shown to improve the ASR word accuracy performance, it is also identified to be particularly sensitive to non-noise mismatches between the training and testing data. Phase information has long been ignored in spectral subtraction as it is deemed to have little effect on human intelligibility. In this work it is shown that phase information is important in obtaining highly accurate estimates of clean speech magnitudes which are typically used in ASR feature extraction. Phase Estimation via Delay Projection is proposed based on the stationarity of sinusoidal signals, and demonstrates the potential to produce improvements in ASR word accuracy in a wide range of SNR. Throughout the dissertation, consideration is given to practical implemen- tation in vehicular environments which resulted in two novel contributions – a LIMA framework which takes advantage of the grounding procedure common to speech dialogue systems, and a resource-saving formulation of frequency-domain spectral subtraction for realisation in field-programmable gate array hardware. The techniques proposed in this dissertation were evaluated using the Aus- tralian English In-Car Speech Corpus which was collected as part of this work. This database is the first of its kind within Australia and captures real in-car speech of 50 native Australian speakers in seven driving conditions common to Australian environments.
Resumo:
An experimental investigation has been made of a round, non-buoyant plume of nitric oxide, NO, in a turbulent grid flow of ozone, 03, using the Turbulent Smog Chamber at the University of Sydney. The measurements have been made at a resolution not previously reported in the literature. The reaction is conducted at non-equilibrium so there is significant interaction between turbulent mixing and chemical reaction. The plume has been characterized by a set of constant initial reactant concentration measurements consisting of radial profiles at various axial locations. Whole plume behaviour can thus be characterized and parameters are selected for a second set of fixed physical location measurements where the effects of varying the initial reactant concentrations are investigated. Careful experiment design and specially developed chemilurninescent analysers, which measure fluctuating concentrations of reactive scalars, ensure that spatial and temporal resolutions are adequate to measure the quantities of interest. Conserved scalar theory is used to define a conserved scalar from the measured reactive scalars and to define frozen, equilibrium and reaction dominated cases for the reactive scalars. Reactive scalar means and the mean reaction rate are bounded by frozen and equilibrium limits but this is not always the case for the reactant variances and covariances. The plume reactant statistics are closer to the equilibrium limit than those for the ambient reactant. The covariance term in the mean reaction rate is found to be negative and significant for all measurements made. The Toor closure was found to overestimate the mean reaction rate by 15 to 65%. Gradient model turbulent diffusivities had significant scatter and were not observed to be affected by reaction. The ratio of turbulent diffusivities for the conserved scalar mean and that for the r.m.s. was found to be approximately 1. Estimates of the ratio of the dissipation timescales of around 2 were found downstream. Estimates of the correlation coefficient between the conserved scalar and its dissipation (parallel to the mean flow) were found to be between 0.25 and the significant value of 0.5. Scalar dissipations for non-reactive and reactive scalars were found to be significantly different. Conditional statistics are found to be a useful way of investigating the reactive behaviour of the plume, effectively decoupling the interaction of chemical reaction and turbulent mixing. It is found that conditional reactive scalar means lack significant transverse dependence as has previously been found theoretically by Klimenko (1995). It is also found that conditional variance around the conditional reactive scalar means is relatively small, simplifying the closure for the conditional reaction rate. These properties are important for the Conditional Moment Closure (CMC) model for turbulent reacting flows recently proposed by Klimenko (1990) and Bilger (1993). Preliminary CMC model calculations are carried out for this flow using a simple model for the conditional scalar dissipation. Model predictions and measured conditional reactive scalar means compare favorably. The reaction dominated limit is found to indicate the maximum reactedness of a reactive scalar and is a limiting case of the CMC model. Conventional (unconditional) reactive scalar means obtained from the preliminary CMC predictions using the conserved scalar p.d.f. compare favorably with those found from experiment except where measuring position is relatively far upstream of the stoichiometric distance. Recommendations include applying a full CMC model to the flow and investigations both of the less significant terms in the conditional mean species equation and the small variation of the conditional mean with radius. Forms for the p.d.f.s, in addition to those found from experiments, could be useful for extending the CMC model to reactive flows in the atmosphere.