40 resultados para Welfare State Models
Resumo:
State-space inference and learning with Gaussian processes (GPs) is an unsolved problem. We propose a new, general methodology for inference and learning in nonlinear state-space models that are described probabilistically by non-parametric GP models. We apply the expectation maximization algorithm to iterate between inference in the latent state-space and learning the parameters of the underlying GP dynamics model. Copyright 2010 by the authors.
Resumo:
Obtaining accurate confidence measures for automatic speech recognition (ASR) transcriptions is an important task which stands to benefit from the use of multiple information sources. This paper investigates the application of conditional random field (CRF) models as a principled technique for combining multiple features from such sources. A novel method for combining suitably defined features is presented, allowing for confidence annotation using lattice-based features of hypotheses other than the lattice 1-best. The resulting framework is applied to different stages of a state-of-the-art large vocabulary speech recognition pipeline, and consistent improvements are shown over a sophisticated baseline system. Copyright © 2011 ISCA.
Resumo:
In this article, we develop a new Rao-Blackwellized Monte Carlo smoothing algorithm for conditionally linear Gaussian models. The algorithm is based on the forward-filtering backward-simulation Monte Carlo smoother concept and performs the backward simulation directly in the marginal space of the non-Gaussian state component while treating the linear part analytically. Unlike the previously proposed backward-simulation based Rao-Blackwellized smoothing approaches, it does not require sampling of the Gaussian state component and is also able to overcome certain normalization problems of two-filter smoother based approaches. The performance of the algorithm is illustrated in a simulated application. © 2012 IFAC.
Resumo:
Mandarin Chinese is based on characters which are syllabic in nature and morphological in meaning. All spoken languages have syllabiotactic rules which govern the construction of syllables and their allowed sequences. These constraints are not as restrictive as those learned from word sequences, but they can provide additional useful linguistic information. Hence, it is possible to improve speech recognition performance by appropriately combining these two types of constraints. For the Chinese language considered in this paper, character level language models (LMs) can be used as a first level approximation to allowed syllable sequences. To test this idea, word and character level n-gram LMs were trained on 2.8 billion words (equivalent to 4.3 billion characters) of texts from a wide collection of text sources. Both hypothesis and model based combination techniques were investigated to combine word and character level LMs. Significant character error rate reductions up to 7.3% relative were obtained on a state-of-the-art Mandarin Chinese broadcast audio recognition task using an adapted history dependent multi-level LM that performs a log-linearly combination of character and word level LMs. This supports the hypothesis that character or syllable sequence models are useful for improving Mandarin speech recognition performance.
Resumo:
The task of word-level confidence estimation (CE) for automatic speech recognition (ASR) systems stands to benefit from the combination of suitably defined input features from multiple information sources. However, the information sources of interest may not necessarily operate at the same level of granularity as the underlying ASR system. The research described here builds on previous work on confidence estimation for ASR systems using features extracted from word-level recognition lattices, by incorporating information at the sub-word level. Furthermore, the use of Conditional Random Fields (CRFs) with hidden states is investigated as a technique to combine information for word-level CE. Performance improvements are shown using the sub-word-level information in linear-chain CRFs with appropriately engineered feature functions, as well as when applying the hidden-state CRF model at the word level.
Resumo:
In natural languages multiple word sequences can represent the same underlying meaning. Only modelling the observed surface word sequence can result in poor context coverage, for example, when using n-gram language models (LM). To handle this issue, this paper presents a novel form of language model, the paraphrastic LM. A phrase level transduction model that is statistically learned from standard text data is used to generate paraphrase variants. LM probabilities are then estimated by maximizing their marginal probability. Significant error rate reductions of 0.5%-0.6% absolute were obtained on a state-ofthe-art conversational telephone speech recognition task using a paraphrastic multi-level LM modelling both word and phrase sequences.
Resumo:
This paper presents a complete system for expressive visual text-to-speech (VTTS), which is capable of producing expressive output, in the form of a 'talking head', given an input text and a set of continuous expression weights. The face is modeled using an active appearance model (AAM), and several extensions are proposed which make it more applicable to the task of VTTS. The model allows for normalization with respect to both pose and blink state which significantly reduces artifacts in the resulting synthesized sequences. We demonstrate quantitative improvements in terms of reconstruction error over a million frames, as well as in large-scale user studies, comparing the output of different systems. © 2013 IEEE.
Resumo:
Copyright © 2014, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. This paper presents the beginnings of an automatic statistician, focusing on regression problems. Our system explores an open-ended space of statistical models to discover a good explanation of a data set, and then produces a detailed report with figures and natural- language text. Our approach treats unknown regression functions non- parametrically using Gaussian processes, which has two important consequences. First, Gaussian processes can model functions in terms of high-level properties (e.g. smoothness, trends, periodicity, changepoints). Taken together with the compositional structure of our language of models this allows us to automatically describe functions in simple terms. Second, the use of flexible nonparametric models and a rich language for composing them in an open-ended manner also results in state- of-the-art extrapolation performance evaluated over 13 real time series data sets from various domains.
Resumo:
Solid-state dye-sensitized solar cells rely on effective infiltration of a solid-state hole-transporting material into the pores of a nanoporous TiO 2 network to allow for dye regeneration and hole extraction. Using microsecond transient absorption spectroscopy and femtosecond photoluminescence upconversion spectroscopy, the hole-transfer yield from the dye to the hole-transporting material 2,2′,7,7′-tetrakis(N,N-di-p- methoxyphenylamine)-9,9'-spirobifluorene (spiro-OMeTAD) is shown to rise rapidly with higher pore-filling fractions as the dye-coated pore surface is increasingly covered with hole-transporting material. Once a pore-filling fraction of ≈30% is reached, further increases do not significantly change the hole-transfer yield. Using simple models of infiltration of spiro-OMeTAD into the TiO2 porous network, it is shown that this pore-filling fraction is less than the amount required to cover the dye surface with at least a single layer of hole-transporting material, suggesting that charge diffusion through the dye monolayer network precedes transfer to the hole-transporting material. Comparison of these results with device parameters shows that improvements of the power-conversion efficiency beyond ≈30% pore filling are not caused by a higher hole-transfer yield, but by a higher charge-collection efficiency, which is found to occur in steps. The observed sharp onsets in photocurrent and power-conversion efficiencies with increasing pore-filling fraction correlate well with percolation theory, predicting the points of cohesive pathway formation in successive spiro-OMeTAD layers adhered to the pore walls. From percolation theory it is predicted that, for standard mesoporous TiO2 with 20 nm pore size, the photocurrent should show no further improvement beyond an ≈83% pore-filling fraction. Solid-state dye-sensitized solar cells capable of complete hole transfer with pore-filling fractions as low as ∼30% are demonstrated. Improvements of device efficiencies beyond ∼30% are explained by a stepwise increase in charge-collection efficiency in agreement with percolation theory. Furthermore, it is predicted that, for a 20 nm pore size, the photocurrent reaches a maximum at ∼83% pore-filling fraction. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Resumo:
We present a method for producing dense Active Appearance Models (AAMs), suitable for video-realistic synthesis. To this end we estimate a joint alignment of all training images using a set of pairwise registrations and ensure that these pairwise registrations are only calculated between similar images. This is achieved by defining a graph on the image set whose edge weights correspond to registration errors and computing a bounded diameter minimum spanning tree (BDMST). Dense optical flow is used to compute pairwise registration and we introduce a flow refinement method to align small scale texture. Once registration between training images has been established we propose a method to add vertices to the AAM in a way that minimises error between the observed flow fields and a flow field interpolated between the AAM mesh points. We demonstrate a significant improvement in model compactness using the proposed method and show it dealing with cases that are problematic for current state-of-the-art approaches.