39 resultados para maximum likelihood analysis


Relevância:

80.00% 80.00%

Publicador:

Resumo:

For many realistic scenarios, there are multiple factors that affect the clean speech signal. In this work approaches to handling two such factors, speaker and background noise differences, simultaneously are described. A new adaptation scheme is proposed. Here the acoustic models are first adapted to the target speaker via an MLLR transform. This is followed by adaptation to the target noise environment via model-based vector Taylor series (VTS) compensation. These speaker and noise transforms are jointly estimated, using maximum likelihood. Experiments on the AURORA4 task demonstrate that this adaptation scheme provides improved performance over VTS-based noise adaptation. In addition, this framework enables the speech and noise to be factorised, allowing the speaker transform estimated in one noise condition to be successfully used in a different noise condition. © 2011 IEEE.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Recently there has been interest in structured discriminative models for speech recognition. In these models sentence posteriors are directly modelled, given a set of features extracted from the observation sequence, and hypothesised word sequence. In previous work these discriminative models have been combined with features derived from generative models for noise-robust speech recognition for continuous digits. This paper extends this work to medium to large vocabulary tasks. The form of the score-space extracted using the generative models, and parameter tying of the discriminative model, are both discussed. Update formulae for both conditional maximum likelihood and minimum Bayes' risk training are described. Experimental results are presented on small and medium to large vocabulary noise-corrupted speech recognition tasks: AURORA 2 and 4. © 2011 IEEE.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We propose a novel model for the spatio-temporal clustering of trajectories based on motion, which applies to challenging street-view video sequences of pedestrians captured by a mobile camera. A key contribution of our work is the introduction of novel probabilistic region trajectories, motivated by the non-repeatability of segmentation of frames in a video sequence. Hierarchical image segments are obtained by using a state-of-the-art hierarchical segmentation algorithm, and connected from adjacent frames in a directed acyclic graph. The region trajectories and measures of confidence are extracted from this graph using a dynamic programming-based optimisation. Our second main contribution is a Bayesian framework with a twofold goal: to learn the optimal, in a maximum likelihood sense, Random Forests classifier of motion patterns based on video features, and construct a unique graph from region trajectories of different frames, lengths and hierarchical levels. Finally, we demonstrate the use of Isomap for effective spatio-temporal clustering of the region trajectories of pedestrians. We support our claims with experimental results on new and existing challenging video sequences. © 2011 IEEE.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

An increasingly common scenario in building speech synthesis and recognition systems is training on inhomogeneous data. This paper proposes a new framework for estimating hidden Markov models on data containing both multiple speakers and multiple languages. The proposed framework, speaker and language factorization, attempts to factorize speaker-/language-specific characteristics in the data and then model them using separate transforms. Language-specific factors in the data are represented by transforms based on cluster mean interpolation with cluster-dependent decision trees. Acoustic variations caused by speaker characteristics are handled by transforms based on constrained maximum-likelihood linear regression. Experimental results on statistical parametric speech synthesis show that the proposed framework enables data from multiple speakers in different languages to be used to: train a synthesis system; synthesize speech in a language using speaker characteristics estimated in a different language; and adapt to a new language. © 2012 IEEE.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper we formulate the nonnegative matrix factorisation (NMF) problem as a maximum likelihood estimation problem for hidden Markov models and propose online expectation-maximisation (EM) algorithms to estimate the NMF and the other unknown static parameters. We also propose a sequential Monte Carlo approximation of our online EM algorithm. We show the performance of the proposed method with two numerical examples. © 2012 IFAC.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper, we present an expectation-maximisation (EM) algorithm for maximum likelihood estimation in multiple target models (MTT) with Gaussian linear state-space dynamics. We show that estimation of sufficient statistics for EM in a single Gaussian linear state-space model can be extended to the MTT case along with a Monte Carlo approximation for inference of unknown associations of targets. The stochastic approximation EM algorithm that we present here can be used along with any Monte Carlo method which has been developed for tracking in MTT models, such as Markov chain Monte Carlo and sequential Monte Carlo methods. We demonstrate the performance of the algorithm with a simulation. © 2012 ISIF (Intl Society of Information Fusi).

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The prediction of time-changing variances is an important task in the modeling of financial data. Standard econometric models are often limited as they assume rigid functional relationships for the evolution of the variance. Moreover, functional parameters are usually learned by maximum likelihood, which can lead to over-fitting. To address these problems we introduce GP-Vol, a novel non-parametric model for time-changing variances based on Gaussian Processes. This new model can capture highly flexible functional relationships for the variances. Furthermore, we introduce a new online algorithm for fast inference in GP-Vol. This method is much faster than current offline inference procedures and it avoids overfitting problems by following a fully Bayesian approach. Experiments with financial data show that GP-Vol performs significantly better than current standard alternatives.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We present novel batch and online (sequential) versions of the expectation-maximisation (EM) algorithm for inferring the static parameters of a multiple target tracking (MTT) model. Online EM is of particular interest as it is a more practical method for long data sets since in batch EM, or a full Bayesian approach, a complete browse of the data is required between successive parameter updates. Online EM is also suited to MTT applications that demand real-time processing of the data. Performance is assessed in numerical examples using simulated data for various scenarios. For batch estimation our method significantly outperforms an existing gradient based maximum likelihood technique, which we show to be significantly biased. © 2014 Springer Science+Business Media New York.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The importance of properly exploiting a classifier's inherent geometric characteristics when developing a classification methodology is emphasized as a prerequisite to achieving near optimal performance when carrying out thematic mapping. When used properly, it is argued that the long-standing maximum likelihood approach and the more recent support vector machine can perform comparably. Both contain the flexibility to segment the spectral domain in such a manner as to match inherent class separations in the data, as do most reasonable classifiers. The choice of which classifier to use in practice is determined largely by preference and related considerations, such as ease of training, multiclass capabilities, and classification cost. © 1980-2012 IEEE.