39 resultados para Maximum-likelihood-estimation


Relevância:

80.00% 80.00%

Publicador:

Resumo:

For many realistic scenarios, there are multiple factors that affect the clean speech signal. In this work approaches to handling two such factors, speaker and background noise differences, simultaneously are described. A new adaptation scheme is proposed. Here the acoustic models are first adapted to the target speaker via an MLLR transform. This is followed by adaptation to the target noise environment via model-based vector Taylor series (VTS) compensation. These speaker and noise transforms are jointly estimated, using maximum likelihood. Experiments on the AURORA4 task demonstrate that this adaptation scheme provides improved performance over VTS-based noise adaptation. In addition, this framework enables the speech and noise to be factorised, allowing the speaker transform estimated in one noise condition to be successfully used in a different noise condition. © 2011 IEEE.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Recently there has been interest in structured discriminative models for speech recognition. In these models sentence posteriors are directly modelled, given a set of features extracted from the observation sequence, and hypothesised word sequence. In previous work these discriminative models have been combined with features derived from generative models for noise-robust speech recognition for continuous digits. This paper extends this work to medium to large vocabulary tasks. The form of the score-space extracted using the generative models, and parameter tying of the discriminative model, are both discussed. Update formulae for both conditional maximum likelihood and minimum Bayes' risk training are described. Experimental results are presented on small and medium to large vocabulary noise-corrupted speech recognition tasks: AURORA 2 and 4. © 2011 IEEE.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We propose a novel model for the spatio-temporal clustering of trajectories based on motion, which applies to challenging street-view video sequences of pedestrians captured by a mobile camera. A key contribution of our work is the introduction of novel probabilistic region trajectories, motivated by the non-repeatability of segmentation of frames in a video sequence. Hierarchical image segments are obtained by using a state-of-the-art hierarchical segmentation algorithm, and connected from adjacent frames in a directed acyclic graph. The region trajectories and measures of confidence are extracted from this graph using a dynamic programming-based optimisation. Our second main contribution is a Bayesian framework with a twofold goal: to learn the optimal, in a maximum likelihood sense, Random Forests classifier of motion patterns based on video features, and construct a unique graph from region trajectories of different frames, lengths and hierarchical levels. Finally, we demonstrate the use of Isomap for effective spatio-temporal clustering of the region trajectories of pedestrians. We support our claims with experimental results on new and existing challenging video sequences. © 2011 IEEE.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

An increasingly common scenario in building speech synthesis and recognition systems is training on inhomogeneous data. This paper proposes a new framework for estimating hidden Markov models on data containing both multiple speakers and multiple languages. The proposed framework, speaker and language factorization, attempts to factorize speaker-/language-specific characteristics in the data and then model them using separate transforms. Language-specific factors in the data are represented by transforms based on cluster mean interpolation with cluster-dependent decision trees. Acoustic variations caused by speaker characteristics are handled by transforms based on constrained maximum-likelihood linear regression. Experimental results on statistical parametric speech synthesis show that the proposed framework enables data from multiple speakers in different languages to be used to: train a synthesis system; synthesize speech in a language using speaker characteristics estimated in a different language; and adapt to a new language. © 2012 IEEE.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Antibodies are known to be essential in controlling Salmonella infection, but their exact role remains elusive. We recently developed an in vitro model to investigate the relative efficiency of four different human immunoglobulin G (IgG) subclasses in modulating the interaction of the bacteria with human phagocytes. Our results indicated that different IgG subclasses affect the efficacy of Salmonella uptake by human phagocytes. In this study, we aim to quantify the effects of IgG on intracellular dynamics of infection by combining distributions of bacterial numbers per phagocyte observed by fluorescence microscopy with a mathematical model that simulates the in vitro dynamics. We then use maximum likelihood to estimate the model parameters and compare them across IgG subclasses. The analysis reveals heterogeneity in the division rates of the bacteria, strongly suggesting that a subpopulation of intracellular Salmonella, while visible under the microscope, is not dividing. Clear differences in the observed distributions among the four IgG subclasses are best explained by variations in phagocytosis and intracellular dynamics. We propose and compare potential factors affecting the replication and death of bacteria within phagocytes, and we discuss these results in the light of recent findings on dormancy of Salmonella.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Electron multiplication charge-coupled devices (EMCCD) are widely used for photon counting experiments and measurements of low intensity light sources, and are extensively employed in biological fluorescence imaging applications. These devices have a complex statistical behaviour that is often not fully considered in the analysis of EMCCD data. Robust and optimal analysis of EMCCD images requires an understanding of their noise properties, in particular to exploit fully the advantages of Bayesian and maximum-likelihood analysis techniques, whose value is increasingly recognised in biological imaging for obtaining robust quantitative measurements from challenging data. To improve our own EMCCD analysis and as an effort to aid that of the wider bioimaging community, we present, explain and discuss a detailed physical model for EMCCD noise properties, giving a likelihood function for image counts in each pixel for a given incident intensity, and we explain how to measure the parameters for this model from various calibration images. © 2013 Hirsch et al.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We study unsupervised learning in a probabilistic generative model for occlusion. The model uses two types of latent variables: one indicates which objects are present in the image, and the other how they are ordered in depth. This depth order then determines how the positions and appearances of the objects present, specified in the model parameters, combine to form the image. We show that the object parameters can be learnt from an unlabelled set of images in which objects occlude one another. Exact maximum-likelihood learning is intractable. However, we show that tractable approximations to Expectation Maximization (EM) can be found if the training images each contain only a small number of objects on average. In numerical experiments it is shown that these approximations recover the correct set of object parameters. Experiments on a novel version of the bars test using colored bars, and experiments on more realistic data, show that the algorithm performs well in extracting the generating causes. Experiments based on the standard bars benchmark test for object learning show that the algorithm performs well in comparison to other recent component extraction approaches. The model and the learning algorithm thus connect research on occlusion with the research field of multiple-causes component extraction methods.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The prediction of time-changing variances is an important task in the modeling of financial data. Standard econometric models are often limited as they assume rigid functional relationships for the evolution of the variance. Moreover, functional parameters are usually learned by maximum likelihood, which can lead to over-fitting. To address these problems we introduce GP-Vol, a novel non-parametric model for time-changing variances based on Gaussian Processes. This new model can capture highly flexible functional relationships for the variances. Furthermore, we introduce a new online algorithm for fast inference in GP-Vol. This method is much faster than current offline inference procedures and it avoids overfitting problems by following a fully Bayesian approach. Experiments with financial data show that GP-Vol performs significantly better than current standard alternatives.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The importance of properly exploiting a classifier's inherent geometric characteristics when developing a classification methodology is emphasized as a prerequisite to achieving near optimal performance when carrying out thematic mapping. When used properly, it is argued that the long-standing maximum likelihood approach and the more recent support vector machine can perform comparably. Both contain the flexibility to segment the spectral domain in such a manner as to match inherent class separations in the data, as do most reasonable classifiers. The choice of which classifier to use in practice is determined largely by preference and related considerations, such as ease of training, multiclass capabilities, and classification cost. © 1980-2012 IEEE.