364 resultados para mixture distribution
em Queensland University of Technology - ePrints Archive
Resumo:
Autonomous navigation and picture compilation tasks require robust feature descriptions or models. Given the non Gaussian nature of sensor observations, it will be shown that Gaussian mixture models provide a general probabilistic representation allowing analytical solutions to the update and prediction operations in the general Bayesian filtering problem. Each operation in the Bayesian filter for Gaussian mixture models multiplicatively increases the number of parameters in the representation leading to the need for a re-parameterisation step. A computationally efficient re-parameterisation step will be demonstrated resulting in a compact and accurate estimate of the true distribution.
Resumo:
Fractional anisotropy (FA), a very widely used measure of fiber integrity based on diffusion tensor imaging (DTI), is a problematic concept as it is influenced by several quantities including the number of dominant fiber directions within each voxel, each fiber's anisotropy, and partial volume effects from neighboring gray matter. With High-angular resolution diffusion imaging (HARDI) and the tensor distribution function (TDF), one can reconstruct multiple underlying fibers per voxel and their individual anisotropy measures by representing the diffusion profile as a probabilistic mixture of tensors. We found that FA, when compared with TDF-derived anisotropy measures, correlates poorly with individual fiber anisotropy, and may sub-optimally detect disease processes that affect myelination. By contrast, mean diffusivity (MD) as defined in standard DTI appears to be more accurate. Overall, we argue that novel measures derived from the TDF approach may yield more sensitive and accurate information than DTI-derived measures.
Resumo:
Molecular phylogenetic studies of homologous sequences of nucleotides often assume that the underlying evolutionary process was globally stationary, reversible, and homogeneous (SRH), and that a model of evolution with one or more site-specific and time-reversible rate matrices (e.g., the GTR rate matrix) is enough to accurately model the evolution of data over the whole tree. However, an increasing body of data suggests that evolution under these conditions is an exception, rather than the norm. To address this issue, several non-SRH models of molecular evolution have been proposed, but they either ignore heterogeneity in the substitution process across sites (HAS) or assume it can be modeled accurately using the distribution. As an alternative to these models of evolution, we introduce a family of mixture models that approximate HAS without the assumption of an underlying predefined statistical distribution. This family of mixture models is combined with non-SRH models of evolution that account for heterogeneity in the substitution process across lineages (HAL). We also present two algorithms for searching model space and identifying an optimal model of evolution that is less likely to over- or underparameterize the data. The performance of the two new algorithms was evaluated using alignments of nucleotides with 10 000 sites simulated under complex non-SRH conditions on a 25-tipped tree. The algorithms were found to be very successful, identifying the correct HAL model with a 75% success rate (the average success rate for assigning rate matrices to the tree's 48 edges was 99.25%) and, for the correct HAL model, identifying the correct HAS model with a 98% success rate. Finally, parameter estimates obtained under the correct HAL-HAS model were found to be accurate and precise. The merits of our new algorithms were illustrated with an analysis of 42 337 second codon sites extracted from a concatenation of 106 alignments of orthologous genes encoded by the nuclear genomes of Saccharomyces cerevisiae, S. paradoxus, S. mikatae, S. kudriavzevii, S. castellii, S. kluyveri, S. bayanus, and Candida albicans. Our results show that second codon sites in the ancestral genome of these species contained 49.1% invariable sites, 39.6% variable sites belonging to one rate category (V1), and 11.3% variable sites belonging to a second rate category (V2). The ancestral nucleotide content was found to differ markedly across these three sets of sites, and the evolutionary processes operating at the variable sites were found to be non-SRH and best modeled by a combination of eight edge-specific rate matrices (four for V1 and four for V2). The number of substitutions per site at the variable sites also differed markedly, with sites belonging to V1 evolving slower than those belonging to V2 along the lineages separating the seven species of Saccharomyces. Finally, sites belonging to V1 appeared to have ceased evolving along the lineages separating S. cerevisiae, S. paradoxus, S. mikatae, S. kudriavzevii, and S. bayanus, implying that they might have become so selectively constrained that they could be considered invariable sites in these species.
Resumo:
This paper proposes solutions to three issues pertaining to the estimation of finite mixture models with an unknown number of components: the non-identifiability induced by overfitting the number of components, the mixing limitations of standard Markov Chain Monte Carlo (MCMC) sampling techniques, and the related label switching problem. An overfitting approach is used to estimate the number of components in a finite mixture model via a Zmix algorithm. Zmix provides a bridge between multidimensional samplers and test based estimation methods, whereby priors are chosen to encourage extra groups to have weights approaching zero. MCMC sampling is made possible by the implementation of prior parallel tempering, an extension of parallel tempering. Zmix can accurately estimate the number of components, posterior parameter estimates and allocation probabilities given a sufficiently large sample size. The results will reflect uncertainty in the final model and will report the range of possible candidate models and their respective estimated probabilities from a single run. Label switching is resolved with a computationally light-weight method, Zswitch, developed for overfitted mixtures by exploiting the intuitiveness of allocation-based relabelling algorithms and the precision of label-invariant loss functions. Four simulation studies are included to illustrate Zmix and Zswitch, as well as three case studies from the literature. All methods are available as part of the R package Zmix, which can currently be applied to univariate Gaussian mixture models.
Resumo:
In this paper, we examine approaches to estimate a Bayesian mixture model at both single and multiple time points for a sample of actual and simulated aerosol particle size distribution (PSD) data. For estimation of a mixture model at a single time point, we use Reversible Jump Markov Chain Monte Carlo (RJMCMC) to estimate mixture model parameters including the number of components which is assumed to be unknown. We compare the results of this approach to a commonly used estimation method in the aerosol physics literature. As PSD data is often measured over time, often at small time intervals, we also examine the use of an informative prior for estimation of the mixture parameters which takes into account the correlated nature of the parameters. The Bayesian mixture model offers a promising approach, providing advantages both in estimation and inference.
Resumo:
Many biological environments are crowded by macromolecules, organelles and cells which can impede the transport of other cells and molecules. Previous studies have sought to describe these effects using either random walk models or fractional order diffusion equations. Here we examine the transport of both a single agent and a population of agents through an environment containing obstacles of varying size and shape, whose relative densities are drawn from a specified distribution. Our simulation results for a single agent indicate that smaller obstacles are more effective at retarding transport than larger obstacles; these findings are consistent with our simulations of the collective motion of populations of agents. In an attempt to explore whether these kinds of stochastic random walk simulations can be described using a fractional order diffusion equation framework, we calibrate the solution of such a differential equation to our averaged agent density information. Our approach suggests that these kinds of commonly used differential equation models ought to be used with care since we are unable to match the solution of a fractional order diffusion equation to our data in a consistent fashion over a finite time period.