16 resultados para common factor models
Resumo:
A nonparametric Bayesian extension of Factor Analysis (FA) is proposed where observed data $\mathbf{Y}$ is modeled as a linear superposition, $\mathbf{G}$, of a potentially infinite number of hidden factors, $\mathbf{X}$. The Indian Buffet Process (IBP) is used as a prior on $\mathbf{G}$ to incorporate sparsity and to allow the number of latent features to be inferred. The model's utility for modeling gene expression data is investigated using randomly generated data sets based on a known sparse connectivity matrix for E. Coli, and on three biological data sets of increasing complexity.
Resumo:
The use of L1 regularisation for sparse learning has generated immense research interest, with successful application in such diverse areas as signal acquisition, image coding, genomics and collaborative filtering. While existing work highlights the many advantages of L1 methods, in this paper we find that L1 regularisation often dramatically underperforms in terms of predictive performance when compared with other methods for inferring sparsity. We focus on unsupervised latent variable models, and develop L1 minimising factor models, Bayesian variants of "L1", and Bayesian models with a stronger L0-like sparsity induced through spike-and-slab distributions. These spike-and-slab Bayesian factor models encourage sparsity while accounting for uncertainty in a principled manner and avoiding unnecessary shrinkage of non-zero values. We demonstrate on a number of data sets that in practice spike-and-slab Bayesian methods outperform L1 minimisation, even on a computational budget. We thus highlight the need to re-assess the wide use of L1 methods in sparsity-reliant applications, particularly when we care about generalising to previously unseen data, and provide an alternative that, over many varying conditions, provides improved generalisation performance.
Resumo:
Campylobacter jejuni is one of the most common causes of acute enteritis in the developed world. The consumption of contaminated poultry, where C. jejuni is believed to be a commensal organism, is a major risk factor. However, the dynamics of this colonization process in commercially reared chickens is still poorly understood. Quantification of these dynamics of infection at an individual level is vital to understand transmission within populations and formulate new control strategies. There are multiple potential routes of introduction of C. jejuni into a commercial flock. Introduction is followed by a rapid increase in environmental levels of C. jejuni and the level of colonization of individual broilers. Recent experimental and epidemiological evidence suggest that the celerity of this process could be masking a complex pattern of colonization and extinction of bacterial strains within individual hosts. Despite the rapidity of colonization, experimental transmission studies exhibit a highly variable and unexplained delay time in the initial stages of the process. We review past models of transmission of C. jejuni in broilers and consider simple modifications, motivated by the plausible biological mechanisms of clearance and latency, which could account for this delay. We show how simple mathematical models can be used to guide the focus of experimental studies by providing testable predictions based on our hypotheses. We conclude by suggesting that competition experiments could be used to further understand the dynamics and mechanisms underlying the colonization process. The population models for such competition processes have been extensively studied in other ecological and evolutionary contexts. However, C. jejuni can potentially adapt phenotypically through phase variation in gene expression, leading to unification of ecological and evolutionary time-scales. For a theoretician, the colonization dynamics of C. jejuni offer an experimental system to explore these 'phylodynamics', the synthesis of population dynamics and evolutionary biology.
Resumo:
Observation shows that the watershed-scale models in common use in the United States (US) differ from those used in the European Union (EU). The question arises whether the difference in model use is due to familiarity or necessity. Do conditions in each continent require the use of unique watershed-scale models, or are models sufficiently customizable that independent development of models that serve the same purpose (e.g., continuous/event- based, lumped/distributed, field-Awatershed-scale) is unnecessary? This paper explores this question through the application of two continuous, semi-distributed, watershed-scale models (HSPF and HBV-INCA) to a rural catchment in southern England. The Hydrological Simulation Program-Fortran (HSPF) model is in wide use in the United States. The Integrated Catchments (INCA) model has been used extensively in Europe, and particularly in England. The results of simulation from both models are presented herein. Both models performed adequately according to the criteria set for them. This suggests that there was not a necessity to have alternative, yet similar, models. This partially supports a general conclusion that resources should be devoted towards training in the use of existing models rather than development of new models that serve a similar purpose to existing models. A further comparison of water quality predictions from both models may alter this conclusion.
Resumo:
A workshop on the computational fluid dynamics (CFD) prediction of shock boundary-layer interactions (SBLIs) was held at the 48th AIAA Aerospace Sciences Meeting. As part of the workshop numerous CFD analysts submitted solutions to four experimentally measured SBLIs. This paper describes the assessment of the CFD predictions. The assessment includes an uncertainty analysis of the experimental data, the definition of an error metric and the application of that metric to the CFD solutions. The CFD solutions provided very similar levels of error and in general it was difficult to discern clear trends in the data. For the Reynolds Averaged Navier-Stokes methods the choice of turbulence model appeared to be the largest factor in solution accuracy. Large-eddy simulation methods produced error levels similar to RANS methods but provided superior predictions of normal stresses.
Resumo:
Model based compensation schemes are a powerful approach for noise robust speech recognition. Recently there have been a number of investigations into adaptive training, and estimating the noise models used for model adaptation. This paper examines the use of EM-based schemes for both canonical models and noise estimation, including discriminative adaptive training. One issue that arises when estimating the noise model is a mismatch between the noise estimation approximation and final model compensation scheme. This paper proposes FA-style compensation where this mismatch is eliminated, though at the expense of a sensitivity to the initial noise estimates. EM-based discriminative adaptive training is evaluated on in-car and Aurora4 tasks. FA-style compensation is then evaluated in an incremental mode on the in-car task. © 2011 IEEE.
Resumo:
Upheaval buckling (UHB) is a common design issue for high temperature buried pipelines. This paper highlights some of thekey issues affecting out-of-straightness (OOS) assessment of pipelines. The following factors are discussed; uplift resistancesoil models, uplift resistance in cohesive soils, uplift mobilisation, ratcheting, uplift resistance at low H/D ratios and thecorrect methodology for load factor selection. A framework for determining ratcheting mobilisation is proposed. Furtherresearch is required to verify and validate this proposed framework. UHB assessment of three different diameter pipelineswere carried out using finite element SAGE PROFILE package incorporating pipeline mobilisation and the results arecompared with semi-analytical formulation proposed by Palmer et al. 1990. The paper also presents a summary of as-laidpipeline features based on projects over the past 10 years.
Resumo:
Vector Taylor Series (VTS) model based compensation is a powerful approach for noise robust speech recognition. An important extension to this approach is VTS adaptive training (VAT), which allows canonical models to be estimated on diverse noise-degraded training data. These canonical model can be estimated using EM-based approaches, allowing simple extensions to discriminative VAT (DVAT). However to ensure a diagonal corrupted speech covariance matrix the Jacobian (loading matrix) relating the noise and clean speech is diagonalised. In this work an approach for yielding optimal diagonal loading matrices based on minimising the expected KL-divergence between the diagonal loading matrix and "correct" distributions is proposed. The performance of DVAT using the standard and optimal diagonalisation was evaluated on both in-car collected data and the Aurora4 task. © 2012 IEEE.
Resumo:
A direct numerical simulation (DNS) database of freely propagating statistically planar turbulent premixed flames with a range of different turbulent Reynolds numbers has been used to assess the performance of algebraic flame surface density (FSD) models based on a fractal representation of the flame wrinkling factor. The turbulent Reynolds number Ret has been varied by modifying the Karlovitz number Ka and the Damköhler number Da independently of each other in such a way that the flames remain within the thin reaction zones regime. It has been found that the turbulent Reynolds number and the Karlovitz number both have a significant influence on the fractal dimension, which is found to increase with increasing Ret and Ka before reaching an asymptotic value for large values of Ret and Ka. A parameterisation of the fractal dimension is presented in which the effects of the Reynolds and the Karlovitz numbers are explicitly taken into account. By contrast, the inner cut-off scale normalised by the Zel'dovich flame thickness ηi/δz does not exhibit any significant dependence on Ret for the cases considered here. The performance of several algebraic FSD models has been assessed based on various criteria. Most of the algebraic models show a deterioration in performance with increasing the LES filter width. © 2012 Mohit Katragadda et al.
Resumo:
We offer a solution to the problem of efficiently translating algorithms between different types of discrete statistical model. We investigate the expressive power of three classes of model-those with binary variables, with pairwise factors, and with planar topology-as well as their four intersections. We formalize a notion of "simple reduction" for the problem of inferring marginal probabilities and consider whether it is possible to "simply reduce" marginal inference from general discrete factor graphs to factor graphs in each of these seven subclasses. We characterize the reducibility of each class, showing in particular that the class of binary pairwise factor graphs is able to simply reduce only positive models. We also exhibit a continuous "spectral reduction" based on polynomial interpolation, which overcomes this limitation. Experiments assess the performance of standard approximate inference algorithms on the outputs of our reductions.
Resumo:
Switching between two modes of operation is a common property of biological systems. In continuous-time differential equation models, this is often realised by bistability, i.e. the existence of two asymptotically stable steadystates. Several biological models are shown to exhibit delayed switching, with a pronounced transient phase, in particular for near-threshold perturbations. This study shows that this delay in switching from one mode to the other in response to a transient input is reflected in local properties of an unstable saddle point, which has a one dimensional unstable manifold with a significantly slower eigenvalue than the stable ones. Thus, the trajectories first approximatively converge to the saddle point, then linger along the saddle's unstable manifold before quickly approaching one of the stable equilibria. ©2010 IEEE.
Resumo:
Three questions have been prominent in the study of visual working memory limitations: (a) What is the nature of mnemonic precision (e.g., quantized or continuous)? (b) How many items are remembered? (c) To what extent do spatial binding errors account for working memory failures? Modeling studies have typically focused on comparing possible answers to a single one of these questions, even though the result of such a comparison might depend on the assumed answers to both others. Here, we consider every possible combination of previously proposed answers to the individual questions. Each model is then a point in a 3-factor model space containing a total of 32 models, of which only 6 have been tested previously. We compare all models on data from 10 delayed-estimation experiments from 6 laboratories (for a total of 164 subjects and 131,452 trials). Consistently across experiments, we find that (a) mnemonic precision is not quantized but continuous and not equal but variable across items and trials; (b) the number of remembered items is likely to be variable across trials, with a mean of 6.4 in the best model (median across subjects); (c) spatial binding errors occur but explain only a small fraction of responses (16.5% at set size 8 in the best model). We find strong evidence against all 6 documented models. Our results demonstrate the value of factorial model comparison in working memory.