4 resultados para Dirichlet polynomials
em Duke University
Resumo:
Testing for differences within data sets is an important issue across various applications. Our work is primarily motivated by the analysis of microbiomial composition, which has been increasingly relevant and important with the rise of DNA sequencing. We first review classical frequentist tests that are commonly used in tackling such problems. We then propose a Bayesian Dirichlet-multinomial framework for modeling the metagenomic data and for testing underlying differences between the samples. A parametric Dirichlet-multinomial model uses an intuitive hierarchical structure that allows for flexibility in characterizing both the within-group variation and the cross-group difference and provides very interpretable parameters. A computational method for evaluating the marginal likelihoods under the null and alternative hypotheses is also given. Through simulations, we show that our Bayesian model performs competitively against frequentist counterparts. We illustrate the method through analyzing metagenomic applications using the Human Microbiome Project data.
Resumo:
A tree-based dictionary learning model is developed for joint analysis of imagery and associated text. The dictionary learning may be applied directly to the imagery from patches, or to general feature vectors extracted from patches or superpixels (using any existing method for image feature extraction). Each image is associated with a path through the tree (from root to a leaf), and each of the multiple patches in a given image is associated with one node in that path. Nodes near the tree root are shared between multiple paths, representing image characteristics that are common among different types of images. Moving toward the leaves, nodes become specialized, representing details in image classes. If available, words (text) are also jointly modeled, with a path-dependent probability over words. The tree structure is inferred via a nested Dirichlet process, and a retrospective stick-breaking sampler is used to infer the tree depth and width.
Resumo:
We present a theory of hypoellipticity and unique ergodicity for semilinear parabolic stochastic PDEs with "polynomial" nonlinearities and additive noise, considered as abstract evolution equations in some Hilbert space. It is shown that if Hörmander's bracket condition holds at every point of this Hilbert space, then a lower bound on the Malliavin covariance operatorμt can be obtained. Informally, this bound can be read as "Fix any finite-dimensional projection on a subspace of sufficiently regular functions. Then the eigenfunctions of μt with small eigenvalues have only a very small component in the image of Π." We also show how to use a priori bounds on the solutions to the equation to obtain good control on the dependency of the bounds on the Malliavin matrix on the initial condition. These bounds are sufficient in many cases to obtain the asymptotic strong Feller property introduced in [HM06]. One of the main novel technical tools is an almost sure bound from below on the size of "Wiener polynomials," where the coefficients are possibly non-adapted stochastic processes satisfying a Lips chitz condition. By exploiting the polynomial structure of the equations, this result can be used to replace Norris' lemma, which is unavailable in the present context. We conclude by showing that the two-dimensional stochastic Navier-Stokes equations and a large class of reaction-diffusion equations fit the framework of our theory.
Resumo:
A common challenge that users of academic databases face is making sense of their query outputs for knowledge discovery. This is exacerbated by the size and growth of modern databases. PubMed, a central index of biomedical literature, contains over 25 million citations, and can output search results containing hundreds of thousands of citations. Under these conditions, efficient knowledge discovery requires a different data structure than a chronological list of articles. It requires a method of conveying what the important ideas are, where they are located, and how they are connected; a method of allowing users to see the underlying topical structure of their search. This paper presents VizMaps, a PubMed search interface that addresses some of these problems. Given search terms, our main backend pipeline extracts relevant words from the title and abstract, and clusters them into discovered topics using Bayesian topic models, in particular the Latent Dirichlet Allocation (LDA). It then outputs a visual, navigable map of the query results.