934 resultados para Bayesian methods


Relevância:

70.00% 70.00%

Publicador:

Resumo:

Thesis (Ph.D.)--University of Washington, 2016-08

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Two probabilistic interpretations of the n-tuple recognition method are put forward in order to allow this technique to be analysed with the same Bayesian methods used in connection with other neural network models. Elementary demonstrations are then given of the use of maximum likelihood and maximum entropy methods for tuning the model parameters and assisting their interpretation. One of the models can be used to illustrate the significance of overlapping n-tuple samples with respect to correlations in the patterns.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The problem of evaluating different learning rules and other statistical estimators is analysed. A new general theory of statistical inference is developed by combining Bayesian decision theory with information geometry. It is coherent and invariant. For each sample a unique ideal estimate exists and is given by an average over the posterior. An optimal estimate within a model is given by a projection of the ideal estimate. The ideal estimate is a sufficient statistic of the posterior, so practical learning rules are functions of the ideal estimator. If the sole purpose of learning is to extract information from the data, the learning rule must also approximate the ideal estimator. This framework is applicable to both Bayesian and non-Bayesian methods, with arbitrary statistical models, and to supervised, unsupervised and reinforcement learning schemes.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

We present results that compare the performance of neural networks trained with two Bayesian methods, (i) the Evidence Framework of MacKay (1992) and (ii) a Markov Chain Monte Carlo method due to Neal (1996) on a task of classifying segmented outdoor images. We also investigate the use of the Automatic Relevance Determination method for input feature selection.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Following adaptation to an oriented (1-d) signal in central vision, the orientation of subsequently viewed test signals may appear repelled away from or attracted towards the adapting orientation. Small angular differences between the adaptor and test yield 'repulsive' shifts, while large angular differences yield 'attractive' shifts. In peripheral vision, however, both small and large angular differences yield repulsive shifts. To account for these tilt after-effects (TAEs), a cascaded model of orientation estimation that is optimized using hierarchical Bayesian methods is proposed. The model accounts for orientation bias through adaptation-induced losses in information that arise because of signal uncertainties and neural constraints placed upon the propagation of visual information. Repulsive (direct) TAEs arise at early stages of visual processing from adaptation of orientation-selective units with peak sensitivity at the orientation of the adaptor (theta). Attractive (indirect) TAEs result from adaptation of second-stage units with peak sensitivity at theta and theta+90 degrees , which arise from an efficient stage of linear compression that pools across the responses of the first-stage orientation-selective units. A spatial orientation vector is estimated from the transformed oriented unit responses. The change from attractive to repulsive TAEs in peripheral vision can be explained by the differing harmonic biases resulting from constraints on signal power (in central vision) versus signal uncertainties in orientation (in peripheral vision). The proposed model is consistent with recent work by computational neuroscientists in supposing that visual bias reflects the adjustment of a rational system in the light of uncertain signals and system constraints.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Many modern applications fall into the category of "large-scale" statistical problems, in which both the number of observations n and the number of features or parameters p may be large. Many existing methods focus on point estimation, despite the continued relevance of uncertainty quantification in the sciences, where the number of parameters to estimate often exceeds the sample size, despite huge increases in the value of n typically seen in many fields. Thus, the tendency in some areas of industry to dispense with traditional statistical analysis on the basis that "n=all" is of little relevance outside of certain narrow applications. The main result of the Big Data revolution in most fields has instead been to make computation much harder without reducing the importance of uncertainty quantification. Bayesian methods excel at uncertainty quantification, but often scale poorly relative to alternatives. This conflict between the statistical advantages of Bayesian procedures and their substantial computational disadvantages is perhaps the greatest challenge facing modern Bayesian statistics, and is the primary motivation for the work presented here.

Two general strategies for scaling Bayesian inference are considered. The first is the development of methods that lend themselves to faster computation, and the second is design and characterization of computational algorithms that scale better in n or p. In the first instance, the focus is on joint inference outside of the standard problem of multivariate continuous data that has been a major focus of previous theoretical work in this area. In the second area, we pursue strategies for improving the speed of Markov chain Monte Carlo algorithms, and characterizing their performance in large-scale settings. Throughout, the focus is on rigorous theoretical evaluation combined with empirical demonstrations of performance and concordance with the theory.

One topic we consider is modeling the joint distribution of multivariate categorical data, often summarized in a contingency table. Contingency table analysis routinely relies on log-linear models, with latent structure analysis providing a common alternative. Latent structure models lead to a reduced rank tensor factorization of the probability mass function for multivariate categorical data, while log-linear models achieve dimensionality reduction through sparsity. Little is known about the relationship between these notions of dimensionality reduction in the two paradigms. In Chapter 2, we derive several results relating the support of a log-linear model to nonnegative ranks of the associated probability tensor. Motivated by these findings, we propose a new collapsed Tucker class of tensor decompositions, which bridge existing PARAFAC and Tucker decompositions, providing a more flexible framework for parsimoniously characterizing multivariate categorical data. Taking a Bayesian approach to inference, we illustrate empirical advantages of the new decompositions.

Latent class models for the joint distribution of multivariate categorical, such as the PARAFAC decomposition, data play an important role in the analysis of population structure. In this context, the number of latent classes is interpreted as the number of genetically distinct subpopulations of an organism, an important factor in the analysis of evolutionary processes and conservation status. Existing methods focus on point estimates of the number of subpopulations, and lack robust uncertainty quantification. Moreover, whether the number of latent classes in these models is even an identified parameter is an open question. In Chapter 3, we show that when the model is properly specified, the correct number of subpopulations can be recovered almost surely. We then propose an alternative method for estimating the number of latent subpopulations that provides good quantification of uncertainty, and provide a simple procedure for verifying that the proposed method is consistent for the number of subpopulations. The performance of the model in estimating the number of subpopulations and other common population structure inference problems is assessed in simulations and a real data application.

In contingency table analysis, sparse data is frequently encountered for even modest numbers of variables, resulting in non-existence of maximum likelihood estimates. A common solution is to obtain regularized estimates of the parameters of a log-linear model. Bayesian methods provide a coherent approach to regularization, but are often computationally intensive. Conjugate priors ease computational demands, but the conjugate Diaconis--Ylvisaker priors for the parameters of log-linear models do not give rise to closed form credible regions, complicating posterior inference. In Chapter 4 we derive the optimal Gaussian approximation to the posterior for log-linear models with Diaconis--Ylvisaker priors, and provide convergence rate and finite-sample bounds for the Kullback-Leibler divergence between the exact posterior and the optimal Gaussian approximation. We demonstrate empirically in simulations and a real data application that the approximation is highly accurate, even in relatively small samples. The proposed approximation provides a computationally scalable and principled approach to regularized estimation and approximate Bayesian inference for log-linear models.

Another challenging and somewhat non-standard joint modeling problem is inference on tail dependence in stochastic processes. In applications where extreme dependence is of interest, data are almost always time-indexed. Existing methods for inference and modeling in this setting often cluster extreme events or choose window sizes with the goal of preserving temporal information. In Chapter 5, we propose an alternative paradigm for inference on tail dependence in stochastic processes with arbitrary temporal dependence structure in the extremes, based on the idea that the information on strength of tail dependence and the temporal structure in this dependence are both encoded in waiting times between exceedances of high thresholds. We construct a class of time-indexed stochastic processes with tail dependence obtained by endowing the support points in de Haan's spectral representation of max-stable processes with velocities and lifetimes. We extend Smith's model to these max-stable velocity processes and obtain the distribution of waiting times between extreme events at multiple locations. Motivated by this result, a new definition of tail dependence is proposed that is a function of the distribution of waiting times between threshold exceedances, and an inferential framework is constructed for estimating the strength of extremal dependence and quantifying uncertainty in this paradigm. The method is applied to climatological, financial, and electrophysiology data.

The remainder of this thesis focuses on posterior computation by Markov chain Monte Carlo. The Markov Chain Monte Carlo method is the dominant paradigm for posterior computation in Bayesian analysis. It has long been common to control computation time by making approximations to the Markov transition kernel. Comparatively little attention has been paid to convergence and estimation error in these approximating Markov Chains. In Chapter 6, we propose a framework for assessing when to use approximations in MCMC algorithms, and how much error in the transition kernel should be tolerated to obtain optimal estimation performance with respect to a specified loss function and computational budget. The results require only ergodicity of the exact kernel and control of the kernel approximation accuracy. The theoretical framework is applied to approximations based on random subsets of data, low-rank approximations of Gaussian processes, and a novel approximating Markov chain for discrete mixture models.

Data augmentation Gibbs samplers are arguably the most popular class of algorithm for approximately sampling from the posterior distribution for the parameters of generalized linear models. The truncated Normal and Polya-Gamma data augmentation samplers are standard examples for probit and logit links, respectively. Motivated by an important problem in quantitative advertising, in Chapter 7 we consider the application of these algorithms to modeling rare events. We show that when the sample size is large but the observed number of successes is small, these data augmentation samplers mix very slowly, with a spectral gap that converges to zero at a rate at least proportional to the reciprocal of the square root of the sample size up to a log factor. In simulation studies, moderate sample sizes result in high autocorrelations and small effective sample sizes. Similar empirical results are observed for related data augmentation samplers for multinomial logit and probit models. When applied to a real quantitative advertising dataset, the data augmentation samplers mix very poorly. Conversely, Hamiltonian Monte Carlo and a type of independence chain Metropolis algorithm show good mixing on the same dataset.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Bayesian methods offer a flexible and convenient probabilistic learning framework to extract interpretable knowledge from complex and structured data. Such methods can characterize dependencies among multiple levels of hidden variables and share statistical strength across heterogeneous sources. In the first part of this dissertation, we develop two dependent variational inference methods for full posterior approximation in non-conjugate Bayesian models through hierarchical mixture- and copula-based variational proposals, respectively. The proposed methods move beyond the widely used factorized approximation to the posterior and provide generic applicability to a broad class of probabilistic models with minimal model-specific derivations. In the second part of this dissertation, we design probabilistic graphical models to accommodate multimodal data, describe dynamical behaviors and account for task heterogeneity. In particular, the sparse latent factor model is able to reveal common low-dimensional structures from high-dimensional data. We demonstrate the effectiveness of the proposed statistical learning methods on both synthetic and real-world data.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Perez-Losada et al. [1] analyzed 72 complete genomes corresponding to nine mammalian (67 strains) and 2 avian (5 strains) polyomavirus species using maximum likelihood and Bayesian methods of phylogenetic inference. Because some data of 2 genomes in their work are now not available in GenBank, in this work, we analyze the phylogenetic relationship of the remaining 70 complete genomes corresponding to nine mammalian (65 strains) and two avian (5 strains) polyomavirus species using a dynamical language model approach developed by our group (Yu et al., [26]). This distance method does not require sequence alignment for deriving species phylogeny based on overall similarities of the complete genomes. Our best tree separates the bird polyomaviruses (avian polyomaviruses and goose hemorrhagic polymaviruses) from the mammalian polyomaviruses, which supports the idea of splitting the genus into two subgenera. Such a split is consistent with the different viral life strategies of each group. In the mammalian polyomavirus subgenera, mouse polyomaviruses (MPV), simian viruses 40 (SV40), BK viruses (BKV) and JC viruses (JCV) are grouped as different branches as expected. The topology of our best tree is quite similar to that of the tree constructed by Perez-Losada et al.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Samples of sea water contain phytoplankton taxa in varying amounts, and marine scientists are interested in the relative abundance of each taxa. Their relative biomass can be ascertained indirectly by measuring the quantity of various pigments using high performance liquid chromatography. However, the conversion from pigment to taxa is mathematically non trivial as it is a positive matrix factorisation problem where both matrices are unknown beyond the level of initial estimates. The prior information on the pigment to taxa conversion matrix is used to give the problem a unique solution. An iteration of two non-negative least squares algorithms gives satisfactory results. Some sample analysis of data indicates prospects for this type of analysis. An alternative more computationally intensive approach using Bayesian methods is discussed.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

A novel shape recognition algorithm was developed to autonomously classify the Northern Pacific Sea Star (Asterias amurenis) from benthic images that were collected by the Starbug AUV during 6km of transects in the Derwent estuary. Despite the effects of scattering, attenuation, soft focus and motion blur within the underwater images, an optimal joint classification rate of 77.5% and misclassification rate of 13.5% was achieved. The performance of algorithm was largely attributed to its ability to recognise locally deformed sea star shapes that were created during the segmentation of the distorted images.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Long-term systematic population monitoring data sets are rare but are essential in identifying changes in species abundance. In contrast, community groups and natural history organizations have collected many species lists. These represent a large, untapped source of information on changes in abundance but are generally considered of little value. The major problem with using species lists to detect population changes is that the amount of effort used to obtain the list is often uncontrolled and usually unknown. It has been suggested that using the number of species on the list, the "list length," can be a measure of effort. This paper significantly extends the utility of Franklin's approach using Bayesian logistic regression. We demonstrate the value of List Length Analysis to model changes in species prevalence (i.e., the proportion of lists on which the species occurs) using bird lists collected by a local bird club over 40 years around Brisbane, southeast Queensland, Australia. We estimate the magnitude and certainty of change for 269 bird species and calculate the probabilities that there have been declines and increases of given magnitudes. List Length Analysis confirmed suspected species declines and increases. This method is an important complement to systematically designed intensive monitoring schemes and provides a means of utilizing data that may otherwise be deemed useless. The results of List Length Analysis can be used for targeting species of conservation concern for listing purposes or for more intensive monitoring. While Bayesian methods are not essential for List Length Analysis, they can offer more flexibility in interrogating the data and are able to provide a range of parameters that are easy to interpret and can facilitate conservation listing and prioritization. © 2010 by the Ecological Society of America.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Sepsid flies (Diptera: Sepsidae) are important model insects for sexual selection research. In order to develop mitochondrial (mt) genome data for this significant group, we sequenced the first complete mt genome of the sepsid fly Nemopoda mamaevi Ozerov, 1997. The circular 15,878 bp mt genome is typical of Diptera, containing all 37 genes usually present in bilaterian animals. We discovered inaccurate annotations of fly mt genomes previously deposited on GenBank and thus re-annotated all published mt genomes of Cyclorrhapha. These re-annotations were based on comparative analysis of homologous genes, and provide a statistical analysis of start and stop codon positions. We further detected two 18 bp of conserved intergenic sequences from tRNAGlu-tRNAPhe and ND1-tRNASer(UCN) across Cyclorrhapha, which are the mtTERM binding site motifs. Additionally, we compared automated annotation software MITOS with hand annotation method. Phylogenetic trees based on the mt genome data from Cyclorrhapha were inferred by Maximum-likelihood and Bayesian methods, strongly supported a close relationship between Sepsidae and the Tephritoidea.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Gaussian processes (GPs) are promising Bayesian methods for classification and regression problems. Design of a GP classifier and making predictions using it is, however, computationally demanding, especially when the training set size is large. Sparse GP classifiers are known to overcome this limitation. In this letter, we propose and study a validation-based method for sparse GP classifier design. The proposed method uses a negative log predictive (NLP) loss measure, which is easy to compute for GP models. We use this measure for both basis vector selection and hyperparameter adaptation. The experimental results on several real-world benchmark data sets show better orcomparable generalization performance over existing methods.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Many species inhabit fragmented landscapes, resulting either from anthropogenic or from natural processes. The ecological and evolutionary dynamics of spatially structured populations are affected by a complex interplay between endogenous and exogenous factors. The metapopulation approach, simplifying the landscape to a discrete set of patches of breeding habitat surrounded by unsuitable matrix, has become a widely applied paradigm for the study of species inhabiting highly fragmented landscapes. In this thesis, I focus on the construction of biologically realistic models and their parameterization with empirical data, with the general objective of understanding how the interactions between individuals and their spatially structured environment affect ecological and evolutionary processes in fragmented landscapes. I study two hierarchically structured model systems, which are the Glanville fritillary butterfly in the Åland Islands, and a system of two interacting aphid species in the Tvärminne archipelago, both being located in South-Western Finland. The interesting and challenging feature of both study systems is that the population dynamics occur over multiple spatial scales that are linked by various processes. My main emphasis is in the development of mathematical and statistical methodologies. For the Glanville fritillary case study, I first build a Bayesian framework for the estimation of death rates and capture probabilities from mark-recapture data, with the novelty of accounting for variation among individuals in capture probabilities and survival. I then characterize the dispersal phase of the butterflies by deriving a mathematical approximation of a diffusion-based movement model applied to a network of patches. I use the movement model as a building block to construct an individual-based evolutionary model for the Glanville fritillary butterfly metapopulation. I parameterize the evolutionary model using a pattern-oriented approach, and use it to study how the landscape structure affects the evolution of dispersal. For the aphid case study, I develop a Bayesian model of hierarchical multi-scale metapopulation dynamics, where the observed extinction and colonization rates are decomposed into intrinsic rates operating specifically at each spatial scale. In summary, I show how analytical approaches, hierarchical Bayesian methods and individual-based simulations can be used individually or in combination to tackle complex problems from many different viewpoints. In particular, hierarchical Bayesian methods provide a useful tool for decomposing ecological complexity into more tractable components.