888 resultados para likelihood-based inference


Relevância:

40.00% 40.00%

Publicador:

Resumo:

Euastacus crayfish are endemic to freshwater ecosystems of the eastern coast of Australia. While recent evolutionary studies have focused on a few of these species, here we provide a comprehensive phylogenetic estimate of relationships among the species within the genus. We sequenced three mitochondrial gene regions (COI, 16S, and 12S) and one nuclear region (28S) from 40 species of the genus Euastacus, as well as one undescribed species. Using these data, we estimated the phylogenetic relationships within the genus using maximum-likelihood, parsimony, and Bayesian Markov Chain Monte Carlo analyses. Using Bayes factors to test different model hypotheses, we found that the best phylogeny supports monophyletic groupings of all but two recognized species and suggests a widespread ancestor that diverged by vicariance. We also show that Eitastacus and Astacopsis are most likely monophyletic sister genera. We use the resulting phylogeny as a framework to test biogeographic hypotheses relating to the diversification of the genus. (c) 2005 Elsevier Inc. All rights reserved.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This thesis is concerned with approximate inference in dynamical systems, from a variational Bayesian perspective. When modelling real world dynamical systems, stochastic differential equations appear as a natural choice, mainly because of their ability to model the noise of the system by adding a variant of some stochastic process to the deterministic dynamics. Hence, inference in such processes has drawn much attention. Here two new extended frameworks are derived and presented that are based on basis function expansions and local polynomial approximations of a recently proposed variational Bayesian algorithm. It is shown that the new extensions converge to the original variational algorithm and can be used for state estimation (smoothing). However, the main focus is on estimating the (hyper-) parameters of these systems (i.e. drift parameters and diffusion coefficients). The new methods are numerically validated on a range of different systems which vary in dimensionality and non-linearity. These are the Ornstein-Uhlenbeck process, for which the exact likelihood can be computed analytically, the univariate and highly non-linear, stochastic double well and the multivariate chaotic stochastic Lorenz '63 (3-dimensional model). The algorithms are also applied to the 40 dimensional stochastic Lorenz '96 system. In this investigation these new approaches are compared with a variety of other well known methods such as the ensemble Kalman filter / smoother, a hybrid Monte Carlo sampler, the dual unscented Kalman filter (for jointly estimating the systems states and model parameters) and full weak-constraint 4D-Var. Empirical analysis of their asymptotic behaviour as a function of observation density or length of time window increases is provided.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Diffusion processes are a family of continuous-time continuous-state stochastic processes that are in general only partially observed. The joint estimation of the forcing parameters and the system noise (volatility) in these dynamical systems is a crucial, but non-trivial task, especially when the system is nonlinear and multimodal. We propose a variational treatment of diffusion processes, which allows us to compute type II maximum likelihood estimates of the parameters by simple gradient techniques and which is computationally less demanding than most MCMC approaches. We also show how a cheap estimate of the posterior over the parameters can be constructed based on the variational free energy.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

* This paper was made according to the program № 14 of fundamental scientific research of the Presidium of the Russian Academy of Sciences, the project 06-I-П14-052

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Many modern applications fall into the category of "large-scale" statistical problems, in which both the number of observations n and the number of features or parameters p may be large. Many existing methods focus on point estimation, despite the continued relevance of uncertainty quantification in the sciences, where the number of parameters to estimate often exceeds the sample size, despite huge increases in the value of n typically seen in many fields. Thus, the tendency in some areas of industry to dispense with traditional statistical analysis on the basis that "n=all" is of little relevance outside of certain narrow applications. The main result of the Big Data revolution in most fields has instead been to make computation much harder without reducing the importance of uncertainty quantification. Bayesian methods excel at uncertainty quantification, but often scale poorly relative to alternatives. This conflict between the statistical advantages of Bayesian procedures and their substantial computational disadvantages is perhaps the greatest challenge facing modern Bayesian statistics, and is the primary motivation for the work presented here.

Two general strategies for scaling Bayesian inference are considered. The first is the development of methods that lend themselves to faster computation, and the second is design and characterization of computational algorithms that scale better in n or p. In the first instance, the focus is on joint inference outside of the standard problem of multivariate continuous data that has been a major focus of previous theoretical work in this area. In the second area, we pursue strategies for improving the speed of Markov chain Monte Carlo algorithms, and characterizing their performance in large-scale settings. Throughout, the focus is on rigorous theoretical evaluation combined with empirical demonstrations of performance and concordance with the theory.

One topic we consider is modeling the joint distribution of multivariate categorical data, often summarized in a contingency table. Contingency table analysis routinely relies on log-linear models, with latent structure analysis providing a common alternative. Latent structure models lead to a reduced rank tensor factorization of the probability mass function for multivariate categorical data, while log-linear models achieve dimensionality reduction through sparsity. Little is known about the relationship between these notions of dimensionality reduction in the two paradigms. In Chapter 2, we derive several results relating the support of a log-linear model to nonnegative ranks of the associated probability tensor. Motivated by these findings, we propose a new collapsed Tucker class of tensor decompositions, which bridge existing PARAFAC and Tucker decompositions, providing a more flexible framework for parsimoniously characterizing multivariate categorical data. Taking a Bayesian approach to inference, we illustrate empirical advantages of the new decompositions.

Latent class models for the joint distribution of multivariate categorical, such as the PARAFAC decomposition, data play an important role in the analysis of population structure. In this context, the number of latent classes is interpreted as the number of genetically distinct subpopulations of an organism, an important factor in the analysis of evolutionary processes and conservation status. Existing methods focus on point estimates of the number of subpopulations, and lack robust uncertainty quantification. Moreover, whether the number of latent classes in these models is even an identified parameter is an open question. In Chapter 3, we show that when the model is properly specified, the correct number of subpopulations can be recovered almost surely. We then propose an alternative method for estimating the number of latent subpopulations that provides good quantification of uncertainty, and provide a simple procedure for verifying that the proposed method is consistent for the number of subpopulations. The performance of the model in estimating the number of subpopulations and other common population structure inference problems is assessed in simulations and a real data application.

In contingency table analysis, sparse data is frequently encountered for even modest numbers of variables, resulting in non-existence of maximum likelihood estimates. A common solution is to obtain regularized estimates of the parameters of a log-linear model. Bayesian methods provide a coherent approach to regularization, but are often computationally intensive. Conjugate priors ease computational demands, but the conjugate Diaconis--Ylvisaker priors for the parameters of log-linear models do not give rise to closed form credible regions, complicating posterior inference. In Chapter 4 we derive the optimal Gaussian approximation to the posterior for log-linear models with Diaconis--Ylvisaker priors, and provide convergence rate and finite-sample bounds for the Kullback-Leibler divergence between the exact posterior and the optimal Gaussian approximation. We demonstrate empirically in simulations and a real data application that the approximation is highly accurate, even in relatively small samples. The proposed approximation provides a computationally scalable and principled approach to regularized estimation and approximate Bayesian inference for log-linear models.

Another challenging and somewhat non-standard joint modeling problem is inference on tail dependence in stochastic processes. In applications where extreme dependence is of interest, data are almost always time-indexed. Existing methods for inference and modeling in this setting often cluster extreme events or choose window sizes with the goal of preserving temporal information. In Chapter 5, we propose an alternative paradigm for inference on tail dependence in stochastic processes with arbitrary temporal dependence structure in the extremes, based on the idea that the information on strength of tail dependence and the temporal structure in this dependence are both encoded in waiting times between exceedances of high thresholds. We construct a class of time-indexed stochastic processes with tail dependence obtained by endowing the support points in de Haan's spectral representation of max-stable processes with velocities and lifetimes. We extend Smith's model to these max-stable velocity processes and obtain the distribution of waiting times between extreme events at multiple locations. Motivated by this result, a new definition of tail dependence is proposed that is a function of the distribution of waiting times between threshold exceedances, and an inferential framework is constructed for estimating the strength of extremal dependence and quantifying uncertainty in this paradigm. The method is applied to climatological, financial, and electrophysiology data.

The remainder of this thesis focuses on posterior computation by Markov chain Monte Carlo. The Markov Chain Monte Carlo method is the dominant paradigm for posterior computation in Bayesian analysis. It has long been common to control computation time by making approximations to the Markov transition kernel. Comparatively little attention has been paid to convergence and estimation error in these approximating Markov Chains. In Chapter 6, we propose a framework for assessing when to use approximations in MCMC algorithms, and how much error in the transition kernel should be tolerated to obtain optimal estimation performance with respect to a specified loss function and computational budget. The results require only ergodicity of the exact kernel and control of the kernel approximation accuracy. The theoretical framework is applied to approximations based on random subsets of data, low-rank approximations of Gaussian processes, and a novel approximating Markov chain for discrete mixture models.

Data augmentation Gibbs samplers are arguably the most popular class of algorithm for approximately sampling from the posterior distribution for the parameters of generalized linear models. The truncated Normal and Polya-Gamma data augmentation samplers are standard examples for probit and logit links, respectively. Motivated by an important problem in quantitative advertising, in Chapter 7 we consider the application of these algorithms to modeling rare events. We show that when the sample size is large but the observed number of successes is small, these data augmentation samplers mix very slowly, with a spectral gap that converges to zero at a rate at least proportional to the reciprocal of the square root of the sample size up to a log factor. In simulation studies, moderate sample sizes result in high autocorrelations and small effective sample sizes. Similar empirical results are observed for related data augmentation samplers for multinomial logit and probit models. When applied to a real quantitative advertising dataset, the data augmentation samplers mix very poorly. Conversely, Hamiltonian Monte Carlo and a type of independence chain Metropolis algorithm show good mixing on the same dataset.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

When we study the variables that a ffect survival time, we usually estimate their eff ects by the Cox regression model. In biomedical research, e ffects of the covariates are often modi ed by a biomarker variable. This leads to covariates-biomarker interactions. Here biomarker is an objective measurement of the patient characteristics at baseline. Liu et al. (2015) has built up a local partial likelihood bootstrap model to estimate and test this interaction e ffect of covariates and biomarker, but the R code developed by Liu et al. (2015) can only handle one variable and one interaction term and can not t the model with adjustment to nuisance variables. In this project, we expand the model to allow adjustment to nuisance variables, expand the R code to take any chosen interaction terms, and we set up many parameters for users to customize their research. We also build up an R package called "lplb" to integrate the complex computations into a simple interface. We conduct numerical simulation to show that the new method has excellent fi nite sample properties under both the null and alternative hypothesis. We also applied the method to analyze data from a prostate cancer clinical trial with acid phosphatase (AP) biomarker.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Phylogenetic inference consist in the search of an evolutionary tree to explain the best way possible genealogical relationships of a set of species. Phylogenetic analysis has a large number of applications in areas such as biology, ecology, paleontology, etc. There are several criterias which has been defined in order to infer phylogenies, among which are the maximum parsimony and maximum likelihood. The first one tries to find the phylogenetic tree that minimizes the number of evolutionary steps needed to describe the evolutionary history among species, while the second tries to find the tree that has the highest probability of produce the observed data according to an evolutionary model. The search of a phylogenetic tree can be formulated as a multi-objective optimization problem, which aims to find trees which satisfy simultaneously (and as much as possible) both criteria of parsimony and likelihood. Due to the fact that these criteria are different there won't be a single optimal solution (a single tree), but a set of compromise solutions. The solutions of this set are called "Pareto Optimal". To find this solutions, evolutionary algorithms are being used with success nowadays.This algorithms are a family of techniques, which aren’t exact, inspired by the process of natural selection. They usually find great quality solutions in order to resolve convoluted optimization problems. The way this algorithms works is based on the handling of a set of trial solutions (trees in the phylogeny case) using operators, some of them exchanges information between solutions, simulating DNA crossing, and others apply aleatory modifications, simulating a mutation. The result of this algorithms is an approximation to the set of the “Pareto Optimal” which can be shown in a graph with in order that the expert in the problem (the biologist when we talk about inference) can choose the solution of the commitment which produces the higher interest. In the case of optimization multi-objective applied to phylogenetic inference, there is open source software tool, called MO-Phylogenetics, which is designed for the purpose of resolving inference problems with classic evolutionary algorithms and last generation algorithms. REFERENCES [1] C.A. Coello Coello, G.B. Lamont, D.A. van Veldhuizen. Evolutionary algorithms for solving multi-objective problems. Spring. Agosto 2007 [2] C. Zambrano-Vega, A.J. Nebro, J.F Aldana-Montes. MO-Phylogenetics: a phylogenetic inference software tool with multi-objective evolutionary metaheuristics. Methods in Ecology and Evolution. En prensa. Febrero 2016.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The Dirichlet process mixture model (DPMM) is a ubiquitous, flexible Bayesian nonparametric statistical model. However, full probabilistic inference in this model is analytically intractable, so that computationally intensive techniques such as Gibbs sampling are required. As a result, DPMM-based methods, which have considerable potential, are restricted to applications in which computational resources and time for inference is plentiful. For example, they would not be practical for digital signal processing on embedded hardware, where computational resources are at a serious premium. Here, we develop a simplified yet statistically rigorous approximate maximum a-posteriori (MAP) inference algorithm for DPMMs. This algorithm is as simple as DP-means clustering, solves the MAP problem as well as Gibbs sampling, while requiring only a fraction of the computational effort. (For freely available code that implements the MAP-DP algorithm for Gaussian mixtures see http://www.maxlittle.net/.) Unlike related small variance asymptotics (SVA), our method is non-degenerate and so inherits the “rich get richer” property of the Dirichlet process. It also retains a non-degenerate closed-form likelihood which enables out-of-sample calculations and the use of standard tools such as cross-validation. We illustrate the benefits of our algorithm on a range of examples and contrast it to variational, SVA and sampling approaches from both a computational complexity perspective as well as in terms of clustering performance. We demonstrate the wide applicabiity of our approach by presenting an approximate MAP inference method for the infinite hidden Markov model whose performance contrasts favorably with a recently proposed hybrid SVA approach. Similarly, we show how our algorithm can applied to a semiparametric mixed-effects regression model where the random effects distribution is modelled using an infinite mixture model, as used in longitudinal progression modelling in population health science. Finally, we propose directions for future research on approximate MAP inference in Bayesian nonparametrics.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In this paper, we consider Preference Inference based on a generalised form of Pareto order. Preference Inference aims at reasoning over an incomplete specification of user preferences. We focus on two problems. The Preference Deduction Problem (PDP) asks if another preference statement can be deduced (with certainty) from a set of given preference statements. The Preference Consistency Problem (PCP) asks if a set of given preference statements is consistent, i.e., the statements are not contradicting each other. Here, preference statements are direct comparisons between alternatives (strict and non-strict). It is assumed that a set of evaluation functions is known by which all alternatives can be rated. We consider Pareto models which induce order relations on the set of alternatives in a Pareto manner, i.e., one alternative is preferred to another only if it is preferred on every component of the model. We describe characterisations for deduction and consistency based on an analysis of the set of evaluation functions, and present algorithmic solutions and complexity results for PDP and PCP, based on Pareto models in general and for a special case. Furthermore, a comparison shows that the inference based on Pareto models is less cautious than some other types of well-known preference model.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Speech recognition in car environments has been identified as a valuable means for reducing driver distraction when operating non-critical in-car systems. Likelihood-maximising (LIMA) frameworks optimise speech enhancement algorithms based on recognised state sequences rather than traditional signal-level criteria such as maximising signal-to-noise ratio. Previously presented LIMA frameworks require calibration utterances to generate optimised enhancement parameters which are used for all subsequent utterances. Sub-optimal recognition performance occurs in noise conditions which are significantly different from that present during the calibration session - a serious problem in rapidly changing noise environments. We propose a dialog-based design which allows regular optimisation iterations in order to track the changing noise conditions. Experiments using Mel-filterbank spectral subtraction are performed to determine the optimisation requirements for vehicular environments and show that minimal optimisation assists real-time operation with improved speech recognition accuracy. It is also shown that the proposed design is able to provide improved recognition performance over frameworks incorporating a calibration session.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background The problem of silent multiple comparisons is one of the most difficult statistical problems faced by scientists. It is a particular problem for investigating a one-off cancer cluster reported to a health department because any one of hundreds, or possibly thousands, of neighbourhoods, schools, or workplaces could have reported a cluster, which could have been for any one of several types of cancer or any one of several time periods. Methods This paper contrasts the frequentist approach with a Bayesian approach for dealing with silent multiple comparisons in the context of a one-off cluster reported to a health department. Two published cluster investigations were re-analysed using the Dunn-Sidak method to adjust frequentist p-values and confidence intervals for silent multiple comparisons. Bayesian methods were based on the Gamma distribution. Results Bayesian analysis with non-informative priors produced results similar to the frequentist analysis, and suggested that both clusters represented a statistical excess. In the frequentist framework, the statistical significance of both clusters was extremely sensitive to the number of silent multiple comparisons, which can only ever be a subjective "guesstimate". The Bayesian approach is also subjective: whether there is an apparent statistical excess depends on the specified prior. Conclusion In cluster investigations, the frequentist approach is just as subjective as the Bayesian approach, but the Bayesian approach is less ambitious in that it treats the analysis as a synthesis of data and personal judgements (possibly poor ones), rather than objective reality. Bayesian analysis is (arguably) a useful tool to support complicated decision-making, because it makes the uncertainty associated with silent multiple comparisons explicit.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Image annotation is a significant step towards semantic based image retrieval. Ontology is a popular approach for semantic representation and has been intensively studied for multimedia analysis. However, relations among concepts are seldom used to extract higher-level semantics. Moreover, the ontology inference is often crisp. This paper aims to enable sophisticated semantic querying of images, and thus contributes to 1) an ontology framework to contain both visual and contextual knowledge, and 2) a probabilistic inference approach to reason the high-level concepts based on different sources of information. The experiment on a natural scene database from LabelMe database shows encouraging results.