1000 resultados para MCMC algorithm
Resumo:
This thesis presents Bayesian solutions to inference problems for three types of social network data structures: a single observation of a social network, repeated observations on the same social network, and repeated observations on a social network developing through time. A social network is conceived as being a structure consisting of actors and their social interaction with each other. A common conceptualisation of social networks is to let the actors be represented by nodes in a graph with edges between pairs of nodes that are relationally tied to each other according to some definition. Statistical analysis of social networks is to a large extent concerned with modelling of these relational ties, which lends itself to empirical evaluation. The first paper deals with a family of statistical models for social networks called exponential random graphs that takes various structural features of the network into account. In general, the likelihood functions of exponential random graphs are only known up to a constant of proportionality. A procedure for performing Bayesian inference using Markov chain Monte Carlo (MCMC) methods is presented. The algorithm consists of two basic steps, one in which an ordinary Metropolis-Hastings up-dating step is used, and another in which an importance sampling scheme is used to calculate the acceptance probability of the Metropolis-Hastings step. In paper number two a method for modelling reports given by actors (or other informants) on their social interaction with others is investigated in a Bayesian framework. The model contains two basic ingredients: the unknown network structure and functions that link this unknown network structure to the reports given by the actors. These functions take the form of probit link functions. An intrinsic problem is that the model is not identified, meaning that there are combinations of values on the unknown structure and the parameters in the probit link functions that are observationally equivalent. Instead of using restrictions for achieving identification, it is proposed that the different observationally equivalent combinations of parameters and unknown structure be investigated a posteriori. Estimation of parameters is carried out using Gibbs sampling with a switching devise that enables transitions between posterior modal regions. The main goal of the procedures is to provide tools for comparisons of different model specifications. Papers 3 and 4, propose Bayesian methods for longitudinal social networks. The premise of the models investigated is that overall change in social networks occurs as a consequence of sequences of incremental changes. Models for the evolution of social networks using continuos-time Markov chains are meant to capture these dynamics. Paper 3 presents an MCMC algorithm for exploring the posteriors of parameters for such Markov chains. More specifically, the unobserved evolution of the network in-between observations is explicitly modelled thereby avoiding the need to deal with explicit formulas for the transition probabilities. This enables likelihood based parameter inference in a wider class of network evolution models than has been available before. Paper 4 builds on the proposed inference procedure of Paper 3 and demonstrates how to perform model selection for a class of network evolution models.
Resumo:
L’invarianza spaziale dei parametri di un modello afflussi-deflussi può rivelarsi una soluzione pratica e valida nel caso si voglia stimare la disponibilità di risorsa idrica di un’area. La simulazione idrologica è infatti uno strumento molto adottato ma presenta alcune criticità legate soprattutto alla necessità di calibrare i parametri del modello. Se si opta per l’applicazione di modelli spazialmente distribuiti, utili perché in grado di rendere conto della variabilità spaziale dei fenomeni che concorrono alla formazione di deflusso, il problema è solitamente legato all’alto numero di parametri in gioco. Assumendo che alcuni di questi siano omogenei nello spazio, dunque presentino lo stesso valore sui diversi bacini, è possibile ridurre il numero complessivo dei parametri che necessitano della calibrazione. Si verifica su base statistica questa assunzione, ricorrendo alla stima dell’incertezza parametrica valutata per mezzo di un algoritmo MCMC. Si nota che le distribuzioni dei parametri risultano in diversa misura compatibili sui bacini considerati. Quando poi l’obiettivo è la stima della disponibilità di risorsa idrica di bacini non strumentati, l’ipotesi di invarianza dei parametri assume ancora più importanza; solitamente infatti si affronta questo problema ricorrendo a lunghe analisi di regionalizzazione dei parametri. In questa sede invece si propone una procedura di cross-calibrazione che viene realizzata adottando le informazioni provenienti dai bacini strumentati più simili al sito di interesse. Si vuole raggiungere cioè un giusto compromesso tra lo svantaggio derivante dall’assumere i parametri del modello costanti sui bacini strumentati e il beneficio legato all’introduzione, passo dopo passo, di nuove e importanti informazioni derivanti dai bacini strumentati coinvolti nell’analisi. I risultati dimostrano l’utilità della metodologia proposta; si vede infatti che, in fase di validazione sul bacino considerato non strumentato, è possibile raggiungere un buona concordanza tra le serie di portata simulate e osservate.
Resumo:
Permutation tests are useful for drawing inferences from imaging data because of their flexibility and ability to capture features of the brain that are difficult to capture parametrically. However, most implementations of permutation tests ignore important confounding covariates. To employ covariate control in a nonparametric setting we have developed a Markov chain Monte Carlo (MCMC) algorithm for conditional permutation testing using propensity scores. We present the first use of this methodology for imaging data. Our MCMC algorithm is an extension of algorithms developed to approximate exact conditional probabilities in contingency tables, logit, and log-linear models. An application of our non-parametric method to remove potential bias due to the observed covariates is presented.
Resumo:
Bayesian phylogenetic analyses are now very popular in systematics and molecular evolution because they allow the use of much more realistic models than currently possible with maximum likelihood methods. There are, however, a growing number of examples in which large Bayesian posterior clade probabilities are associated with very short edge lengths and low values for non-Bayesian measures of support such as nonparametric bootstrapping. For the four-taxon case when the true tree is the star phylogeny, Bayesian analyses become increasingly unpredictable in their preference for one of the three possible resolved tree topologies as data set size increases. This leads to the prediction that hard (or near-hard) polytomies in nature will cause unpredictable behavior in Bayesian analyses, with arbitrary resolutions of the polytomy receiving very high posterior probabilities in some cases. We present a simple solution to this problem involving a reversible-jump Markov chain Monte Carlo (MCMC) algorithm that allows exploration of all of tree space, including unresolved tree topologies with one or more polytomies. The reversible-jump MCMC approach allows prior distributions to place some weight on less-resolved tree topologies, which eliminates misleadingly high posteriors associated with arbitrary resolutions of hard polytomies. Fortunately, assigning some prior probability to polytomous tree topologies does not appear to come with a significant cost in terms of the ability to assess the level of support for edges that do exist in the true tree. Methods are discussed for applying arbitrary prior distributions to tree topologies of varying resolution, and an empirical example showing evidence of polytomies is analyzed and discussed.
Resumo:
Bayesian phylogenetic analyses are now very popular in systematics and molecular evolution because they allow the use of much more realistic models than currently possible with maximum likelihood methods. There are, however, a growing number of examples in which large Bayesian posterior clade probabilities are associated with very short edge lengths and low values for non-Bayesian measures of support such as nonparametric bootstrapping. For the four-taxon case when the true tree is the star phylogeny, Bayesian analyses become increasingly unpredictable in their preference for one of the three possible resolved tree topologies as data set size increases. This leads to the prediction that hard (or near-hard) polytomies in nature will cause unpredictable behavior in Bayesian analyses, with arbitrary resolutions of the polytomy receiving very high posterior probabilities in some cases. We present a simple solution to this problem involving a reversible-jump Markov chain Monte Carlo (MCMC) algorithm that allows exploration of all of tree space, including unresolved tree topologies with one or more polytomies. The reversible-jump MCMC approach allows prior distributions to place some weight on less-resolved tree topologies, which eliminates misleadingly high posteriors associated with arbitrary resolutions of hard polytomies. Fortunately, assigning some prior probability to polytomous tree topologies does not appear to come with a significant cost in terms of the ability to assess the level of support for edges that do exist in the true tree. Methods are discussed for applying arbitrary prior distributions to tree topologies of varying resolution, and an empirical example showing evidence of polytomies is analyzed and discussed.
Resumo:
Em testes nos quais uma quantidade considerável de indivíduos não dispõe de tempo suciente para responder todos os itens temos o que é chamado de efeito de Speededness. O uso do modelo unidimensional da Teoria da Resposta ao Item (TRI) em testes com speededness pode nos levar a uma série de interpretações errôneas uma vez que nesse modelo é suposto que os respondentes possuem tempo suciente para responder todos os itens. Nesse trabalho, desenvolvemos uma análise Bayesiana do modelo tri-dimensional da TRI proposto por Wollack e Cohen (2005) considerando uma estrutura de dependência entre as distribuições a priori dos traços latentes a qual modelamos com o uso de cópulas. Apresentamos um processo de estimação para o modelo proposto e fazemos um estudo de simulação comparativo com a análise realizada por Bazan et al. (2010) na qual foi utilizada distribuições a priori independentes para os traços latentes. Finalmente, fazemos uma análise de sensibilidade do modelo em estudo e apresentamos uma aplicação levando em conta um conjunto de dados reais proveniente de um subteste do EGRA, chamado de Nonsense Words, realizado no Peru em 2007. Nesse subteste os alunos são avaliados por via oral efetuando a leitura, sequencialmente, de 50 palavras sem sentidos em 60 segundos o que caracteriza a presença do efeito speededness.
Resumo:
Background: Managed forests are a major component of tropical landscapes. Production forests as designated by national forest services cover up to 400 million ha, i.e. half of the forested area in the humid tropics. Forest management thus plays a major role in the global carbon budget, but with a lack of unified method to estimate carbon fluxes from tropical managed forests. In this study we propose a new time- and spatially-explicit methodology to estimate the above-ground carbon budget of selective logging at regional scale. Results: The yearly balance of a logging unit, i.e. the elementary management unit of a forest estate, is modelled by aggregating three sub-models encompassing (i) emissions from extracted wood, (ii) emissions from logging damage and deforested areas and (iii) carbon storage from post-logging recovery. Models are parametrised and uncertainties are propagated through a MCMC algorithm. As a case study, we used 38 years of National Forest Inventories in French Guiana, northeastern Amazonia, to estimate the above-ground carbon balance (i.e. the net carbon exchange with the atmosphere) of selectively logged forests. Over this period, the net carbon balance of selective logging in the French Guianan Permanent Forest Estate is estimated to be comprised between 0.12 and 1.33 Tg C, with a median value of 0.64 Tg C. Uncertainties over the model could be diminished by improving the accuracy of both logging damage and large woody necromass decay submodels. Conclusions: We propose an innovating carbon accounting framework relying upon basic logging statistics. This flexible tool allows carbon budget of tropical managed forests to be estimated in a wide range of tropical regions
Resumo:
We present a novel filtering algorithm for tracking multiple clusters of coordinated objects. Based on a Markov chain Monte Carlo (MCMC) mechanism, the new algorithm propagates a discrete approximation of the underlying filtering density. A dynamic Gaussian mixture model is utilized for representing the time-varying clustering structure. This involves point process formulations of typical behavioral moves such as birth and death of clusters as well as merging and splitting. For handling complex, possibly large scale scenarios, the sampling efficiency of the basic MCMC scheme is enhanced via the use of a Metropolis within Gibbs particle refinement step. As the proposed methodology essentially involves random set representations, a new type of estimator, termed the probability hypothesis density surface (PHDS), is derived for computing point estimates. It is further proved that this estimator is optimal in the sense of the mean relative entropy. Finally, the algorithm's performance is assessed and demonstrated in both synthetic and realistic tracking scenarios. © 2012 Elsevier Ltd. All rights reserved.
Resumo:
Standard Monte Carlo (sMC) simulation models have been widely used in AEC industry research to address system uncertainties. Although the benefits of probabilistic simulation analyses over deterministic methods are well documented, the sMC simulation technique is quite sensitive to the probability distributions of the input variables. This phenomenon becomes highly pronounced when the region of interest within the joint probability distribution (a function of the input variables) is small. In such cases, the standard Monte Carlo approach is often impractical from a computational standpoint. In this paper, a comparative analysis of standard Monte Carlo simulation to Markov Chain Monte Carlo with subset simulation (MCMC/ss) is presented. The MCMC/ss technique constitutes a more complex simulation method (relative to sMC), wherein a structured sampling algorithm is employed in place of completely randomized sampling. Consequently, gains in computational efficiency can be made. The two simulation methods are compared via theoretical case studies.
Resumo:
Pseudo-marginal methods such as the grouped independence Metropolis-Hastings (GIMH) and Markov chain within Metropolis (MCWM) algorithms have been introduced in the literature as an approach to perform Bayesian inference in latent variable models. These methods replace intractable likelihood calculations with unbiased estimates within Markov chain Monte Carlo algorithms. The GIMH method has the posterior of interest as its limiting distribution, but suffers from poor mixing if it is too computationally intensive to obtain high-precision likelihood estimates. The MCWM algorithm has better mixing properties, but less theoretical support. In this paper we propose to use Gaussian processes (GP) to accelerate the GIMH method, whilst using a short pilot run of MCWM to train the GP. Our new method, GP-GIMH, is illustrated on simulated data from a stochastic volatility and a gene network model.
Resumo:
This work presents new, efficient Markov chain Monte Carlo (MCMC) simulation methods for statistical analysis in various modelling applications. When using MCMC methods, the model is simulated repeatedly to explore the probability distribution describing the uncertainties in model parameters and predictions. In adaptive MCMC methods based on the Metropolis-Hastings algorithm, the proposal distribution needed by the algorithm learns from the target distribution as the simulation proceeds. Adaptive MCMC methods have been subject of intensive research lately, as they open a way for essentially easier use of the methodology. The lack of user-friendly computer programs has been a main obstacle for wider acceptance of the methods. This work provides two new adaptive MCMC methods: DRAM and AARJ. The DRAM method has been built especially to work in high dimensional and non-linear problems. The AARJ method is an extension to DRAM for model selection problems, where the mathematical formulation of the model is uncertain and we want simultaneously to fit several different models to the same observations. The methods were developed while keeping in mind the needs of modelling applications typical in environmental sciences. The development work has been pursued while working with several application projects. The applications presented in this work are: a winter time oxygen concentration model for Lake Tuusulanjärvi and adaptive control of the aerator; a nutrition model for Lake Pyhäjärvi and lake management planning; validation of the algorithms of the GOMOS ozone remote sensing instrument on board the Envisat satellite of European Space Agency and the study of the effects of aerosol model selection on the GOMOS algorithm.