966 resultados para Local likelihood function
Resumo:
Background The estimation of demographic parameters from genetic data often requires the computation of likelihoods. However, the likelihood function is computationally intractable for many realistic evolutionary models, and the use of Bayesian inference has therefore been limited to very simple models. The situation changed recently with the advent of Approximate Bayesian Computation (ABC) algorithms allowing one to obtain parameter posterior distributions based on simulations not requiring likelihood computations. Results Here we present ABCtoolbox, a series of open source programs to perform Approximate Bayesian Computations (ABC). It implements various ABC algorithms including rejection sampling, MCMC without likelihood, a Particle-based sampler and ABC-GLM. ABCtoolbox is bundled with, but not limited to, a program that allows parameter inference in a population genetics context and the simultaneous use of different types of markers with different ploidy levels. In addition, ABCtoolbox can also interact with most simulation and summary statistics computation programs. The usability of the ABCtoolbox is demonstrated by inferring the evolutionary history of two evolutionary lineages of Microtus arvalis. Using nuclear microsatellites and mitochondrial sequence data in the same estimation procedure enabled us to infer sex-specific population sizes and migration rates and to find that males show smaller population sizes but much higher levels of migration than females. Conclusion ABCtoolbox allows a user to perform all the necessary steps of a full ABC analysis, from parameter sampling from prior distributions, data simulations, computation of summary statistics, estimation of posterior distributions, model choice, validation of the estimation procedure, and visualization of the results.
Resumo:
In this paper, we focus on the model for two types of tumors. Tumor development can be described by four types of death rates and four tumor transition rates. We present a general semi-parametric model to estimate the tumor transition rates based on data from survival/sacrifice experiments. In the model, we make a proportional assumption of tumor transition rates on a common parametric function but no assumption of the death rates from any states. We derived the likelihood function of the data observed in such an experiment, and an EM algorithm that simplified estimating procedures. This article extends work on semi-parametric models for one type of tumor (see Portier and Dinse and Dinse) to two types of tumors.
Resumo:
Generalized linear mixed models with semiparametric random effects are useful in a wide variety of Bayesian applications. When the random effects arise from a mixture of Dirichlet process (MDP) model, normal base measures and Gibbs sampling procedures based on the Pólya urn scheme are often used to simulate posterior draws. These algorithms are applicable in the conjugate case when (for a normal base measure) the likelihood is normal. In the non-conjugate case, the algorithms proposed by MacEachern and Müller (1998) and Neal (2000) are often applied to generate posterior samples. Some common problems associated with simulation algorithms for non-conjugate MDP models include convergence and mixing difficulties. This paper proposes an algorithm based on the Pólya urn scheme that extends the Gibbs sampling algorithms to non-conjugate models with normal base measures and exponential family likelihoods. The algorithm proceeds by making Laplace approximations to the likelihood function, thereby reducing the procedure to that of conjugate normal MDP models. To ensure the validity of the stationary distribution in the non-conjugate case, the proposals are accepted or rejected by a Metropolis-Hastings step. In the special case where the data are normally distributed, the algorithm is identical to the Gibbs sampler.
Resumo:
There is an emerging interest in modeling spatially correlated survival data in biomedical and epidemiological studies. In this paper, we propose a new class of semiparametric normal transformation models for right censored spatially correlated survival data. This class of models assumes that survival outcomes marginally follow a Cox proportional hazard model with unspecified baseline hazard, and their joint distribution is obtained by transforming survival outcomes to normal random variables, whose joint distribution is assumed to be multivariate normal with a spatial correlation structure. A key feature of the class of semiparametric normal transformation models is that it provides a rich class of spatial survival models where regression coefficients have population average interpretation and the spatial dependence of survival times is conveniently modeled using the transformed variables by flexible normal random fields. We study the relationship of the spatial correlation structure of the transformed normal variables and the dependence measures of the original survival times. Direct nonparametric maximum likelihood estimation in such models is practically prohibited due to the high dimensional intractable integration of the likelihood function and the infinite dimensional nuisance baseline hazard parameter. We hence develop a class of spatial semiparametric estimating equations, which conveniently estimate the population-level regression coefficients and the dependence parameters simultaneously. We study the asymptotic properties of the proposed estimators, and show that they are consistent and asymptotically normal. The proposed method is illustrated with an analysis of data from the East Boston Ashma Study and its performance is evaluated using simulations.
Resumo:
In this paper, we study panel count data with informative observation times. We assume nonparametric and semiparametric proportional rate models for the underlying recurrent event process, where the form of the baseline rate function is left unspecified and a subject-specific frailty variable inflates or deflates the rate function multiplicatively. The proposed models allow the recurrent event processes and observation times to be correlated through their connections with the unobserved frailty; moreover, the distributions of both the frailty variable and observation times are considered as nuisance parameters. The baseline rate function and the regression parameters are estimated by maximizing a conditional likelihood function of observed event counts and solving estimation equations. Large sample properties of the proposed estimators are studied. Numerical studies demonstrate that the proposed estimation procedures perform well for moderate sample sizes. An application to a bladder tumor study is presented to illustrate the use of the proposed methods.
Resumo:
Geostatistics involves the fitting of spatially continuous models to spatially discrete data (Chil`es and Delfiner, 1999). Preferential sampling arises when the process that determines the data-locations and the process being modelled are stochastically dependent. Conventional geostatistical methods assume, if only implicitly, that sampling is non-preferential. However, these methods are often used in situations where sampling is likely to be preferential. For example, in mineral exploration samples may be concentrated in areas thought likely to yield high-grade ore. We give a general expression for the likelihood function of preferentially sampled geostatistical data and describe how this can be evaluated approximately using Monte Carlo methods. We present a model for preferential sampling, and demonstrate through simulated examples that ignoring preferential sampling can lead to seriously misleading inferences. We describe an application of the model to a set of bio-monitoring data from Galicia, northern Spain, in which making allowance for preferential sampling materially changes the inferences.
Resumo:
Amyloids and prion proteins are clinically and biologically important beta-structures, whose supersecondary structures are difficult to determine by standard experimental or computational means. In addition, significant conformational heterogeneity is known or suspected to exist in many amyloid fibrils. Recent work has indicated the utility of pairwise probabilistic statistics in beta-structure prediction. We develop here a new strategy for beta-structure prediction, emphasizing the determination of beta-strands and pairs of beta-strands as fundamental units of beta-structure. Our program, BETASCAN, calculates likelihood scores for potential beta-strands and strand-pairs based on correlations observed in parallel beta-sheets. The program then determines the strands and pairs with the greatest local likelihood for all of the sequence's potential beta-structures. BETASCAN suggests multiple alternate folding patterns and assigns relative a priori probabilities based solely on amino acid sequence, probability tables, and pre-chosen parameters. The algorithm compares favorably with the results of previous algorithms (BETAPRO, PASTA, SALSA, TANGO, and Zyggregator) in beta-structure prediction and amyloid propensity prediction. Accurate prediction is demonstrated for experimentally determined amyloid beta-structures, for a set of known beta-aggregates, and for the parallel beta-strands of beta-helices, amyloid-like globular proteins. BETASCAN is able both to detect beta-strands with higher sensitivity and to detect the edges of beta-strands in a richly beta-like sequence. For two proteins (Abeta and Het-s), there exist multiple sets of experimental data implying contradictory structures; BETASCAN is able to detect each competing structure as a potential structure variant. The ability to correlate multiple alternate beta-structures to experiment opens the possibility of computational investigation of prion strains and structural heterogeneity of amyloid. BETASCAN is publicly accessible on the Web at http://betascan.csail.mit.edu.
Resumo:
Passive positioning systems produce user location information for third-party providers of positioning services. Since the tracked wireless devices do not participate in the positioning process, passive positioning can only rely on simple, measurable radio signal parameters, such as timing or power information. In this work, we provide a passive tracking system for WiFi signals with an enhanced particle filter using fine-grained power-based ranging. Our proposed particle filter provides an improved likelihood function on observation parameters and is equipped with a modified coordinated turn model to address the challenges in a passive positioning system. The anchor nodes for WiFi signal sniffing and target positioning use software defined radio techniques to extract channel state information to mitigate multipath effects. By combining the enhanced particle filter and a set of enhanced ranging methods, our system can track mobile targets with an accuracy of 1.5m for 50% and 2.3m for 90% in a complex indoor environment. Our proposed particle filter significantly outperforms the typical bootstrap particle filter, extended Kalman filter and trilateration algorithms.
Resumo:
Do siblings of centenarians tend to have longer life spans? To answer this question, life spans of 184 siblings for 42 centenarians have been evaluated. Two important questions have been addressed in analyzing the sibling data. First, a standard needs to be established, to which the life spans of 184 siblings are compared. In this report, an external reference population is constructed from the U.S. life tables. Its estimated mortality rates are treated as baseline hazards from which the relative mortality of the siblings are estimated. Second, the standard survival models which assume independent observations are invalid when correlation within family exists, underestimating the true variance. Methods that allow correlations are illustrated by three different methods. First, the cumulative relative excess mortality between siblings and their comparison group is calculated and used as an effective graphic tool, along with the Product Limit estimator of the survival function. The variance estimator of the cumulative relative excess mortality is adjusted for the potential within family correlation using Taylor linearization approach. Second, approaches that adjust for the inflated variance are examined. They are adjusted one-sample log-rank test using design effect originally proposed by Rao and Scott in the correlated binomial or Poisson distribution setting and the robust variance estimator derived from the log-likelihood function of a multiplicative model. Nether of these two approaches provide correlation estimate within families, but the comparison with the comparison with the standard remains valid under dependence. Last, using the frailty model concept, the multiplicative model, where the baseline hazards are known, is extended by adding a random frailty term that is based on the positive stable or the gamma distribution. Comparisons between the two frailty distributions are performed by simulation. Based on the results from various approaches, it is concluded that the siblings of centenarians had significant lower mortality rates as compared to their cohorts. The frailty models also indicate significant correlations between the life spans of the siblings. ^
Resumo:
Standard methods for testing safety data are needed to ensure the safe conduct of clinical trials. In particular, objective rules for reliably identifying unsafe treatments need to be put into place to help protect patients from unnecessary harm. DMCs are uniquely qualified to evaluate accumulating unblinded data and make recommendations about the continuing safe conduct of a trial. However, it is the trial leadership who must make the tough ethical decision about stopping a trial, and they could benefit from objective statistical rules that help them judge the strength of evidence contained in the blinded data. We design early stopping rules for harm that act as continuous safety screens for randomized controlled clinical trials with blinded treatment information, which could be used by anyone, including trial investigators (and trial leadership). A Bayesian framework, with emphasis on the likelihood function, is used to allow for continuous monitoring without adjusting for multiple comparisons. Close collaboration between the statistician and the clinical investigators will be needed in order to design safety screens with good operating characteristics. Though the math underlying this procedure may be computationally intensive, implementation of the statistical rules will be easy and the continuous screening provided will give suitably early warning when real problems were to emerge. Trial investigators and trial leadership need these safety screens to help them to effectively monitor the ongoing safe conduct of clinical trials with blinded data.^
Resumo:
Neste trabalho propomos o uso de um método Bayesiano para estimar o parâmetro de memória de um processo estocástico com memória longa quando sua função de verossimilhança é intratável ou não está disponível. Esta abordagem fornece uma aproximação para a distribuição a posteriori sobre a memória e outros parâmetros e é baseada numa aplicação simples do método conhecido como computação Bayesiana aproximada (ABC). Alguns estimadores populares para o parâmetro de memória serão revisados e comparados com esta abordagem. O emprego de nossa proposta viabiliza a solução de problemas complexos sob o ponto de vista Bayesiano e, embora aproximativa, possui um desempenho muito satisfatório quando comparada com métodos clássicos.
Resumo:
We introduce a family of rules for adjusting one’s credences in response to learning the credences of others. These rules have a number of desirable features. 1. They yield the posterior credences that would result from updating by standard Bayesian conditionalization on one’s peers’ reported credences if one’s likelihood function takes a particular simple form. 2. In the simplest form, they are symmetric among the agents in the group. 3. They map neatly onto the familiar Condorcet voting results. 4. They preserve shared agreement about independence in a wide range of cases. 5. They commute with conditionalization and with multiple peer updates. Importantly, these rules have a surprising property that we call synergy — peer testimony of credences can provide mutually supporting evidence raising an individual’s credence higher than any peer’s initial prior report. At first, this may seem to be a strike against them. We argue, however, that synergy is actually a desirable feature and the failure of other updating rules to yield synergy is a strike against them.
Resumo:
Principal component analysis (PCA) is a ubiquitous technique for data analysis and processing, but one which is not based upon a probability model. In this paper we demonstrate how the principal axes of a set of observed data vectors may be determined through maximum-likelihood estimation of parameters in a latent variable model closely related to factor analysis. We consider the properties of the associated likelihood function, giving an EM algorithm for estimating the principal subspace iteratively, and discuss the advantages conveyed by the definition of a probability density function for PCA.
Resumo:
Principal component analysis (PCA) is a ubiquitous technique for data analysis and processing, but one which is not based upon a probability model. In this paper we demonstrate how the principal axes of a set of observed data vectors may be determined through maximum-likelihood estimation of parameters in a latent variable model closely related to factor analysis. We consider the properties of the associated likelihood function, giving an EM algorithm for estimating the principal subspace iteratively, and discuss the advantages conveyed by the definition of a probability density function for PCA.
Resumo:
The paper develops a novel realized matrix-exponential stochastic volatility model of multivariate returns and realized covariances that incorporates asymmetry and long memory (hereafter the RMESV-ALM model). The matrix exponential transformation guarantees the positivedefiniteness of the dynamic covariance matrix. The contribution of the paper ties in with Robert Basmann’s seminal work in terms of the estimation of highly non-linear model specifications (“Causality tests and observationally equivalent representations of econometric models”, Journal of Econometrics, 1988, 39(1-2), 69–104), especially for developing tests for leverage and spillover effects in the covariance dynamics. Efficient importance sampling is used to maximize the likelihood function of RMESV-ALM, and the finite sample properties of the quasi-maximum likelihood estimator of the parameters are analysed. Using high frequency data for three US financial assets, the new model is estimated and evaluated. The forecasting performance of the new model is compared with a novel dynamic realized matrix-exponential conditional covariance model. The volatility and co-volatility spillovers are examined via the news impact curves and the impulse response functions from returns to volatility and co-volatility.