146 resultados para Likelihood Functions
em Queensland University of Technology - ePrints Archive
Resumo:
We propose a simple method of constructing quasi-likelihood functions for dependent data based on conditional-mean-variance relationships, and apply the method to estimating the fractal dimension from box-counting data. Simulation studies were carried out to compare this method with the traditional methods. We also applied this technique to real data from fishing grounds in the Gulf of Carpentaria, Australia
Resumo:
This paper presents a framework for performing real-time recursive estimation of landmarks’ visual appearance. Imaging data in its original high dimensional space is probabilistically mapped to a compressed low dimensional space through the definition of likelihood functions. The likelihoods are subsequently fused with prior information using a Bayesian update. This process produces a probabilistic estimate of the low dimensional representation of the landmark visual appearance. The overall filtering provides information complementary to the conventional position estimates which is used to enhance data association. In addition to robotics observations, the filter integrates human observations in the appearance estimates. The appearance tracks as computed by the filter allow landmark classification. The set of labels involved in the classification task is thought of as an observation space where human observations are made by selecting a label. The low dimensional appearance estimates returned by the filter allow for low cost communication in low bandwidth sensor networks. Deployment of the filter in such a network is demonstrated in an outdoor mapping application involving a human operator, a ground and an air vehicle.
Resumo:
In this paper we present a new simulation methodology in order to obtain exact or approximate Bayesian inference for models for low-valued count time series data that have computationally demanding likelihood functions. The algorithm fits within the framework of particle Markov chain Monte Carlo (PMCMC) methods. The particle filter requires only model simulations and, in this regard, our approach has connections with approximate Bayesian computation (ABC). However, an advantage of using the PMCMC approach in this setting is that simulated data can be matched with data observed one-at-a-time, rather than attempting to match on the full dataset simultaneously or on a low-dimensional non-sufficient summary statistic, which is common practice in ABC. For low-valued count time series data we find that it is often computationally feasible to match simulated data with observed data exactly. Our particle filter maintains $N$ particles by repeating the simulation until $N+1$ exact matches are obtained. Our algorithm creates an unbiased estimate of the likelihood, resulting in exact posterior inferences when included in an MCMC algorithm. In cases where exact matching is computationally prohibitive, a tolerance is introduced as per ABC. A novel aspect of our approach is that we introduce auxiliary variables into our particle filter so that partially observed and/or non-Markovian models can be accommodated. We demonstrate that Bayesian model choice problems can be easily handled in this framework.
Resumo:
Analytically or computationally intractable likelihood functions can arise in complex statistical inferential problems making them inaccessible to standard Bayesian inferential methods. Approximate Bayesian computation (ABC) methods address such inferential problems by replacing direct likelihood evaluations with repeated sampling from the model. ABC methods have been predominantly applied to parameter estimation problems and less to model choice problems due to the added difficulty of handling multiple model spaces. The ABC algorithm proposed here addresses model choice problems by extending Fearnhead and Prangle (2012, Journal of the Royal Statistical Society, Series B 74, 1–28) where the posterior mean of the model parameters estimated through regression formed the summary statistics used in the discrepancy measure. An additional stepwise multinomial logistic regression is performed on the model indicator variable in the regression step and the estimated model probabilities are incorporated into the set of summary statistics for model choice purposes. A reversible jump Markov chain Monte Carlo step is also included in the algorithm to increase model diversity for thorough exploration of the model space. This algorithm was applied to a validating example to demonstrate the robustness of the algorithm across a wide range of true model probabilities. Its subsequent use in three pathogen transmission examples of varying complexity illustrates the utility of the algorithm in inferring preference of particular transmission models for the pathogens.
Resumo:
Background: A knowledge of energy expenditure in infancy is required for the estimation of recommended daily amounts of food energy, for designing artificial infant feeds, and as a reference standard for studies of energy metabolism in disease states. Objectives: The objectives of this study were to construct centile reference charts for total energy expenditure (TEE) in infants across the first year of life. Methods: Repeated measures of TEE using the doubly labeled water technique were made in 162 infants at 1.5, 3, 6, 9 and 12 months. In total, 322 TEE measurements were obtained. The LMS method with maximum penalized likelihood was used to construct the centile reference charts. Centiles were constructed for TEE expressed as MJ/day and also expressed relative to body weight (BW) and fat-free mass (FFM). Results: TEE increased with age and was 1.40,1.86, 2.64, 3.07 and 3.65 MJ/day at 1.5, 3, 6, 9 and 12 months, respectively. The standard deviations were 0.43, 0.47, 0.52,0.66 and 0.88, respectively. TEE in MJ/kg increased from 0.29 to 0.36 and in MJ/day/kg FFM from 0.36 to 0.48. Conclusions: We have presented centile reference charts for TEE expressed as MJ/day and expressed relative to BW and FFM in infants across the first year of life. There was a wide variation or biological scatter in TEE values seen at all ages. We suggest that these centile charts may be used to assess and possibly quantify abnormal energy metabolism in disease states in infants.
Resumo:
OBJECTIVE: To further investigate a common variant (rs9939609) in the fat mass- and obesity-associated gene (FTO), which recent genome-wide association studies have shown to be associated with body mass index (BMI) and obesity. DESIGN: We examined the effect of this FTO variant on BMI in 3353 Australian adult male and female twins. RESULTS: The minor A allele of rs9939609 was associated with an increased BMI (P=0.0007). Each additional copy of the A allele was associated with a mean BMI increase of approximately 1.04 kg/m(2) (approximately 3.71 kg). Using variance components decomposition, we estimate that this single-nucleotide polymorphism accounts for approximately 3% of the genetic variance in BMI in our sample (approximately 2% of the total variance). By comparing intrapair variances of monozygotic twins of different genotypes we were able to perform a direct test of gene by environment (G x E) interaction in both sexes and gene by parity (G x P) interaction in women, but no evidence was found for either. CONCLUSIONS: In addition to supporting earlier findings that the rs9939609 variant in the FTO gene is associated with an increased BMI, our results indicate that the associated genetic effect does not interact with environment or parity.
Resumo:
Matrix function approximation is a current focus of worldwide interest and finds application in a variety of areas of applied mathematics and statistics. In this thesis we focus on the approximation of A^(-α/2)b, where A ∈ ℝ^(n×n) is a large, sparse symmetric positive definite matrix and b ∈ ℝ^n is a vector. In particular, we will focus on matrix function techniques for sampling from Gaussian Markov random fields in applied statistics and the solution of fractional-in-space partial differential equations. Gaussian Markov random fields (GMRFs) are multivariate normal random variables characterised by a sparse precision (inverse covariance) matrix. GMRFs are popular models in computational spatial statistics as the sparse structure can be exploited, typically through the use of the sparse Cholesky decomposition, to construct fast sampling methods. It is well known, however, that for sufficiently large problems, iterative methods for solving linear systems outperform direct methods. Fractional-in-space partial differential equations arise in models of processes undergoing anomalous diffusion. Unfortunately, as the fractional Laplacian is a non-local operator, numerical methods based on the direct discretisation of these equations typically requires the solution of dense linear systems, which is impractical for fine discretisations. In this thesis, novel applications of Krylov subspace approximations to matrix functions for both of these problems are investigated. Matrix functions arise when sampling from a GMRF by noting that the Cholesky decomposition A = LL^T is, essentially, a `square root' of the precision matrix A. Therefore, we can replace the usual sampling method, which forms x = L^(-T)z, with x = A^(-1/2)z, where z is a vector of independent and identically distributed standard normal random variables. Similarly, the matrix transfer technique can be used to build solutions to the fractional Poisson equation of the form ϕn = A^(-α/2)b, where A is the finite difference approximation to the Laplacian. Hence both applications require the approximation of f(A)b, where f(t) = t^(-α/2) and A is sparse. In this thesis we will compare the Lanczos approximation, the shift-and-invert Lanczos approximation, the extended Krylov subspace method, rational approximations and the restarted Lanczos approximation for approximating matrix functions of this form. A number of new and novel results are presented in this thesis. Firstly, we prove the convergence of the matrix transfer technique for the solution of the fractional Poisson equation and we give conditions by which the finite difference discretisation can be replaced by other methods for discretising the Laplacian. We then investigate a number of methods for approximating matrix functions of the form A^(-α/2)b and investigate stopping criteria for these methods. In particular, we derive a new method for restarting the Lanczos approximation to f(A)b. We then apply these techniques to the problem of sampling from a GMRF and construct a full suite of methods for sampling conditioned on linear constraints and approximating the likelihood. Finally, we consider the problem of sampling from a generalised Matern random field, which combines our techniques for solving fractional-in-space partial differential equations with our method for sampling from GMRFs.
Resumo:
A quasi-maximum likelihood procedure for estimating the parameters of multi-dimensional diffusions is developed in which the transitional density is a multivariate Gaussian density with first and second moments approximating the true moments of the unknown density. For affine drift and diffusion functions, the moments are exactly those of the true transitional density and for nonlinear drift and diffusion functions the approximation is extremely good and is as effective as alternative methods based on likelihood approximations. The estimation procedure generalises to models with latent factors. A conditioning procedure is developed that allows parameter estimation in the absence of proxies.
Resumo:
The method of generalized estimating equations (GEE) is a popular tool for analysing longitudinal (panel) data. Often, the covariates collected are time-dependent in nature, for example, age, relapse status, monthly income. When using GEE to analyse longitudinal data with time-dependent covariates, crucial assumptions about the covariates are necessary for valid inferences to be drawn. When those assumptions do not hold or cannot be verified, Pepe and Anderson (1994, Communications in Statistics, Simulations and Computation 23, 939–951) advocated using an independence working correlation assumption in the GEE model as a robust approach. However, using GEE with the independence correlation assumption may lead to significant efficiency loss (Fitzmaurice, 1995, Biometrics 51, 309–317). In this article, we propose a method that extracts additional information from the estimating equations that are excluded by the independence assumption. The method always includes the estimating equations under the independence assumption and the contribution from the remaining estimating equations is weighted according to the likelihood of each equation being a consistent estimating equation and the information it carries. We apply the method to a longitudinal study of the health of a group of Filipino children.
Resumo:
Robust methods are useful in making reliable statistical inferences when there are small deviations from the model assumptions. The widely used method of the generalized estimating equations can be "robustified" by replacing the standardized residuals with the M-residuals. If the Pearson residuals are assumed to be unbiased from zero, parameter estimators from the robust approach are asymptotically biased when error distributions are not symmetric. We propose a distribution-free method for correcting this bias. Our extensive numerical studies show that the proposed method can reduce the bias substantially. Examples are given for illustration.
Resumo:
The tissue kallikreins are serine proteases encoded by highly conserved multigene families. The rodent kallikrein (KLK) families are particularly large, consisting of 13 26 genes clustered in one chromosomal locus. It has been recently recognised that the human KLK gene family is of a similar size (15 genes) with the identification of another 12 related genes (KLK4-KLK15) within and adjacent to the original human KLK locus (KLK1-3) on chromosome 19q13.4. The structural organisation and size of these new genes is similar to that of other KLK genes except for additional exons encoding 5 or 3 untranslated regions. Moreover, many of these genes have multiple mRNA transcripts, a trait not observed with rodent genes. Unlike all other kallikreins, the KLK4-KLK15 encoded proteases are less related (25–44%) and do not contain a conventional kallikrein loop. Clusters of genes exhibit high prostatic (KLK2-4, KLK15) or pancreatic (KLK6-13) expression, suggesting evolutionary conservation of elements conferring tissue specificity. These genes are also expressed, to varying degrees, in a wider range of tissues suggesting a functional involvement of these newer human kallikrein proteases in a diverse range of physiological processes.