5 resultados para MLE
em Queensland University of Technology - ePrints Archive
Resumo:
This dissertation is primarily an applied statistical modelling investigation, motivated by a case study comprising real data and real questions. Theoretical questions on modelling and computation of normalization constants arose from pursuit of these data analytic questions. The essence of the thesis can be described as follows. Consider binary data observed on a two-dimensional lattice. A common problem with such data is the ambiguity of zeroes recorded. These may represent zero response given some threshold (presence) or that the threshold has not been triggered (absence). Suppose that the researcher wishes to estimate the effects of covariates on the binary responses, whilst taking into account underlying spatial variation, which is itself of some interest. This situation arises in many contexts and the dingo, cypress and toad case studies described in the motivation chapter are examples of this. Two main approaches to modelling and inference are investigated in this thesis. The first is frequentist and based on generalized linear models, with spatial variation modelled by using a block structure or by smoothing the residuals spatially. The EM algorithm can be used to obtain point estimates, coupled with bootstrapping or asymptotic MLE estimates for standard errors. The second approach is Bayesian and based on a three- or four-tier hierarchical model, comprising a logistic regression with covariates for the data layer, a binary Markov Random field (MRF) for the underlying spatial process, and suitable priors for parameters in these main models. The three-parameter autologistic model is a particular MRF of interest. Markov chain Monte Carlo (MCMC) methods comprising hybrid Metropolis/Gibbs samplers is suitable for computation in this situation. Model performance can be gauged by MCMC diagnostics. Model choice can be assessed by incorporating another tier in the modelling hierarchy. This requires evaluation of a normalization constant, a notoriously difficult problem. Difficulty with estimating the normalization constant for the MRF can be overcome by using a path integral approach, although this is a highly computationally intensive method. Different methods of estimating ratios of normalization constants (N Cs) are investigated, including importance sampling Monte Carlo (ISMC), dependent Monte Carlo based on MCMC simulations (MCMC), and reverse logistic regression (RLR). I develop an idea present though not fully developed in the literature, and propose the Integrated mean canonical statistic (IMCS) method for estimating log NC ratios for binary MRFs. The IMCS method falls within the framework of the newly identified path sampling methods of Gelman & Meng (1998) and outperforms ISMC, MCMC and RLR. It also does not rely on simplifying assumptions, such as ignoring spatio-temporal dependence in the process. A thorough investigation is made of the application of IMCS to the three-parameter Autologistic model. This work introduces background computations required for the full implementation of the four-tier model in Chapter 7. Two different extensions of the three-tier model to a four-tier version are investigated. The first extension incorporates temporal dependence in the underlying spatio-temporal process. The second extensions allows the successes and failures in the data layer to depend on time. The MCMC computational method is extended to incorporate the extra layer. A major contribution of the thesis is the development of a fully Bayesian approach to inference for these hierarchical models for the first time. Note: The author of this thesis has agreed to make it open access but invites people downloading the thesis to send her an email via the 'Contact Author' function.
Resumo:
Most unsignalised intersection capacity calculation procedures are based on gap acceptance models. Accuracy of critical gap estimation affects accuracy of capacity and delay estimation. Several methods have been published to estimate drivers’ sample mean critical gap, the Maximum Likelihood Estimation (MLE) technique regarded as the most accurate. This study assesses three novel methods; Average Central Gap (ACG) method, Strength Weighted Central Gap method (SWCG), and Mode Central Gap method (MCG), against MLE for their fidelity in rendering true sample mean critical gaps. A Monte Carlo event based simulation model was used to draw the maximum rejected gap and accepted gap for each of a sample of 300 drivers across 32 simulation runs. Simulation mean critical gap is varied between 3s and 8s, while offered gap rate is varied between 0.05veh/s and 0.55veh/s. This study affirms that MLE provides a close to perfect fit to simulation mean critical gaps across a broad range of conditions. The MCG method also provides an almost perfect fit and has superior computational simplicity and efficiency to the MLE. The SWCG method performs robustly under high flows; however, poorly under low to moderate flows. Further research is recommended using field traffic data, under a variety of minor stream and major stream flow conditions for a variety of minor stream movement types, to compare critical gap estimates using MLE against MCG. Should the MCG method prove as robust as MLE, serious consideration should be given to its adoption to estimate critical gap parameters in guidelines.