905 resultados para Methods : Statistical


Relevância:

20.00% 20.00%

Publicador:

Resumo:

This dissertation is primarily an applied statistical modelling investigation, motivated by a case study comprising real data and real questions. Theoretical questions on modelling and computation of normalization constants arose from pursuit of these data analytic questions. The essence of the thesis can be described as follows. Consider binary data observed on a two-dimensional lattice. A common problem with such data is the ambiguity of zeroes recorded. These may represent zero response given some threshold (presence) or that the threshold has not been triggered (absence). Suppose that the researcher wishes to estimate the effects of covariates on the binary responses, whilst taking into account underlying spatial variation, which is itself of some interest. This situation arises in many contexts and the dingo, cypress and toad case studies described in the motivation chapter are examples of this. Two main approaches to modelling and inference are investigated in this thesis. The first is frequentist and based on generalized linear models, with spatial variation modelled by using a block structure or by smoothing the residuals spatially. The EM algorithm can be used to obtain point estimates, coupled with bootstrapping or asymptotic MLE estimates for standard errors. The second approach is Bayesian and based on a three- or four-tier hierarchical model, comprising a logistic regression with covariates for the data layer, a binary Markov Random field (MRF) for the underlying spatial process, and suitable priors for parameters in these main models. The three-parameter autologistic model is a particular MRF of interest. Markov chain Monte Carlo (MCMC) methods comprising hybrid Metropolis/Gibbs samplers is suitable for computation in this situation. Model performance can be gauged by MCMC diagnostics. Model choice can be assessed by incorporating another tier in the modelling hierarchy. This requires evaluation of a normalization constant, a notoriously difficult problem. Difficulty with estimating the normalization constant for the MRF can be overcome by using a path integral approach, although this is a highly computationally intensive method. Different methods of estimating ratios of normalization constants (N Cs) are investigated, including importance sampling Monte Carlo (ISMC), dependent Monte Carlo based on MCMC simulations (MCMC), and reverse logistic regression (RLR). I develop an idea present though not fully developed in the literature, and propose the Integrated mean canonical statistic (IMCS) method for estimating log NC ratios for binary MRFs. The IMCS method falls within the framework of the newly identified path sampling methods of Gelman & Meng (1998) and outperforms ISMC, MCMC and RLR. It also does not rely on simplifying assumptions, such as ignoring spatio-temporal dependence in the process. A thorough investigation is made of the application of IMCS to the three-parameter Autologistic model. This work introduces background computations required for the full implementation of the four-tier model in Chapter 7. Two different extensions of the three-tier model to a four-tier version are investigated. The first extension incorporates temporal dependence in the underlying spatio-temporal process. The second extensions allows the successes and failures in the data layer to depend on time. The MCMC computational method is extended to incorporate the extra layer. A major contribution of the thesis is the development of a fully Bayesian approach to inference for these hierarchical models for the first time. Note: The author of this thesis has agreed to make it open access but invites people downloading the thesis to send her an email via the 'Contact Author' function.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This thesis applies Monte Carlo techniques to the study of X-ray absorptiometric methods of bone mineral measurement. These studies seek to obtain information that can be used in efforts to improve the accuracy of the bone mineral measurements. A Monte Carlo computer code for X-ray photon transport at diagnostic energies has been developed from first principles. This development was undertaken as there was no readily available code which included electron binding energy corrections for incoherent scattering and one of the objectives of the project was to study the effects of inclusion of these corrections in Monte Carlo models. The code includes the main Monte Carlo program plus utilities for dealing with input data. A number of geometrical subroutines which can be used to construct complex geometries have also been written. The accuracy of the Monte Carlo code has been evaluated against the predictions of theory and the results of experiments. The results show a high correlation with theoretical predictions. In comparisons of model results with those of direct experimental measurements, agreement to within the model and experimental variances is obtained. The code is an accurate and valid modelling tool. A study of the significance of inclusion of electron binding energy corrections for incoherent scatter in the Monte Carlo code has been made. The results show this significance to be very dependent upon the type of application. The most significant effect is a reduction of low angle scatter flux for high atomic number scatterers. To effectively apply the Monte Carlo code to the study of bone mineral density measurement by photon absorptiometry the results must be considered in the context of a theoretical framework for the extraction of energy dependent information from planar X-ray beams. Such a theoretical framework is developed and the two-dimensional nature of tissue decomposition based on attenuation measurements alone is explained. This theoretical framework forms the basis for analytical models of bone mineral measurement by dual energy X-ray photon absorptiometry techniques. Monte Carlo models of dual energy X-ray absorptiometry (DEXA) have been established. These models have been used to study the contribution of scattered radiation to the measurements. It has been demonstrated that the measurement geometry has a significant effect upon the scatter contribution to the detected signal. For the geometry of the models studied in this work the scatter has no significant effect upon the results of the measurements. The model has also been used to study a proposed technique which involves dual energy X-ray transmission measurements plus a linear measurement of the distance along the ray path. This is designated as the DPA( +) technique. The addition of the linear measurement enables the tissue decomposition to be extended to three components. Bone mineral, fat and lean soft tissue are the components considered here. The results of the model demonstrate that the measurement of bone mineral using this technique is stable over a wide range of soft tissue compositions and hence would indicate the potential to overcome a major problem of the two component DEXA technique. However, the results also show that the accuracy of the DPA( +) technique is highly dependent upon the composition of the non-mineral components of bone and has poorer precision (approximately twice the coefficient of variation) than the standard DEXA measurements. These factors may limit the usefulness of the technique. These studies illustrate the value of Monte Carlo computer modelling of quantitative X-ray measurement techniques. The Monte Carlo models of bone densitometry measurement have:- 1. demonstrated the significant effects of the measurement geometry upon the contribution of scattered radiation to the measurements, 2. demonstrated that the statistical precision of the proposed DPA( +) three tissue component technique is poorer than that of the standard DEXA two tissue component technique, 3. demonstrated that the proposed DPA(+) technique has difficulty providing accurate simultaneous measurement of body composition in terms of a three component model of fat, lean soft tissue and bone mineral,4. and provided a knowledge base for input to decisions about development (or otherwise) of a physical prototype DPA( +) imaging system. The Monte Carlo computer code, data, utilities and associated models represent a set of significant, accurate and valid modelling tools for quantitative studies of physical problems in the fields of diagnostic radiology and radiography.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Expert knowledge is valuable in many modelling endeavours, particularly where data is not extensive or sufficiently robust. In Bayesian statistics, expert opinion may be formulated as informative priors, to provide an honest reflection of the current state of knowledge, before updating this with new information. Technology is increasingly being exploited to help support the process of eliciting such information. This paper reviews the benefits that have been gained from utilizing technology in this way. These benefits can be structured within a six-step elicitation design framework proposed recently (Low Choy et al., 2009). We assume that the purpose of elicitation is to formulate a Bayesian statistical prior, either to provide a standalone expert-defined model, or for updating new data within a Bayesian analysis. We also assume that the model has been pre-specified before selecting the software. In this case, technology has the most to offer to: targeting what experts know (E2), eliciting and encoding expert opinions (E4), whilst enhancing accuracy (E5), and providing an effective and efficient protocol (E6). Benefits include: -providing an environment with familiar nuances (to make the expert comfortable) where experts can explore their knowledge from various perspectives (E2); -automating tedious or repetitive tasks, thereby minimizing calculation errors, as well as encouraging interaction between elicitors and experts (E5); -cognitive gains by educating users, enabling instant feedback (E2, E4-E5), and providing alternative methods of communicating assessments and feedback information, since experts think and learn differently; and -ensuring a repeatable and transparent protocol is used (E6).

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A national-level safety analysis tool is needed to complement existing analytical tools for assessment of the safety impacts of roadway design alternatives. FHWA has sponsored the development of the Interactive Highway Safety Design Model (IHSDM), which is roadway design and redesign software that estimates the safety effects of alternative designs. Considering the importance of IHSDM in shaping the future of safety-related transportation investment decisions, FHWA justifiably sponsored research with the sole intent of independently validating some of the statistical models and algorithms in IHSDM. Statistical model validation aims to accomplish many important tasks, including (a) assessment of the logical defensibility of proposed models, (b) assessment of the transferability of models over future time periods and across different geographic locations, and (c) identification of areas in which future model improvements should be made. These three activities are reported for five proposed types of rural intersection crash prediction models. The internal validation of the model revealed that the crash models potentially suffer from omitted variables that affect safety, site selection and countermeasure selection bias, poorly measured and surrogate variables, and misspecification of model functional forms. The external validation indicated the inability of models to perform on par with model estimation performance. Recommendations for improving the state of the practice from this research include the systematic conduct of carefully designed before-and-after studies, improvements in data standardization and collection practices, and the development of analytical methods to combine the results of before-and-after studies with cross-sectional studies in a meaningful and useful way.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

With daily commercial and social activity in cities, regulation of train service in mass rapid transit railways is necessary to maintain service and passenger flow. Dwell-time adjustment at stations is one commonly used approach to regulation of train service, but its control space is very limited. Coasting control is a viable means of meeting the specific run-time in an inter-station run. The current practice is to start coasting at a fixed distance from the departed station. Hence, it is only optimal with respect to a nominal operational condition of the train schedule, but not the current service demand. The advantage of coasting can only be fully secured when coasting points are determined in real-time. However, identifying the necessary starting point(s) for coasting under the constraints of current service conditions is no simple task as train movement is governed by a large number of factors. The feasibility and performance of classical and heuristic searching measures in locating coasting point(s) is studied with the aid of a single train simulator, according to specified inter-station run times.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A statistical modeling method to accurately determine combustion chamber resonance is proposed and demonstrated. This method utilises Markov-chain Monte Carlo (MCMC) through the use of the Metropolis-Hastings (MH) algorithm to yield a probability density function for the combustion chamber frequency and find the best estimate of the resonant frequency, along with uncertainty. The accurate determination of combustion chamber resonance is then used to investigate various engine phenomena, with appropriate uncertainty, for a range of engine cycles. It is shown that, when operating on various ethanol/diesel fuel combinations, a 20% substitution yields the least amount of inter-cycle variability, in relation to combustion chamber resonance.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The Mobile Emissions Assessment System for Urban and Regional Evaluation (MEASURE) model provides an external validation capability for hot stabilized option; the model is one of several new modal emissions models designed to predict hot stabilized emission rates for various motor vehicle groups as a function of the conditions under which the vehicles are operating. The validation of aggregate measurements, such as speed and acceleration profile, is performed on an independent data set using three statistical criteria. The MEASURE algorithms have proved to provide significant improvements in both average emission estimates and explanatory power over some earlier models for pollutants across almost every operating cycle tested.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Statistical modeling of traffic crashes has been of interest to researchers for decades. Over the most recent decade many crash models have accounted for extra-variation in crash counts—variation over and above that accounted for by the Poisson density. The extra-variation – or dispersion – is theorized to capture unaccounted for variation in crashes across sites. The majority of studies have assumed fixed dispersion parameters in over-dispersed crash models—tantamount to assuming that unaccounted for variation is proportional to the expected crash count. Miaou and Lord [Miaou, S.P., Lord, D., 2003. Modeling traffic crash-flow relationships for intersections: dispersion parameter, functional form, and Bayes versus empirical Bayes methods. Transport. Res. Rec. 1840, 31–40] challenged the fixed dispersion parameter assumption, and examined various dispersion parameter relationships when modeling urban signalized intersection accidents in Toronto. They suggested that further work is needed to determine the appropriateness of the findings for rural as well as other intersection types, to corroborate their findings, and to explore alternative dispersion functions. This study builds upon the work of Miaou and Lord, with exploration of additional dispersion functions, the use of an independent data set, and presents an opportunity to corroborate their findings. Data from Georgia are used in this study. A Bayesian modeling approach with non-informative priors is adopted, using sampling-based estimation via Markov Chain Monte Carlo (MCMC) and the Gibbs sampler. A total of eight model specifications were developed; four of them employed traffic flows as explanatory factors in mean structure while the remainder of them included geometric factors in addition to major and minor road traffic flows. The models were compared and contrasted using the significance of coefficients, standard deviance, chi-square goodness-of-fit, and deviance information criteria (DIC) statistics. The findings indicate that the modeling of the dispersion parameter, which essentially explains the extra-variance structure, depends greatly on how the mean structure is modeled. In the presence of a well-defined mean function, the extra-variance structure generally becomes insignificant, i.e. the variance structure is a simple function of the mean. It appears that extra-variation is a function of covariates when the mean structure (expected crash count) is poorly specified and suffers from omitted variables. In contrast, when sufficient explanatory variables are used to model the mean (expected crash count), extra-Poisson variation is not significantly related to these variables. If these results are generalizable, they suggest that model specification may be improved by testing extra-variation functions for significance. They also suggest that known influences of expected crash counts are likely to be different than factors that might help to explain unaccounted for variation in crashes across sites

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Identification of hot spots, also known as the sites with promise, black spots, accident-prone locations, or priority investigation locations, is an important and routine activity for improving the overall safety of roadway networks. Extensive literature focuses on methods for hot spot identification (HSID). A subset of this considerable literature is dedicated to conducting performance assessments of various HSID methods. A central issue in comparing HSID methods is the development and selection of quantitative and qualitative performance measures or criteria. The authors contend that currently employed HSID assessment criteria—namely false positives and false negatives—are necessary but not sufficient, and additional criteria are needed to exploit the ordinal nature of site ranking data. With the intent to equip road safety professionals and researchers with more useful tools to compare the performances of various HSID methods and to improve the level of HSID assessments, this paper proposes four quantitative HSID evaluation tests that are, to the authors’ knowledge, new and unique. These tests evaluate different aspects of HSID method performance, including reliability of results, ranking consistency, and false identification consistency and reliability. It is intended that road safety professionals apply these different evaluation tests in addition to existing tests to compare the performances of various HSID methods, and then select the most appropriate HSID method to screen road networks to identify sites that require further analysis. This work demonstrates four new criteria using 3 years of Arizona road section accident data and four commonly applied HSID methods [accident frequency ranking, accident rate ranking, accident reduction potential, and empirical Bayes (EB)]. The EB HSID method reveals itself as the superior method in most of the evaluation tests. In contrast, identifying hot spots using accident rate rankings performs the least well among the tests. The accident frequency and accident reduction potential methods perform similarly, with slight differences explained. The authors believe that the four new evaluation tests offer insight into HSID performance heretofore unavailable to analysts and researchers.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

It is important to examine the nature of the relationships between roadway, environmental, and traffic factors and motor vehicle crashes, with the aim to improve the collective understanding of causal mechanisms involved in crashes and to better predict their occurrence. Statistical models of motor vehicle crashes are one path of inquiry often used to gain these initial insights. Recent efforts have focused on the estimation of negative binomial and Poisson regression models (and related deviants) due to their relatively good fit to crash data. Of course analysts constantly seek methods that offer greater consistency with the data generating mechanism (motor vehicle crashes in this case), provide better statistical fit, and provide insight into data structure that was previously unavailable. One such opportunity exists with some types of crash data, in particular crash-level data that are collected across roadway segments, intersections, etc. It is argued in this paper that some crash data possess hierarchical structure that has not routinely been exploited. This paper describes the application of binomial multilevel models of crash types using 548 motor vehicle crashes collected from 91 two-lane rural intersections in the state of Georgia. Crash prediction models are estimated for angle, rear-end, and sideswipe (both same direction and opposite direction) crashes. The contributions of the paper are the realization of hierarchical data structure and the application of a theoretically appealing and suitable analysis approach for multilevel data, yielding insights into intersection-related crashes by crash type.