12 resultados para Spatial Statistics, Autologistic Model
em Collection Of Biostatistics Research Archive
Resumo:
In epidemiological work, outcomes are frequently non-normal, sample sizes may be large, and effects are often small. To relate health outcomes to geographic risk factors, fast and powerful methods for fitting spatial models, particularly for non-normal data, are required. We focus on binary outcomes, with the risk surface a smooth function of space. We compare penalized likelihood models, including the penalized quasi-likelihood (PQL) approach, and Bayesian models based on fit, speed, and ease of implementation. A Bayesian model using a spectral basis representation of the spatial surface provides the best tradeoff of sensitivity and specificity in simulations, detecting real spatial features while limiting overfitting and being more efficient computationally than other Bayesian approaches. One of the contributions of this work is further development of this underused representation. The spectral basis model outperforms the penalized likelihood methods, which are prone to overfitting, but is slower to fit and not as easily implemented. Conclusions based on a real dataset of cancer cases in Taiwan are similar albeit less conclusive with respect to comparing the approaches. The success of the spectral basis with binary data and similar results with count data suggest that it may be generally useful in spatial models and more complicated hierarchical models.
Resumo:
Geostatistics involves the fitting of spatially continuous models to spatially discrete data (Chil`es and Delfiner, 1999). Preferential sampling arises when the process that determines the data-locations and the process being modelled are stochastically dependent. Conventional geostatistical methods assume, if only implicitly, that sampling is non-preferential. However, these methods are often used in situations where sampling is likely to be preferential. For example, in mineral exploration samples may be concentrated in areas thought likely to yield high-grade ore. We give a general expression for the likelihood function of preferentially sampled geostatistical data and describe how this can be evaluated approximately using Monte Carlo methods. We present a model for preferential sampling, and demonstrate through simulated examples that ignoring preferential sampling can lead to seriously misleading inferences. We describe an application of the model to a set of bio-monitoring data from Galicia, northern Spain, in which making allowance for preferential sampling materially changes the inferences.
Resumo:
The AEGISS (Ascertainment and Enhancement of Gastrointestinal Infection Surveillance and Statistics) project aims to use spatio-temporal statistical methods to identify anomalies in the space-time distribution of non-specific, gastrointestinal infections in the UK, using the Southampton area in southern England as a test-case. In this paper, we use the AEGISS project to illustrate how spatio-temporal point process methodology can be used in the development of a rapid-response, spatial surveillance system. Current surveillance of gastroenteric disease in the UK relies on general practitioners reporting cases of suspected food-poisoning through a statutory notification scheme, voluntary laboratory reports of the isolation of gastrointestinal pathogens and standard reports of general outbreaks of infectious intestinal disease by public health and environmental health authorities. However, most statutory notifications are made only after a laboratory reports the isolation of a gastrointestinal pathogen. As a result, detection is delayed and the ability to react to an emerging outbreak is reduced. For more detailed discussion, see Diggle et al. (2003). A new and potentially valuable source of data on the incidence of non-specific gastro-enteric infections in the UK is NHS Direct, a 24-hour phone-in clinical advice service. NHS Direct data are less likely than reports by general practitioners to suffer from spatially and temporally localized inconsistencies in reporting rates. Also, reporting delays by patients are likely to be reduced, as no appointments are needed. Against this, NHS Direct data sacrifice specificity. Each call to NHS Direct is classified only according to the general pattern of reported symptoms (Cooper et al, 2003). The current paper focuses on the use of spatio-temporal statistical analysis for early detection of unexplained variation in the spatio-temporal incidence of non-specific gastroenteric symptoms, as reported to NHS Direct. Section 2 describes our statistical formulation of this problem, the nature of the available data and our approach to predictive inference. Section 3 describes the stochastic model. Section 4 gives the results of fitting the model to NHS Direct data. Section 5 shows how the model is used for spatio-temporal prediction. The paper concludes with a short discussion.
Resumo:
Professor Sir David R. Cox (DRC) is widely acknowledged as among the most important scientists of the second half of the twentieth century. He inherited the mantle of statistical science from Pearson and Fisher, advanced their ideas, and translated statistical theory into practice so as to forever change the application of statistics in many fields, but especially biology and medicine. The logistic and proportional hazards models he substantially developed, are arguably among the most influential biostatistical methods in current practice. This paper looks forward over the period from DRC's 80th to 90th birthdays, to speculate about the future of biostatistics, drawing lessons from DRC's contributions along the way. We consider "Cox's model" of biostatistics, an approach to statistical science that: formulates scientific questions or quantities in terms of parameters gamma in probability models f(y; gamma) that represent in a parsimonious fashion, the underlying scientific mechanisms (Cox, 1997); partition the parameters gamma = theta, eta into a subset of interest theta and other "nuisance parameters" eta necessary to complete the probability distribution (Cox and Hinkley, 1974); develops methods of inference about the scientific quantities that depend as little as possible upon the nuisance parameters (Barndorff-Nielsen and Cox, 1989); and thinks critically about the appropriate conditional distribution on which to base infrences. We briefly review exciting biomedical and public health challenges that are capable of driving statistical developments in the next decade. We discuss the statistical models and model-based inferences central to the CM approach, contrasting them with computationally-intensive strategies for prediction and inference advocated by Breiman and others (e.g. Breiman, 2001) and to more traditional design-based methods of inference (Fisher, 1935). We discuss the hierarchical (multi-level) model as an example of the future challanges and opportunities for model-based inference. We then consider the role of conditional inference, a second key element of the CM. Recent examples from genetics are used to illustrate these ideas. Finally, the paper examines causal inference and statistical computing, two other topics we believe will be central to biostatistics research and practice in the coming decade. Throughout the paper, we attempt to indicate how DRC's work and the "Cox Model" have set a standard of excellence to which all can aspire in the future.
Resumo:
Traditionally, the use of Bayes factors has required the specification of proper prior distributions on model parameters implicit to both null and alternative hypotheses. In this paper, I describe an approach to defining Bayes factors based on modeling test statistics. Because the distributions of test statistics do not depend on unknown model parameters, this approach eliminates the subjectivity normally associated with the definition of Bayes factors. For standard test statistics, including the _2, F, t and z statistics, the values of Bayes factors that result from this approach can be simply expressed in closed form.
Resumo:
Recent research highlights the promise of remotely-sensed aerosol optical depth (AOD) as a proxy for ground-level PM2.5. Particular interest lies in the information on spatial heterogeneity potentially provided by AOD, with important application to estimating and monitoring pollution exposure for public health purposes. Given the temporal and spatio-temporal correlations reported between AOD and PM2.5 , it is tempting to interpret the spatial patterns in AOD as reflecting patterns in PM2.5 . Here we find only limited spatial associations of AOD from three satellite retrievals with PM2.5 over the eastern U.S. at the daily and yearly levels in 2004. We then use statistical modeling to show that the patterns in monthly average AOD poorly reflect patterns in PM2.5 because of systematic, spatially-correlated error in AOD as a proxy for PM2.5 . Furthermore, when we include AOD as a predictor of monthly PM2.5 in a statistical prediction model, AOD provides little additional information to improve predictions of PM2.5 when included in a model that already accounts for land use, emission sources, meteorology and regional variability. These results suggest caution in using spatial variation in AOD to stand in for spatial variation in ground-level PM2.5 in epidemiological analyses and indicate that when PM2.5 monitoring is available, careful statistical modeling outperforms the use of AOD.
Resumo:
Functional neuroimaging techniques enable investigations into the neural basis of human cognition, emotions, and behaviors. In practice, applications of functional magnetic resonance imaging (fMRI) have provided novel insights into the neuropathophysiology of major psychiatric,neurological, and substance abuse disorders, as well as into the neural responses to their treatments. Modern activation studies often compare localized task-induced changes in brain activity between experimental groups. One may also extend voxel-level analyses by simultaneously considering the ensemble of voxels constituting an anatomically defined region of interest (ROI) or by considering means or quantiles of the ROI. In this work we present a Bayesian extension of voxel-level analyses that offers several notable benefits. First, it combines whole-brain voxel-by-voxel modeling and ROI analyses within a unified framework. Secondly, an unstructured variance/covariance for regional mean parameters allows for the study of inter-regional functional connectivity, provided enough subjects are available to allow for accurate estimation. Finally, an exchangeable correlation structure within regions allows for the consideration of intra-regional functional connectivity. We perform estimation for our model using Markov Chain Monte Carlo (MCMC) techniques implemented via Gibbs sampling which, despite the high throughput nature of the data, can be executed quickly (less than 30 minutes). We apply our Bayesian hierarchical model to two novel fMRI data sets: one considering inhibitory control in cocaine-dependent men and the second considering verbal memory in subjects at high risk for Alzheimer’s disease. The unifying hierarchical model presented in this manuscript is shown to enhance the interpretation content of these data sets.
Resumo:
We present a state-of-the-art application of smoothing for dependent bivariate binomial spatial data to Loa loa prevalence mapping in West Africa. This application is special because it starts with the non-spatial calibration of survey instruments, continues with the spatial model building and assessment and ends with robust, tested software that will be used by the field scientists of the World Health Organization for online prevalence map updating. From a statistical perspective several important methodological issues were addressed: (a) building spatial models that are complex enough to capture the structure of the data but remain computationally usable; (b)reducing the computational burden in the handling of very large covariate data sets; (c) devising methods for comparing spatial prediction methods for a given exceedance policy threshold.
Resumo:
Functional Magnetic Resonance Imaging (fMRI) is a non-invasive technique which is commonly used to quantify changes in blood oxygenation and flow coupled to neuronal activation. One of the primary goals of fMRI studies is to identify localized brain regions where neuronal activation levels vary between groups. Single voxel t-tests have been commonly used to determine whether activation related to the protocol differs across groups. Due to the generally limited number of subjects within each study, accurate estimation of variance at each voxel is difficult. Thus, combining information across voxels in the statistical analysis of fMRI data is desirable in order to improve efficiency. Here we construct a hierarchical model and apply an Empirical Bayes framework on the analysis of group fMRI data, employing techniques used in high throughput genomic studies. The key idea is to shrink residual variances by combining information across voxels, and subsequently to construct an improved test statistic in lieu of the classical t-statistic. This hierarchical model results in a shrinkage of voxel-wise residual sample variances towards a common value. The shrunken estimator for voxelspecific variance components on the group analyses outperforms the classical residual error estimator in terms of mean squared error. Moreover, the shrunken test-statistic decreases false positive rate when testing differences in brain contrast maps across a wide range of simulation studies. This methodology was also applied to experimental data regarding a cognitive activation task.