7 resultados para Refractive errors - Epidemiology
em Collection Of Biostatistics Research Archive
Resumo:
High-throughput SNP arrays provide estimates of genotypes for up to one million loci, often used in genome-wide association studies. While these estimates are typically very accurate, genotyping errors do occur, which can influence in particular the most extreme test statistics and p-values. Estimates for the genotype uncertainties are also available, although typically ignored. In this manuscript, we develop a framework to incorporate these genotype uncertainties in case-control studies for any genetic model. We verify that using the assumption of a “local alternative” in the score test is very reasonable for effect sizes typically seen in SNP association studies, and show that the power of the score test is simply a function of the correlation of the genotype probabilities with the true genotypes. We demonstrate that the power to detect a true association can be substantially increased for difficult to call genotypes, resulting in improved inference in association studies.
Resumo:
Despite the widespread popularity of linear models for correlated outcomes (e.g. linear mixed models and time series models), distribution diagnostic methodology remains relatively underdeveloped in this context. In this paper we present an easy-to-implement approach that lends itself to graphical displays of model fit. Our approach involves multiplying the estimated margional residual vector by the Cholesky decomposition of the inverse of the estimated margional variance matrix. The resulting "rotated" residuals are used to construct an empirical cumulative distribution function and pointwise standard errors. The theoretical framework, including conditions and asymptotic properties, involves technical details that are motivated by Lange and Ryan (1989), Pierce (1982), and Randles (1982). Our method appears to work well in a variety of circumstances, including models having independent units of sampling (clustered data) and models for which all observations are correlated (e.g., a single time series). Our methods can produce satisfactory results even for models that do not satisfy all of the technical conditions stated in our theory.
Resumo:
Multi-site time series studies of air pollution and mortality and morbidity have figured prominently in the literature as comprehensive approaches for estimating acute effects of air pollution on health. Hierarchical models are generally used to combine site-specific information and estimate pooled air pollution effects taking into account both within-site statistical uncertainty, and across-site heterogeneity. Within a site, characteristics of time series data of air pollution and health (small pollution effects, missing data, highly correlated predictors, non linear confounding etc.) make modelling all sources of uncertainty challenging. One potential consequence is underestimation of the statistical variance of the site-specific effects to be combined. In this paper we investigate the impact of variance underestimation on the pooled relative rate estimate. We focus on two-stage normal-normal hierarchical models and on under- estimation of the statistical variance at the first stage. By mathematical considerations and simulation studies, we found that variance underestimation does not affect the pooled estimate substantially. However, some sensitivity of the pooled estimate to variance underestimation is observed when the number of sites is small and underestimation is severe. These simulation results are applicable to any two-stage normal-normal hierarchical model for combining information of site-specific results, and they can be easily extended to more general hierarchical formulations. We also examined the impact of variance underestimation on the national average relative rate estimate from the National Morbidity Mortality Air Pollution Study and we found that variance underestimation as much as 40% has little effect on the national average.
Resumo:
Medical errors originating in health care facilities are a significant source of preventable morbidity, mortality, and healthcare costs. Voluntary error report systems that collect information on the causes and contributing factors of medi- cal errors regardless of the resulting harm may be useful for developing effective harm prevention strategies. Some patient safety experts question the utility of data from errors that did not lead to harm to the patient, also called near misses. A near miss (a.k.a. close call) is an unplanned event that did not result in injury to the patient. Only a fortunate break in the chain of events prevented injury. We use data from a large voluntary reporting system of 836,174 medication errors from 1999 to 2005 to provide evidence that the causes and contributing factors of errors that result in harm are similar to the causes and contributing factors of near misses. We develop Bayesian hierarchical models for estimating the log odds of selecting a given cause (or contributing factor) of error given harm has occurred and the log odds of selecting the same cause given that harm did not occur. The posterior distribution of the correlation between these two vectors of log-odds is used as a measure of the evidence supporting the use of data from near misses and their causes and contributing factors to prevent medical errors. In addition, we identify the causes and contributing factors that have the highest or lowest log-odds ratio of harm versus no harm. These causes and contributing factors should also be a focus in the design of prevention strategies. This paper provides important evidence on the utility of data from near misses, which constitute the vast majority of errors in our data.
Resumo:
In medical follow-up studies, ordered bivariate survival data are frequently encountered when bivariate failure events are used as the outcomes to identify the progression of a disease. In cancer studies interest could be focused on bivariate failure times, for example, time from birth to cancer onset and time from cancer onset to death. This paper considers a sampling scheme where the first failure event (cancer onset) is identified within a calendar time interval, the time of the initiating event (birth) can be retrospectively confirmed, and the occurrence of the second event (death) is observed sub ject to right censoring. To analyze this type of bivariate failure time data, it is important to recognize the presence of bias arising due to interval sampling. In this paper, nonparametric and semiparametric methods are developed to analyze the bivariate survival data with interval sampling under stationary and semi-stationary conditions. Numerical studies demonstrate the proposed estimating approaches perform well with practical sample sizes in different simulated models. We apply the proposed methods to SEER ovarian cancer registry data for illustration of the methods and theory.