5 resultados para multi-factor models
em Collection Of Biostatistics Research Archive
Resumo:
The etiology of complex diseases is heterogeneous. The presence of risk alleles in one or more genetic loci affects the function of a variety of intermediate biological pathways, resulting in the overt expression of disease. Hence, there is an increasing focus on identifying the genetic basis of disease by sytematically studying phenotypic traits pertaining to the underlying biological functions. In this paper we focus on identifying genetic loci linked to quantitative phenotypic traits in experimental crosses. Such genetic mapping methods often use a one stage design by genotyping all the markers of interest on the available subjects. A genome scan based on single locus or multi-locus models is used to identify the putative loci. Since the number of quantitative trait loci (QTLs) is very likely to be small relative to the number of markers genotyped, a one-stage selective genotyping approach is commonly used to reduce the genotyping burden, whereby markers are genotyped solely on individuals with extreme trait values. This approach is powerful in the presence of a single quantitative trait locus (QTL) but may result in substantial loss of information in the presence of multiple QTLs. Here we investigate the efficiency of sequential two stage designs to identify QTLs in experimental populations. Our investigations for backcross and F2 crosses suggest that genotyping all the markers on 60% of the subjects in Stage 1 and genotyping the chromosomes significant at 20% level using additional subjects in Stage 2 and testing using all the subjects provides an efficient approach to identify the QTLs and utilizes only 70% of the genotyping burden relative to a one stage design, regardless of the heritability and genotyping density. Complex traits are a consequence of multiple QTLs conferring main effects as well as epistatic interactions. We propose a two-stage analytic approach where a single-locus genome scan is conducted in Stage 1 to identify promising chromosomes, and interactions are examined using the loci on these chromosomes in Stage 2. We examine settings under which the two-stage analytic approach provides sufficient power to detect the putative QTLs.
Resumo:
Multi-site time series studies of air pollution and mortality and morbidity have figured prominently in the literature as comprehensive approaches for estimating acute effects of air pollution on health. Hierarchical models are generally used to combine site-specific information and estimate pooled air pollution effects taking into account both within-site statistical uncertainty, and across-site heterogeneity. Within a site, characteristics of time series data of air pollution and health (small pollution effects, missing data, highly correlated predictors, non linear confounding etc.) make modelling all sources of uncertainty challenging. One potential consequence is underestimation of the statistical variance of the site-specific effects to be combined. In this paper we investigate the impact of variance underestimation on the pooled relative rate estimate. We focus on two-stage normal-normal hierarchical models and on under- estimation of the statistical variance at the first stage. By mathematical considerations and simulation studies, we found that variance underestimation does not affect the pooled estimate substantially. However, some sensitivity of the pooled estimate to variance underestimation is observed when the number of sites is small and underestimation is severe. These simulation results are applicable to any two-stage normal-normal hierarchical model for combining information of site-specific results, and they can be easily extended to more general hierarchical formulations. We also examined the impact of variance underestimation on the national average relative rate estimate from the National Morbidity Mortality Air Pollution Study and we found that variance underestimation as much as 40% has little effect on the national average.
Resumo:
This paper proposes Poisson log-linear multilevel models to investigate population variability in sleep state transition rates. We specifically propose a Bayesian Poisson regression model that is more flexible, scalable to larger studies, and easily fit than other attempts in the literature. We further use hierarchical random effects to account for pairings of individuals and repeated measures within those individuals, as comparing diseased to non-diseased subjects while minimizing bias is of epidemiologic importance. We estimate essentially non-parametric piecewise constant hazards and smooth them, and allow for time varying covariates and segment of the night comparisons. The Bayesian Poisson regression is justified through a re-derivation of a classical algebraic likelihood equivalence of Poisson regression with a log(time) offset and survival regression assuming piecewise constant hazards. This relationship allows us to synthesize two methods currently used to analyze sleep transition phenomena: stratified multi-state proportional hazards models and log-linear models with GEE for transition counts. An example data set from the Sleep Heart Health Study is analyzed.