7 resultados para Iterative power methods

em DigitalCommons@The Texas Medical Center


Relevância:

40.00% 40.00%

Publicador:

Resumo:

Linkage disequilibrium methods can be used to find genes influencing quantitative trait variation in humans. Linkage disequilibrium methods can require smaller sample sizes than linkage equilibrium methods, such as the variance component approach to find loci with a specific effect size. The increase in power is at the expense of requiring more markers to be typed to scan the entire genome. This thesis compares different linkage disequilibrium methods to determine which factors influence the power to detect disequilibrium. The costs of disequilibrium and equilibrium tests were compared to determine whether the savings in phenotyping costs when using disequilibrium methods outweigh the additional genotyping costs.^ Nine linkage disequilibrium tests were examined by simulation. Five tests involve selecting isolated unrelated individuals while four involved the selection of parent child trios (TDT). All nine tests were found to be able to identify disequilibrium with the correct significance level in Hardy-Weinberg populations. Increasing linked genetic variance and trait allele frequency were found to increase the power to detect disequilibrium, while increasing the number of generations and distance between marker and trait loci decreased the power to detect disequilibrium. Discordant sampling was used for several of the tests. It was found that the more stringent the sampling, the greater the power to detect disequilibrium in a sample of given size. The power to detect disequilibrium was not affected by the presence of polygenic effects.^ When the trait locus had more than two trait alleles, the power of the tests maximized to less than one. For the simulation methods used here, when there were more than two-trait alleles there was a probability equal to 1-heterozygosity of the marker locus that both trait alleles were in disequilibrium with the same marker allele, resulting in the marker being uninformative for disequilibrium.^ The five tests using isolated unrelated individuals were found to have excess error rates when there was disequilibrium due to population admixture. Increased error rates also resulted from increased unlinked major gene effects, discordant trait allele frequency, and increased disequilibrium. Polygenic effects did not affect the error rates. The TDT, Transmission Disequilibrium Test, based tests were not liable to any increase in error rates.^ For all sample ascertainment costs, for recent mutations ($<$100 generations) linkage disequilibrium tests were less expensive than the variance component test to carry out. Candidate gene scans saved even more money. The use of recently admixed populations also decreased the cost of performing a linkage disequilibrium test. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Objective. The purpose of the study is to provide a holistic depiction of behavioral & environmental factors contributing to risky sexual behaviors among predominantly high school educated, low-income African Americans residing in urban areas of Houston, TX utilizing the Theory of Gender and Power, Situational/Environmental Variables Theory, and Sexual Script Theory. Methods. A cross-sectional study was conducted via questionnaires among 215 Houston area residents, 149 were women and 66 were male. Measures used to assess behaviors of the population included a history of homelessness, use of crack/cocaine among several other illicit drugs, the type of sexual partner, age of participant, age of most recent sex partner, whether or not participants sought health care in the last 12 months, knowledge of partner's other sexual activities, symptoms of depression, and places where partner's were met. In an effort to determine risk of sexual encounters, a risk index employing the variables used to assess condom use was created categorizing sexual encounters as unsafe or safe. Results. Variables meeting the significance level of p<.15 for the bivariate analysis of each theory were entered into a binary logistic regression analysis. The block for each theory was significant, suggesting that the grouping assignments of each variable by theory were significantly associated with unsafe sexual behaviors. Within the regression analysis, variables such as sex for drugs/money, low income, and crack use demonstrated an effect size of ≥ ± 1, indicating that these variables had a significant effect on unsafe sexual behavioral practices. Conclusions. Variables assessing behavior and environment demonstrated a significant effect when categorized by relation to designated theories.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Objective. The purpose of the study is to provide a holistic depiction of behavioral & environmental factors contributing to risky sexual behaviors among predominantly high school educated, low-income African Americans residing in urban areas of Houston, TX utilizing the Theory of Gender and Power, Situational/Environmental Variables Theory, and Sexual Script Theory. ^ Methods. A cross-sectional study was conducted via questionnaires among 215 Houston area residents, 149 were women and 66 were male. Measures used to assess behaviors of the population included a history of homelessness, use of crack/cocaine among several other illicit drugs, the type of sexual partner, age of participant, age of most recent sex partner, whether or not participants sought health care in the last 12 months, knowledge of partner's other sexual activities, symptoms of depression, and places where partner's were met. In an effort to determine risk of sexual encounters, a risk index employing the variables used to assess condom use was created categorizing sexual encounters as unsafe or safe. ^ Results. Variables meeting the significance level of p<.15 for the bivariate analysis of each theory were entered into a binary logistic regression analysis. The block for each theory was significant, suggesting that the grouping assignments of each variable by theory were significantly associated with unsafe sexual behaviors. Within the regression analysis, variables such as sex for drugs/money, low income, and crack use demonstrated an effect size of ≥±1, indicating that these variables had a significant effect on unsafe sexual behavioral practices. ^ Conclusions. Variables assessing behavior and environment demonstrated a significant effect when categorized by relation to designated theories. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Objectives. This paper seeks to assess the effect on statistical power of regression model misspecification in a variety of situations. ^ Methods and results. The effect of misspecification in regression can be approximated by evaluating the correlation between the correct specification and the misspecification of the outcome variable (Harris 2010).In this paper, three misspecified models (linear, categorical and fractional polynomial) were considered. In the first section, the mathematical method of calculating the correlation between correct and misspecified models with simple mathematical forms was derived and demonstrated. In the second section, data from the National Health and Nutrition Examination Survey (NHANES 2007-2008) were used to examine such correlations. Our study shows that comparing to linear or categorical models, the fractional polynomial models, with the higher correlations, provided a better approximation of the true relationship, which was illustrated by LOESS regression. In the third section, we present the results of simulation studies that demonstrate overall misspecification in regression can produce marked decreases in power with small sample sizes. However, the categorical model had greatest power, ranging from 0.877 to 0.936 depending on sample size and outcome variable used. The power of fractional polynomial model was close to that of linear model, which ranged from 0.69 to 0.83, and appeared to be affected by the increased degrees of freedom of this model.^ Conclusion. Correlations between alternative model specifications can be used to provide a good approximation of the effect on statistical power of misspecification when the sample size is large. When model specifications have known simple mathematical forms, such correlations can be calculated mathematically. Actual public health data from NHANES 2007-2008 were used as examples to demonstrate the situations with unknown or complex correct model specification. Simulation of power for misspecified models confirmed the results based on correlation methods but also illustrated the effect of model degrees of freedom on power.^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Studies have shown that rare genetic variants have stronger effects in predisposing common diseases, and several statistical methods have been developed for association studies involving rare variants. In order to better understand how these statistical methods perform, we seek to compare two recently developed rare variant statistical methods (VT and C-alpha) on 10,000 simulated re-sequencing data sets with disease status and the corresponding 10,000 simulated null data sets. The SLC1A1 gene has been suggested to be associated with diastolic blood pressure (DBP) in previous studies. In the current study, we applied VT and C-alpha methods to the empirical re-sequencing data for the SLC1A1 gene from 300 whites and 200 blacks. We found that VT method obtains higher power and performs better than C-alpha method with the simulated data we used. The type I errors were well-controlled for both methods. In addition, both VT and C-alpha methods suggested no statistical evidence for the association between the SLC1A1 gene and DBP. Overall, our findings provided an important comparison of the two statistical methods for future reference and provided preliminary and pioneer findings on the association between the SLC1A1 gene and blood pressure.^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The determination of size as well as power of a test is a vital part of a Clinical Trial Design. This research focuses on the simulation of clinical trial data with time-to-event as the primary outcome. It investigates the impact of different recruitment patterns, and time dependent hazard structures on size and power of the log-rank test. A non-homogeneous Poisson process is used to simulate entry times according to the different accrual patterns. A Weibull distribution is employed to simulate survival times according to the different hazard structures. The current study utilizes simulation methods to evaluate the effect of different recruitment patterns on size and power estimates of the log-rank test. The size of the log-rank test is estimated by simulating survival times with identical hazard rates between the treatment and the control arm of the study resulting in a hazard ratio of one. Powers of the log-rank test at specific values of hazard ratio (≠1) are estimated by simulating survival times with different, but proportional hazard rates for the two arms of the study. Different shapes (constant, decreasing, or increasing) of the hazard function of the Weibull distribution are also considered to assess the effect of hazard structure on the size and power of the log-rank test. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Complex diseases such as cancer result from multiple genetic changes and environmental exposures. Due to the rapid development of genotyping and sequencing technologies, we are now able to more accurately assess causal effects of many genetic and environmental factors. Genome-wide association studies have been able to localize many causal genetic variants predisposing to certain diseases. However, these studies only explain a small portion of variations in the heritability of diseases. More advanced statistical models are urgently needed to identify and characterize some additional genetic and environmental factors and their interactions, which will enable us to better understand the causes of complex diseases. In the past decade, thanks to the increasing computational capabilities and novel statistical developments, Bayesian methods have been widely applied in the genetics/genomics researches and demonstrating superiority over some regular approaches in certain research areas. Gene-environment and gene-gene interaction studies are among the areas where Bayesian methods may fully exert its functionalities and advantages. This dissertation focuses on developing new Bayesian statistical methods for data analysis with complex gene-environment and gene-gene interactions, as well as extending some existing methods for gene-environment interactions to other related areas. It includes three sections: (1) Deriving the Bayesian variable selection framework for the hierarchical gene-environment and gene-gene interactions; (2) Developing the Bayesian Natural and Orthogonal Interaction (NOIA) models for gene-environment interactions; and (3) extending the applications of two Bayesian statistical methods which were developed for gene-environment interaction studies, to other related types of studies such as adaptive borrowing historical data. We propose a Bayesian hierarchical mixture model framework that allows us to investigate the genetic and environmental effects, gene by gene interactions (epistasis) and gene by environment interactions in the same model. It is well known that, in many practical situations, there exists a natural hierarchical structure between the main effects and interactions in the linear model. Here we propose a model that incorporates this hierarchical structure into the Bayesian mixture model, such that the irrelevant interaction effects can be removed more efficiently, resulting in more robust, parsimonious and powerful models. We evaluate both of the 'strong hierarchical' and 'weak hierarchical' models, which specify that both or one of the main effects between interacting factors must be present for the interactions to be included in the model. The extensive simulation results show that the proposed strong and weak hierarchical mixture models control the proportion of false positive discoveries and yield a powerful approach to identify the predisposing main effects and interactions in the studies with complex gene-environment and gene-gene interactions. We also compare these two models with the 'independent' model that does not impose this hierarchical constraint and observe their superior performances in most of the considered situations. The proposed models are implemented in the real data analysis of gene and environment interactions in the cases of lung cancer and cutaneous melanoma case-control studies. The Bayesian statistical models enjoy the properties of being allowed to incorporate useful prior information in the modeling process. Moreover, the Bayesian mixture model outperforms the multivariate logistic model in terms of the performances on the parameter estimation and variable selection in most cases. Our proposed models hold the hierarchical constraints, that further improve the Bayesian mixture model by reducing the proportion of false positive findings among the identified interactions and successfully identifying the reported associations. This is practically appealing for the study of investigating the causal factors from a moderate number of candidate genetic and environmental factors along with a relatively large number of interactions. The natural and orthogonal interaction (NOIA) models of genetic effects have previously been developed to provide an analysis framework, by which the estimates of effects for a quantitative trait are statistically orthogonal regardless of the existence of Hardy-Weinberg Equilibrium (HWE) within loci. Ma et al. (2012) recently developed a NOIA model for the gene-environment interaction studies and have shown the advantages of using the model for detecting the true main effects and interactions, compared with the usual functional model. In this project, we propose a novel Bayesian statistical model that combines the Bayesian hierarchical mixture model with the NOIA statistical model and the usual functional model. The proposed Bayesian NOIA model demonstrates more power at detecting the non-null effects with higher marginal posterior probabilities. Also, we review two Bayesian statistical models (Bayesian empirical shrinkage-type estimator and Bayesian model averaging), which were developed for the gene-environment interaction studies. Inspired by these Bayesian models, we develop two novel statistical methods that are able to handle the related problems such as borrowing data from historical studies. The proposed methods are analogous to the methods for the gene-environment interactions on behalf of the success on balancing the statistical efficiency and bias in a unified model. By extensive simulation studies, we compare the operating characteristics of the proposed models with the existing models including the hierarchical meta-analysis model. The results show that the proposed approaches adaptively borrow the historical data in a data-driven way. These novel models may have a broad range of statistical applications in both of genetic/genomic and clinical studies.