7 resultados para Portmanteau test statistics
em DigitalCommons@The Texas Medical Center
Resumo:
Despite current enthusiasm for investigation of gene-gene interactions and gene-environment interactions, the essential issue of how to define and detect gene-environment interactions remains unresolved. In this report, we define gene-environment interactions as a stochastic dependence in the context of the effects of the genetic and environmental risk factors on the cause of phenotypic variation among individuals. We use mutual information that is widely used in communication and complex system analysis to measure gene-environment interactions. We investigate how gene-environment interactions generate the large difference in the information measure of gene-environment interactions between the general population and a diseased population, which motives us to develop mutual information-based statistics for testing gene-environment interactions. We validated the null distribution and calculated the type 1 error rates for the mutual information-based statistics to test gene-environment interactions using extensive simulation studies. We found that the new test statistics were more powerful than the traditional logistic regression under several disease models. Finally, in order to further evaluate the performance of our new method, we applied the mutual information-based statistics to three real examples. Our results showed that P-values for the mutual information-based statistics were much smaller than that obtained by other approaches including logistic regression models.
Resumo:
Linkage and association studies are major analytical tools to search for susceptibility genes for complex diseases. With the availability of large collection of single nucleotide polymorphisms (SNPs) and the rapid progresses for high throughput genotyping technologies, together with the ambitious goals of the International HapMap Project, genetic markers covering the whole genome will be available for genome-wide linkage and association studies. In order not to inflate the type I error rate in performing genome-wide linkage and association studies, multiple adjustment for the significant level for each independent linkage and/or association test is required, and this has led to the suggestion of genome-wide significant cut-off as low as 5 × 10 −7. Almost no linkage and/or association study can meet such a stringent threshold by the standard statistical methods. Developing new statistics with high power is urgently needed to tackle this problem. This dissertation proposes and explores a class of novel test statistics that can be used in both population-based and family-based genetic data by employing a completely new strategy, which uses nonlinear transformation of the sample means to construct test statistics for linkage and association studies. Extensive simulation studies are used to illustrate the properties of the nonlinear test statistics. Power calculations are performed using both analytical and empirical methods. Finally, real data sets are analyzed with the nonlinear test statistics. Results show that the nonlinear test statistics have correct type I error rates, and most of the studied nonlinear test statistics have higher power than the standard chi-square test. This dissertation introduces a new idea to design novel test statistics with high power and might open new ways to mapping susceptibility genes for complex diseases. ^
Resumo:
Monte Carlo simulation has been conducted to investigate parameter estimation and hypothesis testing in some well known adaptive randomization procedures. The four urn models studied are Randomized Play-the-Winner (RPW), Randomized Pôlya Urn (RPU), Birth and Death Urn with Immigration (BDUI), and Drop-the-Loses Urn (DL). Two sequential estimation methods, the sequential maximum likelihood estimation (SMLE) and the doubly adaptive biased coin design (DABC), are simulated at three optimal allocation targets that minimize the expected number of failures under the assumption of constant variance of simple difference (RSIHR), relative risk (ORR), and odds ratio (OOR) respectively. Log likelihood ratio test and three Wald-type tests (simple difference, log of relative risk, log of odds ratio) are compared in different adaptive procedures. ^ Simulation results indicates that although RPW is slightly better in assigning more patients to the superior treatment, the DL method is considerably less variable and the test statistics have better normality. When compared with SMLE, DABC has slightly higher overall response rate with lower variance, but has larger bias and variance in parameter estimation. Additionally, the test statistics in SMLE have better normality and lower type I error rate, and the power of hypothesis testing is more comparable with the equal randomization. Usually, RSIHR has the highest power among the 3 optimal allocation ratios. However, the ORR allocation has better power and lower type I error rate when the log of relative risk is the test statistics. The number of expected failures in ORR is smaller than RSIHR. It is also shown that the simple difference of response rates has the worst normality among all 4 test statistics. The power of hypothesis test is always inflated when simple difference is used. On the other hand, the normality of the log likelihood ratio test statistics is robust against the change of adaptive randomization procedures. ^
Resumo:
Interim clinical trial monitoring procedures were motivated by ethical and economic considerations. Classical Brownian motion (Bm) techniques for statistical monitoring of clinical trials were widely used. Conditional power argument and α-spending function based boundary crossing probabilities are popular statistical hypothesis testing procedures under the assumption of Brownian motion. However, it is not rare that the assumptions of Brownian motion are only partially met for trial data. Therefore, I used a more generalized form of stochastic process, called fractional Brownian motion (fBm), to model the test statistics. Fractional Brownian motion does not hold Markov property and future observations depend not only on the present observations but also on the past ones. In this dissertation, we simulated a wide range of fBm data, e.g., H = 0.5 (that is, classical Bm) vs. 0.5< H <1, with treatment effects vs. without treatment effects. Then the performance of conditional power and boundary-crossing based interim analyses were compared by assuming that the data follow Bm or fBm. Our simulation study suggested that the conditional power or boundaries under fBm assumptions are generally higher than those under Bm assumptions when H > 0.5 and also matches better with the empirical results. ^
Resumo:
In the biomedical studies, the general data structures have been the matched (paired) and unmatched designs. Recently, many researchers are interested in Meta-Analysis to obtain a better understanding from several clinical data of a medical treatment. The hybrid design, which is combined two data structures, may create the fundamental question for statistical methods and the challenges for statistical inferences. The applied methods are depending on the underlying distribution. If the outcomes are normally distributed, we would use the classic paired and two independent sample T-tests on the matched and unmatched cases. If not, we can apply Wilcoxon signed rank and rank sum test on each case. ^ To assess an overall treatment effect on a hybrid design, we can apply the inverse variance weight method used in Meta-Analysis. On the nonparametric case, we can use a test statistic which is combined on two Wilcoxon test statistics. However, these two test statistics are not in same scale. We propose the Hybrid Test Statistic based on the Hodges-Lehmann estimates of the treatment effects, which are medians in the same scale.^ To compare the proposed method, we use the classic meta-analysis T-test statistic on the combined the estimates of the treatment effects from two T-test statistics. Theoretically, the efficiency of two unbiased estimators of a parameter is the ratio of their variances. With the concept of Asymptotic Relative Efficiency (ARE) developed by Pitman, we show ARE of the hybrid test statistic relative to classic meta-analysis T-test statistic using the Hodges-Lemann estimators associated with two test statistics.^ From several simulation studies, we calculate the empirical type I error rate and power of the test statistics. The proposed statistic would provide effective tool to evaluate and understand the treatment effect in various public health studies as well as clinical trials.^
Resumo:
Schizophrenia (SZ) is a complex disorder with high heritability and variable phenotypes that has limited success in finding causal genes associated with the disease development. Pathway-based analysis is an effective approach in investigating the molecular mechanism of susceptible genes associated with complex diseases. The etiology of complex diseases could be a network of genetic factors and within the genes, interaction may occur. In this work we argue that some genes might be of small effect that by itself are neither sufficient nor necessary to cause the disease however, their effect may induce slight changes to the gene expression or affect the protein function, therefore, analyzing the gene-gene interaction mechanism within the disease pathway would play crucial role in dissecting the genetic architecture of complex diseases, making the pathway-based analysis a complementary approach to GWAS technique. ^ In this study, we implemented three novel linkage disequilibrium based statistics, the linear combination, the quadratic, and the decorrelation test statistics, to investigate the interaction between linked and unlinked genes in two independent case-control GWAS datasets for SZ including participants of European (EA) and African (AA) ancestries. The EA population included 1,173 cases and 1,378 controls with 729,454 genotyped SNPs, while the AA population included 219 cases and 288 controls with 845,814 genotyped SNPs. We identified 17,186 interacting gene-sets at significant level in EA dataset, and 12,691 gene-sets in AA dataset using the gene-gene interaction method. We also identified 18,846 genes in EA dataset and 19,431 genes in AA dataset that were in the disease pathways. However, few genes were reported of significant association to SZ. ^ Our research determined the pathways characteristics for schizophrenia through the gene-gene interaction and gene-pathway based approaches. Our findings suggest insightful inferences of our methods in studying the molecular mechanisms of common complex diseases.^
Resumo:
Pathway based genome wide association study evolves from pathway analysis for microarray gene expression and is under rapid development as a complementary for single-SNP based genome wide association study. However, it faces new challenges, such as the summarization of SNP statistics to pathway statistics. The current study applies the ridge regularized Kernel Sliced Inverse Regression (KSIR) to achieve dimension reduction and compared this method to the other two widely used methods, the minimal-p-value (minP) approach of assigning the best test statistics of all SNPs in each pathway as the statistics of the pathway and the principal component analysis (PCA) method of utilizing PCA to calculate the principal components of each pathway. Comparison of the three methods using simulated datasets consisting of 500 cases, 500 controls and100 SNPs demonstrated that KSIR method outperformed the other two methods in terms of causal pathway ranking and the statistical power. PCA method showed similar performance as the minP method. KSIR method also showed a better performance over the other two methods in analyzing a real dataset, the WTCCC Ulcerative Colitis dataset consisting of 1762 cases, 3773 controls as the discovery cohort and 591 cases, 1639 controls as the replication cohort. Several immune and non-immune pathways relevant to ulcerative colitis were identified by these methods. Results from the current study provided a reference for further methodology development and identified novel pathways that may be of importance to the development of ulcerative colitis.^