8 resultados para Statistical Simulation

em DigitalCommons@The Texas Medical Center


Relevância:

40.00% 40.00%

Publicador:

Resumo:

Objectives. This paper seeks to assess the effect on statistical power of regression model misspecification in a variety of situations. ^ Methods and results. The effect of misspecification in regression can be approximated by evaluating the correlation between the correct specification and the misspecification of the outcome variable (Harris 2010).In this paper, three misspecified models (linear, categorical and fractional polynomial) were considered. In the first section, the mathematical method of calculating the correlation between correct and misspecified models with simple mathematical forms was derived and demonstrated. In the second section, data from the National Health and Nutrition Examination Survey (NHANES 2007-2008) were used to examine such correlations. Our study shows that comparing to linear or categorical models, the fractional polynomial models, with the higher correlations, provided a better approximation of the true relationship, which was illustrated by LOESS regression. In the third section, we present the results of simulation studies that demonstrate overall misspecification in regression can produce marked decreases in power with small sample sizes. However, the categorical model had greatest power, ranging from 0.877 to 0.936 depending on sample size and outcome variable used. The power of fractional polynomial model was close to that of linear model, which ranged from 0.69 to 0.83, and appeared to be affected by the increased degrees of freedom of this model.^ Conclusion. Correlations between alternative model specifications can be used to provide a good approximation of the effect on statistical power of misspecification when the sample size is large. When model specifications have known simple mathematical forms, such correlations can be calculated mathematically. Actual public health data from NHANES 2007-2008 were used as examples to demonstrate the situations with unknown or complex correct model specification. Simulation of power for misspecified models confirmed the results based on correlation methods but also illustrated the effect of model degrees of freedom on power.^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Calcium levels in spines play a significant role in determining the sign and magnitude of synaptic plasticity. The magnitude of calcium influx into spines is highly dependent on influx through N-methyl D-aspartate (NMDA) receptors, and therefore depends on the number of postsynaptic NMDA receptors in each spine. We have calculated previously how the number of postsynaptic NMDA receptors determines the mean and variance of calcium transients in the postsynaptic density, and how this alters the shape of plasticity curves. However, the number of postsynaptic NMDA receptors in the postsynaptic density is not well known. Anatomical methods for estimating the number of NMDA receptors produce estimates that are very different than those produced by physiological techniques. The physiological techniques are based on the statistics of synaptic transmission and it is difficult to experimentally estimate their precision. In this paper we use stochastic simulations in order to test the validity of a physiological estimation technique based on failure analysis. We find that the method is likely to underestimate the number of postsynaptic NMDA receptors, explain the source of the error, and re-derive a more precise estimation technique. We also show that the original failure analysis as well as our improved formulas are not robust to small estimation errors in key parameters.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Genetic anticipation is defined as a decrease in age of onset or increase in severity as the disorder is transmitted through subsequent generations. Anticipation has been noted in the literature for over a century. Recently, anticipation in several diseases including Huntington's Disease, Myotonic Dystrophy and Fragile X Syndrome were shown to be caused by expansion of triplet repeats. Anticipation effects have also been observed in numerous mental disorders (e.g. Schizophrenia, Bipolar Disorder), cancers (Li-Fraumeni Syndrome, Leukemia) and other complex diseases. ^ Several statistical methods have been applied to determine whether anticipation is a true phenomenon in a particular disorder, including standard statistical tests and newly developed affected parent/affected child pair methods. These methods have been shown to be inappropriate for assessing anticipation for a variety of reasons, including familial correlation and low power. Therefore, we have developed family-based likelihood modeling approaches to model the underlying transmission of the disease gene and penetrance function and hence detect anticipation. These methods can be applied in extended families, thus improving the power to detect anticipation compared with existing methods based only upon parents and children. The first method we have proposed is based on the regressive logistic hazard model. This approach models anticipation by a generational covariate. The second method allows alleles to mutate as they are transmitted from parents to offspring and is appropriate for modeling the known triplet repeat diseases in which the disease alleles can become more deleterious as they are transmitted across generations. ^ To evaluate the new methods, we performed extensive simulation studies for data simulated under different conditions to evaluate the effectiveness of the algorithms to detect genetic anticipation. Results from analysis by the first method yielded empirical power greater than 87% based on the 5% type I error critical value identified in each simulation depending on the method of data generation and current age criteria. Analysis by the second method was not possible due to the current formulation of the software. The application of this method to Huntington's Disease and Li-Fraumeni Syndrome data sets revealed evidence for a generation effect in both cases. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Interim clinical trial monitoring procedures were motivated by ethical and economic considerations. Classical Brownian motion (Bm) techniques for statistical monitoring of clinical trials were widely used. Conditional power argument and α-spending function based boundary crossing probabilities are popular statistical hypothesis testing procedures under the assumption of Brownian motion. However, it is not rare that the assumptions of Brownian motion are only partially met for trial data. Therefore, I used a more generalized form of stochastic process, called fractional Brownian motion (fBm), to model the test statistics. Fractional Brownian motion does not hold Markov property and future observations depend not only on the present observations but also on the past ones. In this dissertation, we simulated a wide range of fBm data, e.g., H = 0.5 (that is, classical Bm) vs. 0.5< H <1, with treatment effects vs. without treatment effects. Then the performance of conditional power and boundary-crossing based interim analyses were compared by assuming that the data follow Bm or fBm. Our simulation study suggested that the conditional power or boundaries under fBm assumptions are generally higher than those under Bm assumptions when H > 0.5 and also matches better with the empirical results. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Mixed longitudinal designs are important study designs for many areas of medical research. Mixed longitudinal studies have several advantages over cross-sectional or pure longitudinal studies, including shorter study completion time and ability to separate time and age effects, thus are an attractive choice. Statistical methodology used in general longitudinal studies has been rapidly developing within the last few decades. Common approaches for statistical modeling in studies with mixed longitudinal designs have been the linear mixed-effects model incorporating an age or time effect. The general linear mixed-effects model is considered an appropriate choice to analyze repeated measurements data in longitudinal studies. However, common use of linear mixed-effects model on mixed longitudinal studies often incorporates age as the only random-effect but fails to take into consideration the cohort effect in conducting statistical inferences on age-related trajectories of outcome measurements. We believe special attention should be paid to cohort effects when analyzing data in mixed longitudinal designs with multiple overlapping cohorts. Thus, this has become an important statistical issue to address. ^ This research aims to address statistical issues related to mixed longitudinal studies. The proposed study examined the existing statistical analysis methods for the mixed longitudinal designs and developed an alternative analytic method to incorporate effects from multiple overlapping cohorts as well as from different aged subjects. The proposed study used simulation to evaluate the performance of the proposed analytic method by comparing it with the commonly-used model. Finally, the study applied the proposed analytic method to the data collected by an existing study Project HeartBeat!, which had been evaluated using traditional analytic techniques. Project HeartBeat! is a longitudinal study of cardiovascular disease (CVD) risk factors in childhood and adolescence using a mixed longitudinal design. The proposed model was used to evaluate four blood lipids adjusting for age, gender, race/ethnicity, and endocrine hormones. The result of this dissertation suggest the proposed analytic model could be a more flexible and reliable choice than the traditional model in terms of fitting data to provide more accurate estimates in mixed longitudinal studies. Conceptually, the proposed model described in this study has useful features, including consideration of effects from multiple overlapping cohorts, and is an attractive approach for analyzing data in mixed longitudinal design studies.^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

An interim analysis is usually applied in later phase II or phase III trials to find convincing evidence of a significant treatment difference that may lead to trial termination at an earlier point than planned at the beginning. This can result in the saving of patient resources and shortening of drug development and approval time. In addition, ethics and economics are also the reasons to stop a trial earlier. In clinical trials of eyes, ears, knees, arms, kidneys, lungs, and other clustered treatments, data may include distribution-free random variables with matched and unmatched subjects in one study. It is important to properly include both subjects in the interim and the final analyses so that the maximum efficiency of statistical and clinical inferences can be obtained at different stages of the trials. So far, no publication has applied a statistical method for distribution-free data with matched and unmatched subjects in the interim analysis of clinical trials. In this simulation study, the hybrid statistic was used to estimate the empirical powers and the empirical type I errors among the simulated datasets with different sample sizes, different effect sizes, different correlation coefficients for matched pairs, and different data distributions, respectively, in the interim and final analysis with 4 different group sequential methods. Empirical powers and empirical type I errors were also compared to those estimated by using the meta-analysis t-test among the same simulated datasets. Results from this simulation study show that, compared to the meta-analysis t-test commonly used for data with normally distributed observations, the hybrid statistic has a greater power for data observed from normally, log-normally, and multinomially distributed random variables with matched and unmatched subjects and with outliers. Powers rose with the increase in sample size, effect size, and correlation coefficient for the matched pairs. In addition, lower type I errors were observed estimated by using the hybrid statistic, which indicates that this test is also conservative for data with outliers in the interim analysis of clinical trials.^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The Phase I clinical trial is considered the "first in human" study in medical research to examine the toxicity of a new agent. It determines the maximum tolerable dose (MTD) of a new agent, i.e., the highest dose in which toxicity is still acceptable. Several phase I clinical trial designs have been proposed in the past 30 years. The well known standard method, so called the 3+3 design, is widely accepted by clinicians since it is the easiest to implement and it does not need a statistical calculation. Continual reassessment method (CRM), a design uses Bayesian method, has been rising in popularity in the last two decades. Several variants of the CRM design have also been suggested in numerous statistical literatures. Rolling six is a new method introduced in pediatric oncology in 2008, which claims to shorten the trial duration as compared to the 3+3 design. The goal of the present research was to simulate clinical trials and compare these phase I clinical trial designs. Patient population was created by discrete event simulation (DES) method. The characteristics of the patients were generated by several distributions with the parameters derived from a historical phase I clinical trial data review. Patients were then selected and enrolled in clinical trials, each of which uses the 3+3 design, the rolling six, or the CRM design. Five scenarios of dose-toxicity relationship were used to compare the performance of the phase I clinical trial designs. One thousand trials were simulated per phase I clinical trial design per dose-toxicity scenario. The results showed the rolling six design was not superior to the 3+3 design in terms of trial duration. The time to trial completion was comparable between the rolling six and the 3+3 design. However, they both shorten the duration as compared to the two CRM designs. Both CRMs were superior to the 3+3 design and the rolling six in accuracy of MTD estimation. The 3+3 design and rolling six tended to assign more patients to undesired lower dose levels. The toxicities were slightly greater in the CRMs.^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Complex diseases such as cancer result from multiple genetic changes and environmental exposures. Due to the rapid development of genotyping and sequencing technologies, we are now able to more accurately assess causal effects of many genetic and environmental factors. Genome-wide association studies have been able to localize many causal genetic variants predisposing to certain diseases. However, these studies only explain a small portion of variations in the heritability of diseases. More advanced statistical models are urgently needed to identify and characterize some additional genetic and environmental factors and their interactions, which will enable us to better understand the causes of complex diseases. In the past decade, thanks to the increasing computational capabilities and novel statistical developments, Bayesian methods have been widely applied in the genetics/genomics researches and demonstrating superiority over some regular approaches in certain research areas. Gene-environment and gene-gene interaction studies are among the areas where Bayesian methods may fully exert its functionalities and advantages. This dissertation focuses on developing new Bayesian statistical methods for data analysis with complex gene-environment and gene-gene interactions, as well as extending some existing methods for gene-environment interactions to other related areas. It includes three sections: (1) Deriving the Bayesian variable selection framework for the hierarchical gene-environment and gene-gene interactions; (2) Developing the Bayesian Natural and Orthogonal Interaction (NOIA) models for gene-environment interactions; and (3) extending the applications of two Bayesian statistical methods which were developed for gene-environment interaction studies, to other related types of studies such as adaptive borrowing historical data. We propose a Bayesian hierarchical mixture model framework that allows us to investigate the genetic and environmental effects, gene by gene interactions (epistasis) and gene by environment interactions in the same model. It is well known that, in many practical situations, there exists a natural hierarchical structure between the main effects and interactions in the linear model. Here we propose a model that incorporates this hierarchical structure into the Bayesian mixture model, such that the irrelevant interaction effects can be removed more efficiently, resulting in more robust, parsimonious and powerful models. We evaluate both of the 'strong hierarchical' and 'weak hierarchical' models, which specify that both or one of the main effects between interacting factors must be present for the interactions to be included in the model. The extensive simulation results show that the proposed strong and weak hierarchical mixture models control the proportion of false positive discoveries and yield a powerful approach to identify the predisposing main effects and interactions in the studies with complex gene-environment and gene-gene interactions. We also compare these two models with the 'independent' model that does not impose this hierarchical constraint and observe their superior performances in most of the considered situations. The proposed models are implemented in the real data analysis of gene and environment interactions in the cases of lung cancer and cutaneous melanoma case-control studies. The Bayesian statistical models enjoy the properties of being allowed to incorporate useful prior information in the modeling process. Moreover, the Bayesian mixture model outperforms the multivariate logistic model in terms of the performances on the parameter estimation and variable selection in most cases. Our proposed models hold the hierarchical constraints, that further improve the Bayesian mixture model by reducing the proportion of false positive findings among the identified interactions and successfully identifying the reported associations. This is practically appealing for the study of investigating the causal factors from a moderate number of candidate genetic and environmental factors along with a relatively large number of interactions. The natural and orthogonal interaction (NOIA) models of genetic effects have previously been developed to provide an analysis framework, by which the estimates of effects for a quantitative trait are statistically orthogonal regardless of the existence of Hardy-Weinberg Equilibrium (HWE) within loci. Ma et al. (2012) recently developed a NOIA model for the gene-environment interaction studies and have shown the advantages of using the model for detecting the true main effects and interactions, compared with the usual functional model. In this project, we propose a novel Bayesian statistical model that combines the Bayesian hierarchical mixture model with the NOIA statistical model and the usual functional model. The proposed Bayesian NOIA model demonstrates more power at detecting the non-null effects with higher marginal posterior probabilities. Also, we review two Bayesian statistical models (Bayesian empirical shrinkage-type estimator and Bayesian model averaging), which were developed for the gene-environment interaction studies. Inspired by these Bayesian models, we develop two novel statistical methods that are able to handle the related problems such as borrowing data from historical studies. The proposed methods are analogous to the methods for the gene-environment interactions on behalf of the success on balancing the statistical efficiency and bias in a unified model. By extensive simulation studies, we compare the operating characteristics of the proposed models with the existing models including the hierarchical meta-analysis model. The results show that the proposed approaches adaptively borrow the historical data in a data-driven way. These novel models may have a broad range of statistical applications in both of genetic/genomic and clinical studies.