986 resultados para Biology, Biostatistics|Health Sciences, Epidemiology|Health Sciences, Oncology
Resumo:
Background and purpose. Brain lesions in acute ischemic stroke measured by imaging tools provide important clinical information for diagnosis and final infarct volume has been considered as a potential surrogate marker for clinical outcomes. Strong correlations have been found between lesion volume and clinical outcomes in the NINDS t-PA Stroke Trial but little has been published about lesion location and clinical outcomes. Studies of the National Institute of Neurological Disorders and Stroke (NINDS) t-PA Stroke Trial data found the direction of the t-PA treatment effect on a decrease in CT lesion volume was consistent with the observed clinical effects at 3 months, but measure of t-PA treatment benefits using CT lesion volumes showed a diminished statistical significance, as compared to using clinical scales. ^ Methods. We used the global test to evaluate the hypothesis that lesion locations were strongly associated with clinical outcomes within each treatment group at 3 months after stroke. The anatomic locations of CT scans were used for analysis. We also assessed the effect of t-PA on lesion location using a global statistical test. ^ Results. In the t-PA group, patients with frontal lesions had larger infarct volumes and worse NIHSS score at 3 months after stroke. The clinical status of patients with frontal lesions in t-PA group was less likely to be affected by lesion volume, as compared to those who had no frontal lesions in at 3 months. For patients within the placebo group, both brain stem and internal capsule locations were significantly associated with a lower odd of having favorable outcomes at 3 months. Using a global test we could not detect a significant effect of t-PA treatment on lesion location although differences between two treatment groups in the proportion of lesion findings in each location were found. ^ Conclusions. Frontal, brain stem, and internal capsule locations were significantly related to clinical status at 3 months after stroke onset. We detect no significant t-PA effect on all 9 locations although proportion of lesion findings in differed among locations between the two treatment groups.^
Resumo:
An interim analysis is usually applied in later phase II or phase III trials to find convincing evidence of a significant treatment difference that may lead to trial termination at an earlier point than planned at the beginning. This can result in the saving of patient resources and shortening of drug development and approval time. In addition, ethics and economics are also the reasons to stop a trial earlier. In clinical trials of eyes, ears, knees, arms, kidneys, lungs, and other clustered treatments, data may include distribution-free random variables with matched and unmatched subjects in one study. It is important to properly include both subjects in the interim and the final analyses so that the maximum efficiency of statistical and clinical inferences can be obtained at different stages of the trials. So far, no publication has applied a statistical method for distribution-free data with matched and unmatched subjects in the interim analysis of clinical trials. In this simulation study, the hybrid statistic was used to estimate the empirical powers and the empirical type I errors among the simulated datasets with different sample sizes, different effect sizes, different correlation coefficients for matched pairs, and different data distributions, respectively, in the interim and final analysis with 4 different group sequential methods. Empirical powers and empirical type I errors were also compared to those estimated by using the meta-analysis t-test among the same simulated datasets. Results from this simulation study show that, compared to the meta-analysis t-test commonly used for data with normally distributed observations, the hybrid statistic has a greater power for data observed from normally, log-normally, and multinomially distributed random variables with matched and unmatched subjects and with outliers. Powers rose with the increase in sample size, effect size, and correlation coefficient for the matched pairs. In addition, lower type I errors were observed estimated by using the hybrid statistic, which indicates that this test is also conservative for data with outliers in the interim analysis of clinical trials.^
Resumo:
This study analyzed the relationship between fasting blood glucose (FBG) and 8-year mortality in the Hypertension Detection Follow-up Program (HDFP) population. Fasting blood glucose (FBG) was examined both as a continuous variable and by specified FBG strata: Normal (FBG 60–100 mg/dL), Impaired (FBG ≥100 and ≤125 mg/dL), and Diabetic (FBG>125 mg/dL or pre-existing diabetes) subgroups. The relationship between type 2 diabetes was examined with all-cause mortality. This thesis described and compared the characteristics of fasting blood glucose strata by recognized glucose cut-points; described the mortality rates in the various fasting blood glucose strata using Kaplan-Meier mortality curves, and compared the mortality risk of various strata using Cox Regression analysis. Overall, mortality was significantly greater among Referred Care (RC) participants compared to Stepped Care (SC) {HR = 1.17; 95% CI (1.052,1.309); p-value = 0.004}, as reported by the HDFP investigators in 1979. Compared with SC participants, the RC mortality rate was significantly higher for the Normal FBG group {HR = 1.18; 95% CI (1.029,1.363); p-value = 0.019} and the Impaired FBG group, {HR = 1.34; 95% CI (1.036,1.734); p-value = 0.026,}. However, for the diabetic group, 8-year mortality did not differ significantly between the RC and SC groups after adjusting for race, gender, age, smoking status among Diabetic individuals {HR = 1.03; 95% CI (0.816,1.303); p-value = 0.798}. This latter finding is possibly due to a lack of a treatment difference of hypertension among Diabetic participants in both RC and SC groups. The largest difference in mortality between RC and SC was in the Impaired subgroup, suggesting that hypertensive patients with FBG between 100 and 125 mg/dL would benefit from aggressive antihypertensive therapy.^
Resumo:
This study proposed a novel statistical method that modeled the multiple outcomes and missing data process jointly using item response theory. This method follows the "intent-to-treat" principle in clinical trials and accounts for the correlation between outcomes and missing data process. This method may provide a good solution to chronic mental disorder study. ^ The simulation study demonstrated that if the true model is the proposed model with moderate or strong correlation, ignoring the within correlation may lead to overestimate of the treatment effect and result in more type I error than specified level. Even if the within correlation is small, the performance of proposed model is as good as naïve response model. Thus, the proposed model is robust for different correlation settings if the data is generated by the proposed model.^
Resumo:
There are two practical challenges in the phase I clinical trial conduct: lack of transparency to physicians, and the late onset toxicity. In my dissertation, Bayesian approaches are used to address these two problems in clinical trial designs. The proposed simple optimal designs cast the dose finding problem as a decision making process for dose escalation and deescalation. The proposed designs minimize the incorrect decision error rate to find the maximum tolerated dose (MTD). For the late onset toxicity problem, a Bayesian adaptive dose-finding design for drug combination is proposed. The dose-toxicity relationship is modeled using the Finney model. The unobserved delayed toxicity outcomes are treated as missing data and Bayesian data augment is employed to handle the resulting missing data. Extensive simulation studies have been conducted to examine the operating characteristics of the proposed designs and demonstrated the designs' good performances in various practical scenarios.^
Resumo:
Mixture modeling is commonly used to model categorical latent variables that represent subpopulations in which population membership is unknown but can be inferred from the data. In relatively recent years, the potential of finite mixture models has been applied in time-to-event data. However, the commonly used survival mixture model assumes that the effects of the covariates involved in failure times differ across latent classes, but the covariate distribution is homogeneous. The aim of this dissertation is to develop a method to examine time-to-event data in the presence of unobserved heterogeneity under a framework of mixture modeling. A joint model is developed to incorporate the latent survival trajectory along with the observed information for the joint analysis of a time-to-event variable, its discrete and continuous covariates, and a latent class variable. It is assumed that the effects of covariates on survival times and the distribution of covariates vary across different latent classes. The unobservable survival trajectories are identified through estimating the probability that a subject belongs to a particular class based on observed information. We applied this method to a Hodgkin lymphoma study with long-term follow-up and observed four distinct latent classes in terms of long-term survival and distributions of prognostic factors. Our results from simulation studies and from the Hodgkin lymphoma study demonstrated the superiority of our joint model compared with the conventional survival model. This flexible inference method provides more accurate estimation and accommodates unobservable heterogeneity among individuals while taking involved interactions between covariates into consideration.^
Resumo:
Pathway based genome wide association study evolves from pathway analysis for microarray gene expression and is under rapid development as a complementary for single-SNP based genome wide association study. However, it faces new challenges, such as the summarization of SNP statistics to pathway statistics. The current study applies the ridge regularized Kernel Sliced Inverse Regression (KSIR) to achieve dimension reduction and compared this method to the other two widely used methods, the minimal-p-value (minP) approach of assigning the best test statistics of all SNPs in each pathway as the statistics of the pathway and the principal component analysis (PCA) method of utilizing PCA to calculate the principal components of each pathway. Comparison of the three methods using simulated datasets consisting of 500 cases, 500 controls and100 SNPs demonstrated that KSIR method outperformed the other two methods in terms of causal pathway ranking and the statistical power. PCA method showed similar performance as the minP method. KSIR method also showed a better performance over the other two methods in analyzing a real dataset, the WTCCC Ulcerative Colitis dataset consisting of 1762 cases, 3773 controls as the discovery cohort and 591 cases, 1639 controls as the replication cohort. Several immune and non-immune pathways relevant to ulcerative colitis were identified by these methods. Results from the current study provided a reference for further methodology development and identified novel pathways that may be of importance to the development of ulcerative colitis.^
New methods for quantification and analysis of quantitative real-time polymerase chain reaction data
Resumo:
Quantitative real-time polymerase chain reaction (qPCR) is a sensitive gene quantitation method that has been widely used in the biological and biomedical fields. The currently used methods for PCR data analysis, including the threshold cycle (CT) method, linear and non-linear model fitting methods, all require subtracting background fluorescence. However, the removal of background fluorescence is usually inaccurate, and therefore can distort results. Here, we propose a new method, the taking-difference linear regression method, to overcome this limitation. Briefly, for each two consecutive PCR cycles, we subtracted the fluorescence in the former cycle from that in the later cycle, transforming the n cycle raw data into n-1 cycle data. Then linear regression was applied to the natural logarithm of the transformed data. Finally, amplification efficiencies and the initial DNA molecular numbers were calculated for each PCR run. To evaluate this new method, we compared it in terms of accuracy and precision with the original linear regression method with three background corrections, being the mean of cycles 1-3, the mean of cycles 3-7, and the minimum. Three criteria, including threshold identification, max R2, and max slope, were employed to search for target data points. Considering that PCR data are time series data, we also applied linear mixed models. Collectively, when the threshold identification criterion was applied and when the linear mixed model was adopted, the taking-difference linear regression method was superior as it gave an accurate estimation of initial DNA amount and a reasonable estimation of PCR amplification efficiencies. When the criteria of max R2 and max slope were used, the original linear regression method gave an accurate estimation of initial DNA amount. Overall, the taking-difference linear regression method avoids the error in subtracting an unknown background and thus it is theoretically more accurate and reliable. This method is easy to perform and the taking-difference strategy can be extended to all current methods for qPCR data analysis.^
Resumo:
The genomic era brought by recent advances in the next-generation sequencing technology makes the genome-wide scans of natural selection a reality. Currently, almost all the statistical tests and analytical methods for identifying genes under selection was performed on the individual gene basis. Although these methods have the power of identifying gene subject to strong selection, they have limited power in discovering genes targeted by moderate or weak selection forces, which are crucial for understanding the molecular mechanisms of complex phenotypes and diseases. Recent availability and rapid completeness of many gene network and protein-protein interaction databases accompanying the genomic era open the avenues of exploring the possibility of enhancing the power of discovering genes under natural selection. The aim of the thesis is to explore and develop normal mixture model based methods for leveraging gene network information to enhance the power of natural selection target gene discovery. The results show that the developed statistical method, which combines the posterior log odds of the standard normal mixture model and the Guilt-By-Association score of the gene network in a naïve Bayes framework, has the power to discover moderate/weak selection gene which bridges the genes under strong selection and it helps our understanding the biology under complex diseases and related natural selection phenotypes.^
Resumo:
With most clinical trials, missing data presents a statistical problem in evaluating a treatment's efficacy. There are many methods commonly used to assess missing data; however, these methods leave room for bias to enter the study. This thesis was a secondary analysis on data taken from TIME, a phase 2 randomized clinical trial conducted to evaluate the safety and effect of the administration timing of bone marrow mononuclear cells (BMMNC) for subjects with acute myocardial infarction (AMI).^ We evaluated the effect of missing data by comparing the variance inflation factor (VIF) of the effect of therapy between all subjects and only subjects with complete data. Through the general linear model, an unbiased solution was made for the VIF of the treatment's efficacy using the weighted least squares method to incorporate missing data. Two groups were identified from the TIME data: 1) all subjects and 2) subjects with complete data (baseline and follow-up measurements). After the general solution was found for the VIF, it was migrated Excel 2010 to evaluate data from TIME. The resulting numerical value from the two groups was compared to assess the effect of missing data.^ The VIF values from the TIME study were considerably less in the group with missing data. By design, we varied the correlation factor in order to evaluate the VIFs of both groups. As the correlation factor increased, the VIF values increased at a faster rate in the group with only complete data. Furthermore, while varying the correlation factor, the number of subjects with missing data was also varied to see how missing data affects the VIF. When subjects with only baseline data was increased, we saw a significant rate increase in VIF values in the group with only complete data while the group with missing data saw a steady and consistent increase in the VIF. The same was seen when we varied the group with follow-up only data. This essentially showed that the VIFs steadily increased when missing data is not ignored. When missing data is ignored as with our comparison group, the VIF values sharply increase as correlation increases.^
Resumo:
The development of targeted therapy involve many challenges. Our study will address some of the key issues involved in biomarker identification and clinical trial design. In our study, we propose two biomarker selection methods, and then apply them in two different clinical trial designs for targeted therapy development. In particular, we propose a Bayesian two-step lasso procedure for biomarker selection in the proportional hazards model in Chapter 2. In the first step of this strategy, we use the Bayesian group lasso to identify the important marker groups, wherein each group contains the main effect of a single marker and its interactions with treatments. In the second step, we zoom in to select each individual marker and the interactions between markers and treatments in order to identify prognostic or predictive markers using the Bayesian adaptive lasso. In Chapter 3, we propose a Bayesian two-stage adaptive design for targeted therapy development while implementing the variable selection method given in Chapter 2. In Chapter 4, we proposed an alternate frequentist adaptive randomization strategy for situations where a large number of biomarkers need to be incorporated in the study design. We also propose a new adaptive randomization rule, which takes into account the variations associated with the point estimates of survival times. In all of our designs, we seek to identify the key markers that are either prognostic or predictive with respect to treatment. We are going to use extensive simulation to evaluate the operating characteristics of our methods.^
Resumo:
Prevalent sampling is an efficient and focused approach to the study of the natural history of disease. Right-censored time-to-event data observed from prospective prevalent cohort studies are often subject to left-truncated sampling. Left-truncated samples are not randomly selected from the population of interest and have a selection bias. Extensive studies have focused on estimating the unbiased distribution given left-truncated samples. However, in many applications, the exact date of disease onset was not observed. For example, in an HIV infection study, the exact HIV infection time is not observable. However, it is known that the HIV infection date occurred between two observable dates. Meeting these challenges motivated our study. We propose parametric models to estimate the unbiased distribution of left-truncated, right-censored time-to-event data with uncertain onset times. We first consider data from a length-biased sampling, a specific case in left-truncated samplings. Then we extend the proposed method to general left-truncated sampling. With a parametric model, we construct the full likelihood, given a biased sample with unobservable onset of disease. The parameters are estimated through the maximization of the constructed likelihood by adjusting the selection bias and unobservable exact onset. Simulations are conducted to evaluate the finite sample performance of the proposed methods. We apply the proposed method to an HIV infection study, estimating the unbiased survival function and covariance coefficients. ^
Resumo:
Phase I clinical trial is mainly designed to determine the maximum tolerated dose (MTD) of a new drug. Optimization of phase I trial design is crucial to minimize the number of enrolled patients exposed to unsafe dose levels and to provide reliable information to the later phases of clinical trials. Although it has been criticized about its inefficient MTD estimation, nowadays the traditional 3+3 method remains dominant in practice due to its simplicity and conservative estimation. There are many new designs that have been proven to generate more credible MTD estimation, such as the Continual Reassessment Method (CRM). Despite its accepted better performance, the CRM design is still not widely used in real trials. There are several factors that contribute to the difficulties of CRM adaption in practice. First, CRM is not widely accepted by the regulatory agencies such as FDA in terms of safety. It is considered to be less conservative and tend to expose more patients above the MTD level than the traditional design. Second, CRM is relatively complex and not intuitive for the clinicians to fully understand. Third, the CRM method take much more time and need statistical experts and computer programs throughout the trial. The current situation is that the clinicians still tend to follow the trial process that they are comfortable with. This situation is not likely to change in the near future. Based on this situation, we have the motivation to improve the accuracy of MTD selection while follow the procedure of the traditional design to maintain simplicity. We found that in 3+3 method, the dose transition and the MTD determination are relatively independent. Thus we proposed to separate the two stages. The dose transition rule remained the same as 3+3 method. After getting the toxicity information from the dose transition stage, we combined the isotonic transformation to ensure the monotonic increasing order before selecting the optimal MTD. To compare the operating characteristics of the proposed isotonic method and the other designs, we carried out 10,000 simulation trials under different dose setting scenarios to compare the design characteristics of the isotonic modified method with standard 3+3 method, CRM, biased coin design (BC) and k-in-a-row design (KIAW). The isotonic modified method improved MTD estimation of the standard 3+3 in 39 out of 40 scenarios. The improvement is much greater when the target is 0.3 other than 0.25. The modified design is also competitive when comparing with other selected methods. A CRM method performed better in general but was not as stable as the isotonic method throughout the different dose settings. The results demonstrated that our proposed isotonic modified method is not only easily conducted using the same procedure as 3+3 but also outperforms the conventional 3+3 design. It can also be applied to determine MTD for any given TTL. These features make the isotonic modified method of practical value in phase I clinical trials.^
Resumo:
Cervical cancer is the leading cause of death and disease from malignant neoplasms among women in developing countries. Even though the Pap smear has significantly decreased the number of deaths from cervical cancer in the past years, it has its limitations. Researchers have developed an automated screening machine which can potentially detect abnormal cases that are overlooked by conventional screening. The goal of quantitative cytology is to classify the patient's tissue sample based on quantitative measurements of the individual cells. It is also much cheaper and potentially can take less time. One of the major challenges of collecting cells with a cytobrush is the possibility of not sampling any existing dysplastic cells on the cervix. Being able to correctly classify patients who have disease without the presence of dysplastic cells could improve the accuracy of quantitative cytology algorithms. Subtle morphologic changes in normal-appearing tissues adjacent to or distant from malignant tumors have been shown to exist, but a comparison of various statistical methods, including many recent advances in the statistical learning field, has not previously been done. The objective of this thesis is to use different classification methods applied to quantitative cytology data for the detection of malignancy associated changes (MACs). In this thesis, Elastic Net is the best algorithm. When we applied the Elastic Net algorithm to the test set, we combined the training set and validation set as "training" set and used 5-fold cross validation to choose the parameter for Elastic Net. It has a sensitivity of 47% at 80% specificity, an AUC 0.52, and a partial AUC 0.10 (95% CI 0.09-0.11).^
Resumo:
Conservative procedures in low-dose risk assessment are used to set safety standards for known or suspected carcinogens. However, the assumptions upon which the methods are based and the effects of these methods are not well understood.^ To minimize the number of false-negatives and to reduce the cost of bioassays, animals are given very high doses of potential carcinogens. Results must then be extrapolated to much smaller doses to set safety standards for risks such as one per million. There are a number of competing methods that add a conservative safety factor into these calculations.^ A method of quantifying the conservatism of these methods was described and tested on eight procedures used in setting low-dose safety standards. The results using these procedures were compared by computer simulation and by the use of data from a large scale animal study.^ The method consisted of determining a "true safe dose" (tsd) according to an assumed underlying model. If one assumed that Y = the probability of cancer = P(d), a known mathematical function of the dose, then by setting Y to some predetermined acceptable risk, one can solve for d, the model's "true safe dose".^ Simulations were generated, assuming a binomial distribution, for an artificial bioassay. The eight procedures were then used to determine a "virtual safe dose" (vsd) that estimates the tsd, assuming a risk of one per million. A ratio R = ((tsd-vsd)/vsd) was calculated for each "experiment" (simulation). The mean R of 500 simulations and the probability R $<$ 0 was used to measure the over and under conservatism of each procedure.^ The eight procedures included Weil's method, Hoel's method, the Mantel-Byran method, the improved Mantel-Byran, Gross's method, fitting a one-hit model, Crump's procedure, and applying Rai and Van Ryzin's method to a Weibull model.^ None of the procedures performed uniformly well for all types of dose-response curves. When the data were linear, the one-hit model, Hoel's method, or the Gross-Mantel method worked reasonably well. However, when the data were non-linear, these same methods were overly conservative. Crump's procedure and the Weibull model performed better in these situations. ^