178 resultados para biostatistics
Resumo:
Objective: The primary objective of our study was to study the effect of metformin in patients of metastatic renal cell cancer (mRCC) and diabetes who are on treatment with frontline therapy of tyrosine kinase inhibitors. The effect of therapy was described in terms of overall survival and progression free survival. Comparisons were made between group of patients receiving metformin versus group of patients receiving insulin in diabetic patients of metastatic renal cancer on frontline therapy. Exploratory analyses were also done comparing non-diabetic patients of metastatic renal cell cancer receiving frontline therapy compared to diabetic patients of metastatic renal cell cancer receiving metformin therapy. ^ Methods: The study design is a retrospective case series to elaborate the response rate of frontline therapy in combination with metformin for mRCC patients with type 2 diabetes mellitus. The cohort was selected from a database, which was generated for assessing the effect of tyrosine kinase inhibitor therapy associated hypertension in metastatic renal cell cancer at MD Anderson Cancer Center. Patients who had been started on frontline therapy for metastatic renal cell carcinoma from all ethnic and racial backgrounds were selected for the study. The exclusion criteria would be of patients who took frontline therapy for less than 3 months or were lost to follow-up. Our exposure variable was treatment with metformin, which comprised of patients who took metformin for the treatment of type 2 diabetes at any time of diagnosis of metastatic renal cell carcinoma. The outcomes assessed were last available follow-up or date of death for the overall survival and date of progression of disease from their radiological reports for time to progression. The response rates were compared by covariates that are known to be strongly associated with renal cell cancer. ^ Results: For our primary analyses between the insulin and metformin group, there were 82 patients, out of which 50 took insulin therapy and 32 took metformin therapy for type 2 diabetes. For our exploratory analysis, we compared 32 diabetic patients on metformin to 146 non-diabetic patients, not on metformin. Baseline characteristics were compared among the population. The time from the start of treatment until the date of progression of renal cell cancer and date of death or last follow-up were estimated for survival analysis. ^ In our primary analyses, there was a significant difference in the time to progression of patients receiving metformin therapy vs insulin therapy, which was also seen in our exploratory analyses. The median time to progression in primary analyses was 1259 days (95% CI: 659-1832 days) in patients on metformin therapy compared to 540 days (95% CI: 350-894) in patients who were receiving insulin therapy (p=0.024). The median time to progression in exploratory analyses was 1259 days (95% CI: 659-1832 days) in patients on metformin therapy compared to 279 days (95% CI: 202-372 days) in non-diabetic group (p-value <0.0001). ^ The median overall survival was 1004 days in metformin group (95% CI: 761-1212 days) compared to 816 days (95%CI: 558-1405 days) in insulin group (p-value<0.91). For the exploratory analyses, the median overall survival was 1004 days in metformin group (95% CI: 761-1212 days) compared to 766 days (95%CI: 649-965 days) in the non-diabetic group (p-value<0.78). Metformin was observed to increase the progression free survival in both the primary and exploratory analyses (HR=0.52 in metformin Vs insulin group and HR=0.36 in metformin Vs non-diabetic group, respectively). ^ Conclusion: In laboratory studies and a few clinical studies metformin has been proven to have dual benefits in patients suffering from cancer and type 2-diabetes via its action on the mammalian target of Rapamycin pathway and effect in decreasing blood sugar by increasing the sensitivity of the insulin receptors to insulin. Several studies in breast cancer patients have documented a beneficial effect (quantified by pathological remission of cancer) of metformin use in patients taking treatment for breast cancer therapy. Combination of metformin therapy in patients taking frontline therapy for renal cell cancer may provide a significant benefit in prolonging the overall survival in patients with metastatic renal cell cancer and diabetes. ^
Resumo:
Colorectal cancer (CRC) is the third leading cancer in both incidence and mortality in Texas. This study investigated the adherence of CRC treatment to standard treatment guidelines and the association between standard treatment and CRC survival in Texas. The author used Texas Cancer Registry (TCR) and Medicare linked data to study the CRC treatment patterns and factors associated with standard treatment in patients who were more than 65 years old and were diagnosed in 2001 through 2007. We also determined whether adherence to standard treatment affect patients' survival. Multiple logistic regression and Cox regression analysis were used to analyze our data. Both regression models are adjusted for demographic characteristics and tumor characteristics. We found that for the 3977 regional colon cancer patients 80 years old or younger, 60.2% of them received chemotherapy, in adherence to the recommended treatment guidelines. People with younger age, female gender, higher education and lower comorbidity score are more likely adherent to this surgery guideline. Patients' adherence to chemotherapy in this cohort have better survival compared to those who are not (HR: 0.76, 95% CI: 0.68-0.84). For the 12709 colon cancer patients treated with surgery, 49.3% have more than 12 lymph nodes removed, in adherence to the treatment guidelines. People with younger age, female gender, higher education, regional stage, lager tumor size and lower comorbidity score are more likely to adherent to this surgery guideline. Patients with more than 12 lymph nodes removed in this cohort have better survival (HR: 0.86, 95% CI: 0.82-0.91). For the 1211 regional rectal cancer patients 80 years old or younger, 63.2% of them were adherent to radiation treatment. People with smaller tumor size and lower comorbidity score are more likely to adherent to this radiation guideline. There is no significant survival difference between radiation adherent patients and non-adherent patients (HR: 1.03, 95% CI: 0.82-1.29). For the 1122 regional rectal cancer patients 80 years old or younger who were treated with surgery, 76.0% of them received postoperative chemotherapy, in adherence to the treatment guidelines. People with younger age and smaller comorbidity score are related with higher adherence rate. Patients adherent with adjuvant chemotherapy in this cohort have better survival than those were not adherent (HR: 0.60, 95% CI: 0.45-0.79).^
Resumo:
Cervical cancer is the leading cause of death and disease from malignant neoplasms among women in developing countries. Even though the Pap smear has significantly decreased the number of deaths from cervical cancer in the past years, it has its limitations. Researchers have developed an automated screening machine which can potentially detect abnormal cases that are overlooked by conventional screening. The goal of quantitative cytology is to classify the patient's tissue sample based on quantitative measurements of the individual cells. It is also much cheaper and potentially can take less time. One of the major challenges of collecting cells with a cytobrush is the possibility of not sampling any existing dysplastic cells on the cervix. Being able to correctly classify patients who have disease without the presence of dysplastic cells could improve the accuracy of quantitative cytology algorithms. Subtle morphologic changes in normal-appearing tissues adjacent to or distant from malignant tumors have been shown to exist, but a comparison of various statistical methods, including many recent advances in the statistical learning field, has not previously been done. The objective of this thesis is to use different classification methods applied to quantitative cytology data for the detection of malignancy associated changes (MACs). In this thesis, Elastic Net is the best algorithm. When we applied the Elastic Net algorithm to the test set, we combined the training set and validation set as "training" set and used 5-fold cross validation to choose the parameter for Elastic Net. It has a sensitivity of 47% at 80% specificity, an AUC 0.52, and a partial AUC 0.10 (95% CI 0.09-0.11).^
Resumo:
Hierarchical linear growth model (HLGM), as a flexible and powerful analytic method, has played an increased important role in psychology, public health and medical sciences in recent decades. Mostly, researchers who conduct HLGM are interested in the treatment effect on individual trajectories, which can be indicated by the cross-level interaction effects. However, the statistical hypothesis test for the effect of cross-level interaction in HLGM only show us whether there is a significant group difference in the average rate of change, rate of acceleration or higher polynomial effect; it fails to convey information about the magnitude of the difference between the group trajectories at specific time point. Thus, reporting and interpreting effect sizes have been increased emphases in HLGM in recent years, due to the limitations and increased criticisms for statistical hypothesis testing. However, most researchers fail to report these model-implied effect sizes for group trajectories comparison and their corresponding confidence intervals in HLGM analysis, since lack of appropriate and standard functions to estimate effect sizes associated with the model-implied difference between grouping trajectories in HLGM, and also lack of computing packages in the popular statistical software to automatically calculate them. ^ The present project is the first to establish the appropriate computing functions to assess the standard difference between grouping trajectories in HLGM. We proposed the two functions to estimate effect sizes on model-based grouping trajectories difference at specific time, we also suggested the robust effect sizes to reduce the bias of estimated effect sizes. Then, we applied the proposed functions to estimate the population effect sizes (d ) and robust effect sizes (du) on the cross-level interaction in HLGM by using the three simulated datasets, and also we compared the three methods of constructing confidence intervals around d and du recommended the best one for application. At the end, we constructed 95% confidence intervals with the suitable method for the effect sizes what we obtained with the three simulated datasets. ^ The effect sizes between grouping trajectories for the three simulated longitudinal datasets indicated that even though the statistical hypothesis test shows no significant difference between grouping trajectories, effect sizes between these grouping trajectories can still be large at some time points. Therefore, effect sizes between grouping trajectories in HLGM analysis provide us additional and meaningful information to assess group effect on individual trajectories. In addition, we also compared the three methods to construct 95% confident intervals around corresponding effect sizes in this project, which handled with the uncertainty of effect sizes to population parameter. We suggested the noncentral t-distribution based method when the assumptions held, and the bootstrap bias-corrected and accelerated method when the assumptions are not met.^
Resumo:
Conservative procedures in low-dose risk assessment are used to set safety standards for known or suspected carcinogens. However, the assumptions upon which the methods are based and the effects of these methods are not well understood.^ To minimize the number of false-negatives and to reduce the cost of bioassays, animals are given very high doses of potential carcinogens. Results must then be extrapolated to much smaller doses to set safety standards for risks such as one per million. There are a number of competing methods that add a conservative safety factor into these calculations.^ A method of quantifying the conservatism of these methods was described and tested on eight procedures used in setting low-dose safety standards. The results using these procedures were compared by computer simulation and by the use of data from a large scale animal study.^ The method consisted of determining a "true safe dose" (tsd) according to an assumed underlying model. If one assumed that Y = the probability of cancer = P(d), a known mathematical function of the dose, then by setting Y to some predetermined acceptable risk, one can solve for d, the model's "true safe dose".^ Simulations were generated, assuming a binomial distribution, for an artificial bioassay. The eight procedures were then used to determine a "virtual safe dose" (vsd) that estimates the tsd, assuming a risk of one per million. A ratio R = ((tsd-vsd)/vsd) was calculated for each "experiment" (simulation). The mean R of 500 simulations and the probability R $<$ 0 was used to measure the over and under conservatism of each procedure.^ The eight procedures included Weil's method, Hoel's method, the Mantel-Byran method, the improved Mantel-Byran, Gross's method, fitting a one-hit model, Crump's procedure, and applying Rai and Van Ryzin's method to a Weibull model.^ None of the procedures performed uniformly well for all types of dose-response curves. When the data were linear, the one-hit model, Hoel's method, or the Gross-Mantel method worked reasonably well. However, when the data were non-linear, these same methods were overly conservative. Crump's procedure and the Weibull model performed better in these situations. ^
Resumo:
It is well known that an identification problem exists in the analysis of age-period-cohort data because of the relationship among the three factors (date of birth + age at death = date of death). There are numerous suggestions about how to analyze the data. No one solution has been satisfactory. The purpose of this study is to provide another analytic method by extending the Cox's lifetable regression model with time-dependent covariates. The new approach contains the following features: (1) It is based on the conditional maximum likelihood procedure using a proportional hazard function described by Cox (1972), treating the age factor as the underlying hazard to estimate the parameters for the cohort and period factors. (2) The model is flexible so that both the cohort and period factors can be treated as dummy or continuous variables, and the parameter estimations can be obtained for numerous combinations of variables as in a regression analysis. (3) The model is applicable even when the time period is unequally spaced.^ Two specific models are considered to illustrate the new approach and applied to the U.S. prostate cancer data. We find that there are significant differences between all cohorts and there is a significant period effect for both whites and nonwhites. The underlying hazard increases exponentially with age indicating that old people have much higher risk than young people. A log transformation of relative risk shows that the prostate cancer risk declined in recent cohorts for both models. However, prostate cancer risk declined 5 cohorts (25 years) earlier for whites than for nonwhites under the period factor model (0 0 0 1 1 1 1). These latter results are similar to the previous study by Holford (1983).^ The new approach offers a general method to analyze the age-period-cohort data without using any arbitrary constraint in the model. ^
Resumo:
False-positive and false-negative values were calculated for five different designs of the trend test and it was demonstrated that a design suggested by Portier and Hoel in 1984 for a different problem produced the lowest false-positive and false-negative rates when applied to historical spontaneous tumor rate data for Fischer Rats. ^
Resumo:
The problem of analyzing data with updated measurements in the time-dependent proportional hazards model arises frequently in practice. One available option is to reduce the number of intervals (or updated measurements) to be included in the Cox regression model. We empirically investigated the bias of the estimator of the time-dependent covariate while varying the effect of failure rate, sample size, true values of the parameters and the number of intervals. We also evaluated how often a time-dependent covariate needs to be collected and assessed the effect of sample size and failure rate on the power of testing a time-dependent effect.^ A time-dependent proportional hazards model with two binary covariates was considered. The time axis was partitioned into k intervals. The baseline hazard was assumed to be 1 so that the failure times were exponentially distributed in the ith interval. A type II censoring model was adopted to characterize the failure rate. The factors of interest were sample size (500, 1000), type II censoring with failure rates of 0.05, 0.10, and 0.20, and three values for each of the non-time-dependent and time-dependent covariates (1/4,1/2,3/4).^ The mean of the bias of the estimator of the coefficient of the time-dependent covariate decreased as sample size and number of intervals increased whereas the mean of the bias increased as failure rate and true values of the covariates increased. The mean of the bias of the estimator of the coefficient was smallest when all of the updated measurements were used in the model compared with two models that used selected measurements of the time-dependent covariate. For the model that included all the measurements, the coverage rates of the estimator of the coefficient of the time-dependent covariate was in most cases 90% or more except when the failure rate was high (0.20). The power associated with testing a time-dependent effect was highest when all of the measurements of the time-dependent covariate were used. An example from the Systolic Hypertension in the Elderly Program Cooperative Research Group is presented. ^
Resumo:
The performance of the Hosmer-Lemeshow global goodness-of-fit statistic for logistic regression models was explored in a wide variety of conditions not previously fully investigated. Computer simulations, each consisting of 500 regression models, were run to assess the statistic in 23 different situations. The items which varied among the situations included the number of observations used in each regression, the number of covariates, the degree of dependence among the covariates, the combinations of continuous and discrete variables, and the generation of the values of the dependent variable for model fit or lack of fit.^ The study found that the $\rm\ C$g* statistic was adequate in tests of significance for most situations. However, when testing data which deviate from a logistic model, the statistic has low power to detect such deviation. Although grouping of the estimated probabilities into quantiles from 8 to 30 was studied, the deciles of risk approach was generally sufficient. Subdividing the estimated probabilities into more than 10 quantiles when there are many covariates in the model is not necessary, despite theoretical reasons which suggest otherwise. Because it does not follow a X$\sp2$ distribution, the statistic is not recommended for use in models containing only categorical variables with a limited number of covariate patterns.^ The statistic performed adequately when there were at least 10 observations per quantile. Large numbers of observations per quantile did not lead to incorrect conclusions that the model did not fit the data when it actually did. However, the statistic failed to detect lack of fit when it existed and should be supplemented with further tests for the influence of individual observations. Careful examination of the parameter estimates is also essential since the statistic did not perform as desired when there was moderate to severe collinearity among covariates.^ Two methods studied for handling tied values of the estimated probabilities made only a slight difference in conclusions about model fit. Neither method split observations with identical probabilities into different quantiles. Approaches which create equal size groups by separating ties should be avoided. ^
Resumo:
Multi-center clinical trials are very common in the development of new drugs and devices. One concern in such trials, is the effect of individual investigational sites enrolling small numbers of patients on the overall result. Can the presence of small centers cause an ineffective treatment to appear effective when treatment-by-center interaction is not statistically significant?^ In this research, simulations are used to study the effect that centers enrolling few patients may have on the analysis of clinical trial data. A multi-center clinical trial with 20 sites is simulated to investigate the effect of a new treatment in comparison to a placebo treatment. Twelve of these 20 investigational sites are considered small, each enrolling less than four patients per treatment group. Three clinical trials are simulated with sample sizes of 100, 170 and 300. The simulated data is generated with various characteristics, one in which treatment should be considered effective and another where treatment is not effective. Qualitative interactions are also produced within the small sites to further investigate the effect of small centers under various conditions.^ Standard analysis of variance methods and the "sometimes-pool" testing procedure are applied to the simulated data. One model investigates treatment and center effect and treatment-by-center interaction. Another model investigates treatment effect alone. These analyses are used to determine the power to detect treatment-by-center interactions, and the probability of type I error.^ We find it is difficult to detect treatment-by-center interactions when only a few investigational sites enrolling a limited number of patients participate in the interaction. However, we find no increased risk of type I error in these situations. In a pooled analysis, when the treatment is not effective, the probability of finding a significant treatment effect in the absence of significant treatment-by-center interaction is well within standard limits of type I error. ^
Resumo:
The purpose of this research is to develop a new statistical method to determine the minimum set of rows (R) in a R x C contingency table of discrete data that explains the dependence of observations. The statistical power of the method will be empirically determined by computer simulation to judge its efficiency over the presently existing methods. The method will be applied to data on DNA fragment length variation at six VNTR loci in over 72 populations from five major racial groups of human (total sample size is over 15,000 individuals; each sample having at least 50 individuals). DNA fragment lengths grouped in bins will form the basis of studying inter-population DNA variation within the racial groups are significant, will provide a rigorous re-binning procedure for forensic computation of DNA profile frequencies that takes into account intra-racial DNA variation among populations. ^
Resumo:
Many statistical studies feature data with both exact-time and interval-censored events. While a number of methods currently exist to handle interval-censored events and multivariate exact-time events separately, few techniques exist to deal with their combination. This thesis develops a theoretical framework for analyzing a multivariate endpoint comprised of a single interval-censored event plus an arbitrary number of exact-time events. The approach fuses the exact-time events, modeled using the marginal method of Wei, Lin, and Weissfeld, with a piecewise-exponential interval-censored component. The resulting model incorporates more of the information in the data and also removes some of the biases associated with the exclusion of interval-censored events. A simulation study demonstrates that our approach produces reliable estimates for the model parameters and their variance-covariance matrix. As a real-world data example, we apply this technique to the Systolic Hypertension in the Elderly Program (SHEP) clinical trial, which features three correlated events: clinical non-fatal myocardial infarction, fatal myocardial infarction (two exact-time events), and silent myocardial infarction (one interval-censored event). ^
Resumo:
In this dissertation, we propose a continuous-time Markov chain model to examine the longitudinal data that have three categories in the outcome variable. The advantage of this model is that it permits a different number of measurements for each subject and the duration between two consecutive time points of measurements can be irregular. Using the maximum likelihood principle, we can estimate the transition probability between two time points. By using the information provided by the independent variables, this model can also estimate the transition probability for each subject. The Monte Carlo simulation method will be used to investigate the goodness of model fitting compared with that obtained from other models. A public health example will be used to demonstrate the application of this method. ^