9 resultados para model testing
em DigitalCommons@The Texas Medical Center
Resumo:
ACCURACY OF THE BRCAPRO RISK ASSESSMENT MODEL IN MALES PRESENTING TO MD ANDERSON FOR BRCA TESTING Publication No. _______ Carolyn A. Garby, B.S. Supervisory Professor: Banu Arun, M.D. Hereditary Breast and Ovarian Cancer (HBOC) syndrome is due to mutations in BRCA1 and BRCA2 genes. Women with HBOC have high risks to develop breast and ovarian cancers. Males with HBOC are commonly overlooked because male breast cancer is rare and other male cancer risks such as prostate and pancreatic cancers are relatively low. BRCA genetic testing is indicated for men as it is currently estimated that 4-40% of male breast cancers result from a BRCA1 or BRCA2 mutation (Ottini, 2010) and management recommendations can be made based on genetic test results. Risk assessment models are available to provide the individualized likelihood to have a BRCA mutation. Only one study has been conducted to date to evaluate the accuracy of BRCAPro in males and was based on a cohort of Italian males and utilized an older version of BRCAPro. The objective of this study is to determine if BRCAPro5.1 is a valid risk assessment model for males who present to MD Anderson Cancer Center for BRCA genetic testing. BRCAPro has been previously validated for determining the probability of carrying a BRCA mutation, however has not been further examined particularly in males. The total cohort consisted of 152 males who had undergone BRCA genetic testing. The cohort was stratified by indication for genetic counseling. Indications included having a known familial BRCA mutation, having a personal diagnosis of a BRCA-related cancer, or having a family history suggestive of HBOC. Overall there were 22 (14.47%) BRCA1+ males and 25 (16.45%) BRCA2+ males. Receiver operating characteristic curves were constructed for the cohort overall, for each particular indication, as well as for each cancer subtype. Our findings revealed that the BRCAPro5.1 model had perfect discriminating ability at a threshold of 56.2 for males with breast cancer, however only 2 (4.35%) of 46 were found to have BRCA2 mutations. These results are significantly lower than the high approximation (40%) reported in previous literature. BRCAPro does perform well in certain situations for men. Future investigation of male breast cancer and men at risk for BRCA mutations is necessary to provide a more accurate risk assessment.
Resumo:
In 2011, there will be an estimated 1,596,670 new cancer cases and 571,950 cancer-related deaths in the US. With the ever-increasing applications of cancer genetics in epidemiology, there is great potential to identify genetic risk factors that would help identify individuals with increased genetic susceptibility to cancer, which could be used to develop interventions or targeted therapies that could hopefully reduce cancer risk and mortality. In this dissertation, I propose to develop a new statistical method to evaluate the role of haplotypes in cancer susceptibility and development. This model will be flexible enough to handle not only haplotypes of any size, but also a variety of covariates. I will then apply this method to three cancer-related data sets (Hodgkin Disease, Glioma, and Lung Cancer). I hypothesize that there is substantial improvement in the estimation of association between haplotypes and disease, with the use of a Bayesian mathematical method to infer haplotypes that uses prior information from known genetics sources. Analysis based on haplotypes using information from publically available genetic sources generally show increased odds ratios and smaller p-values in both the Hodgkin, Glioma, and Lung data sets. For instance, the Bayesian Joint Logistic Model (BJLM) inferred haplotype TC had a substantially higher estimated effect size (OR=12.16, 95% CI = 2.47-90.1 vs. 9.24, 95% CI = 1.81-47.2) and more significant p-value (0.00044 vs. 0.008) for Hodgkin Disease compared to a traditional logistic regression approach. Also, the effect sizes of haplotypes modeled with recessive genetic effects were higher (and had more significant p-values) when analyzed with the BJLM. Full genetic models with haplotype information developed with the BJLM resulted in significantly higher discriminatory power and a significantly higher Net Reclassification Index compared to those developed with haplo.stats for lung cancer. Future analysis for this work could be to incorporate the 1000 Genomes project, which offers a larger selection of SNPs can be incorporated into the information from known genetic sources as well. Other future analysis include testing non-binary outcomes, like the levels of biomarkers that are present in lung cancer (NNK), and extending this analysis to full GWAS studies.
Resumo:
The current analysis examined the association of several demographic and behavioral variables with prior HIV testing within a population of injection drug users (IDUs) living in Harris County, Texas in 2005 (n=563). After completing the initial univariate analyses of all potential predictors, a multivariable model was created. This model was designed to guide future intervention efforts. Data used in this analysis were collected by the University of Texas School of Public Health in association with the Houston Department of Health and Human Services for the first IDU cycle of the National HIV Behavioral Surveillance System. About 76% of the IDUs reported previously being tested for HIV. Demographic variables that displayed a significant association with prior testing during the univariate analyses include age, race/ethnicity, birth outside the United States, education level, recent arrest, and current health insurance coverage. Several drug-related and sexual behaviors also demonstrated significant associations with prior testing, including age of first injection drug use, heroin use, methamphetamine use, source of needles or syringes, consistent use of new needles, recent visits to a shooting gallery or similar location, previous alcohol or drug treatment, condom use during their most recent sexual encounter, and having sexual partners who also used injection drugs. Additionally, the univariate analyses revealed that recent use of health or HIV prevention services was associated with previously testing for HIV. The final multivariable model included age, race/ethnicity, recent arrest, previous alcohol or drug treatment, and heroin use. ^
Resumo:
Tuberculosis (TB) is an infectious disease of great public health importance, particularly to institutions that provide health care to large numbers of TB patients such as Parkland Hospital in Dallas, TX. The purpose of this retrospective chart review was to analyze differences in TB positive and TB negative patients to better understand whether or not there were variables that could be utilized to develop a predictive model for use in the emergency department to reduce the overall number of suspected TB patients being sent to respiratory isolation for TB testing. This study included patients who presented to the Parkland Hospital emergency department between November 2006 and December 2007 and were isolated and tested for TB. Outcome of TB was defined as a positive sputum AFB test or a positive M. tuberculosis culture result. Data were collected utilizing the UT Southwestern Medical Center computerized database OACIS and included demographic information, TB risk factors, physical symptoms, and clinical results. Only two variables were significantly (P<0.05) related to TB outcome: dyspnea (shortness of breath) (P<0.001) and abnormal x-ray (P<0.001). Marginally significant variables included hemoptysis (P=0.06), weight loss (P=0.11), night sweats (P=0.20), history of homelessness or incarceration (P=0.15), and history of positive skin PPD (P=0.19). Using a combination of significant and marginally significant variables, a predictive model was designed which demonstrated a specificity of 24% and a sensitivity of 70%. In conclusion, a predictive model for TB outcome based on patients who presented to the Parkland Hospital emergency department between November 2006 and December 2007 was unsuccessful given the limited number of variables that differed significantly between TB positive and TB negative patients. It is suggested that a future prospective cohort study should be implemented to collect data on TB positive and TB negative patients. It may be possible that a more thorough prospective collection of data may lead to clearer comparisons between TB positive and TB negative patients and ultimately to the design of a more sensitive predictive model for TB outcome. ^
Resumo:
Background. At present, prostate cancer screening (PCS) guidelines require a discussion of risks, benefits, alternatives, and personal values, making decision aids an important tool to help convey information and to help clarify values. Objective: The overall goal of this study is to provide evidence of the reliability and validity of a PCS anxiety measure and the Decisional Conflict Scale (DCS). Methods. Using data from a randomized, controlled PCS decision aid trial that measured PCS anxiety at baseline and DCS at baseline (T0) and at two-weeks (T2), four psychometric properties were assessed: (1) internal consistency reliability, indicated by factor analysis intraclass correlations and Cronbach's α; (2) construct validity, indicated by patterns of Pearson correlations among subscales; (3) discriminant validity, indicated by the measure's ability to discriminate between undecided men and those with a definite screening intention; and (4) factor validity and invariance using confirmatory factor analyses (CFA). Results. The PCS anxiety measure had adequate internal consistency reliability and good construct and discriminant validity. CFAs indicated that the 3-factor model did not have adequate fit. CFAs for a general PCS anxiety measure and a PSA anxiety measure indicated adequate fit. The general PCS anxiety measure was invariant across clinics. The DCS had adequate internal consistency reliability except for the support subscale and had adequate discriminate validity. Good construct validity was found at the private clinic, but was only found for the feeling informed subscale at the public clinic. The traditional DCS did not have adequate fit at T0 or at T2. The alternative DCS had adequate fit at T0 but was not identified at T2. Factor loadings indicated that two subscales, feeling informed and feeling clear about values, were not distinct factors. Conclusions. Our general PCS anxiety measure can be used in PCS decision aid studies. The alternative DCS may be appropriate for men eligible for PCS. Implications: More emphasis needs to be placed on the development of PCS anxiety items relating to testing procedures. We recommend that the two DCS versions be validated in other samples of men eligible for PCS and in other health care decisions that involve uncertainty. ^
Resumo:
Interaction effect is an important scientific interest for many areas of research. Common approach for investigating the interaction effect of two continuous covariates on a response variable is through a cross-product term in multiple linear regression. In epidemiological studies, the two-way analysis of variance (ANOVA) type of method has also been utilized to examine the interaction effect by replacing the continuous covariates with their discretized levels. However, the implications of model assumptions of either approach have not been examined and the statistical validation has only focused on the general method, not specifically for the interaction effect.^ In this dissertation, we investigated the validity of both approaches based on the mathematical assumptions for non-skewed data. We showed that linear regression may not be an appropriate model when the interaction effect exists because it implies a highly skewed distribution for the response variable. We also showed that the normality and constant variance assumptions required by ANOVA are not satisfied in the model where the continuous covariates are replaced with their discretized levels. Therefore, naïve application of ANOVA method may lead to an incorrect conclusion. ^ Given the problems identified above, we proposed a novel method modifying from the traditional ANOVA approach to rigorously evaluate the interaction effect. The analytical expression of the interaction effect was derived based on the conditional distribution of the response variable given the discretized continuous covariates. A testing procedure that combines the p-values from each level of the discretized covariates was developed to test the overall significance of the interaction effect. According to the simulation study, the proposed method is more powerful then the least squares regression and the ANOVA method in detecting the interaction effect when data comes from a trivariate normal distribution. The proposed method was applied to a dataset from the National Institute of Neurological Disorders and Stroke (NINDS) tissue plasminogen activator (t-PA) stroke trial, and baseline age-by-weight interaction effect was found significant in predicting the change from baseline in NIHSS at Month-3 among patients received t-PA therapy.^
Resumo:
Conventional designs of animal bioassays allocate the same number of animals into control and dose groups to explore the spontaneous and induced tumor incidence rates, respectively. The purpose of such bioassays are (a) to determine whether or not the substance exhibits carcinogenic properties, and (b) if so, to estimate the human response at relatively low doses. In this study, it has been found that the optimal allocation to the experimental groups which, in some sense, minimize the error of the estimated response for low dose extrapolation is associated with the dose level and tumor risk. The number of dose levels has been investigated at the affordable experimental cost. The pattern of the administered dose, 1 MTD, 1/2 MTD, 1/4 MTD,....., etc. plus control, gives the most reasonable arrangement for the low dose extrapolation purpose. The arrangement of five dose groups may make the highest dose trivial. A four-dose design can circumvent this problem and has also one degree of freedom for testing the goodness-of-fit of the response model.^ An example using the data on liver tumors induced in mice in a lifetime study of feeding dieldrin (Walker et al., 1973) is implemented with the methodology. The results are compared with conclusions drawn from other studies. ^
Resumo:
Hierarchical linear growth model (HLGM), as a flexible and powerful analytic method, has played an increased important role in psychology, public health and medical sciences in recent decades. Mostly, researchers who conduct HLGM are interested in the treatment effect on individual trajectories, which can be indicated by the cross-level interaction effects. However, the statistical hypothesis test for the effect of cross-level interaction in HLGM only show us whether there is a significant group difference in the average rate of change, rate of acceleration or higher polynomial effect; it fails to convey information about the magnitude of the difference between the group trajectories at specific time point. Thus, reporting and interpreting effect sizes have been increased emphases in HLGM in recent years, due to the limitations and increased criticisms for statistical hypothesis testing. However, most researchers fail to report these model-implied effect sizes for group trajectories comparison and their corresponding confidence intervals in HLGM analysis, since lack of appropriate and standard functions to estimate effect sizes associated with the model-implied difference between grouping trajectories in HLGM, and also lack of computing packages in the popular statistical software to automatically calculate them. ^ The present project is the first to establish the appropriate computing functions to assess the standard difference between grouping trajectories in HLGM. We proposed the two functions to estimate effect sizes on model-based grouping trajectories difference at specific time, we also suggested the robust effect sizes to reduce the bias of estimated effect sizes. Then, we applied the proposed functions to estimate the population effect sizes (d ) and robust effect sizes (du) on the cross-level interaction in HLGM by using the three simulated datasets, and also we compared the three methods of constructing confidence intervals around d and du recommended the best one for application. At the end, we constructed 95% confidence intervals with the suitable method for the effect sizes what we obtained with the three simulated datasets. ^ The effect sizes between grouping trajectories for the three simulated longitudinal datasets indicated that even though the statistical hypothesis test shows no significant difference between grouping trajectories, effect sizes between these grouping trajectories can still be large at some time points. Therefore, effect sizes between grouping trajectories in HLGM analysis provide us additional and meaningful information to assess group effect on individual trajectories. In addition, we also compared the three methods to construct 95% confident intervals around corresponding effect sizes in this project, which handled with the uncertainty of effect sizes to population parameter. We suggested the noncentral t-distribution based method when the assumptions held, and the bootstrap bias-corrected and accelerated method when the assumptions are not met.^
Resumo:
The problem of analyzing data with updated measurements in the time-dependent proportional hazards model arises frequently in practice. One available option is to reduce the number of intervals (or updated measurements) to be included in the Cox regression model. We empirically investigated the bias of the estimator of the time-dependent covariate while varying the effect of failure rate, sample size, true values of the parameters and the number of intervals. We also evaluated how often a time-dependent covariate needs to be collected and assessed the effect of sample size and failure rate on the power of testing a time-dependent effect.^ A time-dependent proportional hazards model with two binary covariates was considered. The time axis was partitioned into k intervals. The baseline hazard was assumed to be 1 so that the failure times were exponentially distributed in the ith interval. A type II censoring model was adopted to characterize the failure rate. The factors of interest were sample size (500, 1000), type II censoring with failure rates of 0.05, 0.10, and 0.20, and three values for each of the non-time-dependent and time-dependent covariates (1/4,1/2,3/4).^ The mean of the bias of the estimator of the coefficient of the time-dependent covariate decreased as sample size and number of intervals increased whereas the mean of the bias increased as failure rate and true values of the covariates increased. The mean of the bias of the estimator of the coefficient was smallest when all of the updated measurements were used in the model compared with two models that used selected measurements of the time-dependent covariate. For the model that included all the measurements, the coverage rates of the estimator of the coefficient of the time-dependent covariate was in most cases 90% or more except when the failure rate was high (0.20). The power associated with testing a time-dependent effect was highest when all of the measurements of the time-dependent covariate were used. An example from the Systolic Hypertension in the Elderly Program Cooperative Research Group is presented. ^