14 resultados para REGRESSION APPROACH

em DigitalCommons@The Texas Medical Center


Relevância:

60.00% 60.00%

Publicador:

Resumo:

In 2011, there will be an estimated 1,596,670 new cancer cases and 571,950 cancer-related deaths in the US. With the ever-increasing applications of cancer genetics in epidemiology, there is great potential to identify genetic risk factors that would help identify individuals with increased genetic susceptibility to cancer, which could be used to develop interventions or targeted therapies that could hopefully reduce cancer risk and mortality. In this dissertation, I propose to develop a new statistical method to evaluate the role of haplotypes in cancer susceptibility and development. This model will be flexible enough to handle not only haplotypes of any size, but also a variety of covariates. I will then apply this method to three cancer-related data sets (Hodgkin Disease, Glioma, and Lung Cancer). I hypothesize that there is substantial improvement in the estimation of association between haplotypes and disease, with the use of a Bayesian mathematical method to infer haplotypes that uses prior information from known genetics sources. Analysis based on haplotypes using information from publically available genetic sources generally show increased odds ratios and smaller p-values in both the Hodgkin, Glioma, and Lung data sets. For instance, the Bayesian Joint Logistic Model (BJLM) inferred haplotype TC had a substantially higher estimated effect size (OR=12.16, 95% CI = 2.47-90.1 vs. 9.24, 95% CI = 1.81-47.2) and more significant p-value (0.00044 vs. 0.008) for Hodgkin Disease compared to a traditional logistic regression approach. Also, the effect sizes of haplotypes modeled with recessive genetic effects were higher (and had more significant p-values) when analyzed with the BJLM. Full genetic models with haplotype information developed with the BJLM resulted in significantly higher discriminatory power and a significantly higher Net Reclassification Index compared to those developed with haplo.stats for lung cancer. Future analysis for this work could be to incorporate the 1000 Genomes project, which offers a larger selection of SNPs can be incorporated into the information from known genetic sources as well. Other future analysis include testing non-binary outcomes, like the levels of biomarkers that are present in lung cancer (NNK), and extending this analysis to full GWAS studies.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This study investigates the degree to which gender, ethnicity, relationship to perpetrator, and geomapped socio-economic factors significantly predict the incidence of childhood sexual abuse, physical abuse and non- abuse. These variables are then linked to geographic identifiers using geographic information system (GIS) technology to develop a geo-mapping framework for child sexual and physical abuse prevention.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

A Bayesian approach to estimation of the regression coefficients of a multinominal logit model with ordinal scale response categories is presented. A Monte Carlo method is used to construct the posterior distribution of the link function. The link function is treated as an arbitrary scalar function. Then the Gauss-Markov theorem is used to determine a function of the link which produces a random vector of coefficients. The posterior distribution of the random vector of coefficients is used to estimate the regression coefficients. The method described is referred to as a Bayesian generalized least square (BGLS) analysis. Two cases involving multinominal logit models are described. Case I involves a cumulative logit model and Case II involves a proportional-odds model. All inferences about the coefficients for both cases are described in terms of the posterior distribution of the regression coefficients. The results from the BGLS method are compared to maximum likelihood estimates of the regression coefficients. The BGLS method avoids the nonlinear problems encountered when estimating the regression coefficients of a generalized linear model. The method is not complex or computationally intensive. The BGLS method offers several advantages over Bayesian approaches. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The considerable search for synergistic agents in cancer research is motivated by the therapeutic benefits achieved by combining anti-cancer agents. Synergistic agents make it possible to reduce dosage while maintaining or enhancing a desired effect. Other favorable outcomes of synergistic agents include reduction in toxicity and minimizing or delaying drug resistance. Dose-response assessment and drug-drug interaction analysis play an important part in the drug discovery process, however analysis are often poorly done. This dissertation is an effort to notably improve dose-response assessment and drug-drug interaction analysis. The most commonly used method in published analysis is the Median-Effect Principle/Combination Index method (Chou and Talalay, 1984). The Median-Effect Principle/Combination Index method leads to inefficiency by ignoring important sources of variation inherent in dose-response data and discarding data points that do not fit the Median-Effect Principle. Previous work has shown that the conventional method yields a high rate of false positives (Boik, Boik, Newman, 2008; Hennessey, Rosner, Bast, Chen, 2010) and, in some cases, low power to detect synergy. There is a great need for improving the current methodology. We developed a Bayesian framework for dose-response modeling and drug-drug interaction analysis. First, we developed a hierarchical meta-regression dose-response model that accounts for various sources of variation and uncertainty and allows one to incorporate knowledge from prior studies into the current analysis, thus offering a more efficient and reliable inference. Second, in the case that parametric dose-response models do not fit the data, we developed a practical and flexible nonparametric regression method for meta-analysis of independently repeated dose-response experiments. Third, and lastly, we developed a method, based on Loewe additivity that allows one to quantitatively assess interaction between two agents combined at a fixed dose ratio. The proposed method makes a comprehensive and honest account of uncertainty within drug interaction assessment. Extensive simulation studies show that the novel methodology improves the screening process of effective/synergistic agents and reduces the incidence of type I error. We consider an ovarian cancer cell line study that investigates the combined effect of DNA methylation inhibitors and histone deacetylation inhibitors in human ovarian cancer cell lines. The hypothesis is that the combination of DNA methylation inhibitors and histone deacetylation inhibitors will enhance antiproliferative activity in human ovarian cancer cell lines compared to treatment with each inhibitor alone. By applying the proposed Bayesian methodology, in vitro synergy was declared for DNA methylation inhibitor, 5-AZA-2'-deoxycytidine combined with one histone deacetylation inhibitor, suberoylanilide hydroxamic acid or trichostatin A in the cell lines HEY and SKOV3. This suggests potential new epigenetic therapies in cell growth inhibition of ovarian cancer cells.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A case-series analysis of approximately 811 cancer patients who developed Candidemia between 1989 and 1998 and seen at M. D. Anderson Cancer Center, was studied to assess the impact and timing of central venous catheter (CVC) removal on the outcome of fungal bloodstream infections in cancer patients with primary catheter-related Candidemia as well as secondary infections. ^ This study explored the diagnosis and the management of vascular catheter-associated fungemia in patients with cancer. The microbiologic and clinical factors were determined to predict catheter-related Candidemia. Those factors included, in addition to basic demographics, the underlying malignancy, chemotherapy, neutropenia, and other salient data. Statistical analyses included univariate and multivariate logistic regression to determine the outcome of Candidemia in relation to the timing of catheter removal, type of species, and to identify predictors of catheter-related infections. ^ The conclusions of the study aim at enhancing our mastery of issues involving CVC removal and potentially will have an impact on the management of nosocomial bloodstream infections related to timing of CVC removal and the optimal duration of treatment of catheter-related Candidemia. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Purpose. To examine the association between living in proximity to Toxics Release Inventory (TRI) facilities and the incidence of childhood cancer in the State of Texas. ^ Design. This is a secondary data analysis utilizing the publicly available Toxics release inventory (TRI), maintained by the U.S. Environmental protection agency that lists the facilities that release any of the 650 TRI chemicals. Total childhood cancer cases and childhood cancer rate (age 0-14 years) by county, for the years 1995-2003 were used from the Texas cancer registry, available at the Texas department of State Health Services website. Setting: This study was limited to the children population of the State of Texas. ^ Method. Analysis was done using Stata version 9 and SPSS version 15.0. Satscan was used for geographical spatial clustering of childhood cancer cases based on county centroids using the Poisson clustering algorithm which adjusts for population density. Pictorial maps were created using MapInfo professional version 8.0. ^ Results. One hundred and twenty five counties had no TRI facilities in their region, while 129 facilities had at least one TRI facility. An increasing trend for number of facilities and total disposal was observed except for the highest category based on cancer rate quartiles. Linear regression analysis using log transformation for number of facilities and total disposal in predicting cancer rates was computed, however both these variables were not found to be significant predictors. Seven significant geographical spatial clusters of counties for high childhood cancer rates (p<0.05) were indicated. Binomial logistic regression by categorizing the cancer rate in to two groups (<=150 and >150) indicated an odds ratio of 1.58 (CI 1.127, 2.222) for the natural log of number of facilities. ^ Conclusion. We have used a unique methodology by combining GIS and spatial clustering techniques with existing statistical approaches in examining the association between living in proximity to TRI facilities and the incidence of childhood cancer in the State of Texas. Although a concrete association was not indicated, further studies are required examining specific TRI chemicals. Use of this information can enable the researchers and public to identify potential concerns, gain a better understanding of potential risks, and work with industry and government to reduce toxic chemical use, disposal or other releases and the risks associated with them. TRI data, in conjunction with other information, can be used as a starting point in evaluating exposures and risks. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The need for timely population data for health planning and Indicators of need has Increased the demand for population estimates. The data required to produce estimates is difficult to obtain and the process is time consuming. Estimation methods that require less effort and fewer data are needed. The structure preserving estimator (SPREE) is a promising technique not previously used to estimate county population characteristics. This study first uses traditional regression estimation techniques to produce estimates of county population totals. Then the structure preserving estimator, using the results produced in the first phase as constraints, is evaluated.^ Regression methods are among the most frequently used demographic methods for estimating populations. These methods use symptomatic indicators to predict population change. This research evaluates three regression methods to determine which will produce the best estimates based on the 1970 to 1980 indicators of population change. Strategies for stratifying data to improve the ability of the methods to predict change were tested. Difference-correlation using PMSA strata produced the equation which fit the data the best. Regression diagnostics were used to evaluate the residuals.^ The second phase of this study is to evaluate use of the structure preserving estimator in making estimates of population characteristics. The SPREE estimation approach uses existing data (the association structure) to establish the relationship between the variable of interest and the associated variable(s) at the county level. Marginals at the state level (the allocation structure) supply the current relationship between the variables. The full allocation structure model uses current estimates of county population totals to limit the magnitude of county estimates. The limited full allocation structure model has no constraints on county size. The 1970 county census age - gender population provides the association structure, the allocation structure is the 1980 state age - gender distribution.^ The full allocation model produces good estimates of the 1980 county age - gender populations. An unanticipated finding of this research is that the limited full allocation model produces estimates of county population totals that are superior to those produced by the regression methods. The full allocation model is used to produce estimates of 1986 county population characteristics. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Coronary perfusion with thrombolytic therapy and selective reperfusion by percutaneous transluminal coronary angioplasty (PTCA) were examined in the Corpus Christi Heart Project, a population-based surveillance program for hospitalized acute myocardial infarction (MI) patients in a biethnic community of Mexican-Americans (MAs) and non-Hispanic whites (NHWs). Results were based on 250 (12.4%) patients who received thromobolytic therapy in a cohort of 2011 acute MI cases. Out of these 107 (42.8%) underwent PTCA with a mean follow-up of 25 months. There were 186 (74.4%) men and 64 (25.6%) women; 148 (59.2%) were NHWs, 86 (34.4%) were MAs. Thrombolysis and PTCA were performed less frequently in women than in men, and less frequently in MAs than in NHWs.^ According to the coronary reperfusion interventions used, patients were divided in two groups, those that received no-PTCA (57.2%) and the other that underwent PTCA (42.8%) after thrombolysis. The case-fatality rate was higher in no-PTCA patients than in the PTCA (7.7% versus 5.6%), as was mortality at one year (16.2% versus 10.5%). Reperfusion was successful in 48.0% in the entire cohort and (51.4% versus 45.6%) in the PTCA and no-PTCA groups. Mortality in the successful reperfusion patients was 5.0% compared to 22.3% in the unsuccessful reperfusion group (p = 0.00016, 95% CI: 1.98-11.6).^ Cardiac catheterization was performed in 86.4% thrombolytic patients. Severe stenosis ($>$75%) obstruction was present most commonly in the left descending artery (52.8%) and in the right coronary artery (52.8%). The occurrence of adverse in-hospital clinical events was higher in the no-PTCA as compared to the PTCA and catheterized patients with the exception of reperfusion arrythmias (p = 0.140; Fisher's exact test p = 0.129).^ Cox regression analysis was used to study the relationship between selected variables and mortality. Apart from successful reperfusion, age group (p = 0.028, 95% CI: 2.1-12.42), site of acute MI index (p = 0.050) and ejection-fraction (p = 0.052) were predictors of long-term survival. The ejection-fraction in the PTCA group was higher than (median 78% versus 53%) in the no-PTCA group. Assessed by logistic regression analysis history of high cholesterol ($>$200mg/dl) and diabetes mellites did have significant prognostic value (p = 0.0233; p = 0.0318) in long-term survival irrespective of treatment status.^ In conclusion, the results of this study support the idea that the use of PTCA as a selective intervention following thrombolysis improves survival of patients with acute MI. The use of PTCA in this setting appears to be safe. However, we can not exclude the possibility that some of these results may have occurred due to the exclusion from PTCA of high risk patients (selection bias). ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Logistic regression is one of the most important tools in the analysis of epidemiological and clinical data. Such data often contain missing values for one or more variables. Common practice is to eliminate all individuals for whom any information is missing. This deletion approach does not make efficient use of available information and often introduces bias.^ Two methods were developed to estimate logistic regression coefficients for mixed dichotomous and continuous covariates including partially observed binary covariates. The data were assumed missing at random (MAR). One method (PD) used predictive distribution as weight to calculate the average of the logistic regressions performing on all possible values of missing observations, and the second method (RS) used a variant of resampling technique. Additional seven methods were compared with these two approaches in a simulation study. They are: (1) Analysis based on only the complete cases, (2) Substituting the mean of the observed values for the missing value, (3) An imputation technique based on the proportions of observed data, (4) Regressing the partially observed covariates on the remaining continuous covariates, (5) Regressing the partially observed covariates on the remaining continuous covariates conditional on response variable, (6) Regressing the partially observed covariates on the remaining continuous covariates and response variable, and (7) EM algorithm. Both proposed methods showed smaller standard errors (s.e.) for the coefficient involving the partially observed covariate and for the other coefficients as well. However, both methods, especially PD, are computationally demanding; thus for analysis of large data sets with partially observed covariates, further refinement of these approaches is needed. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The history of the logistic function since its introduction in 1838 is reviewed, and the logistic model for a polychotomous response variable is presented with a discussion of the assumptions involved in its derivation and use. Following this, the maximum likelihood estimators for the model parameters are derived along with a Newton-Raphson iterative procedure for evaluation. A rigorous mathematical derivation of the limiting distribution of the maximum likelihood estimators is then presented using a characteristic function approach. An appendix with theorems on the asymptotic normality of sample sums when the observations are not identically distributed, with proofs, supports the presentation on asymptotic properties of the maximum likelihood estimators. Finally, two applications of the model are presented using data from the Hypertension Detection and Follow-up Program, a prospective, population-based, randomized trial of treatment for hypertension. The first application compares the risk of five-year mortality from cardiovascular causes with that from noncardiovascular causes; the second application compares risk factors for fatal or nonfatal coronary heart disease with those for fatal or nonfatal stroke. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The tobacco-specific nitrosamine 4-(methylnitrosamino)-1-(3-pyridyl)-1-butanone (NNK) is an obvious carcinogen for lung cancer. Since CBMN (Cytokinesis-blocked micronucleus) has been found to be extremely sensitive to NNK-induced genetic damage, it is a potential important factor to predict the lung cancer risk. However, the association between lung cancer and NNK-induced genetic damage measured by CBMN assay has not been rigorously examined. ^ This research develops a methodology to model the chromosomal changes under NNK-induced genetic damage in a logistic regression framework in order to predict the occurrence of lung cancer. Since these chromosomal changes were usually not observed very long due to laboratory cost and time, a resampling technique was applied to generate the Markov chain of the normal and the damaged cell for each individual. A joint likelihood between the resampled Markov chains and the logistic regression model including transition probabilities of this chain as covariates was established. The Maximum likelihood estimation was applied to carry on the statistical test for comparison. The ability of this approach to increase discriminating power to predict lung cancer was compared to a baseline "non-genetic" model. ^ Our method offered an option to understand the association between the dynamic cell information and lung cancer. Our study indicated the extent of DNA damage/non-damage using the CBMN assay provides critical information that impacts public health studies of lung cancer risk. This novel statistical method could simultaneously estimate the process of DNA damage/non-damage and its relationship with lung cancer for each individual.^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Pancreatic cancer is the 4th most common cause for cancer death in the United States, accompanied by less than 5% five-year survival rate based on current treatments, particularly because it is usually detected at a late stage. Identifying a high-risk population to launch an effective preventive strategy and intervention to control this highly lethal disease is desperately needed. The genetic etiology of pancreatic cancer has not been well profiled. We hypothesized that unidentified genetic variants by previous genome-wide association study (GWAS) for pancreatic cancer, due to stringent statistical threshold or missing interaction analysis, may be unveiled using alternative approaches. To achieve this aim, we explored genetic susceptibility to pancreatic cancer in terms of marginal associations of pathway and genes, as well as their interactions with risk factors. We conducted pathway- and gene-based analysis using GWAS data from 3141 pancreatic cancer patients and 3367 controls with European ancestry. Using the gene set ridge regression in association studies (GRASS) method, we analyzed 197 pathways from the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. Using the logistic kernel machine (LKM) test, we analyzed 17906 genes defined by University of California Santa Cruz (UCSC) database. Using the likelihood ratio test (LRT) in a logistic regression model, we analyzed 177 pathways and 17906 genes for interactions with risk factors in 2028 pancreatic cancer patients and 2109 controls with European ancestry. After adjusting for multiple comparisons, six pathways were marginally associated with risk of pancreatic cancer ( P < 0.00025): Fc epsilon RI signaling, maturity onset diabetes of the young, neuroactive ligand-receptor interaction, long-term depression (Ps < 0.0002), and the olfactory transduction and vascular smooth muscle contraction pathways (P = 0.0002; Nine genes were marginally associated with pancreatic cancer risk (P < 2.62 × 10−5), including five reported genes (ABO, HNF1A, CLPTM1L, SHH and MYC), as well as four novel genes (OR13C4, OR 13C3, KCNA6 and HNF4 G); three pathways significantly interacted with risk factors on modifying the risk of pancreatic cancer (P < 2.82 × 10−4): chemokine signaling pathway with obesity ( P < 1.43 × 10−4), calcium signaling pathway (P < 2.27 × 10−4) and MAPK signaling pathway with diabetes (P < 2.77 × 10−4). However, none of the 17906 genes tested for interactions survived the multiple comparisons corrections. In summary, our current GWAS study unveiled unidentified genetic susceptibility to pancreatic cancer using alternative methods. These novel findings provide new perspectives on genetic susceptibility to and molecular mechanisms of pancreatic cancer, once confirmed, will shed promising light on the prevention and treatment of this disease. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

It is well known that an identification problem exists in the analysis of age-period-cohort data because of the relationship among the three factors (date of birth + age at death = date of death). There are numerous suggestions about how to analyze the data. No one solution has been satisfactory. The purpose of this study is to provide another analytic method by extending the Cox's lifetable regression model with time-dependent covariates. The new approach contains the following features: (1) It is based on the conditional maximum likelihood procedure using a proportional hazard function described by Cox (1972), treating the age factor as the underlying hazard to estimate the parameters for the cohort and period factors. (2) The model is flexible so that both the cohort and period factors can be treated as dummy or continuous variables, and the parameter estimations can be obtained for numerous combinations of variables as in a regression analysis. (3) The model is applicable even when the time period is unequally spaced.^ Two specific models are considered to illustrate the new approach and applied to the U.S. prostate cancer data. We find that there are significant differences between all cohorts and there is a significant period effect for both whites and nonwhites. The underlying hazard increases exponentially with age indicating that old people have much higher risk than young people. A log transformation of relative risk shows that the prostate cancer risk declined in recent cohorts for both models. However, prostate cancer risk declined 5 cohorts (25 years) earlier for whites than for nonwhites under the period factor model (0 0 0 1 1 1 1). These latter results are similar to the previous study by Holford (1983).^ The new approach offers a general method to analyze the age-period-cohort data without using any arbitrary constraint in the model. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The performance of the Hosmer-Lemeshow global goodness-of-fit statistic for logistic regression models was explored in a wide variety of conditions not previously fully investigated. Computer simulations, each consisting of 500 regression models, were run to assess the statistic in 23 different situations. The items which varied among the situations included the number of observations used in each regression, the number of covariates, the degree of dependence among the covariates, the combinations of continuous and discrete variables, and the generation of the values of the dependent variable for model fit or lack of fit.^ The study found that the $\rm\ C$g* statistic was adequate in tests of significance for most situations. However, when testing data which deviate from a logistic model, the statistic has low power to detect such deviation. Although grouping of the estimated probabilities into quantiles from 8 to 30 was studied, the deciles of risk approach was generally sufficient. Subdividing the estimated probabilities into more than 10 quantiles when there are many covariates in the model is not necessary, despite theoretical reasons which suggest otherwise. Because it does not follow a X$\sp2$ distribution, the statistic is not recommended for use in models containing only categorical variables with a limited number of covariate patterns.^ The statistic performed adequately when there were at least 10 observations per quantile. Large numbers of observations per quantile did not lead to incorrect conclusions that the model did not fit the data when it actually did. However, the statistic failed to detect lack of fit when it existed and should be supplemented with further tests for the influence of individual observations. Careful examination of the parameter estimates is also essential since the statistic did not perform as desired when there was moderate to severe collinearity among covariates.^ Two methods studied for handling tied values of the estimated probabilities made only a slight difference in conclusions about model fit. Neither method split observations with identical probabilities into different quantiles. Approaches which create equal size groups by separating ties should be avoided. ^