897 resultados para multivariable regression
Resumo:
This dissertation develops and explores the methodology for the use of cubic spline functions in assessing time-by-covariate interactions in Cox proportional hazards regression models. These interactions indicate violations of the proportional hazards assumption of the Cox model. Use of cubic spline functions allows for the investigation of the shape of a possible covariate time-dependence without having to specify a particular functional form. Cubic spline functions yield both a graphical method and a formal test for the proportional hazards assumption as well as a test of the nonlinearity of the time-by-covariate interaction. Five existing methods for assessing violations of the proportional hazards assumption are reviewed and applied along with cubic splines to three well known two-sample datasets. An additional dataset with three covariates is used to explore the use of cubic spline functions in a more general setting. ^
Resumo:
A Bayesian approach to estimation of the regression coefficients of a multinominal logit model with ordinal scale response categories is presented. A Monte Carlo method is used to construct the posterior distribution of the link function. The link function is treated as an arbitrary scalar function. Then the Gauss-Markov theorem is used to determine a function of the link which produces a random vector of coefficients. The posterior distribution of the random vector of coefficients is used to estimate the regression coefficients. The method described is referred to as a Bayesian generalized least square (BGLS) analysis. Two cases involving multinominal logit models are described. Case I involves a cumulative logit model and Case II involves a proportional-odds model. All inferences about the coefficients for both cases are described in terms of the posterior distribution of the regression coefficients. The results from the BGLS method are compared to maximum likelihood estimates of the regression coefficients. The BGLS method avoids the nonlinear problems encountered when estimating the regression coefficients of a generalized linear model. The method is not complex or computationally intensive. The BGLS method offers several advantages over Bayesian approaches. ^
Resumo:
Logistic regression is one of the most important tools in the analysis of epidemiological and clinical data. Such data often contain missing values for one or more variables. Common practice is to eliminate all individuals for whom any information is missing. This deletion approach does not make efficient use of available information and often introduces bias.^ Two methods were developed to estimate logistic regression coefficients for mixed dichotomous and continuous covariates including partially observed binary covariates. The data were assumed missing at random (MAR). One method (PD) used predictive distribution as weight to calculate the average of the logistic regressions performing on all possible values of missing observations, and the second method (RS) used a variant of resampling technique. Additional seven methods were compared with these two approaches in a simulation study. They are: (1) Analysis based on only the complete cases, (2) Substituting the mean of the observed values for the missing value, (3) An imputation technique based on the proportions of observed data, (4) Regressing the partially observed covariates on the remaining continuous covariates, (5) Regressing the partially observed covariates on the remaining continuous covariates conditional on response variable, (6) Regressing the partially observed covariates on the remaining continuous covariates and response variable, and (7) EM algorithm. Both proposed methods showed smaller standard errors (s.e.) for the coefficient involving the partially observed covariate and for the other coefficients as well. However, both methods, especially PD, are computationally demanding; thus for analysis of large data sets with partially observed covariates, further refinement of these approaches is needed. ^
Resumo:
A large number of ridge regression estimators have been proposed and used with little knowledge of their true distributions. Because of this lack of knowledge, these estimators cannot be used to test hypotheses or to form confidence intervals.^ This paper presents a basic technique for deriving the exact distribution functions for a class of generalized ridge estimators. The technique is applied to five prominent generalized ridge estimators. Graphs of the resulting distribution functions are presented. The actual behavior of these estimators is found to be considerably different than the behavior which is generally assumed for ridge estimators.^ This paper also uses the derived distributions to examine the mean squared error properties of the estimators. A technique for developing confidence intervals based on the generalized ridge estimators is also presented. ^
Resumo:
The history of the logistic function since its introduction in 1838 is reviewed, and the logistic model for a polychotomous response variable is presented with a discussion of the assumptions involved in its derivation and use. Following this, the maximum likelihood estimators for the model parameters are derived along with a Newton-Raphson iterative procedure for evaluation. A rigorous mathematical derivation of the limiting distribution of the maximum likelihood estimators is then presented using a characteristic function approach. An appendix with theorems on the asymptotic normality of sample sums when the observations are not identically distributed, with proofs, supports the presentation on asymptotic properties of the maximum likelihood estimators. Finally, two applications of the model are presented using data from the Hypertension Detection and Follow-up Program, a prospective, population-based, randomized trial of treatment for hypertension. The first application compares the risk of five-year mortality from cardiovascular causes with that from noncardiovascular causes; the second application compares risk factors for fatal or nonfatal coronary heart disease with those for fatal or nonfatal stroke. ^
Resumo:
Traditional comparison of standardized mortality ratios (SMRs) can be misleading if the age-specific mortality ratios are not homogeneous. For this reason, a regression model has been developed which incorporates the mortality ratio as a function of age. This model is then applied to mortality data from an occupational cohort study. The nature of the occupational data necessitates the investigation of mortality ratios which increase with age. These occupational data are used primarily to illustrate and develop the statistical methodology.^ The age-specific mortality ratio (MR) for the covariates of interest can be written as MR(,ij...m) = ((mu)(,ij...m)/(theta)(,ij...m)) = r(.)exp (Z('')(,ij...m)(beta)) where (mu)(,ij...m) and (theta)(,ij...m) denote the force of mortality in the study and chosen standard populations in the ij...m('th) stratum, respectively, r is the intercept, Z(,ij...m) is the vector of covariables associated with the i('th) age interval, and (beta) is a vector of regression coefficients associated with these covariables. A Newton-Raphson iterative procedure has been used for determining the maximum likelihood estimates of the regression coefficients.^ This model provides a statistical method for a logical and easily interpretable explanation of an occupational cohort mortality experience. Since it gives a reasonable fit to the mortality data, it can also be concluded that the model is fairly realistic. The traditional statistical method for the analysis of occupational cohort mortality data is to present a summary index such as the SMR under the assumption of constant (homogeneous) age-specific mortality ratios. Since the mortality ratios for occupational groups usually increase with age, the homogeneity assumption of the age-specific mortality ratios is often untenable. The traditional method of comparing SMRs under the homogeneity assumption is a special case of this model, without age as a covariate.^ This model also provides a statistical technique to evaluate the relative risk between two SMRs or a dose-response relationship among several SMRs. The model presented has application in the medical, demographic and epidemiologic areas. The methods developed in this thesis are suitable for future analyses of mortality or morbidity data when the age-specific mortality/morbidity experience is a function of age or when there is an interaction effect between confounding variables needs to be evaluated. ^
Resumo:
One of the difficulties in the practical application of ridge regression is that, for a given data set, it is unknown whether a selected ridge estimator has smaller squared error than the least squares estimator. The concept of the improvement region is defined, and a technique is developed which obtains approximate confidence intervals for the value of ridge k which produces the maximum reduction in mean squared error. Two simulation experiments were conducted to investigate how accurate these approximate confidence intervals might be. ^
Resumo:
The tobacco-specific nitrosamine 4-(methylnitrosamino)-1-(3-pyridyl)-1-butanone (NNK) is an obvious carcinogen for lung cancer. Since CBMN (Cytokinesis-blocked micronucleus) has been found to be extremely sensitive to NNK-induced genetic damage, it is a potential important factor to predict the lung cancer risk. However, the association between lung cancer and NNK-induced genetic damage measured by CBMN assay has not been rigorously examined. ^ This research develops a methodology to model the chromosomal changes under NNK-induced genetic damage in a logistic regression framework in order to predict the occurrence of lung cancer. Since these chromosomal changes were usually not observed very long due to laboratory cost and time, a resampling technique was applied to generate the Markov chain of the normal and the damaged cell for each individual. A joint likelihood between the resampled Markov chains and the logistic regression model including transition probabilities of this chain as covariates was established. The Maximum likelihood estimation was applied to carry on the statistical test for comparison. The ability of this approach to increase discriminating power to predict lung cancer was compared to a baseline "non-genetic" model. ^ Our method offered an option to understand the association between the dynamic cell information and lung cancer. Our study indicated the extent of DNA damage/non-damage using the CBMN assay provides critical information that impacts public health studies of lung cancer risk. This novel statistical method could simultaneously estimate the process of DNA damage/non-damage and its relationship with lung cancer for each individual.^
Resumo:
Preventable Hospitalizations (PHs) are hospitalizations that can be avoided with appropriate and timely care in the ambulatory setting and hence are closely associated with primary care access in a community. Increased primary care availability and health insurance coverage may increase primary care access, and consequently may be significantly associated with risks and costs of PHs. Objective. To estimate the risk and cost of preventable hospitalizations (PHs); to determine the association of primary care availability and health insurance coverage with the risk and costs of PHs, first alone and then simultaneously; and finally, to estimate the impact of expansions in primary care availability and health insurance coverage on the burden of PHs among non-elderly adult residents of Harris County. Methods. The study population was residents of Harris County, age 18 to 64, who had at least one hospital discharge in a Texas hospital in 2008. The primary independent variables were availability of primary care physicians, availability of primary care safety net clinics and health insurance coverage. The primary dependent variables were PHs and associated hospitalization costs. The Texas Health Care Information Collection (THCIC) Inpatient Discharge data was used to obtain information on the number and costs of PHs in the study population. Risk of PHs in the study population, as well as average and total costs of PHs were calculated. Multivariable logistic regression models and two-step Heckman regression models with log-transformed costs were used to determine the association of primary care availability and health insurance coverage with the risk and costs of PHs respectively, while controlling for individual predisposing, enabling and need characteristics. Predicted PH risk and cost were used to calculate the predicted burden of PHs in the study population and the impact of expansions in primary care availability and health insurance coverage on the predicted burden. Results. In 2008, hospitalized non-elderly adults in Harris County had 11,313 PHs and a corresponding PH risk of 8.02%. Congestive heart failure was the most common PH. PHs imposed a total economic burden of $84 billion at an average of $7,449 per PH. Higher primary care safety net availability was significantly associated with the lower risk of PHs in the final risk model, but only in the uninsured. A unit increase in safety net availability led to a 23% decline in PH odds in the uninsured, compared to only a 4% decline in the insured. Higher primary care physician availability was associated with increased PH costs in the final cost model (β=0.0020; p<0.05). Lack of health insurance coverage increased the risk of PH, with the uninsured having 30% higher odds of PHs (OR=1.299; p<0.05), but reduced the cost of a PH by 7% (β=-0.0668; p<0.05). Expansions in primary care availability and health insurance coverage were associated with a reduction of about $1.6 million in PH burden at the highest level of expansion. Conclusions. Availability of primary care resources and health insurance coverage in hospitalized non-elderly adults in Harris County are significantly associated with the risk and costs of PHs. Expansions in these primary care access factors can be expected to produce significant reductions in the burden of PHs in Harris County.^
Resumo:
Choline and betaine are important methyl donors that contribute to protein and phospholipid synthesis and DNA methylation. They can either be obtained through diet or synthesized de novo. Evidence from human and animal research indicates that choline metabolic pathways may be activated during a variety of diseases, including cancer. Studies have been conducted to investigate the role of dietary intake of choline and betaine on cancers, but results vary among studies by cancer types, and no such study had been conducted for lung cancer. We conducted a case-control study to explore the association between choline and betaine dietary intake and lung cancer. A total of 2807 cases and 2919 controls were included in the study. After adjusting for total calorie intake, age, sex, race and smoking status, multivariable logistic regression analysis revealed a significant negative association between choline/betaine intake and lung cancer. Specifically, we observed that higher choline intake was associated with reduced lung cancer odds, and the association did not differ significantly by smoking status. A similar negative trend was observed in the association between betaine intake and lung cancer after adjusting for total calorie intake, age, sex, smoking status, race, and pack-years of smoking. However, this association was strongly affected by smoking. No significant association was observed with increased betaine intake and lung cancer among never smokers, but higher betaine intake was strongly associated with reduced lung cancer odds among smokers, and lower odds ratios were observed among current smokers than among former smokers. Our results suggest that high intake of choline may be protective for lung cancer independent of smoking status, while high betaine intake may mitigate the adverse effect of smoking on lung cancer, and help prevent lung cancer among smokers.^
Resumo:
Purpose of the Study: This study evaluated the prevalence of periodontal disease between Mexican American elderly and European American elderly residing in three socio-economically distinct neighborhoods in San Antonio, Texas. ^ Study Group: Subjects for the original protocol were participants of the Oral Health: San Antonio Longitudinal Study of Aging (OH: SALSA), which began with National Institutes of Health (NIH) funding in 1993 (M.J. Saunders, PI). The cohort in the study was the individuals who had been enrolled in Phases I and III of the San Antonio Heart Study (SAHS). This SAHS/SALSA sample is a community-based probability sample of Mexican American and European American residents from three socio-economically distinct San Antonio neighborhoods: low-income barrio, middle-income transitional, and upper-income suburban. The OH: SALSA cohort was established between July 1993 and May 1998 by sampling two subsets of the San Antonio Heart Study (SAHS) cohort. These subsets included the San Antonio Longitudinal Study of Aging (SALSA) cohort, comprised of the oldest members of the SAHS (age 65+ yrs. old), and a younger set of controls (age 35-64 yrs. old) sampled from the remainder of the SAHS cohort. ^ Methods: The study used simple descriptive statistics to describe the sociodemographic characteristics and periodontal disease indicators of the OH: SALSA participants. Means and standard deviations were used to summarize continuous measures. Proportions were used to summarize categorical measures. Simple m x n chi square statistics was used to compare ethnic differences. A multivariable ordered logit regression was used to estimate the prevalence of periodontal disease and test ethnic group and neighborhood differences in the prevalence of periodontal disease. A multivariable model adjustment for socio-economic status (income and education), gender, and age (treated as confounders) was applied. ^ Summary: In the unadjusted and adjusted model, Mexican American elderly demonstrated the greatest prevalence for periodontitis, p < 0.05. Mexican American elderly in barrio neighborhoods demonstrated the greatest prevalence for severe periodontitis, with unadjusted prevalence rates of 31.7%, 22.3%, and 22.4% for Mexican American elderly barrio, transitional, and suburban neighborhoods, respectively. Also, Mexican American elderly had adjusted prevalence rates of 29.4%, 23.7%, and 20.4% for barrio, transitional, and suburban neighborhoods, respectively. ^ Conclusion: This study indicates that the prevalence of periodontal disease is an important oral health issue among the Mexican American elderly. The results suggest that the socioeconomic status of the residential neighborhood increased the risk for severe periodontal disease among the Mexican American elderly when compared to European American elderly. A viable approach to recognizing oral health disparities in our growing population of Mexican American elderly is imperative for the provision of special care programs that will help increase the quality of care in this minority population.^
Resumo:
Background and aim. Hepatitis B virus (HBV) and hepatitis C virus (HCV) co-infection is associated with increased risk of cirrhosis, decompensation, hepatocellular carcinoma, and death. Yet, there is sparse epidemiologic data on co-infection in the United States. Therefore, the aim of this study was to determine the prevalence and determinants of HBV co-infection in a large United States population of HCV patients. ^ Methods. The National Veterans Affairs HCV Clinical Case Registry was used to identify patients tested for HCV during 1997–2005. HCV exposure was defined as two positive HCV tests (antibody, RNA or genotype) or one positive test combined with an ICD-9 code for HCV. HCV infection was defined as only a positive HCV RNA or genotype. HBV exposure was defined as a positive test for hepatitis B core antibodies, hepatitis B surface antigen, HBV DNA, hepatitis Be antigen, or hepatitis Be antibody. HBV infection was defined as only a positive test for hepatitis B surface antigen, HBV DNA, or hepatitis Be antigen within one year before or after the HCV index date. The prevalence of exposure to HBV in patients with HCV exposure and the prevalence of HBV infection in patients with HCV infection were determined. Multivariable logistic regression was used to identify demographic and clinical determinants of co-infection. ^ Results. Among 168,239 patients with HCV exposure, 58,415 patients had HBV exposure for a prevalence of 34.7% (95% CI 34.5–35.0). Among 102,971 patients with HCV infection, 1,431 patients had HBV co-infection for a prevalence of 1.4% (95% CI 1.3–1.5). The independent determinants for an increased risk of HBV co-infection were male sex, positive HIV status, a history of hemophilia, sickle cell anemia or thalassemia, history of blood transfusion, cocaine and other drug use. Age >50 years and Hispanic ethnicity were associated with a decreased risk of HBV co-infection. ^ Conclusions. This is the largest cohort study in the United States on the prevalence of HBV co-infection. Among veterans with HCV, exposure to HBV is common (∼35%), but HBV co-infection is relatively low (1.4%). There is an increased risk of co-infection with younger age, male sex, HIV, and drug use, with decreased risk in Hispanics.^
Resumo:
Hepatitis B virus (HBV) is a significant cause of liver diseases and related complications worldwide. Both injecting and non-injecting drug users are at increased risk of contracting HBV infection. Scientific evidence suggests that drug users have subnormal response to HBV vaccination and the seroprotection rates are lower than that in the general population; potentially due to vaccine factors, host factors, or both. The purpose of this systematic review is to examine the rates of seroprotection following HBV vaccination in drug using populations and to conduct a meta-analysis to identify the factors associated with varying seroprotection rates. Seroprotection is defined as developing an anti-HBs antibody level of ≥ 10 mIU/ml after receiving the HBV vaccine. Original research articles were searched using online databases and reference lists of shortlisted articles. HBV vaccine intervention studies reporting seroprotection rates in drug users and published in English language during or after 1989 were eligible. Out of 235 citations reviewed, 11 studies were included in this review. The reported seroprotection rates ranged from 54.5 – 97.1%. Combination vaccine (HAV and HBV) (Risk ratio 12.91, 95% CI 2.98-55.86, p = 0.003), measurement of anti-HBs with microparticle immunoassay (Risk ratio 3.46, 95% CI 1.11-10.81, p = 0.035) and anti-HBs antibody measurement at 2 months after the last HBV vaccine dose (RR 4.11, 95% CI 1.55-10.89, p = 0.009) were significantly associated with higher seroprotection rates. Although statistically nonsignificant, the variables mean age>30 years, higher prevalence of anti-HBc antibody and anti-HIV antibody in the sample population, and current drug use (not in drug rehabilitation treatment) were strongly associated with decreased seroprotection rates. Proportion of injecting drug users, vaccine dose and accelerated vaccine schedule were not predictors of heterogeneity across studies. Studies examined in this review were significantly heterogeneous (Q = 180.850, p = 0.000) and factors identified should be considered when comparing immune response across studies. The combination vaccine showed promising results; however, its effectiveness compared to standard HBV vaccine needs to be examined systematically. Immune response in DUs can possibly be improved by the use of bivalent vaccines, booster doses, and improving vaccine completion rates through integrated public programs and incentives.^
Resumo:
Trastuzumab is a humanized-monoclonal antibody, developed specifically for HER2-neu over-expressed breast cancer patients. Although highly effective and well tolerated, it was reported associated with Congestive Heart Failure (CHF) in clinical trial settings (up to 27%). This leaves a gap where, Trastuzumab-related CHF rate in general population, especially older breast cancer patients with long term treatment of Trastuzumab remains unknown. This thesis examined the rates and risk factors associated with Trastuzumab-related CHF in a large population of older breast cancer patients. A retrospective cohort study using the existing Surveillance, Epidemiology and End Results (SEER) and Medicare linked de-identified database was performed. Breast cancer patients ≥ 66 years old, stage I-IV, diagnosed in 1998-2007, fully covered by Medicare but no HMO within 1-year before and after first diagnosis month, received 1st chemotherapy no earlier than 30 days prior to diagnosis were selected as study cohort. The primary outcome of this study is a diagnosis of CHF after starting chemotherapy but none CHF claims on or before cancer diagnosis date. ICD-9 and HCPCS codes were used to pool the claims for Trastuzumab use, chemotherapy, comorbidities and CHF claims. Statistical analysis including comparison of characteristics, Kaplan-Meier survival estimates of CHF rates for long term follow up, and Multivariable Cox regression model using Trastuzumab as a time-dependent variable were performed. Out of 17,684 selected cohort, 2,037 (12%) received Trastuzumab. Among them, 35% (714 out of 2037) were diagnosed with CHF, compared to 31% (4784 of 15647) of CHF rate in other chemotherapy recipients (p<.0001). After 10 years of follow-up, 65% of Trastuzumab users developed CHF, compared to 47% in their counterparts. After adjusting for patient demographic, tumor and clinical characteristics, older breast cancer patients who used Trastuzumab showed a significantly higher risk in developing CHF than other chemotherapy recipients (HR 1.69, 95% CI 1.54 - 1.85). And this risk is increased along with the increment of age (p-value < .0001). Among Trastuzumab users, these covariates also significantly increased the risk of CHF: older age, stage IV, Non-Hispanic black race, unmarried, comorbidities, Anthracyclin use, Taxane use, and lower educational level. It is concluded that, Trastuzumab users in older breast cancer patients had 69% higher risk in developing CHF than non-Trastuzumab users, much higher than the 27% increase reported in younger clinical trial patients. Older age, Non-Hispanic black race, unmarried, comorbidity, combined use with Anthracycline or Taxane also significantly increase the risk of CHF development in older patients treated with Trastuzumab. ^