881 resultados para Regression-based decomposition.
Resumo:
Background. A few studies have reported gender differences along the colorectal cancer (CRC) continuum but none has done so longitudinally to compare a cancer and a non-cancer populations.^ Objectives and Methods. To examine gender differences in colorectal cancer screening (CRCS); to examine trends in gender differences in CRC screening among two groups of patients (Medicare beneficiaries with and without cancer); to examine gender differences in CRC incidence; and to examine for any differences over time. In Paper 1, the study population consisted of men and women, ages 67–89 years, with CRC (73,666) or without any cancer (39,006), residing in 12 U.S. Surveillance Epidemiology and End-Results (SEER) regions. Crude and age-adjusted percentages and odds ratios of receiving fecal occult blood test (FOBT), sigmoidoscopy (SIG), or colonoscopy (COL) were calculated. Multivariable logistic regression was used to assess gender on the odds of receiving CRC screening over time.^ In Paper 2, age-adjusted incidence rates and proportions over time were reported across race, CRC subsite, CRC stage and SEER region for 373,956 patients, ages 40+ years, residing in 9 SEER regions and diagnosed with malignant CRC. ^ Results. Overall, women had higher CRC screening rates than men and screening rates in general were higher in the SEER sample of persons with CRC diagnosis. Significant temporal divergence in FOBT screening was observed between men and women in both cohorts. Although the largest temporal increases in screening rates were found for COL, especially among the cohort with CRC, little change in the gender gap was observed over time. Receipt of FOBT was significantly associated with female gender especially in the period of full Medicare coverage. Receipt of COL was also significantly associated with male gender, especially in the period of limited Medicare coverage.^ Overall, approximately equal numbers of men (187,973) and women (185,983) were diagnosed with malignant CRC. Men had significantly higher age-adjusted CRC incidence rates than women across all categories of age, race, subsite, stage and SEER region even though rates declined in all categories over time. Significant moderate increases in rate difference occurred among 40-59 year olds; significant reductions occurred among patients age 70+, within subsite rectum, unstaged and distant stage CRC, and eastern and western SEER regions. ^ Conclusions. Persistent gender differences in CRC incidence across time may have implications for gender-based interventions that take age into consideration. A shift toward proximal cancer was observed over time for both genders, but the high proportion of men who develop rectal cancer suggests that a greater proportion of men may need to be targeted with newer screening methods such as fecal DNA or COL. Although previous reports have documented higher CRC screening among men, higher incidence of CRC observed among men suggests that higher risk categories of men are probably not being reached. FOBT utilization rates among women have increased over time and the gender gap has widened between 1998 and 2005. COL utilization is associated with male gender but the differences over time are small.^
Resumo:
Objectives. Triple Negative Breast Cancer (TNBC) lack expression of estrogen receptors (ER), progesterone receptors (PR), and absence of Her2 gene amplification. Current literature has identified TNBC and over-expression of cyclo-oxygenase-2 (COX-2) protein in primary breast cancer to be independent markers of poor prognosis in terms of overall and distant disease free survival. The purpose of this study was to compare COX-2 over-expression in TNBC patients to those patients who expressed one or more of the three tumor markers (i.e. ER, and/or PR, and/or Her2).^ Methods. Using a secondary data analysis, a cross-sectional design was implemented to examine the association of interest. Data collected from two ongoing protocols titled "LAB04-0657: a model for COX-2 mediated bone metastasis (Specific aim 3)" and "LAB04-0698: correlation of circulating tumor cells and COX-2 expression in primary breast cancer metastasis" was used for analysis. A sample of 125 female patients was analyzed using Chi-square tests and logistic regression models. ^ Results. COX-2 over-expression was present in 33% (41/125) and 28% (35/124) patients were identified as having TNBC. TNBC status was associated with elevated COX-2 expression (OR= 3.34; 95% CI= 1.40–8.22) and high tumor grade (OR= 4.09; 95% CI= 1.58–10.82). In a multivariable analysis, TNBC status was an important predictor of COX-2 expression after adjusting for age, menopausal status, BMI, and lymph node status (OR= 3.31; 95% CI: 1.26–8.67; p=0.01).^ Conclusion. TNBC is associated with COX-2 expression—a known marker of poor prognosis in patients with operable breast cancer. Replication of these results in a study with a larger sample size, or a future randomized clinical trial demonstrating an improved prognosis with COX-2 suppression in these patients would support this hypothesis.^
Resumo:
The objective of this dissertation was to determine the initiation and completion rates of adjuvant chemotherapy, its toxicity and the compliance rates of post-treatment surveillance for elderly patients with colon cancer using the linked Surveillance, Epidemiology, and End Results – Medicare database.^ The first study assessed the initiation and completion rate of 5-fluorouracil-based adjuvant chemotherapy and its relationship with patient characteristics. Of the 12,265 patients diagnosed with stage III colon adenocarcinoma in 1991-2005, 64.4% received adjuvant chemotherapy within 3-months after tumor resection and 40% of them completed the treatment. Age, marital status, and comorbidity score were significant predictors for chemotherapy initiation and completion.^ The second study estimated the incidence rate of toxicity-related endpoints among stage III colon adenocarcinoma patients treated with chemotherapy in 1991-2005. Of the 12,099 patients, 63.9% underwent chemotherapy and had volume depletion disorder (3-month cumulative incidence rate [CIR]=9.1%), agranulocytosis (CIR=3.4%), diarrhea (CIR=2.4%), nausea and vomiting (CIR=2.3%). Cox regression analysis confirmed such association (HR=2.76; 95% CI=2.42-3.15). The risk of ischemic heart diseases was slightly associated with chemotherapy (HR=1.08), but significantly among patients aged <75 with no comorbidity (HR=1.70). ^ The third study determined the adherence rate of follow-up cares among patients diagnosed with stage I-III colon adenocarcinoma in 2000 - June 2002. We identified 7,348 patients with a median follow-up of 59 months. The adherence rate was 83.9% for office visits, 29.4% for CEA tests, and 74.3% for colonoscopy. Overall, 25.2% met the recommended post-treatment care. Younger age at diagnosis, white race, married, advanced stage, fewer comorbidities, and chemotherapy use were significantly associated with guideline adherence.^ In conclusions, not all colon cancer patients received chemotherapy. Receiving chemotherapy was associated with increased risk of developing gastrointestinal, hematological and cardiac toxicities. Patients were more likely to comply with the schedule for office visits and colonoscopy but failed in CEA tests. ^
Resumo:
Background. It is important to understand the association between diet and risk of pancreatic cancer in order to better understand the etiology of pancreatic cancer.^ Objectives. Describe the dietary patterns of cases of adenocarcinoma of the pancreas and non-cancer controls and evaluate the odds of having a healthy eating pattern among cases and non-cancer controls.^ Design and Methods. An ongoing hospital-based case-control study was conducted in Houston, Texas from 2000-2008 with 678 pancreatic adenocarcinoma cases and 724 controls. Participants completed a food frequency questionnaire and a risk factor questionnaire. Dietary patterns were derived by principal component analysis and associations between dietary patterns and pancreatic cancer risk were assessed using unconditional logistic regression.^ Results. Two dietary patterns were derived: fruit-vegetable and high fat-meat. There were no statistically significant associations between the fruit-vegetable pattern and pancreatic cancer. An inverse association was seen between the high fat-meat pattern and pancreatic cancer risk when comparing those in the upper intake quintile to those scoring in the lowest quintile after adjusting for demographic and risk factor variables (OR=0.67, p=0.03). In sex-stratified analysis adjusted for demographic and risk factor variables, females scoring in the upper intake quintile of the fruit-vegetable pattern had a 49% lower risk of pancreatic cancer compared to females scoring in the lowest quintile (OR=0.51, p=0.03). An inverse relationship was also seen for the high fat-meat pattern when comparing females in the upper intake quintile to females in the lowest quintile (OR=0.50, p=0.03). In males, neither dietary pattern was significantly associated with pancreatic cancer.^ Conclusions. The current findings for the fruit-vegetable pattern are similar to those of previous studies and support the hypothesis that there is an inverse association between a “healthy” diet (comprised of fruits, vegetables, and whole grains) and risk of having pancreatic cancer (in females only). However, the inverse relationship with the high fat-meat pattern and risk of pancreatic cancer is contrary to other results. Further research on dietary patters and pancreatic cancer risk may lead to better understanding of the etiologic cause of pancreatic cancer.^
Resumo:
Objectives. This paper seeks to assess the effect on statistical power of regression model misspecification in a variety of situations. ^ Methods and results. The effect of misspecification in regression can be approximated by evaluating the correlation between the correct specification and the misspecification of the outcome variable (Harris 2010).In this paper, three misspecified models (linear, categorical and fractional polynomial) were considered. In the first section, the mathematical method of calculating the correlation between correct and misspecified models with simple mathematical forms was derived and demonstrated. In the second section, data from the National Health and Nutrition Examination Survey (NHANES 2007-2008) were used to examine such correlations. Our study shows that comparing to linear or categorical models, the fractional polynomial models, with the higher correlations, provided a better approximation of the true relationship, which was illustrated by LOESS regression. In the third section, we present the results of simulation studies that demonstrate overall misspecification in regression can produce marked decreases in power with small sample sizes. However, the categorical model had greatest power, ranging from 0.877 to 0.936 depending on sample size and outcome variable used. The power of fractional polynomial model was close to that of linear model, which ranged from 0.69 to 0.83, and appeared to be affected by the increased degrees of freedom of this model.^ Conclusion. Correlations between alternative model specifications can be used to provide a good approximation of the effect on statistical power of misspecification when the sample size is large. When model specifications have known simple mathematical forms, such correlations can be calculated mathematically. Actual public health data from NHANES 2007-2008 were used as examples to demonstrate the situations with unknown or complex correct model specification. Simulation of power for misspecified models confirmed the results based on correlation methods but also illustrated the effect of model degrees of freedom on power.^
Resumo:
The standard analyses of survival data involve the assumption that survival and censoring are independent. When censoring and survival are related, the phenomenon is known as informative censoring. This paper examines the effects of an informative censoring assumption on the hazard function and the estimated hazard ratio provided by the Cox model.^ The limiting factor in all analyses of informative censoring is the problem of non-identifiability. Non-identifiability implies that it is impossible to distinguish a situation in which censoring and death are independent from one in which there is dependence. However, it is possible that informative censoring occurs. Examination of the literature indicates how others have approached the problem and covers the relevant theoretical background.^ Three models are examined in detail. The first model uses conditionally independent marginal hazards to obtain the unconditional survival function and hazards. The second model is based on the Gumbel Type A method for combining independent marginal distributions into bivariate distributions using a dependency parameter. Finally, a formulation based on a compartmental model is presented and its results described. For the latter two approaches, the resulting hazard is used in the Cox model in a simulation study.^ The unconditional survival distribution formed from the first model involves dependency, but the crude hazard resulting from this unconditional distribution is identical to the marginal hazard, and inferences based on the hazard are valid. The hazard ratios formed from two distributions following the Gumbel Type A model are biased by a factor dependent on the amount of censoring in the two populations and the strength of the dependency of death and censoring in the two populations. The Cox model estimates this biased hazard ratio. In general, the hazard resulting from the compartmental model is not constant, even if the individual marginal hazards are constant, unless censoring is non-informative. The hazard ratio tends to a specific limit.^ Methods of evaluating situations in which informative censoring is present are described, and the relative utility of the three models examined is discussed. ^
Resumo:
Logistic regression is one of the most important tools in the analysis of epidemiological and clinical data. Such data often contain missing values for one or more variables. Common practice is to eliminate all individuals for whom any information is missing. This deletion approach does not make efficient use of available information and often introduces bias.^ Two methods were developed to estimate logistic regression coefficients for mixed dichotomous and continuous covariates including partially observed binary covariates. The data were assumed missing at random (MAR). One method (PD) used predictive distribution as weight to calculate the average of the logistic regressions performing on all possible values of missing observations, and the second method (RS) used a variant of resampling technique. Additional seven methods were compared with these two approaches in a simulation study. They are: (1) Analysis based on only the complete cases, (2) Substituting the mean of the observed values for the missing value, (3) An imputation technique based on the proportions of observed data, (4) Regressing the partially observed covariates on the remaining continuous covariates, (5) Regressing the partially observed covariates on the remaining continuous covariates conditional on response variable, (6) Regressing the partially observed covariates on the remaining continuous covariates and response variable, and (7) EM algorithm. Both proposed methods showed smaller standard errors (s.e.) for the coefficient involving the partially observed covariate and for the other coefficients as well. However, both methods, especially PD, are computationally demanding; thus for analysis of large data sets with partially observed covariates, further refinement of these approaches is needed. ^
Resumo:
A large number of ridge regression estimators have been proposed and used with little knowledge of their true distributions. Because of this lack of knowledge, these estimators cannot be used to test hypotheses or to form confidence intervals.^ This paper presents a basic technique for deriving the exact distribution functions for a class of generalized ridge estimators. The technique is applied to five prominent generalized ridge estimators. Graphs of the resulting distribution functions are presented. The actual behavior of these estimators is found to be considerably different than the behavior which is generally assumed for ridge estimators.^ This paper also uses the derived distributions to examine the mean squared error properties of the estimators. A technique for developing confidence intervals based on the generalized ridge estimators is also presented. ^
Resumo:
The history of the logistic function since its introduction in 1838 is reviewed, and the logistic model for a polychotomous response variable is presented with a discussion of the assumptions involved in its derivation and use. Following this, the maximum likelihood estimators for the model parameters are derived along with a Newton-Raphson iterative procedure for evaluation. A rigorous mathematical derivation of the limiting distribution of the maximum likelihood estimators is then presented using a characteristic function approach. An appendix with theorems on the asymptotic normality of sample sums when the observations are not identically distributed, with proofs, supports the presentation on asymptotic properties of the maximum likelihood estimators. Finally, two applications of the model are presented using data from the Hypertension Detection and Follow-up Program, a prospective, population-based, randomized trial of treatment for hypertension. The first application compares the risk of five-year mortality from cardiovascular causes with that from noncardiovascular causes; the second application compares risk factors for fatal or nonfatal coronary heart disease with those for fatal or nonfatal stroke. ^
Resumo:
Purpose: School districts in the U.S. regularly offer foods that compete with the USDA reimbursable meal, known as `a la carte' foods. These foods must adhere to state nutritional regulations; however, the implementation of these regulations often differs across districts. The purpose of this study is to compare two methods of offering a la carte foods on student's lunch intake: 1) an extensive a la carte program in which schools have a separate area for a la carte food sales, that includes non-reimbursable entrees; and 2) a moderate a la carte program, which offers the sale of a la carte foods on the same serving line with reimbursable meals. ^ Methods: Direct observation was used to assess children's lunch consumption in six schools, across two districts in Central Texas (n=373 observations). Schools were matched on socioeconomic status. Data collectors were randomly assigned to students, and recorded foods obtained, foods consumed, source of food, gender, grade, and ethnicity. Observations were entered into a nutrient database program, FIAS Millennium Edition, to obtain nutritional information. Differences in energy and nutrient intake across lunch sources and districts were assessed using ANOVA and independent t-tests. A linear regression model was applied to control for potential confounders. ^ Results: Students at schools with extensive a la carte programs consumed significantly more calories, carbohydrates, total fat, saturated fat, calcium, and sodium compared to students in schools with moderate a la carte offerings (p<.05). Students in the extensive a la carte program consumed approximately 94 calories more than students in the moderate a la carte program. There was no significant difference in the energy consumption in students who consumed any amount of a la carte compared to students who consumed none. In both districts, students who consumed a la carte offerings were more likely to consume sugar-sweetened beverages, sweets, chips, and pizza compared to students who consumed no a la carte foods. ^ Conclusion: The amount, type and method of offering a la carte foods can significantly affect student dietary intake. This pilot study indicates that when a la carte foods are more available, students consume more calories. Findings underscore the need for further investigation on how availability of a la carte foods affects children's diets. Guidelines for school a la carte offerings should be maximized to encourage the consumption of healthful foods and appropriate energy intake.^
Resumo:
Objective: The objective of this study is to investigate the association between processed and unprocessed red meat consumption and prostate cancer (PCa) stage in a homogenous Mexican-American population. Methods: This population-based case-control study had a total of 582 participants (287 cases with histologically confirmed adenocarcinoma of the prostate gland and 295 age and ethnicity-matched controls) that were all residing in the Southeast region of Texas from 1998 to 2006. All questionnaire information was collected using a validated data collection instrument. Statistical Analysis: Descriptive analyses included Student's t-test and Pearson's Chi-square tests. Odds ratios and 95% confidence intervals were calculated to quantify the association between nutritional factors and PCa stage. A multivariable model was used for unconditional logistic regression. Results: After adjusting for relevant covariates, those who consume high amounts of processed red meat have a non-significant increased odds of being diagnosed with localized PCa (OR = 1.60 95% CI: 0.85 - 3.03) and total PCa (OR = 1.43 95% CI: 0.81 - 2.52) but not for advanced PCa (OR = 0.91 95% CI: 1.37 - 2.23). Interestingly, high consumption of carbohydrates shows a significant reduction in the odds of being diagnosed with total PCa and advanced PCa (OR = 0.43 95% CI: 0.24 - 0.77; OR = 0.27 95% CI: 0.10 - 0.71, respectively). However, consuming high amounts of energy from protein and fat was shown to increase the odds of being diagnosed with advanced PCa (OR = 4.62 95% CI: 1.69 - 12.59; OR = 2.61 95% CI: 1.04 - 6.58, respectively). Conclusion: Mexican-Americans who consume high amounts of energy from protein and fat had increased odds of being diagnosed with advanced PCa, while high amounts of carbohydrates reduced the odds of being diagnosed with total and advanced PCa.^
Resumo:
Pathway based genome wide association study evolves from pathway analysis for microarray gene expression and is under rapid development as a complementary for single-SNP based genome wide association study. However, it faces new challenges, such as the summarization of SNP statistics to pathway statistics. The current study applies the ridge regularized Kernel Sliced Inverse Regression (KSIR) to achieve dimension reduction and compared this method to the other two widely used methods, the minimal-p-value (minP) approach of assigning the best test statistics of all SNPs in each pathway as the statistics of the pathway and the principal component analysis (PCA) method of utilizing PCA to calculate the principal components of each pathway. Comparison of the three methods using simulated datasets consisting of 500 cases, 500 controls and100 SNPs demonstrated that KSIR method outperformed the other two methods in terms of causal pathway ranking and the statistical power. PCA method showed similar performance as the minP method. KSIR method also showed a better performance over the other two methods in analyzing a real dataset, the WTCCC Ulcerative Colitis dataset consisting of 1762 cases, 3773 controls as the discovery cohort and 591 cases, 1639 controls as the replication cohort. Several immune and non-immune pathways relevant to ulcerative colitis were identified by these methods. Results from the current study provided a reference for further methodology development and identified novel pathways that may be of importance to the development of ulcerative colitis.^
Resumo:
It is well known that an identification problem exists in the analysis of age-period-cohort data because of the relationship among the three factors (date of birth + age at death = date of death). There are numerous suggestions about how to analyze the data. No one solution has been satisfactory. The purpose of this study is to provide another analytic method by extending the Cox's lifetable regression model with time-dependent covariates. The new approach contains the following features: (1) It is based on the conditional maximum likelihood procedure using a proportional hazard function described by Cox (1972), treating the age factor as the underlying hazard to estimate the parameters for the cohort and period factors. (2) The model is flexible so that both the cohort and period factors can be treated as dummy or continuous variables, and the parameter estimations can be obtained for numerous combinations of variables as in a regression analysis. (3) The model is applicable even when the time period is unequally spaced.^ Two specific models are considered to illustrate the new approach and applied to the U.S. prostate cancer data. We find that there are significant differences between all cohorts and there is a significant period effect for both whites and nonwhites. The underlying hazard increases exponentially with age indicating that old people have much higher risk than young people. A log transformation of relative risk shows that the prostate cancer risk declined in recent cohorts for both models. However, prostate cancer risk declined 5 cohorts (25 years) earlier for whites than for nonwhites under the period factor model (0 0 0 1 1 1 1). These latter results are similar to the previous study by Holford (1983).^ The new approach offers a general method to analyze the age-period-cohort data without using any arbitrary constraint in the model. ^
Resumo:
Structural decomposition techniques based on input-output table have become a widely used tool for analyzing long term economic growth. However, due to limitations of data, such techniques have never been applied to China's regional economies. Fortunately, in 2003, China's Interregional Input-Output Table for 1987 and Multi-regional Input-Output Table for 1997 were published, making decomposition analysis of China's regional economies possible. This paper first estimates the interregional input-output table in constant price by using an alternative approach: the Grid-Search method, and then applies the standard input-output decomposition technique to China's regional economies for 1987-97. Based on the decomposition results, the contributions to output growth of different factors are summarized at the regional and industrial level. Furthermore, interdependence between China's regional economies is measured and explained by aggregating the decomposition factors into the intraregional multiplier-related effect, the feedback-related effect, and the spillover-related effect. Finally, the performance of China's industrial and regional development policies implemented in the 1990s is briefly discussed based on the analytical results of the paper.
Resumo:
The increasing importance of vertical specialisation (VS) trade has been a notable feature of rapid economic globalisation and regional integration. In an attempt to understand countries’ depth of participation in global production chains, many Input-Output based VS indicators have been developed. However, most of them focus on showing the overall magnitude of a country’s VS trade, rather than explaining the roles that specific sectors or products play in VS trade and what factors make the VS change over time. Changes in vertical specialisation indicators are, in fact, determined by mixed and complex factors such as import substitution ratios, types of exported goods and domestic production networks. In this paper, decomposition techniques are applied to VS measurement based on the OECD Input-Output database. The decomposition results not only help us understand the structure of VS at detailed sector and product levels, but also show us the contributions of trade dependency, industrial structures of foreign trade and domestic production system to a country’s vertical specialisation trade.