23 resultados para Regression-based decomposition.
Resumo:
The standard analyses of survival data involve the assumption that survival and censoring are independent. When censoring and survival are related, the phenomenon is known as informative censoring. This paper examines the effects of an informative censoring assumption on the hazard function and the estimated hazard ratio provided by the Cox model.^ The limiting factor in all analyses of informative censoring is the problem of non-identifiability. Non-identifiability implies that it is impossible to distinguish a situation in which censoring and death are independent from one in which there is dependence. However, it is possible that informative censoring occurs. Examination of the literature indicates how others have approached the problem and covers the relevant theoretical background.^ Three models are examined in detail. The first model uses conditionally independent marginal hazards to obtain the unconditional survival function and hazards. The second model is based on the Gumbel Type A method for combining independent marginal distributions into bivariate distributions using a dependency parameter. Finally, a formulation based on a compartmental model is presented and its results described. For the latter two approaches, the resulting hazard is used in the Cox model in a simulation study.^ The unconditional survival distribution formed from the first model involves dependency, but the crude hazard resulting from this unconditional distribution is identical to the marginal hazard, and inferences based on the hazard are valid. The hazard ratios formed from two distributions following the Gumbel Type A model are biased by a factor dependent on the amount of censoring in the two populations and the strength of the dependency of death and censoring in the two populations. The Cox model estimates this biased hazard ratio. In general, the hazard resulting from the compartmental model is not constant, even if the individual marginal hazards are constant, unless censoring is non-informative. The hazard ratio tends to a specific limit.^ Methods of evaluating situations in which informative censoring is present are described, and the relative utility of the three models examined is discussed. ^
Resumo:
Logistic regression is one of the most important tools in the analysis of epidemiological and clinical data. Such data often contain missing values for one or more variables. Common practice is to eliminate all individuals for whom any information is missing. This deletion approach does not make efficient use of available information and often introduces bias.^ Two methods were developed to estimate logistic regression coefficients for mixed dichotomous and continuous covariates including partially observed binary covariates. The data were assumed missing at random (MAR). One method (PD) used predictive distribution as weight to calculate the average of the logistic regressions performing on all possible values of missing observations, and the second method (RS) used a variant of resampling technique. Additional seven methods were compared with these two approaches in a simulation study. They are: (1) Analysis based on only the complete cases, (2) Substituting the mean of the observed values for the missing value, (3) An imputation technique based on the proportions of observed data, (4) Regressing the partially observed covariates on the remaining continuous covariates, (5) Regressing the partially observed covariates on the remaining continuous covariates conditional on response variable, (6) Regressing the partially observed covariates on the remaining continuous covariates and response variable, and (7) EM algorithm. Both proposed methods showed smaller standard errors (s.e.) for the coefficient involving the partially observed covariate and for the other coefficients as well. However, both methods, especially PD, are computationally demanding; thus for analysis of large data sets with partially observed covariates, further refinement of these approaches is needed. ^
Resumo:
A large number of ridge regression estimators have been proposed and used with little knowledge of their true distributions. Because of this lack of knowledge, these estimators cannot be used to test hypotheses or to form confidence intervals.^ This paper presents a basic technique for deriving the exact distribution functions for a class of generalized ridge estimators. The technique is applied to five prominent generalized ridge estimators. Graphs of the resulting distribution functions are presented. The actual behavior of these estimators is found to be considerably different than the behavior which is generally assumed for ridge estimators.^ This paper also uses the derived distributions to examine the mean squared error properties of the estimators. A technique for developing confidence intervals based on the generalized ridge estimators is also presented. ^
Resumo:
The history of the logistic function since its introduction in 1838 is reviewed, and the logistic model for a polychotomous response variable is presented with a discussion of the assumptions involved in its derivation and use. Following this, the maximum likelihood estimators for the model parameters are derived along with a Newton-Raphson iterative procedure for evaluation. A rigorous mathematical derivation of the limiting distribution of the maximum likelihood estimators is then presented using a characteristic function approach. An appendix with theorems on the asymptotic normality of sample sums when the observations are not identically distributed, with proofs, supports the presentation on asymptotic properties of the maximum likelihood estimators. Finally, two applications of the model are presented using data from the Hypertension Detection and Follow-up Program, a prospective, population-based, randomized trial of treatment for hypertension. The first application compares the risk of five-year mortality from cardiovascular causes with that from noncardiovascular causes; the second application compares risk factors for fatal or nonfatal coronary heart disease with those for fatal or nonfatal stroke. ^
Resumo:
Purpose: School districts in the U.S. regularly offer foods that compete with the USDA reimbursable meal, known as `a la carte' foods. These foods must adhere to state nutritional regulations; however, the implementation of these regulations often differs across districts. The purpose of this study is to compare two methods of offering a la carte foods on student's lunch intake: 1) an extensive a la carte program in which schools have a separate area for a la carte food sales, that includes non-reimbursable entrees; and 2) a moderate a la carte program, which offers the sale of a la carte foods on the same serving line with reimbursable meals. ^ Methods: Direct observation was used to assess children's lunch consumption in six schools, across two districts in Central Texas (n=373 observations). Schools were matched on socioeconomic status. Data collectors were randomly assigned to students, and recorded foods obtained, foods consumed, source of food, gender, grade, and ethnicity. Observations were entered into a nutrient database program, FIAS Millennium Edition, to obtain nutritional information. Differences in energy and nutrient intake across lunch sources and districts were assessed using ANOVA and independent t-tests. A linear regression model was applied to control for potential confounders. ^ Results: Students at schools with extensive a la carte programs consumed significantly more calories, carbohydrates, total fat, saturated fat, calcium, and sodium compared to students in schools with moderate a la carte offerings (p<.05). Students in the extensive a la carte program consumed approximately 94 calories more than students in the moderate a la carte program. There was no significant difference in the energy consumption in students who consumed any amount of a la carte compared to students who consumed none. In both districts, students who consumed a la carte offerings were more likely to consume sugar-sweetened beverages, sweets, chips, and pizza compared to students who consumed no a la carte foods. ^ Conclusion: The amount, type and method of offering a la carte foods can significantly affect student dietary intake. This pilot study indicates that when a la carte foods are more available, students consume more calories. Findings underscore the need for further investigation on how availability of a la carte foods affects children's diets. Guidelines for school a la carte offerings should be maximized to encourage the consumption of healthful foods and appropriate energy intake.^
Resumo:
Objective: The objective of this study is to investigate the association between processed and unprocessed red meat consumption and prostate cancer (PCa) stage in a homogenous Mexican-American population. Methods: This population-based case-control study had a total of 582 participants (287 cases with histologically confirmed adenocarcinoma of the prostate gland and 295 age and ethnicity-matched controls) that were all residing in the Southeast region of Texas from 1998 to 2006. All questionnaire information was collected using a validated data collection instrument. Statistical Analysis: Descriptive analyses included Student's t-test and Pearson's Chi-square tests. Odds ratios and 95% confidence intervals were calculated to quantify the association between nutritional factors and PCa stage. A multivariable model was used for unconditional logistic regression. Results: After adjusting for relevant covariates, those who consume high amounts of processed red meat have a non-significant increased odds of being diagnosed with localized PCa (OR = 1.60 95% CI: 0.85 - 3.03) and total PCa (OR = 1.43 95% CI: 0.81 - 2.52) but not for advanced PCa (OR = 0.91 95% CI: 1.37 - 2.23). Interestingly, high consumption of carbohydrates shows a significant reduction in the odds of being diagnosed with total PCa and advanced PCa (OR = 0.43 95% CI: 0.24 - 0.77; OR = 0.27 95% CI: 0.10 - 0.71, respectively). However, consuming high amounts of energy from protein and fat was shown to increase the odds of being diagnosed with advanced PCa (OR = 4.62 95% CI: 1.69 - 12.59; OR = 2.61 95% CI: 1.04 - 6.58, respectively). Conclusion: Mexican-Americans who consume high amounts of energy from protein and fat had increased odds of being diagnosed with advanced PCa, while high amounts of carbohydrates reduced the odds of being diagnosed with total and advanced PCa.^
Resumo:
Pathway based genome wide association study evolves from pathway analysis for microarray gene expression and is under rapid development as a complementary for single-SNP based genome wide association study. However, it faces new challenges, such as the summarization of SNP statistics to pathway statistics. The current study applies the ridge regularized Kernel Sliced Inverse Regression (KSIR) to achieve dimension reduction and compared this method to the other two widely used methods, the minimal-p-value (minP) approach of assigning the best test statistics of all SNPs in each pathway as the statistics of the pathway and the principal component analysis (PCA) method of utilizing PCA to calculate the principal components of each pathway. Comparison of the three methods using simulated datasets consisting of 500 cases, 500 controls and100 SNPs demonstrated that KSIR method outperformed the other two methods in terms of causal pathway ranking and the statistical power. PCA method showed similar performance as the minP method. KSIR method also showed a better performance over the other two methods in analyzing a real dataset, the WTCCC Ulcerative Colitis dataset consisting of 1762 cases, 3773 controls as the discovery cohort and 591 cases, 1639 controls as the replication cohort. Several immune and non-immune pathways relevant to ulcerative colitis were identified by these methods. Results from the current study provided a reference for further methodology development and identified novel pathways that may be of importance to the development of ulcerative colitis.^
Resumo:
It is well known that an identification problem exists in the analysis of age-period-cohort data because of the relationship among the three factors (date of birth + age at death = date of death). There are numerous suggestions about how to analyze the data. No one solution has been satisfactory. The purpose of this study is to provide another analytic method by extending the Cox's lifetable regression model with time-dependent covariates. The new approach contains the following features: (1) It is based on the conditional maximum likelihood procedure using a proportional hazard function described by Cox (1972), treating the age factor as the underlying hazard to estimate the parameters for the cohort and period factors. (2) The model is flexible so that both the cohort and period factors can be treated as dummy or continuous variables, and the parameter estimations can be obtained for numerous combinations of variables as in a regression analysis. (3) The model is applicable even when the time period is unequally spaced.^ Two specific models are considered to illustrate the new approach and applied to the U.S. prostate cancer data. We find that there are significant differences between all cohorts and there is a significant period effect for both whites and nonwhites. The underlying hazard increases exponentially with age indicating that old people have much higher risk than young people. A log transformation of relative risk shows that the prostate cancer risk declined in recent cohorts for both models. However, prostate cancer risk declined 5 cohorts (25 years) earlier for whites than for nonwhites under the period factor model (0 0 0 1 1 1 1). These latter results are similar to the previous study by Holford (1983).^ The new approach offers a general method to analyze the age-period-cohort data without using any arbitrary constraint in the model. ^