30 resultados para nonparametric regression
em DigitalCommons@The Texas Medical Center
Resumo:
The considerable search for synergistic agents in cancer research is motivated by the therapeutic benefits achieved by combining anti-cancer agents. Synergistic agents make it possible to reduce dosage while maintaining or enhancing a desired effect. Other favorable outcomes of synergistic agents include reduction in toxicity and minimizing or delaying drug resistance. Dose-response assessment and drug-drug interaction analysis play an important part in the drug discovery process, however analysis are often poorly done. This dissertation is an effort to notably improve dose-response assessment and drug-drug interaction analysis. The most commonly used method in published analysis is the Median-Effect Principle/Combination Index method (Chou and Talalay, 1984). The Median-Effect Principle/Combination Index method leads to inefficiency by ignoring important sources of variation inherent in dose-response data and discarding data points that do not fit the Median-Effect Principle. Previous work has shown that the conventional method yields a high rate of false positives (Boik, Boik, Newman, 2008; Hennessey, Rosner, Bast, Chen, 2010) and, in some cases, low power to detect synergy. There is a great need for improving the current methodology. We developed a Bayesian framework for dose-response modeling and drug-drug interaction analysis. First, we developed a hierarchical meta-regression dose-response model that accounts for various sources of variation and uncertainty and allows one to incorporate knowledge from prior studies into the current analysis, thus offering a more efficient and reliable inference. Second, in the case that parametric dose-response models do not fit the data, we developed a practical and flexible nonparametric regression method for meta-analysis of independently repeated dose-response experiments. Third, and lastly, we developed a method, based on Loewe additivity that allows one to quantitatively assess interaction between two agents combined at a fixed dose ratio. The proposed method makes a comprehensive and honest account of uncertainty within drug interaction assessment. Extensive simulation studies show that the novel methodology improves the screening process of effective/synergistic agents and reduces the incidence of type I error. We consider an ovarian cancer cell line study that investigates the combined effect of DNA methylation inhibitors and histone deacetylation inhibitors in human ovarian cancer cell lines. The hypothesis is that the combination of DNA methylation inhibitors and histone deacetylation inhibitors will enhance antiproliferative activity in human ovarian cancer cell lines compared to treatment with each inhibitor alone. By applying the proposed Bayesian methodology, in vitro synergy was declared for DNA methylation inhibitor, 5-AZA-2'-deoxycytidine combined with one histone deacetylation inhibitor, suberoylanilide hydroxamic acid or trichostatin A in the cell lines HEY and SKOV3. This suggests potential new epigenetic therapies in cell growth inhibition of ovarian cancer cells.
Resumo:
A non-parametric method was developed and tested to compare the partial areas under two correlated Receiver Operating Characteristic curves. Based on the theory of generalized U-statistics the mathematical formulas have been derived for computing ROC area, and the variance and covariance between the portions of two ROC curves. A practical SAS application also has been developed to facilitate the calculations. The accuracy of the non-parametric method was evaluated by comparing it to other methods. By applying our method to the data from a published ROC analysis of CT image, our results are very close to theirs. A hypothetical example was used to demonstrate the effects of two crossed ROC curves. The two ROC areas are the same. However each portion of the area between two ROC curves were found to be significantly different by the partial ROC curve analysis. For computation of ROC curves with large scales, such as a logistic regression model, we applied our method to the breast cancer study with Medicare claims data. It yielded the same ROC area computation as the SAS Logistic procedure. Our method also provides an alternative to the global summary of ROC area comparison by directly comparing the true-positive rates for two regression models and by determining the range of false-positive values where the models differ. ^
Resumo:
This study investigates the degree to which gender, ethnicity, relationship to perpetrator, and geomapped socio-economic factors significantly predict the incidence of childhood sexual abuse, physical abuse and non- abuse. These variables are then linked to geographic identifiers using geographic information system (GIS) technology to develop a geo-mapping framework for child sexual and physical abuse prevention.
Resumo:
BACKGROUND: Obesity is a systemic disorder associated with an increase in left ventricular mass and premature death and disability from cardiovascular disease. Although bariatric surgery reverses many of the hormonal and hemodynamic derangements, the long-term collective effects on body composition and left ventricular mass have not been considered before. We hypothesized that the decrease in fat mass and lean mass after weight loss surgery is associated with a decrease in left ventricular mass. METHODS: Fifteen severely obese women (mean body mass index [BMI]: 46.7+/-1.7 kg/m(2)) with medically controlled hypertension underwent bariatric surgery. Left ventricular mass and plasma markers of systemic metabolism, together with body mass index (BMI), waist and hip circumferences, body composition (fat mass and lean mass), and resting energy expenditure were measured at 0, 3, 9, 12, and 24 months. RESULTS: Left ventricular mass continued to decrease linearly over the entire period of observation, while rates of weight loss, loss of lean mass, loss of fat mass, and resting energy expenditure all plateaued at 9 [corrected] months (P <.001 for all). Parameters of systemic metabolism normalized by 9 months, and showed no further change at 24 months after surgery. CONCLUSIONS: Even though parameters of obesity, including BMI and body composition, plateau, the benefits of bariatric surgery on systemic metabolism and left ventricular mass are sustained. We propose that the progressive decrease of left ventricular mass after weight loss surgery is regulated by neurohumoral factors, and may contribute to improved long-term survival.
Resumo:
Several studies have shown that successful Employee Assistance Programs (EAPs) have strong management endorsement. Strong management endorsement is defined as positive support in utilizing EAP services for themselves and their employees. This study focuses solely on middle management as opposed to upper or general management support. The study further examines success or lack of success of an EAP by the utilization rate defined as the number of employees over a year period who access EAP services.^ A analytical cross-sectional design was used to compare and observe differences between two groups of middle managers (utilizers and nonutilizers). Middle manager data was collected through a mail questionnaire. The study focused on identifying predictors that influence middle managers' utilization rate specifically: attitude toward EAPs, EAP knowledge level, attitude toward mental health professionals, age, gender, years worked as a middle manager, education level, training, and other possible predictors of utilization. The overall hypothesis states middle manager utilizers of EAP services have more positive attitudes and a better understanding of their EAP than middle management nonutilizers.^ As predicted, nonparametric bivariate results showed significant differences between the two groups. Middle managers in the utilization group (n = 473) tended to show more positive attitudes toward their EAP and mental health professionals and demonstrated greater EAP knowledge compared to the nonutilization group (n = 154). These findings support past studies on variables that influence EAP utilization rates.^ Further variables found to influence middle management utilization were identified by multivariate logistic regression results. These variable were gender (female supervisors), educational levels of employees supervised (employees with lower levels of education), number of employees supervised (greater the number supervised, more likely to utilize), managerial EAP training (trained supervisors) and awareness that problems do influence an employee's productivity.^ These findings strengthen the assertion that middle management's attitudes, as well as other variables may influence utilization. Study findings add new information about important variables specifically influencing middle management who utilize EAPs. An understanding of these variables is essential in developing competent EAP program training and orientation programs for middle managers. ^
Resumo:
The adult male golden hamster, when exposed to blinding (BL), short photoperiod (SP), or daily melatonin injections (MEL) demonstrates dramatic reproductive collapse. This collapse can be blocked by removal of the pineal gland prior to treatment. Reproductive collapse is characterized by a dramatic decrease in both testicular weight and serum gonadotropin titers. The present study was designed to examine the interactions of the hypothalamus and pituitary gland during testicular regression, and to specifically compare and contrast changes caused by the three commonly employed methods of inducing testicular regression (BL,SP,MEL). Hypothalamic LHRH content was altered by all three treatments. There was an initial increase in content of LHRH that occurred concomitantly with the decreased serum gonadotropin titers, followed by a precipitous decline in LHRH content which reflected the rapid increases in both serum LH and FSH which occur during spontaneous testicular recrudescence. In vitro pituitary responsiveness was altered by all three treatments: there was a decline in basal and maximally stimulatable release of both LH and FSH which paralleled the fall of serum gonadotropins. During recrudescence both basal and maximal release dramatically increased in a manner comparable to serum hormone levels. While all three treatments were equally effective in their ability to induce changes at all levels of the endocrine system, there were important temporal differences in the effects of the various treatments. Melatonin injections induced the most rapid changes in endocrine parameters, followed by exposure to short photoperiod. Blinding required the most time to induce the same changes. This study has demonstrated that pineal-mediated testicular regression is a process which involves dynamic changes in multiply-dependent endocrine relationships, and proper evaluation of these changes must be performed with specific temporal events in mind. ^
Resumo:
The ordinal logistic regression models are used to analyze the dependant variable with multiple outcomes that can be ranked, but have been underutilized. In this study, we describe four logistic regression models for analyzing the ordinal response variable. ^ In this methodological study, the four regression models are proposed. The first model uses the multinomial logistic model. The second is adjacent-category logit model. The third is the proportional odds model and the fourth model is the continuation-ratio model. We illustrate and compare the fit of these models using data from the survey designed by the University of Texas, School of Public Health research project PCCaSO (Promoting Colon Cancer Screening in people 50 and Over), to study the patient’s confidence in the completion colorectal cancer screening (CRCS). ^ The purpose of this study is two fold: first, to provide a synthesized review of models for analyzing data with ordinal response, and second, to evaluate their usefulness in epidemiological research, with particular emphasis on model formulation, interpretation of model coefficients, and their implications. Four ordinal logistic models that are used in this study include (1) Multinomial logistic model, (2) Adjacent-category logistic model [9], (3) Continuation-ratio logistic model [10], (4) Proportional logistic model [11]. We recommend that the analyst performs (1) goodness-of-fit tests, (2) sensitivity analysis by fitting and comparing different models.^
Resumo:
Ordinal outcomes are frequently employed in diagnosis and clinical trials. Clinical trials of Alzheimer's disease (AD) treatments are a case in point using the status of mild, moderate or severe disease as outcome measures. As in many other outcome oriented studies, the disease status may be misclassified. This study estimates the extent of misclassification in an ordinal outcome such as disease status. Also, this study estimates the extent of misclassification of a predictor variable such as genotype status. An ordinal logistic regression model is commonly used to model the relationship between disease status, the effect of treatment, and other predictive factors. A simulation study was done. First, data based on a set of hypothetical parameters and hypothetical rates of misclassification was created. Next, the maximum likelihood method was employed to generate likelihood equations accounting for misclassification. The Nelder-Mead Simplex method was used to solve for the misclassification and model parameters. Finally, this method was applied to an AD dataset to detect the amount of misclassification present. The estimates of the ordinal regression model parameters were close to the hypothetical parameters. β1 was hypothesized at 0.50 and the mean estimate was 0.488, β2 was hypothesized at 0.04 and the mean of the estimates was 0.04. Although the estimates for the rates of misclassification of X1 were not as close as β1 and β2, they validate this method. X 1 0-1 misclassification was hypothesized as 2.98% and the mean of the simulated estimates was 1.54% and, in the best case, the misclassification of k from high to medium was hypothesized at 4.87% and had a sample mean of 3.62%. In the AD dataset, the estimate for the odds ratio of X 1 of having both copies of the APOE 4 allele changed from an estimate of 1.377 to an estimate 1.418, demonstrating that the estimates of the odds ratio changed when the analysis includes adjustment for misclassification. ^
Resumo:
SNP genotyping arrays have been developed to characterize single-nucleotide polymorphisms (SNPs) and DNA copy number variations (CNVs). The quality of the inferences about copy number can be affected by many factors including batch effects, DNA sample preparation, signal processing, and analytical approach. Nonparametric and model-based statistical algorithms have been developed to detect CNVs from SNP genotyping data. However, these algorithms lack specificity to detect small CNVs due to the high false positive rate when calling CNVs based on the intensity values. Association tests based on detected CNVs therefore lack power even if the CNVs affecting disease risk are common. In this research, by combining an existing Hidden Markov Model (HMM) and the logistic regression model, a new genome-wide logistic regression algorithm was developed to detect CNV associations with diseases. We showed that the new algorithm is more sensitive and can be more powerful in detecting CNV associations with diseases than an existing popular algorithm, especially when the CNV association signal is weak and a limited number of SNPs are located in the CNV.^
Resumo:
Introduction. Despite the ban of lead-containing gasoline and paint, childhood lead poisoning remains a public health issue. Furthermore, a Medicaid-eligible child is 8 times more likely to have an elevated blood lead level (EBLL) than a non-Medicaid child, which is the primary reason for the early detection lead screening mandate for ages 12 and 24 months among the Medicaid population. Based on field observations, there was evidence that suggested a screening compliance issue. Objective. The purpose of this study was to analyze blood lead screening compliance in previously lead poisoned Medicaid children and test for an association between timely lead screening and timely childhood immunizations. The mean months between follow-up tests were also examined for a significant difference between the non-compliant and compliant lead screened children. Methods. Access to the surveillance data of all childhood lead poisoned cases in Bexar County was granted by the San Antonio Metropolitan Health District. A database was constructed and analyzed using descriptive statistics, logistic regression methods and non-parametric tests. Lead screening at 12 months of age was analyzed separately from lead screening at 24 months. The small portion of the population who were also related were included in one analysis and removed from a second analysis to check for significance. Gender, ethnicity, age of home, and having a sibling with an EBLL were ruled out as confounders for the association tests but ethnicity and age of home were adjusted in the nonparametric tests. Results. There was a strong significant association between lead screening compliance at 12 months and childhood immunization compliance, with or without including related children (p<0.00). However, there was no significant association between the two variables at the age of 24 months. Furthermore, there was no significant difference between the median of the mean months of follow-up blood tests among the non-compliant and compliant lead screened population for at the 12 month screening group but there was a significant difference at the 24 month screening group (p<0.01). Discussion. Descriptive statistics showed that 61% and 56% of the previously lead poisoned Medicaid population did not receive their 12 and 24 month mandated lead screening on time, respectively. This suggests that their elevated blood lead level may have been diagnosed earlier in their childhood. Furthermore, a child who is compliant with their lead screening at 12 months of age is 2.36 times more likely to also receive their childhood immunizations on time compared to a child who was not compliant with their 12 month screening. Even though there was no statistical significant association found for the 24 month group, the public health significance of a screening compliance issue is no less important. The Texas Medicaid program needs to enforce lead screening compliance because it is evident that there has been no monitoring system in place. Further recommendations include a need for an increased focus on parental education and the importance of taking their children for wellness exams on time.^
Resumo:
Objectives. This paper seeks to assess the effect on statistical power of regression model misspecification in a variety of situations. ^ Methods and results. The effect of misspecification in regression can be approximated by evaluating the correlation between the correct specification and the misspecification of the outcome variable (Harris 2010).In this paper, three misspecified models (linear, categorical and fractional polynomial) were considered. In the first section, the mathematical method of calculating the correlation between correct and misspecified models with simple mathematical forms was derived and demonstrated. In the second section, data from the National Health and Nutrition Examination Survey (NHANES 2007-2008) were used to examine such correlations. Our study shows that comparing to linear or categorical models, the fractional polynomial models, with the higher correlations, provided a better approximation of the true relationship, which was illustrated by LOESS regression. In the third section, we present the results of simulation studies that demonstrate overall misspecification in regression can produce marked decreases in power with small sample sizes. However, the categorical model had greatest power, ranging from 0.877 to 0.936 depending on sample size and outcome variable used. The power of fractional polynomial model was close to that of linear model, which ranged from 0.69 to 0.83, and appeared to be affected by the increased degrees of freedom of this model.^ Conclusion. Correlations between alternative model specifications can be used to provide a good approximation of the effect on statistical power of misspecification when the sample size is large. When model specifications have known simple mathematical forms, such correlations can be calculated mathematically. Actual public health data from NHANES 2007-2008 were used as examples to demonstrate the situations with unknown or complex correct model specification. Simulation of power for misspecified models confirmed the results based on correlation methods but also illustrated the effect of model degrees of freedom on power.^
Resumo:
The standard analyses of survival data involve the assumption that survival and censoring are independent. When censoring and survival are related, the phenomenon is known as informative censoring. This paper examines the effects of an informative censoring assumption on the hazard function and the estimated hazard ratio provided by the Cox model.^ The limiting factor in all analyses of informative censoring is the problem of non-identifiability. Non-identifiability implies that it is impossible to distinguish a situation in which censoring and death are independent from one in which there is dependence. However, it is possible that informative censoring occurs. Examination of the literature indicates how others have approached the problem and covers the relevant theoretical background.^ Three models are examined in detail. The first model uses conditionally independent marginal hazards to obtain the unconditional survival function and hazards. The second model is based on the Gumbel Type A method for combining independent marginal distributions into bivariate distributions using a dependency parameter. Finally, a formulation based on a compartmental model is presented and its results described. For the latter two approaches, the resulting hazard is used in the Cox model in a simulation study.^ The unconditional survival distribution formed from the first model involves dependency, but the crude hazard resulting from this unconditional distribution is identical to the marginal hazard, and inferences based on the hazard are valid. The hazard ratios formed from two distributions following the Gumbel Type A model are biased by a factor dependent on the amount of censoring in the two populations and the strength of the dependency of death and censoring in the two populations. The Cox model estimates this biased hazard ratio. In general, the hazard resulting from the compartmental model is not constant, even if the individual marginal hazards are constant, unless censoring is non-informative. The hazard ratio tends to a specific limit.^ Methods of evaluating situations in which informative censoring is present are described, and the relative utility of the three models examined is discussed. ^
Resumo:
Strategies are compared for the development of a linear regression model with stochastic (multivariate normal) regressor variables and the subsequent assessment of its predictive ability. Bias and mean squared error of four estimators of predictive performance are evaluated in simulated samples of 32 population correlation matrices. Models including all of the available predictors are compared with those obtained using selected subsets. The subset selection procedures investigated include two stopping rules, C$\sb{\rm p}$ and S$\sb{\rm p}$, each combined with an 'all possible subsets' or 'forward selection' of variables. The estimators of performance utilized include parametric (MSEP$\sb{\rm m}$) and non-parametric (PRESS) assessments in the entire sample, and two data splitting estimates restricted to a random or balanced (Snee's DUPLEX) 'validation' half sample. The simulations were performed as a designed experiment, with population correlation matrices representing a broad range of data structures.^ The techniques examined for subset selection do not generally result in improved predictions relative to the full model. Approaches using 'forward selection' result in slightly smaller prediction errors and less biased estimators of predictive accuracy than 'all possible subsets' approaches but no differences are detected between the performances of C$\sb{\rm p}$ and S$\sb{\rm p}$. In every case, prediction errors of models obtained by subset selection in either of the half splits exceed those obtained using all predictors and the entire sample.^ Only the random split estimator is conditionally (on $\\beta$) unbiased, however MSEP$\sb{\rm m}$ is unbiased on average and PRESS is nearly so in unselected (fixed form) models. When subset selection techniques are used, MSEP$\sb{\rm m}$ and PRESS always underestimate prediction errors, by as much as 27 percent (on average) in small samples. Despite their bias, the mean squared errors (MSE) of these estimators are at least 30 percent less than that of the unbiased random split estimator. The DUPLEX split estimator suffers from large MSE as well as bias, and seems of little value within the context of stochastic regressor variables.^ To maximize predictive accuracy while retaining a reliable estimate of that accuracy, it is recommended that the entire sample be used for model development, and a leave-one-out statistic (e.g. PRESS) be used for assessment. ^
Resumo:
This dissertation develops and explores the methodology for the use of cubic spline functions in assessing time-by-covariate interactions in Cox proportional hazards regression models. These interactions indicate violations of the proportional hazards assumption of the Cox model. Use of cubic spline functions allows for the investigation of the shape of a possible covariate time-dependence without having to specify a particular functional form. Cubic spline functions yield both a graphical method and a formal test for the proportional hazards assumption as well as a test of the nonlinearity of the time-by-covariate interaction. Five existing methods for assessing violations of the proportional hazards assumption are reviewed and applied along with cubic splines to three well known two-sample datasets. An additional dataset with three covariates is used to explore the use of cubic spline functions in a more general setting. ^
Resumo:
A Bayesian approach to estimation of the regression coefficients of a multinominal logit model with ordinal scale response categories is presented. A Monte Carlo method is used to construct the posterior distribution of the link function. The link function is treated as an arbitrary scalar function. Then the Gauss-Markov theorem is used to determine a function of the link which produces a random vector of coefficients. The posterior distribution of the random vector of coefficients is used to estimate the regression coefficients. The method described is referred to as a Bayesian generalized least square (BGLS) analysis. Two cases involving multinominal logit models are described. Case I involves a cumulative logit model and Case II involves a proportional-odds model. All inferences about the coefficients for both cases are described in terms of the posterior distribution of the regression coefficients. The results from the BGLS method are compared to maximum likelihood estimates of the regression coefficients. The BGLS method avoids the nonlinear problems encountered when estimating the regression coefficients of a generalized linear model. The method is not complex or computationally intensive. The BGLS method offers several advantages over Bayesian approaches. ^