15 resultados para Partial least square regression

em DigitalCommons@The Texas Medical Center


Relevância:

100.00% 100.00%

Publicador:

Resumo:

A Bayesian approach to estimation of the regression coefficients of a multinominal logit model with ordinal scale response categories is presented. A Monte Carlo method is used to construct the posterior distribution of the link function. The link function is treated as an arbitrary scalar function. Then the Gauss-Markov theorem is used to determine a function of the link which produces a random vector of coefficients. The posterior distribution of the random vector of coefficients is used to estimate the regression coefficients. The method described is referred to as a Bayesian generalized least square (BGLS) analysis. Two cases involving multinominal logit models are described. Case I involves a cumulative logit model and Case II involves a proportional-odds model. All inferences about the coefficients for both cases are described in terms of the posterior distribution of the regression coefficients. The results from the BGLS method are compared to maximum likelihood estimates of the regression coefficients. The BGLS method avoids the nonlinear problems encountered when estimating the regression coefficients of a generalized linear model. The method is not complex or computationally intensive. The BGLS method offers several advantages over Bayesian approaches. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The purpose of this research is to examine the relative profitability of the firm within the nursing facility industry in Texas. An examination is made of the variables expected to affect profitability and of importance to the design and implementation of regulatory policy. To facilitate this inquiry, specific questions addressed are: (1) Do differences in ownership form affect profitability (defined as operating income before fixed costs)? (2) What impact does regional location have on profitability? (3) Do patient case-mix and access to care by Medicaid patients differ between proprietary and non-profit firms and facilities located in urban versus rural regions, and what association exists between these variables and profitability? (4) Are economies of scale present in the nursing home industry? (5) Do nursing facilities operate in a competitive output market characterized by the inability of a single firm to exhibit influence over market price?^ Prior studies have principally employed a cost function to assess efficiency differences between classifications of nursing facilities. The inherent weakness in this approach is that it only considers technical efficiency. Not both technical and price efficiency which are the two components of overall economic efficiency. One firm is more technically efficient compared to another if it is able to produce a given quantity of output at the least possible costs. Price efficiency means that scarce resources are being directed towards their most valued use. Assuming similar prices in both input and output markets, differences in overall economic efficiency between firm classes are assessed through profitability, hence a profit function.^ Using the framework of the profit function, data from 1990 Medicaid Costs Reports for Texas, and the analytic technique of Ordinary Least Squares Regression, the findings of the study indicated (1) similar profitability between nursing facilities organized as for-profit versus non-profit and located in urban versus rural regions, (2) an inverse association between both payor-mix and patient case-mix with profitability, (3) strong evidence for the presence of scale economies, and (4) existence of a competitive market structure. The paper concludes with implications regarding reimbursement methodology and construction moratorium policies in Texas. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The desire to promote efficient allocation of health resources and effective patient care has focused attention on home care as an alternative to acute hospital service. in particular, clinical home care is suggested as a substitute for the final days of hospital stay. This dissertation evaluates the relationship between hospital and home care services for residents of British Columbia, Canada beginning in 1993/94 using data from the British Columbia Linked Health database. ^ Lengths of stay for patients referred to home care following hospital discharge are compared to those for patients not referred to home care. Ordinary least squares regression analysis adjusts for age, gender, admission severity, comorbidity, complications, income, and other patient, physician, and hospital characteristics. Home care clients tend to have longer stays in hospital than patients not referred to home care (β = 2.54, p = 0.0001). Longer hospital stays are evident for all home care client groups as well as both older and younger patients. Sensitivity analysis for referral time to direct care and extreme lengths of stay are consistent with these findings. Two stage regression analysis indicates that selection bias is not significant.^ Patients referred to clinical home care also have different health service utilization following discharge compared to patients not referred to home care. Home care nursing clients use more medical services to complement home care. Rehabilitation clients initially substitute home care for physiotherapy services but later are more likely to be admitted to residential care. All home care clients are more likely to be readmitted to hospital during the one year follow-up period. There is also a strong complementary association between direct care referral and homemaker support. Rehabilitation clients have a greater risk of dying during the year following discharge. ^ These results suggest that home care is currently used as a complement rather than a substitute for some acute health services. Organizational and resource issues may contribute to the longer stays by home care clients. Program planning and policies are required if home care is to provide an effective substitute for acute hospital days. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Interaction effect is an important scientific interest for many areas of research. Common approach for investigating the interaction effect of two continuous covariates on a response variable is through a cross-product term in multiple linear regression. In epidemiological studies, the two-way analysis of variance (ANOVA) type of method has also been utilized to examine the interaction effect by replacing the continuous covariates with their discretized levels. However, the implications of model assumptions of either approach have not been examined and the statistical validation has only focused on the general method, not specifically for the interaction effect.^ In this dissertation, we investigated the validity of both approaches based on the mathematical assumptions for non-skewed data. We showed that linear regression may not be an appropriate model when the interaction effect exists because it implies a highly skewed distribution for the response variable. We also showed that the normality and constant variance assumptions required by ANOVA are not satisfied in the model where the continuous covariates are replaced with their discretized levels. Therefore, naïve application of ANOVA method may lead to an incorrect conclusion. ^ Given the problems identified above, we proposed a novel method modifying from the traditional ANOVA approach to rigorously evaluate the interaction effect. The analytical expression of the interaction effect was derived based on the conditional distribution of the response variable given the discretized continuous covariates. A testing procedure that combines the p-values from each level of the discretized covariates was developed to test the overall significance of the interaction effect. According to the simulation study, the proposed method is more powerful then the least squares regression and the ANOVA method in detecting the interaction effect when data comes from a trivariate normal distribution. The proposed method was applied to a dataset from the National Institute of Neurological Disorders and Stroke (NINDS) tissue plasminogen activator (t-PA) stroke trial, and baseline age-by-weight interaction effect was found significant in predicting the change from baseline in NIHSS at Month-3 among patients received t-PA therapy.^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The purpose of this study was to examine, in the context of an economic model of health production, the relationship between inputs (health influencing activities) and fitness.^ Primary data were collected from 204 employees of a large insurance company at the time of their enrollment in an industrially-based health promotion program. The inputs of production included medical care use, exercise, smoking, drinking, eating, coronary disease history, and obesity. The variables of age, gender and education known to affect the production process were also examined. Two estimates of fitness were used; self-report and a physiologic estimate based on exercise treadmill performance. Ordinary least squares and two-stage least squares regression analyses were used to estimate the fitness production functions.^ In the production of self-reported fitness status the coefficients for the exercise, smoking, eating, and drinking production inputs, and the control variable of gender were statistically significant and possessed theoretically correct signs. In the production of physiologic fitness exercise, smoking and gender were statistically significant. Exercise and gender were theoretically consistent while smoking was not. Results are compared with previous analyses of health production. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A non-parametric method was developed and tested to compare the partial areas under two correlated Receiver Operating Characteristic curves. Based on the theory of generalized U-statistics the mathematical formulas have been derived for computing ROC area, and the variance and covariance between the portions of two ROC curves. A practical SAS application also has been developed to facilitate the calculations. The accuracy of the non-parametric method was evaluated by comparing it to other methods. By applying our method to the data from a published ROC analysis of CT image, our results are very close to theirs. A hypothetical example was used to demonstrate the effects of two crossed ROC curves. The two ROC areas are the same. However each portion of the area between two ROC curves were found to be significantly different by the partial ROC curve analysis. For computation of ROC curves with large scales, such as a logistic regression model, we applied our method to the breast cancer study with Medicare claims data. It yielded the same ROC area computation as the SAS Logistic procedure. Our method also provides an alternative to the global summary of ROC area comparison by directly comparing the true-positive rates for two regression models and by determining the range of false-positive values where the models differ. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Mycobacterium avium complex (MAC) is a ubiquitous organism responsible for most pulmonary and disseminated disease caused by non-tuberculosis (NTM) mycobacteria. Though MAC lung disease without predisposing factors is uncommon, in recent years it has been increasingly described in middle-aged and elderly women. Recognition and correct diagnosis, is often delayed due to the indolent nature of the disease. It is unclear if these women have significant clinical disease as or if their airways are simply colonized by the bacterium. This study describes the clinical presentation, identifies risk factors, and describes the clinical significance of MAC lung disease in HIV-negative women aged 50 or greater. ^ A hybrid study design utilizing both cross-sectional and case-control methodologies was used. A comparison population was selected from previously identified tuberculosis suspects found throughout Harris County. The study population had at least one acid fast bacillus pulmonary culture performed between 1/1/1998 and 12/31/2000 from a pulmonary source. Clinical presentation and symptoms were analyzed using a cross-sectional design. Past medical history and other risk factors were evaluated using a traditional case-control study design. Differences in categorical variables were estimated with the Chi Square or Fisher's Exact test as appropriate. Odds ratios and 95% confidence intervals were utilized to evaluate associations. Multivariate logistic regression was used to identify predictive factors for MAC. All statistical tests were two-sided and P-values <0.05 were considered statistically significant. ^ Culture confirmed MAC pulmonary cases were more likely to be white, have bronchiectasis, scoliosis, evidence of cavitation and pleural changes on chest radiography and granulomas on histopathologic examination than women whose pulmonary cultures were AFB negative. After controlling for selected risk factors, white race continued to be significantly associated with MAC lung disease (OR = 4.6, 95% CI = 2.3, 9.2). In addition, asthma history, smoking history and alcohol use were less likely to be evident among MAC cases in a multivariate analysis. Right upper and right middle lobe disease was further noted among clinically significant cases. Based on population data, MAC lung disease appears to represent a significant clinical syndrome in HIV-negative women thus supporting the theory of the Lady Windermere Syndrome. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Human Papillomavirus (HPV) is the most common sexually transmitted disease in the United States. Although HPV prevalence is high in the United States, there are a limited number of research studies that focus on Hispanics, who have higher incidence rates of cervical cancer than their non-Hispanic counterparts. The HPV vaccine introduced in 2006 may offer a feasible solution to the issues surrounding high prevalence of HPV. Due to the high prevalence of HPV infection among adolescents and young adults it has been suggested that HPV vaccination begin prior to onset sexual activity and focus on non-sexually active adolescents and pre-adolescents. Consequently, it has become increasingly important to assess knowledge and awareness of HPV in order to develop effective intervention strategies. This pilot study evaluated the knowledge and health beliefs of Hispanic parents regarding HPV and the HPV vaccine using a newly developed questionnaire based on the constructs of the Health Belief Model. The sample was recruited from an ob-gyn office in El Paso, Texas. Descriptive data show that the majority of the sample was female (94.1%), Hispanic (76.5%), Catholic (64.7%), and had at least a high school education (55.9%). Chi-square analysis revealed that the following variables differed amongst parents who intended to vaccinate their child against HPV and those who did not: religion (p=0.038), perceived severity item "HPV infections are easily treated" (p=0.052), perceived benefits item "It is better to vaccinate a child against an STI before they become sexually active" (p=0.014) and perceived barriers item "The HPV vaccine may have serious side effects that could harm my child" (p=0.004). Univariate logistic regression indicated that religion (OR = 4.8, CI: 1.04, 21.8) and "The HPV vaccine may have serious side effects that could harm my child" (OR = 15.9, CI: 1.73, 145.8) were significant predictors of parental intention to vaccinate. Multivariate logistic regression, using backwards elimination, indicated that religion (OR = 7.7, CI: 1.25, 47.8) and "The HPV vaccine may have serious side effects that may harm my child" (OR = 7.6, CI: 1.15, 50.2) were the best predictive variables for parental intention to vaccinate. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Acute kidney Injury (AKI) in hospitalized pediatric patients can be a significant event that can result in increased patient morbidity and mortality. The incidence of medication associated AKI is increasing in the pediatric population. Currently, there are no data to quantify the risks of developing AKI for various potentially nephrotoxic medications. The primary objective of this study was to determine the odds of nephrotoxic medication exposure in hospitalized pediatric patients with AKI as defined by the pediatric modified pRIFLE criteria. A retrospective case-control study was performed with patients that developed AKI, as defined by the pediatric pRIFLE criteria, as cases, and patients without AKI as controls that were matched by age category, gender, and disease state. Patients between 1 day and 18 years of age, admitted to a non-intensive care unit at Texas Children's Hospital for at least 3 days, and had at least 2 serum creatinine values drawn were included. Patient data was analyzed with Student's t test, Mann-Whitney U test, Chi square analysis, ANOVA, and conditional logistic regression. ^ Out of 1,660 patients identified for inclusion, 561 (33.8%) patients had AKI, and 357 cases were matched with 357 controls to become pairs. Of the cases, 441 were category 'R', 117 category 'I', 3 patients were category 'F', and no patient died. Cases with AKI were significantly younger than controls (p < 0.05). Significantly longer hospital length of stays, increased hospital costs, and exposure to more nephrotoxic medications for a longer period of time were characteristics of patients with AKI compared to patient without AKI. Patients with AKI had greater odds of exposure to one or more nephrotoxic medication than patients without AKI (OR 1.3, 95% CI 1.1–1.4, p < 0.05). Percent changes in estimated creatinine clearance (eCCl) from baseline were greatest with increased number of nephrotoxic medication exposures. ^ Exposure to potentially nephrotoxic medications may place pediatric patients at greater risk of acute kidney injury. Multiple nephrotoxic medication exposure may confer a greater risk of development of acute kidney injury, and result in increased hospital costs and patient morbidity. Due to the high percentage of patients that were exposed to potentially nephrotoxic medications, monitoring and medication selection strategies may need to be altered to prevent or minimize risk.^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Strategies are compared for the development of a linear regression model with stochastic (multivariate normal) regressor variables and the subsequent assessment of its predictive ability. Bias and mean squared error of four estimators of predictive performance are evaluated in simulated samples of 32 population correlation matrices. Models including all of the available predictors are compared with those obtained using selected subsets. The subset selection procedures investigated include two stopping rules, C$\sb{\rm p}$ and S$\sb{\rm p}$, each combined with an 'all possible subsets' or 'forward selection' of variables. The estimators of performance utilized include parametric (MSEP$\sb{\rm m}$) and non-parametric (PRESS) assessments in the entire sample, and two data splitting estimates restricted to a random or balanced (Snee's DUPLEX) 'validation' half sample. The simulations were performed as a designed experiment, with population correlation matrices representing a broad range of data structures.^ The techniques examined for subset selection do not generally result in improved predictions relative to the full model. Approaches using 'forward selection' result in slightly smaller prediction errors and less biased estimators of predictive accuracy than 'all possible subsets' approaches but no differences are detected between the performances of C$\sb{\rm p}$ and S$\sb{\rm p}$. In every case, prediction errors of models obtained by subset selection in either of the half splits exceed those obtained using all predictors and the entire sample.^ Only the random split estimator is conditionally (on $\\beta$) unbiased, however MSEP$\sb{\rm m}$ is unbiased on average and PRESS is nearly so in unselected (fixed form) models. When subset selection techniques are used, MSEP$\sb{\rm m}$ and PRESS always underestimate prediction errors, by as much as 27 percent (on average) in small samples. Despite their bias, the mean squared errors (MSE) of these estimators are at least 30 percent less than that of the unbiased random split estimator. The DUPLEX split estimator suffers from large MSE as well as bias, and seems of little value within the context of stochastic regressor variables.^ To maximize predictive accuracy while retaining a reliable estimate of that accuracy, it is recommended that the entire sample be used for model development, and a leave-one-out statistic (e.g. PRESS) be used for assessment. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

One of the difficulties in the practical application of ridge regression is that, for a given data set, it is unknown whether a selected ridge estimator has smaller squared error than the least squares estimator. The concept of the improvement region is defined, and a technique is developed which obtains approximate confidence intervals for the value of ridge k which produces the maximum reduction in mean squared error. Two simulation experiments were conducted to investigate how accurate these approximate confidence intervals might be. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The study objectives were to (i) Describe the frequency and priority of family meals, (ii) Compare the family mealtime environment by gender and SES, (iii) Examine the association between family meals and weight status among adolescents living in New Delhi, India, (iv) Examine the association between family meals and eating patterns (healthy/unhealthy) among adolescent boys and girls living in New Delhi, India. Survey and anthropometric data were collected from 8th and 10th grade students (n=1818) from four Government (public) schools and four private schools who participated in the HRIDAY study. Chi-square tests were used to evaluate if the distributions of outcomes and exposure varied by gender and SES groups. Logistic regression models were used to obtain the association of weight status (underweight / normal weight Vs overweight / obese) with frequency of family meals as the main exposure. Overall the prevalence of obesity was more among the mid- high SES group and in boys. Over half of the participants had 7 or more family meals in the past week. There was no statistically significant association seen between family meals and weight status. Majority of the participants believed that eating healthy food and maintaining a healthy weight was important and eating at least one family meal was important. Majority of the participants who ate more than 3 or more family meals eat healthy food and also ate fast food. Intervention strategies should focus on the high risk group. Private schools are appropriate settings for interventions. Eating with families should be encouraged and future research should examine family meal patterns.^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The infant mortality rate (IMR) is considered to be one of the most important indices of a country's well-being. Countries around the world and other health organizations like the World Health Organization are dedicating their resources, knowledge and energy to reduce the infant mortality rates. The well-known Millennium Development Goal 4 (MDG 4), whose aim is to archive a two thirds reduction of the under-five mortality rate between 1990 and 2015, is an example of the commitment. ^ In this study our goal is to model the trends of IMR between the 1950s to 2010s for selected countries. We would like to know how the IMR is changing overtime and how it differs across countries. ^ IMR data collected over time forms a time series. The repeated observations of IMR time series are not statistically independent. So in modeling the trend of IMR, it is necessary to account for these correlations. We proposed to use the generalized least squares method in general linear models setting to deal with the variance-covariance structure in our model. In order to estimate the variance-covariance matrix, we referred to the time-series models, especially the autoregressive and moving average models. Furthermore, we will compared results from general linear model with correlation structure to that from ordinary least squares method without taking into account the correlation structure to check how significantly the estimates change.^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The performance of the Hosmer-Lemeshow global goodness-of-fit statistic for logistic regression models was explored in a wide variety of conditions not previously fully investigated. Computer simulations, each consisting of 500 regression models, were run to assess the statistic in 23 different situations. The items which varied among the situations included the number of observations used in each regression, the number of covariates, the degree of dependence among the covariates, the combinations of continuous and discrete variables, and the generation of the values of the dependent variable for model fit or lack of fit.^ The study found that the $\rm\ C$g* statistic was adequate in tests of significance for most situations. However, when testing data which deviate from a logistic model, the statistic has low power to detect such deviation. Although grouping of the estimated probabilities into quantiles from 8 to 30 was studied, the deciles of risk approach was generally sufficient. Subdividing the estimated probabilities into more than 10 quantiles when there are many covariates in the model is not necessary, despite theoretical reasons which suggest otherwise. Because it does not follow a X$\sp2$ distribution, the statistic is not recommended for use in models containing only categorical variables with a limited number of covariate patterns.^ The statistic performed adequately when there were at least 10 observations per quantile. Large numbers of observations per quantile did not lead to incorrect conclusions that the model did not fit the data when it actually did. However, the statistic failed to detect lack of fit when it existed and should be supplemented with further tests for the influence of individual observations. Careful examination of the parameter estimates is also essential since the statistic did not perform as desired when there was moderate to severe collinearity among covariates.^ Two methods studied for handling tied values of the estimated probabilities made only a slight difference in conclusions about model fit. Neither method split observations with identical probabilities into different quantiles. Approaches which create equal size groups by separating ties should be avoided. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The purpose of this research is to develop a new statistical method to determine the minimum set of rows (R) in a R x C contingency table of discrete data that explains the dependence of observations. The statistical power of the method will be empirically determined by computer simulation to judge its efficiency over the presently existing methods. The method will be applied to data on DNA fragment length variation at six VNTR loci in over 72 populations from five major racial groups of human (total sample size is over 15,000 individuals; each sample having at least 50 individuals). DNA fragment lengths grouped in bins will form the basis of studying inter-population DNA variation within the racial groups are significant, will provide a rigorous re-binning procedure for forensic computation of DNA profile frequencies that takes into account intra-racial DNA variation among populations. ^