19 resultados para Categorical variable
em DigitalCommons@The Texas Medical Center
Physical activity and survival after a first myocardial infarction: The Corpus Christi Heart Project
Resumo:
Previous studies have demonstrated that habitual physical activity is associated with a reduced risk of incident coronary heart disease (CHD). However, the role of physical activity in lowering the risk of all-cause mortality, CHD mortality, reinfarction, or receipt of a revascularization procedure after a first myocardial infarction (MI) remains unresolved, particularly in minority populations. To investigate the associations between physical activity and risk of all-cause mortality, CHD mortality, reinfarction, and receipt of a revascularization procedure, this study was conducted among Mexican-American and non-Hispanic white women and men who survived a first MI. The Corpus Christi Heart Project, a population-based cardiovascular surveillance study, provide data which included vital status, survival time, medical history, CHD risk factor information, including level of physical activity among Mexican-American and non-Hispanic white adults who had experienced a first MI between May, 1988 and April, 1990. MI patients were interviewed at baseline and annually thereafter until their death or through May, 1995. A categorical variable was created to reflect change in level of physical activity following the first MI; categories included (1) sedentary with no change, (2) decreased activity, (3) increased activity, and (4) moderate activity with no change (the referent group). Proportional hazards regression analyses were used to assess the relationship of level of physical activity and risk of death, reinfarction, or receipt of a revascularization procedure adjusting for age, sex, ethnicity, severity of MI, and CHD risk factor status. Over a 7-year follow-up period, the relative risk (95% confidence intervals) of all-cause mortality was 4.67 (2.27, 9.60) for the sedentary-no change group, 2.33 (0.96, 5.67) for the decreased activity group, and 0.52 (0.11, 2.41) for the increased activity group. The relative risk of CHD mortality was 6.92 (2.05, 23.34) for the sedentary-no change group, 2.40 (0.55, 10.51) for the decreased activity group, and 1.58 (0.26, 9.65) for the increased activity group. The relative risk for reinfarction was 2.50 (1.52, 4.10) for the sedentary-no change group, 2.26 (1.24, 4.12) for the decreased activity group, and 0.52 (0.21, 1.32) for the increased activity group. Finally, the relative risk for receipt of a revascularization procedure was 0.65 (0.39, 1.07) for the sedentary-no change group, 0.45 (0.22, 0.92) for the decreased activity group, and 1.01 (0.51, 2.02) for the increased activity group. No interactions were observed for ethnicity or severity of first MI. These results are consistent with the hypothesis that moderate physical activity is independently associated with a lower risk of all-cause mortality, CHD mortality, and reinfarction, but not revascularization, among Mexican-American and non-Hispanic white, female and male, first MI patients. These results also support the current recommendation that physical activity plays an important role in the secondary prevention of CHD. ^
Resumo:
Can the early identification of the species of staphylococcus responsible for infection by the use of Real Time PCR technology influence the approach to the treatment of these infections? ^ This study was a retrospective cohort study in which two groups of patients were compared. The first group, ‘Physician Aware’ consisted of patients in whom physicians were informed of specific staphylococcal species and antibiotic sensitivity (using RT-PCR) at the time of notification of the gram stain. The second group, ‘Physician Unaware’ consisted of patients in whom treating physicians received the same information 24–72 hours later as a result of blood culture and antibiotic sensitivity determination. ^ The approach to treatment was compared between ‘Physician Aware’ and ‘Physician Unaware’ groups for three different microbiological diagnoses—namely MRSA, MSSA and no-SA (or coagulase negative Staphylococcus). ^ For a diagnosis of MRSA, the mean time interval to the initiation of Vancomycin therapy was 1.08 hours in the ‘Physician Aware’ group as compared to 5.84 hours in the ‘Physician Unaware’ group (p=0.34). ^ For a diagnosis of MSSA, the mean time interval to the initiation of specific anti-MSSA therapy with Nafcillin was 5.18 hours in the ‘Physician Aware’ group as compared to 49.8 hours in the ‘Physician Unaware’ group (p=0.007). Also, for the same diagnosis, the mean duration of empiric therapy in the ‘Physician Aware’ group was 19.68 hours as compared to 80.75 hours in the ‘Physician Unaware’ group (p=0.003) ^ For a diagnosis of no-SA or coagulase negative staphylococcus, the mean duration of empiric therapy was 35.65 hours in the ‘Physician Aware’ group as compared to 44.38 hours in the ‘Physician Unaware’ group (p=0.07). However, when treatment was considered a categorical variable and after exclusion of all cases where anti-MRS therapy was used for unrelated conditions, only 20 of 72 cases in the ‘Physician Aware’ group received treatment as compared to 48 of 106 cases in the ‘Physician Unaware’ group. ^ Conclusions. Earlier diagnosis of MRSA may not alter final treatment outcomes. However, earlier identification may lead to the earlier institution of measures to limit the spread of infection. The early diagnosis of MSSA infection, does lead to treatment with specific antibiotic therapy at an earlier stage of treatment. Also, the duration of empiric therapy is greatly reduced by early diagnosis. The early diagnosis of coagulase negative staphylococcal infection leads to a lower rate of unnecessary treatment for these infections as they are commonly considered contaminants. ^
Resumo:
Renal insufficiency is one of the most common co-morbidities present in heart failure (HF) patients. It has significant impact on mortality and adverse outcomes. Cystatin C has been shown as a promising marker of renal function. A systematic review of all the published studies evaluating the prognostic role of cystatin C in both acute and chronic HF was undertaken. A comprehensive literature search was conducted involving various terms of 'cystatin C' and 'heart failure' in Pubmed medline and Embase libraries using Scopus database. A total of twelve observational studies were selected in this review for detailed assessment. Six studies were performed in acute HF patients and six were performed in chronic HF patients. Cystatin C was used as a continuous variable, as quartiles/tertiles or as a categorical variable in these studies. Different mortality endpoints were reported in these studies. All twelve studies demonstrated a significant association of cystatin C with mortality. This association was found to be independent of other baseline risk factors that are known to impact HF outcomes. In both acute and chronic HF, cystatin C was not only a strong predictor of outcomes but also a better prognostic marker than creatinine and estimated glomerular filtration rate (eGFR). A combination of cystatin C with other biomarkers such as N terminal pro B- type natriuretic peptide (NT-proBNP) or creatinine also improved the risk stratification. The plausible mechanisms are renal dysfunction, inflammation or a direct effect of cystatin C on ventricular remodeling. Either alone or in combination, cystatin C is a better, accurate and a reliable biomarker for HF prognosis. ^
Resumo:
Ordinal outcomes are frequently employed in diagnosis and clinical trials. Clinical trials of Alzheimer's disease (AD) treatments are a case in point using the status of mild, moderate or severe disease as outcome measures. As in many other outcome oriented studies, the disease status may be misclassified. This study estimates the extent of misclassification in an ordinal outcome such as disease status. Also, this study estimates the extent of misclassification of a predictor variable such as genotype status. An ordinal logistic regression model is commonly used to model the relationship between disease status, the effect of treatment, and other predictive factors. A simulation study was done. First, data based on a set of hypothetical parameters and hypothetical rates of misclassification was created. Next, the maximum likelihood method was employed to generate likelihood equations accounting for misclassification. The Nelder-Mead Simplex method was used to solve for the misclassification and model parameters. Finally, this method was applied to an AD dataset to detect the amount of misclassification present. The estimates of the ordinal regression model parameters were close to the hypothetical parameters. β1 was hypothesized at 0.50 and the mean estimate was 0.488, β2 was hypothesized at 0.04 and the mean of the estimates was 0.04. Although the estimates for the rates of misclassification of X1 were not as close as β1 and β2, they validate this method. X 1 0-1 misclassification was hypothesized as 2.98% and the mean of the simulated estimates was 1.54% and, in the best case, the misclassification of k from high to medium was hypothesized at 4.87% and had a sample mean of 3.62%. In the AD dataset, the estimate for the odds ratio of X 1 of having both copies of the APOE 4 allele changed from an estimate of 1.377 to an estimate 1.418, demonstrating that the estimates of the odds ratio changed when the analysis includes adjustment for misclassification. ^
Resumo:
The need for timely population data for health planning and Indicators of need has Increased the demand for population estimates. The data required to produce estimates is difficult to obtain and the process is time consuming. Estimation methods that require less effort and fewer data are needed. The structure preserving estimator (SPREE) is a promising technique not previously used to estimate county population characteristics. This study first uses traditional regression estimation techniques to produce estimates of county population totals. Then the structure preserving estimator, using the results produced in the first phase as constraints, is evaluated.^ Regression methods are among the most frequently used demographic methods for estimating populations. These methods use symptomatic indicators to predict population change. This research evaluates three regression methods to determine which will produce the best estimates based on the 1970 to 1980 indicators of population change. Strategies for stratifying data to improve the ability of the methods to predict change were tested. Difference-correlation using PMSA strata produced the equation which fit the data the best. Regression diagnostics were used to evaluate the residuals.^ The second phase of this study is to evaluate use of the structure preserving estimator in making estimates of population characteristics. The SPREE estimation approach uses existing data (the association structure) to establish the relationship between the variable of interest and the associated variable(s) at the county level. Marginals at the state level (the allocation structure) supply the current relationship between the variables. The full allocation structure model uses current estimates of county population totals to limit the magnitude of county estimates. The limited full allocation structure model has no constraints on county size. The 1970 county census age - gender population provides the association structure, the allocation structure is the 1980 state age - gender distribution.^ The full allocation model produces good estimates of the 1980 county age - gender populations. An unanticipated finding of this research is that the limited full allocation model produces estimates of county population totals that are superior to those produced by the regression methods. The full allocation model is used to produce estimates of 1986 county population characteristics. ^
Resumo:
The performance of the Hosmer-Lemeshow global goodness-of-fit statistic for logistic regression models was explored in a wide variety of conditions not previously fully investigated. Computer simulations, each consisting of 500 regression models, were run to assess the statistic in 23 different situations. The items which varied among the situations included the number of observations used in each regression, the number of covariates, the degree of dependence among the covariates, the combinations of continuous and discrete variables, and the generation of the values of the dependent variable for model fit or lack of fit.^ The study found that the $\rm\ C$g* statistic was adequate in tests of significance for most situations. However, when testing data which deviate from a logistic model, the statistic has low power to detect such deviation. Although grouping of the estimated probabilities into quantiles from 8 to 30 was studied, the deciles of risk approach was generally sufficient. Subdividing the estimated probabilities into more than 10 quantiles when there are many covariates in the model is not necessary, despite theoretical reasons which suggest otherwise. Because it does not follow a X$\sp2$ distribution, the statistic is not recommended for use in models containing only categorical variables with a limited number of covariate patterns.^ The statistic performed adequately when there were at least 10 observations per quantile. Large numbers of observations per quantile did not lead to incorrect conclusions that the model did not fit the data when it actually did. However, the statistic failed to detect lack of fit when it existed and should be supplemented with further tests for the influence of individual observations. Careful examination of the parameter estimates is also essential since the statistic did not perform as desired when there was moderate to severe collinearity among covariates.^ Two methods studied for handling tied values of the estimated probabilities made only a slight difference in conclusions about model fit. Neither method split observations with identical probabilities into different quantiles. Approaches which create equal size groups by separating ties should be avoided. ^
Resumo:
In this dissertation, we propose a continuous-time Markov chain model to examine the longitudinal data that have three categories in the outcome variable. The advantage of this model is that it permits a different number of measurements for each subject and the duration between two consecutive time points of measurements can be irregular. Using the maximum likelihood principle, we can estimate the transition probability between two time points. By using the information provided by the independent variables, this model can also estimate the transition probability for each subject. The Monte Carlo simulation method will be used to investigate the goodness of model fitting compared with that obtained from other models. A public health example will be used to demonstrate the application of this method. ^
Resumo:
The Lyme disease agent Borrelia burgdorferi can persistently infect humans and other animals despite host active immune responses. This is facilitated, in part, by the vls locus, a complex system consisting of the vlsE expression site and an adjacent set of 11 to 15 silent vls cassettes. Segments of nonexpressed cassettes recombine with the vlsE region during infection of mammalian hosts, resulting in combinatorial antigenic variation of the VlsE outer surface protein. We now demonstrate that synthesis of VlsE is regulated during the natural mammal-tick infectious cycle, being activated in mammals but repressed during tick colonization. Examination of cultured B. burgdorferi cells indicated that the spirochete controls vlsE transcription levels in response to environmental cues. Analysis of PvlsE::gfp fusions in B. burgdorferi indicated that VlsE production is controlled at the level of transcriptional initiation, and regions of 5' DNA involved in the regulation were identified. Electrophoretic mobility shift assays detected qualitative and quantitative changes in patterns of protein-DNA complexes formed between the vlsE promoter and cytoplasmic proteins, suggesting the involvement of DNA-binding proteins in the regulation of vlsE, with at least one protein acting as a transcriptional activator.
Resumo:
An exact knowledge of the kinetic nature of the interaction between the stimulatory G protein (G$\sb{\rm s}$) and the adenylyl cyclase catalytic unit (C) is essential for interpreting the effects of Gs mutations and expression levels on cellular response to a wide variety of hormones, drugs, and neurotransmitters. In particular, insight as to the association of these proteins could lead to progress in tumor biology where single spontaneous mutations in G proteins have been associated with the formation of tumors (118). The question this work attempts to answer is whether the adenylyl cyclase activation by epinephrine stimulated $\beta\sb2$-adrenergic receptors occurs via G$\sb{\rm s}$ proteins by a G$\sb{\rm s}$ to C shuttle or G$\sb{\rm s}$-C precoupled mechanism. The two forms of activation are distinguishable by the effect of G$\sb{\rm s}$ levels on epinephrine stimulated EC50 values for cyclase activation.^ We have made stable transfectants of S49 cyc$\sp-$ cells with the gene for the $\alpha$ protein of G$\sb{\rm s}$ $(\alpha\sb{\rm s})$ which is under the control of the mouse mammary tumor virus LTR promoter (110). Expression of G$\sb{\rm s}\alpha$ was then controlled by incubation of the cells for various times with 5 $\mu$M dexamethasone. Expression of G$\sb{\rm s}\alpha$ led to the appearance of GTP shifts in the competitive binding of epinephrine with $\sp{125}$ICYP to the $\beta$-adrenergic receptors and to agonist dependent adenylyl cyclase activity. High expression of G$\sb{\rm s}\alpha$ resulted in lower EC50's for the adenylyl cyclase activity in response to epinephrine than did low expression. By kinetic modelling, this result is consistent with the existence of a shuttle mechanism for adenylyl cyclase activation by hormones.^ One item of concern that remains to be addressed is the extent to which activation of adenylyl cyclase occurs by a "pure" shuttle mechanism. Kinetic and biochemical experiments by other investigators have revealed that adenylyl cyclase activation, by hormones, may occur via a Gs-C precoupled mechanism (80, 94, 97). Activation of adenylyl cyclase, therefore, probably does not occur by either a pure "'Shuttle" or "Gs-C Precoupled" mechanism, but rather by a "Hybrid" mechanism. The extent to which either the shuttle or precoupled mechanism contributes to hormone stimulated adenylyl cyclase activity is the subject of on-going research. ^
Resumo:
The purpose of this study is to investigate the effects of predictor variable correlations and patterns of missingness with dichotomous and/or continuous data in small samples when missing data is multiply imputed. Missing data of predictor variables is multiply imputed under three different multivariate models: the multivariate normal model for continuous data, the multinomial model for dichotomous data and the general location model for mixed dichotomous and continuous data. Subsequent to the multiple imputation process, Type I error rates of the regression coefficients obtained with logistic regression analysis are estimated under various conditions of correlation structure, sample size, type of data and patterns of missing data. The distributional properties of average mean, variance and correlations among the predictor variables are assessed after the multiple imputation process. ^ For continuous predictor data under the multivariate normal model, Type I error rates are generally within the nominal values with samples of size n = 100. Smaller samples of size n = 50 resulted in more conservative estimates (i.e., lower than the nominal value). Correlation and variance estimates of the original data are retained after multiple imputation with less than 50% missing continuous predictor data. For dichotomous predictor data under the multinomial model, Type I error rates are generally conservative, which in part is due to the sparseness of the data. The correlation structure for the predictor variables is not well retained on multiply-imputed data from small samples with more than 50% missing data with this model. For mixed continuous and dichotomous predictor data, the results are similar to those found under the multivariate normal model for continuous data and under the multinomial model for dichotomous data. With all data types, a fully-observed variable included with variables subject to missingness in the multiple imputation process and subsequent statistical analysis provided liberal (larger than nominal values) Type I error rates under a specific pattern of missing data. It is suggested that future studies focus on the effects of multiple imputation in multivariate settings with more realistic data characteristics and a variety of multivariate analyses, assessing both Type I error and power. ^
Resumo:
Random Forests™ is reported to be one of the most accurate classification algorithms in complex data analysis. It shows excellent performance even when most predictors are noisy and the number of variables is much larger than the number of observations. In this thesis Random Forests was applied to a large-scale lung cancer case-control study. A novel way of automatically selecting prognostic factors was proposed. Also, synthetic positive control was used to validate Random Forests method. Throughout this study we showed that Random Forests can deal with large number of weak input variables without overfitting. It can account for non-additive interactions between these input variables. Random Forests can also be used for variable selection without being adversely affected by collinearities. ^ Random Forests can deal with the large-scale data sets without rigorous data preprocessing. It has robust variable importance ranking measure. Proposed is a novel variable selection method in context of Random Forests that uses the data noise level as the cut-off value to determine the subset of the important predictors. This new approach enhanced the ability of the Random Forests algorithm to automatically identify important predictors for complex data. The cut-off value can also be adjusted based on the results of the synthetic positive control experiments. ^ When the data set had high variables to observations ratio, Random Forests complemented the established logistic regression. This study suggested that Random Forests is recommended for such high dimensionality data. One can use Random Forests to select the important variables and then use logistic regression or Random Forests itself to estimate the effect size of the predictors and to classify new observations. ^ We also found that the mean decrease of accuracy is a more reliable variable ranking measurement than mean decrease of Gini. ^
Resumo:
The discrete-time Markov chain is commonly used in describing changes of health states for chronic diseases in a longitudinal study. Statistical inferences on comparing treatment effects or on finding determinants of disease progression usually require estimation of transition probabilities. In many situations when the outcome data have some missing observations or the variable of interest (called a latent variable) can not be measured directly, the estimation of transition probabilities becomes more complicated. In the latter case, a surrogate variable that is easier to access and can gauge the characteristics of the latent one is usually used for data analysis. ^ This dissertation research proposes methods to analyze longitudinal data (1) that have categorical outcome with missing observations or (2) that use complete or incomplete surrogate observations to analyze the categorical latent outcome. For (1), different missing mechanisms were considered for empirical studies using methods that include EM algorithm, Monte Carlo EM and a procedure that is not a data augmentation method. For (2), the hidden Markov model with the forward-backward procedure was applied for parameter estimation. This method was also extended to cover the computation of standard errors. The proposed methods were demonstrated by the Schizophrenia example. The relevance of public health, the strength and limitations, and possible future research were also discussed. ^
Resumo:
Studies on the relationship between psychosocial determinants and HIV risk behaviors have produced little evidence to support hypotheses based on theoretical relationships. One limitation inherent in many articles in the literature is the method of measurement of the determinants and the analytic approach selected. ^ To reduce the misclassification associated with unit scaling of measures specific to internalized homonegativity, I evaluated the psychometric properties of the Reactions to Homosexuality scale in a confirmatory factor analytic framework. In addition, I assessed the measurement invariance of the scale across racial/ethnic classifications in a sample of men who have sex with men. The resulting measure contained eight items loading on three first-order factors. Invariance assessment identified metric and partial strong invariance between racial/ethnic groups in the sample. ^ Application of the updated measure to a structural model allowed for the exploration of direct and indirect effects of internalized homonegativity on unprotected anal intercourse. Pathways identified in the model show that drug and alcohol use at last sexual encounter, the number of sexual partners in the previous three months and sexual compulsivity all contribute directly to risk behavior. Internalized homonegativity reduced the likelihood of exposure to drugs, alcohol or higher numbers of partners. For men who developed compulsive sexual behavior as a coping strategy for internalized homonegativity, there was an increase in the prevalence odds of risk behavior. ^ In the final stage of the analysis, I conducted a latent profile analysis of the items in the updated Reactions to Homosexuality scale. This analysis identified five distinct profiles, which suggested that the construct was not homogeneous in samples of men who have sex with men. Lack of prior consideration of these distinct manifestations of internalized homonegativity may have contributed to the analytic difficulty in identifying a relationship between the trait and high-risk sexual practices. ^
Resumo:
Objectives. This paper seeks to assess the effect on statistical power of regression model misspecification in a variety of situations. ^ Methods and results. The effect of misspecification in regression can be approximated by evaluating the correlation between the correct specification and the misspecification of the outcome variable (Harris 2010).In this paper, three misspecified models (linear, categorical and fractional polynomial) were considered. In the first section, the mathematical method of calculating the correlation between correct and misspecified models with simple mathematical forms was derived and demonstrated. In the second section, data from the National Health and Nutrition Examination Survey (NHANES 2007-2008) were used to examine such correlations. Our study shows that comparing to linear or categorical models, the fractional polynomial models, with the higher correlations, provided a better approximation of the true relationship, which was illustrated by LOESS regression. In the third section, we present the results of simulation studies that demonstrate overall misspecification in regression can produce marked decreases in power with small sample sizes. However, the categorical model had greatest power, ranging from 0.877 to 0.936 depending on sample size and outcome variable used. The power of fractional polynomial model was close to that of linear model, which ranged from 0.69 to 0.83, and appeared to be affected by the increased degrees of freedom of this model.^ Conclusion. Correlations between alternative model specifications can be used to provide a good approximation of the effect on statistical power of misspecification when the sample size is large. When model specifications have known simple mathematical forms, such correlations can be calculated mathematically. Actual public health data from NHANES 2007-2008 were used as examples to demonstrate the situations with unknown or complex correct model specification. Simulation of power for misspecified models confirmed the results based on correlation methods but also illustrated the effect of model degrees of freedom on power.^
Resumo:
This study evaluates the effectiveness of the Children and Youth Projects' Adolescent Family Life Program, a comprehensive program serving pregnant and parenting adolescents in the economically disadvantaged area of West Dallas. The underlying question asked is what are the relative contributions of the comprehensive, school-linked Adolescent Family Life (AFL) Program compared with the Maternal Health and Family Planning Program (MHFPP), a categorical provider of family planning and reproductive services, towards meeting the immediate and intermediate term needs of adolescent mothers. Also addressed are the protective effects of participation in the Dallas Independent School District Health Special Program, a segregated school for pregnant adolescents.^ A cohort of 339 West Dallas adolescent mothers who delivered babies during a two-year period, 1986 through 1987, are monitored by linking records from Parkland Hospital, the primary provider to hospital services to indigent women in Dallas, the Dallas Independent School District, and the prenatal care providers, the AFL and MHFP Programs. Information is collected on each teen describing her demographic, fertility, service utilization and educational characteristics.^ The study tests the hypothesis that adolescents receiving services from the comprehensive AFL program will be less likely to have a repeat birth and to discontinue school during the 24 month study period, compared with categorical provider clients. Although the study finds that there are no statistically significant differences in repeat deliveries, using survival analysis, or in school continuation between programs, important findings are revealed about the ethnic differences. Black and Hispanic fertility and educational behaviors are compared, and their implications for program design and evaluation discussed. ^