25 resultados para Automatic Analysis of Multivariate Categorical Data Sets
em DigitalCommons@The Texas Medical Center
Resumo:
Background. The purpose of this study was to investigate the association of periodontal disease with sex hormones. If periodontal disease is associated with abnormal levels of sex hormones this may indicate a link between periodontal disease and prostate cancer. ^ Methods. All participants were derived from the third National Health and Nutrition Examination survey (NHANES III) data. For the purpose of our study, serum samples for hormones measurements such as testosterone, free testosterone, estradiol, free estradiol and sex hormone binding globulin (SHBG) and periodontal examination data were available for 1,101 of these men. ^ Results. After adjusting for known risk factors, periodontal disease was significantly associated with sex hormones as testosterone, free testosterone, estradiol and free estradiol. The association of periodontal disease and sex hormone levels were not significantly different between ethnicity groups. ^ Discussion. The results indicate the need for further study of periodontal disease and serum levels of testosterone, free testosterone, estradiol and free estradiol in men.^
Resumo:
The need for timely population data for health planning and Indicators of need has Increased the demand for population estimates. The data required to produce estimates is difficult to obtain and the process is time consuming. Estimation methods that require less effort and fewer data are needed. The structure preserving estimator (SPREE) is a promising technique not previously used to estimate county population characteristics. This study first uses traditional regression estimation techniques to produce estimates of county population totals. Then the structure preserving estimator, using the results produced in the first phase as constraints, is evaluated.^ Regression methods are among the most frequently used demographic methods for estimating populations. These methods use symptomatic indicators to predict population change. This research evaluates three regression methods to determine which will produce the best estimates based on the 1970 to 1980 indicators of population change. Strategies for stratifying data to improve the ability of the methods to predict change were tested. Difference-correlation using PMSA strata produced the equation which fit the data the best. Regression diagnostics were used to evaluate the residuals.^ The second phase of this study is to evaluate use of the structure preserving estimator in making estimates of population characteristics. The SPREE estimation approach uses existing data (the association structure) to establish the relationship between the variable of interest and the associated variable(s) at the county level. Marginals at the state level (the allocation structure) supply the current relationship between the variables. The full allocation structure model uses current estimates of county population totals to limit the magnitude of county estimates. The limited full allocation structure model has no constraints on county size. The 1970 county census age - gender population provides the association structure, the allocation structure is the 1980 state age - gender distribution.^ The full allocation model produces good estimates of the 1980 county age - gender populations. An unanticipated finding of this research is that the limited full allocation model produces estimates of county population totals that are superior to those produced by the regression methods. The full allocation model is used to produce estimates of 1986 county population characteristics. ^
Resumo:
The role of clinical chemistry has traditionally been to evaluate acutely ill or hospitalized patients. Traditional statistical methods have serious drawbacks in that they use univariate techniques. To demonstrate alternative methodology, a multivariate analysis of covariance model was developed and applied to the data from the Cooperative Study of Sickle Cell Disease.^ The purpose of developing the model for the laboratory data from the CSSCD was to evaluate the comparability of the results from the different clinics. Several variables were incorporated into the model in order to control for possible differences among the clinics that might confound any real laboratory differences.^ Differences for LDH, alkaline phosphatase and SGOT were identified which will necessitate adjustments by clinic whenever these data are used. In addition, aberrant clinic values for LDH, creatinine and BUN were also identified.^ The use of any statistical technique including multivariate analysis without thoughtful consideration may lead to spurious conclusions that may not be corrected for some time, if ever. However, the advantages of multivariate analysis far outweigh its potential problems. If its use increases as it should, the applicability to the analysis of laboratory data in prospective patient monitoring, quality control programs, and interpretation of data from cooperative studies could well have a major impact on the health and well being of a large number of individuals. ^
Resumo:
Mixture modeling is commonly used to model categorical latent variables that represent subpopulations in which population membership is unknown but can be inferred from the data. In relatively recent years, the potential of finite mixture models has been applied in time-to-event data. However, the commonly used survival mixture model assumes that the effects of the covariates involved in failure times differ across latent classes, but the covariate distribution is homogeneous. The aim of this dissertation is to develop a method to examine time-to-event data in the presence of unobserved heterogeneity under a framework of mixture modeling. A joint model is developed to incorporate the latent survival trajectory along with the observed information for the joint analysis of a time-to-event variable, its discrete and continuous covariates, and a latent class variable. It is assumed that the effects of covariates on survival times and the distribution of covariates vary across different latent classes. The unobservable survival trajectories are identified through estimating the probability that a subject belongs to a particular class based on observed information. We applied this method to a Hodgkin lymphoma study with long-term follow-up and observed four distinct latent classes in terms of long-term survival and distributions of prognostic factors. Our results from simulation studies and from the Hodgkin lymphoma study demonstrated the superiority of our joint model compared with the conventional survival model. This flexible inference method provides more accurate estimation and accommodates unobservable heterogeneity among individuals while taking involved interactions between covariates into consideration.^
Resumo:
In this dissertation, we propose a continuous-time Markov chain model to examine the longitudinal data that have three categories in the outcome variable. The advantage of this model is that it permits a different number of measurements for each subject and the duration between two consecutive time points of measurements can be irregular. Using the maximum likelihood principle, we can estimate the transition probability between two time points. By using the information provided by the independent variables, this model can also estimate the transition probability for each subject. The Monte Carlo simulation method will be used to investigate the goodness of model fitting compared with that obtained from other models. A public health example will be used to demonstrate the application of this method. ^
Resumo:
Any functionally important mutation is embedded in an evolutionary matrix of other mutations. Cladistic analysis, based on this, is a method of investigating gene effects using a haplotype phylogeny to define a set of tests which localize causal mutations to branches of the phylogeny. Previous implementations of cladistic analysis have not addressed the issue of analyzing data from related individuals, though in human studies, family data are usually needed to obtain unambiguous haplotypes. In this study, a method of cladistic analysis is described in which haplotype effects are parameterized in a linear model which accounts for familial correlations. The method was used to study the effect of apolipoprotein (Apo) B gene variation on total-, LDL-, and HDL-cholesterol, triglyceride, and Apo B levels in 121 French families. Five polymorphisms defined Apo B haplotypes: the signal peptide Insertion/deletion, Bsp 1286I, XbaI, MspI, and EcoRI. Eleven haplotypes were found, and a haplotype phylogeny was constructed and used to define a set of tests of haplotype effects on lipid and apo B levels.^ This new method of cladistic analysis, the parametric method, found significant effects for single haplotypes for all variables. For HDL-cholesterol, 3 clusters of evolutionarily-related haplotypes affecting levels were found. Haplotype effects accounted for about 10% of the genetic variance of triglyceride and HDL-cholesterol levels. The results of the parametric method were compared to those of a method of cladistic analysis based on permutational testing. The permutational method detected fewer haplotype effects, even when modified to account for correlations within families. Simulation studies exploring these differences found evidence of systematic errors in the permutational method due to the process by which haplotype groups were selected for testing.^ The applicability of cladistic analysis to human data was shown. The parametric method is suggested as an improvement over the permutational method. This study has identified candidate haplotypes for sequence comparisons in order to locate the functional mutations in the Apo B gene which may influence plasma lipid levels. ^
Resumo:
Temperature sensitive (ts) mutant viruses have helped elucidate replication processes in many viral systems. Several panels of replication-defective ts mutants in which viral RNA synthesis is abolished at the nonpermissive temperature (RNA$\sp{-})$ have been isolated for Mouse Hepatitis Virus, MHV (Robb et al., 1979; Koolen et al., 1983; Martin et al., 1988; Schaad et al., 1990). However, no one had investigated genetic or phenotypic relationships between these different mutant panels. In order to determine how the panel of MHV-JHM RNA$\sp{-}$ ts mutants (Robb et al., 1979) were genetically related to other described MHV RNA$\sp{-}$ ts mutants, the MHV-JHM mutants were tested for complementation with representatives from two different sets of MHV-A59 ts mutants (Koolen et al., 1983; Schaad et al., 1990). The three ts mutant panels together were found to comprise eight genetically distinct complementation groups. Of these eight complementation groups, three complementation classes are unique to their particular mutant panel; genetically equivalent mutants were not observed within the other two mutant panels. Two complementation groups were common to all three mutant panels. The three remaining complementation groups overlapped two of the three mutant sets. Mutants MHV-JHM tsA204 and MHV-A59 ts261 were shown to be within one of these overlapping complementation groups. The phenotype of the MHV-JHM mutants within this complementation class has been previously characterized (Leibowitz et al., 1982; Leibowitz et al, 1990). When these mutants were grown at the permissive temperature, then shifted up to the nonpermissive temperature at the start of RNA synthesis, genome-length RNA and leader RNA fragments accumulated, but no subgenomic mRNA was synthesized. MHV-A59 ts261 produced leader RNA fragments identical to those observed with MHV-JHM tsA204. Thus, these two MHV RNA$\sp{-}$ ts mutants that were genetically equivalent by complementation testing were phenotypically similar as well. Recombination frequencies obtained from crosses of MHV-A59 ts261 with several of the gene 1 MHV-A59 mutants indicated that the causal mutation(s) of MHV-A59 ts261 was located near the overlapping junction of ORF1a and ORF1b, in the 3$\sp\prime$ end of ORF1a, or the 5$\sp\prime$ end of ORF1b. Sequence analysis of this junction and 1400 nucleotides into the 5$\sp\prime$ end of ORF1b of MHV-A59 ts261 revealed one nucleotide change from the wildtype MHV-A59. This substitution at nucleotide 13,598 (A to G) was a silent mutation in the ORF1a reading frame, but resulted in an amino acid change in ORF1b gene product (I to V). This amino acid change would be expressed only in the readthrough translation product produced upon successful ribosome frameshifting. A revertant of MHV-A59 ts261 (R2) also retained this guanidine residue, but had a second substitution at nucleotide 14,475 in ORF1b. This mutation results in the substitution of valine for an isoleucine.^ The data presented here suggest that the mutation in MHV-A59 ts261 (nucleotide 13,598) would be responsible for the MHV-JHM complementation group A phenotype. A second-site reversion at nucleotide 14,475 may correct this defect in the revertant. Sequencing of gene 1 immediately upstream of nucleotide 13,296 and downstream of nucleotide 15,010 must be conducted to test this hypothesis. ^
Resumo:
This research examines prevalence of alcohol and illicit substance use in the United States and Mexico and associated socio-demographic characteristics. The sources of data for this study are public domain data from the U.S. National Household Survey of Drug Abuse, 1988 (n = 8814), and the Mexican National Survey of Addictions, 1988 (n = 12,579). In addition, this study discusses methodologic issues in cross-cultural and cross-national comparison of behavioral and epidemiologic data from population-based samples. The extent to which patterns of substance abuse vary among subgroups of the U.S. and Mexican populations is assessed, as well as the comparability and equivalence of measures of alcohol and drug use in these national samples.^ The prevalence of alcohol use was somewhat similar in the two countries for all three measures of use: lifetime, past year and past year heavy use, (85.0%, 68.1%, 39.6% and 72.6%, 47.7% and 45.8% for the U.S. and Mexico respectively). The use of illegal substances varied widely between countries, with U.S. respondents reporting significantly higher levels of use than their Mexican counterparts. For example, reported use of any illicit substance in lifetime and past year was 34.2%, 11.6 for the U.S., and 3.3% and 0.6% for Mexico. Despite these differences in prevalence, two demographic characteristics, gender and age, were important correlates of use in both countries. Men in both countries were more likely to report use of alcohol and illicit substances than women. Generally speaking, a greater proportion of respondents in both countries 18 years of age or older reported use of alcohol for all three measures than younger respondents; and a greater proportion of respondents between the ages of 18 and 34 years reported use of illicit substances during lifetime and past year than any other age group.^ Additional substantive research investigating population-based samples and at-risk subgroups is needed to understand the underlying mechanisms of these associations. Further development of cross-culturally meaningful survey methods is warranted to validate comparisons of substance use across countries and societies. ^
Resumo:
Tup1 forms a complex with Ssn6 in yeast. Ssn6-Tup1 complex is recruited via direct interactions with specific DNA binding proteins to a specific promoter region and mediates repression of several sets of genes including a-cell specific genes (asg) in $\alpha$ cells. It has been shown that repression of asgs also requires histone H4 and that Tup1 can directly interact with H3 and H4 in vitro. To address whether histone H3 is required for the repression of asgs, I have examined the effect of H3 and H4 mutations on the expression of a $\alpha$2-controlled LacZ reporter. Assay of $\beta$-glactosidase shows that mutations in either H3 or H4 cause a weak derepression of the reporter gene. Some double mutations result in a stronger derepression, while others do not. The H3 N-terminal deletion also leads to a slightly decreased expression of the reporter gene in $\alpha$ cells. Our data suggest that the N-termini of both H3 and H4 are cooperatively involved in the repression of a-cell specific genes in $\alpha$ cells, possibly through their interaction with Tup1.^ GCN5 was originally identified as a transcriptional regulator required to activate a subset of genes in yeast. Recently, it has been shown that GCN5 encodes the catalytic subunit of a nuclear histone acetyltransferase, providing the first direct link between histone acetylation and gene transcription. Recombinant Gcn5p (rGcn5p) exhibits a limited substrate specificity in vitro. However, neither the specificity of this enzyme in vivo nor the importance of particular acetylated residues to transcription or cell growth are well defined. In order to define the sites of histone acetylation mediated by Gcn5p in vivo and assess the significance of histone acetylation, more than 30 yeast strains have been constructed to bear specific H3 and/or H4 mutations in the presence or absence of GCN5 function. Our genetic data suggest that Gcn5p may have additional targets in vivo that are not identified as the targets of rGcn5p by previous studies. Western analysis using antibodies specifically recognizing particular acetylated isoforms of H3 and H4 led us to conclude that Gcn5p is necessary for full acetylation of multiple sites in both H3 and H4 in vivo. Consistent with these observations, rGcn5p still acetylates histones H3 and H4 bearing mutations either in H3 K14 or H4 K8,16, sites previously identified as the targets of acetylation by rGcn5p in H3 and H4. Our data also demonstrated that Gcn5p-mediated acetylation events are important for normal progression of the cell cycle and for transcriptional activation. Furthermore, a critical overall level of acetylation is essential for cell viability. ^
Resumo:
Many studies in biostatistics deal with binary data. Some of these studies involve correlated observations, which can complicate the analysis of the resulting data. Studies of this kind typically arise when a high degree of commonality exists between test subjects. If there exists a natural hierarchy in the data, multilevel analysis is an appropriate tool for the analysis. Two examples are the measurements on identical twins, or the study of symmetrical organs or appendages such as in the case of ophthalmic studies. Although this type of matching appears ideal for the purposes of comparison, analysis of the resulting data while ignoring the effect of intra-cluster correlation has been shown to produce biased results.^ This paper will explore the use of multilevel modeling of simulated binary data with predetermined levels of correlation. Data will be generated using the Beta-Binomial method with varying degrees of correlation between the lower level observations. The data will be analyzed using the multilevel software package MlwiN (Woodhouse, et al, 1995). Comparisons between the specified intra-cluster correlation of these data and the estimated correlations, using multilevel analysis, will be used to examine the accuracy of this technique in analyzing this type of data. ^
Resumo:
Cancer is the second leading cause of death in the United States. With the advent of new technologies, changes in health care delivery, and multiplicity of provider types that patients must see, cancer care management has become increasingly complex. The availability of cancer health information has been shown to help cancer patients cope with the management and effects of their cancers. As a result, more cancer patients are using the internet to find resources that can aid in decision-making and recovery. ^ The Health Information National Trends Survey (HINTS) is a nationally representative survey designed to collect information about the experiences of cancer and non-cancer adults with health information sources. The HINTS survey focused on both conventional sources as well as newer technologies, particularly the internet. This study is a descriptive analysis of the HINTS 2003 and HINTS 2005 survey data. The purpose of the research is to explore the general trends in health information seeking and use by US adults, and especially by cancer patients. ^ From 2003 to 2005, internet use for various health-related activities appears to have increased among adults with and without cancer. Differences were found between the groups in the general trust in information media, particularly the internet. Non-cancer respondents tended to have greater trust in information media than cancer respondents. ^ The latter portion of this work examined characteristics of HINTS respondents that were thought to be relevant to how much trust individuals placed in the internet as a source of health information. Trust in health information from the internet was significantly greater among younger adults, higher-earning households, internet users, online seekers of health or cancer information, and those who found online cancer information useful. ^
Resumo:
In the United States, “binge” drinking among college students is an emerging public health concern due to the significant physical and psychological effects on young adults. The focus is on identifying interventions that can help decrease high-risk drinking behavior among this group of drinkers. One such intervention is Motivational interviewing (MI), a client-centered therapy that aims at resolving client ambivalence by developing discrepancy and engaging the client in change talk. Of late, there is a growing interest in determining the active ingredients that influence the alliance between the therapist and the client. This study is a secondary analysis of the data obtained from the Southern Methodist Alcohol Research Trial (SMART) project, a dismantling trial of MI and feedback among heavy drinking college students. The present project examines the relationship between therapist and client language in MI sessions on a sample of “binge” drinking college students. Of the 126 SMART tapes, 30 tapes (‘MI with feedback’ group = 15, ‘MI only’ group = 15) were randomly selected for this study. MISC 2.1, a mutually exclusive and exhaustive coding system, was used to code the audio/videotaped MI sessions. Therapist and client language were analyzed for communication characteristics. Overall, therapists adopted a MI consistent style and clients were found to engage in change talk. Counselor acceptance, empathy, spirit, and complex reflections were all significantly related to client change talk (p-values ranged from 0.001 to 0.047). Additionally, therapist ‘advice without permission’ and MI Inconsistent therapist behaviors were strongly correlated with client sustain talk (p-values ranged from 0.006 to 0.048). Simple linear regression models showed a significant correlation between MI consistent (MICO) therapist language (independent variable) and change talk (dependent variable) and MI inconsistent (MIIN) therapist language (independent variable) and sustain talk (dependent variable). The study has several limitations such as small sample size, self-selection bias, poor inter-rater reliability for the global scales and the lack of a temporal measure of therapist and client language. Future studies might consider a larger sample size to obtain more statistical power. In addition the correlation between therapist language, client language and drinking outcome needs to be explored.^
Resumo:
We conducted a nested case-control study to determine the significant risk factors for developing encephalitis from West Nile virus (WNV) infection. The purpose of this research project was to expand the previously published Houston study of 2002–2004 patients to include data on Houston patients from four additional years (2005–2008) to determine if there were any differences in risk factors shown to be associated with developing the more severe outcomes of WNV infection, encephalitis and death, by having this larger sample size. A re-analysis of the risk factors for encephalitis and death was conducted on all of the patients from 2002–2008 and was the focus of this proposed research. This analysis allowed for the determination to be made that there are differences in the outcome in the risk factors for encephalitis and death with an increased sample size. Retrospective medical chart reviews were completed for the 265 confirmed WNV hospitalized patients; 153 patients had encephalitis (WNE), 112 had either viral syndrome with fever (WNF) or meningitis (WNM); a total of 22 patients died. Univariate logistic regression analyses on demographic, comorbidities, and social risk factors was conducted in a similar manner as in the previously conducted study to determine the risk factors for developing encephalitis from WNV. A multivariate model was developed by using model building strategies for the multivariate logistic regression analysis. The hypothesis of this study was that there would be additional risk factors shown to be significant with the increase in sample size of the dataset. This analysis with a greater sample size and increased power supports the hypothesis in that there were additional risk factors shown to be statistically associated with the more severe outcomes of WNV infection (WNE or death). Based on univariate logistic regression results, these data showed that even though age of 20–44 years was statistically significant as a protecting effect for developing WNE in the original study, the expanded sample lacked significance. This study showed a significant WNE risk factor to be chronic alcohol abuse, when it was not significant in the original analysis. Other WNE risk factors identified in this analysis that showed to be significant but were not significant in the original analysis were cancer not in remission > 5 years, history of stroke, and chronic renal disease. When comparing the two analyses with death as an outcome, two risk factors that were shown to be significant in the original analysis but not in the expanded dataset analysis were diabetes mellitus and immunosuppression. Three risk factors shown to be significant in this expanded analysis but were not significant in the original study were illicit drug use, heroin or opiate use, and injection drug use. However, with the multiple logistic regression models, the same independent risk factors for developing encephalitis of age and history of hypertension including drug induced hypertension were consistent in both studies.^
Resumo:
Introduction. Food frequency questionnaires (FFQ) are used study the association between dietary intake and disease. An instructional video may potentially offer a low cost, practical method of dietary assessment training for participants thereby reducing recall bias in FFQs. There is little evidence in the literature of the effect of using instructional videos on FFQ-based intake. Objective. This analysis compared the reported energy and macronutrient intake of two groups that were randomized either to watch an instructional video before completing an FFQ or to view the same instructional video after completing the same FFQ. Methods. In the parent study, a diverse group of students, faculty and staff from Houston Community College were randomized to two groups, stratified by ethnicity, and completed an FFQ. The "video before" group watched an instructional video about completing the FFQ prior to answering the FFQ. The "video after" group watched the instructional video after completing the FFQ. The two groups were compared on mean daily energy (Kcal/day), fat (g/day), protein (g/day), carbohydrate (g/day) and fiber (g/day) intakes using descriptive statistics and one-way ANOVA. Demographic, height, and weight information was collected. Dietary intakes were adjusted for total energy intake before the comparative analysis. BMI and age were ruled out as potential confounders. Results. There were no significant differences between the two groups in mean daily dietary intakes of energy, total fat, protein, carbohydrates and fiber. However, a pattern of higher energy intake and lower fiber intake was reported in the group that viewed the instructional video before completing the FFQ compared to those who viewed the video after. Discussion. Analysis of the difference between reported intake of energy and macronutrients showed an overall pattern, albeit not statistically significant, of higher intake in the video before versus the video after group. Application of instructional videos for dietary assessment may require further research to address the validity of reported dietary intakes in those who are randomized to watch an instructional video before reporting diet compared to a control groups that does not view a video.^
Resumo:
Introduction. Despite the ban of lead-containing gasoline and paint, childhood lead poisoning remains a public health issue. Furthermore, a Medicaid-eligible child is 8 times more likely to have an elevated blood lead level (EBLL) than a non-Medicaid child, which is the primary reason for the early detection lead screening mandate for ages 12 and 24 months among the Medicaid population. Based on field observations, there was evidence that suggested a screening compliance issue. Objective. The purpose of this study was to analyze blood lead screening compliance in previously lead poisoned Medicaid children and test for an association between timely lead screening and timely childhood immunizations. The mean months between follow-up tests were also examined for a significant difference between the non-compliant and compliant lead screened children. Methods. Access to the surveillance data of all childhood lead poisoned cases in Bexar County was granted by the San Antonio Metropolitan Health District. A database was constructed and analyzed using descriptive statistics, logistic regression methods and non-parametric tests. Lead screening at 12 months of age was analyzed separately from lead screening at 24 months. The small portion of the population who were also related were included in one analysis and removed from a second analysis to check for significance. Gender, ethnicity, age of home, and having a sibling with an EBLL were ruled out as confounders for the association tests but ethnicity and age of home were adjusted in the nonparametric tests. Results. There was a strong significant association between lead screening compliance at 12 months and childhood immunization compliance, with or without including related children (p<0.00). However, there was no significant association between the two variables at the age of 24 months. Furthermore, there was no significant difference between the median of the mean months of follow-up blood tests among the non-compliant and compliant lead screened population for at the 12 month screening group but there was a significant difference at the 24 month screening group (p<0.01). Discussion. Descriptive statistics showed that 61% and 56% of the previously lead poisoned Medicaid population did not receive their 12 and 24 month mandated lead screening on time, respectively. This suggests that their elevated blood lead level may have been diagnosed earlier in their childhood. Furthermore, a child who is compliant with their lead screening at 12 months of age is 2.36 times more likely to also receive their childhood immunizations on time compared to a child who was not compliant with their 12 month screening. Even though there was no statistical significant association found for the 24 month group, the public health significance of a screening compliance issue is no less important. The Texas Medicaid program needs to enforce lead screening compliance because it is evident that there has been no monitoring system in place. Further recommendations include a need for an increased focus on parental education and the importance of taking their children for wellness exams on time.^