35 resultados para Multivariate data analysis
Resumo:
To reach the goals established by the Institute of Medicine (IOM) and the Centers for Disease Control's (CDC) STOP TB USA, measures must be taken to curtail a future peak in Tuberculosis (TB) incidence and speed the currently stagnant rate of TB elimination. Both efforts will require, at minimum, the consideration and understanding of the third dimension of TB transmission: the location-based spread of an airborne pathogen among persons known and unknown to each other. This consideration will require an elucidation of the areas within the U.S. that have endemic TB. The Houston Tuberculosis Initiative (HTI) was a population-based active surveillance of confirmed Houston/Harris County TB cases from 1995–2004. Strengths in this dataset include the molecular characterization of laboratory confirmed cases, the collection of geographic locations (including home addresses) frequented by cases, and the HTI time period that parallels a decline in TB incidence in the United States (U.S.). The HTI dataset was used in this secondary data analysis to implement a GIS analysis of TB cases, the locations frequented by cases, and their association with risk factors associated with TB transmission. ^ This study reports, for the first time, the incidence of TB among the homeless in Houston, Texas. The homeless are an at-risk population for TB disease, yet they are also a population whose TB incidence has been unknown and unreported due to their non-enumeration. The first section of this dissertation identifies local areas in Houston with endemic TB disease. Many Houston TB cases who reported living in these endemic areas also share the TB risk factor of current or recent homelessness. Merging the 2004–2005 Houston enumeration of the homeless with historical HTI surveillance data of TB cases in Houston enabled this first-time report of TB risk among the homeless in Houston. The homeless were more likely to be US-born, belong to a genotypic cluster, and belong to a cluster of a larger size. The calculated average incidence among homeless persons was 411/100,000, compared to 9.5/100,000 among housed. These alarming rates are not driven by a co-infection but by social determinants. The unsheltered persons were hospitalized more days and required more follow-up time by staff than those who reported a steady housing situation. The homeless are a specific example of the increased targeting of prevention dollars that could occur if TB rates were reported for specific areas with known health disparities rather than as a generalized rate normalized over a diverse population. ^ It has been estimated that 27% of Houstonians use public transportation. The city layout allows bus routes to run like veins connecting even the most diverse of populations within the metropolitan area. Secondary data analysis of frequent bus use (defined as riding a route weekly) among TB cases was assessed for its relationship with known TB risk factors. The spatial distribution of genotypic clusters associated with bus use was assessed, along with the reported routes and epidemiologic-links among cases belonging to the identified clusters. ^ TB cases who reported frequent bus use were more likely to have demographic and social risk factors associated with poverty, immune suppression and health disparities. An equal proportion of bus riders and non-bus riders were cultured for Mycobacterium tuberculosis, yet 75% of bus riders were genotypically clustered, indicating recent transmission, compared to 56% of non-bus riders (OR=2.4, 95%CI(2.0, 2.8), p<0.001). Bus riders had a mean cluster size of 50.14 vs. 28.9 (p<0.001). Second order spatial analysis of clustered fingerprint 2 (n=122), a Beijing family cluster, revealed geographic clustering among cases based on their report of bus use. Univariate and multivariate analysis of routes reported by cases belonging to these clusters found that 10 of the 14 clusters were associated with use. Individual Metro routes, including one route servicing the local hospitals, were found to be risk factors for belonging to a cluster shown to be endemic in Houston. The routes themselves geographically connect the census tracts previously identified as having endemic TB. 78% (15/23) of Houston Metro routes investigated had one or more print groups reporting frequent use for every HTI study year. We present data on three specific but clonally related print groups and show that bus-use is clustered in time by route and is the only known link between cases in one of the three prints: print 22. (Abstract shortened by UMI.)^
Resumo:
These three manuscripts are presented as a PhD dissertation for the study of using GeoVis application to evaluate telehealth programs. The primary reason of this research was to understand how the GeoVis applications can be designed and developed using combined approaches of HC approach and cognitive fit theory and in terms utilized to evaluate telehealth program in Brazil. First manuscript The first manuscript in this dissertation presented a background about the use of GeoVisualization to facilitate visual exploration of public health data. The manuscript covered the existing challenges that were associated with an adoption of existing GeoVis applications. The manuscript combines the principles of Human Centered approach and Cognitive Fit Theory and a framework using a combination of these approaches is developed that lays the foundation of this research. The framework is then utilized to propose the design, development and evaluation of “the SanaViz” to evaluate telehealth data in Brazil, as a proof of concept. Second manuscript The second manuscript is a methods paper that describes the approaches that can be employed to design and develop “the SanaViz” based on the proposed framework. By defining the various elements of the HC approach and CFT, a mixed methods approach is utilized for the card sorting and sketching techniques. A representative sample of 20 study participants currently involved in the telehealth program at the NUTES telehealth center at UFPE, Recife, Brazil was enrolled. The findings of this manuscript helped us understand the needs of the diverse group of telehealth users, the tasks that they perform and helped us determine the essential features that might be necessary to be included in the proposed GeoVis application “the SanaViz”. Third manuscript The third manuscript involved mix- methods approach to compare the effectiveness and usefulness of the HC GeoVis application “the SanaViz” against a conventional GeoVis application “Instant Atlas”. The same group of 20 study participants who had earlier participated during Aim 2 was enrolled and a combination of quantitative and qualitative assessments was done. Effectiveness was gauged by the time that the participants took to complete the tasks using both the GeoVis applications, the ease with which they completed the tasks and the number of attempts that were taken to complete each task. Usefulness was assessed by System Usability Scale (SUS), a validated questionnaire tested in prior studies. In-depth interviews were conducted to gather opinions about both the GeoVis applications. This manuscript helped us in the demonstration of the usefulness and effectiveness of HC GeoVis applications to facilitate visual exploration of telehealth data, as a proof of concept. Together, these three manuscripts represent challenges of combining principles of Human Centered approach, Cognitive Fit Theory to design and develop GeoVis applications as a method to evaluate Telehealth data. To our knowledge, this is the first study to explore the usefulness and effectiveness of GeoVis to facilitate visual exploration of telehealth data. The results of the research enabled us to develop a framework for the design and development of GeoVis applications related to the areas of public health and especially telehealth. The results of our study showed that the varied users were involved with the telehealth program and the tasks that they performed. Further it enabled us to identify the components that might be essential to be included in these GeoVis applications. The results of our research answered the following questions; (a) Telehealth users vary in their level of understanding about GeoVis (b) Interaction features such as zooming, sorting, and linking and multiple views and representation features such as bar chart and choropleth maps were considered the most essential features of the GeoVis applications. (c) Comparing and sorting were two important tasks that the telehealth users would perform for exploratory data analysis. (d) A HC GeoVis prototype application is more effective and useful for exploration of telehealth data than a conventional GeoVis application. Future studies should be done to incorporate the proposed HC GeoVis framework to enable comprehensive assessment of the users and the tasks they perform to identify the features that might be necessary to be a part of the GeoVis applications. The results of this study demonstrate a novel approach to comprehensively and systematically enhance the evaluation of telehealth programs using the proposed GeoVis Framework.
Resumo:
Complex diseases such as cancer result from multiple genetic changes and environmental exposures. Due to the rapid development of genotyping and sequencing technologies, we are now able to more accurately assess causal effects of many genetic and environmental factors. Genome-wide association studies have been able to localize many causal genetic variants predisposing to certain diseases. However, these studies only explain a small portion of variations in the heritability of diseases. More advanced statistical models are urgently needed to identify and characterize some additional genetic and environmental factors and their interactions, which will enable us to better understand the causes of complex diseases. In the past decade, thanks to the increasing computational capabilities and novel statistical developments, Bayesian methods have been widely applied in the genetics/genomics researches and demonstrating superiority over some regular approaches in certain research areas. Gene-environment and gene-gene interaction studies are among the areas where Bayesian methods may fully exert its functionalities and advantages. This dissertation focuses on developing new Bayesian statistical methods for data analysis with complex gene-environment and gene-gene interactions, as well as extending some existing methods for gene-environment interactions to other related areas. It includes three sections: (1) Deriving the Bayesian variable selection framework for the hierarchical gene-environment and gene-gene interactions; (2) Developing the Bayesian Natural and Orthogonal Interaction (NOIA) models for gene-environment interactions; and (3) extending the applications of two Bayesian statistical methods which were developed for gene-environment interaction studies, to other related types of studies such as adaptive borrowing historical data. We propose a Bayesian hierarchical mixture model framework that allows us to investigate the genetic and environmental effects, gene by gene interactions (epistasis) and gene by environment interactions in the same model. It is well known that, in many practical situations, there exists a natural hierarchical structure between the main effects and interactions in the linear model. Here we propose a model that incorporates this hierarchical structure into the Bayesian mixture model, such that the irrelevant interaction effects can be removed more efficiently, resulting in more robust, parsimonious and powerful models. We evaluate both of the 'strong hierarchical' and 'weak hierarchical' models, which specify that both or one of the main effects between interacting factors must be present for the interactions to be included in the model. The extensive simulation results show that the proposed strong and weak hierarchical mixture models control the proportion of false positive discoveries and yield a powerful approach to identify the predisposing main effects and interactions in the studies with complex gene-environment and gene-gene interactions. We also compare these two models with the 'independent' model that does not impose this hierarchical constraint and observe their superior performances in most of the considered situations. The proposed models are implemented in the real data analysis of gene and environment interactions in the cases of lung cancer and cutaneous melanoma case-control studies. The Bayesian statistical models enjoy the properties of being allowed to incorporate useful prior information in the modeling process. Moreover, the Bayesian mixture model outperforms the multivariate logistic model in terms of the performances on the parameter estimation and variable selection in most cases. Our proposed models hold the hierarchical constraints, that further improve the Bayesian mixture model by reducing the proportion of false positive findings among the identified interactions and successfully identifying the reported associations. This is practically appealing for the study of investigating the causal factors from a moderate number of candidate genetic and environmental factors along with a relatively large number of interactions. The natural and orthogonal interaction (NOIA) models of genetic effects have previously been developed to provide an analysis framework, by which the estimates of effects for a quantitative trait are statistically orthogonal regardless of the existence of Hardy-Weinberg Equilibrium (HWE) within loci. Ma et al. (2012) recently developed a NOIA model for the gene-environment interaction studies and have shown the advantages of using the model for detecting the true main effects and interactions, compared with the usual functional model. In this project, we propose a novel Bayesian statistical model that combines the Bayesian hierarchical mixture model with the NOIA statistical model and the usual functional model. The proposed Bayesian NOIA model demonstrates more power at detecting the non-null effects with higher marginal posterior probabilities. Also, we review two Bayesian statistical models (Bayesian empirical shrinkage-type estimator and Bayesian model averaging), which were developed for the gene-environment interaction studies. Inspired by these Bayesian models, we develop two novel statistical methods that are able to handle the related problems such as borrowing data from historical studies. The proposed methods are analogous to the methods for the gene-environment interactions on behalf of the success on balancing the statistical efficiency and bias in a unified model. By extensive simulation studies, we compare the operating characteristics of the proposed models with the existing models including the hierarchical meta-analysis model. The results show that the proposed approaches adaptively borrow the historical data in a data-driven way. These novel models may have a broad range of statistical applications in both of genetic/genomic and clinical studies.
Resumo:
The current study is a secondary data analysis of a prospective cohort study that examined demographic and psychosocial variables and their associations with physical activity levels in Mexican-American adolescents in Houston, Texas. Body image, subjective social status, and anxiety were the main variables of interest. The sample included 952 unrelated Mexican-American adolescents in Houston, Texas. The majority (84.2%) of the study population did not meet physical activity standards prescribed by the CDC.^ In a multivariate model controlling for age, socioeconomic status, gender, general body image, preferred body image, subjective social status, and anxiety, gender and subjective social status were found to be the strongest determinants of physical activity levels. Males and those with a high subjective social status were more likely to participate in physical activity than those with low subjective status. Lower levels of anxiety and a more positive body image were also found to be associated with higher levels of physical activity. In multivariate analyses gender and subjective social status showed the strongest associations with physical activity.^
Resumo:
The purpose of this study was to understand the scope of breast cancer disparities within the Texas Medical Center. The goal was to increase the awareness of breast cancer disparities at the health care organization level, and to foster the development of organizational interventions to reduce breast cancer disparities. The study seeks to answer the following questions: 1. Are hospitals in the Texas Medical Center implementing interventions to reduce breast cancer disparities? 2. What are their interventions for reducing the effects of non clinical factors on breast cancer treatment disparities? 3. What are their measures for monitoring, continuously improving, and evaluating the success of their interventions? ^ This research project was designed as a mixed methods case study. Quantitative breast cancer data for the years 2000-2009 was obtained from the Texas Cancer Registry (TCR). Qualitative data collection and analysis was done by conducting a total of 20 semi-structured interviews of administrators, physicians and nurses at five hospitals (A, B, C, D and E) in the Texas Medical Center (TMC). For quantitative analysis, the study was limited to early stage breast cancer patients: local and regional. The dependent variable was receipt of standard treatment: Surgery (Yes/No), BCS vs Mastectomy, Chemotherapy (Yes/No) and Radiation after BCS (Yes/No). The main independent variable was race: non-Hispanic White (NHW) , non-Hispanic Black (NHB), and Hispanic. Other covariates included age at diagnosis, diagnosis date, percent poverty, grade, stage, and regional nodes. Multivariate logistic regression was used to test the adjusted association between receipt of standard care and race. Qualitative data was analyzed with the Atlas.ti7 software (ATLAS.ti GmbH, Berlin). ^ Though there were significant differences by race for all dependent variables when the data was analyzed as a single group of all hospitals; at the level of the individual hospitals the results were not consistent by race/ethnicity across all dependent variables for hospitals A, B, and E. There were no racial differences in adjusted analysis for receipt of chemotherapy for the individual hospitals of interest in this study. For hospitals C and D, no racial disparities in treatment was observed in adjusted multivariable analysis. All organizations in this study were aware of the body of research which shows that there are disparities in breast cancer outcomes for patient population groups. However, qualitative data analysis found that there were differences in interest among hospitals in addressing breast cancer disparities in their patient population groups. Some organizations were actively implementing directed measures to reduce the breast cancer disparity gap in outcomes for patients, and others were not. Despite the differences in levels of interest, quantitative data analysis showed that organizations in the Texas Medical Center were making progress in reducing the burden of breast cancer disparities in the patient populations being served.^