8 resultados para databases and data mining
em DigitalCommons@The Texas Medical Center
Resumo:
Academic and industrial research in the late 90s have brought about an exponential explosion of DNA sequence data. Automated expert systems are being created to help biologists to extract patterns, trends and links from this ever-deepening ocean of information. Two such systems aimed on retrieving and subsequently utilizing phylogenetically relevant information have been developed in this dissertation, the major objective of which was to automate the often difficult and confusing phylogenetic reconstruction process. ^ Popular phylogenetic reconstruction methods, such as distance-based methods, attempt to find an optimal tree topology (that reflects the relationships among related sequences and their evolutionary history) by searching through the topology space. Various compromises between the fast (but incomplete) and exhaustive (but computationally prohibitive) search heuristics have been suggested. An intelligent compromise algorithm that relies on a flexible “beam” search principle from the Artificial Intelligence domain and uses the pre-computed local topology reliability information to adjust the beam search space continuously is described in the second chapter of this dissertation. ^ However, sometimes even a (virtually) complete distance-based method is inferior to the significantly more elaborate (and computationally expensive) maximum likelihood (ML) method. In fact, depending on the nature of the sequence data in question either method might prove to be superior. Therefore, it is difficult (even for an expert) to tell a priori which phylogenetic reconstruction method—distance-based, ML or maybe maximum parsimony (MP)—should be chosen for any particular data set. ^ A number of factors, often hidden, influence the performance of a method. For example, it is generally understood that for a phylogenetically “difficult” data set more sophisticated methods (e.g., ML) tend to be more effective and thus should be chosen. However, it is the interplay of many factors that one needs to consider in order to avoid choosing an inferior method (potentially a costly mistake, both in terms of computational expenses and in terms of reconstruction accuracy.) ^ Chapter III of this dissertation details a phylogenetic reconstruction expert system that selects a superior proper method automatically. It uses a classifier (a Decision Tree-inducing algorithm) to map a new data set to the proper phylogenetic reconstruction method. ^
Resumo:
People often use tools to search for information. In order to improve the quality of an information search, it is important to understand how internal information, which is stored in user’s mind, and external information, represented by the interface of tools interact with each other. How information is distributed between internal and external representations significantly affects information search performance. However, few studies have examined the relationship between types of interface and types of search task in the context of information search. For a distributed information search task, how data are distributed, represented, and formatted significantly affects the user search performance in terms of response time and accuracy. Guided by UFuRT (User, Function, Representation, Task), a human-centered process, I propose a search model, task taxonomy. The model defines its relationship with other existing information models. The taxonomy clarifies the legitimate operations for each type of search task of relation data. Based on the model and taxonomy, I have also developed prototypes of interface for the search tasks of relational data. These prototypes were used for experiments. The experiments described in this study are of a within-subject design with a sample of 24 participants recruited from the graduate schools located in the Texas Medical Center. Participants performed one-dimensional nominal search tasks over nominal, ordinal, and ratio displays, and searched one-dimensional nominal, ordinal, interval, and ratio tasks over table and graph displays. Participants also performed the same task and display combination for twodimensional searches. Distributed cognition theory has been adopted as a theoretical framework for analyzing and predicting the search performance of relational data. It has been shown that the representation dimensions and data scales, as well as the search task types, are main factors in determining search efficiency and effectiveness. In particular, the more external representations used, the better search task performance, and the results suggest the ideal search performance occurs when the question type and corresponding data scale representation match. The implications of the study lie in contributing to the effective design of search interface for relational data, especially laboratory results, which are often used in healthcare activities.
Resumo:
Background: The US has higher rates of teen births and sexually transmitted infections (STI) than other developed countries. Texas youth are disproportionately impacted. Purpose: To review local, state, and national data on teens’ engagement in sexual risk behaviors to inform policy and practice related to teen sexual health. Methods: 2009 middle school and high school Youth Risk Behavior Survey (YRBS) data, and data from All About Youth, a middle school study conducted in a large urban school district in Texas, were analyzed to assess the prevalence of sexual initiation, including the initiation of non-coital sex, and the prevalence of sexual risk behaviors among Texas and US youth. Results: A substantial proportion of middle and high school students are having sex. Sexual initiation begins as early as 6th grade and increases steadily through 12th grade with almost two-thirds of high school seniors being sexually experienced. Many teens are not protecting themselves from unintended pregnancy or STIs – nationally, 80% and 39% of high school students did not use birth control pills or a condom respectively the last time they had sex. Many middle and high school students are engaging in oral and anal sex, two behaviors which increase the risk of contracting an STI and HIV. In Texas, an estimated 689,512 out of 1,327,815 public high school students are sexually experienced – over half (52%) of the total high school population. Texas students surpass their US peers in several sexual risk behaviors including number of lifetime sexual partners, being currently sexually active, and not using effective methods of birth control or dual protection when having sex. They are also less likely to receive HIV/AIDS education in school. Conclusion: Changes in policy and practice, including implementation of evidence-based sex education programs in middle and high schools and increased access to integrated, teen-friendly sexual and reproductive health services, are urgently needed at the state and national levels to address these issues effectively.
Resumo:
Cardiovascular disease (CVD) is highly preventable, yet it is a leading cause of death among women in Texas. The primary goals of this research were to examine past and current trends of CVD, as well as identify whether there is an association between the insurance coverage and mortality from CVD among women aged 60–65 in Texas between 2000 and 2011. ^ The systematic review of the research is based on the guidelines and recommendations set by the Centre for Reviews and Dissemination for conducting reviews in health care. Over 47 citations of peer-reviewed articles from Ovid MEDLINE and PubMed databases and five websites were identified, of which 7 studies met inclusion criteria for the first systematic review to examine the trends of CVD in Texas. Ten citations of peer-reviewed articles from Ovid MEDLINE and PubMed databases and five web sites were reviewed for the second systematic review (to study the association between insurance coverage and cardiovascular health among Texas women 60–64 years of age), of which 3 studies met inclusion criteria and were included in the research. The results of the study highlighted key gaps in the existing literature and important areas for the further research, as well as determined directions for future public health CVD prevention programs in Texas. ^ Based on the conducted research, the major determinants of premature mortality among women attributed to cardiovascular disease are based on individual level characteristics, more specifically sex, age, race/ethnicity, and education. The results indicate that African American and non-Hispanic white women are more likely to have higher CVD mortality rates than Hispanic women due to higher prevalence of cardiac risk factors. The data also shows higher levels of mortality from CVD in the southeastern United States, with Texas ranking as the third state with the highest prevalence of CVD among women. According to the Texas Department of State Health Services, there are approximately 56,000 deaths caused by CVD annually in Texas, which represents about one death every ten minutes. Coronary artery disease and stroke were the causes of 31.2 percent of all female deaths in Texas in 2009, meaning that approximately 68 women die from any form of cardiac disease in Texas each day. ^ The data of the reviewed studies indicate that women' lack of health insurance was significantly associated with a higher prevalence of cardiovascular disease. The uninsured women were more likely to be unaware of their risk factors and more likely to have undiagnosed diabetes—a co-morbidity factor of CVD. One of the studies also reports strong correlation between state rates of uninsured and lower rates of preventive care. Given these strong correlations, those who were chronically uninsured were at a higher risk of mortality than the insured, due to prolonged periods of time without basic access to preventive and medical care. ^ Suggested recommendations to decrease CVD mortality rates in Texas are consistent with the existing literature and include state policy development that addresses elimination of health disparities, consideration of potential benefits of universal health coverage by the legislative policymakers, and maintenance of solid partnerships between public health agencies and hospitals to educate on, diagnose, and treat CVD among the female population in Texas. ^
Resumo:
Background. The United Nations' Millennium Development Goal (MDG) 4 aims for a two-thirds reduction in death rates for children under the age of five by 2015. The greatest risk of death is in the first week of life, yet most of these deaths can be prevented by such simple interventions as improved hygiene, exclusive breastfeeding, and thermal care. The percentage of deaths in Nigeria that occur in the first month of life make up 28% of all deaths under five years, a statistic that has remained unchanged despite various child health policies. This paper will address the challenges of reducing the neonatal mortality rate in Nigeria by examining the literature regarding efficacy of home-based, newborn care interventions and policies that have been implemented successfully in India. ^ Methods. I compared similarities and differences between India and Nigeria using qualitative descriptions and available quantitative data of various health indicators. The analysis included identifying policy-related factors and community approaches contributing to India's newborn survival rates. Databases and reference lists of articles were searched for randomized controlled trials of community health worker interventions shown to reduce neonatal mortality rates. ^ Results. While it appears that Nigeria spends more money than India on health per capita ($136 vs. $132, respectively) and as percent GDP (5.8% vs. 4.2%, respectively), it still lags behind India in its neonatal, infant, and under five mortality rates (40 vs. 32 deaths/1000 live births, 88 vs. 48 deaths/1000 live births, 143 vs. 63 deaths/1000 live births, respectively). Both countries have comparably low numbers of healthcare providers. Unlike their counterparts in Nigeria, Indian community health workers receive training on how to deliver postnatal care in the home setting and are monetarily compensated. Gender-related power differences still play a role in the societal structure of both countries. A search of randomized controlled trials of home-based newborn care strategies yielded three relevant articles. Community health workers trained to educate mothers and provide a preventive package of interventions involving clean cord care, thermal care, breastfeeding promotion, and danger sign recognition during multiple postnatal visits in rural India, Bangladesh, and Pakistan reduced neonatal mortality rates by 54%, 34%, and 15–20%, respectively. ^ Conclusion. Access to advanced technology is not necessary to reduce neonatal mortality rates in resource-limited countries. To address the urgency of neonatal mortality, countries with weak health systems need to start at the community level and invest in cost-effective, evidence-based newborn care interventions that utilize available human resources. While more randomized controlled studies are urgently needed, the current available evidence of models of postnatal care provision demonstrates that home-based care and health education provided by community health workers can reduce neonatal mortality rates in the immediate future.^
Resumo:
Intensive family preservation services (IFPS), designed to stabilize at-risk families and avert out-of-home care, have been the focus of many randomized, experimental studies. Employing a retrospective “clinical data-mining” (CDM) methodology (Epstein, 2001), this study makes use of available information extracted from client records in one IFPS agency over the course of two years. The primary goal of this descriptive and associational study was to gain a clearer understanding of IFPS service delivery and effectiveness. Interventions provided to families are delineated and assessed for their impact on improved family functioning, their impact on the reduction of family violence, as well as placement prevention. Findings confirm the use of a wide range of services consistent with IFPS program theory. Because the study employs a quasi-experimental, retrospective use of available information, clinical outcomes described cannot be causally attributed to interventions employed as with randomized controlled trials. With regard to service outcomes, findings suggest that family education, empowerment services and advocacy are most influential in placement prevention and in ameliorating unmanageable behaviors in children as well as the incidence of family violence.
Resumo:
Accurate quantitative estimation of exposure using retrospective data has been one of the most challenging tasks in the exposure assessment field. To improve these estimates, some models have been developed using published exposure databases with their corresponding exposure determinants. These models are designed to be applied to reported exposure determinants obtained from study subjects or exposure levels assigned by an industrial hygienist, so quantitative exposure estimates can be obtained. ^ In an effort to improve the prediction accuracy and generalizability of these models, and taking into account that the limitations encountered in previous studies might be due to limitations in the applicability of traditional statistical methods and concepts, the use of computer science- derived data analysis methods, predominantly machine learning approaches, were proposed and explored in this study. ^ The goal of this study was to develop a set of models using decision trees/ensemble and neural networks methods to predict occupational outcomes based on literature-derived databases, and compare, using cross-validation and data splitting techniques, the resulting prediction capacity to that of traditional regression models. Two cases were addressed: the categorical case, where the exposure level was measured as an exposure rating following the American Industrial Hygiene Association guidelines and the continuous case, where the result of the exposure is expressed as a concentration value. Previously developed literature-based exposure databases for 1,1,1 trichloroethane, methylene dichloride and, trichloroethylene were used. ^ When compared to regression estimations, results showed better accuracy of decision trees/ensemble techniques for the categorical case while neural networks were better for estimation of continuous exposure values. Overrepresentation of classes and overfitting were the main causes for poor neural network performance and accuracy. Estimations based on literature-based databases using machine learning techniques might provide an advantage when they are applied to other methodologies that combine `expert inputs' with current exposure measurements, like the Bayesian Decision Analysis tool. The use of machine learning techniques to more accurately estimate exposures from literature-based exposure databases might represent the starting point for the independence from the expert judgment.^