2 resultados para Mining claims
em DigitalCommons@The Texas Medical Center
Resumo:
Academic and industrial research in the late 90s have brought about an exponential explosion of DNA sequence data. Automated expert systems are being created to help biologists to extract patterns, trends and links from this ever-deepening ocean of information. Two such systems aimed on retrieving and subsequently utilizing phylogenetically relevant information have been developed in this dissertation, the major objective of which was to automate the often difficult and confusing phylogenetic reconstruction process. ^ Popular phylogenetic reconstruction methods, such as distance-based methods, attempt to find an optimal tree topology (that reflects the relationships among related sequences and their evolutionary history) by searching through the topology space. Various compromises between the fast (but incomplete) and exhaustive (but computationally prohibitive) search heuristics have been suggested. An intelligent compromise algorithm that relies on a flexible “beam” search principle from the Artificial Intelligence domain and uses the pre-computed local topology reliability information to adjust the beam search space continuously is described in the second chapter of this dissertation. ^ However, sometimes even a (virtually) complete distance-based method is inferior to the significantly more elaborate (and computationally expensive) maximum likelihood (ML) method. In fact, depending on the nature of the sequence data in question either method might prove to be superior. Therefore, it is difficult (even for an expert) to tell a priori which phylogenetic reconstruction method—distance-based, ML or maybe maximum parsimony (MP)—should be chosen for any particular data set. ^ A number of factors, often hidden, influence the performance of a method. For example, it is generally understood that for a phylogenetically “difficult” data set more sophisticated methods (e.g., ML) tend to be more effective and thus should be chosen. However, it is the interplay of many factors that one needs to consider in order to avoid choosing an inferior method (potentially a costly mistake, both in terms of computational expenses and in terms of reconstruction accuracy.) ^ Chapter III of this dissertation details a phylogenetic reconstruction expert system that selects a superior proper method automatically. It uses a classifier (a Decision Tree-inducing algorithm) to map a new data set to the proper phylogenetic reconstruction method. ^
Resumo:
Few recent estimates of childhood asthma incidence exist in the literature, although the importance of incidence surveillance for understanding asthma risk factors has been recognized. Asthma prevalence, morbidity and mortality reports have repeatedly shown that low-income children are disproportionately impacted by the disease. The aim of this study was to demonstrate the utility of Medicaid claims data for providing statewide estimates of asthma incidence. Medicaid Analytic Extract (MAX) data for Texas children ages 0-17 enrolled in Medicaid between 2004 and 2007 were used to estimate incidence overall and by age group, gender, race and county of residence. A 13+ month period of continuous enrollment was required in order to distinguish incident from prevalent cases identified in the claims data. Age-adjusted incidence of asthma was 4.26/100 person-years during 2005-2007, higher than reported in other populations. Incidence rates decreased with age, were higher for males than females, differed by race, and tended to be higher in rural than urban areas. With this study, we were able to demonstrate the utility of MAX data for estimating asthma incidence, and create a dataset of incident cases to use in further analysis. ^ In subsequent analyses, we investigated a possible association between ambient air pollutants and incident asthma among Medicaid-enrolled children in Harris County Texas between 2005 and 2007. This population is at high risk for asthma, and living in an area with historically poor air quality. We used a time-stratified case-crossover design and conditional logistic regression to calculate odds ratios, adjusted for weather variables and aeroallergens, to assess the effect of increases in ozone, NO2 and PM2.5 concentrations on risk of developing asthma. Our results show that a 10 ppb increase in ozone was significantly associated with asthma during the warm season (May-October), with the strongest effect seen when a 6-day cumulative lag period was used to compute the exposure metric (OR=1.05, 95% CI, 1.02–1.08). Similar results were seen for NO2 and PM 2.5 (OR=1.07, 95% CI, 1.03–1.11 and OR=1.12, 95% CI, 1.03–1.22, respectively). PM2.5 also had significant effects in the cold season (November-April), 5-day cumulative lag: OR=1.11, 95% CI, 1.00–1.22. When compared with children in the lowest quartile of O3 exposure, the risk for children in the highest quartile was 20% higher. This study indicates that these pollutants are associated with newly-diagnosed childhood asthma in this low-income urban population, particularly during the summer months. ^