33 resultados para Project analysis
Resumo:
Children who experience early pubertal development have an increased risk of developing cancer (breast, ovarian, and testicular), osteoporosis, insulin resistance, and obesity as adults. Early pubertal development has been associated with depression, aggressiveness, and increased sexual prowess. Possible explanations for the decline in age of pubertal onset include genetics, exposure to environmental toxins, better nutrition, and a reduction in childhood infections. In this study we (1) evaluated the association between 415 single nucleotide polymorphisms (SNPs) from hormonal pathways and early puberty, defined as menarche prior to age 12 in females and Tanner Stage 2 development prior to age 11 in males, and (2) measured endocrine hormone trajectories (estradiol, testosterone, and DHEAS) in relation to age, race, and Tanner Stage in a cohort of children from Project HeartBeat! At the end of the 4-year study, 193 females had onset of menarche and 121 males had pubertal staging at age 11. African American females had a younger mean age at menarche than Non-Hispanic White females. African American females and males had a lower mean age at each pubertal stage (1-5) than Non-Hispanic White females and males. African American females had higher mean BMI measures at each pubertal stage than Non-Hispanic White females. Of the 415 SNPs evaluated in females, 22 SNPs were associated with early menarche, when adjusted for race ( p<0.05), but none remained significant after adjusting for multiple testing by False Discovery Rate (p<0.00017). In males, 17 SNPs were associated with early pubertal development when adjusted for race (p<0.05), but none remained significant when adjusted for multiple testing (p<0.00017). ^ There were 4955 hormone measurements taken during the 4-year study period from 632 African American and Non-Hispanic White males and females. On average, African American females started and ended the pubertal process at a younger age than Non-Hispanic White females. The mean age of Tanner Stage 2 breast development in African American and Non-Hispanic White females was 9.7 (S.D.=0.8) and 10.2 (S.D.=1.1) years, respectively. There was a significant difference by race in mean age for each pubertal stage, except Tanner Stage 1 for pubic hair development. Both Estradiol and DHEAS levels in females varied significantly with age, but not by race. Estradiol and DHEAS levels increased from Tanner Stage 1 to Tanner Stage 5.^ African American males had a lower mean age at each Tanner Stage of development than Non-Hispanic White males. The mean age of Tanner Stage 2 genital development in African American and Non-Hispanic White males was 10.5 (S.D.=1.1) and 10.8 (S.D.=1.1) years, respectively, but this difference was not significant (p=0.11). Testosterone levels varied significantly with age and race. Non-Hispanic White males had higher levels of testosterone than African American males from Tanner Stage 1-4. Testosterone levels increased for both races from Tanner Stage 1 to Tanner Stage 5. Testosterone levels had the steepest increase from ages 11-15 for both races. DHEAS levels in males varied significantly with age, but not by race. DHEAS levels had the steepest increase from ages 14-17. ^ In conclusion, African American males and females experience pubertal onset at a younger age than Non-Hispanic White males and females, but in this study, we could not find a specific gene that explained the observed variation in age of pubertal onset. Future studies with larger study populations may provide a better understanding of the contribution of genes in early pubertal onset.^
Resumo:
This study represents a secondary analysis of the merging of emergency room visits and daily ozone and PM2.5. Although the adverse health effects of ozone and fine particulate matter have been documented in the literature, evidence regarding the health risks of these two pollutants in Harris County, Texas, is limited. Harris County (Houston) has sufficiently unique characteristics that analysis of these relationships in this setting and with the ozone and industry issues in Houston is informative. The objective of this study was to investigate the association between the joint exposure to ozone and fine particulate matter, and emergency room diagnoses of chronic obstructive pulmonary disease and cardiovascular disease in Harris County, Texas, from 2004 to 2009, with zero and one day lags. ^ The study variables were daily emergency room visits for Harris County, Texas, from 2004 to 2009, temperature, relative humidity, east wind component, north wind component, ozone, and fine particulate matter. Information about each patient's age, race, and gender was also included. The two dichotomous outcomes were emergency room visits diagnoses for chronic obstructive pulmonary disease and cardiovascular disease. Estimates of ozone and PM2.5 were interpolated using kriging, in which estimates of the two pollutants were predicted from monitoring data for every case residence zip code for every day of the six years, over 3 million estimates (one of each pollutant for each case in the database). ^ Logistic regressions were conducted to estimate odds ratios of the two outcomes. Three analyses were conducted: one for all records, another for visits during the four months of April and September of 2005 and 2009, and a third one for visits from zip codes that are close to PM2.5 monitoring stations (east area of Harris County). The last two analyses were designed to investigate special temporal and spatial characteristics of the associations. ^ The dataset included all ER visits surveyed by Safety Net from 2004 to 2009, exceeding 3 million visits for all causes. There were 95,765 COPD and 96,596 CVD cases during this six year period. A 1-μg/m3 increase in PM2.5 on the same day was associated with a 1.0% increase in the odds of chronic obstructive pulmonary disease emergency room diagnoses, a 0.4% increase in the odds of cardiovascular disease emergency room diagnoses, and a 0.2% increase in the odds of cardiovascular disease emergency room diagnoses on the following day. A 1-ppb increase in ozone was associated with a 0.1% increase in the odds of chronic obstructive pulmonary disease emergency room diagnoses on the same day. These four percentages add up to 1.7% of ER visits. That is, over the period of six years, one unit increase for both ozone and PM2.5 (joint increase), resulted in about 55,286 (3,252,102 * 0.017) extra ER visits for CVD or COPD, or 9,214 extra ER visits per year. ^ After adjustment for age, race, gender, day of the week, temperature, relative humidity, east wind component, north wind component, and wind speed, there were statistically significant associations between emergency room chronic obstructive pulmonary disease diagnosis in Harris County, Texas, with joint exposure to ozone and fine particulate matter for the same day; and between emergency room cardiovascular disease diagnosis and exposure to PM2.5 of the same day and the previous day. ^ Despite the small association between the two air pollutants and the health outcomes, this study points to important findings. Namely, the need to identify reasons for the increase of CVD and COPD ER visits over the course of the project, the statistical association between humidity (or whatever other variables for which it may serve as a surrogate) and CVD and COPD cases, and the confirmatory finding that males and blacks have higher odds for the two outcomes, as consistent with other studies. ^ An important finding of this research suggests that the number and distribution of PM2.5 monitors in Harris County - although not evenly spaced geographically—are adequate to detect significant association between exposure and the two outcomes. In addition, this study points to other potential factors that contribute to the rising incidence rates of CVD and COPD ER visits in Harris County such as population increases, patient history, life style, and other pollutants. Finally, results of validation, using a subset of the data demonstrate the robustness of the models.^
Resumo:
The first manuscript, entitled "Time-Series Analysis as Input for Clinical Predictive Modeling: Modeling Cardiac Arrest in a Pediatric ICU" lays out the theoretical background for the project. There are several core concepts presented in this paper. First, traditional multivariate models (where each variable is represented by only one value) provide single point-in-time snapshots of patient status: they are incapable of characterizing deterioration. Since deterioration is consistently identified as a precursor to cardiac arrests, we maintain that the traditional multivariate paradigm is insufficient for predicting arrests. We identify time series analysis as a method capable of characterizing deterioration in an objective, mathematical fashion, and describe how to build a general foundation for predictive modeling using time series analysis results as latent variables. Building a solid foundation for any given modeling task involves addressing a number of issues during the design phase. These include selecting the proper candidate features on which to base the model, and selecting the most appropriate tool to measure them. We also identified several unique design issues that are introduced when time series data elements are added to the set of candidate features. One such issue is in defining the duration and resolution of time series elements required to sufficiently characterize the time series phenomena being considered as candidate features for the predictive model. Once the duration and resolution are established, there must also be explicit mathematical or statistical operations that produce the time series analysis result to be used as a latent candidate feature. In synthesizing the comprehensive framework for building a predictive model based on time series data elements, we identified at least four classes of data that can be used in the model design. The first two classes are shared with traditional multivariate models: multivariate data and clinical latent features. Multivariate data is represented by the standard one value per variable paradigm and is widely employed in a host of clinical models and tools. These are often represented by a number present in a given cell of a table. Clinical latent features derived, rather than directly measured, data elements that more accurately represent a particular clinical phenomenon than any of the directly measured data elements in isolation. The second two classes are unique to the time series data elements. The first of these is the raw data elements. These are represented by multiple values per variable, and constitute the measured observations that are typically available to end users when they review time series data. These are often represented as dots on a graph. The final class of data results from performing time series analysis. This class of data represents the fundamental concept on which our hypothesis is based. The specific statistical or mathematical operations are up to the modeler to determine, but we generally recommend that a variety of analyses be performed in order to maximize the likelihood that a representation of the time series data elements is produced that is able to distinguish between two or more classes of outcomes. The second manuscript, entitled "Building Clinical Prediction Models Using Time Series Data: Modeling Cardiac Arrest in a Pediatric ICU" provides a detailed description, start to finish, of the methods required to prepare the data, build, and validate a predictive model that uses the time series data elements determined in the first paper. One of the fundamental tenets of the second paper is that manual implementations of time series based models are unfeasible due to the relatively large number of data elements and the complexity of preprocessing that must occur before data can be presented to the model. Each of the seventeen steps is analyzed from the perspective of how it may be automated, when necessary. We identify the general objectives and available strategies of each of the steps, and we present our rationale for choosing a specific strategy for each step in the case of predicting cardiac arrest in a pediatric intensive care unit. Another issue brought to light by the second paper is that the individual steps required to use time series data for predictive modeling are more numerous and more complex than those used for modeling with traditional multivariate data. Even after complexities attributable to the design phase (addressed in our first paper) have been accounted for, the management and manipulation of the time series elements (the preprocessing steps in particular) are issues that are not present in a traditional multivariate modeling paradigm. In our methods, we present the issues that arise from the time series data elements: defining a reference time; imputing and reducing time series data in order to conform to a predefined structure that was specified during the design phase; and normalizing variable families rather than individual variable instances. The final manuscript, entitled: "Using Time-Series Analysis to Predict Cardiac Arrest in a Pediatric Intensive Care Unit" presents the results that were obtained by applying the theoretical construct and its associated methods (detailed in the first two papers) to the case of cardiac arrest prediction in a pediatric intensive care unit. Our results showed that utilizing the trend analysis from the time series data elements reduced the number of classification errors by 73%. The area under the Receiver Operating Characteristic curve increased from a baseline of 87% to 98% by including the trend analysis. In addition to the performance measures, we were also able to demonstrate that adding raw time series data elements without their associated trend analyses improved classification accuracy as compared to the baseline multivariate model, but diminished classification accuracy as compared to when just the trend analysis features were added (ie, without adding the raw time series data elements). We believe this phenomenon was largely attributable to overfitting, which is known to increase as the ratio of candidate features to class examples rises. Furthermore, although we employed several feature reduction strategies to counteract the overfitting problem, they failed to improve the performance beyond that which was achieved by exclusion of the raw time series elements. Finally, our data demonstrated that pulse oximetry and systolic blood pressure readings tend to start diminishing about 10-20 minutes before an arrest, whereas heart rates tend to diminish rapidly less than 5 minutes before an arrest.