7 resultados para Large modeling projects

em DigitalCommons@The Texas Medical Center


Relevância:

30.00% 30.00%

Publicador:

Resumo:

A wealth of genetic associations for cardiovascular and metabolic phenotypes in humans has been accumulating over the last decade, in particular a large number of loci derived from recent genome wide association studies (GWAS). True complex disease-associated loci often exert modest effects, so their delineation currently requires integration of diverse phenotypic data from large studies to ensure robust meta-analyses. We have designed a gene-centric 50 K single nucleotide polymorphism (SNP) array to assess potentially relevant loci across a range of cardiovascular, metabolic and inflammatory syndromes. The array utilizes a "cosmopolitan" tagging approach to capture the genetic diversity across approximately 2,000 loci in populations represented in the HapMap and SeattleSNPs projects. The array content is informed by GWAS of vascular and inflammatory disease, expression quantitative trait loci implicated in atherosclerosis, pathway based approaches and comprehensive literature searching. The custom flexibility of the array platform facilitated interrogation of loci at differing stringencies, according to a gene prioritization strategy that allows saturation of high priority loci with a greater density of markers than the existing GWAS tools, particularly in African HapMap samples. We also demonstrate that the IBC array can be used to complement GWAS, increasing coverage in high priority CVD-related loci across all major HapMap populations. DNA from over 200,000 extensively phenotyped individuals will be genotyped with this array with a significant portion of the generated data being released into the academic domain facilitating in silico replication attempts, analyses of rare variants and cross-cohort meta-analyses in diverse populations. These datasets will also facilitate more robust secondary analyses, such as explorations with alternative genetic models, epistasis and gene-environment interactions.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Colorectal cancer is a complex disease that is thought to arise when cells accumulate mutations that allow for uncontrolled growth. There are several recognized mechanisms for generating such mutations in sporadic colon cancer; one of which is chromosomal instability (CIN). One hypothesized driver of CIN in cancer is the improper repair of dysfunctional telomeres. Telomeres comprise the linear ends of chromosomes and play a dual role in cancer. Its length is maintained by the ribonucleoprotein, telomerase, which is not a normally expressed in somatic cells and as cells divide, telomeres continuously shorten. Critically shortened telomeres are considered dysfunctional as they are recognized as sites of DNA damage and cells respond by entering into replicative senescence or apoptosis, a process that is p53-dependent and the mechanism for telomere-induced tumor suppression. Loss of this checkpoint and improper repair of dysfunctional telomeres can initiate a cycle of fusion, bridge and breakage that can lead to chromosomal changes and genomic instability, a process that can lead to transformation of normal cells to cancer cells. Mouse models of telomere dysfunction are currently based on knocking out the telomerase protein or RNA component; however, the naturally long telomeres of mice require multiple generational crosses of telomerase null mice to achieve critically short telomeres. Shelterin is a complex of six core proteins that bind to telomeres specifically. Pot1a is a highly conserved member of this complex that specifically binds to the telomeric single-stranded 3’ G-rich overhang. Previous work in our lab has shown that Pot1a is essential for chromosomal end protection as deletion of Pot1a in murine embryonic fibroblasts (MEFs) leads to open telomere ends that initiate a DNA damage response mediated by ATR, resulting in p53-dependent cellular senescence. Loss of Pot1a in the background of p53 deficiency results in increased aberrant homologous recombination at telomeres and elevated genomic instability, which allows Pot1a-/-, p53-/- MEFs to form tumors when injected into SCID mice. These phenotypes are similar to those seen in cells with critically shortened telomeres. In this work, we created a mouse model of telomere ysfunction in the gastrointestinal tract through the conditional deletion of Pot1a that recapitulates the microscopic features seen in severe telomere attrition. Combined intestinal loss of Pot1a and p53 lead to formation of invasive adenocarcinomas in the small and large intestines. The tumors formed with long latency, low multiplicity and had complex genomes due to chromosomal instability, features similar to those seen in sporadic human colorectal cancers. Taken together, we have developed a novel mouse model of intestinal tumorigenesis based on genomic instability driven by telomere dysfunction.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This dissertation examined body mass index (BMI) growth trajectories and the effects of gender, ethnicity, dietary intake, and physical activity (PA) on BMI growth trajectories among 3rd to 12th graders (9-18 years of age). Growth curve model analysis was performed using data from The Child and Adolescent Trial for Cardiovascular Health (CATCH) study. The study population included 2909 students who were followed up from grades 3-12. The main outcome was BMI at grades 3, 4, 5, 8, and 12. ^ The results revealed that BMI growth differed across two distinct developmental periods of childhood and adolescence. Rate of BMI growth was faster in middle childhood (9-11 years old or 3rd - 5th grades) than in adolescence (11-18 years old or 5th - 12th grades). Students with higher BMI at 3rd grade (baseline) had faster rates of BMI growth. Three groups of students with distinct BMI growth trajectories were identified: high, average, and low. ^ Black and Hispanic children were more likely to be in the groups with higher baseline BMI and faster rates of BMI growth over time. The effects of gender or ethnicity on BMI growth differed across the three groups. The effects of ethnicity on BMI growth were weakened as the children aged. The effects of gender on BMI growth were attenuated in the groups with a large proportion of black and Hispanic children, i.e., “high” or “average” BMI trajectory group. After controlling for gender, ethnicity, and age at baseline, in the “high BMI trajectory”, rate of yearly BMI growth in middle childhood increased 0.102 for every 500 Kcals increase (p=0.049). No significant effects of percentage of energy from total fat and saturated fat on BMI growth were found. Baseline BMI increased 0.041 for every 30 minutes increased in moderate-to-vigorous PA (MVPA) in the “low BMI trajectory”, while Baseline BMI decreased 0.345 for every 30 minutes increased in vigorous PA (VPA) in the “high BMI trajectory”. ^ Childhood overweight and obesity interventions should start at the earliest possible ages, prior to 3rd grade and continue through grade school. Interventions should focus on all children, but specifically black and Hispanic children, who are more likely to be highest at-risk. Promoting VPA earlier in childhood is important for preventing overweight and obesity among children and adolescents. Interventions should target total energy intake, rather than only percentage of energy from total fat or saturated fat. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background. Colorectal cancer (CRC) is the third most commonly diagnosed cancer (excluding skin cancer) in both men and women in the United States, with an estimated 148,810 new cases and 49,960 deaths in 2008 (1). Racial/ethnic disparities have been reported across the CRC care continuum. Studies have documented racial/ethnic disparities in CRC screening (2-9), but only a few studies have looked at these differences in CRC screening over time (9-11). No studies have compared these trends in a population with CRC and without cancer. Additionally, although there is evidence suggesting that hospital factors (e.g. teaching hospital status and NCI designation) are associated with CRC survival (12-16), no studies have sought to explain the racial/ethnic differences in survival by looking at differences in socio-demographics, tumor characteristics, screening, co-morbidities, treatment, as well as hospital characteristics. ^ Objectives and Methods. The overall goals of this dissertation were to describe the patterns and trends of racial/ethnic disparities in CRC screening (i.e. fecal occult blood test (FOBT), sigmoidoscopy (SIG) and colonoscopy (COL)) and to determine if racial/ethnic disparities in CRC survival are explained by differences in socio-demographic, tumor characteristics, screening, co-morbidities, treatment, and hospital factors. These goals were accomplished in a two-paper format.^ In Paper 1, "Racial/Ethnic Disparities and Trends in Colorectal Cancer Screening in Medicare Beneficiaries with Colorectal Cancer and without Cancer in SEER Areas, 1992-2002", the study population consisted of 50,186 Medicare beneficiaries diagnosed with CRC from 1992 to 2002 and 62,917 Medicare beneficiaries without cancer during the same time period. Both cohorts were aged 67 to 89 years and resided in 16 Surveillance, Epidemiology and End Results (SEER) regions of the United States. Screening procedures between 6 months and 3 years prior to the date of diagnosis for CRC patients and prior to the index date for persons without cancer were identified in Medicare claims. The crude and age-gender-adjusted percentages and odds ratios of receiving FOBT, SIG, or COL were calculated. Multivariable logistic regression was used to assess race/ethnicity on the odds of receiving CRC screening over time.^ Paper 2, "Racial/Ethnic Disparities in Colorectal Cancer Survival: To what extent are racial/ethnic disparities in survival explained by racial differences in socio-demographics, screening, co-morbidities, treatment, tumor or hospital characteristics", included a cohort of 50,186 Medicare beneficiaries diagnosed with CRC from 1992 to 2002 and residing in 16 SEER regions of the United States which were identified in the SEER-Medicare linked database. Survival was estimated using the Kaplan-Meier method. Cox proportional hazard modeling was used to estimate hazard ratios (HR) of mortality and 95% confidence intervals (95% CI).^ Results. The screening analysis demonstrated racial/ethnic disparities in screening over time among the cohort without cancer. From 1992 to 1995, Blacks and Hispanics were less likely than Whites to receive FOBT (OR=0.75, 95% CI: 0.65-0.87; OR=0.50, 95% CI: 0.34-0.72, respectively) but their odds of screening increased from 2000 to 2002 (OR=0.79, 95% CI: 0.72-0.85; OR=0.67, 95% CI: 0.54-0.75, respectively). Blacks and Hispanics were less likely than Whites to receive SIG from 1992 to 1995 (OR=0.75, 95% CI: 0.57-0.98; OR=0.29, 95% CI: 0.12-0.71, respectively), but their odds of screening increased from 2000 to 2002 (OR=0.79, 95% CI: 0.68-0.93; OR=0.50, 95% CI: 0.35-0.72, respectively).^ The survival analysis showed that Blacks had worse CRC-specific survival than Whites (HR: 1.33, 95% CI: 1.23-1.44), but this was reduced for stages I-III disease after full adjustment for socio-demographic, tumor characteristics, screening, co-morbidities, treatment and hospital characteristics (aHR=1.24, 95% CI: 1.14-1.35). Socioeconomic status, tumor characteristics, treatment and co-morbidities contributed to the reduction in hazard ratios between Blacks and Whites with stage I-III disease. Asians had better survival than Whites before (HR: 0.73, 95% CI: 0.64-0.82) and after (aHR: 0.80, 95% CI: 0.70-0.92) adjusting for all predictors for stage I-III disease. For stage IV, both Asians and Hispanics had better survival than Whites, and after full adjustment, survival improved (aHR=0.73, 95% CI: 0.63-0.84; aHR=0.74, 95% CI: 0.61-0.92, respectively).^ Conclusion. Screening disparities remain between Blacks and Whites, and Hispanics and Whites, but have decreased in recent years. Future studies should explore other factors that may contribute to screening disparities, such as physician recommendations and language/cultural barriers in this and younger populations.^ There were substantial racial/ethnic differences in CRC survival among older Whites, Blacks, Asians and Hispanics. Co-morbidities, SES, tumor characteristics, treatment and other predictor variables contributed to, but did not fully explain the CRC survival differences between Blacks and Whites. Future research should examine the role of quality of care, particularly the benefit of treatment and post-treatment surveillance, in racial disparities in survival.^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The first manuscript, entitled "Time-Series Analysis as Input for Clinical Predictive Modeling: Modeling Cardiac Arrest in a Pediatric ICU" lays out the theoretical background for the project. There are several core concepts presented in this paper. First, traditional multivariate models (where each variable is represented by only one value) provide single point-in-time snapshots of patient status: they are incapable of characterizing deterioration. Since deterioration is consistently identified as a precursor to cardiac arrests, we maintain that the traditional multivariate paradigm is insufficient for predicting arrests. We identify time series analysis as a method capable of characterizing deterioration in an objective, mathematical fashion, and describe how to build a general foundation for predictive modeling using time series analysis results as latent variables. Building a solid foundation for any given modeling task involves addressing a number of issues during the design phase. These include selecting the proper candidate features on which to base the model, and selecting the most appropriate tool to measure them. We also identified several unique design issues that are introduced when time series data elements are added to the set of candidate features. One such issue is in defining the duration and resolution of time series elements required to sufficiently characterize the time series phenomena being considered as candidate features for the predictive model. Once the duration and resolution are established, there must also be explicit mathematical or statistical operations that produce the time series analysis result to be used as a latent candidate feature. In synthesizing the comprehensive framework for building a predictive model based on time series data elements, we identified at least four classes of data that can be used in the model design. The first two classes are shared with traditional multivariate models: multivariate data and clinical latent features. Multivariate data is represented by the standard one value per variable paradigm and is widely employed in a host of clinical models and tools. These are often represented by a number present in a given cell of a table. Clinical latent features derived, rather than directly measured, data elements that more accurately represent a particular clinical phenomenon than any of the directly measured data elements in isolation. The second two classes are unique to the time series data elements. The first of these is the raw data elements. These are represented by multiple values per variable, and constitute the measured observations that are typically available to end users when they review time series data. These are often represented as dots on a graph. The final class of data results from performing time series analysis. This class of data represents the fundamental concept on which our hypothesis is based. The specific statistical or mathematical operations are up to the modeler to determine, but we generally recommend that a variety of analyses be performed in order to maximize the likelihood that a representation of the time series data elements is produced that is able to distinguish between two or more classes of outcomes. The second manuscript, entitled "Building Clinical Prediction Models Using Time Series Data: Modeling Cardiac Arrest in a Pediatric ICU" provides a detailed description, start to finish, of the methods required to prepare the data, build, and validate a predictive model that uses the time series data elements determined in the first paper. One of the fundamental tenets of the second paper is that manual implementations of time series based models are unfeasible due to the relatively large number of data elements and the complexity of preprocessing that must occur before data can be presented to the model. Each of the seventeen steps is analyzed from the perspective of how it may be automated, when necessary. We identify the general objectives and available strategies of each of the steps, and we present our rationale for choosing a specific strategy for each step in the case of predicting cardiac arrest in a pediatric intensive care unit. Another issue brought to light by the second paper is that the individual steps required to use time series data for predictive modeling are more numerous and more complex than those used for modeling with traditional multivariate data. Even after complexities attributable to the design phase (addressed in our first paper) have been accounted for, the management and manipulation of the time series elements (the preprocessing steps in particular) are issues that are not present in a traditional multivariate modeling paradigm. In our methods, we present the issues that arise from the time series data elements: defining a reference time; imputing and reducing time series data in order to conform to a predefined structure that was specified during the design phase; and normalizing variable families rather than individual variable instances. The final manuscript, entitled: "Using Time-Series Analysis to Predict Cardiac Arrest in a Pediatric Intensive Care Unit" presents the results that were obtained by applying the theoretical construct and its associated methods (detailed in the first two papers) to the case of cardiac arrest prediction in a pediatric intensive care unit. Our results showed that utilizing the trend analysis from the time series data elements reduced the number of classification errors by 73%. The area under the Receiver Operating Characteristic curve increased from a baseline of 87% to 98% by including the trend analysis. In addition to the performance measures, we were also able to demonstrate that adding raw time series data elements without their associated trend analyses improved classification accuracy as compared to the baseline multivariate model, but diminished classification accuracy as compared to when just the trend analysis features were added (ie, without adding the raw time series data elements). We believe this phenomenon was largely attributable to overfitting, which is known to increase as the ratio of candidate features to class examples rises. Furthermore, although we employed several feature reduction strategies to counteract the overfitting problem, they failed to improve the performance beyond that which was achieved by exclusion of the raw time series elements. Finally, our data demonstrated that pulse oximetry and systolic blood pressure readings tend to start diminishing about 10-20 minutes before an arrest, whereas heart rates tend to diminish rapidly less than 5 minutes before an arrest.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Essential biological processes are governed by organized, dynamic interactions between multiple biomolecular systems. Complexes are thus formed to enable the biological function and get dissembled as the process is completed. Examples of such processes include the translation of the messenger RNA into protein by the ribosome, the folding of proteins by chaperonins or the entry of viruses in host cells. Understanding these fundamental processes by characterizing the molecular mechanisms that enable then, would allow the (better) design of therapies and drugs. Such molecular mechanisms may be revealed trough the structural elucidation of the biomolecular assemblies at the core of these processes. Various experimental techniques may be applied to investigate the molecular architecture of biomolecular assemblies. High-resolution techniques, such as X-ray crystallography, may solve the atomic structure of the system, but are typically constrained to biomolecules of reduced flexibility and dimensions. In particular, X-ray crystallography requires the sample to form a three dimensional (3D) crystal lattice which is technically di‑cult, if not impossible, to obtain, especially for large, dynamic systems. Often these techniques solve the structure of the different constituent components within the assembly, but encounter difficulties when investigating the entire system. On the other hand, imaging techniques, such as cryo-electron microscopy (cryo-EM), are able to depict large systems in near-native environment, without requiring the formation of crystals. The structures solved by cryo-EM cover a wide range of resolutions, from very low level of detail where only the overall shape of the system is visible, to high-resolution that approach, but not yet reach, atomic level of detail. In this dissertation, several modeling methods are introduced to either integrate cryo-EM datasets with structural data from X-ray crystallography, or to directly interpret the cryo-EM reconstruction. Such computational techniques were developed with the goal of creating an atomic model for the cryo-EM data. The low-resolution reconstructions lack the level of detail to permit a direct atomic interpretation, i.e. one cannot reliably locate the atoms or amino-acid residues within the structure obtained by cryo-EM. Thereby one needs to consider additional information, for example, structural data from other sources such as X-ray crystallography, in order to enable such a high-resolution interpretation. Modeling techniques are thus developed to integrate the structural data from the different biophysical sources, examples including the work described in the manuscript I and II of this dissertation. At intermediate and high-resolution, cryo-EM reconstructions depict consistent 3D folds such as tubular features which in general correspond to alpha-helices. Such features can be annotated and later on used to build the atomic model of the system, see manuscript III as alternative. Three manuscripts are presented as part of the PhD dissertation, each introducing a computational technique that facilitates the interpretation of cryo-EM reconstructions. The first manuscript is an application paper that describes a heuristics to generate the atomic model for the protein envelope of the Rift Valley fever virus. The second manuscript introduces the evolutionary tabu search strategies to enable the integration of multiple component atomic structures with the cryo-EM map of their assembly. Finally, the third manuscript develops further the latter technique and apply it to annotate consistent 3D patterns in intermediate-resolution cryo-EM reconstructions. The first manuscript, titled An assembly model for Rift Valley fever virus, was submitted for publication in the Journal of Molecular Biology. The cryo-EM structure of the Rift Valley fever virus was previously solved at 27Å-resolution by Dr. Freiberg and collaborators. Such reconstruction shows the overall shape of the virus envelope, yet the reduced level of detail prevents the direct atomic interpretation. High-resolution structures are not yet available for the entire virus nor for the two different component glycoproteins that form its envelope. However, homology models may be generated for these glycoproteins based on similar structures that are available at atomic resolutions. The manuscript presents the steps required to identify an atomic model of the entire virus envelope, based on the low-resolution cryo-EM map of the envelope and the homology models of the two glycoproteins. Starting with the results of the exhaustive search to place the two glycoproteins, the model is built iterative by running multiple multi-body refinements to hierarchically generate models for the different regions of the envelope. The generated atomic model is supported by prior knowledge regarding virus biology and contains valuable information about the molecular architecture of the system. It provides the basis for further investigations seeking to reveal different processes in which the virus is involved such as assembly or fusion. The second manuscript was recently published in the of Journal of Structural Biology (doi:10.1016/j.jsb.2009.12.028) under the title Evolutionary tabu search strategies for the simultaneous registration of multiple atomic structures in cryo-EM reconstructions. This manuscript introduces the evolutionary tabu search strategies applied to enable a multi-body registration. This technique is a hybrid approach that combines a genetic algorithm with a tabu search strategy to promote the proper exploration of the high-dimensional search space. Similar to the Rift Valley fever virus, it is common that the structure of a large multi-component assembly is available at low-resolution from cryo-EM, while high-resolution structures are solved for the different components but lack for the entire system. Evolutionary tabu search strategies enable the building of an atomic model for the entire system by considering simultaneously the different components. Such registration indirectly introduces spatial constrains as all components need to be placed within the assembly, enabling the proper docked in the low-resolution map of the entire assembly. Along with the method description, the manuscript covers the validation, presenting the benefit of the technique in both synthetic and experimental test cases. Such approach successfully docked multiple components up to resolutions of 40Å. The third manuscript is entitled Evolutionary Bidirectional Expansion for the Annotation of Alpha Helices in Electron Cryo-Microscopy Reconstructions and was submitted for publication in the Journal of Structural Biology. The modeling approach described in this manuscript applies the evolutionary tabu search strategies in combination with the bidirectional expansion to annotate secondary structure elements in intermediate resolution cryo-EM reconstructions. In particular, secondary structure elements such as alpha helices show consistent patterns in cryo-EM data, and are visible as rod-like patterns of high density. The evolutionary tabu search strategy is applied to identify the placement of the different alpha helices, while the bidirectional expansion characterizes their length and curvature. The manuscript presents the validation of the approach at resolutions ranging between 6 and 14Å, a level of detail where alpha helices are visible. Up to resolution of 12 Å, the method measures sensitivities between 70-100% as estimated in experimental test cases, i.e. 70-100% of the alpha-helices were correctly predicted in an automatic manner in the experimental data. The three manuscripts presented in this PhD dissertation cover different computation methods for the integration and interpretation of cryo-EM reconstructions. The methods were developed in the molecular modeling software Sculptor (http://sculptor.biomachina.org) and are available for the scientific community interested in the multi-resolution modeling of cryo-EM data. The work spans a wide range of resolution covering multi-body refinement and registration at low-resolution along with annotation of consistent patterns at high-resolution. Such methods are essential for the modeling of cryo-EM data, and may be applied in other fields where similar spatial problems are encountered, such as medical imaging.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Development of homology modeling methods will remain an area of active research. These methods aim to develop and model increasingly accurate three-dimensional structures of yet uncrystallized therapeutically relevant proteins e.g. Class A G-Protein Coupled Receptors. Incorporating protein flexibility is one way to achieve this goal. Here, I will discuss the enhancement and validation of the ligand-steered modeling, originally developed by Dr. Claudio Cavasotto, via cross modeling of the newly crystallized GPCR structures. This method uses known ligands and known experimental information to optimize relevant protein binding sites by incorporating protein flexibility. The ligand-steered models were able to model, reasonably reproduce binding sites and the co-crystallized native ligand poses of the β2 adrenergic and Adenosine 2A receptors using a single template structure. They also performed better than the choice of template, and crude models in a small scale high-throughput docking experiments and compound selectivity studies. Next, the application of this method to develop high-quality homology models of Cannabinoid Receptor 2, an emerging non-psychotic pain management target, is discussed. These models were validated by their ability to rationalize structure activity relationship data of two, inverse agonist and agonist, series of compounds. The method was also applied to improve the virtual screening performance of the β2 adrenergic crystal structure by optimizing the binding site using β2 specific compounds. These results show the feasibility of optimizing only the pharmacologically relevant protein binding sites and applicability to structure-based drug design projects.