102 resultados para Data sets storage
Resumo:
The ability to obtain gene expression profiles from human disease specimens provides an opportunity to identify relevant gene pathways, but is limited by the absence of data sets spanning a broad range of conditions. Here, we analyzed publicly available microarray data from 16 diverse skin conditions in order to gain insight into disease pathogenesis. Unsupervised hierarchical clustering separated samples by disease as well as common cellular and molecular pathways. Disease-specific signatures were leveraged to build a multi-disease classifier, which predicted the diagnosis of publicly and prospectively collected expression profiles with 93% accuracy. In one sample, the molecular classifier differed from the initial clinical diagnosis and correctly predicted the eventual diagnosis as the clinical presentation evolved. Finally, integration of IFN-regulated gene programs with the skin database revealed a significant inverse correlation between IFN-β and IFN-γ programs across all conditions. Our study provides an integrative approach to the study of gene signatures from multiple skin conditions, elucidating mechanisms of disease pathogenesis. In addition, these studies provide a framework for developing tools for personalized medicine toward the precise prediction, prevention, and treatment of disease on an individual level.
Resumo:
To better understand the relationship between tumor-host interactions and the efficacy of chemotherapy, we have developed an analytical approach to quantify several biological processes observed in gene expression data sets. We tested the approach on tumor biopsies from individuals with estrogen receptor-negative breast cancer treated with chemotherapy. We report that increased stromal gene expression predicts resistance to preoperative chemotherapy with 5-fluorouracil, epirubicin and cyclophosphamide (FEC) in subjects in the EORTC 10994/BIG 00-01 trial. The predictive value of the stromal signature was successfully validated in two independent cohorts of subjects who received chemotherapy but not in an untreated control group, indicating that the signature is predictive rather than prognostic. The genes in the signature are expressed in reactive stroma, according to reanalysis of data from microdissected breast tumor samples. These findings identify a previously undescribed resistance mechanism to FEC treatment and suggest that antistromal agents may offer new ways to overcome resistance to chemotherapy.
Resumo:
The ratio of resting metabolic rate (RMR) to fat-free mass (FFM) is often used to compare individuals of different body sizes. Because RMR has not been well described over the full range of FFM, a literature review was conducted among groups with a wide range of FFM. It included 31 data sets comprising a total of 1111 subjects: 118 infants and preschoolers, 323 adolescents, and 670 adults; FFM ranged from 2.8 to 106 kg. The relationship of RMR to FFM was found to be nonlinear and average slopes of the regression equations of the three groups differed significantly (P less than 0.0001). For only the youngest group did the intercept approach zero. The lower slopes of RMR on FFM, at higher measures of FFM, corresponded to relatively greater proportions of less metabolically active muscle mass and to lesser proportions of more metabolically active nonmuscle organ mass. Because the contribution of FFM to RMR is not constant, an arithmetic error is introduced when the ratio of RMR to FFM is used. Hence, alternative methods should be used to compare individuals with markedly different FFM.
Resumo:
PURPOSE Our purpose was development and assessment of a BRAF-mutant gene expression signature for colon cancer (CC) and the study of its prognostic implications. Materials and METHODS A set of 668 stage II and III CC samples from the PETACC-3 (Pan-European Trails in Alimentary Tract Cancers) clinical trial were used to assess differential gene expression between c.1799T>A (p.V600E) BRAF mutant and non-BRAF, non-KRAS mutant cancers (double wild type) and to construct a gene expression-based classifier for detecting BRAF mutant samples with high sensitivity. The classifier was validated in independent data sets, and survival rates were compared between classifier positive and negative tumors. Results A 64 gene-based classifier was developed with 96% sensitivity and 86% specificity for detecting BRAF mutant tumors in PETACC-3 and independent samples. A subpopulation of BRAF wild-type patients (30% of KRAS mutants, 13% of double wild type) showed a gene expression pattern and had poor overall survival and survival after relapse, similar to those observed in BRAF-mutant patients. Thus they form a distinct prognostic subgroup within their mutation class. CONCLUSION A characteristic pattern of gene expression is associated with and accurately predicts BRAF mutation status and, in addition, identifies a population of BRAF mutated-like KRAS mutants and double wild-type patients with similarly poor prognosis. This suggests a common biology between these tumors and provides a novel classification tool for cancers, adding prognostic and biologic information that is not captured by the mutation status alone. These results may guide therapeutic strategies for this patient segment and may help in population stratification for clinical trials.
Resumo:
PURPOSE: To improve the risk stratification of patients with rhabdomyosarcoma (RMS) through the use of clinical and molecular biologic data. PATIENTS AND METHODS: Two independent data sets of gene-expression profiling for 124 and 101 patients with RMS were used to derive prognostic gene signatures by using a meta-analysis. These and a previously published metagene signature were evaluated by using cross validation analyses. A combined clinical and molecular risk-stratification scheme that incorporated the PAX3/FOXO1 fusion gene status was derived from 287 patients with RMS and evaluated. RESULTS: We showed that our prognostic gene-expression signature and the one previously published performed well with reproducible and significant effects. However, their effect was reduced when cross validated or tested in independent data and did not add new prognostic information over the fusion gene status, which is simpler to assay. Among nonmetastatic patients, patients who were PAX3/FOXO1 positive had a significantly poorer outcome compared with both alveolar-negative and PAX7/FOXO1-positive patients. Furthermore, a new clinicomolecular risk score that incorporated fusion gene status (negative and PAX3/FOXO1 and PAX7/FOXO1 positive), Intergroup Rhabdomyosarcoma Study TNM stage, and age showed a significant increase in performance over the current risk-stratification scheme. CONCLUSION: Gene signatures can improve current stratification of patients with RMS but will require complex assays to be developed and extensive validation before clinical application. A significant majority of their prognostic value was encapsulated by the fusion gene status. A continuous risk score derived from the combination of clinical parameters with the presence or absence of PAX3/FOXO1 represents a robust approach to improving current risk-adapted therapy for RMS.
Resumo:
BACKGROUND/OBJECTIVES: Preoperative nutrition has been shown to reduce morbidity after major gastrointestinal (GI) surgery in selected patients at risk. In a randomized trial performed recently (NCT00512213), almost half of the patients, however, did not consume the recommended dose of nutritional intervention. The present study aimed to identify the risk factors for noncompliance. SUBJECTS/METHODS: Demographic (n=5) and nutritional (n=21) parameters for this retrospective analysis were obtained from a prospectively maintained database. The outcome of interest was compliance with the allocated intervention (ingestion of ⩾11/15 preoperative oral nutritional supplement units). Uni- and multivariate analyses of potential risk factors for noncompliance were performed. RESULTS: The final analysis included 141 patients with complete data sets for the purpose of the study. Fifty-nine patients (42%) were considered noncompliant. Univariate analysis identified low C-reactive protein levels (P=0.015), decreased recent food intake (P=0.032) and, as a trend, low hemoglobin (P=0.065) and low pre-albumin (P=0.056) levels as risk factors for decreased compliance. However, none of them was retained as an independent risk factor after multivariate analysis. Interestingly, 17 potential explanatory parameters, such as upper GI cancer, weight loss, reduced appetite or co-morbidities, did not show any significant correlation with reduced intake of nutritional supplements. CONCLUSIONS: Reduced compliance with preoperative nutritional interventions remains a major issue because the expected benefit depends on the actual intake. Seemingly, obvious reasons could not be retained as valid explanations. Compliance seems thus to be primarily a question of will and information; the importance of nutritional supplementation needs to be emphasized by specific patients' education.
Resumo:
Advances in flow cytometry and other single-cell technologies have enabled high-dimensional, high-throughput measurements of individual cells as well as the interrogation of cell population heterogeneity. However, in many instances, computational tools to analyze the wealth of data generated by these technologies are lacking. Here, we present a computational framework for unbiased combinatorial polyfunctionality analysis of antigen-specific T-cell subsets (COMPASS). COMPASS uses a Bayesian hierarchical framework to model all observed cell subsets and select those most likely to have antigen-specific responses. Cell-subset responses are quantified by posterior probabilities, and human subject-level responses are quantified by two summary statistics that describe the quality of an individual's polyfunctional response and can be correlated directly with clinical outcome. Using three clinical data sets of cytokine production, we demonstrate how COMPASS improves characterization of antigen-specific T cells and reveals cellular 'correlates of protection/immunity' in the RV144 HIV vaccine efficacy trial that are missed by other methods. COMPASS is available as open-source software.
Resumo:
BACKGROUND: Available methods to simulate nucleotide or amino acid data typically use Markov models to simulate each position independently. These approaches are not appropriate to assess the performance of combinatorial and probabilistic methods that look for coevolving positions in nucleotide or amino acid sequences. RESULTS: We have developed a web-based platform that gives a user-friendly access to two phylogenetic-based methods implementing the Coev model: the evaluation of coevolving scores and the simulation of coevolving positions. We have also extended the capabilities of the Coev model to allow for the generalization of the alphabet used in the Markov model, which can now analyse both nucleotide and amino acid data sets. The simulation of coevolving positions is novel and builds upon the developments of the Coev model. It allows user to simulate pairs of dependent nucleotide or amino acid positions. CONCLUSIONS: The main focus of our paper is the new simulation method we present for coevolving positions. The implementation of this method is embedded within the web platform Coev-web that is freely accessible at http://coev.vital-it.ch/, and was tested in most modern web browsers.
Resumo:
BACKGROUND: Several distributions of country-specific blood pressure (BP) percentiles by sex, age, and height for children and adolescents have been established worldwide. However, there are no globally unified BP references for defining elevated BP in children and adolescents, which limits international comparisons of the prevalence of pediatric elevated BP. We aimed to establish international BP references for children and adolescents by using 7 nationally representative data sets (China, India, Iran, Korea, Poland, Tunisia, and the United States). METHODS AND RESULTS: Data on BP for 52 636 nonoverweight children and adolescents aged 6 to 19 years were obtained from 7 large nationally representative cross-sectional surveys in China, India, Iran, Korea, Poland, Tunisia, and the United States. BP values were obtained with certified mercury sphygmomanometers in all 7 countries by using standard procedures for BP measurement. Smoothed BP percentiles (50th, 90th, 95th, and 99th) by age and height were estimated by using the Generalized Additive Model for Location Scale and Shape model. BP values were similar between males and females until the age of 13 years and were higher in males than females thereafter. In comparison with the BP levels of the 90th and 95th percentiles of the US Fourth Report at median height, systolic BP of the corresponding percentiles of these international references was lower, whereas diastolic BP was similar. CONCLUSIONS: These international BP references will be a useful tool for international comparison of the prevalence of elevated BP in children and adolescents and may help to identify hypertensive youths in diverse populations.
Resumo:
As increasingly large molecular data sets are collected for phylogenomics, the conflicting phylogenetic signal among gene trees poses challenges to resolve some difficult nodes of the Tree of Life. Among these nodes, the phylogenetic position of the honey bees (Apini) within the corbiculate bee group remains controversial, despite its considerable importance for understanding the emergence and maintenance of eusociality. Here, we show that this controversy stems in part from pervasive phylogenetic conflicts among GC-rich gene trees. GC-rich genes typically have a high nucleotidic heterogeneity among species, which can induce topological conflicts among gene trees. When retaining only the most GC-homogeneous genes or using a nonhomogeneous model of sequence evolution, our analyses reveal a monophyletic group of the three lineages with a eusocial lifestyle (honey bees, bumble bees, and stingless bees). These phylogenetic relationships strongly suggest a single origin of eusociality in the corbiculate bees, with no reversal to solitary living in this group. To accurately reconstruct other important evolutionary steps across the Tree of Life, we suggest removing GC-rich and GC-heterogeneous genes from large phylogenomic data sets. Interpreted as a consequence of genome-wide variations in recombination rates, this GC effect can affect all taxa featuring GC-biased gene conversion, which is common in eukaryotes.
Resumo:
Mammalian physiology and behavior follow daily rhythms that are orchestrated by endogenous timekeepers known as circadian clocks. Rhythms in transcription are considered the main mechanism to engender rhythmic gene expression, but important roles for posttranscriptional mechanisms have recently emerged as well (reviewed in Lim and Allada (2013) [1]). We have recently reported on the use of ribosome profiling (RPF-seq), a method based on the high-throughput sequencing of ribosome protected mRNA fragments, to explore the temporal regulation of translation efficiency (Janich et al., 2015 [2]). Through the comparison of around-the-clock RPF-seq and matching RNA-seq data we were able to identify 150 genes, involved in ribosome biogenesis, iron metabolism and other pathways, whose rhythmicity is generated entirely at the level of protein synthesis. The temporal transcriptome and translatome data sets from this study have been deposited in NCBI's Gene Expression Omnibus under the accession number GSE67305. Here we provide additional information on the experimental setup and on important optimization steps pertaining to the ribosome profiling technique in mouse liver and to data analysis.
Resumo:
Aim: Emerging polyploids may depend on environmental niche shifts for successful establishment. Using the alpine plant Ranunculus kuepferi as a model system, we explore the niche shift hypothesis at different spatial resolutions and in contrasting parts of the species range. Location: European Alps. Methods: We sampled 12 individuals from each of 102 populations of R. kuepferi across the Alps, determined their ploidy levels, derived coarse-grain (100x100m) environmental descriptors for all sampling sites by downscaling WorldClim maps, and calculated fine-scale environmental descriptors (2x2m) from indicator values of the vegetation accompanying the sampled individuals. Both coarse and fine-scale variables were further computed for 8239 vegetation plots from across the Alps. Subsequently, we compared niche optima and breadths of diploid and tetraploid cytotypes by combining principal components analysis and kernel smoothing procedures. Comparisons were done separately for coarse and fine-grain data sets and for sympatric, allopatric and the total set of populations. Results: All comparisons indicate that the niches of the two cytotypes differ in optima and/or breadths, but results vary in important details. The whole-range analysis suggests differentiation along the temperature gradient to be most important. However, sympatric comparisons indicate that this climatic shift was not a direct response to competition with diploid ancestors. Moreover, fine-grained analyses demonstrate niche contraction of tetraploids, especially in the sympatric range, that goes undetected with coarse-grained data. Main conclusions: Although the niche optima of the two cytotypes differ, separation along ecological gradients was probably less decisive for polyploid establishment than a shift towards facultative apomixis, a particularly effective strategy to avoid minority cytotype exclusion. In addition, our results suggest that coarse-grained analyses overestimate niche breadths of widely distributed taxa. Niche comparison analyses should hence be conducted at environmental data resolutions appropriate for the organism and question under study.