104 resultados para microarray data classification
Resumo:
BACKGROUND: Synthesis of the Staphylococcus aureus peptidoglycan pentaglycine interpeptide bridge is catalyzed by the nonribosomal peptidyl transferases FemX, FemA and FemB. Inactivation of the femAB operon reduces the interpeptide to a monoglycine, leading to a poorly crosslinked peptidoglycan. femAB mutants show a reduced growth rate and are hypersusceptible to virtually all antibiotics, including methicillin, making FemAB a potential target to restore beta-lactam susceptibility in methicillin-resistant S. aureus (MRSA). Cis-complementation with wild type femAB only restores synthesis of the pentaglycine interpeptide and methicillin resistance, but the growth rate remains low. This study characterizes the adaptations that ensured survival of the cells after femAB inactivation. RESULTS: In addition to slow growth, the cis-complemented femAB mutant showed temperature sensitivity and a higher methicillin resistance than the wild type. Transcriptional profiling paired with reporter metabolite analysis revealed multiple changes in the global transcriptome. A number of transporters for sugars, glycerol, and glycine betaine, some of which could serve as osmoprotectants, were upregulated. Striking differences were found in the transcription of several genes involved in nitrogen metabolism and the arginine-deiminase pathway, an alternative for ATP production. In addition, microarray data indicated enhanced expression of virulence factors that correlated with premature expression of the global regulators sae, sarA, and agr. CONCLUSION: Survival under conditions preventing normal cell wall formation triggered complex adaptations that incurred a fitness cost, showing the remarkable flexibility of S. aureus to circumvent cell wall damage. Potential FemAB inhibitors would have to be used in combination with other antibiotics to prevent selection of resistant survivors.
Resumo:
Under optimal non-physiological conditions of low concentrations and low temperatures, proteins may spontaneously fold to the native state, as all the information for folding lies in the amino acid sequence of the polypeptide. However, under conditions of stress or high protein crowding as inside cells, a polypeptide may misfold and enter an aggregation pathway resulting in the formation of misfolded conformers and fibrils, which can be toxic and lead to neurodegenerative illnesses, such as Alzheimer's, Parkinson's or Huntington's diseases and aging in general. To avert and revert protein misfolding and aggregation, cells have evolved a set of proteins called molecular chaperones. Here, I focussed on the human cytosolic chaperones Hsp70 (DnaK) and HspllO, and co-chaperone Hsp40 (DnaJ), and the chaperonin CCT (GroEL). The cytosolic molecular chaperones Hsp70s/Hspll0s and the chaperonins are highly upregulated in bacterial and human cells under different stresses and are involved both in the prevention and the reversion of protein misfolding and aggregation. Hsp70 works in collaboration with Hsp40 to reactivate misfolded or aggregated proteins in a strict ATP dependent manner. Chaperonins (CCT and GroEL) also unfold and reactivate stably misfolded proteins but we found that it needed to use the energy of ATP hydrolysis in order to evict over- sticky misfolded intermediates that inhibited the unfoldase catalytic sites. Ill In this study, we initially characterized a particular type of inactive misfolded monomeric luciferase and rhodanese species that were obtained by repeated cycles of freeze-thawing (FT). These stable misfolded monomeric conformers (FT-luciferase and FT-rhodanese) had exposed hydrophobic residues and were enriched with wrong ß-sheet structures (Chapter 2). Using FT-luciferase as substrate, we found that the Hsp70 orthologs, called HspllO (Sse in yeast), acted similarly to Hsp70 as were bona fide ATP- fuelled polypeptide unfoldases and was much more than a mere nucleotide exchange factor, as generally thought. Moreover, we found that HspllO collaborated with Hsp70 in the disaggregation of stable protein aggregates in which Hsp70 and HspllO acted as equal partners that synergistically combined their individual ATP-consuming polypeptide unfoldase activities to reactivate the misfolded/aggregated proteins (Chapter 3). Using FT-rhodanese as substrate, we found that chaperonins (GroEL and CCT) could catalytically reactivate misfolded rhodanese monomers in the absence of ATP. Also, our results suggested that encaging of an unfolding polypeptide inside the GroEL cavity under a GroES cap was not an obligatory step as generally thought (Chapter 4). Further, we investigated the role of Hsp40, a J-protein co-chaperone of Hsp70, in targeting misfolded polypeptides substrates onto Hsp70 for unfolding. We found that even a large excess of monomeric unfolded a-synuclein did not inhibit DnaJ, whereas, in contrast, stable misfolded a-synuclein oligomers strongly inhibited the DnaK-mediated chaperone reaction by way of sequestering the DnaJ co-chaperone. This work revealed that DnaJ could specifically distinguish, and bind potentially toxic stably aggregated species, such as soluble a-synuclein oligomers involved in Parkinson's disease, and with the help of DnaK and ATP convert them into from harmless natively unfolded a-synuclein monomers (chapter 5). Finally, our meta-analysis of microarray data of plant and animal tissues treated with various chemicals and abiotic stresses, revealed possible co-expressions between core chaperone machineries and their co-chaperone regulators. It clearly showed that protein misfolding in the cytosol elicits a different response, consisting of upregulating the synthesis mainly of cytosolic chaperones, from protein misfolding in the endoplasmic reticulum (ER) that elicited a typical unfolded protein response (UPR), consisting of upregulating the synthesis mainly of ER chaperones. We proposed that drugs that best mimicked heat or UPR stress at increasing the chaperone load in the cytoplasm or ER respectively, may prove effective at combating protein misfolding diseases and aging (Chapter 6). - Dans les conditions optimales de basse concentration et de basse température, les protéines vont spontanément adopter un repliement natif car toutes les informations nécessaires se trouvent dans la séquence des acides aminés du polypeptide. En revanche, dans des conditions de stress ou de forte concentration des protéines comme à l'intérieur d'une cellule, un polypeptide peu mal se replier et entrer dans un processus d'agrégation conduisant à la formation de conformères et de fibrilles qui peuvent être toxiques et causer des maladies neurodégénératives comme la maladie d'Alzheimer, la maladie de Parkinson ou la chorée de Huntington. Afin d'empêcher ou de rectifier le mauvais repliement des protéines, les cellules ont développé des protéines appelées chaperonnes. Dans ce travail, je me suis intéressé aux chaperonnes cytosoliques Hsp70 (DnaK) et HspllO, la co-chaperones Hsp40 (DnaJ), le complexe CCT/TRiC et GroEL. Chez les bactéries et les humains, les chaperonnes cytosoliques Hsp70s/Hspl 10s et les « chaperonines» sont fortement activées par différentes conditions de stress et sont toutes impliquées dans la prévention et la correction du mauvais repliement des protéines et de leur agrégation. Hsp70 collabore avec Hsp40 pour réactiver les protéines agrégées ou mal repliées et leur action nécessite de 1ATP. Les chaperonines (GroEL) déplient et réactivent aussi les protéines mal repliées de façon stable mais nous avons trouvé qu'elles utilisent l'ATP pour libérer les intermédiaires collant et mal repliés du site catalytique de dépliage. Nous avons initialement caractérisé un type particulier de formes stables de luciférase et de rhodanese monomériques mal repliées obtenues après plusieurs cycles de congélation / décongélation répétés (FT). Ces monomères exposaient des résidus hydrophobiques et étaient plus riches en feuillets ß anormaux. Ils pouvaient cependant être réactivés par les chaperonnes Hsp70+Hsp40 (DnaK+DnaJ) et de l'ATP, ou par Hsp60 (GroEL) sans ATP (Chapitre 2). En utilisant la FT-Luciferase comme substrat nous avons trouvé que HspllO (un orthologue de Hsp70) était une authentique dépliase, dépendante strictement de l'ATP. De plus, nous avons trouvé que HspllO collaborait avec Hsp70 dans la désagrégation d'agrégats stables de protéines en combinant leurs activités dépliase consommatrice d'ATP (Chapitre 3). En utilisant la FT-rhodanese, nous avons trouvé que les chaperonines (GroEL et CCT) pouvaient réactiver catalytiquement des monomères mal repliés en absence d'ATP. Nos résultats suggérèrent également que la capture d'un polypeptide en cours de dépliement dans la cavité de GroEL et sous un couvercle du complexe GroES ne serait pas une étape obligatoire du mécanisme, comme il est communément accepté dans la littérature (Chapitre 4). De plus, nous avons étudié le rôle de Hsp40, une co-chaperones de Hsp70, dans l'adressage de substrats polypeptidiques mal repliés vers Hsp70. Ce travail a révélé que DnaJ pouvait différencier et lier des polypeptide mal repliés (toxiques), comme des oligomères d'a-synucléine dans la maladie de Parkinson, et clairement les différencier des monomères inoffensifs d'a-synucléine (Chapitre 5). Finalement une méta-analyse de données de microarrays de tissus végétaux et animaux traités avec différents stress chimiques et abiotiques a révélé une possible co-expression de la machinerie des chaperonnes et des régulateurs de co- chaperonne. Cette meta-analyse montre aussi clairement que le mauvais repliement des protéines dans le cytosol entraîne la synthèse de chaperonnes principalement cytosoliques alors que le mauvais repliement de protéines dans le réticulum endoplasmique (ER) entraine une réponse typique de dépliement (UPR) qui consiste principalement en la synthèse de chaperonnes localisées dans l'ER. Nous émettons l'hypothèse que les drogues qui reproduisent le mieux les stress de chaleur ou les stress UPR pourraient se montrer efficaces dans la lutte contre le mauvais repliement des protéines et le vieillissement (Chapitre 6).
Resumo:
The ability to obtain gene expression profiles from human disease specimens provides an opportunity to identify relevant gene pathways, but is limited by the absence of data sets spanning a broad range of conditions. Here, we analyzed publicly available microarray data from 16 diverse skin conditions in order to gain insight into disease pathogenesis. Unsupervised hierarchical clustering separated samples by disease as well as common cellular and molecular pathways. Disease-specific signatures were leveraged to build a multi-disease classifier, which predicted the diagnosis of publicly and prospectively collected expression profiles with 93% accuracy. In one sample, the molecular classifier differed from the initial clinical diagnosis and correctly predicted the eventual diagnosis as the clinical presentation evolved. Finally, integration of IFN-regulated gene programs with the skin database revealed a significant inverse correlation between IFN-β and IFN-γ programs across all conditions. Our study provides an integrative approach to the study of gene signatures from multiple skin conditions, elucidating mechanisms of disease pathogenesis. In addition, these studies provide a framework for developing tools for personalized medicine toward the precise prediction, prevention, and treatment of disease on an individual level.
Resumo:
Introduction: As part of the MicroArray Quality Control (MAQC)-II project, this analysis examines how the choice of univariate feature-selection methods and classification algorithms may influence the performance of genomic predictors under varying degrees of prediction difficulty represented by three clinically relevant endpoints. Methods: We used gene-expression data from 230 breast cancers (grouped into training and independent validation sets), and we examined 40 predictors (five univariate feature-selection methods combined with eight different classifiers) for each of the three endpoints. Their classification performance was estimated on the training set by using two different resampling methods and compared with the accuracy observed in the independent validation set. Results: A ranking of the three classification problems was obtained, and the performance of 120 models was estimated and assessed on an independent validation set. The bootstrapping estimates were closer to the validation performance than were the cross-validation estimates. The required sample size for each endpoint was estimated, and both gene-level and pathway-level analyses were performed on the obtained models. Conclusions: We showed that genomic predictor accuracy is determined largely by an interplay between sample size and classification difficulty. Variations on univariate feature-selection methods and choice of classification algorithm have only a modest impact on predictor performance, and several statistically equally good predictors can be developed for any given classification problem.
Resumo:
This study presents a classification criteria for two-class Cannabis seedlings. As the cultivation of drug type cannabis is forbidden in Switzerland, law enforcement authorities regularly ask laboratories to determine cannabis plant's chemotype from seized material in order to ascertain that the plantation is legal or not. In this study, the classification analysis is based on data obtained from the relative proportion of three major leaf compounds measured by gas-chromatography interfaced with mass spectrometry (GC-MS). The aim is to discriminate between drug type (illegal) and fiber type (legal) cannabis at an early stage of the growth. A Bayesian procedure is proposed: a Bayes factor is computed and classification is performed on the basis of the decision maker specifications (i.e. prior probability distributions on cannabis type and consequences of classification measured by losses). Classification rates are computed with two statistical models and results are compared. Sensitivity analysis is then performed to analyze the robustness of classification criteria.
Resumo:
The 2008 Data Fusion Contest organized by the IEEE Geoscience and Remote Sensing Data Fusion Technical Committee deals with the classification of high-resolution hyperspectral data from an urban area. Unlike in the previous issues of the contest, the goal was not only to identify the best algorithm but also to provide a collaborative effort: The decision fusion of the best individual algorithms was aiming at further improving the classification performances, and the best algorithms were ranked according to their relative contribution to the decision fusion. This paper presents the five awarded algorithms and the conclusions of the contest, stressing the importance of decision fusion, dimension reduction, and supervised classification methods, such as neural networks and support vector machines.
Resumo:
BACKGROUND: Prognosis prediction for resected primary colon cancer is based on the T-stage Node Metastasis (TNM) staging system. We investigated if four well-documented gene expression risk scores can improve patient stratification. METHODS: Microarray-based versions of risk-scores were applied to a large independent cohort of 688 stage II/III tumors from the PETACC-3 trial. Prognostic value for relapse-free survival (RFS), survival after relapse (SAR), and overall survival (OS) was assessed by regression analysis. To assess improvement over a reference, prognostic model was assessed with the area under curve (AUC) of receiver operating characteristic (ROC) curves. All statistical tests were two-sided, except the AUC increase. RESULTS: All four risk scores (RSs) showed a statistically significant association (single-test, P < .0167) with OS or RFS in univariate models, but with HRs below 1.38 per interquartile range. Three scores were predictors of shorter RFS, one of shorter SAR. Each RS could only marginally improve an RFS or OS model with the known factors T-stage, N-stage, and microsatellite instability (MSI) status (AUC gains < 0.025 units). The pairwise interscore discordance was never high (maximal Spearman correlation = 0.563) A combined score showed a trend to higher prognostic value and higher AUC increase for OS (HR = 1.74, 95% confidence interval [CI] = 1.44 to 2.10, P < .001, AUC from 0.6918 to 0.7321) and RFS (HR = 1.56, 95% CI = 1.33 to 1.84, P < .001, AUC from 0.6723 to 0.6945) than any single score. CONCLUSIONS: The four tested gene expression-based risk scores provide prognostic information but contribute only marginally to improving models based on established risk factors. A combination of the risk scores might provide more robust information. Predictors of RFS and SAR might need to be different.
Resumo:
The DRG classification provides a useful tool for the evaluation of hospital care. Indicators such as readmissions and mortality rates adjusted for the hospital Casemix could be adopted in Switzerland at the price of minor additions to the hospital discharge record. The additional information required to build patients histories and to identify the deaths occurring after hospital discharge is detailed.
Resumo:
The paper deals with the development and application of the generic methodology for automatic processing (mapping and classification) of environmental data. General Regression Neural Network (GRNN) is considered in detail and is proposed as an efficient tool to solve the problem of spatial data mapping (regression). The Probabilistic Neural Network (PNN) is considered as an automatic tool for spatial classifications. The automatic tuning of isotropic and anisotropic GRNN/PNN models using cross-validation procedure is presented. Results are compared with the k-Nearest-Neighbours (k-NN) interpolation algorithm using independent validation data set. Real case studies are based on decision-oriented mapping and classification of radioactively contaminated territories.
Resumo:
Expression data contribute significantly to the biological value of the sequenced human genome, providing extensive information about gene structure and the pattern of gene expression. ESTs, together with SAGE libraries and microarray experiment information, provide a broad and rich view of the transcriptome. However, it is difficult to perform large-scale expression mining of the data generated by these diverse experimental approaches. Not only is the data stored in disparate locations, but there is frequent ambiguity in the meaning of terms used to describe the source of the material used in the experiment. Untangling semantic differences between the data provided by different resources is therefore largely reliant on the domain knowledge of a human expert. We present here eVOC, a system which associates labelled target cDNAs for microarray experiments, or cDNA libraries and their associated transcripts with controlled terms in a set of hierarchical vocabularies. eVOC consists of four orthogonal controlled vocabularies suitable for describing the domains of human gene expression data including Anatomical System, Cell Type, Pathology and Developmental Stage. We have curated and annotated 7016 cDNA libraries represented in dbEST, as well as 104 SAGE libraries,with expression information,and provide this as an integrated, public resource that allows the linking of transcripts and libraries with expression terms. Both the vocabularies and the vocabulary-annotated libraries can be retrieved from http://www.sanbi.ac.za/evoc/. Several groups are involved in developing this resource with the aim of unifying transcript expression information.
Resumo:
OBJECTIVES: This study aimed at investigating whether data from medical teleconsultations may contribute to influenza surveillance. METHODS: International Classification of Primary Care 2nd Edition (ICPC-2) codes were used to analyse the proportion of teleconsultations due to influenza-related symptoms. Results were compared with the weekly Swiss Sentinel reports. RESULTS: When using the ICPC-2 code for fever we could reproduce the seasonal influenza peaks of the winter seasons 07/08, 08/09 and 09/10 as depicted by the Sentinel data. For the pandemic influenza 09/10, we detected a much higher first peak in summer 2009 which correlated with a potential underreporting in the Sentinel system. CONCLUSIONS: ICPC-2 data from medical teleconsultations allows influenza surveillance in real time and correlates very well with the Swiss Sentinel system.
Resumo:
MOTIVATION: Microarray results accumulated in public repositories are widely reused in meta-analytical studies and secondary databases. The quality of the data obtained with this technology varies from experiment to experiment, and an efficient method for quality assessment is necessary to ensure their reliability. RESULTS: The lack of a good benchmark has hampered evaluation of existing methods for quality control. In this study, we propose a new independent quality metric that is based on evolutionary conservation of expression profiles. We show, using 11 large organ-specific datasets, that IQRray, a new quality metrics developed by us, exhibits the highest correlation with this reference metric, among 14 metrics tested. IQRray outperforms other methods in identification of poor quality arrays in datasets composed of arrays from many independent experiments. In contrast, the performance of methods designed for detecting outliers in a single experiment like Normalized Unscaled Standard Error and Relative Log Expression was low because of the inability of these methods to detect datasets containing only low-quality arrays and because the scores cannot be directly compared between experiments. AVAILABILITY AND IMPLEMENTATION: The R implementation of IQRray is available at: ftp://lausanne.isb-sib.ch/pub/databases/Bgee/general/IQRray.R. CONTACT: Marta.Rosikiewicz@unil.ch SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Resumo:
To compare the impact of meeting specific classification criteria [modified New York (mNY), European Spondyloarthropathy Study Group (ESSG), and Assessment of SpondyloArthritis international Society (ASAS) criteria] on anti-tumor necrosis factor (anti-TNF) drug retention, and to determine predictive factors of better drug survival. All patients fulfilling the ESSG criteria for axial spondyloarthritis (SpA) with available data on the axial ASAS and mNY criteria, and who had received at least one anti-TNF treatment were retrospectively retrieved in a single academic institution in Switzerland. Drug retention was computed using survival analysis (Kaplan-Meier), adjusted for potential confounders. Of the 137 patients classified as having axial SpA using the ESSG criteria, 112 also met the ASAS axial SpA criteria, and 77 fulfilled the mNY criteria. Drug retention rates at 12 and 24 months for the first biologic therapy were not significantly different between the diagnostic groups. Only the small ASAS non-classified axial SpA group (25 patients) showed a nonsignificant trend toward shorter drug survival. Elevated CRP level, but not the presence of bone marrow edema on magnetic resonance imaging (MRI) scans, was associated with significantly better drug retention (OR 7.9, ICR 4-14). In this cohort, anti-TNF drug survival was independent of the classification criteria. Elevated CRP level, but not positive MRI, was associated with better drug retention.