47 resultados para DNA-microarray data
em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo
Resumo:
Abstract Background With the development of DNA hybridization microarray technologies, nowadays it is possible to simultaneously assess the expression levels of thousands to tens of thousands of genes. Quantitative comparison of microarrays uncovers distinct patterns of gene expression, which define different cellular phenotypes or cellular responses to drugs. Due to technical biases, normalization of the intensity levels is a pre-requisite to performing further statistical analyses. Therefore, choosing a suitable approach for normalization can be critical, deserving judicious consideration. Results Here, we considered three commonly used normalization approaches, namely: Loess, Splines and Wavelets, and two non-parametric regression methods, which have yet to be used for normalization, namely, the Kernel smoothing and Support Vector Regression. The results obtained were compared using artificial microarray data and benchmark studies. The results indicate that the Support Vector Regression is the most robust to outliers and that Kernel is the worst normalization technique, while no practical differences were observed between Loess, Splines and Wavelets. Conclusion In face of our results, the Support Vector Regression is favored for microarray normalization due to its superiority when compared to the other methods for its robustness in estimating the normalization curve.
Resumo:
Abstract Background From shotgun libraries used for the genomic sequencing of the phytopathogenic bacterium Xanthomonas axonopodis pv. citri (XAC), clones that were representative of the largest possible number of coding sequences (CDSs) were selected to create a DNA microarray platform on glass slides (XACarray). The creation of the XACarray allowed for the establishment of a tool that is capable of providing data for the analysis of global genome expression in this organism. Findings The inserts from the selected clones were amplified by PCR with the universal oligonucleotide primers M13R and M13F. The obtained products were purified and fixed in duplicate on glass slides specific for use in DNA microarrays. The number of spots on the microarray totaled 6,144 and included 768 positive controls and 624 negative controls per slide. Validation of the platform was performed through hybridization of total DNA probes from XAC labeled with different fluorophores, Cy3 and Cy5. In this validation assay, 86% of all PCR products fixed on the glass slides were confirmed to present a hybridization signal greater than twice the standard deviation of the deviation of the global median signal-to-noise ration. Conclusions Our validation of the XACArray platform using DNA-DNA hybridization revealed that it can be used to evaluate the expression of 2,365 individual CDSs from all major functional categories, which corresponds to 52.7% of the annotated CDSs of the XAC genome. As a proof of concept, we used this platform in a previously work to verify the absence of genomic regions that could not be detected by sequencing in related strains of Xanthomonas.
Resumo:
Abstract Background The search for enriched (aka over-represented or enhanced) ontology terms in a list of genes obtained from microarray experiments is becoming a standard procedure for a system-level analysis. This procedure tries to summarize the information focussing on classification designs such as Gene Ontology, KEGG pathways, and so on, instead of focussing on individual genes. Although it is well known in statistics that association and significance are distinct concepts, only the former approach has been used to deal with the ontology term enrichment problem. Results BayGO implements a Bayesian approach to search for enriched terms from microarray data. The R source-code is freely available at http://blasto.iq.usp.br/~tkoide/BayGO in three versions: Linux, which can be easily incorporated into pre-existent pipelines; Windows, to be controlled interactively; and as a web-tool. The software was validated using a bacterial heat shock response dataset, since this stress triggers known system-level responses. Conclusion The Bayesian model accounts for the fact that, eventually, not all the genes from a given category are observable in microarray data due to low intensity signal, quality filters, genes that were not spotted and so on. Moreover, BayGO allows one to measure the statistical association between generic ontology terms and differential expression, instead of working only with the common significance analysis.
Resumo:
Abstract Background Smallpox is a lethal disease that was endemic in many parts of the world until eradicated by massive immunization. Due to its lethality, there are serious concerns about its use as a bioweapon. Here we analyze publicly available microarray data to further understand survival of smallpox infected macaques, using systems biology approaches. Our goal is to improve the knowledge about the progression of this disease. Results We used KEGG pathways annotations to define groups of genes (or modules), and subsequently compared them to macaque survival times. This technique provided additional insights about the host response to this disease, such as increased expression of the cytokines and ECM receptors in the individuals with higher survival times. These results could indicate that these gene groups could influence an effective response from the host to smallpox. Conclusion Macaques with higher survival times clearly express some specific pathways previously unidentified using regular gene-by-gene approaches. Our work also shows how third party analysis of public datasets can be important to support new hypotheses to relevant biological problems.
Resumo:
A common interest in gene expression data analysis is to identify from a large pool of candidate genes the genes that present significant changes in expression levels between a treatment and a control biological condition. Usually, it is done using a statistic value and a cutoff value that are used to separate the genes differentially and nondifferentially expressed. In this paper, we propose a Bayesian approach to identify genes differentially expressed calculating sequentially credibility intervals from predictive densities which are constructed using the sampled mean treatment effect from all genes in study excluding the treatment effect of genes previously identified with statistical evidence for difference. We compare our Bayesian approach with the standard ones based on the use of the t-test and modified t-tests via a simulation study, using small sample sizes which are common in gene expression data analysis. Results obtained report evidence that the proposed approach performs better than standard ones, especially for cases with mean differences and increases in treatment variance in relation to control variance. We also apply the methodologies to a well-known publicly available data set on Escherichia coli bacterium.
Resumo:
Patients with type 2 diabetes mellitus (T2DM) exhibit insulin resistance associated with obesity and inflammatory response, besides an increased level of oxidative DNA damage as a consequence of the hyperglycemic condition and the generation of reactive oxygen species (ROS). In order to provide information on the mechanisms involved in the pathophysiology of T2DM, we analyzed the transcriptional expression patterns exhibited by peripheral blood mononuclear cells (PBMCs) from patients with T2DM compared to non-diabetic subjects, by investigating several biological processes: inflammatory and immune responses, responses to oxidative stress and hypoxia, fatty acid processing, and DNA repair. PBMCs were obtained from 20 T2DM patients and eight non-diabetic subjects. Total RNA was hybridized to Agilent whole human genome 4x44K one-color oligo-microarray. Microarray data were analyzed using the GeneSpring GX 11.0 software (Agilent). We used BRB-ArrayTools software (gene set analysis - GSA) to investigate significant gene sets and the Genomica tool to study a possible influence of clinical features on gene expression profiles. We showed that PBMCs from T2DM patients presented significant changes in gene expression, exhibiting 1320 differentially expressed genes compared to the control group. A great number of genes were involved in biological processes implicated in the pathogenesis of T2DM. Among the genes with high fold-change values, the up-regulated ones were associated with fatty acid metabolism and protection against lipid-induced oxidative stress, while the down-regulated ones were implicated in the suppression of pro-inflammatory cytokines production and DNA repair. Moreover, we identified two significant signaling pathways: adipocytokine, related to insulin resistance; and ceramide, related to oxidative stress and induction of apoptosis. In addition, expression profiles were not influenced by patient features, such as age, gender, obesity, pre/post-menopause age, neuropathy, glycemia, and HbA(1c) percentage. Hence, by studying expression profiles of PBMCs, we provided quantitative and qualitative differences and similarities between T2DM patients and non-diabetic individuals, contributing with new perspectives for a better understanding of the disease. (C) 2012 Elsevier B.V. All rights reserved.
Resumo:
Iphisa elegans Gray, 1851 is a ground-dwelling lizard widespread over Amazonia that displays a broadly conserved external morphology over its range. This wide geographical distribution and conservation of body form contrasts with the expected poor dispersal ability of the species, the tumultuous past of Amazonia, and the previously documented prevalence of cryptic species in widespread terrestrial organisms in this region. Here we investigate this homogeneity by examining hemipenial morphology and conducting phylogenetic analyses of mitochondrial (CYTB) and nuclear (C-MOS) DNA sequence data from 49 individuals sampled across Amazonia. We detected remarkable variation in hemipenial morphology within this species, with multiple cases of sympatric occurrence of distinct hemipenial morphotypes. Phylogenetic analyses revealed highly divergent lineages corroborating the patterns suggested by the hemipenial morphotypes, including co-occurrence of different lineages. The degrees of genetic and morphological distinctness, as well as instances of sympatry among mtDNA lineages/morphotypes without nuDNA allele sharing, suggest that I. elegans is a complex of cryptic species. An extensive and integrative taxonomic revision of the I. elegans complex throughout its wide geographical range is needed. (c) 2012 The Linnean Society of London, Zoological Journal of the Linnean Society, 2012, 166, 361376.
Resumo:
Xylella fastidiosa inhabits the plant xylem, a nutrient-poor environment, so that mechanisms to sense and respond to adverse environmental conditions are extremely important for bacterial survival in the plant host. Although the complete genome sequences of different Xylella strains have been determined, little is known about stress responses and gene regulation in these organisms. In this work, a DNA microarray was constructed containing 2,600 ORFs identified in the genome sequencing project of Xylella fastidiosa 9a5c strain, and used to check global gene expression differences in the bacteria when it is infecting a symptomatic and a tolerant citrus tree. Different patterns of expression were found in each variety, suggesting that bacteria are responding differentially according to each plant xylem environment. The global gene expression profile was determined and several genes related to bacterial survival in stressed conditions were found to be differentially expressed between varieties, suggesting the involvement of different strategies for adaptation to the environment. The expression pattern of some genes related to the heat shock response, toxin and detoxification processes, adaptation to atypical conditions, repair systems as well as some regulatory genes are discussed in this paper. DNA microarray proved to be a powerful technique for global transcriptome analyses. This is one of the first studies of Xylella fastidiosa gene expression in vivo which helped to increase insight into stress responses and possible bacterial survival mechanisms in the nutrient-poor environment of xylem vessels.
Resumo:
Breast cancer metastasis is a leading cause of death by malignancy in women worldwide. Efforts are being made to further characterize the rate-limiting steps of cancer metastasis, i.e. extravasation of circulating tumor cells and colonization of secondary organs. In this study, we investigated whether angiotensin II, a major vasoactive peptide both produced locally and released in the bloodstream, may trigger activating signals that contribute to cancer cell extravasation and metastasis. We used an experimental in vivo model of cancer metastasis in which bioluminescent breast tumor cells (D3H2LN) were injected intra-cardiacally into nude mice in order to recapitulate the late and essential steps of metastatic dissemination. Real-time intravital imaging studies revealed that angiotensin II accelerates the formation of metastatic foci at secondary sites. Pre-treatment of cancer cells with the peptide increases the number of mice with metastases, as well as the number and size of metastases per mouse. In vitro, angiotensin II contributes to each sequential step of cancer metastasis by promoting cancer cell adhesion to endothelial cells, trans-endothelial migration and tumor cell migration across extracellular matrix. At the molecular level, a total of 102 genes differentially expressed following angiotensin II pretreatment were identified by comparative DNA microarray. Angiotensin II regulates two groups of connected genes related to its precursor angiotensinogen. Among those, up-regulated MMP2/MMP9 and ICAM1 stand at the crossroad of a network of genes involved in cell adhesion, migration and invasion. Our data suggest that targeting angiotensin II production or action may represent a valuable therapeutic option to prevent metastatic progression of invasive breast tumors.
Resumo:
Background and Aim: The identification of gastric carcinomas (GC) has traditionally been based on histomorphology. Recently, DNA microarrays have successfully been used to identify tumors through clustering of the expression profiles. Random forest clustering is widely used for tissue microarrays and other immunohistochemical data, because it handles highly-skewed tumor marker expressions well, and weighs the contribution of each marker according to its relatedness with other tumor markers. In the present study, we e identified biologically- and clinically-meaningful groups of GC by hierarchical clustering analysis of immunohistochemical protein expression. Methods: We selected 28 proteins (p16, p27, p21, cyclin D1, cyclin A, cyclin B1, pRb, p53, c-met, c-erbB-2, vascular endothelial growth factor, transforming growth factor [TGF]-beta I, TGF-beta II, MutS homolog-2, bcl-2, bax, bak, bcl-x, adenomatous polyposis coli, clathrin, E-cadherin, beta-catenin, mucin (MUC) 1, MUC2, MUC5AC, MUC6, matrix metalloproteinase [ MMP]-2, and MMP-9) to be investigated by immunohistochemistry in 482 GC. The analyses of the data were done using a random forest-clustering method. Results: Proteins related to cell cycle, growth factor, cell motility, cell adhesion, apoptosis, and matrix remodeling were highly expressed in GC. We identified protein expressions associated with poor survival in diffuse-type GC. Conclusions: Based on the expression analysis of 28 proteins, we identified two groups of GC that could not be explained by any clinicopathological variables, and a subgroup of long-surviving diffuse-type GC patients with a distinct molecular profile. These results provide not only a new molecular basis for understanding the biological properties of GC, but also better prediction of survival than the classic pathological grouping.
Resumo:
Abstract Background Transcript enumeration methods such as SAGE, MPSS, and sequencing-by-synthesis EST "digital northern", are important high-throughput techniques for digital gene expression measurement. As other counting or voting processes, these measurements constitute compositional data exhibiting properties particular to the simplex space where the summation of the components is constrained. These properties are not present on regular Euclidean spaces, on which hybridization-based microarray data is often modeled. Therefore, pattern recognition methods commonly used for microarray data analysis may be non-informative for the data generated by transcript enumeration techniques since they ignore certain fundamental properties of this space. Results Here we present a software tool, Simcluster, designed to perform clustering analysis for data on the simplex space. We present Simcluster as a stand-alone command-line C package and as a user-friendly on-line tool. Both versions are available at: http://xerad.systemsbiology.net/simcluster. Conclusion Simcluster is designed in accordance with a well-established mathematical framework for compositional data analysis, which provides principled procedures for dealing with the simplex space, and is thus applicable in a number of contexts, including enumeration-based gene expression data.
Resumo:
Abstract Background Several mathematical and statistical methods have been proposed in the last few years to analyze microarray data. Most of those methods involve complicated formulas, and software implementations that require advanced computer programming skills. Researchers from other areas may experience difficulties when they attempting to use those methods in their research. Here we present an user-friendly toolbox which allows large-scale gene expression analysis to be carried out by biomedical researchers with limited programming skills. Results Here, we introduce an user-friendly toolbox called GEDI (Gene Expression Data Interpreter), an extensible, open-source, and freely-available tool that we believe will be useful to a wide range of laboratories, and to researchers with no background in Mathematics and Computer Science, allowing them to analyze their own data by applying both classical and advanced approaches developed and recently published by Fujita et al. Conclusion GEDI is an integrated user-friendly viewer that combines the state of the art SVR, DVAR and SVAR algorithms, previously developed by us. It facilitates the application of SVR, DVAR and SVAR, further than the mathematical formulas present in the corresponding publications, and allows one to better understand the results by means of available visualizations. Both running the statistical methods and visualizing the results are carried out within the graphical user interface, rendering these algorithms accessible to the broad community of researchers in Molecular Biology.
Resumo:
Abstract Background In the alpha subclass of proteobacteria iron homeostasis is controlled by diverse iron responsive regulators. Caulobacter crescentus, an important freshwater α-proteobacterium, uses the ferric uptake repressor (Fur) for such purpose. However, the impact of the iron availability on the C. crescentus transcriptome and an overall perspective of the regulatory networks involved remain unknown. Results In this work we report the identification of iron-responsive and Fur-regulated genes in C. crescentus using microarray-based global transcriptional analyses. We identified 42 genes that were strongly upregulated both by mutation of fur and by iron limitation condition. Among them, there are genes involved in iron uptake (four TonB-dependent receptor gene clusters, and feoAB), riboflavin biosynthesis and genes encoding hypothetical proteins. Most of these genes are associated with predicted Fur binding sites, implicating them as direct targets of Fur-mediated repression. These data were validated by β-galactosidase and EMSA assays for two operons encoding putative transporters. The role of Fur as a positive regulator is also evident, given that 27 genes were downregulated both by mutation of fur and under low-iron condition. As expected, this group includes many genes involved in energy metabolism, mostly iron-using enzymes. Surprisingly, included in this group are also TonB-dependent receptors genes and the genes fixK, fixT and ftrB encoding an oxygen signaling network required for growth during hypoxia. Bioinformatics analyses suggest that positive regulation by Fur is mainly indirect. In addition to the Fur modulon, iron limitation altered expression of 113 more genes, including induction of genes involved in Fe-S cluster assembly, oxidative stress and heat shock response, as well as repression of genes implicated in amino acid metabolism, chemotaxis and motility. Conclusions Using a global transcriptional approach, we determined the C. crescentus iron stimulon. Many but not all of iron responsive genes were directly or indirectly controlled by Fur. The iron limitation stimulon overlaps with other regulatory systems, such as the RpoH and FixK regulons. Altogether, our results showed that adaptation of C. crescentus to iron limitation not only involves increasing the transcription of iron-acquisition systems and decreasing the production of iron-using proteins, but also includes novel genes and regulatory mechanisms.
Resumo:
Abstract Background Papaya (Carica papaya L.) is a commercially important crop that produces climacteric fruits with a soft and sweet pulp that contain a wide range of health promoting phytochemicals. Despite its importance, little is known about transcriptional modifications during papaya fruit ripening and their control. In this study we report the analysis of ripe papaya transcriptome by using a cross-species (XSpecies) microarray technique based on the phylogenetic proximity between papaya and Arabidopsis thaliana. Results Papaya transcriptome analyses resulted in the identification of 414 ripening-related genes with some having their expression validated by qPCR. The transcription profile was compared with that from ripening tomato and grape. There were many similarities between papaya and tomato especially with respect to the expression of genes encoding proteins involved in primary metabolism, regulation of transcription, biotic and abiotic stress and cell wall metabolism. XSpecies microarray data indicated that transcription factors (TFs) of the MADS-box, NAC and AP2/ERF gene families were involved in the control of papaya ripening and revealed that cell wall-related gene expression in papaya had similarities to the expression profiles seen in Arabidopsis during hypocotyl development. Conclusion The cross-species array experiment identified a ripening-related set of genes in papaya allowing the comparison of transcription control between papaya and other fruit bearing taxa during the ripening process.
Resumo:
Introduction: Ovarian adenocarcinoma is frequently detected at the late stage, when therapy efficacy is limited and death occurs in up to 50% of the cases. A potential novel treatment for this disease is a monoclonal antibody that recognizes phosphate transporter sodium-dependent phosphate transporter protein 2b (NaPi2b). Materials and Methods: To better understand the expression of this protein in different histologic types of ovarian carcinomas, we immunostained 50 tumor samples with anti-NaPi2b monoclonal antibody MX35 and, in parallel, we assessed the expression of the gene encoding NaPi2b (SCL34A2) by in silico analysis of microarray data. Results: Both approaches detected higher expression of NaPi2b (SCL34A2) in ovarian carcinoma than in normal tissue. Moreover, a comprehensive analysis indicates that SCL34A2 is the only gene of the several phosphate transporters genes whose expression differentiates normal from carcinoma samples, suggesting it might exert a major role in ovarian carcinomas. Immunohistochemical and mRNA expression data have also shown that 2 histologic subtypes of ovarian carcinoma express particularly high levels of NaPi2b: serous and clear cell adenocarcinomas. Serous adenocarcinomas are the most frequent, contrasting with clear cell carcinomas, rare, and with worse prognosis. Conclusion: This identification of subgroups of patients expressing NaPi2b may be important in selecting cohorts who most likely should be included in future clinical trials, as a recently generated humanized version of MX35 has been developed.