977 resultados para Gene-ontology
Resumo:
Abstract Background RNAs transcribed from intronic regions of genes are involved in a number of processes related to post-transcriptional control of gene expression. However, the complement of human genes in which introns are transcribed, and the number of intronic transcriptional units and their tissue expression patterns are not known. Results A survey of mRNA and EST public databases revealed more than 55,000 totally intronic noncoding (TIN) RNAs transcribed from the introns of 74% of all unique RefSeq genes. Guided by this information, we designed an oligoarray platform containing sense and antisense probes for each of 7,135 randomly selected TIN transcripts plus the corresponding protein-coding genes. We identified exonic and intronic tissue-specific expression signatures for human liver, prostate and kidney. The most highly expressed antisense TIN RNAs were transcribed from introns of protein-coding genes significantly enriched (p = 0.002 to 0.022) in the 'Regulation of transcription' Gene Ontology category. RNA polymerase II inhibition resulted in increased expression of a fraction of intronic RNAs in cell cultures, suggesting that other RNA polymerases may be involved in their biosynthesis. Members of a subset of intronic and protein-coding signatures transcribed from the same genomic loci have correlated expression patterns, suggesting that intronic RNAs regulate the abundance or the pattern of exon usage in protein-coding messages. Conclusion We have identified diverse intronic RNA expression patterns, pointing to distinct regulatory roles. This gene-oriented approach, using a combined intron-exon oligoarray, should permit further comparative analysis of intronic transcription under various physiological and pathological conditions, thus advancing current knowledge about the biological functions of these noncoding RNAs.
Resumo:
Background: The insect exoskeleton provides shape, waterproofing, and locomotion via attached somatic muscles. The exoskeleton is renewed during molting, a process regulated by ecdysteroid hormones. The holometabolous pupa transforms into an adult during the imaginal molt, when the epidermis synthe3sizes the definitive exoskeleton that then differentiates progressively. An important issue in insect development concerns how the exoskeletal regions are constructed to provide their morphological, physiological and mechanical functions. We used whole-genome oligonucleotide microarrays to screen for genes involved in exoskeletal formation in the honeybee thoracic dorsum. Our analysis included three sampling times during the pupal-to-adult molt, i.e., before, during and after the ecdysteroid-induced apolysis that triggers synthesis of the adult exoskeleton. Results: Gene ontology annotation based on orthologous relationships with Drosophila melanogaster genes placed the honeybee differentially expressed genes (DEGs) into distinct categories of Biological Process and Molecular Function, depending on developmental time, revealing the functional elements required for adult exoskeleton formation. Of the 1,253 unique DEGs, 547 were upregulated in the thoracic dorsum after apolysis, suggesting induction by the ecdysteroid pulse. The upregulated gene set included 20 of the 47 cuticular protein (CP) genes that were previously identified in the honeybee genome, and three novel putative CP genes that do not belong to a known CP family. In situ hybridization showed that two of the novel genes were abundantly expressed in the epidermis during adult exoskeleton formation, strongly implicating them as genuine CP genes. Conserved sequence motifs identified the CP genes as members of the CPR, Tweedle, Apidermin, CPF, CPLCP1 and Analogous-to-Peritrophins families. Furthermore, 28 of the 36 muscle-related DEGs were upregulated during the de novo formation of striated fibers attached to the exoskeleton. A search for cis-regulatory motifs in the 5′-untranslated region of the DEGs revealed potential binding sites for known transcription factors. Construction of a regulatory network showed that various upregulated CP- and muscle-related genes (15 and 21 genes, respectively) share common elements, suggesting co-regulation during thoracic exoskeleton formation. Conclusions: These findings help reveal molecular aspects of rigid thoracic exoskeleton formation during the ecdysteroid-coordinated pupal-to-adult molt in the honeybee.
Resumo:
Abstract Background Propolis is a natural product of plant resins collected by honeybees (Apis mellifera) from various plant sources. Our previous studies indicated that propolis sensitivity is dependent on the mitochondrial function and that vacuolar acidification and autophagy are important for yeast cell death caused by propolis. Here, we extended our understanding of propolis-mediated cell death in the yeast Saccharomyces cerevisiae by applying systems biology tools to analyze the transcriptional profiling of cells exposed to propolis. Methods We have used transcriptional profiling of S. cerevisiae exposed to propolis. We validated our findings by using real-time PCR of selected genes. Systems biology tools (physical protein-protein interaction [PPPI] network) were applied to analyse the propolis-induced transcriptional bevavior, aiming to identify which pathways are modulated by propolis in S. cerevisiae and potentially influencing cell death. Results We were able to observe 1,339 genes modulated in at least one time point when compared to the reference time (propolis untreated samples) (t-test, p-value 0.01). Enrichment analysis performed by Gene Ontology (GO) Term finder tool showed enrichment for several biological categories among the genes up-regulated in the microarray hybridization such as transport and transmembrane transport and response to stress. Real-time RT-PCR analysis of selected genes showed by our microarray hybridization approach was capable of providing information about S. cerevisiae gene expression modulation with a considerably high level of confidence. Finally, a physical protein-protein (PPPI) network design and global topological analysis stressed the importance of these pathways in response of S. cerevisiae to propolis and were correlated with the transcriptional data obtained thorough the microarray analysis. Conclusions In summary, our data indicate that propolis is largely affecting several pathways in the eukaryotic cell. However, the most prominent pathways are related to oxidative stress, mitochondrial electron transport chain, vacuolar acidification, regulation of macroautophagy associated with protein target to vacuole, cellular response to starvation, and negative regulation of transcription from RNA polymerase II promoter. Our work emphasizes again the importance of S. cerevisiae as a model system to understand at molecular level the mechanism whereby propolis causes cell death in this organism at the concentration herein tested. Our study is the first one that investigates systematically by using functional genomics how propolis influences and modulates the mRNA abundance of an organism and may stimulate further work on the propolis-mediated cell death mechanisms in fungi.
Resumo:
Abstract Background Bone fractures and loss represent significant costs for the public health system and often affect the patients quality of life, therefore, understanding the molecular basis for bone regeneration is essential. Cytokines, such as IL-6, IL-10 and TNFα, secreted by inflammatory cells at the lesion site, at the very beginning of the repair process, act as chemotactic factors for mesenchymal stem cells, which proliferate and differentiate into osteoblasts through the autocrine and paracrine action of bone morphogenetic proteins (BMPs), mainly BMP-2. Although it is known that BMP-2 binds to ActRI/BMPR and activates the SMAD 1/5/8 downstream effectors, little is known about the intracellular mechanisms participating in osteoblastic differentiation. We assessed differences in the phosphorylation status of different cellular proteins upon BMP-2 osteogenic induction of isolated murine skin mesenchymal stem cells using Triplex Stable Isotope Dimethyl Labeling coupled with LC/MS. Results From 150 μg of starting material, 2,264 proteins were identified and quantified at five different time points, 235 of which are differentially phosphorylated. Kinase motif analysis showed that several substrates display phosphorylation sites for Casein Kinase, p38, CDK and JNK. Gene ontology analysis showed an increase in biological processes related with signaling and differentiation at early time points after BMP2 induction. Moreover, proteins involved in cytoskeleton rearrangement, Wnt and Ras pathways were found to be differentially phosphorylated during all timepoints studied. Conclusions Taken together, these data, allow new insights on the intracellular substrates which are phosphorylated early on during differentiation to BMP2-driven osteoblastic differentiation of skin-derived mesenchymal stem cells.
Resumo:
Abstract Background Intronic and intergenic long noncoding RNAs (lncRNAs) are emerging gene expression regulators. The molecular pathogenesis of renal cell carcinoma (RCC) is still poorly understood, and in particular, limited studies are available for intronic lncRNAs expressed in RCC Methods Microarray experiments were performed with custom-designed arrays enriched with probes for lncRNAs mapping to intronic genomic regions. Samples from 18 primary RCC tumors and 11 nontumor adjacent matched tissues were analyzed. Meta-analyses were performed with microarray expression data from three additional human tissues (normal liver, prostate tumor and kidney nontumor samples), and with large-scale public data for epigenetic regulatory marks and for evolutionarily conserved sequences. Results A signature of 29 intronic lncRNAs differentially expressed between RCC and nontumor samples was obtained (false discovery rate (FDR) <5%). A signature of 26 intronic lncRNAs significantly correlated with the RCC five-year patient survival outcome was identified (FDR <5%, p-value ≤0.01). We identified 4303 intronic antisense lncRNAs expressed in RCC, of which 22% were significantly (p <0.05) cis correlated with the expression of the mRNA in the same locus across RCC and three other human tissues. Gene Ontology (GO) analysis of those loci pointed to 'regulation of biological processes’ as the main enriched category. A module map analysis of the protein-coding genes significantly (p <0.05) trans correlated with the 20% most abundant lncRNAs, identified 51 enriched GO terms (p <0.05). We determined that 60% of the expressed lncRNAs are evolutionarily conserved. At the genomic loci containing the intronic RCC-expressed lncRNAs, a strong association (p <0.001) was found between their transcription start sites and genomic marks such as CpG islands, RNA Pol II binding and histones methylation and acetylation. Conclusion Intronic antisense lncRNAs are widely expressed in RCC tumors. Some of them are significantly altered in RCC in comparison with nontumor samples. The majority of these lncRNAs is evolutionarily conserved and possibly modulated by epigenetic modifications. Our data suggest that these RCC lncRNAs may contribute to the complex network of regulatory RNAs playing a role in renal cell malignant transformation.
Resumo:
To identify the regions of recurrent copy number abnormality in osteosarcoma and their effect on gene expression, we performed an integrated genome-wide high-resolution array CGH (aCGH) and gene expression profiling analysis on 40 human OS tissues and 12 OS cell lines. This analysis identified several recurrent chromosome regions that contain genes that show a gene dosage effect on gene expression. A further search, performed on those genes that were over-expressed and localized in the frequently amplified chromosomal regions, greatly reduced the number of candidate genes while their characterization using gene ontology (GO) analysis suggests the importance of the deregulation of the G1-to-S phase in the development of the disease. We also identified frequent deletions on 3q in the vicinity of LSAMP and performed a fine mapping analysis of the breakpoints. We precisely mapped the breakpoints in several instances and demonstrated that the majority do not involve the LSAMP gene itself, and that they appear to form by a process of non-homologous end joining. In addition, aCGH analysis revealed frequent gains of IGF1R that were highly correlated with its protein level. Blockade of IGF1R in OS cell lines with high copy number gain led to growth inhibition suggesting that IGF1R may be a viable drug target in OS, particularly in patients with copy number driven overexpression of this receptor.
Resumo:
Bioinformatics, in the last few decades, has played a fundamental role to give sense to the huge amount of data produced. Obtained the complete sequence of a genome, the major problem of knowing as much as possible of its coding regions, is crucial. Protein sequence annotation is challenging and, due to the size of the problem, only computational approaches can provide a feasible solution. As it has been recently pointed out by the Critical Assessment of Function Annotations (CAFA), most accurate methods are those based on the transfer-by-homology approach and the most incisive contribution is given by cross-genome comparisons. In the present thesis it is described a non-hierarchical sequence clustering method for protein automatic large-scale annotation, called “The Bologna Annotation Resource Plus” (BAR+). The method is based on an all-against-all alignment of more than 13 millions protein sequences characterized by a very stringent metric. BAR+ can safely transfer functional features (Gene Ontology and Pfam terms) inside clusters by means of a statistical validation, even in the case of multi-domain proteins. Within BAR+ clusters it is also possible to transfer the three dimensional structure (when a template is available). This is possible by the way of cluster-specific HMM profiles that can be used to calculate reliable template-to-target alignments even in the case of distantly related proteins (sequence identity < 30%). Other BAR+ based applications have been developed during my doctorate including the prediction of Magnesium binding sites in human proteins, the ABC transporters superfamily classification and the functional prediction (GO terms) of the CAFA targets. Remarkably, in the CAFA assessment, BAR+ placed among the ten most accurate methods. At present, as a web server for the functional and structural protein sequence annotation, BAR+ is freely available at http://bar.biocomp.unibo.it/bar2.0.
Resumo:
Kolumnare Apfelbäume (Malus x domestica) stellen aufgrund ihres auffälligen Phänotyps eine ökonomisch interessante Wuchsform dar. Diese extreme Form des Kurztriebwuchses zeichnet sich durch einen insgesamt sehr schlanken, säulenförmigen Habitus aus, welcher eine dichte Pflanzung und damit einhergehend Ertragssteigerungen im Vergleich zu normalwüchsigen Bäumen ermöglicht. Verursacht wird der Phänotyp durch die Anwesenheit eines einzelnen, dominanten Allels des Columnar (Co)-Gens. Bis auf die approximative Lokalisation des Gens auf Chromosom 10 ist über mögliche Identität und Funktion bislang nichts bekannt.rnIn der vorliegenden Arbeit wurde ein erster Versuch unternommen, mit Hilfe von Next Generation Sequencing (NGS) Technologien und RNA-Seq Einblicke in das Transkriptom des Sprossapikalmeristems (SAM) kolumnarer Apfelbäume zu gewinnen. So konnte gezeigt werden, dass unabhängig vom Zeitpunkt der Entnahme des Materials mehrere hundert Gene differentiell reguliert werden. Diese lassen sich funktional in mehrere überrepräsentierte Kategorien gruppieren, von denen sich einige wiederum mit dem kolumnaren Phänotyp assoziieren lassen. Durch den Einsatz weiterer Expressionsstudien (Microarrays, qRT-PCR) konnten frühere Ergebnisse bezüglich des Hormonhaushalts auf Genebene bestätigt und neue Erkenntnisse gewonnen werden, die eine mögliche Erklärung für den Phänotyp darstellen. Weiterhin ergab der Vergleich aller durchgeführten Expressionsstudien eine Anreicherung signifikant differentiell regulierter Gene auf Chromosom 10, was auf einen „selective sweep“ hindeutet. Eine potentielle epigenetische Regulation dieser Gene durch das Genprodukt von Co könnte daher möglich sein. Mehr als die Hälfte dieser Gene lassen sich darüber hinaus aufgrund ihrer Funktion direkt mit dem kolumnaren Phänotyp assoziieren.rnDiese Ergebnisse zeigen, dass die Anwesenheit des Co-Allels massive Veränderungen in der Genregulation des SAMs mit sich bringt, wobei einige dieser differentiell regulierten Gene mit großer Wahrscheinlichkeit an der Etablierung des kolumnaren Phänotyps beteiligt sind. Auch wenn die Funktion des Co-Genproduktes nicht abschließend geklärt werden konnte, sind doch anhand der Resultate schlüssige Hypothesen diesbezüglich möglich.rn
Resumo:
Mesenchymale Stamzellen (MSC) sind Vertreter der adulten Stammzellen. Sie bergen durch ihre große Plastizität ein immenses Potential für die klinische Nutzung in Form von Stammzelltherapien. Zellen dieses Typs kommen vornehmlich im Knochenmark der großen Röhrenknochen vor und können zu Knochen, Knorpel und Fettzellen differenzieren. MSC leisten einen wichtigen Beitrag im Rahmen regenerativer Prozesse, beispielsweise zur Heilung von Frakturen. Breite Studien demonstrieren bereits jetzt auch bei komplexeren Erkrankungen (z.B. Osteoporose) therapeutisch vielversprechende Einsatzmöglichkeiten. Oft kommen hierbei aus MSC gezielt differenzierte Folgelinien aus Zellkulturen zum Einsatz. Dies bedingt eine kontrollierte Steuerung der Differenzierungsprozesse in vitro. Der Differenzierung einer Stammzelle liegt eine komplexe Veränderung ihrer Genexpression zugrunde. Genexpressionsmuster zur Erhaltung und Proliferation der Stammzellen müssen durch solche, die der linienspezifischen Differenzierung dienen, ersetzt werden. Die mit der Differenzierung einhergehende, transkriptomische Neuausrichtung ist für das Verständnis der Prozesse grundlegend und wurde bislang nur unzureichend untersucht. Ziel der vorliegenden Arbeit ist eine transkriptomweite und vergleichende Genexpressionsanalyse Mesenchymaler Stammzellen und deren in vitro differenzierten Folgelinien mittels Plasmid - DNA Microarrays und Sequenziertechniken der nächsten Generation (RNA-Seq, Illumina Plattform). In dieser Arbeit diente das Hausrind (Bos taurus) als Modellorganismus, da es genetisch betrachtet eine hohe Ähnlichkeit zum Menschen aufweist und Knochenmark als Quelle von MSC gut verfügbar ist. Primärkulturen Mesenchymaler Stammzellen konnten aus dem Knochenmark von Rindern erfolgreich isoliert werden. Es wurden in vitro Zellkultur - Versuche durchgeführt, um die Zellen zu Osteoblasten, Chondrozyten und Adipozyten zu differenzieren. Zur Genexpressionsanalyse wurde RNA aus jungen MSC und einer MSC Langzeitkultur („alte MSC“), sowie aus den differenzierten Zelllinien isoliert und für nachfolgende Experimente wo nötig amplifiziert. Der Erfolg der Differenzierungen konnte anhand der Genexpression von spezifischen Markergenen und mittels histologischer Färbungen belegt werden. Hierbei zeigte sich die Differenzierung zu Osteoblasten und Adipozyten erfolgreich, während die Differenzierung zu Chondrozyten trotz diverser Modifikationen am Protokoll nicht erfolgreich durchgeführt werden konnte. Eine vergleichende Hybridisierung zur Bestimmung differentieller Genexpression (MSC vs. Differenzierung) mittels selbst hergestellter Plasmid - DNA Microarrays ergab für die Osteogenese mit Genen wie destrin und enpp1, für die undifferenzierten MSC mit dem Gen sema3c neue Kandidatengene, deren biologische Funktion aufzuklären in zukünftigen Experimenten vielversprechende Ergebnisse liefern sollte. Die Analyse der transkriptomweiten Genexpression mittels NGS lieferte einen noch umfangreicheren Einblick ins Differenzierungsgeschehen. Es zeigte sich eine hohe Ähnlichkeit im Expressionsprofil von jungen MSC und Adipozyten, sowie zwischen den Profilen der alten MSC (eine Langzeitkultur) und Osteoblasten. Die alten MSC wiesen deutliche Anzeichen für eine spontane Differenzierung in die osteogene Richtung auf. Durch Analyse der 100 am stärksten exprimierten Gene jeder Zelllinie ließen sich für junge MSC und Adipozyten besonders Gene der extrazellulären Matrix (z.B col1a1,6 ; fn1 uvm.) auffinden. Sowohl Osteoblasten, als auch die alten MSC exprimieren hingegen verstärkt Gene mit Bezug zur oxidativen Phosphorylierung, sowie ribosomale Proteine. Eine Betrachtung der differentiellen Genexpression (junge MSC vs. Differenzierung) mit anschließender Pathway Analyse und Genontologie Anreicherungsstatistik unterstützt diese Ergebnisse vor allem bei Osteoblasten, wo nun jedoch zusätzlich auch Gene zur Regulation der Knochenentwicklung und Mineralisierung in den Vordergrund treten. Für Adipozyten konnte mit Genen des „Jak-STAT signaling pathway“, der Fokalen Adhäsion, sowie Genen des „Cytokine-cytokine receptor interaction pathway“ sehr spannende Einsichten in die Biologie dieses Zelltyps erlangt werden, die sicher weiterer Untersuchungen bedürfen. In undifferenzierten MSC konnte durch differentielle Genexpressionsanalyse die Rolle des nicht kanonischen Teils des WNT Signalweges als für die Aufrechterhaltung des Stammzellstatus potentiell äußerst einflussreich ermittelt werden. Die hier diskutierten Ergebnisse zeigen beispielhaft, dass besonders mittels Genexpressionsanalyse im Hochdurchsatzverfahren wertvolle Einblicke in die komplexe Biologie der Stammzelldifferenzierung möglich sind. Als Grundlage für nachfolgende Arbeiten konnten interessante Gene ermittelt und Hypothesen zu deren Einfluss auf Stammzelleigenschaften und Differenzierungsprozesse aufgestellt werden. Um einen besseren Einblick in den Differenzierungsverlauf zu ermöglichen, könnten künftig NGS Analysen zu unterschiedlichen Differenzierungszeitpunkten durchgeführt werden. Zudem wären weitere Anstrengungen zur erfolgreichen Etablierung der chondrogenen Differenzierung zur vollständigen Analyse der Genexpression des trilinearen Differenzierungspotentials von MSC wünschenswert.
Resumo:
P>Outcrossing Arabidopsis species that diverged from their inbreeding relative Arabidopsis thaliana 5 million yr ago and display a biogeographical pattern of interspecific sympatry vs intraspecific allopatry provides an ideal model for studying impacts of gene introgression and polyploidization on species diversification. Flow cytometry analyses detected ploidy polymorphisms of 2x and 4x in Arabidopsis lyrata ssp. kamchatica of Taiwan. Genomic divergence between species/subspecies was estimated based on 98 randomly chosen nuclear genes. Multilocus analyses revealed a mosaic genome in diploid A. l. kamchatica composed of Arabidopsis halleri-like and A. lyrata-like alleles. Coalescent analyses suggest that the segregation of ancestral polymorphisms alone cannot explain the high inconsistency between gene trees across loci, and that gene introgression via diploid A. l. kamchatica likely distorts the molecular phylogenies of Arabidopsis species. However, not all genes migrated across species freely. Gene ontology analyses suggested that some nonmigrating genes were constrained by natural selection. High levels of estimated ancestral polymorphisms between A. halleri and A. lyrata suggest that gene flow between these species has not completely ceased since their initial isolation. Polymorphism data of extant populations also imply recent gene flow between the species. Our study reveals that interspecific gene flow affects the genome evolution in Arabidopsis.
Resumo:
Streptococcus pneumoniae is the most common pathogen causing non-epidemic bacterial meningitis worldwide. The immune response and inflammatory processes contribute to the pathophysiology. Hence, the anti-inflammatory dexamethasone is advocated as adjuvant treatment although its clinical efficacy remains a question at issue. In experimental models of pneumococcal meningitis, dexamethasone increased neuronal damage in the dentate gyrus. Here, we investigated expressional changes in the hippocampus and cortex at 72 h after infection when dexamethasone was given to infant rats with pneumococcal meningitis. Nursing Wistar rats were intracisternally infected with Streptococcus pneumoniae to induce experimental meningitis or were sham-infected with pyrogen-free saline. Besides antibiotics, animals were either treated with dexamethasone or saline. Expressional changes were assessed by the use of GeneChip® Rat Exon 1.0 ST Arrays and quantitative real-time PCR. Protein levels of brain-derived neurotrophic factor, cytokines and chemokines were evaluated in immunoassays using Luminex xMAP® technology. In infected animals, 213 and 264 genes were significantly regulated by dexamethasone in the hippocampus and cortex respectively. Separately for the cortex and the hippocampus, Gene Ontology analysis identified clusters of biological processes which were assigned to the predefined categories "inflammation", "growth", "apoptosis" and others. Dexamethasone affected the expression of genes and protein levels of chemokines reflecting diminished activation of microglia. Dexamethasone-induced changes of genes related to apoptosis suggest the downregulation of the Akt-survival pathway and the induction of caspase-independent apoptosis. Signalling of pro-neurogenic pathways such as transforming growth factor pathway was reduced by dexamethasone resulting in a lack of pro-survival triggers. The anti-inflammatory properties of dexamethasone were observed on gene and protein level in experimental pneumococcal meningitis. Further dexamethasone-induced expressional changes reflect an increase of pro-apoptotic signals and a decrease of pro-neurogenic processes. The findings may help to identify potential mechanisms leading to apoptosis by dexamethasone in experimental pneumococcal meningitis.
Resumo:
Background Levels of differentiation among populations depend both on demographic and selective factors: genetic drift and local adaptation increase population differentiation, which is eroded by gene flow and balancing selection. We describe here the genomic distribution and the properties of genomic regions with unusually high and low levels of population differentiation in humans to assess the influence of selective and neutral processes on human genetic structure. Methods Individual SNPs of the Human Genome Diversity Panel (HGDP) showing significantly high or low levels of population differentiation were detected under a hierarchical-island model (HIM). A Hidden Markov Model allowed us to detect genomic regions or islands of high or low population differentiation. Results Under the HIM, only 1.5% of all SNPs are significant at the 1% level, but their genomic spatial distribution is significantly non-random. We find evidence that local adaptation shaped high-differentiation islands, as they are enriched for non-synonymous SNPs and overlap with previously identified candidate regions for positive selection. Moreover there is a negative relationship between the size of islands and recombination rate, which is stronger for islands overlapping with genes. Gene ontology analysis supports the role of diet as a major selective pressure in those highly differentiated islands. Low-differentiation islands are also enriched for non-synonymous SNPs, and contain an overly high proportion of genes belonging to the 'Oncogenesis' biological process. Conclusions Even though selection seems to be acting in shaping islands of high population differentiation, neutral demographic processes might have promoted the appearance of some genomic islands since i) as much as 20% of islands are in non-genic regions ii) these non-genic islands are on average two times shorter than genic islands, suggesting a more rapid erosion by recombination, and iii) most loci are strongly differentiated between Africans and non-Africans, a result consistent with known human demographic history.
Resumo:
The last few years have seen the advent of high-throughput technologies to analyze various properties of the transcriptome and proteome of several organisms. The congruency of these different data sources, or lack thereof, can shed light on the mechanisms that govern cellular function. A central challenge for bioinformatics research is to develop a unified framework for combining the multiple sources of functional genomics information and testing associations between them, thus obtaining a robust and integrated view of the underlying biology. We present a graph theoretic approach to test the significance of the association between multiple disparate sources of functional genomics data by proposing two statistical tests, namely edge permutation and node label permutation tests. We demonstrate the use of the proposed tests by finding significant association between a Gene Ontology-derived "predictome" and data obtained from mRNA expression and phenotypic experiments for Saccharomyces cerevisiae. Moreover, we employ the graph theoretic framework to recast a surprising discrepancy presented in Giaever et al. (2002) between gene expression and knockout phenotype, using expression data from a different set of experiments.
Resumo:
Nitrogen and water are essential for plant growth and development. In this study, we designed experiments to produce gene expression data of poplar roots under nitrogen starvation and water deprivation conditions. We found low concentration of nitrogen led first to increased root elongation followed by lateral root proliferation and eventually increased root biomass. To identify genes regulating root growth and development under nitrogen starvation and water deprivation, we designed a series of data analysis procedures, through which, we have successfully identified biologically important genes. Differentially Expressed Genes (DEGs) analysis identified the genes that are differentially expressed under nitrogen starvation or drought. Protein domain enrichment analysis identified enriched themes (in same domains) that are highly interactive during the treatment. Gene Ontology (GO) enrichment analysis allowed us to identify biological process changed during nitrogen starvation. Based on the above analyses, we examined the local Gene Regulatory Network (GRN) and identified a number of transcription factors. After testing, one of them is a high hierarchically ranked transcription factor that affects root growth under nitrogen starvation. It is very tedious and time-consuming to analyze gene expression data. To avoid doing analysis manually, we attempt to automate a computational pipeline that now can be used for identification of DEGs and protein domain analysis in a single run. It is implemented in scripts of Perl and R.
Resumo:
High-throughput assays, such as yeast two-hybrid system, have generated a huge amount of protein-protein interaction (PPI) data in the past decade. This tremendously increases the need for developing reliable methods to systematically and automatically suggest protein functions and relationships between them. With the available PPI data, it is now possible to study the functions and relationships in the context of a large-scale network. To data, several network-based schemes have been provided to effectively annotate protein functions on a large scale. However, due to those inherent noises in high-throughput data generation, new methods and algorithms should be developed to increase the reliability of functional annotations. Previous work in a yeast PPI network (Samanta and Liang, 2003) has shown that the local connection topology, particularly for two proteins sharing an unusually large number of neighbors, can predict functional associations between proteins, and hence suggest their functions. One advantage of the work is that their algorithm is not sensitive to noises (false positives) in high-throughput PPI data. In this study, we improved their prediction scheme by developing a new algorithm and new methods which we applied on a human PPI network to make a genome-wide functional inference. We used the new algorithm to measure and reduce the influence of hub proteins on detecting functionally associated proteins. We used the annotations of the Gene Ontology (GO) and the Kyoto Encyclopedia of Genes and Genomes (KEGG) as independent and unbiased benchmarks to evaluate our algorithms and methods within the human PPI network. We showed that, compared with the previous work from Samanta and Liang, our algorithm and methods developed in this study improved the overall quality of functional inferences for human proteins. By applying the algorithms to the human PPI network, we obtained 4,233 significant functional associations among 1,754 proteins. Further comparisons of their KEGG and GO annotations allowed us to assign 466 KEGG pathway annotations to 274 proteins and 123 GO annotations to 114 proteins with estimated false discovery rates of <21% for KEGG and <30% for GO. We clustered 1,729 proteins by their functional associations and made pathway analysis to identify several subclusters that are highly enriched in certain signaling pathways. Particularly, we performed a detailed analysis on a subcluster enriched in the transforming growth factor β signaling pathway (P<10-50) which is important in cell proliferation and tumorigenesis. Analysis of another four subclusters also suggested potential new players in six signaling pathways worthy of further experimental investigations. Our study gives clear insight into the common neighbor-based prediction scheme and provides a reliable method for large-scale functional annotations in this post-genomic era.