977 resultados para Gene-ontology
Resumo:
Résumé : La phase haploïde de la spermatogenèse (spermiogenèse) est caractérisée par une modification importante de la structure de la chromatine et un changement de la topologie de l’ADN du spermatide. Les mécanismes par lesquels ce changement se produit ainsi que les protéines impliquées ne sont pas encore complètement élucidés. Mes travaux ont permis d’établir la présence de cassures bicaténaires transitoires pendant ce remodelage par l’essai des comètes et l’électrophorèse en champ pulsé. En procédant à des immunofluorescences sur coupes de tissus et en utilisant un extrait nucléaire hautement actif, la présence de topoisomérases ainsi que de marqueurs de systèmes de réparation a été confirmée. Les protéines de réparation identifiées font partie de systèmes sujets à l’erreur, donc cette refonte structurale de la chromatine pourrait être génétiquement instable et expliquer le biais paternel observé pour les mutations de novo dans de récentes études impliquant des criblages à haut débit. Une technique permettant l’immunocapture spécifique des cassures bicaténaires a été développée et appliquée sur des spermatides murins représentant différentes étapes de différenciation. Les résultats de séquençage à haut débit ont montré que les cassures bicaténaires (hotspots) de la spermiogenèse se produisent en majorité dans l’ADN intergénique, notamment dans les séquences LINE1, l’ADN satellite et les répétions simples. Les hotspots contiennent aussi des motifs de liaisons des protéines des familles FOX et PRDM, dont les fonctions sont entre autres de lier et remodeler localement la chromatine condensée. Aussi, le motif de liaison de la protéine BRCA1 se trouve enrichi dans les hotspots de cassures bicaténaires. Celle-ci agit entre autres dans la réparation de l’ADN par jonction terminale non-homologue (NHEJ) et dans la réparation des adduits ADN-topoisomérase. De façon remarquable, le motif de reconnaissance de la protéine SPO11, impliquée dans la formation des cassures méiotiques, a été enrichi dans les hotspots, ce qui suggère que la machinerie méiotique serait aussi utilisée pendant la spermiogenèse pour la formation des cassures. Enfin, bien que les hotspots se localisent plutôt dans les séquences intergéniques, les gènes ciblés sont impliqués dans le développement du cerveau et des neurones. Ces résultats sont en accord avec l’origine majoritairement paternelle observée des mutations de novo associées aux troubles du spectre de l’autisme et de la schizophrénie et leur augmentation avec l’âge du père. Puisque les processus du remodelage de la chromatine des spermatides sont conservés dans l’évolution, ces résultats suggèrent que le remodelage de la chromatine de la spermiogenèse représente un mécanisme additionnel contribuant à la formation de mutations de novo, expliquant le biais paternel observé pour certains types de mutations.
Resumo:
Differences in gene expression of human bone marrow stromal cells (hBMSCs) during culture in three-dimensional (3D) nanofiber scaffolds or on two-dimensional (2D) films were investigated via pathway analysis of microarray mRNA expression profiles. Previous work has shown that hBMSC culture in nanofiber scaffolds can induce osteogenic differentiation in the absence of osteogenic supplements (OS). Analysis using ontology databases revealed that nanofibers and OS regulated similar pathways and that both were enriched for TGF-beta and cell-adhesion/ECM-receptor pathways. The most notable difference between the two was that nanofibers had stronger enrichment for cell-adhesion/ECM-receptor pathways. Comparison of nanofibers scaffolds with flat films yielded stronger differences in gene expression than comparison of nanofibers made from different polymers, suggesting that substrate structure had stronger effects on cell function than substrate polymer composition. These results demonstrate that physical (nanofibers) and biochemical (OS) signals regulate similar ontological pathways, suggesting that these cues use similar molecular mechanisms to control hBMSC differentiation. Published by Elsevier Ltd.
Resumo:
BACKGROUND: Broccoli consumption has been associated with a reduced risk of prostate cancer. Isothiocyanates (ITCs) derived from glucosinolates that accumulate in broccoli are dietary compounds that may mediate these health effects. Sulforaphane (SF, 4-methylsulphinylbutyl ITC) derives from heading broccoli (calabrese) and iberin (IB, 3-methylsulphinypropyl ITC) from sprouting broccoli. While there are many studies regarding the biological activity of SF, mainly undertaken with cancerous cells, there are few studies associated with IB. METHODS: Primary epithelial and stromal cells were derived from benign prostatic hyperplasia tissue. Affymetrix U133 Plus 2.0 whole genome arrays were used to compare global gene expression between these cells, and to quantify changes in gene expression following exposure to physiologically appropriate concentrations of SF and IB. Ontology and pathway analyses were used to interpret results. Changes in expression of a subset of genes were confirmed by real-time RT-PCR. RESULTS: Global gene expression profiling identified epithelial and stromal-specific gene expression profiles. SF induced more changes in epithelial cells, whereas IB was more effective in stromal cells. Although IB and SF induced different changes in gene expression in both epithelial and stromal cells, these were associated with similar pathways, such as cell cycle and detoxification. Both ITCs increased expression of PLAGL1, a tumor suppressor gene, in stromal cells and suppressed expression of the putative tumor promoting genes IFITM1, CSPG2, and VIM in epithelial cells. CONCLUSION: These data suggest that IB and SF both alter genes associated with cancer prevention, and IB should be investigated further as a potential chemopreventative agent.
Resumo:
This thesis studies human gene expression space using high throughput gene expression data from DNA microarrays. In molecular biology, high throughput techniques allow numerical measurements of expression of tens of thousands of genes simultaneously. In a single study, this data is traditionally obtained from a limited number of sample types with a small number of replicates. For organism-wide analysis, this data has been largely unavailable and the global structure of human transcriptome has remained unknown. This thesis introduces a human transcriptome map of different biological entities and analysis of its general structure. The map is constructed from gene expression data from the two largest public microarray data repositories, GEO and ArrayExpress. The creation of this map contributed to the development of ArrayExpress by identifying and retrofitting the previously unusable and missing data and by improving the access to its data. It also contributed to creation of several new tools for microarray data manipulation and establishment of data exchange between GEO and ArrayExpress. The data integration for the global map required creation of a new large ontology of human cell types, disease states, organism parts and cell lines. The ontology was used in a new text mining and decision tree based method for automatic conversion of human readable free text microarray data annotations into categorised format. The data comparability and minimisation of the systematic measurement errors that are characteristic to each lab- oratory in this large cross-laboratories integrated dataset, was ensured by computation of a range of microarray data quality metrics and exclusion of incomparable data. The structure of a global map of human gene expression was then explored by principal component analysis and hierarchical clustering using heuristics and help from another purpose built sample ontology. A preface and motivation to the construction and analysis of a global map of human gene expression is given by analysis of two microarray datasets of human malignant melanoma. The analysis of these sets incorporate indirect comparison of statistical methods for finding differentially expressed genes and point to the need to study gene expression on a global level.
Resumo:
Soldatova, L., Clare, A., Sparkes, A. and King, R. D. (2006) An ontology for a robot scientist. Bioinformatics 2006 22: 464-471
Resumo:
Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)
Resumo:
Background: A current challenge in gene annotation is to define the gene function in the context of the network of relationships instead of using single genes. The inference of gene networks (GNs) has emerged as an approach to better understand the biology of the system and to study how several components of this network interact with each other and keep their functions stable. However, in general there is no sufficient data to accurately recover the GNs from their expression levels leading to the curse of dimensionality, in which the number of variables is higher than samples. One way to mitigate this problem is to integrate biological data instead of using only the expression profiles in the inference process. Nowadays, the use of several biological information in inference methods had a significant increase in order to better recover the connections between genes and reduce the false positives. What makes this strategy so interesting is the possibility of confirming the known connections through the included biological data, and the possibility of discovering new relationships between genes when observed the expression data. Although several works in data integration have increased the performance of the network inference methods, the real contribution of adding each type of biological information in the obtained improvement is not clear. Methods: We propose a methodology to include biological information into an inference algorithm in order to assess its prediction gain by using biological information and expression profile together. We also evaluated and compared the gain of adding four types of biological information: (a) protein-protein interaction, (b) Rosetta stone fusion proteins, (c) KEGG and (d) KEGG+GO. Results and conclusions: This work presents a first comparison of the gain in the use of prior biological information in the inference of GNs by considering the eukaryote (P. falciparum) organism. Our results indicates that information based on direct interaction can produce a higher improvement in the gain than data about a less specific relationship as GO or KEGG. Also, as expected, the results show that the use of biological information is a very important approach for the improvement of the inference. We also compared the gain in the inference of the global network and only the hubs. The results indicates that the use of biological information can improve the identification of the most connected proteins.
Resumo:
Abstract Background The study and analysis of gene expression measurements is the primary focus of functional genomics. Once expression data is available, biologists are faced with the task of extracting (new) knowledge associated to the underlying biological phenomenon. Most often, in order to perform this task, biologists execute a number of analysis activities on the available gene expression dataset rather than a single analysis activity. The integration of heteregeneous tools and data sources to create an integrated analysis environment represents a challenging and error-prone task. Semantic integration enables the assignment of unambiguous meanings to data shared among different applications in an integrated environment, allowing the exchange of data in a semantically consistent and meaningful way. This work aims at developing an ontology-based methodology for the semantic integration of gene expression analysis tools and data sources. The proposed methodology relies on software connectors to support not only the access to heterogeneous data sources but also the definition of transformation rules on exchanged data. Results We have studied the different challenges involved in the integration of computer systems and the role software connectors play in this task. We have also studied a number of gene expression technologies, analysis tools and related ontologies in order to devise basic integration scenarios and propose a reference ontology for the gene expression domain. Then, we have defined a number of activities and associated guidelines to prescribe how the development of connectors should be carried out. Finally, we have applied the proposed methodology in the construction of three different integration scenarios involving the use of different tools for the analysis of different types of gene expression data. Conclusions The proposed methodology facilitates the development of connectors capable of semantically integrating different gene expression analysis tools and data sources. The methodology can be used in the development of connectors supporting both simple and nontrivial processing requirements, thus assuring accurate data exchange and information interpretation from exchanged data.
Resumo:
Background There is evidence that certain mutations in the double-strand break repair pathway ataxia-telangiectasia mutated gene act in a dominant-negative manner to increase the risk of breast cancer. There are also some reports to suggest that the amino acid substitution variants T2119C Ser707Pro and C3161G Pro1054Arg may be associated with breast cancer risk. We investigate the breast cancer risk associated with these two nonconservative amino acid substitution variants using a large Australian population-based case–control study. Methods The polymorphisms were genotyped in more than 1300 cases and 600 controls using 5' exonuclease assays. Case–control analyses and genotype distributions were compared by logistic regression. Results The 2119C variant was rare, occurring at frequencies of 1.4 and 1.3% in cases and controls, respectively (P = 0.8). There was no difference in genotype distribution between cases and controls (P = 0.8), and the TC genotype was not associated with increased risk of breast cancer (adjusted odds ratio = 1.08, 95% confidence interval = 0.59–1.97, P = 0.8). Similarly, the 3161G variant was no more common in cases than in controls (2.9% versus 2.2%, P = 0.2), there was no difference in genotype distribution between cases and controls (P = 0.1), and the CG genotype was not associated with an increased risk of breast cancer (adjusted odds ratio = 1.30, 95% confidence interval = 0.85–1.98, P = 0.2). This lack of evidence for an association persisted within groups defined by the family history of breast cancer or by age. Conclusion The 2119C and 3161G amino acid substitution variants are not associated with moderate or high risks of breast cancer in Australian women.
Resumo:
The tissue kallikreins are serine proteases encoded by highly conserved multigene families. The rodent kallikrein (KLK) families are particularly large, consisting of 13 26 genes clustered in one chromosomal locus. It has been recently recognised that the human KLK gene family is of a similar size (15 genes) with the identification of another 12 related genes (KLK4-KLK15) within and adjacent to the original human KLK locus (KLK1-3) on chromosome 19q13.4. The structural organisation and size of these new genes is similar to that of other KLK genes except for additional exons encoding 5 or 3 untranslated regions. Moreover, many of these genes have multiple mRNA transcripts, a trait not observed with rodent genes. Unlike all other kallikreins, the KLK4-KLK15 encoded proteases are less related (25–44%) and do not contain a conventional kallikrein loop. Clusters of genes exhibit high prostatic (KLK2-4, KLK15) or pancreatic (KLK6-13) expression, suggesting evolutionary conservation of elements conferring tissue specificity. These genes are also expressed, to varying degrees, in a wider range of tissues suggesting a functional involvement of these newer human kallikrein proteases in a diverse range of physiological processes.