922 resultados para High-Throughput Nucleotide Sequencing
Resumo:
Chromatin immunoprecipitation (ChIP) provides a means of enriching DNA associated with transcription factors, histone modifications, and indeed any other proteins for which suitably characterized antibodies are available. Over the years, sequence detection has progressed from quantitative real-time PCR and Southern blotting to microarrays (ChIP-chip) and now high-throughput sequencing (ChIP-seq). This progression has vastly increased the sequence coverage and data volumes generated. This in turn has enabled informaticians to predict the identity of multi-protein complexes on DNA based on the overrepresentation of sequence motifs in DNA enriched by ChIP with a single antibody against a single protein. In the course of the development of high-throughput sequencing, little has changed in the ChIP methodology until recently. In the last three years, a number of modifications have been made to the ChIP protocol with the goal of enhancing the sensitivity of the method and further reducing the levels of nonspecific background sequences in ChIPped samples. In this chapter, we provide a brief commentary on these methodological changes and describe a detailed ChIP-exo method able to generate narrower peaks and greater peak coverage from ChIPped material.
Resumo:
Hereditary optic neuropathies (HON) are a genetic cause of visual impairment characterized by degeneration of retinal ganglion cells. The majority of HON are caused by pathogenic variants in mtDNA genes and in gene OPA1. However, several other genes can cause optic atrophy and can only be identified by high throughput genetic analysis. Whole Exome Sequencing (WES) is becoming the primary choice in rare disease molecular diagnosis, being both cost effective and informative. We performed WES on a cohort of 106 cases, of which 74 isolated ON patients (ON) and 32 syndromic ON patients (sON). The total diagnostic yield amounts to 27%, slightly higher for syndromic ON (31%) than for isolated ON (26%). The majority of genes found are related to mitochondrial function and already reported for harbouring HON pathogenic variants: ACO2, AFG3L2, C19orf12, DNAJC30, FDXR, MECR, MTFMT, NDUFAF2, NDUFB11, NDUFV2, OPA1, PDSS1, SDHA, SSBP1, and WFS1. Among these OPA1, ACO2, and WFS1 were confirmed as the most relevant genetic causes of ON. Moreover, several genes were identified, especially in sON patients, with direct impairment of non-mitochondrial molecular pathways: from autophagy and ubiquitin system (LYST, SNF8, WDR45, UCHL1), to neural cells development and function (KIF1A, GFAP, EPHB2, CACNA1A, CACNA1F), but also vitamin metabolism (SLC52A2, BTD), cilia structure (USH2A), and nuclear pore shuttling (NUTF2). Functional validation on yeast model was performed for pathogenic variants detected in MECR, MTFMT, SDHA, and UCHL1 genes. For SDHA and UCHL1 also muscle biopsy and fibroblast cell lines from patients were analysed, pointing to possible pathogenic mechanisms that will be investigated in further studies. In conclusion, WES proved to be an efficient tool when applied to our ON cohort, for both common disease-genes identification and novel genes discovery. It is therefore recommended to consider WES in ON molecular diagnostic pipeline, as for other rare genetic diseases.
Resumo:
High-throughput screening of physical, genetic and chemical-genetic interactions brings important perspectives in the Systems Biology field, as the analysis of these interactions provides new insights into protein/gene function, cellular metabolic variations and the validation of therapeutic targets and drug design. However, such analysis depends on a pipeline connecting different tools that can automatically integrate data from diverse sources and result in a more comprehensive dataset that can be properly interpreted. We describe here the Integrated Interactome System (IIS), an integrative platform with a web-based interface for the annotation, analysis and visualization of the interaction profiles of proteins/genes, metabolites and drugs of interest. IIS works in four connected modules: (i) Submission module, which receives raw data derived from Sanger sequencing (e.g. two-hybrid system); (ii) Search module, which enables the user to search for the processed reads to be assembled into contigs/singlets, or for lists of proteins/genes, metabolites and drugs of interest, and add them to the project; (iii) Annotation module, which assigns annotations from several databases for the contigs/singlets or lists of proteins/genes, generating tables with automatic annotation that can be manually curated; and (iv) Interactome module, which maps the contigs/singlets or the uploaded lists to entries in our integrated database, building networks that gather novel identified interactions, protein and metabolite expression/concentration levels, subcellular localization and computed topological metrics, GO biological processes and KEGG pathways enrichment. This module generates a XGMML file that can be imported into Cytoscape or be visualized directly on the web. We have developed IIS by the integration of diverse databases following the need of appropriate tools for a systematic analysis of physical, genetic and chemical-genetic interactions. IIS was validated with yeast two-hybrid, proteomics and metabolomics datasets, but it is also extendable to other datasets. IIS is freely available online at: http://www.lge.ibi.unicamp.br/lnbio/IIS/.
Resumo:
Background: High-density tiling arrays and new sequencing technologies are generating rapidly increasing volumes of transcriptome and protein-DNA interaction data. Visualization and exploration of this data is critical to understanding the regulatory logic encoded in the genome by which the cell dynamically affects its physiology and interacts with its environment. Results: The Gaggle Genome Browser is a cross-platform desktop program for interactively visualizing high-throughput data in the context of the genome. Important features include dynamic panning and zooming, keyword search and open interoperability through the Gaggle framework. Users may bookmark locations on the genome with descriptive annotations and share these bookmarks with other users. The program handles large sets of user-generated data using an in-process database and leverages the facilities of SQL and the R environment for importing and manipulating data. A key aspect of the Gaggle Genome Browser is interoperability. By connecting to the Gaggle framework, the genome browser joins a suite of interconnected bioinformatics tools for analysis and visualization with connectivity to major public repositories of sequences, interactions and pathways. To this flexible environment for exploring and combining data, the Gaggle Genome Browser adds the ability to visualize diverse types of data in relation to its coordinates on the genome. Conclusions: Genomic coordinates function as a common key by which disparate biological data types can be related to one another. In the Gaggle Genome Browser, heterogeneous data are joined by their location on the genome to create information-rich visualizations yielding insight into genome organization, transcription and its regulation and, ultimately, a better understanding of the mechanisms that enable the cell to dynamically respond to its environment.
Resumo:
Background: High-throughput molecular approaches for gene expression profiling, such as Serial Analysis of Gene Expression (SAGE), Massively Parallel Signature Sequencing (MPSS) or Sequencing-by-Synthesis (SBS) represent powerful techniques that provide global transcription profiles of different cell types through sequencing of short fragments of transcripts, denominated sequence tags. These techniques have improved our understanding about the relationships between these expression profiles and cellular phenotypes. Despite this, more reliable datasets are still necessary. In this work, we present a web-based tool named S3T: Score System for Sequence Tags, to index sequenced tags in accordance with their reliability. This is made through a series of evaluations based on a defined rule set. S3T allows the identification/selection of tags, considered more reliable for further gene expression analysis. Results: This methodology was applied to a public SAGE dataset. In order to compare data before and after filtering, a hierarchical clustering analysis was performed in samples from the same type of tissue, in distinct biological conditions, using these two datasets. Our results provide evidences suggesting that it is possible to find more congruous clusters after using S3T scoring system. Conclusion: These results substantiate the proposed application to generate more reliable data. This is a significant contribution for determination of global gene expression profiles. The library analysis with S3T is freely available at http://gdm.fmrp.usp.br/s3t/.S3T source code and datasets can also be downloaded from the aforementioned website.
Resumo:
Microarray gene expression profiling is a high-throughput system used to identify differentially expressed genes and regulation patterns, and to discover new tumor markers. As the molecular pathogenesis of meningiomas and schwannomas, characterized by NF2 gene alterations, remains unclear and suitable molecular targets need to be identified, we used low density cDNA microarrays to establish expression patterns of 96 cancer-related genes on 23 schwannomas, 42 meningiomas and 3 normal cerebral meninges. We also performed a mutational analysis of the NF2 gene (PCR, dHPLC, Sequencing and MLPA), a search for 22q LOH and an analysis of gene silencing by promoter hypermethylation (MS-MLPA). Results showed a high frequency of NF2 gene mutations (40%), increased 22q LOH as aggressiveness increased, frequent losses and gains by MLPA in benign meningiomas, and gene expression silencing by hypermethylation. Array analysis showed decreased expression of 7 genes in meningiomas. Unsupervised analyses identified 2 molecular subgroups for both meningiomas and schwannomas showing 38 and 20 differentially expressed genes, respectively, and 19 genes differentially expressed between the two tumor types. These findings provide a molecular subgroup classification for meningiomas and schwannomas with possible implications for clinical practice.
Resumo:
Breast cancer is the most common form of cancer among women and the identification of markers to discriminate tumorigenic from normal cells, as well as the different stages of this pathology, is of critical importance. Two-dimensional electrophoresis has been used before for studying breast cancer, but the progressive completion of human genomic sequencing and the introduction of mass spectrometry, combined with advanced bioinformatics for protein identification, have considerably increased the possibilities for characterizing new markers and therapeutic targets. Breast cancer proteomics has already identified markers of potential clinical interest (such as the molecular chaperone 14-3-3 sigma) and technological innovations such as large scale and high throughput analysis are now driving the field. Methods in functional proteomics have also been developed to study the intracellular signaling pathways that underlie the development of breast cancer. As illustrated with fibroblast growth factor-2, a mitogen and motogen factor for breast cancer cells, proteomics is a powerful approach to identify signaling proteins and to decipher the complex signaling circuitry involved in tumor growth. Together with genomics, proteomics is well on the way to molecularly characterizing the different types of breast tumor, and thus defining new therapeutic targets for future treatment.
Resumo:
Dissertation presented to obtain a Ph.D. degree in Biology, speciality Microbiology, by Universidade Nova de Lisboa, Faculdade de Ciências e Tecnologia
Resumo:
Dissertation presented to obtain the Master Degree in Molecular, Genetics and Biomedicine
Resumo:
"Tissue engineering: part A", vol. 21, suppl. 1 (2015)
Resumo:
Under the framework of constraint based modeling, genome-scale metabolic models (GSMMs) have been used for several tasks, such as metabolic engineering and phenotype prediction. More recently, their application in health related research has spanned drug discovery, biomarker identification and host-pathogen interactions, targeting diseases such as cancer, Alzheimer, obesity or diabetes. In the last years, the development of novel techniques for genome sequencing and other high-throughput methods, together with advances in Bioinformatics, allowed the reconstruction of GSMMs for human cells. Considering the diversity of cell types and tissues present in the human body, it is imperative to develop tissue-specific metabolic models. Methods to automatically generate these models, based on generic human metabolic models and a plethora of omics data, have been proposed. However, their results have not yet been adequately and critically evaluated and compared. This work presents a survey of the most important tissue or cell type specific metabolic model reconstruction methods, which use literature, transcriptomics, proteomics and metabolomics data, together with a global template model. As a case study, we analyzed the consistency between several omics data sources and reconstructed distinct metabolic models of hepatocytes using different methods and data sources as inputs. The results show that omics data sources have a poor overlapping and, in some cases, are even contradictory. Additionally, the hepatocyte metabolic models generated are in many cases not able to perform metabolic functions known to be present in the liver tissue. We conclude that reliable methods for a priori omics data integration are required to support the reconstruction of complex models of human cells.
Resumo:
Dissertação de mestrado em Molecular Genetics
Resumo:
Purpose of review: Elucidating the genetic background of Parkinson disease and essential tremor is crucial to understand the pathogenesis and improve diagnostic and therapeutic strategies. Recent findings: A number of approaches have been applied including familial and association studies, and studies of gene expression profiles to identify genes involved in susceptibility to Parkinson disease. These studies have nominated a number of candidate Parkinson disease genes and novel loci including Omi/HtrA2, GIGYF2, FGF20, PDXK, EIF4G1 and PARK16. A recent notable finding has been the confirmation for the role of heterozygous mutations in glucocerebrosidase (GBA) as risk factors for Parkinson disease. Finally, association studies have nominated genetic variation in the leucine-rich repeat and Ig containing 1 gene (LINGO1) as a risk for both Parkinson disease and essential tremor, providing the first genetic evidence of a link between the two conditions. Summary: Although undoubtedly genes remain to be identified, considerable progress has been achieved in the understanding of the genetic basis of Parkinson disease. This same effort is now required for essential tremor. The use of next-generation high-throughput sequencing and genotyping technologies will help pave the way for future insight leading to advances in diagnosis, prevention and cure.
Resumo:
RNA polymerase III (Pol III) occurs in two versions, one containing the POLR3G subunit and the other the closely related POLR3GL subunit. It is not clear whether these two Pol III forms have the same function, in particular whether they recognize the same target genes. We show that the POLR3G and POLR3GL genes arose from a DNA-based gene duplication, probably in a common ancestor of vertebrates. POLR3G- as well as POLR3GL-containing Pol III are present in cultured cell lines and in normal mouse liver, although the relative amounts of the two forms vary, with the POLR3G-containing Pol III relatively more abundant in dividing cells. Genome-wide chromatin immunoprecipitations followed by high-throughput sequencing (ChIP-seq) reveal that both forms of Pol III occupy the same target genes, in very constant proportions within one cell line, suggesting that the two forms of Pol III have a similar function with regard to specificity for target genes. In contrast, the POLR3G promoter-not the POLR3GL promoter-binds the transcription factor MYC, as do all other promoters of genes encoding Pol III subunits. Thus, the POLR3G/POLR3GL duplication did not lead to neo-functionalization of the gene product (at least with regard to target gene specificity) but rather to neo-functionalization of the transcription units, which acquired different mechanisms of regulation, thus likely affording greater regulation potential to the cell.
Resumo:
BACKGROUND: The Nuclear Factor I (NFI) family of DNA binding proteins (also called CCAAT box transcription factors or CTF) is involved in both DNA replication and gene expression regulation. Using chromatin immuno-precipitation and high throughput sequencing (ChIP-Seq), we performed a genome-wide mapping of NFI DNA binding sites in primary mouse embryonic fibroblasts. RESULTS: We found that in vivo and in vitro NFI DNA binding specificities are indistinguishable, as in vivo ChIP-Seq NFI binding sites matched predictions based on previously established position weight matrix models of its in vitro binding specificity. Combining ChIP-Seq with mRNA profiling data, we found that NFI preferentially associates with highly expressed genes that it up-regulates, while binding sites were under-represented at expressed but unregulated genes. Genomic binding also correlated with markers of transcribed genes such as histone modifications H3K4me3 and H3K36me3, even outside of annotated transcribed loci, implying NFI in the control of the deposition of these modifications. Positional correlation between + and - strand ChIP-Seq tags revealed that, in contrast to other transcription factors, NFI associates with a nucleosomal length of cleavage-resistant DNA, suggesting an interaction with positioned nucleosomes. In addition, NFI binding prominently occurred at boundaries displaying discontinuities in histone modifications specific of expressed and silent chromatin, such as loci submitted to parental allele-specific imprinted expression. CONCLUSIONS: Our data thus suggest that NFI nucleosomal interaction may contribute to the partitioning of distinct chromatin domains and to epigenetic gene expression regulation.NFI ChIP-Seq and input control DNA data were deposited at Gene Expression Omnibus (GEO) repository under accession number GSE15844. Gene expression microarray data for mouse embryonic fibroblasts are on GEO accession number GSE15871.