969 resultados para transcription factor binding sites


Relevância:

100.00% 100.00%

Publicador:

Resumo:

HTPSELEX is a public database providing access to primary and derived data from high-throughput SELEX experiments aimed at characterizing the binding specificity of transcription factors. The resource is primarily intended to serve computational biologists interested in building models of transcription factor binding sites from large sets of binding sequences. The guiding principle is to make available all information that is relevant for this purpose. For each experiment, we try to provide accurate information about the protein material used, details of the wet lab protocol, an archive of sequencing trace files, assembled clone sequences (concatemers) and complete sets of in vitro selected protein-binding tags. In addition, we offer in-house derived binding sites models. HTPSELEX also offers reasonably large SELEX libraries obtained with conventional low-throughput protocols. The FTP site contains the trace archives and database flatfiles. The web server offers user-friendly interfaces for viewing individual entries and quality-controlled download of SELEX sequence libraries according to a user-defined sequencing quality threshold. HTPSELEX is available from ftp://ftp.isrec.isb-sib.ch/pub/databases/htpselex/ and http://www.isrec.isb-sib.ch/htpselex.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The ability to determine the location and relative strength of all transcription-factor binding sites in a genome is important both for a comprehensive understanding of gene regulation and for effective promoter engineering in biotechnological applications. Here we present a bioinformatically driven experimental method to accurately define the DNA-binding sequence specificity of transcription factors. A generalized profile was used as a predictive quantitative model for binding sites, and its parameters were estimated from in vitro-selected ligands using standard hidden Markov model training algorithms. Computer simulations showed that several thousand low- to medium-affinity sequences are required to generate a profile of desired accuracy. To produce data on this scale, we applied high-throughput genomics methods to the biochemical problem addressed here. A method combining systematic evolution of ligands by exponential enrichment (SELEX) and serial analysis of gene expression (SAGE) protocols was coupled to an automated quality-controlled sequence extraction procedure based on Phred quality scores. This allowed the sequencing of a database of more than 10,000 potential DNA ligands for the CTF/NFI transcription factor. The resulting binding-site model defines the sequence specificity of this protein with a high degree of accuracy not achieved earlier and thereby makes it possible to identify previously unknown regulatory sequences in genomic DNA. A covariance analysis of the selected sites revealed non-independent base preferences at different nucleotide positions, providing insight into the binding mechanism.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Abstract One of the most important issues in molecular biology is to understand regulatory mechanisms that control gene expression. Gene expression is often regulated by proteins, called transcription factors which bind to short (5 to 20 base pairs),degenerate segments of DNA. Experimental efforts towards understanding the sequence specificity of transcription factors is laborious and expensive, but can be substantially accelerated with the use of computational predictions. This thesis describes the use of algorithms and resources for transcriptionfactor binding site analysis in addressing quantitative modelling, where probabilitic models are built to represent binding properties of a transcription factor and can be used to find new functional binding sites in genomes. Initially, an open-access database(HTPSELEX) was created, holding high quality binding sequences for two eukaryotic families of transcription factors namely CTF/NF1 and LEFT/TCF. The binding sequences were elucidated using a recently described experimental procedure called HTP-SELEX, that allows generation of large number (> 1000) of binding sites using mass sequencing technology. For each HTP-SELEX experiments we also provide accurate primary experimental information about the protein material used, details of the wet lab protocol, an archive of sequencing trace files, and assembled clone sequences of binding sequences. The database also offers reasonably large SELEX libraries obtained with conventional low-throughput protocols.The database is available at http://wwwisrec.isb-sib.ch/htpselex/ and and ftp://ftp.isrec.isb-sib.ch/pub/databases/htpselex. The Expectation-Maximisation(EM) algorithm is one the frequently used methods to estimate probabilistic models to represent the sequence specificity of transcription factors. We present computer simulations in order to estimate the precision of EM estimated models as a function of data set parameters(like length of initial sequences, number of initial sequences, percentage of nonbinding sequences). We observed a remarkable robustness of the EM algorithm with regard to length of training sequences and the degree of contamination. The HTPSELEX database and the benchmarked results of the EM algorithm formed part of the foundation for the subsequent project, where a statistical framework called hidden Markov model has been developed to represent sequence specificity of the transcription factors CTF/NF1 and LEF1/TCF using the HTP-SELEX experiment data. The hidden Markov model framework is capable of both predicting and classifying CTF/NF1 and LEF1/TCF binding sites. A covariance analysis of the binding sites revealed non-independent base preferences at different nucleotide positions, providing insight into the binding mechanism. We next tested the LEF1/TCF model by computing binding scores for a set of LEF1/TCF binding sequences for which relative affinities were determined experimentally using non-linear regression. The predicted and experimentally determined binding affinities were in good correlation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Toward the goal of identifying complete sets of transcription factor (TF)-binding sites in the genomes of several gamma proteobacteria, and hence describing their transcription regulatory networks, we present a phylogenetic footprinting method for identifying these sites. Probable transcription regulatory sites upstream of Escherichia coli genes were identified by cross-species comparison using an extended Gibbs sampling algorithm. Close examination of a study set of 184 genes with documented transcription regulatory sites revealed that when orthologous data were available from at least two other gamma proteobacterial species, 81% of our predictions corresponded with the documented sites, and 67% corresponded when data from only one other species were available. That the remaining predictions included bona fide TF-binding sites was proven by affinity purification of a putative transcription factor (YijC) bound to such a site upstream of the fabA gene. Predicted regulatory sites for 2097 E.coli genes are available at http://www.wadsworth.org/resnres/bioinfo/.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We have used a multiplex selection approach to construct a library of DNA-protein interaction sites recognized by many of the DNA-binding proteins present in a cell type. An estimated minimum of two-thirds of the binding sites present in a library prepared from activated Jurkat T cells represent authentic transcription factor binding sites. We used the library for isolation of "optimal" binding site probes that facilitated cloning of a factor and to identify binding activities induced within 2 hr of activation of Jurkat cells. Since a large fraction of the oligonucleotides obtained appear to represent "optimal" binding sites for sequence-specific DNA-binding proteins, it is feasible to construct a catalog of consensus binding sites for DNA-binding proteins in a given cell type. Qualitative and quantitative comparisons of the catalogs of binding site sequences from various cell types could provide valuable insights into the process of differentiation acting at the level of transcriptional control.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The prediction of regulatory elements is a problem where computational methods offer great hope. Over the past few years, numerous tools have become available for this task. The purpose of the current assessment is twofold: to provide some guidance to users regarding the accuracy of currently available tools in various settings, and to provide a benchmark of data sets for assessing future tools.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Accurate prediction of transcription factor binding sites is needed to unravel the function and regulation of genes discovered in genome sequencing projects. To evaluate current computer prediction tools, we have begun a systematic study of the sequence-specific DNA-binding of a transcription factor belonging to the CTF/NFI family. Using a systematic collection of rationally designed oligonucleotides combined with an in vitro DNA binding assay, we found that the sequence specificity of this protein cannot be represented by a simple consensus sequence or weight matrix. For instance, CTF/NFI uses a flexible DNA binding mode that allows for variations of the binding site length. From the experimental data, we derived a novel prediction method using a generalised profile as a binding site predictor. Experimental evaluation of the generalised profile indicated that it accurately predicts the binding affinity of the transcription factor to natural or synthetic DNA sequences. Furthermore, the in vitro measured binding affinities of a subset of oligonucleotides were found to correlate with their transcriptional activities in transfected cells. The combined computational-experimental approach exemplified in this work thus resulted in an accurate prediction method for CTF/NFI binding sites potentially functioning as regulatory regions in vivo.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

rSNP_Guide is a novel curated database system for analysis of transcription factor (TF) binding to target sequences in regulatory gene regions altered by mutations. It accumulates experimental data on naturally occurring site variants in regulatory gene regions and site-directed mutations. This database system also contains the web tools for SNP analysis, i.e., active applet applying weight matrices to predict the regulatory site candidates altered by a mutation. The current version of the rSNP_Guide is supplemented by six sub-databases: (i) rSNP_DB, on DNA–protein interaction caused by mutation; (ii) SYSTEM, on experimental systems; (iii) rSNP_BIB, on citations to original publications; (iv) SAMPLES, on experimentally identified sequences of known regulatory sites; (v) MATRIX, on weight matrices of known TF sites; (vi) rSNP_Report, on characteristic examples of successful rSNP_Tools implementation. These databases are useful for the analysis of natural SNPs and site-directed mutations. The databases are available through the Web, http://wwwmgs.bionet.nsc.ru/mgs/systems/rsnp/.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We address the problem of comparing and characterizing the promoter regions of genes with similar expression patterns. This remains a challenging problem in sequence analysis, because often the promoter regions of co-expressed genes do not show discernible sequence conservation. In our approach, thus, we have not directly compared the nucleotide sequence of promoters. Instead, we have obtained predictions of transcription factor binding sites, annotated the predicted sites with the labels of the corresponding binding factors, and aligned the resulting sequences of labels—to which we refer here as transcription factor maps (TF-maps). To obtain the global pairwise alignment of two TF-maps, we have adapted an algorithm initially developed to align restriction enzyme maps. We have optimized the parameters of the algorithm in a small, but well-curated, collection of human–mouse orthologous gene pairs. Results in this dataset, as well as in an independent much larger dataset from the CISRED database, indicate that TF-map alignments are able to uncover conserved regulatory elements, which cannot be detected by the typical sequence alignments.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Information about the genomic coordinates and the sequence of experimentally identified transcription factor binding sites is found scattered under a variety of diverse formats. The availability of standard collections of such high-quality data is important to design, evaluate and improve novel computational approaches to identify binding motifs on promoter sequences from related genes. ABS (http://genome.imim.es/datasets/abs2005/index.html) is a public database of known binding sites identified in promoters of orthologous vertebrate genes that have been manually curated from bibliography. We have annotated 650 experimental binding sites from 68 transcription factors and 100 orthologous target genes in human, mouse, rat or chicken genome sequences. Computational predictions and promoter alignment information are also provided for each entry. A simple and easy-to-use web interface facilitates data retrieval allowing different views of the information. In addition, the release 1.0 of ABS includes a customizable generator of artificial datasets based on the known sites contained in the collection and an evaluation tool to aid during the training and the assessment of motif-finding programs.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Expression control in synthetic genetic circuitry, for example, for construction of sensitive biosensors, is hampered by the lack of DNA parts that maintain ultralow background yet achieve high output upon signal integration by the cells. Here, we demonstrate how placement of auxiliary transcription factor binding sites within a regulatable promoter context can yield an important gain in signal-to-noise output ratios from prokaryotic biosensor circuits. As a proof of principle, we use the arsenite-responsive ArsR repressor protein from Escherichia coli and its cognate operator. Additional ArsR operators placed downstream of its target promoter can act as a transcription roadblock in a distance-dependent manner and reduce background expression of downstream-placed reporter genes. We show that the transcription roadblock functions both in cognate and heterologous promoter contexts. Secondary ArsR operators placed upstream of their promoter can also improve signal-to-noise output while maintaining effector dependency. Importantly, background control can be released through the addition of micromolar concentrations of arsenite. The ArsR-operator system thus provides a flexible system for additional gene expression control, which, given the extreme sensitivity to micrograms per liter effector concentrations, could be applicable in more general contexts.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

CIITA is a master transactivator of the major histocompatibility complex class II genes, which are involved in antigen presentation. Defects in CIITA result in fatal immunodeficiencies. CIITA activation is also the control point for the induction of major histocompatibility complex class II and associated genes by interferon-γ, but CIITA does not bind directly to DNA. Expression of CIITA in G3A cells, which lack endogenous CIITA, followed by in vivo genomic footprinting, now reveals that CIITA is required for the assembly of transcription factor complexes on the promoters of this gene family, including DRA, Ii, and DMB. CIITA-dependent promoter assembly occurs in interferon-γ-inducible cell types, but not in B lymphocytes. Dissection of the CIITA protein indicates that transactivation and promoter loading are inseparable and reveal a requirement for a GTP binding motif. These findings suggest that CIITA may be a new class of transactivator.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Transcription factors must be able to access their DNA binding sites to either activate or repress transcription. However, DNA wrapping and compaction into chromatin occludes most binding sites from ready access by proteins. Pioneer transcription factors are capable of binding their DNA elements within a condensed chromatin context and then reducing the level of nucleosome occupancy so that the chromatin structure is more accessible. This altered accessibility increases the probability of other transcription factors binding to their own DNA binding elements. My hypothesis is that Foxa1, a ‘pioneer’ transcription factor, activates alpha-fetoprotein (AFP) expression by binding DNA in a chromatinized environment, reducing the nucleosome occupancy and facilitating binding of additional transcription factors.^ Using retinoic-acid differentiated mouse embryonic stem cells, we illustrate a mechanism for activation of the tumor marker AFP by the pioneer transcription factor Foxa1 and TGF-β downstream effector transcription factors Smad2 and Smad4. In differentiating embryonic stem cells, binding of the Foxa1 forkhead box transcription factor to chromatin reduces nucleosome occupancy and levels of linker histone H1 at the AFP distal promoter. The more accessible DNA is subsequently bound by the Smad2 and Smad4 transcription factors, concurrent with activation of transcription. Chromatin immunoprecipitation analyses combined with siRNA-mediated knockdown indicate that Smad protein binding and the reduction of nucleosome occupancy at the AFP distal promoter is dependent on Foxa1. In addition to facilitating transcription factor binding, Foxa1 is also associated with histone modifications related to active gene expression. Acetylation of lysine 9 on histone H3, a mark that is associated active transcription, is dependent on Foxa1, while methylation of H3K4, also associated with active transcription, is independent of Foxa1. I propose that Foxa1 potentiates a region of chromatin to respond to Smad proteins, leading to active expression of AFP.^ These studies demonstrate one mechanism whereby a transcription factor can alter the accessibility of additional transcription factors to chromatin, by altering nucleosome positions. Specifically, Foxa1 exposes DNA so that Smad4 can bind to its regulatory element and activate transcription of the tumor-marker gene AFP.^