980 resultados para DNA-Binding Proteins -- genetics
Resumo:
The protein sequence deduced from the open reading frame of a human placental cDNA encoding a cAMP-responsive enhancer (CRE)-binding protein (CREB-327) has structural features characteristic of several other transcriptional transactivator proteins including jun, fos, C/EBP, myc, and CRE-BP1. Results of Southwestern analysis of nuclear extracts from several different cell lines show that there are multiple CRE-binding proteins, which vary in size in cell lines derived from different tissues and animal species. To examine the molecular diversity of CREB-327 and related proteins at the nucleic acid level, we used labeled cDNAs from human placenta that encode two different CRE-binding proteins (CREB-327 and CRE-BP1) to probe Northern and Southern blots. Both probes hybridized to multiple fragments on Southern blots of genomic DNA from various species. Alternatively, when a human placental c-jun probe was hybridized to the same blot, a single fragment was detected in most cases, consistent with the intronless nature of the human c-jun gene. The CREB-327 probe hybridized to multiple mRNAs, derived from human placenta, ranging in size from 2-9 kilobases. In contrast, the CRE-BP1 probe identified a single 4-kilobase mRNA. Sequence analyses of several overlapping human genomic cosmid clones containing CREB-327 sequences in conjunction with polymerase chain reaction indicates that the CREB-327/341 cDNAs are composed of at least eight or nine exons, and analyses of human placental cDNAs provide direct evidence for at least one alternatively spliced exon. Analyses of mouse/hamster-human hybridoma DNAs by Southern blotting and polymerase chain reaction localizes the CREB-327/341 gene to human chromosome 2. The results indicate that there is a dichotomy of CREB-like proteins, those that are related by overall structure and DNA-binding specificity as well as those that are related by close similarities of primary sequences.
Resumo:
SUMMARY : Eukaryotic DNA interacts with the nuclear proteins using non-covalent ionic interactions. Proteins can recognize specific nucleotide sequences based on the sterical interactions with the DNA and these specific protein-DNA interactions are the basis for many nuclear processes, e.g. gene transcription, chromosomal replication, and recombination. New technology termed ChIP-Seq has been recently developed for the analysis of protein-DNA interactions on a whole genome scale and it is based on immunoprecipitation of chromatin and high-throughput DNA sequencing procedure. ChIP-Seq is a novel technique with a great potential to replace older techniques for mapping of protein-DNA interactions. In this thesis, we bring some new insights into the ChIP-Seq data analysis. First, we point out to some common and so far unknown artifacts of the method. Sequence tag distribution in the genome does not follow uniform distribution and we have found extreme hot-spots of tag accumulation over specific loci in the human and mouse genomes. These artifactual sequence tags accumulations will create false peaks in every ChIP-Seq dataset and we propose different filtering methods to reduce the number of false positives. Next, we propose random sampling as a powerful analytical tool in the ChIP-Seq data analysis that could be used to infer biological knowledge from the massive ChIP-Seq datasets. We created unbiased random sampling algorithm and we used this methodology to reveal some of the important biological properties of Nuclear Factor I DNA binding proteins. Finally, by analyzing the ChIP-Seq data in detail, we revealed that Nuclear Factor I transcription factors mainly act as activators of transcription, and that they are associated with specific chromatin modifications that are markers of open chromatin. We speculate that NFI factors only interact with the DNA wrapped around the nucleosome. We also found multiple loci that indicate possible chromatin barrier activity of NFI proteins, which could suggest the use of NFI binding sequences as chromatin insulators in biotechnology applications. RESUME : L'ADN des eucaryotes interagit avec les protéines nucléaires par des interactions noncovalentes ioniques. Les protéines peuvent reconnaître les séquences nucléotidiques spécifiques basées sur l'interaction stérique avec l'ADN, et des interactions spécifiques contrôlent de nombreux processus nucléaire, p.ex. transcription du gène, la réplication chromosomique, et la recombinaison. Une nouvelle technologie appelée ChIP-Seq a été récemment développée pour l'analyse des interactions protéine-ADN à l'échelle du génome entier et cette approche est basée sur l'immuno-précipitation de la chromatine et sur la procédure de séquençage de l'ADN à haut débit. La nouvelle approche ChIP-Seq a donc un fort potentiel pour remplacer les anciennes techniques de cartographie des interactions protéine-ADN. Dans cette thèse, nous apportons de nouvelles perspectives dans l'analyse des données ChIP-Seq. Tout d'abord, nous avons identifié des artefacts très communs associés à cette méthode qui étaient jusqu'à présent insoupçonnés. La distribution des séquences dans le génome ne suit pas une distribution uniforme et nous avons constaté des positions extrêmes d'accumulation de séquence à des régions spécifiques, des génomes humains et de la souris. Ces accumulations des séquences artéfactuelles créera de faux pics dans toutes les données ChIP-Seq, et nous proposons différentes méthodes de filtrage pour réduire le nombre de faux positifs. Ensuite, nous proposons un nouvel échantillonnage aléatoire comme un outil puissant d'analyse des données ChIP-Seq, ce qui pourraient augmenter l'acquisition de connaissances biologiques à partir des données ChIP-Seq. Nous avons créé un algorithme d'échantillonnage aléatoire et nous avons utilisé cette méthode pour révéler certaines des propriétés biologiques importantes de protéines liant à l'ADN nommés Facteur Nucléaire I (NFI). Enfin, en analysant en détail les données de ChIP-Seq pour la famille de facteurs de transcription nommés Facteur Nucléaire I, nous avons révélé que ces protéines agissent principalement comme des activateurs de transcription, et qu'elles sont associées à des modifications de la chromatine spécifiques qui sont des marqueurs de la chromatine ouverte. Nous pensons que lés facteurs NFI interagir uniquement avec l'ADN enroulé autour du nucléosome. Nous avons également constaté plusieurs régions génomiques qui indiquent une éventuelle activité de barrière chromatinienne des protéines NFI, ce qui pourrait suggérer l'utilisation de séquences de liaison NFI comme séquences isolatrices dans des applications de la biotechnologie.
Resumo:
Abstract Telomeres, the natural ends of chromosomes, need to be protected from chromosome end fusions, aberrant homologous recombination and degradation. In humans, chromosome ends are specified through arrays of tandemly repeated 5'-TTAGGG-3' hexamers, ending in a 3' overhang. A complex formed by the six proteins TRF1, TRF2, hRap1, TIN2, TPP1 and POT1 specifically assocìates with and protects telomeres. Telomeres are maintained by semiconservative DNA replication and by a specialized reverse transcriptase, telomerase, that carries an RNA subunit which templates new telomeric repeat synthesis. The telomeric single stranded (ss) DNA binding protein POT1 protects the telomeric 3' overhang and modulates telomerase-mediated telomere elongation. It is possible that POT1 also influences DNA synthesis during semiconservative DNA replication, which is initiated by the DNA polymerase alpha-primase complex. The heterotrimeric ss DNA-binding protein RPA plays essential roles during DNA replication. RPA binds to ss DNA with high affinity in order to stabilize ss DNA and facilitate nascent strand synthesis at the replication fork. Here we investigate how the two proteins RPA and POT1 contribute to telomere maintenance by regulating semi-conservative DNA replication and telomerase. Using chromatin immunoprecipitation experiments, we show that RPA associates with telomeres during S-phase. Analysis of telomere structure in cells shRNA-depleted for RPA and POT1 reveals that loss of RPA and POT1 causes exposure of single-stranded DNA at telomeres, suggestive of incomplete DNA replication. Biochemical experiments using purified recombinant POT1 and RPA show that saturating telomeric oligonucleotides with POT1 or RPA reduces the primase activity of the DNA polymerase alpha-primase complex and the overall activity of telomerase. POT1 and RPA also increase the primer extension by DNA polymerase alpha-primase complex and the processivity of telomerase under certain conditions, although POT1 increases the activities to a greater extent than RPA. We propose that POT1 is required for proper replication of the lagging strand of telomeres and that some phenotypes observed in POT1-depleted cells may stern from incomplete DNA replication rather than de-protection of the single-stranded overhang. Résumé Les télomères, les extrémités normales des chromosomes linéaires, doivent être protégés des fusions chromosomiques, d'événements de recombinaison homologue aberrants et de phénomènes de dégradation. Chez l'Homme, les extrémités des chromosomes sont constitués d'ADN double brin répétitif de séquence 5'-TTAGGG-3', d'une extension simple brin 3' sortante et d'un complexe protéique formé des six facteurs TRF1, TRF2, hRap1, TIN2, TPP1 et POT1 qui, s'associant à cette séquence, protègent l'ADN télomèrique. Les télomères sont maintenus par la télomérase, une transcriptase inverse capable d'allonger l'extension 3' sortante télomérique. POT1 lie l'ADN simple brin télomérique et module l'élongation des télomères par la télomérase. POT1 pourrait en théorie également influencer la réplication semi-conservative de l'ADN. L'ADN-polymérase Pal alpha-primase amorce et initie la synthèse d'ADN. Pendant la réplication, l'ADN simple brin est stabilisé par RPA, un complexe hétérotrimèrique qui lie l'ADN simple brin. RPA facilite la synthèse du brin naissant à la fourche de réplication. Ici nous avons étudié comment ces deux protéines qui lient l'ADN simple brin, RPA et POT1, régulent la réplication des télomères par la télomérase et la machinerie classique de réplication de l'ADN. Par immunoprécipitation de chromatine (ChIP), nous montrons que RPA est localisé aux télomères lors de la phase S du cycle cellulaire. De plus, l'analyse de la structure des télomeres indique que !a perte de RPA ou de POT1 conduit à l'apparition d'ADN simple brin télomérique, suggérant une réplication incomplète de l'ADN télomérique in vivo. Par une approche complémentaire biochimique utilisant les protéines POT1 et RPA recombinantes purifiées, nous montrons également que la liaison de POT1 ou de RPA à des oligonucléotides télomériques bloque l'activité primase du complexe polymérase alpha/primase et réduit l'activité télomérase sur ces substrats. En revanche, leur liaison augmente l'activité ADN-polymérase du complexe polymérase alpha/primase, ainsi que fa processivité de la télomérase dans certaines conditions, POT1 étant le plus efficace des deux facteurs. Nous proposons que POT1 est nécessaire à la réplication du brin retardé au niveau des télomères, ce qui suggère que certains phénotypes des cellules déplétés en POT1 puissent résulter d'une réplication incomplète de l'ADN télémétrique plutôt que d'une déprotection de l'extrémité sortante des télomères.
Resumo:
The LIM domain-binding protein Ldb1 is an essential cofactor of LIM-homeodomain (LIM-HD) and LIM-only (LMO) proteins in development. The stoichiometry of Ldb1, LIM-HD, and LMO proteins is tightly controlled in the cell and is likely a critical determinant of their biological actions. Single-stranded DNA-binding proteins (SSBPs) were recently shown to interact with Ldb1 and are also important in developmental programs. We establish here that two mammalian SSBPs, SSBP2 and SSBP3, contribute to an erythroid DNA-binding complex that contains the transcription factors Tal1 and GATA-1, the LIM domain protein Lmo2, and Ldb1 and binds a bipartite E-box-GATA DNA sequence motif. In addition, SSBP2 was found to augment transcription of the Protein 4.2 (P4.2) gene, a direct target of the E-box-GATA-binding complex, in an Ldb1-dependent manner and to increase endogenous Ldb1 and Lmo2 protein levels, E-box-GATA DNA-binding activity, and P4.2 and beta-globin expression in erythroid progenitors. Finally, SSBP2 was demonstrated to inhibit Ldb1 and Lmo2 interaction with the E3 ubiquitin ligase RLIM, prevent RLIM-mediated Ldb1 ubiquitination, and protect Ldb1 and Lmo2 from proteasomal degradation. These results define a novel biochemical function for SSBPs in regulating the abundance of LIM domain and LIM domain-binding proteins.
Resumo:
The products of the recF, recO, and recR genes are thought to interact and assist RecA in the utilization of single-stranded DNA precomplexed with single-stranded DNA binding protein (Ssb) during synapsis. Using immunoprecipitation, size-exclusion chromatography, and Ssb protein affinity chromatography in the absence of any nucleotide cofactors, we have obtained the following results: (i) RecF interacts with RecO, (ii) RecF interacts with RecR in the presence of RecO to form a complex consisting of RecF, RecO, and RecR (RecF–RecO–RecR); (iii) RecF interacts with Ssb protein in the presence of RecO. These data suggested that RecO mediates the interactions of RecF protein with RecR and with Ssb proteins. Incubation of RecF, RecO, RecR, and Ssb proteins resulted in the formation of RecF–RecO–Ssb complexes; i.e., RecR was excluded. Preincubation of RecF, RecO, and RecR proteins prior to addition of Ssb protein resulted in the formation of complexes consisting of RecF, RecO, RecR, and Ssb proteins. These data suggest that one role of RecF is to stabilize the interaction of RecR with RecO in the presence of Ssb protein. Finally, we found that interactions of RecF with RecO are lost in the presence of ATP. We discuss these results to explain how the RecF–RecO–RecR complex functions as an anti-Ssb factor.
Resumo:
Ethylene-responsive element-binding proteins (EREBPs) of tobacco (Nicotiana tabacum L.) bind to the GCC box of many pathogenesis-related (PR) gene promoters, including osmotin (PR-5). The two GCC boxes on the osmotin promoter are known to be required, but not sufficient, for maximal ethylene responsiveness. EREBPs participate in the signal transduction pathway leading from exogenous ethylene application and pathogen infection to PR gene induction. In this study EREBP3 was used as bait in a yeast two-hybrid interaction trap with a tobacco cDNA library as prey to isolate signal transduction pathway intermediates that interact with EREBPs. One of the strongest interactors was found to encode a nitrilase-like protein (NLP). Nitrilase is an enzyme involved in auxin biosynthesis. NLP interacted with other EREBP family members, namely tobacco EREBP2 and tomato (Lycopersicon esculentum L.) Pti4/5/6. The EREBP2-EREBP3 interaction with NLP required part of the DNA-binding domain. The specificity of interaction was further confirmed by protein-binding studies in solution. We propose that the EREBP-NLP interaction serves to regulate PR gene expression by sequestration of EREBPs in the cytoplasm.
Resumo:
We studied transcription initiation in the mitochondria of higher plants, with particular respect to promoter structures. Conserved elements of these promoters have been successfully identified by in vitro transcription systems in different species, whereas the involved protein components are still unknown. Proteins binding to double-stranded oligonucleotides representing different parts of the pea (Pisum sativum) mitochondrial atp9 were analyzed by denaturation-renaturation chromatography and mobility-shift experiments. Two DNA-protein complexes were detected, which appeared to be sequence specific in competition experiments. Purification by hydroxyapatite, phosphocellulose, and reversed-phase high-pressure liquid chromatography separated two polypeptides with apparent molecular masses of 32 and 44 kD. Both proteins bound to conserved structures of the pea atp9 and the heterologous Oenothera berteriana atp1 promoters and to sequences just upstream. Possible functions of these proteins in mitochondrial promoter recognition are discussed.
Resumo:
We have identified a class of proteins that bind single-stranded telomeric DNA and are required for the nuclear organization of telomeres and/or telomere-associated proteins. Rlf6p was identified by its sequence similarity to Gbp1p, a single-stranded telomeric DNA-binding protein from Chlamydomonas reinhardtii. Rlf6p and Gbp1p bind yeast single-stranded G-strand telomeric DNA. Both proteins include at least two RNA recognition motifs, which are found in many proteins that interact with single-stranded nucleic acids. Disruption of RLF6 alters the distribution of repressor/activator protein 1 (Rap1p), a telomere-associated protein. In wild-type yeast cells, Rap1p localizes to a small number of perinuclear spots, while in rlf6 cells Rap1p appears diffuse and nuclear. Interestingly, telomere position effect and telomere length control, which require RAP1, are unaffected by rlf6 mutations, demonstrating that Rap1p localization can be uncoupled from other Rap1p-dependent telomere functions. In addition, expression of Chlamydomonas GBP1 restores perinuclear, punctate Rap1p localization in rlf6 mutant cells. The functional complementation of a fungal gene by an algal gene suggests that Rlf6p and Gbp1p are members of a conserved class of single-stranded telomeric DNA-binding proteins that influence nuclear organization. Furthermore, it demonstrates that, despite their unusual codon bias, C. reinhardtii genes can be efficiently translated in Saccharomyces cerevisiae cells.
Resumo:
DNA-binding proteins are crucial for various cellular processes and hence have become an important target for both basic research and drug development. With the avalanche of protein sequences generated in the postgenomic age, it is highly desired to establish an automated method for rapidly and accurately identifying DNA-binding proteins based on their sequence information alone. Owing to the fact that all biological species have developed beginning from a very limited number of ancestral species, it is important to take into account the evolutionary information in developing such a high-throughput tool. In view of this, a new predictor was proposed by incorporating the evolutionary information into the general form of pseudo amino acid composition via the top-n-gram approach. It was observed by comparing the new predictor with the existing methods via both jackknife test and independent data-set test that the new predictor outperformed its counterparts. It is anticipated that the new predictor may become a useful vehicle for identifying DNA-binding proteins. It has not escaped our notice that the novel approach to extract evolutionary information into the formulation of statistical samples can be used to identify many other protein attributes as well.
Resumo:
DNA-binding proteins are crucial for various cellular processes, such as recognition of specific nucleotide, regulation of transcription, and regulation of gene expression. Developing an effective model for identifying DNA-binding proteins is an urgent research problem. Up to now, many methods have been proposed, but most of them focus on only one classifier and cannot make full use of the large number of negative samples to improve predicting performance. This study proposed a predictor called enDNA-Prot for DNA-binding protein identification by employing the ensemble learning technique. Experiential results showed that enDNA-Prot was comparable with DNA-Prot and outperformed DNAbinder and iDNA-Prot with performance improvement in the range of 3.97-9.52% in ACC and 0.08-0.19 in MCC. Furthermore, when the benchmark dataset was expanded with negative samples, the performance of enDNA-Prot outperformed the three existing methods by 2.83-16.63% in terms of ACC and 0.02-0.16 in terms of MCC. It indicated that enDNA-Prot is an effective method for DNA-binding protein identification and expanding training dataset with negative samples can improve its performance. For the convenience of the vast majority of experimental scientists, we developed a user-friendly web-server for enDNA-Prot which is freely accessible to the public. © 2014 Ruifeng Xu et al.
Resumo:
Background: DNA-binding proteins play a pivotal role in various intra- and extra-cellular activities ranging from DNA replication to gene expression control. Identification of DNA-binding proteins is one of the major challenges in the field of genome annotation. There have been several computational methods proposed in the literature to deal with the DNA-binding protein identification. However, most of them can't provide an invaluable knowledge base for our understanding of DNA-protein interactions. Results: We firstly presented a new protein sequence encoding method called PSSM Distance Transformation, and then constructed a DNA-binding protein identification method (SVM-PSSM-DT) by combining PSSM Distance Transformation with support vector machine (SVM). First, the PSSM profiles are generated by using the PSI-BLAST program to search the non-redundant (NR) database. Next, the PSSM profiles are transformed into uniform numeric representations appropriately by distance transformation scheme. Lastly, the resulting uniform numeric representations are inputted into a SVM classifier for prediction. Thus whether a sequence can bind to DNA or not can be determined. In benchmark test on 525 DNA-binding and 550 non DNA-binding proteins using jackknife validation, the present model achieved an ACC of 79.96%, MCC of 0.622 and AUC of 86.50%. This performance is considerably better than most of the existing state-of-the-art predictive methods. When tested on a recently constructed independent dataset PDB186, SVM-PSSM-DT also achieved the best performance with ACC of 80.00%, MCC of 0.647 and AUC of 87.40%, and outperformed some existing state-of-the-art methods. Conclusions: The experiment results demonstrate that PSSM Distance Transformation is an available protein sequence encoding method and SVM-PSSM-DT is a useful tool for identifying the DNA-binding proteins. A user-friendly web-server of SVM-PSSM-DT was constructed, which is freely accessible to the public at the web-site on http://bioinformatics.hitsz.edu.cn/PSSM-DT/.
Resumo:
STAT transcription factors are expressed in many cell types and bind to similar sequences. However, different STAT gene knock-outs show very distinct phenotypes. To determine whether differences between the binding specificities of STAT proteins account for these effects, we compared the sequences bound by STAT1, STAT5A, STAT5B, and STAT6. One sequence set was selected from random oligonucleotides by recombinant STAT1, STAT5A, or STAT6. For another set including many weak binding sites, we quantified the relative affinities to STAT1, STAT5A, STAT5B, and STAT6. We compared the results to the binding sites in natural STAT target genes identified by others. The experiments confirmed the similar specificity of different STAT proteins. Detailed analysis indicated that STAT5A specificity is more similar to that of STAT6 than that of STAT1, as expected from the evolutionary relationships. The preference of STAT6 for sites in which the half-palindromes (TTC) are separated by four nucleotides (N(4)) was confirmed, but analysis of weak binding sites showed that STAT6 binds fairly well to N(3) sites. As previously reported, STAT1 and STAT5 prefer N(3) sites; however, STAT5A, but not STAT1, weakly binds N(4) sites. None of the STATs bound to half-palindromes. There were no specificity differences between STAT5A and STAT5B.
Resumo:
The nuclear factor I (NFI) family consists of sequence-specific DNA-binding proteins that activate both transcription and adenovirus DNA replication. We have characterized three new members of the NFI family that belong to the Xenopus laevis NFI-X subtype and differ in their C-termini. We show that these polypeptides can activate transcription in HeLa and Drosophila Schneider line 2 cells, using an activation domain that is subdivided into adjacent variable and subtype-specific domains each having independent activation properties in chimeric proteins. Together, these two domains constitute the full NFI-X transactivation potential. In addition, we find that the X. laevis NFI-X proteins are capable of activating adenovirus DNA replication through their conserved N-terminal DNA-binding domains. Surprisingly, their in vitro DNA-binding activities are specifically inhibited by a novel repressor domain contained within the C-terminal part, while the dimerization and replication functions per se are not affected. However, inhibition of DNA-binding activity in vitro is relieved within the cell, as transcriptional activation occurs irrespective of the presence of the repressor domain. Moreover, the region comprising the repressor domain participates in transactivation. Mechanisms that may allow the relief of DNA-binding inhibition in vivo and trigger transcriptional activation are discussed.