70 resultados para conserved noncoding sequence
Resumo:
The HIV-1 transcript is alternatively spliced to over 30 different mRNAs. Whether RNA secondary structure can influence HIV-1 RNA alternative splicing has not previously been examined. Here we have determined the secondary structure of the HIV-1/BRU RNA segment, containing the alternative A3, A4a, A4b, A4c and A5 3′ splice sites. Site A3, required for tat mRNA production, is contained in the terminal loop of a stem–loop structure (SLS2), which is highly conserved in HIV-1 and related SIVcpz strains. The exon splicing silencer (ESS2) acting on site A3 is located in a long irregular stem–loop structure (SLS3). Two SLS3 domains were protected by nuclear components under splicing condition assays. One contains the A4c branch points and a putative SR protein binding site. The other one is adjacent to ESS2. Unexpectedly, only the 3′ A residue of ESS2 was protected. The suboptimal A3 polypyrimidine tract (PPT) is base paired. Using site-directed mutagenesis and transfection of a mini-HIV-1 cDNA into HeLa cells, we found that, in a wild-type PPT context, a mutation of the A3 downstream sequence that reinforced SLS2 stability decreased site A3 utilization. This was not the case with an optimized PPT. Hence, sequence and secondary structure of the PPT may cooperate in limiting site A3 utilization.
Resumo:
The RegA proteins from the bacteriophage T4 and RB69 are translational repressors that control the expression of multiple phage mRNAs. RegA proteins from the two phages share 78% sequence identity; however, in vivo expression studies have suggested that the RB69 RegA protein binds target RNAs with a higher affinity than T4 RegA protein. To study the RNA binding properties of T4 and RB69 RegA proteins more directly, the binding sites of RB69 RegA protein on synthetic RNAs corresponding to the translation initiation region of two RB69 target genes were mapped by RNase protection assays. These assays revealed that RB69 RegA protein protects nucleotides –9 to –3 (relative to the start codon) on RB69 gene 44, which contains the sequence GAAAAUU. On RB69 gene 45, the protected site (nucleotides –8 to –3) contains a similar purine-rich sequence: GAAAUA. Interestingly, T4 RegA protein protected the same nucleotides on these RNAs. To examine the specificity of RNA binding, quantitative RNA gel shift assays were performed with synthetic RNAs corresponding to recognition elements (REs) in three T4 and three RB69 mRNAs. Comparative gel shift assays demonstrated that RB69 RegA protein has an ∼7-fold higher affinity for T4 gene 44 RE RNA than T4 RegA protein. RB69 RegA protein also binds RB69 gene 44 RE RNA with a 4-fold higher affinity than T4 RegA protein. On the other hand, T4 RegA exhibited a higher affinity than RB69 RegA protein for RB69 gene 45 RE RNA. With respect to their affinities for cognate RNAs, both RegA proteins exhibited the following hierarchy of affinities: gene 44 > gene 45 > regA. Interestingly, T4 RegA exhibited the highest affinity towards RB69 gene 45 RE RNA, whereas RB69 RegA protein had the highest affinity for T4 gene 44 RE RNA. The helix–loop groove RNA binding motif of T4 RegA protein is fully conserved in RB69 RegA protein. However, homology modeling of the structure of RB69 RegA protein reveals that the divergent residues are clustered in two areas of the surface, and that there are two large areas of high conservation near the helix–loop groove, which may also play a role in RNA binding.
Resumo:
The Conserved Key Amino Acid Positions DataBase (CKAAPs DB) provides access to an analysis of structurally similar proteins with dissimilar sequences where key residues within a common fold are identified. The derivation and significance of CKAAPs starting from pairwise structure alignments is described fully in Reddy et al. [Reddy,B.V.B., Li,W.W., Shindyalov,I.N. and Bourne,P.E. (2000) Proteins, in press]. The CKAAPs identified from this theoretical analysis are provided to experimentalists and theoreticians for potential use in protein engineering and modeling. It has been suggested that CKAAPs may be crucial features for protein folding, structural stability and function. Over 170 substructures, as defined by the Combinatorial Extension (CE) database, which are found in approximately 3000 representative polypeptide chains have been analyzed and are available in the CKAAPs DB. CKAAPs DB also provides CKAAPs of the representative set of proteins derived from the CE and FSSP databases. Thus the database contains over 5000 representative polypeptide chains, covering all known structures in the PDB. A web interface to a relational database permits fast retrieval of structure-sequence alignments, CKAAPs and associated statistics. Users may query by PDB ID, protein name, function and Enzyme Classification number. Users may also submit protein alignments of their own to obtain CKAAPs. An interface to display CKAAPs on each structure from a web browser is also being implemented. CKAAPs DB is maintained by the San Diego Supercomputer Center and accessible at the URL http://ckaaps.sdsc.edu.
Resumo:
The chloroplast gene rbcL encodes the large subunit of the CO2-fixing enzyme ribulose-bisphosphate carboxylase. In previous work a target for photo-accelerated degradation of Chlamydomonas reinhardtii rbcL transcripts in vivo was found to lie within the first 63 nucleotides, and a sequence element required for increasing the longevity of transcripts of rbcL-reporter genes was found to occur between nucleotides 170 and 350. Photo-accelerated degradation of rbcL transcripts has been found to require nucleotides 21 to 41. Transcript nucleotides lying between 329 and 334 and between 14 and 27 are essential for stabilizing transcripts in vivo; mutations in either region reduce the longevity of transcripts. It is postulated that the effectiveness of photo-accelerated endonuclease attacks on the nucleotide 21 to 41 region is reduced by physical blockage or distortion of the target sequence by interacting proteins that associate with nucleotides in the 14 to 27 and 329 to 334 regions of the transcripts. Both the nucleotide +329 to +334 stabilizing sequence of rbcL and a transcription enhancing sequence that lies between +126 and +170 encode well conserved (cyanobacteria through angiosperms) amino acid sequences; the evolution of expression control elements within the protein coding sequence of rbcL is considered.
Resumo:
The human prion gene contains five copies of a 24 nt repeat that is highly conserved among species. An analysis of folding free energies of the human prion mRNA, in particular in the repeat region, suggested biased codon selection and the presence of RNA patterns. In particular, pseudoknots, similar to the one predicted by Wills in the human prion mRNA, were identified in the repeat region of all available prion mRNAs available in GenBank, but not those of birds and the red slider turtle. An alignment of these mRNAs, which share low sequence homology, shows several co-variations that maintain the pseudoknot pattern. The presence of pseudoknots in yeast Sup35p and Rnq1 suggests acquisition in the prokaryotic era. Computer generated three-dimensional structures of the human prion pseudoknot highlight protein and RNA interaction domains, which suggest a possible effect in prion protein translation. The role of pseudoknots in prion diseases is discussed as individuals with extra copies of the 24 nt repeat develop the familial form of Creutzfeldt–Jakob disease.
Resumo:
SF3b155 is an essential spliceosomal protein, highly conserved during evolution. It has been identified as a subunit of splicing factor SF3b, which, together with a second multimeric complex termed SF3a, interacts specifically with the 12S U2 snRNP and converts it into the active 17S form. The protein displays a characteristic intranuclear localization. It is diffusely distributed in the nucleoplasm but highly concentrated in defined intranuclear structures termed “speckles,” a subnuclear compartment enriched in small ribonucleoprotein particles and various splicing factors. The primary sequence of SF3b155 suggests a multidomain structure, different from those of other nuclear speckles components. To identify which part of SF3b155 determines its specific intranuclear localization, we have constructed expression vectors encoding a series of epitope-tagged SF3b155 deletion mutants as well as chimeric combinations of SF3b155 sequences with the soluble cytoplasmic protein pyruvate kinase. Following transfection of cultured mammalian cells, we have identified (i) a functional nuclear localization signal of the monopartite type (KRKRR, amino acids 196–200) and (ii) a molecular segment with multiple threonine-proline repeats (amino acids 208–513), which is essential and sufficient to confer a specific accumulation in nuclear speckles. This latter sequence element, in particular amino acids 208–440, is required for correct subcellular localization of SF3b155 and is also sufficient to target a reporter protein to nuclear speckles. Moreover, this “speckle-targeting sequence” transfers the capacity for interaction with other U2 snRNP components.
Resumo:
It has been suggested that delayed DNA replication underlies fragility at common human fragile sites, but specific sequences responsible for expression of these inducible fragile sites have not been identified. One approach to identify such cis-acting sequences within the large nonexonic regions of fragile sites would be to identify conserved functional elements within orthologous fragile sites by interspecies sequence comparison. This study describes a comparison of orthologous fragile regions, the human FRA3B/FHIT and the murine Fra14A2/Fhit locus. We sequenced over 600 kbp of the mouse Fra14A2, covering the region orthologous to the fragile epicenter of FRA3B, and determined the Fhit deletion break points in a mouse kidney cancer cell line (RENCA). The murine Fra14A2 locus, like the human FRA3B, was characterized by a high AT content. Alignment of the two sequences showed that this fragile region was stable in evolution despite its susceptibility to mitotic recombination on inhibition of DNA replication. There were also several unusual highly conserved regions (HCRs). The positions of predicted matrix attachment regions (MARs), possibly related to replication origins, were not conserved. Of known fragile region landmarks, five cancer cell break points, one viral integration site, and one aphidicolin break cluster were located within or near HCRs. Thus, comparison of orthologous fragile regions has identified highly conserved sequences with possible functional roles in maintenance of fragility.
Resumo:
We present a method for discovering conserved sequence motifs from families of aligned protein sequences. The method has been implemented as a computer program called emotif (http://motif.stanford.edu/emotif). Given an aligned set of protein sequences, emotif generates a set of motifs with a wide range of specificities and sensitivities. emotif also can generate motifs that describe possible subfamilies of a protein superfamily. A disjunction of such motifs often can represent the entire superfamily with high specificity and sensitivity. We have used emotif to generate sets of motifs from all 7,000 protein alignments in the blocks and prints databases. The resulting database, called identify (http://motif.stanford.edu/identify), contains more than 50,000 motifs. For each alignment, the database contains several motifs having a probability of matching a false positive that range from 10−10 to 10−5. Highly specific motifs are well suited for searching entire proteomes, while generating very few false predictions. identify assigns biological functions to 25–30% of all proteins encoded by the Saccharomyces cerevisiae genome and by several bacterial genomes. In particular, identify assigned functions to 172 of proteins of unknown function in the yeast genome.
Resumo:
A whole genome cattle-hamster radiation hybrid cell panel was used to construct a map of 54 markers located on bovine chromosome 5 (BTA5). Of the 54 markers, 34 are microsatellites selected from the cattle linkage map and 20 are genes. Among the 20 mapped genes, 10 are new assignments that were made by using the comparative mapping by annotation and sequence similarity strategy. A LOD-3 radiation hybrid framework map consisting of 21 markers was constructed. The relatively low retention frequency of markers on this chromosome (19%) prevented unambiguous ordering of the other 33 markers. The length of the map is 398.7 cR, corresponding to a ratio of ≈2.8 cR5,000/cM. Type I genes were binned for comparison of gene order among cattle, humans, and mice. Multiple internal rearrangements within conserved syntenic groups were apparent upon comparison of gene order on BTA5 and HSA12 and HSA22. A similarly high number of rearrangements were observed between BTA5 and MMU6, MMU10, and MMU15. The detailed comparative map of BTA5 should facilitate identification of genes affecting economically important traits that have been mapped to this chromosome and should contribute to our understanding of mammalian chromosome evolution.
Resumo:
The Deleted in AZoospermia (DAZ) genes encode potential RNA-binding proteins that are expressed exclusively in prenatal and postnatal germ cells and are strong candidates for human fertility factors. Here we report the identification of an additional member of the DAZ gene family, which we have called BOULE. With the identification of this gene, it is clear that the human DAZ gene family contains at least three members: DAZ, a Y-chromosome gene cluster that arose 30–40 million years ago and whose deletion is linked to infertility in men; DAZL, the “father” of DAZ, a gene that maps to human chromosome 3 and has homologs required for both female and male germ cell development in other organisms; and BOULE, a gene that we propose is the “grandfather” of DAZ and maps to human chromosome 2. Human and mouse BOULE resemble the invertebrate meiotic regulator Boule, the proposed ortholog of DAZ, in sequence and expression pattern and hence likely perform a similar meiotic function. In contrast, the previously identified human DAZ and DAZL are expressed much earlier than BOULE in prenatal germ stem cells and spermatogonia; DAZL also is expressed in female germ cells. These data suggest that homologs of the DAZ gene family can be grouped into two subfamilies (BOULE and DAZL) and that members of the DAZ family evolved from an ancestral meiotic regulator, Boule, to assume distinct, yet overlapping, functions in germ cell development.
Resumo:
The CCAAT motif is found in the promoters of many eukaryotic genes. In yeast a single complex of three proteins, termed HAP2, HAP3, and HAP5, binds to this sequence, and in mammals the three components of the equivalent complex (called variously NF-Y, CBF, or CP1) are also represented by single genes. Here we report the presence of multiple genes for each of the components of the CCAAT-binding complex, HAP2,3,5, from Arabidopsis. Three independent Arabidopsis HAP subunit 2 (AtHAP2) cDNAs were cloned by functional complementation of a yeast hap2 mutant, and two independent forms each of AtHAP3 and AtHAP5 cDNAs were detected in the expressed sequence tag database. Additional homologs (two of AtHAP3 and one of AtHAP5) have been identified from available Arabidopsis genomic sequences. Northern-blot analysis indicated ubiquitous expression for each AtHAP2 and AtHAP5 cDNA in a range of tissues, whereas expression of each AtHAP3 cDNA was under developmental and/or environmental regulation. The unexpected presence of multiple forms of each HAP homolog in Arabidopsis, compared with the single genes in yeast and vertebrates, suggests that the HAP2,3,5 complex may play diverse roles in gene transcription in higher plants.
Resumo:
Two transcription factors, C1 (a Myb-domain protein) and B (a basic-helix-loop-helix protein), mediate transcriptional activation of the anthocyanin-biosynthetic genes of maize (Zea mays). To begin to assess the mechanism of activation, the sequences required for C1- and B-mediated induction have been determined for the a2 promoter, which encodes an anthocyanin-biosynthetic enzyme. Analysis of a series of 7- to 13-base-pair substitutions revealed two regions crucial for activation. One region, centered at −99, contained a C1-binding site that abolished C1 binding. The other crucial region was adjacent, centered at −91. C1 binding was not detected at this site, and mutation of this site did not prevent C1 binding at −99. An oligonucleotide dimer containing these two crucial elements was sufficient for C1 and B activation of a heterologous promoter. These data suggest that activation of the anthocyanin genes involves C1 and another factor binding at closely adjacent sites. Mutating a previously postulated anthocyanin consensus sequence within a2 did not significantly reduce activation by C1 and B. However, sequence comparisons of the crucial a2 regions with sequences important for C1- and B-mediated activation in two other anthocyanin promoters led to a revised consensus element shared by these promoters.
Resumo:
The psbA2 gene of a unicellular cyanobacterium, Microcystis aeruginosa K-81, encodes a D1 protein homolog in the reaction center of photosynthetic Photosystem II. The expression of the psbA2 transcript has been shown to be light-dependent as assessed under light and dark (12/12 h) cycling conditions. We aligned the 5′-untranslated leader regions (UTRs) of psbAs from different photosynthetic organisms and identified a conserved sequence, UAAAUAAA or the ‘AU-box’, just upstream of the SD sequences. To clarify the role of 5′-upstream cis-elements containing the AU-box for light-dependent expression of psbA2, a series of deletion and point mutations in the region were introduced into the genome of heterologous cyanobacterium Synechococcus sp. strain PCC 7942, and psbA2 expression was examined. A clear pattern of light-dependent expression was observed in recombinant cyanobacteria carrying the K-81 psbA2 –38/+36 region (which includes the minimal promoter element and a light-dependent cis-element with the AU-box), +1 indicating the transcription start site. A constitutive pattern of expression, in which the transcripts remained almost stable under dark conditions, was obtained in cells harboring the –38/+14 region (the minimal element), indicating that the +14/+36 region with the AU-box is important for the observed light-dependent expression. Point mutations analyses within the AU-box also revealed that changes in number, direction and identity (as assayed by adenine/uridine nucleotide substitutions) influenced the light-dependent pattern of expression. The level of psbA2 transcripts increased markedly in CG- or deletion-box mutants in the dark, strongly indicating that the AU- (AT-) box acts as a negative cis-element. Furthermore, characterization of transcript accumulation in cells treated with rifampicin suggests that psbA2 5′-mRNA is unstable in the dark, supporting the view that the light-dependent expression is controlled at the post-transcriptional level. We discuss various mechanisms that may lead to altered mRNA stability such as the binding of factor(s) or ribosomes to the 5′-UTR and possible roles of the AU-box motif and the SD sequence.
Resumo:
The absence of the fragile X mental retardation protein (FMRP), encoded by the FMR1 gene, is responsible for pathologic manifestations in the Fragile X Syndrome, the most frequent cause of inherited mental retardation. FMRP is an RNA-binding protein associated with polysomes as part of a messenger ribonucleoprotein (mRNP) complex. Although its function is poorly understood, various observations suggest a role in local protein translation at neuronal dendrites and in dendritic spine maturation. We present here the identification of CYFIP1/2 (Cytoplasmic FMRP Interacting Proteins) as FMRP interactors. CYFIP1/2 share 88% amino acid sequence identity and represent the two members in humans of a highly conserved protein family. Remarkably, whereas CYFIP2 also interacts with the FMRP-related proteins FXR1P/2P, CYFIP1 interacts exclusively with FMRP. FMRP–CYFIP interaction involves the domain of FMRP also mediating homo- and heteromerization, thus suggesting a competition between interaction among the FXR proteins and interaction with CYFIP. CYFIP1/2 are proteins of unknown function, but CYFIP1 has recently been shown to interact with the small GTPase Rac1, which is implicated in development and maintenance of neuronal structures. Consistent with FMRP and Rac1 localization in dendritic fine structures, CYFIP1/2 are present in synaptosomal extracts.
Resumo:
We have analyzed the level of intraindividual sequence variability (heteroplasmy) of mtDNA in human brain by denaturing gradient gel electrophoresis and sequencing. Single base substitutions, as well as insertions or deletions of single bases, were numerous in the noncoding control region (D-loop), and 35-45% of the molecules from a single tissue showed sequence differences. By contrast, heteroplasmy in coding regions was not detected. The lower level of heteroplasmy in the coding regions is indicative of selection against deleterious mutations. Similar levels of heteroplasmy were found in two brain regions from the same individual, while no heteroplasmy was detected in blood. Thus, heteroplasmy seems to be more frequent in nonmitotic tissues. We observed a 7.7-fold increase in the frequency of deletions/insertions and a 2.2-fold increase in the overall frequency of heteroplasmic mutations in two individuals aged 96 and 99, relative to an individual aged 28. Our results show that intraindividual sequence variability occurs at a high frequency in the noncoding regions of normal human brain and indicate that small insertions and deletions might accumulate with age at a lower rate than large rearrangements.