281 resultados para SPLICEOSOMAL INTRONS
Resumo:
Chlorarachniophyte algae contain a complex, multi-membraned chloroplast derived from the endosymbiosis of a eukaryotic alga. The vestigial nucleus of the endosymbiont, called the nucleomorph, contains only three small linear chromosomes with a haploid genome size of 380 kb and is the smallest known eukaryotic genome. Nucleotide sequence data from a subtelomeric fragment of chromosome III were analyzed as a preliminary investigation of the coding capacity of this vestigial genome. Several housekeeping genes including U6 small nuclear RNA (snRNA), ribosomal proteins S4 and S13, a core protein of the spliceosome [small nuclear ribonucleoprotein (snRNP) E], and a cip-like protease (clpP) were identified. Expression of these genes was confirmed by combinations of Northern blot analysis, in situ hybridization, immunocytochemistry, and cDNA analysis. The protein-encoding genes are typically eukaryotic in overall structure and their messenger RNAs are polyadenylylated. A novel feature is the abundance of 18-, 19-, or 20-nucleotide introns; the smallest spliceosomal introns known. Two of the genes, U6 and S13, overlap while another two genes, snRNP E and clpP, are cotranscribed in a single mRNA. The overall gene organization is extraordinarily compact, making the nucleomorph a unique model for eukaryotic genomics.
Resumo:
Les introns sont des portions de gènes transcrites dans l’ARN messager, mais retirées pendant l’épissage avant la synthèse des produits du gène. Chez les eucaryotes, on rencontre les introns splicéosomaux, qui sont retirés de l’ARN messager par des splicéosomes. Les introns permettent plusieurs processus importants, tels que l'épissage alternatif, la dégradation des ARNs messagers non-sens, et l'encodage d'ARNs fonctionnels. Leurs rôles nous interrogent sur l'influence de la sélection naturelle sur leur évolution. Nous nous intéressons aux mutations qui peuvent modifier les produits d'un gène en changeant les sites d'épissage des introns. Ces mutations peuvent influencer le fonctionnement d'un organisme, et constituent donc un sujet d'étude intéressant, mais il n'existe actuellement pas de logiciels permettant de les étudier convenablement. Le but de notre projet était donc de concevoir une méthode pour détecter et analyser les changements des sites d'épissage des introns splicéosomaux. Nous avons finalement développé une méthode qui repère les évènements évolutifs qui affectent les introns splicéosomaux dans un jeu d'espèces données. La méthode a été exécutée sur un ensemble d'espèces d'oomycètes. Plusieurs évènements détectés ont changé les sites d’épissage et les protéines, mais de nombreux évènements trouvés ont modifié les introns sans affecter les produits des gènes. Il manque à notre méthode une étape finale d'analyse approfondie des données récoltées. Cependant, la méthode actuelle est facilement reproductible et automatise l'analyse des génomes pour la détection des évènements. Les fichiers produits peuvent ensuite être analysés dans chaque étude pour répondre à des questions spécifiques.
Resumo:
Intron splicing is one of the most important steps involved in the maturation process of a pre-mRNA. Although the sequence profiles around the splice sites have been studied extensively, the levels of sequence identity between the exonic sequences preceding the donor sites and the intronic sequences preceding the acceptor sites has not been examined as thoroughly. In this study we investigated identity patterns between the last 15 nucleotides of the exonic sequence preceding the 5' splice site and the intronic sequence preceding the 3' splice site in a set of human protein-coding genes that do not exhibit intron retention. We found that almost 60% of consecutive exons and introns in human protein-coding genes share at least two identical nucleotides at their 3' ends and, on average, the sequence identity length is 2.47 nucleotides. Based on our findings we conclude that the 3' ends of exons and introns tend to have longer identical sequences within a gene than when being taken from different genes. Our results hold even if the pairs are non-consecutive in the transcription order. (C) 2012 Elsevier Ltd. All rights reserved.
Resumo:
The gene encoding the glycolytic enzyme triose-phosphate isomerase (TPI; EC 5.3.1.1) has been central to the long-standing controversy on the origin and evolutionary significance of spliceosomal introns by virtue of its pivotal support for the introns-early view, or exon theory of genes. Putative correlations between intron positions and TPI protein structure have led to the conjecture that the gene was assembled by exon shuffling, and five TPI intron positions are old by the criterion of being conserved between animals and plants. We have sequenced TPI genes from three diverse eukaryotes--the basidiomycete Coprinus cinereus, the nematode Caenorhabditis elegans, and the insect Heliothis virescens--and have found introns at seven novel positions that disrupt previously recognized gene/protein structure correlations. The set of 21 TPI introns now known is consistent with a random model of intron insertion. Twelve of the 21 TPI introns appear to be of recent origin since each is present in but a single examined species. These results, together with their implication that as more TPI genes are sequenced more intron positions will be found, render TPI untenable as a paradigm for the introns-early theory and, instead, support the introns-late view that spliceosomal introns have been inserted into preexisting genes during eukaryotic evolution.
Resumo:
Exon shuffling has been characterized as one of the major evolutionary forces shaping both the genome and the proteome of eukaryotes. This mechanism was particularly important in the creation of multidomain proteins during animal evolution, bringing a number of functional genetic novelties. Here, genome information from a variety of eukaryotic species was used to address several issues related to the evolutionary history of exon shuffling. By comparing all protein sequences within each species, we were able to characterize exon shuffling signatures throughout metazoans. Intron phase (the position of the intron regarding the codon) and exon symmetry (the pattern of flanking introns for a given exon or block of adjacent exons) were features used to evaluate exon shuffling. We confirmed previous observations that exon shuffling mediated by phase 1 introns (1-1 exon shuffling) is the predominant kind in multicellular animals. Evidence is provided that such pattern was achieved since the early steps of animal evolution, supported by a detectable presence of 1-1 shuffling units in Trichoplax adhaerens and a considerable prevalence of them in Nematostella vectensis. In contrast, Monosiga brevicollis, one of the closest relatives of metazoans, and Arabidopsis thaliana, showed no evidence of 1-1 exon or domain shuffling above what it would be expected by chance. Instead, exon shuffling events are less abundant and predominantly mediated by phase 0 introns (0-0 exon shuffling) in those non-metazoan species. Moreover, an intermediate pattern of 1-1 and 0-0 exon shuffling was observed for the placozoan T. adhaerens, a primitive animal. Finally, characterization of flanking intron phases around domain borders allowed us to identify a common set of symmetric 1-1 domains that have been shuffled throughout the metazoan lineage.
Resumo:
Cells of several major algal groups are evolutionary chimeras of two radically different eukaryotic cells. Most of these “cells within cells” lost the nucleus of the former algal endosymbiont. But after hundreds of millions of years cryptomonads still retain the nucleus of their former red algal endosymbiont as a tiny relict organelle, the nucleomorph, which has three minute linear chromosomes, but their function and the nature of their ends have been unclear. We report extensive cryptomonad nucleomorph sequences (68.5 kb), from one end of each of the three chromosomes of Guillardia theta. Telomeres of the nucleomorph chromosomes differ dramatically from those of other eukaryotes, being repeats of the 23-mer sequence (AG)7AAG6A, not a typical hexamer (commonly TTAGGG). The subterminal regions comprising the rRNA cistrons and one protein-coding gene are exactly repeated at all three chromosome ends. Gene density (one per 0.8 kb) is the highest for any cellular genome. None of the 38 protein-coding genes has spliceosomal introns, in marked contrast to the chlorarachniophyte nucleomorph. Most identified nucleomorph genes are for gene expression or protein degradation; histone, tubulin, and putatively centrosomal ranbpm genes are probably important for chromosome segregation. No genes for primary or secondary metabolism have been found. Two of the three tRNA genes have introns, one in a hitherto undescribed location. Intergenic regions are exceptionally short; three genes transcribed by two different RNA polymerases overlap their neighbors. The reported sequences encode two essential chloroplast proteins, FtsZ and rubredoxin, thus explaining why cryptomonad nucleomorphs persist.
Resumo:
Group II introns are widely believed to have been ancestors of spliceosomal introns, yet little is known about their own evolutionary history. In order to address the evolution of mobile group II introns, we have compiled 71 open reading frames (ORFs) related to group II intron reverse transcriptases and subjected their derived amino acid sequences to phylogenetic analysis. The phylogenetic tree was rooted with reverse transcriptases (RTs) of non-long terminal repeat retroelements, and the inferred phylogeny reveals two major clusters which we term the mitochondrial and chloroplast-like lineages. Bacterial ORFs are mainly positioned at the bases of the two lineages but with weak bootstrap support. The data give an overview of an apparently high degree of horizontal transfer of group II intron ORFs, mostly among related organisms but also between organelles and bacteria. The Zn domain (nuclease) and YADD motif (RT active site) were lost multiple times during evolution. Differences in domain structures suggest that the oldest ORFs were concise, while the ORF in the mitochondrial lineage subsequently expanded in three locations. The data are consistent with a bacterial origin for mobile group II introns.
Resumo:
Protein coding genes are comprised of protein-coding exons and non-protein-coding introns. The process of splicing involves removal of the introns and joining of the exons to form a mature messenger RNA, which subsequently undergoes translation into polypeptide. The spliceosome is a large, RNA/protein assembly of five small nuclear RNAs as well as over 300 proteins, which catalyzes intron removal and exon ligation. The selection of specific exons for inclusion in the mature messenger RNA is spatiotemporally regulated and results in production of an enormous diversity of polypeptides from a single gene locus. This phenomenon, known as alternative splicing, is regulated, in part, by protein splicing factors, which target the spliceosome to exon/intron boundaries. The first part of my dissertation (Chapters II and III) focuses on the discovery and characterization of the 45 kilodalton FK506 binding protein (FKBP45), which I discovered in the silk moth, Bombyx mori, as a U1 small nuclear RNA binding protein. This protein family binds the immunosuppressants FK506 and rapamycin and contains peptidyl-prolyl cis-trans isomerase activity, which converts polypeptides from cis to trans about a proline residue. This is the first time that an FKBP has been identified in the spliceosome. The second section of my dissertation (Chapters IV, V, VI and VII) is an investigation of the potential role of small nuclear RNA sequence variants in the control of splicing. I identified 46 copies of small nuclear RNAs in the 6X whole genome shotgun of the Bombyx mori p50T strain. These variants may play a role in differential binding of specific proteins that mediate alternative splicing. Along these lines, further investigation of U2 snRNA sequence variants in Bombyx mori demonstrated that some U2 snRNAs preferentially assemble into high molecular weight spliceosomal complexes over others. Expression of snRNA variants may represent another mechanism by which the cell is able to fine tune the splicing process.
Resumo:
Protein coding genes are comprised of protein-coding exons and non-protein-coding introns. The process of splicing involves removal of the introns and joining of the exons to form a mature messenger RNA, which subsequently undergoes translation into polypeptide. The spliceosome is a large, RNA/protein assembly of five small nuclear RNAs as well as over 300 proteins, which catalyzes intron removal and exon ligation. The selection of specific exons for inclusion in the mature messenger RNA is spatio-temporally regulated and results in production of an enormous diversity of polypeptides from a single gene locus. This phenomenon, known as alternative splicing, is regulated, in part, by protein splicing factors, which target the spliceosome to exon/intron boundaries. The first part of my dissertation (Chapters II and III) focuses on the discovery and characterization of the 45 kilodalton FK506 binding protein (FKBP45), which I discovered in the silk moth, Bombyx mori, as a U1 small nuclear RNA binding protein. This protein family binds the immunosuppressants FK506 and rapamycin and contains peptidyl-prolyl cis-trans isomerase activity, which converts polypeptides from cis to trans about a proline residue. This is the first time that an FKBP has been identified in the spliceosome. The second section of my dissertation (Chapters IV, V, VI and VII) is an investigation of the potential role of small nuclear RNA sequence variants in the control of splicing. I identified 46 copies of small nuclear RNAs in the 6X whole genome shotgun of the Bombyx mori p50T strain. These variants may play a role in differential binding of specific proteins that mediate alternative splicing. Along these lines, further investigation of U2 snRNA sequence variants in Bombyx mori demonstrated that some U2 snRNAs preferentially assemble into high molecular weight spliceosomal complexes over others. Expression of snRNA variants may represent another mechanism by which the cell is able to fine tune the splicing process.
Resumo:
Background The majority of introns in gene transcripts are found within the coding sequences (CDSs). A small but significant fraction of introns are also found to reside within the untranslated regions (5′UTRs and 3′UTRs) of expressed sequences. Alignment of the whole genome and expressed sequence tags (ESTs) of the model plant Arabidopsis thaliana has identified introns residing in both coding and non-coding regions of the genome. Results A bioinformatic analysis revealed some interesting observations: (1) the density of introns in 5′UTRs is similar to that in CDSs but much higher than that in 3′UTRs; (2) the 5′UTR introns are preferentially located close to the initiating ATG codon; (3) introns in the 5′UTRs are, on average, longer than introns in the CDSs and 3′UTRs; and (4) 5′UTR introns have a different nucleotide composition to that of CDs and 3′UTR introns. Furthermore, we show that the 5′UTR intron of the A. thaliana EFIα-A3 gene affects the gene expression and the size of the 5′UTR intron influences the level of gene expression. Conclusion Introns within the 5′UTR show specific features that distinguish them from introns that reside within the coding sequence and the 3′UTR. In the EFIα-A3 gene, the presence of a long intron in the 5′UTR is sufficient to enhance gene expression in plants in a size dependent manner.
Resumo:
The removal of non-coding sequences, introns, is an essential part of messenger RNA processing. In most metazoan organisms, the U12-type spliceosome processes a subset of introns containing highly conserved recognition sequences. U12-type introns constitute less than 0,5% of all introns and reside preferentially in genes related to information processing functions, as opposed to genes encoding for metabolic enzymes. It has previously been shown that the excision of U12-type introns is inefficient compared to that of U2-type introns, supporting the model that these introns could provide a rate-limiting control for gene expression. The low efficiency of U12-type splicing is believed to have important consequences to gene expression by limiting the production of mature mRNAs from genes containing U12-type introns. The inefficiency of U12-type splicing has been attributed to the low abundance of the components of the U12-type spliceosome in cells, but this hypothesis has not been proven. The aim of the first part of this work was to study the effect of the abundance of the spliceosomal snRNA components on splicing. Cells with a low abundance of the U12-type spliceosome were found to inefficiently process U12-type introns encoded by a transfected construct, but the expression levels of endogenous genes were not found to be affected by the abundance of the U12-type spliceosome. However, significant levels of endogenous unspliced U12-type intron-containing pre-mRNAs were detected in cells. Together these results support the idea that U12-type splicing may limit gene expression in some situations. The inefficiency of U12-type splicing has also promoted the idea that the U12-type spliceosome may control gene expression, limiting the mRNA levels of some U12-type intron-containing genes. While the identities of the primary target genes that contain U12-type introns are relatively well known, little has previously been known about the downstream genes and pathways potentially affected by the efficiency of U12-type intron processing. Here, the effects of U12-type splicing efficiency on a whole organism were studied in a Drosophila line with a mutation in an essential U12-type spliceosome component. Genes containing U12-type introns showed variable gene-specific responses to the splicing defect, which points to variation in the susceptibility of different genes to changes in splicing efficiency. Surprisingly, microarray screening revealed that metabolic genes were enriched among downstream effects, and that the phenotype could largely be attributed to one U12-type intron-containing mitochondrial gene. Gene expression control by the U12-type spliceosome could thus have widespread effects on metabolic functions in the organism. The subcellular localization of the U12-type spliceosome components was studied as a response to a recent dispute on the localization of the U12-type spliceosome. All components studied were found to be nuclear indicating that the processing of U12-type introns occurs within the nucleus, thus clarifying a question central to the field. The results suggest that the U12-type spliceosome can limit the expression of genes that contain U12-type introns in a gene-specific manner. Through its limiting role in pre-mRNA processing, the U12-type splicing activity can affect specific genetic pathways, which in the case of Drosophila are involved in metabolic functions.
Resumo:
After analyzing the secondary structures of 68 exon-intron-exon and the corresponding exon-exon sequence segments, it is found that about 90% of 5' and 3' terminal bases G (splicing sites) of introns are situated in the loops of secondary structures or at the ends of stems near the loops, and most of "G" s in loops are closed to the ends of loops. Approximately 92% of the connecting sites of the adjoining exons also show the similar features. About 82% of the branch point "A" s are situated in loops or at the ends of stems near the loops. Splicing sites and branch points approach each other in space because of the folding.
Resumo:
A comparative analysis on the intron sequence oligonucleotide usages in two sets of yeast genes with higher and lower transcription frequencies, respectively, has shown that the intron sequence structures of the two sets of genes are different. There are more potential binding sites for transcription factors in the introns of the genes with high transcription frequencies. So it is speculated that introns regulate the transcription of genes. But more evidences are needed to favor this speculation. The detailed comparative analyses on the distribution ( length and position) of introns and exons in the two sets of gene sequences also show that there is an obvious boundary between the lengths of the two sets of introns. There is no boundary between the lengths of the two sets of exons, although the means of their lengths are of discrepancy. The situation of the gene lengths ( length of intron and exon) is similar to exon lengths. As far as the relative position, the introns in two sets of genes all have a bias toward the 5' ends of genes. But as the actual position is considered, more introns in high transcription genes have a tendency to be located toward the 5' ends of genes, some even located at 5'-UTR. These results suggest that the gene transcription rates are related to the length of intron, but not to the lengths of exons and genes sequences. The positions of introns may also influence the transcription rates. The transcriptional regulation of introns may be correlative with the transcriptional regulation of the upstream of genes, or be its continuous action.
Resumo:
A great deal of experimental studies have shown that many introns of eukaryotic genes function as regulators of transcription. However, comprehensive studies of this problem have not yet been conducted. After checking the transcription frequencies of some Saccharomyces cerevisiae (yeast), genes and their introns, a remarkable phenomenon was discovered that generally the introns of the genes with higher transcription frequencies are longer, and the introns of the genes with lower transcription frequencies are shorter. This suggests that the longer introns of genes with higher transcription frequencies may contain some characteristic sequence structures, which could enhance the transcription of genes. Therefore, two sets of introns of yeast genes were chosen for further study. The transcription frequencies of the first set of genes are higher (>30), and those of the second set of genes are lower (less than or equal to10). Some oligonucleotides are detected by statistically comparative analyses of the occurrence frequencies of oligonucleotides (mainly tetranucleotides and pentanucleotides), whose occurrence frequencies in the first set of introns; are significantly higher than those in the second set of introns, and are also significantly higher than those in the exons flanking the introns of the first set. Some of these extracted oligonucleotides are the same as the regulatory elements of transcription revealed by experimental analyses. Besides, the distributions of these extracted oligonucleotides in the two sets of introns and the exons show that the sequence structures of the first set of introns are favorable for transcription of genes.
Resumo:
We conducted a comparative statistical analysis of tetra- through hexanucleotide frequencies in two sets of introns of yeast genes. The first set consisted of introns of genes that have transcription rates higher than 30 mRNAs/h while the second set contained introns of genes whose transcription rates were lower than or equal to 10 mRNAs/h. Some oligonucleotides whose occurrence frequencies in the first set of introns are significantly higher than those in the second set of introns were detected. The frequencies of occurrence of most of these detected oligonucleotides are also significantly higher than those in the exons flanking the introns of the first set. Interestingly some of these detected oligonucleotides are the same as well known "signature" sequences of transcriptional regulatory elements. This could imply the existence of potential positive regulatory motifs of transcription in yeast introns. (C) 2003 Elsevier Ltd. All rights reserved.