21 resultados para SPLICEOSOMAL INTRONS
em University of Queensland eSpace - Australia
Resumo:
Eukaryotic phenotypic diversity arises from multitasking of a core proteome of limited size. Multitasking is routine in computers, as well as in other sophisticated information systems, and requires multiple inputs and outputs to control and integrate network activity. Higher eukaryotes have a mosaic gene structure with a dual output, mRNA (protein-coding) sequences and introns, which are released from the pre-mRNA by posttranscriptional processing. Introns have been enormously successful as a class of sequences and comprise up to 95% of the primary transcripts of protein-coding genes in mammals. In addition, many other transcripts (perhaps more than half) do not encode proteins at all, but appear both to be developmentally regulated and to have genetic function. We suggest that these RNAs (eRNAs) have evolved to function as endogenous network control molecules which enable direct gene-gene communication and multitasking of eukaryotic genomes. Analysis of a range of complex genetic phenomena in which RNA is involved or implicated, including co-suppression, transgene silencing, RNA interference, imprinting, methylation, and transvection, suggests that a higher-order regulatory system based on RNA signals operates in the higher eukaryotes and involves chromatin remodeling as well as other RNA-DNA, RNA-RNA, and RNA-protein interactions. The evolution of densely connected gene networks would be expected to result in a relatively stable core proteome due to the multiple reuse of components, implying,that cellular differentiation and phenotypic variation in the higher eukaryotes results primarily from variation in the control architecture. Thus, network integration and multitasking using trans-acting RNA molecules produced in parallel with protein-coding sequences may underpin both the evolution of developmentally sophisticated multicellular organisms and the rapid expansion of phenotypic complexity into uncontested environments such as those initiated in the Cambrian radiation and those seen after major extinction events.
Resumo:
The current prediction or genes in the Plasmodium falciparum genome database relies upon a limited number of specially developed computer algorithms. We have re-annotated the sequence of chromosome 2 of P. falciparum by a computer-assisted manual analysis. which is described here. Of 161 newly predicted introns, we have experimentally confirmed 98. We regard 110 introns from the previously published analyses as probable, we delete 3, change 26 and add 135. We recognise 214 genes in chromosome 2. We have predicted introns in 121 genes. The increased complexity or gene structure on chromosome 2 is likely to be mirrored by the entire genome. (C) 2001 Elsevier Science B.V. All rights reserved.
Resumo:
By spliced alignment of human DNA and transcript sequence data we constructed a data set of transcript-confirmed exons and introns from 2793 genes, 796 of which (28%) were seen to have multiple isoforms. We find that over one-third of human exons can translate in more than one frame, and that this is highly correlated with G+C content. Introns containing adenosine at donor site position +3 (A3), rather than guanosine (G3), are more common in low G+C regions, while the converse is true in high G+C regions. These two classes of introns are shown to have distinct lengths, consensus sequences and correlations among splice signals, leading to the hypothesis that A3 donor sites are associated with exon definition, and G3 donor sites with intron definition. Minor classes of introns, including GC-AG, U12-type GT-AG, weak, and putative AG-dependant introns are identified and characterized. Cassette exons are more prevalent in low G+C regions, while exon isoforms are more prevalent in high G+C regions. Cassette exon events outnumber other alternative events, while exon isoform events involve truncation twice as often as extension, and occur at acceptor sites twice as often as at donor sites. Alternative splicing is usually associated with weak splice signals, and in a majority of cases, preserves the coding frame. The reported characteristics of constitutive and alternative splice signals, and the hypotheses offered regarding alternative splicing and genome organization, have important implications for experimental research into RNA processing. The 'AltExtron' data sets are available at http://www.bit.uq.edu.au/altExtron/ and http://www.ebi.ac.uk/similar tothanaraj/altExtron/.
Resumo:
Multiple Sclerosis (MS) is a central nervous system (CNS) chronic inflammatory demyelinating disease leading to various neurological disabilities. The disorder is more prevalent for women with a ratio of 3:2 female to male. Objectives: To investigate variation within the estrogen receptor 1 (ESR1) polymorphism gene in an Australian MS case-control population using two intragenic restriction fragment length polymorphisms; the G594A located in exon 8 detected with the BtgI restriction enzyme and T938C located in intron 1, detected with PvuII. One hundred and ten Australian MS patients were studied, with patients classified clinically as Relapsing Remitting MS (RR-MS), Secondary Progressive MS (SP-MS) or Primary Progressive MS (PP-MS). Also, 110 age, sex and ethnicity matched controls were investigated as a comparative group. No significant difference in the allelic distribution frequency was found between the case and control groups for the ESR1 PvuII (P = 0.50) and Btg1 (P = 0.45) marker. Our results do not support a role for these two ESR1 markers in multiple sclerosis susceptibility, however other markers within ESR1 should not be excluded for potential involvement in the disorder.
Resumo:
Sm and Sm-like proteins are key components of small ribonucleoproteins involved in many RNA and DNA processing pathways. In eukaryotes, these complexes contain seven unique Sm or Sm-like (Lsm) proteins assembled as hetero-heptameric rings, whereas in Archaea and bacteria six or seven-membered rings are made from only a single polypeptide chain. Here we show that single Sm and Lsm proteins from yeast also have the capacity to assemble into homo-oligomeric rings. Formation of homo-oligomers by the spliceosomal small nuclear ribonucleoprotein components SmE and SmF preclude hetero-interactions vital to formation of functional small nuclear RNP complexes in vivo. To better understand these unusual complexes, we have determined the crystal structure of the homomeric assembly of the spliceosomal protein SmF. Like its archaeal/bacterial homologs, the SmF complex forms a homomeric ring but in an entirely novel arrangement whereby two heptameric rings form a co-axially stacked dimer via interactions mediated by the variable loops of the individual SmF protein chains. Furthermore, we demonstrate that the homomeric assemblies of yeast Sm and Lsm proteins are capable of binding not only to oligo(U) RNA but, in the case of SmF, also to oligo(dT) single-stranded DNA.
Resumo:
Leucine-rich repeats (LRRs) are 20-29-residue sequence motifs present in a number of proteins with diverse functions. The primary function of these motifs appears to be to provide a versatile structural framework for the formation of protein-protein interactions. The past two years have seen an explosion of new structural information on proteins with LRRs. The new structures represent different LRR subfamilies and proteins with diverse functions, including GTPase-activating protein rna 1 p from the ribonuclease-inhibitor-like subfamily; spliceosomal protein U2A', Rab geranylgeranyltransferase, internalin B, dynein light chain 1 and nuclear export protein TAP from the SDS22-like subfamily; Skp2 from the cysteine-containing subfamily; and YopM from the bacterial subfamily. The new structural information has increased our understanding of the structural determinants of LRR proteins and our ability to model such proteins with unknown structures, and has shed new light on how these proteins participate in protein-protein interactions.
Resumo:
It has been previously observed that the intrinsically weak variant GC donor sites, in order to be recognized by the U2-type spliceosome, possess strong consensus sequences maximized for base pair formation with U1 and U5/U6 snRNAs. However, variability in signal strength is a fundamental mechanism for splice site selection in alternative splicing. Here we report human alternative GC-AG introns (for the first time from any species), and show that while constitutive GC-AG introns do possess strong signals at their donor sites, a large subset of alternative GC-AG introns possess weak consensus sequences at their donor sites. Surprisingly, this subset of alternative isoforms shows strong consensus at acceptor exon positions 1 and 2. The improved consensus at the acceptor exon can facilitate a strong interaction with U5 snRNA, which tethers the two exons for ligation during the second step of splicing. Further, these isoforms nearly always possess alternative acceptor sites and always possess alternative acceptor sites and exhibit particularly weak polypyrimidine tracts characteristic of AG-dependent introns. The acceptor exon nucleotides are part of the consensus required for the U2AF(35)-mediated recognition of AG in such introns. Such improved consensus at acceptor exons is not found in either normal or alternative GT-AG introns having weak donor sites or weak polypyrimidine,tracts. The changes probably reflect mechanisms that allow GC-AG alternative intron isoforms to cope with two conflicting requirements, namely an apparent need for differential splice strength to direct the choice of alternative sites and a need for improved donor signals to compensate for the central mismatch base pair (C-A) in the RNA duplex of U1 snRNA and the pre-mRNA. The other important findings include (i) one in every twenty alternative introns is a GC-AG intron, and (ii) three of every five observed GC-AG introns are alternative isoforms.
Resumo:
Around 98% of all transcriptional output in humans is noncoding RNA. RNA-mediated gene regulation is widespread in higher eukaryotes and complex genetic phenomena like RNA interference, co-suppression, transgene silencing, imprinting, methylation, and possibly position-effect variegation and transvection, all involve intersecting pathways based on or connected to RNA signaling. I suggest that the central dogma is incomplete, and that intronic and other non-coding RNAs have evolved to comprise a second tier of gene expression in eukaryotes, which enables the integration and networking of complex suites of gene activity. Although proteins are the fundamental effectors of cellular function, the basis of eukaryotic complexity and phenotypic variation may lie primarily in a control architecture composed of a highly parallel system of trans-acting RNAs that relay state information required for the coordination and modulation of gene expression, via chromatin remodeling, RNA-DNA, RNA-RNA and RNA-protein interactions. This system has interesting and perhaps informative analogies with small world networks and dataflow computing.
Resumo:
Within a 199 866 base pair (bp) portion of a Plasmodium vivax chromosome we identified a conserved linkage group consisting of at least 41 genes homologous to Plasmodium falciparum genes located on chromosome 3. There were no P. vivax homologues of the P. falciparum cytoadherence-linked asexual genes clag 3.2, clag 3.1 and a var C pseudogene found on the P. vivax chromosome. Within the conserved linkage group, the gene order and structure are identical to those of P. falciparum chromosome 3. This conserved linkage group may extend to as many as 190 genes. The subtelomeric regions are different in size and the P. vivax segment contains genes for which no P. falciparum homologues have been identified to date. The size difference of at least 900 kb between the homologous P. vivax chromosome and P. falciparum chromosome 3 is presumably due to a translocation. There is substantial sequence divergence with a much higher guanine + cytosine (G + C) content in the DNA and a preference for amino acids using GC-rich codons in the deduced proteins of P. vivax. This structural conservation of homologous genes and their products combined with sequence divergence at the nucleotide level makes the P. vivax genome a powerful tool for comparative analyses of Plasmodium genomes. (C) 2001 Elsevier Science B.V. All rights reserved.
Resumo:
Using differential display-polymerase chain reaction, we identified a novel gene sequence, designated solid tumor-associated gene 1 (STAG1), that is upregulated in renal cell carcinoma (RCC). The full-length cDNA (4839 bp) encompassed the recently reported androgen-regulated prostatic cDNA PMEPA1 and so we refer to this gene as STAG1/PMEPA1, Two STAG1/PMEPA1 mRNA transcripts of approximately 2.7 an 5 kb, with identical coding regions but variant 3' untranslated regions, were predominantly expressed in normal prostate tissue and at lower levels in the ovary. The expression of this gene was upregulated in 87% of RCC samples and also was upregulated in stomach and rectal adenocarcinomas. In contrast, STAG1/PMEPA1 expression was barely detectable in leukemia and lymphoma samples, Analysis of expressed sequence tag databases showed that STAG1/PMEPA1 also was expressed in pancreatic, endometrial, and prostatic adenocarcinomas. The STAG1/PMEPA1 cDNA encodes a 287-amino-acid protein containing a putative transmembrane domain and motifs that suggest that it may bind src homology 3- and tryptophan tryptophan domain-containing proteins. This protein shows 67% identity to the protein encoded by the chromosome 18 open reading frame 1 gene. Translation of STAG1/PMEPA1 mRNA in vitro showed two products of 36 and 39 kDa, respectively, suggesting that translation may initiate at more than one site. Comparison to genomic clones showed that STAG1/PMEPA1 was located on chromosome 20q13 between microsatellite markers D20S183 and D20S173 and spanned four exons and three introns. The upregulation of this gene in several solid tumors indicated that it may play an important role in tumorigenesis. (C) 2001 Wiley-Liss, Inc.
Resumo:
A new algorithm, PfAGSS, for predicting 3' splice sites in Plasmodium falciparum genomic sequences is described. Application of this program to the published P. falciparum chromosome 2 and 3 data suggests that existing programs result in a high error rate in assigning 3' intron boundaries. (C) 2001 Elsevier Science B.V. All rights reserved.
Resumo:
Early pregnancy factor (EPF) is a secreted protein with growth regulatory and immunomodulatory properties. Human platelet-derived EPF shares amino acid sequence identity with chaperonin 10 (Cpn10), a mitochondrial matrix protein which functions as a molecular chaperone. The striking differences in cellular localization and function of the two proteins suggest differential regulation of production reflecting either alternative transcription of the same gene or transcription from different genes. In mammals and more distantly related genera, there is a large gene family with homology to CPN 10 cDNA, which includes intronless copies of the coding sequence. To determine whether this could represent the gene for EPF, we have screened a mouse genomic library and sequenced representative Cpn10 family members, looking for a functional gene distinct from that of Cpn 10, which could encode EPF. Eight distinct genes were identified. Cpn10 contains introns, while other members are intronless. Six of these appear to be pseudogenes, and the remaining member, Cpn10-rs1, would encode a full-length protein. The 309-bp open reading frame (ORF) is identical to that of mouse Cpn10 cDNA with the exception of three single-base changes, two resulting in amino acid changes. Only one further single nucleotide difference between the Cpn10-rs1 and Cpn10 cDNAs is observed, located in the 3' UTR. Single nucleotide primer extension was applied to discriminate between Cpn10-rs1 and Cpn10 expression. Cpn10, which is ubiquitous, was detected in all tissue samples tested, whereas Cpn10-rs1 was expressed selectively. The pattern was completely coincident with known patterns of EPF activity, strongly suggesting that Cpn10-rs1 does encode EPF. The complete ORF of Cpn10-rs1 was expressed in E. coli. The purified recombinant protein was found to be equipotent with native human platelet-derived EPF in the bioassay for EPF, the rosette inhibition test.
Resumo:
The mouse hnRNP A2/B1/B0 gene has been cloned using a PCR-based strategy and sequenced. Analysis of this sequence showed that the gene organization closely follows that of the human orthologue with 12 exons and 11 introns. The hnRNP A2/B1/B0 gene gives rise to four splice variants through alternative splicing of exons 2 and 9. RT-PCR assays indicated that all splice variants were expressed in mouse brain, skin, and stomach tissues of varying ages, although their ratios to one another varied with age and tissue type. We also identified a small subset of all polyadenylated splice variants that included intron 11, which shows 94% sequence identity between human and mouse. Several processed pseudogenes were identified in the mouse genome. A search of the mouse genome databases located five pseudogenes, four of. which are presumed to be non-functional because of the presence of premature stop codons, large deletions or rearrangements within the coding region. The fifth, which possesses putative promoter elements and has a coding sequence identical to that of the hnRNP A2 mRNA, variant, may be functional. (C) 2002 Elsevier Science B.V. All rights reserved.
Resumo:
Alternative splicing is widespread in mammalian gene expression, and variant splice patterns are often specific to different stages of development, particular tissues or a disease state. There is a need to systematically collect data on alternatively spliced exons, introns and splice isoforms, and to annotate this data. The Alternative Splicing Database consortium has been addressing this need, and is committed to maintaining and developing a value-added database of alternative splice events, and of experimentally verified regulatory mechanisms that mediate splice variants. In this paper we present two of the products from this project: namely, a database of computationally delineated alternative splice events as seen in alignments of EST/cDNA sequences with genome sequences, and a database of alternatively spliced exons collected from literature. The reported splice events are from nine different organisms and are annotated for various biological features including expression states and cross-species conservation. The data are presented on our ASD web pages (http://www.ebi.ac.uk/asd).