58 resultados para Noncoding Rnas
Resumo:
In Mesoamerica, tropical dry forest is a highly threatened habitat, and species endemic to this environment are under extreme pressure. The tree species, Lonchocarpus costaricensis is endemic to the dry northwest of Costa Rica and southwest Nicaragua. It is a locally important species but, as land has been cleared for agriculture, populations have experienced considerable reduction and fragmentation. To assess current levels and distribution of genetic diversity in the species, a combination of chloroplast-specific (cpDNA) and whole genome DNA markers (amplified fragment length polymorphism, AFLP) were used to fingerprint 121 individual trees in 6 populations. Two cpDNA haplotypes were identified, distributed among populations such that populations at the extremes of the distribution showed lowest diversity. A large number (487) of AFLP markers were obtained and indicated that diversity levels were highest in the two coastal populations (Cobano, Matapalo, H = 0.23, 0.28 respectively). Population differentiation was low overall, F-ST = 0.12, although Matapalo was strongly differentiated from all other populations (F-ST = 0.16-0.22), apart from Cobano (F., = 0.11). Spatial genetic structure was present in both datasets at different scales: cpDNA was structured at a range-wide distribution scale, whilst AFLP data revealed genetic neighbourhoods on a population scale. In general, the habitat degradation of recent times appears not to have yet impacted diversity levels in mature populations. However, although no data on seed or saplings were collected, it seems likely that reproductive mechanisms in the species will have been affected by land clearance. It is recommended that efforts should be made to conserve the extant genetic resource base and further research undertaken to investigate diversity levels in the progeny generation.
Resumo:
Recently, we identified a large number of ultraconserved (uc) sequences in noncoding regions of human, mouse, and rat genomes that appear to be essential for vertebrate and amniote ontogeny. Here, we used similar methods to identify ultraconserved genomic regions between the insect species Drosophila melanogaster and Drosophila pseudoobscura, as well as the more distantly related Anopheles gambiae. As with vertebrates, ultraconserved sequences in insects appear to Occur primarily in intergenic and intronic sequences, and at intron-exon junctions. The sequences are significantly associated with genes encoding developmental regulators and transcription factors, but are less frequent and are smaller in size than in vertebrates. The longest identical, nongapped orthologous match between the three genomes was found within the homothorax (hth) gene. This sequence spans an internal exon-intron junction, with the majority located within the intron, and is predicted to form a highly stable stem-loop RNA structure. Real-time quantitative PCR analysis of different hth splice isoforms and Northern blotting showed that the conserved element is associated with a high incidence of intron retention in hth pre-mRNA, suggesting that the conserved intronic element is critically important in the post-transcriptional regulation of hth expression in Diptera.
Resumo:
In a first step toward understanding the molecular basis of pineapple fruit development, a sequencing project was initiated to survey a range of expressed sequences from green unripe and yellow ripe fruit tissue. A highly abundant metallothionein transcript was identified during library construction, and was estimated to account for up to 50% of all EST library clones. Library clones with metallothionein subtracted were sequenced, and 408 unripe green and 1140 ripe yellow edited EST clone sequences were retrieved. Clone redundancy was high, with the combined 1548 clone sequences clustering into just 634 contigs comprising 191 consensus sequences and 443 singletons. Half of the EST clone sequences clustered within 13.5% and 9.3% of contigs from green unripe and yellow ripe libraries, respectively, indicating that a small subset of genes dominate the majority of the transcriptome. Furthermore, sequence cluster analysis, northern analysis, and functional classification revealed major differences between genes expressed in the unripe green and ripe yellow fruit tissues. Abundant genes identified from the green fruit include a fruit bromelain and a bromelain inhibitor. Abundant genes identified in the yellow fruit library include a MADS box gene, and several genes normally associated with protein synthesis, including homologues of ribosomal L10 and the translation factors SUI1 and eIF5A. Both the green unripe and yellow ripe libraries contained high proportions of clones associated with oxidative stress responses and the detoxification of free radicals.
Resumo:
We completed the genome sequence of Lettuce necrotic yellows virus (LNYV) by determining the nucleotide sequences of the 4a (putative phosphoprotein), 4b, M (matrix protein), G (glycoprotein) and L (polymerase) genes. The genome consists of 12,807 nucleotides and encodes six genes in the order 3' leader-N-4a(P)-4b-M-G-L-5' trailer. Sequences were derived from clones of a cDNA library from LNYV genomic RNA and from fragments amplified using reverse transcription-polymerase chain reaction. The 4a protein has a low isoelectric point characteristic for rhabdovirus phosphoproteins. The 4b protein has significant sequence similarities with the movement proteins of capillo- and trichoviruses and may be involved in cell-to-cell movement. The putative G protein sequence contains a predicted 25 amino acids signal peptide and endopeptidase cleavage site, three predicted glycosylation sites and a putative transmembrane domain. The deduced L protein sequence shows similarities with the L proteins of other plant rhabdoviruses and contains polymerase module motifs characteristic for RNA-dependent RNA polymerases of negative-strand RNA viruses. Phylogenetic analysis of this motif among rhabdoviruses placed LNYV in a group with other sequenced cytorhabdoviruses, most closely related to Strawberry crinkle virus. (c) 2005 Elsevier B.V. All rights reserved.
Resumo:
Transgenic tobacco plants, carrying a Potato virus Y (PVY)-NIa hairpin sequence separated by a unique unrelated spacer sequence were specifically silenced and highly resistant to PVY infection. In such plants neither PVY-NIa nor spacer transgene transcripts were detectable by specific quantitative real time reverse transcriptase PCR (RT-qPCR) assays of similar relative efficiencies developed for direct comparative analysis. However, small interfering RNAs (siRNAs) specific for the PVY sequence of the transgene and none specific for the LNYV spacer sequence were detected. Following infection with Cucumber mosaic virus (CMV), which suppresses dsRNA-induced RNA silencing, transcript levels of PVY-NIa as well as spacer sequence increased manifold with the same time course. The cellular abundance of the single-stranded (ss) spacer sequence was consistently higher than that of PVY dsRNA in all cases. The results show that during RNA silencing and its suppression of a hairpin transcript in transgenic tobacco, the ssRNA spacer sequence is affected differently than the dsRNA. In PVY-silenced plants. the spacer is efficiently degraded by a mechanism not involving the accumulation of siRNAs, while following suppression of RNA silencing by CMV, the spacer appears protected from degradation. Crown Copyright (c) 2006 Published by Elsevier B.V. All rights reserved.
Resumo:
Although MYB overexpression in colorectal cancer (CRC) is known to be a prognostic indicator for poor survival, the basis for this overexpression is unclear. Among multiple levels of MYB regulation, the most dynamic is the control of transcriptional elongation by sequences within intron I. The authors have proposed that this regulatory sequence is transcribed into an RNA stem-loop and 19-residue polyuridine tract, and is subject to mutation in CRC. When this region was examined in colorectal and breast carcinoma cell lines and tissues, the authors found frequent mutations only in CRC. It was determined that these mutations allowed increased transcription compared with the wild type sequence. These data suggest that this MYB regulatory region within intron I is subject to mutations in CRC but not breast cancer, perhaps consistent with the mutagenic insult that occurs within the colon and not mammary tissue. In CRC, these mutations may contribute to MYB overexpression, highlighting the importance of noncoding sequences in the regulation of key cancer genes. (c) 2006 Wiley-Liss, Inc.
Resumo:
The gene content of a mitochondrial (mt) genome, i.e., 37 genes and a large noncoding region (LNR), is usually conserved in Metazoa. The arrangement of these genes and the LNR is generally conserved at low taxonomic levels but varies substantially at high levels. We report here a variation in mt gene content and gene arrangement among chigger mites of the genus Leptotrombidium. We found previously that the mt genome of Leptotrombidium pallidum has an extra gene for large-subunit rRNA (rrnL), a pseudo-gene for small-subunit rRNA (PrrnS), and three extra LNRs, additional to the 37 genes and an LNR typical of Metazoa. Further, the arrangement of mt genes of L. pallidum differs drastically from that of the hypothetical ancestor of the arthropods. To find to what extent the novel gene content and gene arrangement occurred in Leptotrombidium, we sequenced the entire or partial mt genomes of three other species, L. akamushi, L. deliense, and L. fletcheri. These three species share the arrangement of all genes with L. pallidum, except trnQ (for tRNA-glutamine). Unlike L. pallidum, however, these three species do not have extra rrnL or PrrnS and have only one extra LNR. By comparison between Leptotrombidium species and the ancestor of the arthropods, we propose that (1) the type of mt genome present in L. pallidum evolved from the type present in the other three Leptotrombidium species, and (2) three molecular mechanisms were involved in the evolution of mt gene content and gene arrangement in Leptotrombidium species.
Resumo:
The term non-coding RNA (ncRNA) is commonly employed for RNA that does not encode a protein, but this does not mean that such RNAs do not contain information nor have function. Although it has been generally assumed that most genetic information is transacted by proteins, recent evidence suggests that the majority of the genomes of mammals and other complex organisms is in fact transcribed into ncRNAs, many of which are alternatively spliced and/or processed into smaller products. These ncRNAs include microRNAs and snoRNAs (many if not most of which remain to be identified), as well as likely other classes of yet-to-be-discovered small regulatory RNAs, and tens of thousands of longer transcripts (including complex patterns of interlacing and overlapping sense and antisense transcripts), most of whose functions are unknown. These RNAs (including those derived from introns) appear to comprise a hidden layer of internal signals that control various levels of gene expression in physiology and development, including chromatin architecture/epigenetic memory, transcription, RNA splicing, editing, translation and turnover. RNA regulatory networks may determine most of our complex characteristics, play a significant role in disease and constitute an unexplored world of genetic variation both within and between species.
Resumo:
The mammalian transcriptome harbours shadowy entities that resist classification and analysis. In analogy with pseudogenes, we define pseudo-messenger RNA to be RNA molecules that resemble protein- coding mRNA, but cannot encode full-length proteins owing to disruptions of the reading frame. Using a rigorous computational pipeline, which rules out sequencing errors, we identify 10,679 pseudo - messenger RNAs ( approximately half of which are transposonassociated) among the 102,801 FANTOM3 mouse cDNAs: just over 10% of the FANTOM3 transcriptome. These comprise not only transcribed pseudogenes, but also disrupted splice variants of otherwise protein- coding genes. Some may encode truncated proteins, only a minority of which appear subject to nonsense- mediated decay. The presence of an excess of transcripts whose only disruptions are opal stop codons suggests that there are more selenoproteins than currently estimated. We also describe compensatory frameshifts, where a segment of the gene has changed frame but remains translatable. In summary, we survey a large class of non- standard but potentially functional transcripts that are likely to encode genetic information and effect biological processes in novel ways. Many of these transcripts do not correspond cleanly to any identifiable object in the genome, implying fundamental limits to the goal of annotating all functional elements at the genome sequence level.
Resumo:
Eukaryotic genomes display segmental patterns of variation in various properties, including GC content and degree of evolutionary conservation. DNA segmentation algorithms are aimed at identifying statistically significant boundaries between such segments. Such algorithms may provide a means of discovering new classes of functional elements in eukaryotic genomes. This paper presents a model and an algorithm for Bayesian DNA segmentation and considers the feasibility of using it to segment whole eukaryotic genomes. The algorithm is tested on a range of simulated and real DNA sequences, and the following conclusions are drawn. Firstly, the algorithm correctly identifies non-segmented sequence, and can thus be used to reject the null hypothesis of uniformity in the property of interest. Secondly, estimates of the number and locations of change-points produced by the algorithm are robust to variations in algorithm parameters and initial starting conditions and correspond to real features in the data. Thirdly, the algorithm is successfully used to segment human chromosome 1 according to GC content, thus demonstrating the feasibility of Bayesian segmentation of eukaryotic genomes. The software described in this paper is available from the author's website (www.uq.edu.au/similar to uqjkeith/) or upon request to the author.
Resumo:
Short proteins play key roles in cell signalling and other processes, but their abundance in the mammalian proteome is unknown. Current catalogues of mammalian proteins exhibit an artefactual discontinuity at a length of 100 aa, so that protein abundance peaks just above this length and falls off sharply below it. To clarify the abundance of short proteins, we identify proteins in the FANTOM collection of mouse cDNAs by analysing synonymous and nonsynonymous substitutions with the computer program CRITICA. This analysis confirms that there is no real discontinuity at length 100. Roughly 10% of mouse proteins are shorter than 100 aa, although the majority of these are variants of proteins longer than 100 aa. We identify many novel short proteins, including a dark matter'' subset containing ones that lack detectable homology to other known proteins. Translation assays confirm that some of these novel proteins can be translated and localised to the secretory pathway.
Resumo:
Our previous studies using trans-complementation analysis of Kunjin virus (KUN) full-length cDNA clones harboring in-frame deletions in the NS3 gene demonstrated the inability of these defective complemented RNAs to be packaged into virus particles (W. J. Liu, P. L. Sedlak, N. Kondratieva, and A. A. Khromykh, J. Virol. 76:10766-10775). In this study we aimed to establish whether this requirement for NS3 in RNA packaging is determined by the secondary RNA structure of the NS3 gene or by the essential role of the translated NS3 gene product. Multiple silent mutations of three computer-predicted stable RNA structures in the NS3 coding region of KUN replicon RNA aimed at disrupting RNA secondary structure without affecting amino acid sequence did not affect RNA replication and packaging into virus-like particles in the packaging cell line, thus demonstrating that the predicted conserved RNA structures in the NS3 gene do not play a role in RNA replication and/or packaging. In contrast, double frameshift mutations in the NS3 coding region of full-length KUN RNA, producing scrambled NS3 protein but retaining secondary RNA structure, resulted in the loss of ability of these defective RNAs to be packaged into virus particles in complementation experiments in KUN replicon-expressing cells. Furthermore, the more robust complementation-packaging system based on established stable cell lines producing large amounts of complemented replicating NS3-deficient replicon RNAs and infection with KUN virus to provide structural proteins also failed to detect any secreted virus-like particles containing packaged NS3-deficient replicon RNAs. These results have now firmly established the requirement of KUN NS3 protein translated in cis for genome packaging into virus particles.
Resumo:
Despite the presence of over 3 million transposons separated on average by similar to 500 bp, the human and mouse genomes each contain almost 1000 transposon-free regions (TFRs) over 10 kb in length. The majority of human TFRs correlate with orthologous TFRs in the mouse, despite the fact that most transposons are lineage specific. Many human TFRs also overlap with orthologous TFRs in the marsupial opossum, indicating that these regions have remained refractory to transposon insertion for long evolutionary periods. Over 90% of the bases covered by TFRs are noncoding, much of which is not highly conserved. Most TFRs are not associated with unusual nucleotide composition, but are significantly associated with genes encoding developmental regulators, suggesting that they represent extended regions of regulatory information that are largely unable to tolerate insertions, a conclusion difficult to reconcile with current conceptions of gene regulation.