912 resultados para INTERSPERSED REPETITIVE ELEMENTS
Resumo:
Short interspersed nuclear elements (SINEs) are widespread among eukaryotic genomes. They are repetitive DNA sequences that have been amplified by retrotransposition. In this study, a class of SINEs were isolated from the Opsariichthys bidens genome, and named Opsar. Sequence analysis confirmed that Opsar is a new class of typical SINEs derived from tRNA molecules. With the tRNA-derived region of Opsar and through BLASTN search, we further identified Zb-SINEs from the zebrafish genome, which includes two groups: Zb-SINE-A and Zb-SINE-B. The Zb-SINE-A group comprises subfamilies of -Al--A5, and the Zb-SINE-B group is a dimer of the tRNA(Ala)-derived region and shares a similar dimeric composition to Alu. Zb-SINEs are composed of three distinct regions: a 5 end tRNA-derived region, a tRNA-unrelated region and a 3 end AT-rich region. The flanking regions are AT rich. The average length of Zb-SINEs elements is about 340 6p. Zb-SINEs account for as much as 0.1% of the whole zebrafish genome. About 70% of the Zb-SINEs are on chromosomes 11, 18, and 19. These Zb-SINEs were characterized by PCR and dot hybridization. The distribution pattern of Zb-SINEs in genome strongly supports the master genes model. The tRNA-derived regions of Opsar and Zb-SINEs were compared with the tRNA(Ala) gene, and they showed 76% similarity, indicating that Opsar and Zb-SINEs originated from an inactive tRNA(Ala) sequence or a tRNA(Ala)-like sequence. In view of the evolutionary status of zebrafish in the Cyprinidae, we deduced that Zb-SINEs were a very old class of interspersed sequences.
Resumo:
Chris L. Organ, Andrew M. Shedlock, Andrew Meade, Mark Pagel and Scott V. Edwards (2007). Origin of avian genome size and structure in non-avian dinosaurs. Nature, 46(7132), 180-184. RAE2008
Resumo:
Genome sequence varies in numerous ways among individuals although the gross architecture is fixed for all humans. Retrotransposons create one of the most abundant structural variants in the human genome and are divided in many families, with certain members in some families, e.g., L1, Alu, SVA, and HERV-K, remaining active for transposition. Along with other types of genomic variants, retrotransponson-derived variants contribute to the whole spectrum of genome variants in humans. With the advancement of sequencing techniques, many human genomes are being sequenced at the individual level, fueling the comparative research on these variants among individuals. In this thesis, the evolution and functional impact of structural variations is examined primarily focusing on retrotransposons in the context of human evolution. The thesis comprises of three different studies on the topics that are presented in three data chapters. First, the recent evolution of all human specific AluYb members, representing the second most active subfamily of Alus, was tracked to identify their source/master copy using a novel approach. All human-specific AluYb elements from the reference genome were extracted, aligned with one another to construct clusters of similar copies and each cluster was analyzed to generate the evolutionary relationship between the members of the cluster. The approach resulted in identification of one major driver copy of all human specific Yb8 and the source copy of the Yb9 lineage. Three new subfamilies within the AluYb family – Yb8a1, Yb10 and Yb11 were also identified, with Yb11 being the youngest and most polymorphic. Second, an attempt to construct a relation between transposable elements (TEs) and tandem repeats (TRs) was made at a genome-wide scale for the first time. Upon sequence comparison, positional cross-checking and other relevant analyses, it was observed that over 20% of all TRs are derived from TEs. This result established the first connection between these two types of repetitive elements, and extends our appreciation for the impact of TEs on genomes. Furthermore, only 6% of these TE-derived TRs follow the already postulated initiation and expansion mechanisms, suggesting that the others are likely to follow a yet-unidentified mechanism. Third, by taking a combination of multiple computational approaches involving all types of genetic variations published so far including transposable elements, the first whole genome sequence of the most recent common ancestor of all modern human populations that diverged into different populations around 125,000-100,000 years ago was constructed. The study shows that the current reference genome sequence is 8.89 million base pairs larger than our common ancestor’s genome, contributed by a whole spectrum of genetic mechanisms. The use of this ancestral reference genome to facilitate the analysis of personal genomes was demonstrated using an example genome and more insightful recent evolutionary analyses involving the Neanderthal genome. The three data chapters presented in this thesis conclude that the tandem repeats and transposable elements are not two entirely distinctly isolated elements as over 20% TRs are actually derived from TEs. Certain subfamilies of TEs themselves are still evolving with the generation of newer subfamilies. The evolutionary analyses of all TEs along with other genomic variants helped to construct the genome sequence of the most recent common ancestor to all modern human populations which provides a better alternative to human reference genome and can be a useful resource for the study of personal genomics, population genetics, human and primate evolution.
Resumo:
Avian genomes are small and streamlined compared with those of other amniotes by virtue of having fewer repetitive elements and less non-coding DNA(1,2). This condition has been suggested to represent a key adaptation for flight in birds, by reducing the metabolic costs associated with having large genome and cell sizes(3,4). However, the evolution of genome architecture in birds, or any other lineage, is difficult to study because genomic information is often absent for long-extinct relatives. Here we use a novel bayesian comparative method to show that bone-cell size correlates well with genome size in extant vertebrates, and hence use this relationship to estimate the genome sizes of 31 species of extinct dinosaur, including several species of extinct birds. Our results indicate that the small genomes typically associated with avian flight evolved in the saurischian dinosaur lineage between 230 and 250 million years ago, long before this lineage gave rise to the first birds. By comparison, ornithischian dinosaurs are inferred to have had much larger genomes, which were probably typical for ancestral Dinosauria. Using comparative genomic data, we estimate that genome-wide interspersed mobile elements, a class of repetitive DNA, comprised 5 - 12% of the total genome size in the saurischian dinosaur lineage, but was 7 - 19% of total genome size in ornithischian dinosaurs, suggesting that repetitive elements became less active in the saurischian lineage. These genomic characteristics should be added to the list of attributes previously considered avian but now thought to have arisen in non-avian dinosaurs, such as feathers(5), pulmonary innovations 6, and parental care and nesting
Resumo:
Some families of mammalian interspersed repetitive DNA, such as the Alu SINE sequence, appear to have evolved by the serial replacement of one active sequence with another, consistent with there being a single source of transposition: the "master gene." Alternative models, in which multiple source sequences are simultaneously active, have been called "transposon models." Transposon models differ in the proportion of elements that are active and in whether inactivation occurs at the moment of transposition or later. Here we examine the predictions of various types of transposon model regarding the patterns of sequence variation expected at an equilibrium between transposition, inactivation, and deletion. Under the master gene model, all bifurcations in the true tree of elements occur in a single lineage. We show that this property will also hold approximately for transposon models in which most elements are inactive and where at least some of the inactivation events occur after transposition. Such tree shapes are therefore not conclusive evidence for a single source of transposition.
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
BACKGROUND: Enterococcus faecalis has emerged as a major hospital pathogen. To explore its diversity, we sequenced E. faecalis strain OG1RF, which is commonly used for molecular manipulation and virulence studies. RESULTS: The 2,739,625 base pair chromosome of OG1RF was found to contain approximately 232 kilobases unique to this strain compared to V583, the only publicly available sequenced strain. Almost no mobile genetic elements were found in OG1RF. The 64 areas of divergence were classified into three categories. First, OG1RF carries 39 unique regions, including 2 CRISPR loci and a new WxL locus. Second, we found nine replacements where a sequence specific to V583 was substituted by a sequence specific to OG1RF. For example, the iol operon of OG1RF replaces a possible prophage and the vanB transposon in V583. Finally, we found 16 regions that were present in V583 but missing from OG1RF, including the proposed pathogenicity island, several probable prophages, and the cpsCDEFGHIJK capsular polysaccharide operon. OG1RF was more rapidly but less frequently lethal than V583 in the mouse peritonitis model and considerably outcompeted V583 in a murine model of urinary tract infections. CONCLUSION: E. faecalis OG1RF carries a number of unique loci compared to V583, but the almost complete lack of mobile genetic elements demonstrates that this is not a defining feature of the species. Additionally, OG1RF's effects in experimental models suggest that mediators of virulence may be diverse between different E. faecalis strains and that virulence is not dependent on the presence of mobile genetic elements.
Resumo:
In many organisms, polarity of the oocyte is established post-transcriptionally via subcellular RNA localization. Many RNAs are localized during oogenesis in Xenopus laevis, including Xlsirts ( Xenopus laevis short interspersed repeat transcripts) [Kloc, 1993]. Xlsirts constitute a large family defined by highly homologous repeat units 79–81 nucleotides in length. Endogenous Xlsirt RNAs use the METRO (Message Transport Organizer) pathway of localization, where RNAs are transported from the nucleus to the mitochondrial cloud in stage I oocytes. Secondly, RNAs anchor at the vegetal pole in stage II oocytes. Exogenous Xlsirt RNAs can also utilize the Late pathway of localization, which involves localization to the vegetal cortex during stage III of oogenesis and results in RNAs anchored in the cortex of the entire vegetal hemisphere. ^ The Xlsirts localization signal is contained within the repeat region. This study was designed to test the hypothesis that there are cis -acting localization elements in Xlsirts, and that higher order structure plays a role. Results of experiments on Xlsirt P11, a 1700 basepair (bp) family member, led to the conclusion that a 137-bp fragment of the repetitive region is necessary and sufficient for METRO and Late pathway localization. This analysis definitively demonstrates that the Xlsirt localization signal for the METRO and Late pathways reside within the repetitive region and not within the flanking regions. Analysis of Xlsirt linker scanning mutations revealed two METRO-pathway specific subelements, and one Late-pathway specific subelement. Functional, computer, and biochemical evidence relates the higher order structure of this element to its ability to function as a localization element. ^ Xlsirt 137 is 99% identical to the Xlsirt consensus sequence identified in this study, suggesting that it is the localization element for all localized Xlsirt family members. The repeat unit was reframed based on function, rather than arbitrarily based on sequence. This work supports the hypothesis presented in 1981 by George Spohr, who originally isolated the Xlsirts, which stated that the highly conserved repetitive elements must be constrained from variability due to some unknown function of the repeats themselves. These studies shed light on the mechanism of RNA localization, linking structure and function. ^
Resumo:
A set of oat–maize chromosome addition lines with individual maize (Zea mays L.) chromosomes present in plants with a complete oat (Avena sativa L.) chromosome complement provides a unique opportunity to analyze the organization of centromeric regions of each maize chromosome. A DNA sequence, MCS1a, described previously as a maize centromere-associated sequence, was used as a probe to isolate cosmid clones from a genomic library made of DNA purified from a maize chromosome 9 addition line. Analysis of six cosmid clones containing centromeric DNA segments revealed a complex organization. The MCS1a sequence was found to comprise a portion of the long terminal repeats of a retrotransposon-like repeated element, termed CentA. Two of the six cosmid clones contained regions composed of a newly identified family of tandem repeats, termed CentC. Copies of CentA and tandem arrays of CentC are interspersed with other repetitive elements, including the previously identified maize retroelements Huck and Prem2. Fluorescence in situ hybridization revealed that CentC and CentA elements are limited to the centromeric region of each maize chromosome. The retroelements Huck and Prem2 are dispersed along all maize chromosomes, although Huck elements are present in an increased concentration around centromeric regions. Significant variation in the size of the blocks of CentC and in the copy number of CentA elements, as well as restriction fragment length variations were detected within the centromeric region of each maize chromosome studied. The different proportions and arrangements of these elements and likely others provide each centromeric region with a unique overall structure.
Resumo:
We have shown previously by Southern blot analysis that Bov-B long interspersed nuclear elements (LINEs) are present in different Viperidae snake species. To address the question as to whether Bov-B LINEs really have been transmitted horizontally between vertebrate classes, the analysis has been extended to a larger number of vertebrate, invertebrate, and plant species. In this paper, the evolutionary origin of Bov-B LINEs is shown unequivocally to be in Squamata. The previously proposed horizontal transfer of Bov-B LINEs in vertebrates has been confirmed by their discontinuous phylogenetic distribution in Squamata (Serpentes and two lizard infra-orders) as well as in Ruminantia, by the high level of nucleotide identity, and by their phylogenetic relationships. The horizontal transfer of Bov-B LINEs from Squamata to the ancestor of Ruminantia is evident from the genetic distances and discontinuous phylogenetic distribution. The ancestor of Colubroidea snakes is a possible donor of Bov-B LINEs to Ruminantia. The timing of horizontal transfer has been estimated from the distribution of Bov-B LINEs in Ruminantia and the fossil data of Ruminantia to be 40–50 My ago. The phylogenetic relationships of Bov-B LINEs from the various Squamata species agrees with that of the species phylogeny, suggesting that Bov-B LINEs have been maintained stably by vertical transmission since the origin of Squamata in the Mesozoic era.
Resumo:
The worldwide threat of tuberculosis to human health emphasizes the need to develop novel approaches to a global epidemiological surveillance. The current standard for Mycobacterium tuberculosis typing based on IS6110 restriction fragment length polymorphism (RFLP) suffers from the difficulty of comparing data between independent laboratories. Here, we propose a high-resolution typing method based on variable number tandem repeats (VNTRs) of genetic elements named mycobacterial interspersed repetitive units (MIRUs) in 12 human minisatellite-like regions of the M. tuberculosis genome. MIRU-VNTR profiles of 72 different M. tuberculosis isolates were established by PCR analysis of all 12 loci. From 2 to 8 MIRU-VNTR alleles were identified in the 12 regions in these strains, which corresponds to a potential of over 16 million different combinations, yielding a resolution power close to that of IS6110-RFLP. All epidemiologically related isolates tested were perfectly clustered by MIRU-VNTR typing, indicating that the stability of these MIRU-VNTRs is adequate to track outbreak episodes. The correlation between genetic relationships inferred from MIRU-VNTR and IS6110-RFLP typing was highly significant. Compared with IS6110-RFLP, high-resolution MIRU-VNTR typing has the considerable advantages of being fast, appropriate for all M. tuberculosis isolates, including strains that have a few IS6110 copies, and permitting easy and rapid comparison of results from independent laboratories. This typing method opens the way to the construction of digital global databases for molecular epidemiology studies of M. tuberculosis.
Resumo:
For the most part, studies of grass genome structure have been limited to the generation of whole-genome genetic maps or the fine structure and sequence analysis of single genes or gene clusters. We have investigated large contiguous segments of the genomes of maize, sorghum, and rice, primarily focusing on intergenic spaces. Our data indicate that much (>50%) of the maize genome is composed of interspersed repetitive DNAs, primarily nested retrotransposons that insert between genes. These retroelements are less abundant in smaller genome plants, including rice and sorghum. Although 5- to 200-kb blocks of methylated, presumably heterochromatic, retrotransposons flank most maize genes, rice and sorghum genes are often adjacent. Similar genes are commonly found in the same relative chromosomal locations and orientations in each of these three species, although there are numerous exceptions to this collinearity (i.e., rearrangements) that can be detected at the levels of both the recombinational map and cloned DNA. Evolutionarily conserved sequences are largely confined to genes and their regulatory elements. Our results indicate that a knowledge of grass genome structure will be a useful tool for gene discovery and isolation, but the general rules and biological significance of grass genome organization remain to be determined. Moreover, the nature and frequency of exceptions to the general patterns of grass genome structure and collinearity are still largely unknown and will require extensive further investigation.
Resumo:
Several recent reports indicate that mobile elements are frequently found in and flanking many wild-type plant genes. To determine the extent of this association, we performed computer-based systematic searches to identify mobile elements in the genes of two "model" plants, Oryza sativa (domesticated rice) and Arabidopsis thaliana. Whereas 32 common sequences belonging to nine putative mobile element families were found in the noncoding regions of rice genes, none were found in Arabidopsis genes. Five of the nine families (Gaijin, Castaway, Ditto, Wanderer, and Explorer) are first described in this report, while the other four were described previously (Tourist, Stowaway, p-SINE1, and Amy/LTP). Sequence similarity, structural similarity, and documentation of past mobility strongly suggests that many of the rice common sequences are bona fide mobile elements. Members of four of the new rice mobile element families are similar in some respects to members of the previously identified inverted-repeat element families, Tourist and Stowaway. Together these elements are the most prevalent type of transposons found in the rice genes surveyed and form a unique collection of inverted-repeat transposons we refer to as miniature inverted-repeat transposable elements or MITEs. The sequence and structure of MITEs are clearly distinct from short or long interspersed nuclear elements (SINEs or LINEs), the most common transposable elements associated with mammalian nuclear genes. Mobile elements, therefore, are associated with both animal and plant genes, but the identity of these elements is strikingly different.
Resumo:
We have characterized a family of repetitive DNA elements with homology to the MgPa cellular adhesion operon of Mycoplasma genitalium, a bacterium that has the smallest known genome of any free-living organism. One element, 2272 bp in length and flanked by DNA with no homology to MgPa, was completely sequenced. At least four others were partially sequenced. The complete element is a composite of six regions. Five of these regions show sequence similarity with nonadjacent segments of genes of the MgPa operon. The sixth region, located near the center of the element, is an A+T-rich sequence that has only been found in this repeat family. Open reading frames are present within the five individual regions showing sequence homology to MgPa and the adjacent open reading frame 3 (ORF3) gene. However, termination codons are found between adjacent regions of homology to the MgPa operon and in the A+T-rich sequence. Thus, these repetitive elements do not appear to be directly expressible protein coding sequences. The sequence of one region from five different repetitive elements was compared with the homologous region of the MgPa gene from the type strain G37 and four newly isolated M. genitalium strains. Recombination between repetitive elements of strain G37 and the MgPa operon can explain the majority of polymorphisms within our partial sequences of the MgPa genes of the new isolates. Therefore, we propose that the repetitive elements of M. genitalium provide a reservoir of sequence that contributes to antigenic variation in proteins of the MgPa cellular adhesion operon.