989 resultados para SEQUENCE EVOLUTION
Resumo:
Background: Non-long terminal repeat (non-LTR) retrotransposons have contributed to shaping the structure and function of genomes. In silico and experimental approaches have been used to identify the non-LTR elements of the urochordate Ciona intestinalis. Knowledge of the types and abundance of non-LTR elements in urochordates is a key step in understanding their contribution to the structure and function of vertebrate genomes. Results: Consensus elements phylogenetically related to the I, LINE1, LINE2, LOA and R2 elements of the 14 eukaryotic non-LTR clades are described from C. intestinalis. The ascidian elements showed conservation of both the reverse transcriptase coding sequence and the overall structural organization seen in each clade. The apurinic/apyrimidinic endonuclease and nucleic-acid-binding domains encoded upstream of the reverse transcriptase, and the RNase H and the restriction enzyme-like endonuclease motifs encoded downstream of the reverse transcriptase were identified in the corresponding Ciona families. Conclusions: The genome of C. intestinalis harbors representatives of at least five clades of non-LTR retrotransposons. The copy number per haploid genome of each element is low, less than 100, far below the values reported for vertebrate counterparts but within the range for protostomes. Genomic and sequence analysis shows that the ascidian non-LTR elements are unmethylated and flanked by genomic segments with a gene density lower than average for the genome. The analysis provides valuable data for understanding the evolution of early chordate genomes and enlarges the view on the distribution of the non-LTR retrotransposons in eukaryotes.
Resumo:
Background: Chemoreception is a widespread mechanism that is involved in critical biologic processes, including individual and social behavior. The insect peripheral olfactory system comprises three major multigene families: the olfactory receptor (Or), the gustatory receptor (Gr), and the odorant-binding protein (OBP) families. Members of the latter family establish the first contact with the odorants, and thus constitute the first step in the chemosensory transduction pathway.Results: Comparative analysis of the OBP family in 12 Drosophila genomes allowed the identification of 595 genes that encode putative functional and nonfunctional members in extant species, with 43 gene gains and 28 gene losses (15 deletions and 13 pseudogenization events). The evolution of this family shows tandem gene duplication events, progressive divergence in DNA and amino acid sequence, and prevalence of pseudogenization events in external branches of the phylogenetic tree. We observed that the OBP arrangement in clusters is maintained across the Drosophila species and that purifying selection governs the evolution of the family; nevertheless, OBP genes differ in their functional constraints levels. Finally, we detect that the OBP repertoire evolves more rapidly in the specialist lineages of the Drosophila melanogaster group (D. sechellia and D. erecta) than in their closest generalists.Conclusion: Overall, the evolution of the OBP multigene family is consistent with the birth-and-death model. We also found that members of this family exhibit different functional constraints, which is indicative of some functional divergence, and that they might be involved in some of the specialization processes that occurred through the diversification of the Drosophila genus.
Resumo:
BACKGROUND: The expansion of amino acid repeats is determined by a high mutation rate and can be increased or limited by selection. It has been suggested that recent expansions could be associated with the potential of adaptation to new environments. In this work, we quantify the strength of this association, as well as the contribution of potential confounding factors. RESULTS: Mammalian positively selected genes have accumulated more recent amino acid repeats than other mammalian genes. However, we found little support for an accelerated evolutionary rate as the main driver for the expansion of amino acid repeats. The most significant predictors of amino acid repeats are gene function and GC content. There is no correlation with expression level. CONCLUSIONS: Our analyses show that amino acid repeat expansions are causally independent from protein adaptive evolution in mammalian genomes. Relaxed purifying selection or positive selection do not associate with more or more recent amino acid repeats. Their occurrence is slightly favoured by the sequence context but mainly determined by the molecular function of the gene.
Resumo:
The Lateglacial evolution of the Ticino glacier and tributaries is poorly known because of the lack of research by Quaternary geomorphologists during the last decades. In spite of the interest for the cryosphere reactions during the Lateglacial climate warming, only few scientific studies were carried out about the history of the northern valleys of the Ticino Alps during the deglaciation (e.g. Seiffert 1953, Renner 1982, Hantke 1983). Within the framework of geomorphological investigations on the Lateglacial and Holocene glacier/permafrost evolution in the Ticino Alps, the history of the Brenno glacier (Blenio Valley, Eastern Ticino Alps) during the end of the Pleistocene has been studied. The deglaciation sequence of the Blenio Valley is still not complete (Scapozza et al. 2009). Only the first glacial stadial of the Brenno glacier and the last Lateglacial stadials of the Greina region (northern Blenio valley, see Fontana et al. 2008) and of the upper Malvaglia Valley (eastern Blenio Valley, see Scapozza et al. 2008) have been unequivocally defined. For every stadial, the surface of the palaeoglacier and the depression of the Equilibrium Line Altitude (ELA) have been reconstructed on the base of geomorphological mapping. The first individual glacial stadial of the Brenno glacier corresponds to the Biasca stadial of the Ticino glacier defined by Hantke (1983). The ELA depression of 1100-1200 meters and its morphological and glaciological characteristics allow us to correlate this stadial with the Weissbad stadial defined by Keller (1988). In the Greina region, three stadials corresponding to the end of the Lateglacial have been identified, with an ELA depression of 110, 210 and 310-350 meters (Fontana et al. 2008). In the upper Malvaglia Valley, three stadials corresponding to the end of the Oldest Dryas and the Younger Dryas have been identified for the Orino glacier, with an ELA depression of 290, 400-420 and 470-560 meters (Scapozza et al. 2008). If we consider the other (fragmentary) glacial deposits of the Blenio Valley, it is possible to define a regression sequence of the Brenno glacier with 8 stadials, from the Biasca stadial to the end of the Younger Dryas. An attempt of correlation with the model "Gothard" developed by Renner (1982) and Hantke (1983) and with the model "Eastern Swiss Alps" developed by Maisch (1982) is proposed in Table 1. The following chronological conclusions are, therefore, proposed: (1) the Biasca stadial is probably the first stadial after the transition Pleniglacial - Lateglacial; (2) the stadials BRE 7 to BRE 3 are positioned between the beginning of the Lateglacial and the Bølling-Allerød interstadial; (3) the stadials BRE 2 and BRE 1 are assumed to be related to the Younger Dryas event.
Resumo:
The complete mitochondrial DNA (mtDNA) control region was amplified and directly sequenced in two species of shrew, Crocidura russula and Sorex araneus (Insectivora, Mammalia). The general organization is similar to that found in other mammals: a central conserved region surrounded by two more variable domains. However, we have found in shrews the simultaneous presence of arrays of tandem repeats in potential locations where repeats tend to occur separately in other mammalian species. These locations correspond to regions which are associated with a possible interruption of the replication processes, either at the end of the three-stranded D-loop structure or toward the end of the heavy-strand replication. In the left domain the repeated sequences (R1 repeats) are 78 bp long, whereas in the right domain the repeats are 12 bp long in C. russula and 14 bp long in S. araneus (R2 repeats). Variation in the copy number of these repeated sequences results in mtDNA control region length differences. Southern blot analysis indicates that level of heteroplasmy (more than one mtDNA form within an individual) differs between species. A comparative study of the R2 repeats in 12 additional species representing three shrew subfamilies provides useful indications for the understanding of the origin and the evolution of these homologous tandemly repeated sequences. An asymmetry in the distribution of variants within the arrays, as well as the constant occurrence of shorter repeated sequences flanking only one side of the R2 arrays, could be related to asymmetry in the replication of each strand of the mtDNA molecule. The pattern of sequence and length variation within and between species, together with the capability of the arrays to form stable secondary structures, suggests that the dominant mechanism involved in the evolution of these arrays in unidirectional replication slippage.
Resumo:
The shrews of the Sorer araneus group have undergone a spectacular chromosome evolution. The karyotype of Sorer granarius is generally considered ancestral to those of Sorer coronatus and S. araneus. However, a sequence of 777 base pairs of the cytochrome b gene of the mitochondrial DNA (mtDNA) produces a quite different picture: S. granarius is closely related to the populations of S. araneus from the Pyrenees and from the northwestern Alps, whereas S. coronatus and S. araneus from Italy and the southern Alps represent two well-separated lineages. It is suggested that mtDNA and chromosomal evolution are in this case largely independant processes. Whereas mtDNA haplotypes are closely linked to the geographical history of the populations, chromosomal mutations were probably transmitted from one population to another. Available data suggest that the impressive chromosome polymorphism of this group is quite a recent phenomenon.
Resumo:
Arising from either retrotransposition or genomic duplication of functional genes, pseudogenes are "genomic fossils" valuable for exploring the dynamics and evolution of genes and genomes. Pseudogene identification is an important problem in computational genomics, and is also critical for obtaining an accurate picture of a genome's structure and function. However, no consensus computational scheme for defining and detecting pseudogenes has been developed thus far. As part of the ENCyclopedia Of DNA Elements (ENCODE) project, we have compared several distinct pseudogene annotation strategies and found that different approaches and parameters often resulted in rather distinct sets of pseudogenes. We subsequently developed a consensus approach for annotating pseudogenes (derived from protein coding genes) in the ENCODE regions, resulting in 201 pseudogenes, two-thirds of which originated from retrotransposition. A survey of orthologs for these pseudogenes in 28 vertebrate genomes showed that a significant fraction ( approximately 80%) of the processed pseudogenes are primate-specific sequences, highlighting the increasing retrotransposition activity in primates. Analysis of sequence conservation and variation also demonstrated that most pseudogenes evolve neutrally, and processed pseudogenes appear to have lost their coding potential immediately or soon after their emergence. In order to explore the functional implication of pseudogene prevalence, we have extensively examined the transcriptional activity of the ENCODE pseudogenes. We performed systematic series of pseudogene-specific RACE analyses. These, together with complementary evidence derived from tiling microarrays and high throughput sequencing, demonstrated that at least a fifth of the 201 pseudogenes are transcribed in one or more cell lines or tissues.
Resumo:
Only a very small fraction of long noncoding RNAs (lncRNAs) are well characterized. The evolutionary history of lncRNAs can provide insights into their functionality, but the absence of lncRNA annotations in non-model organisms has precluded comparative analyses. Here we present a large-scale evolutionary study of lncRNA repertoires and expression patterns, in 11 tetrapod species. We identify approximately 11,000 primate-specific lncRNAs and 2,500 highly conserved lncRNAs, including approximately 400 genes that are likely to have originated more than 300 million years ago. We find that lncRNAs, in particular ancient ones, are in general actively regulated and may function predominantly in embryonic development. Most lncRNAs evolve rapidly in terms of sequence and expression levels, but tissue specificities are often conserved. We compared expression patterns of homologous lncRNA and protein-coding families across tetrapods to reconstruct an evolutionarily conserved co-expression network. This network suggests potential functions for lncRNAs in fundamental processes such as spermatogenesis and synaptic transmission, but also in more specific mechanisms such as placenta development through microRNA production.
Resumo:
BACKGROUND: The P-type II ATPase gene family encodes proteins with an important role in adaptation of the cell to variation in external K+, Ca2+ and Na2+ concentrations. The presence of P-type II gene subfamilies that are specific for certain kingdoms has been reported but was sometimes contradicted by discovery of previously unknown homologous sequences in newly sequenced genomes. Members of this gene family have been sampled in all of the fungal phyla except the arbuscular mycorrhizal fungi (AMF; phylum Glomeromycota), which are known to play a key-role in terrestrial ecosystems and to be genetically highly variable within populations. Here we used highly degenerate primers on AMF genomic DNA to increase the sampling of fungal P-Type II ATPases and to test previous predictions about their evolution. In parallel, homologous sequences of the P-type II ATPases have been used to determine the nature and amount of polymorphism that is present at these loci among isolates of Glomus intraradices harvested from the same field. RESULTS: In this study, four P-type II ATPase sub-families have been isolated from three AMF species. We show that, contrary to previous predictions, P-type IIC ATPases are present in all basal fungal taxa. Additionally, P-Type IIE ATPases should no longer be considered as exclusive to the Ascomycota and the Basidiomycota, since we also demonstrate their presence in the Zygomycota. Finally, a comparison of homologous sequences encoding P-type IID ATPases showed unexpectedly that indel mutations among coding regions, as well as specific gene duplications occur among AMF individuals within the same field. CONCLUSION: On the basis of these results we suggest that the diversification of P-Type IIC and E ATPases followed the diversification of the extant fungal phyla with independent events of gene gains and losses. Consistent with recent findings on the human genome, but at a much smaller geographic scale, we provided evidence that structural genomic changes, such as exonic indel mutations and gene duplications are less rare than previously thought and that these also occur within fungal populations.
Resumo:
Low-complexity regions (LCRs) in proteins are tracts that are highly enriched in one or a few aminoacids. Given their high abundance, and their capacity to expand in relatively short periods of time through replication slippage, they can greatly contribute to increase protein sequence space and generate novel protein functions. However, little is known about the global impact of LCRs on protein evolution. We have traced back the evolutionary history of 2,802 LCRs from a large set of homologous protein families from H.sapiens, M.musculus, G.gallus, D.rerio and C.intestinalis. Transcriptional factors and other regulatory functions are overrepresented in proteins containing LCRs. We have found that the gain of novel LCRs is frequently associated with repeat expansion whereas the loss of LCRs is more often due to accumulation of amino acid substitutions as opposed to deletions. This dichotomy results in net protein sequence gain over time. We have detected a significant increase in the rate of accumulation of novel LCRs in the ancestral Amniota and mammalian branches, and a reduction in the chicken branch. Alanine and/or glycine-rich LCRs are overrepresented in recently emerged LCR sets from all branches, suggesting that their expansion is better tolerated than for other LCR types. LCRs enriched in positively charged amino acids show the contrary pattern, indicating an important effect of purifying selection in their maintenance. We have performed the first large-scale study on the evolutionary dynamics of LCRs in protein families. The study has shown that the composition of an LCR is an important determinant of its evolutionary pattern.
Resumo:
Thyroid hormones are involved in the regulation of growth and metabolism in all vertebrates. Transthyretin is one of the extracellular proteins with high affinity for thyroid hormones which determine the partitioning of these hormones between extracellular compartments and intracellular lipids. During vertebrate evolution, both the tissue pattern of expression and the structure of the gene for transthyretin underwent characteristic changes. The purpose of this study was to characterize the position of Insectivora in the evolution of transthyretin in eutherians, a subclass of Mammalia. Transthyretin was identified by thyroxine binding and Western analysis in the blood of adult shrews, hedgehogs, and moles. Transthyretin is synthesized in the liver and secreted into the bloodstream, similar to the situation for other adult eutherians, birds, and diprotodont marsupials, but different from that for adult fish, amphibians, reptiles, monotremes, and Australian polyprotodont marsupials. For the characterization of the structure of the gene and the processing of mRNA for transthyretin, cDNA libraries were prepared from RNA from hedgehog and shrew livers, and full-length cDNA clones were isolated and sequenced. Sections of genomic DNA in the regions coding for the splice sites between exons 1 and 2 were synthesized by polymerase chain reaction and sequenced. The location of splicing was deduced from comparison of genomic with cDNA nucleotide sequences. Changes in the nucleotide sequence of the transthyretin gene during evolution are most pronounced in the region coding for the N-terminal region of the protein. Both the derived overall amino sequences and the N-terminal regions of the transthyretins in Insectivora were found to be very similar to those in other eutherians but differed from those found in marsupials, birds, reptiles, amphibians, and fish. Also, the pattern of transthyretin precursor mRNA splicing in Insectivora was more similar to that in other eutherians than to that in marsupials, reptiles, and birds. Thus, in contrast to the marsupials, with a different pattern of transthyretin gene expression in the evolutionarily "older" polyprotodonts compared with the evolutionarily "younger" diprotodonts, no separate lineages of transthyretin evolution could be identified in eutherians. We conclude that transthyretin gene expression in the liver of adult eutherians probably appeared before the branching of the lineages leading to modern eutherian species.
Resumo:
Ionotropic glutamate receptors (iGluRs) are a highly conserved family of ligand-gated ion channels present in animals, plants, and bacteria, which are best characterized for their roles in synaptic communication in vertebrate nervous systems. A variant subfamily of iGluRs, the Ionotropic Receptors (IRs), was recently identified as a new class of olfactory receptors in the fruit fly, Drosophila melanogaster, hinting at a broader function of this ion channel family in detection of environmental, as well as intercellular, chemical signals. Here, we investigate the origin and evolution of IRs by comprehensive evolutionary genomics and in situ expression analysis. In marked contrast to the insect-specific Odorant Receptor family, we show that IRs are expressed in olfactory organs across Protostomia--a major branch of the animal kingdom that encompasses arthropods, nematodes, and molluscs--indicating that they represent an ancestral protostome chemosensory receptor family. Two subfamilies of IRs are distinguished: conserved "antennal IRs," which likely define the first olfactory receptor family of insects, and species-specific "divergent IRs," which are expressed in peripheral and internal gustatory neurons, implicating this family in taste and food assessment. Comparative analysis of drosophilid IRs reveals the selective forces that have shaped the repertoires in flies with distinct chemosensory preferences. Examination of IR gene structure and genomic distribution suggests both non-allelic homologous recombination and retroposition contributed to the expansion of this multigene family. Together, these findings lay a foundation for functional analysis of these receptors in both neurobiological and evolutionary studies. Furthermore, this work identifies novel targets for manipulating chemosensory-driven behaviours of agricultural pests and disease vectors.
Resumo:
Background: It has been suggested that chromosomal rearrangements harbor the molecular footprint of the biological phenomena which they induce, in the form, for instance, of changes in the sequence divergence rates of linked genes. So far, all the studies of these potential associations have focused on the relationship between structural changes and the rates of evolution of single-copy DNA and have tried to exclude segmental duplications (SDs). This is paradoxical, since SDs are one of the primary forces driving the evolution of structure and function in our genomes and have been linked not only with novel genes acquiring new functions, but also with overall higher DNA sequence divergence and major chromosomal rearrangements.Results: Here we take the opposite view and focus on SDs. We analyze several of the features of SDs, including the rates of intraspecific divergence between paralogous copies of human SDs and of interspecific divergence between human SDs and chimpanzee DNA. We study how divergence measures relate to chromosomal rearrangements, while considering other factors that affect evolutionary rates in single copy DNA. Conclusion: We find that interspecific SD divergence behaves similarly to divergence of single-copy DNA. In contrast, old and recent paralogous copies of SDs do present different patterns of intraspecific divergence. Also, we show that some relatively recent SDs accumulate in regions that carry inversions in sister lineages.
Resumo:
TWEAK (TNF homologue with weak apoptosis-inducing activity) and Fn14 (fibroblast growth factor-inducible protein 14) are members of the tumor necrosis factor (TNF) ligand and receptor super-families. Having observed that Xenopus Fn14 cross-reacts with human TWEAK, despite its relatively low sequence homology to human Fn14, we examined the conservation in tertiary fold and binding interfaces between the two species. Our results, combining NMR solution structure determination, binding assays, extensive site-directed mutagenesis and molecular modeling, reveal that, in addition to the known and previously characterized β-hairpin motif, the helix-loop-helix motif makes an essential contribution to the receptor/ligand binding interface. We further discuss the insight provided by the structural analyses regarding how the cysteine-rich domains of the TNF receptor super-family may have evolved over time. DATABASE: Structural data are available in the Protein Data Bank/BioMagResBank databases under the accession codes 2KMZ, 2KN0 and 2KN1 and 17237, 17247 and 17252. STRUCTURED DIGITAL ABSTRACT: TWEAK binds to hFn14 by surface plasmon resonance (View interaction) xeFn14 binds to TWEAK by enzyme linked immunosorbent assay (View interaction) TWEAK binds to xeFn14 by surface plasmon resonance (View interaction) hFn14 binds to TWEAK by enzyme linked immunosorbent assay (View interaction).
Resumo:
Fungi are a large group of eukaryotes found in nearly all ecosystems. More than 250 fungal genomes have already been sequenced, greatly improving our understanding of fungal evolution, physiology, and development. However, for the Pezizomycetes, an early-diverging lineage of filamentous ascomycetes, there is so far only one genome available, namely that of the black truffle, Tuber melanosporum, a mycorrhizal species with unusual subterranean fruiting bodies. To help close the sequence gap among basal filamentous ascomycetes, and to allow conclusions about the evolution of fungal development, we sequenced the genome and assayed transcriptomes during development of Pyronema confluens, a saprobic Pezizomycete with a typical apothecium as fruiting body. With a size of 50 Mb and ~13,400 protein-coding genes, the genome is more characteristic of higher filamentous ascomycetes than the large, repeat-rich truffle genome; however, some typical features are different in the P. confluens lineage, e.g. the genomic environment of the mating type genes that is conserved in higher filamentous ascomycetes, but only partly conserved in P. confluens. On the other hand, P. confluens has a full complement of fungal photoreceptors, and expression studies indicate that light perception might be similar to distantly related ascomycetes and, thus, represent a basic feature of filamentous ascomycetes. Analysis of spliced RNA-seq sequence reads allowed the detection of natural antisense transcripts for 281 genes. The P. confluens genome contains an unusually high number of predicted orphan genes, many of which are upregulated during sexual development, consistent with the idea of rapid evolution of sex-associated genes. Comparative transcriptomics identified the transcription factor gene pro44 that is upregulated during development in P. confluens and the Sordariomycete Sordaria macrospora. The P. confluens pro44 gene (PCON_06721) was used to complement the S. macrospora pro44 deletion mutant, showing functional conservation of this developmental regulator.