72 resultados para Avian genomes
Resumo:
The recent availability of the chicken genome sequence poses the question of whether there are human protein-coding genes conserved in chicken that are currently not included in the human gene catalog. Here, we show, using comparative gene finding followed by experimental verification of exon pairs by RT–PCR, that the addition to the multi-exonic subset of this catalog could be as little as 0.2%, suggesting that we may be closing in on the human gene set. Our protocol, however, has two shortcomings: (i) the bioinformatic screening of the predicted genes, applied to filter out false positives, cannot handle intronless genes; and (ii) the experimental verification could fail to identify expression at a specific developmental time. This highlights the importance of developing methods that could provide a reliable estimate of the number of these two types of genes.
Resumo:
Selenocysteine (Sec) is co-translationally inserted into selenoproteins in response to codon UGA with the help of the selenocysteine insertion sequence (SECIS) element. The number of selenoproteins in animals varies, with humans having 25 and mice having 24 selenoproteins. To date, however, only one selenoprotein, thioredoxin reductase, has been detected in Caenorhabditis elegans, and this enzyme contains only one Sec. Here, we characterize the selenoproteomes of C.elegans and Caenorhabditis briggsae with three independent algorithms, one searching for pairs of homologous nematode SECIS elements, another searching for Cys- or Sec-containing homologs of potential nematode selenoprotein genes and the third identifying Sec-containing homologs of annotated nematode proteins. These methods suggest that thioredoxin reductase is the only Sec-containing protein in the C.elegans and C.briggsae genomes. In contrast, we identified additional selenoproteins in other nematodes. Assuming that Sec insertion mechanisms are conserved between nematodes and other eukaryotes, the data suggest that nematode selenoproteomes were reduced during evolution, and that in an extreme reduction case Sec insertion systems probably decode only a single UGA codon in C.elegans and C.briggsae genomes. In addition, all detected genes had a rare form of SECIS element containing a guanosine in place of a conserved adenosine present in most other SECIS structures, suggesting that in organisms with small selenoproteomes SECIS elements may change rapidly.
Resumo:
Background: Despite the continuous production of genome sequence for a number of organisms,reliable, comprehensive, and cost effective gene prediction remains problematic. This is particularlytrue for genomes for which there is not a large collection of known gene sequences, such as therecently published chicken genome. We used the chicken sequence to test comparative andhomology-based gene-finding methods followed by experimental validation as an effective genomeannotation method.Results: We performed experimental evaluation by RT-PCR of three different computational genefinders, Ensembl, SGP2 and TWINSCAN, applied to the chicken genome. A Venn diagram wascomputed and each component of it was evaluated. The results showed that de novo comparativemethods can identify up to about 700 chicken genes with no previous evidence of expression, andcan correctly extend about 40% of homology-based predictions at the 5' end.Conclusions: De novo comparative gene prediction followed by experimental verification iseffective at enhancing the annotation of the newly sequenced genomes provided by standardhomology-based methods.
Resumo:
Poor understanding of the spliceosomal mechanisms to select intronic 3' ends (3'ss) is a major obstacle to deciphering eukaryotic genomes. Here, we discern the rules for global 3'ss selection in yeast. We show that, in contrast to the uniformity of yeast splicing, the spliceosome uses all available 3'ss within a distance window from the intronic branch site (BS), and that in 70% of all possible 3'ss this is likely to be mediated by pre-mRNA structures. Our results reveal that one of these RNA folds acts as an RNA thermosensor, modulating alternative splicing in response to heat shock by controlling alternate 3'ss availability. Thus, our data point to a deeper role for the pre-mRNA in the control of its own fate, and to a simple mechanism for some alternative splicing.
Resumo:
It is generally accepted that the extent of phenotypic change between human and great apes is dissonant with the rate of molecular change. Between these two groups, proteins are virtually identical, cytogenetically there are few rearrangements that distinguish ape-human chromosomes, and rates of single-base-pair change and retrotransposon activity have slowed particularly within hominid lineages when compared to rodents or monkeys. Studies of gene family evolution indicate that gene loss and gain are enriched within the primate lineage. Here, we perform a systematic analysis of duplication content of four primate genomes (macaque, orang-utan, chimpanzee and human) in an effort to understand the pattern and rates of genomic duplication during hominid evolution. We find that the ancestral branch leading to human and African great apes shows the most significant increase in duplication activity both in terms of base pairs and in terms of events. This duplication acceleration within the ancestral species is significant when compared to lineage-specific rate estimates even after accounting for copy-number polymorphism and homoplasy. We discover striking examples of recurrent and independent gene-containing duplications within the gorilla and chimpanzee that are absent in the human lineage. Our results suggest that the evolutionary properties of copy-number mutation differ significantly from other forms of genetic mutation and, in contrast to the hominid slowdown of single-base-pair mutations, there has been a genomic burst of duplication activity at this period during human evolution.
Resumo:
Tomato (Solanum lycopersicum) is a major crop plant and a model system for fruit development. Solanum is one of the largest angiosperm genera1 and includes annual and perennial plants from diverse habitats. Here we present a high-quality genome sequence of domesticated tomato, a draft sequence of its closest wild relative, Solanum pimpinellifolium2, and compare them to each other and to the potato genome (Solanum tuberosum). The two tomato genomes show only 0.6% nucleotide divergence and signs of recent admixture, but show more than 8% divergence from potato, with nine large and several smaller inversions. In contrast to Arabidopsis, but similar to soybean, tomato and potato small RNAs map predominantly to gene-rich chromosomal regions, including gene promoters. The Solanum lineage has experienced two consecutive genome triplications: one that is ancient and shared with rosids, and a more recent one. These triplications set the stage for the neofunctionalization of genes controlling fruit characteristics, such as colour and fleshiness.
Resumo:
Background: Transposable elements (TEs) constitute a substantial amount of all eukaryotic genomes. They induce an important proportion of deleterious mutations by insertion into genes or gene regulatory regions. However, their mutational capabilities are not always adverse but can contribute to the genetic diversity and evolution of organisms. Knowledge of their distribution and activity in the genomes of populations under different environmental and demographic regimes, is important to understand their role in species evolution. In this work we study the chromosomaldistribution of two TEs, gypsy and bilbo, in original and colonizing populations of Drosophilasubobscura to reveal the putative effect of colonization on their insertion profile.Results: Chromosomal frequency distribution of two TEs in one original and three colonizingpopulations of D. subobscura, is different. Whereas the original population shows a low insertionfrequency in most TE sites, colonizing populations have a mixture of high (frequency ¿ 10%) andlow insertion sites for both TEs. Most highly occupied sites are coincident among colonizingpopulations and some of them are correlated to chromosomal arrangements. Comparisons of TEcopy number between the X chromosome and autosomes show that gypsy occupancy seems to becontrolled by negative selection, but bilbo one does not. Conclusion: These results are in accordance that TEs in Drosophila subobscura colonizing populations are submitted to a founder effect followed by genetic drift as a consequence of colonization. This would explain the high insertion frequencies of bilbo and gypsy in coincident sites of colonizing populations. High occupancy sites would represent insertion events prior to colonization. Sites of low frequency would be insertions that occurred after colonization and/orcopies from the original population whose frequency is decreasing in colonizing populations. Thiswork is a pioneer attempt to explain the chromosomal distribution of TEs in a colonizing specieswith high inversion polymorphism to reveal the putative effect of arrangements in TE insertionprofiles. In general no associations between arrangements and TE have been found, except in a fewcases where the association is very strong. Alternatively, founder drift effects, seem to play aleading role in TE genome distribution in colonizing populations.
Resumo:
Aphids are important agricultural pests and also biological models for studies of insect-plant interactions, symbiosis, virus vectoring, and the developmental causes of extreme phenotypic plasticity. Here we present the 464 Mb draft genome assembly of the pea aphid Acyrthosiphon pisum. This first published whole genome sequence of a basal hemimetabolous insect provides an outgroup to the multiple published genomes of holometabolous insects. Pea aphids are host-plant specialists, they can reproduce both sexually and asexually, and they have coevolved with an obligate bacterial symbiont. Here we highlight findings from whole genome analysis that may be related to these unusual biological features. These findings include discovery of extensive gene duplication in more than 2000 gene families as well as loss of evolutionarily conserved genes. Gene family expansions relative to other published genomes include genes involved in chromatin modification, miRNA synthesis, and sugar transport. Gene losses include genes central to the IMD immune pathway, selenoprotein utilization, purine salvage, and the entire urea cycle. The pea aphid genome reveals that only a limited number of genes have been acquired from bacteria; thus the reduced gene count of Buchnera does not reflect gene transfer to the host genome. The inventory of metabolic genes in the pea aphid genome suggests that there is extensive metabolite exchange between the aphid and Buchnera, including sharing of amino acid biosynthesis between the aphid and Buchnera. The pea aphid genome provides a foundation for post-genomic studies of fundamental biological questions and applied agricultural problems.
Resumo:
Background: Annotations of completely sequenced genomes reveal that nearly half of the genes identified are of unknown function, and that some belong to uncharacterized gene families. To help resolve such issues, information can be obtained from the comparative analysis of homologous genes in model organisms. Results: While characterizing genes from the retinitis pigmentosa locus RP26 at 2q31-q33, we have identified a new gene, ORMDL1, that belongs to a novel gene family comprising three genes in humans (ORMDL1, ORMDL2 and ORMDL3), and homologs in yeast, microsporidia, plants, Drosophila, urochordates and vertebrates. The human genes are expressed ubiquitously in adult and fetal tissues. The Drosophila ORMDL homolog is also expressed throughout embryonic and larval stages, particularly in ectodermally derived tissues. The ORMDL genes encode transmembrane proteins anchored in the endoplasmic reticulum (ER). Double knockout of the two Saccharomyces cerevisiae homologs leads to decreased growth rate and greater sensitivity to tunicamycin and dithiothreitol. Yeast mutants can be rescued by human ORMDL homologs. Conclusions: From protein sequence comparisons we have defined a novel gene family, not previously recognized because of the absence of a characterized functional signature. The sequence conservation of this family from yeast to vertebrates, the maintenance of duplicate copies in different lineages, the ubiquitous pattern of expression in human and Drosophila, the partial functional redundancy of the yeast homologs and phenotypic rescue by the human homologs, strongly support functional conservation. Subcellular localization and the response of yeast mutants to specific agents point to the involvement of ORMDL in protein folding in the ER.
Resumo:
Background: Hox and ParaHox gene clusters are thought to have resulted from the duplication of a ProtoHox gene cluster early in metazoan evolution. However, the origin and evolution of the other genes belonging to the extended Hox group of homeobox-containing genes, that is, Mox and Evx, remains obscure. We constructed phylogenetic trees with mouse, amphioxus and Drosophila extended Hox and other related Antennapedia-type homeobox gene sequences and analyzed the linkage data available for such genes.Results: We claim that neither Mox nor Evx is a Hox or ParaHox gene. We propose a scenariothat reconciles phylogeny with linkage data, in which an Evx/Mox ancestor gene linked to aProtoHox cluster was involved in a segmental tandem duplication event that generated an arrayof all Hox-like genes, referred to as the `coupled¿ cluster. A chromosomal breakage within thiscluster explains the current composition of the extended Hox cluster (with Evx, Hox and Moxgenes) and the ParaHox cluster.Conclusions: Most studies dealing with the origin and evolution of Hox and ParaHox clustershave not included the Hox-related genes Mox and Evx. Our phylogenetic analyses and theavailable linkage data in mammalian genomes support an evolutionary scenario in which anancestor of Evx and Mox was linked to the ProtoHox cluster, and that a tandem duplication of alarge genomic region early in metazoan evolution generated the Hox and ParaHox clusters, plusthe cluster-neighbors Evx and Mox. The large `coupled¿ Hox-like cluster EvxHox/MoxParaHox wassubsequently broken, thus grouping the Mox and Evx genes to the Hox clusters, and isolating theParaHox cluster.
Resumo:
Background: Non-long terminal repeat (non-LTR) retrotransposons have contributed to shaping the structure and function of genomes. In silico and experimental approaches have been used to identify the non-LTR elements of the urochordate Ciona intestinalis. Knowledge of the types and abundance of non-LTR elements in urochordates is a key step in understanding their contribution to the structure and function of vertebrate genomes. Results: Consensus elements phylogenetically related to the I, LINE1, LINE2, LOA and R2 elements of the 14 eukaryotic non-LTR clades are described from C. intestinalis. The ascidian elements showed conservation of both the reverse transcriptase coding sequence and the overall structural organization seen in each clade. The apurinic/apyrimidinic endonuclease and nucleic-acid-binding domains encoded upstream of the reverse transcriptase, and the RNase H and the restriction enzyme-like endonuclease motifs encoded downstream of the reverse transcriptase were identified in the corresponding Ciona families. Conclusions: The genome of C. intestinalis harbors representatives of at least five clades of non-LTR retrotransposons. The copy number per haploid genome of each element is low, less than 100, far below the values reported for vertebrate counterparts but within the range for protostomes. Genomic and sequence analysis shows that the ascidian non-LTR elements are unmethylated and flanked by genomic segments with a gene density lower than average for the genome. The analysis provides valuable data for understanding the evolution of early chordate genomes and enlarges the view on the distribution of the non-LTR retrotransposons in eukaryotes.
Resumo:
At this time, about 3,000 different viruses are recognized, but metagenomic studies suggest that these viruses are a small fraction of the viruses that exist in nature. We have explored viral diversity by deep sequencing nucleic acids obtained from virion populations enriched from raw sewage. We identified 234 known viruses, including 17 that infect humans. Plant, insect, and algal viruses as well as bacteriophages were also present. These viruses represented 26 taxonomic families and included viruses with single-stranded DNA (ssDNA), double-stranded DNA (dsDNA), positive-sense ssRNA [ssRNA(¿)], and dsRNA genomes. Novel viruses that could be placed in specific taxa represented 51 different families, making untreated wastewater the most diverse viral metagenome (genetic material recovered directly from environmental samples) examined thus far. However, the vast majority of sequence reads bore little or no sequence relation to known viruses and thus could not be placed into specific taxa. These results show that the vast majority of the viruses on Earth have not yet been characterized. Untreated wastewater provides a rich matrix for identifying novel viruses and for studying virus diversity.
Resumo:
Background: It has been shown in a variety of organisms, including mammals, that genes that appeared recently in evolution, for example orphan genes, evolve faster than older genes. Low functional constraints at the time of origin of novel genes may explain these results. However, this observation has been recently attributed to an artifact caused by the inability of Blast to detect the fastest genes in different eukaryotic genomes. Distinguishing between these two possible explanations would be of great importance for any studies dealing with the taxon distribution of proteins and the origin of novel genes. Results: Here we used simulations of protein sequences to examine the capacity of Blast to detect proteins of diverse evolutionary rates in the different species of an eukaryotic phylogenetic tree that included metazoans, fungi and plants. We simulated the evolution of protein genes with the same evolutionary rates than those observed in functional mammalian genes and with among-site rate heterogeneity. Under these conditions, we found that only a very small percentage of simulated ancestral eukaryotic proteins was affected by the Blast artifact. We show that the good detectability of Blast is due to the heterogeneity of protein evolutionary rates at different sites, since only a small conserved motif in a sequence suffices to detect its homologues. Our results indicate that Blast, at least when applied within eukaryotes, only misses homologues of extremely fast-evolving sequences, which are rare in the mammalian genome, as well as sequences evolving homogeneously or pseudogenes.Conclusion: Although great care should be exercised in the recognition of remote homologues, most functional mammalian genes can be detected in eukaryotic genomes by Blast. That is, the majority of functional mammalian genes are not as fast as for not being detected in other metazoans, fungi or plants, if they had been present in these organisms. Thus, the correlation previously found between age and rate seems not to be due to a pure Blast artifact, at least for mammals. This may have important implications to understand the mechanisms by which novel genes originate.
Resumo:
Wolfram syndrome is a progressive neurodegenerative disorder transmitted in an autosomal recessive mode. We report two Wolfram syndrome families harboring multiple deletions of mitochondrial DNA. The deletions reached percentages as high as 85-90% in affected tissues such as the central nervous system of one patient, while in other tissues from the same patient and from other members of the family, the percentages of deleted mitochondrial DNA genomes were only 1-10%. Recently, a Wolfram syndrome gene has been linked to markers on 4p16. In both families linkage between the disease locus and 4p16 markers gave a maximum multipoint lod score of 3.79 at theta = 0 (Pi<0.03) with respect to D4S431. In these families, the syndrome was caused by mutations in this nucleus-encoded gene which deleteriously interacts with the mitochondrial genome. This is the first evidence of the implication of both genomes in a recessive disease.
Resumo:
Wolfram syndrome is a progressive neurodegenerative disorder transmitted in an autosomal recessive mode. We report two Wolfram syndrome families harboring multiple deletions of mitochondrial DNA. The deletions reached percentages as high as 85-90% in affected tissues such as the central nervous system of one patient, while in other tissues from the same patient and from other members of the family, the percentages of deleted mitochondrial DNA genomes were only 1-10%. Recently, a Wolfram syndrome gene has been linked to markers on 4p16. In both families linkage between the disease locus and 4p16 markers gave a maximum multipoint lod score of 3.79 at theta = 0 (Pi<0.03) with respect to D4S431. In these families, the syndrome was caused by mutations in this nucleus-encoded gene which deleteriously interacts with the mitochondrial genome. This is the first evidence of the implication of both genomes in a recessive disease.