5 resultados para Genome Annotation Assessment
em Chinese Academy of Sciences Institutional Repositories Grid Portal
Resumo:
The human genome project has been recently complemented by whole-genome assessment sequence of 32 mammals and 24 nonmammalian vertebrate species suitable for comparative genomic analyses. Here we anticipate a precipitous drop in costs and increase in sequ
Sequencing, annotation and comparative analysis of nine BACs of giant panda (Ailuropoda melanoleuca)
Resumo:
A 10-fold BAC library for giant panda was constructed and nine BACs were selected to generate finish sequences. These BACs could be used as a validation resource for the de novo assembly accuracy of the whole genome shotgun sequencing reads of giant panda newly generated by the Illumina GA sequencing technology. Complete sanger sequencing, assembly, annotation and comparative analysis were carried out on the selected BACs of a joint length 878 kb. Homologue search and de novo prediction methods were used to annotate genes and repeats. Twelve protein coding genes were predicted, seven of which could be functionally annotated. The seven genes have an average gene size of about 41 kb, an average coding size of about 1.2 kb and an average exon number of 6 per gene. Besides, seven tRNA genes were found. About 27 percent of the BAC sequence is composed of repeats. A phylogenetic tree was constructed using neighbor-join algorithm across five species, including giant panda, human, dog, cat and mouse, which reconfirms dog as the most related species to giant panda. Our results provide detailed sequence and structure information for new genes and repeats of giant panda, which will be helpful for further studies on the giant panda.
Resumo:
Using next-generation sequencing technology alone, we have successfully generated and assembled a draft sequence of the giant panda genome. The assembled contigs (2.25 gigabases (Gb)) cover approximately 94% of the whole genome, and the remaining gaps (0.05 Gb) seem to contain carnivore-specific repeats and tandem repeats. Comparisons with the dog and human showed that the panda genome has a lower divergence rate. The assessment of panda genes potentially underlying some of its unique traits indicated that its bamboo diet might be more dependent on its gut microbiome than its own genetic composition. We also identified more than 2.7 million heterozygous single nucleotide polymorphisms in the diploid genome. Our data and analyses provide a foundation for promoting mammalian genetic research, and demonstrate the feasibility for using next-generation sequencing technologies for accurate, cost-effective and rapid de novo assembly of large eukaryotic genomes.
Resumo:
Background: Giardia are a group of widespread intestinal protozoan parasites in a number of vertebrates. Much evidence from G. lamblia indicated they might be the most primitive extant eukaryotes. When and how such a group of the earliest branching unicellular eukaryotes developed the ability to successfully parasitize the latest branching higher eukaryotes (vertebrates) is an intriguing question. Gene duplication has long been thought to be the most common mechanism in the production of primary resources for the origin of evolutionary novelties. In order to parse the evolutionary trajectory of Giardia parasitic lifestyle, here we carried out a genome-wide analysis about gene duplication patterns in G. lamblia. Results: Although genomic comparison showed that in G. lamblia the contents of many fundamental biologic pathways are simplified and the whole genome is very compact, in our study 40% of its genes were identified as duplicated genes. Evolutionary distance analyses of these duplicated genes indicated two rounds of large scale duplication events had occurred in G. lamblia genome. Functional annotation of them further showed that the majority of recent duplicated genes are VSPs (Variant-specific Surface Proteins), which are essential for the successful parasitic life of Giardia in hosts. Based on evolutionary comparison with their hosts, it was found that the rapid expansion of VSPs in G. lamblia is consistent with the evolutionary radiation of placental mammals. Conclusions: Based on the genome-wide analysis of duplicated genes in G. lamblia, we found that gene duplication was essential for the origin and evolution of Giardia parasitic lifestyle. The recent expansion of VSPs uniquely occurring in G. lamblia is consistent with the increment of its hosts. Therefore we proposed a hypothesis that the increment of Giradia hosts might be the driving force for the rapid expansion of VSPs.
Resumo:
Full-length and partial genome sequences of four members of the genus Aquareovirus, family Reoviridae (Golden shiner reovirus, Grass carp reovirus, Striped bass reovirus and golden ide reovirus) were characterized. Based on sequence comparison, the unclassified Grass carp reovirus was shown to be a member of the species Aquareovirus C The status of golden ide reovirus, another unclassified aquareovirus, was also examined. Sequence analysis showed that it did not belong to the species Aquareovirus A or C, but assessment of its relationship to the species Aquareovirus B, D, E and F was hampered by the absence of genetic data from these species. In agreement with previous reports of ultrastructural resemblance between aquareoviruses and orthoreoviruses, genetic analysis revealed homology in the genes of the two groups. This homology concerned eight of the 11 segments of the aquareovirus genome (amino acid identity 17-42%), and similar genetic organization was observed in two other segments. The conserved terminal sequences in the genomes of members of the two groups were also similar. These data are undoubtedly an indication of the common evolutionary origin of these viruses. This clear genetic relatedness between members of distinct genera is unique within the family Reoviridae. Such a genetic relationship is usually observed between members of a single genus. However, the current taxonomic classification of aquareoviruses and orthoreoviruses in two different genera is supported by a number of characteristics, including their distinct G+C contents, unequal numbers of genome segments, absence of an antigenic relationship, different cytopathic effects and specific econiches.