959 resultados para GENOME SEQUENCING
Resumo:
The identification and annotation of protein-coding genes is one of the primary goals of whole-genome sequencing projects, and the accuracy of predicting the primary protein products of gene expression is vital to the interpretation of the available data and the design of downstream functional applications. Nevertheless, the comprehensive annotation of eukaryotic genomes remains a considerable challenge. Many genomes submitted to public databases, including those of major model organisms, contain significant numbers of wrong and incomplete gene predictions. We present a community-based reannotation of the Aspergillus nidulans genome with the primary goal of increasing the number and quality of protein functional assignments through the careful review of experts in the field of fungal biology. (C) 2009 Elsevier Inc. All rights reserved.
Resumo:
To understand the biology and evolution of ruminants, the cattle genome was sequenced to about sevenfold coverage. The cattle genome contains a minimum of 22,000 genes, with a core set of 14,345 orthologs shared among seven mammalian species of which 1217 are absent or undetected in noneutherian (marsupial or monotreme) genomes. Cattle-specific evolutionary breakpoint regions in chromosomes have a higher density of segmental duplications, enrichment of repetitive elements, and species-specific variations in genes associated with lactation and immune responsiveness. Genes involved in metabolism are generally highly conserved, although five metabolic genes are deleted or extensively diverged from their human orthologs. The cattle genome sequence thus provides a resource for understanding mammalian evolution and accelerating livestock genetic improvement for milk and meat production.
Resumo:
The main focus of the human genome sequencing project has been gene discovery, but a great additional benefit is that it offers the chance to examine the large proportion of the genome that does not contain human genes. The nature of this ‘noncoding’ DNA is poorly understood, both as an evolutionary question (how did it get there?) and in the functional sense (what is it doing now?). Much of the noncoding DNA is derived from retroviruses that have inserted their DNA into the genome. The availability of complete genomic sequences will revolutionize studies of the number and location of endogenous retroviruses, their role in genome evolution, and their contribution to human disease.
Resumo:
In recent years, analysis of the genomes of many organisms has received increasing international attention. The bulk of the effort to date has centred on the Human Genome Project and analysis of model organisms such as yeast, Drosophila and Caenorhabditis elegans. More recently, the revolution in genome sequencing and gene identification has begun to impact on infectious disease organisms. Initially, much of the effort was concentrated on prokaryotes, but small eukaryotic genomes, including the protozoan parasites Plasmodium, Toxoplasma and trypanosomatids (Leishmania, Trypanosoma brucei and T. cruzi), as well as some multicellular organisms, such as Brugia and Schistosoma, are benefiting from the technological advances of the genome era. These advances promise a radical new approach to the development of novel diagnostic tools, chemotherapeutic targets and vaccines for infectious disease organisms, as well as to the more detailed analysis of cell biology and function.Several networks or consortia linking laboratories around the world have been established to support these parasite genome projects[1] (for more information, see http://www.ebi.ac.uk/ parasites/paratable.html). Five of these networks were supported by an initiative launched in 1994 by the Specific Programme for Research and Tropical Diseases (TDR) of the WHO[2, 3, 4, 5, 6]. The Leishmania Genome Network (LGN) is one of these[3]. Its activities are reported at http://www.ebi.ac.uk/parasites/leish.html, and its current aim is to map and sequence the genome of Leishmania by the year 2002. All the mapping, hybridization and sequence data are also publicly available from LeishDB, an AceDB-based genome database (http://www.ebi.ac.uk/parasites/LGN/leissssoft.html).
Resumo:
Background: Integrative and conjugative elements (ICE) form a diverse group of DNA elements that are integrated in the chromosome of the bacterial host, but can occasionally excise and horizontally transfer to a new host cell. ICE come in different families, typically with a conserved core for functions controlling the element's behavior and a variable region providing auxiliary functions to the host. The ICEclc element of Pseudomonas knackmussii strain B13 is representative for a large family of chromosomal islands detected by genome sequencing approaches. It provides the host with the capacity to degrade chloroaromatics and 2-aminophenol. Results: Here we study the transcriptional organization of the ICEclc core region. By northern hybridizations, reverse-transcriptase polymerase chain reaction (RT-PCR) and Rapid Amplification of cDNA Ends (5'-RACE) fifteen transcripts were mapped in the core region. The occurrence and location of those transcripts were further confirmed by hybridizing labeled cDNA to a semi-tiling micro-array probing both strands of the ICEclc core region. Dot blot and semi-tiling array hybridizations demonstrated most of the core transcripts to be upregulated during stationary phase on 3-chlorobenzoate, but not on succinate or glucose. Conclusions: The transcription analysis of the ICEclc core region provides detailed insights in the mode of regulatory organization and will help to further understand the complex mode of behavior of this class of mobile elements. We conclude that ICEclc core transcription is concerted at a global level, more reminiscent of a phage program than of plasmid conjugation.
Resumo:
Dengue virus (DENV) infections represent a significant concern for public health worldwide, being considered as the most prevalent arthropod-borne virus regarding the number of reported cases. In this study, we report the complete genome sequencing of a DENV serotype 4 isolate, genotype II, obtained in the city of Manaus, directly from the serum sample, applying Ion Torrent sequencing technology. The use of a massive sequencing technology allowed the detection of two variable sites, one in the coding region for the viral envelope protein and the other in the nonstructural 1 coding region within viral populations.
Resumo:
Gastric (GC) and breast (BrC) cancer are two of the most common and deadly tumours. Different lines of evidence suggest a possible causative role of viral infections for both GC and BrC. Wide genome sequencing (WGS) technologies allow searching for viral agents in tissues of patients with cancer. These technologies have already contributed to establish virus-cancer associations as well as to discovery new tumour viruses. The objective of this study was to document possible associations of viral infection with GC and BrC in Mexican patients. In order to gain idea about cost effective conditions of experimental sequencing, we first carried out an in silico simulation of WGS. The next-generation-platform IlluminaGallx was then used to sequence GC and BrC tumour samples. While we did not find viral sequences in tissues from BrC patients, multiple reads matching Epstein-Barr virus (EBV) sequences were found in GC tissues. An end-point polymerase chain reaction confirmed an enrichment of EBV sequences in one of the GC samples sequenced, validating the next-generation sequencing-bioinformatics pipeline.
Resumo:
Routine screening of lung transplant recipients and hospital patients for respiratory virus infections allowed to identify human rhinovirus (HRV) in the upper and lower respiratory tracts, including immunocompromised hosts chronically infected with the same strain over weeks or months. Phylogenetic analysis of 144 HRV-positive samples showed no apparent correlation between a given viral genotype or species and their ability to invade the lower respiratory tract or lead to protracted infection. By contrast, protracted infections were found almost exclusively in immunocompromised patients, thus suggesting that host factors rather than the virus genotype modulate disease outcome, in particular the immune response. Complete genome sequencing of five chronic cases to study rhinovirus genome adaptation showed that the calculated mutation frequency was in the range observed during acute human infections. Analysis of mutation hot spot regions between specimens collected at different times or in different body sites revealed that non-synonymous changes were mostly concentrated in the viral capsid genes VP1, VP2 and VP3, independent of the HRV type. In an immunosuppressed lung transplant recipient infected with the same HRV strain for more than two years, both classical and ultra-deep sequencing of samples collected at different time points in the upper and lower respiratory tracts showed that these virus populations were phylogenetically indistinguishable over the course of infection, except for the last month. Specific signatures were found in the last two lower respiratory tract populations, including changes in the 5'UTR polypyrimidine tract and the VP2 immunogenic site 2. These results highlight for the first time the ability of a given rhinovirus to evolve in the course of a natural infection in immunocompromised patients and complement data obtained from previous experimental inoculation studies in immunocompetent volunteers.
Resumo:
Pneumocystis jirovecii is a fungus belonging to a basal lineage of the Ascomycotina, the Taphrinomycotina subphylum. It is a parasite specific to humans that dwells primarily in the lung and can cause severe pneumonia in individuals with debilitated immune system. Despite its clinical importance, many aspects of its biology remain poorly understood, at least in part because of the lack of a continuous in vitro cultivation system. The present thesis consists in the genome reconstruction and comparative genomics of P. jirovecii. It is made of three parts: (i) the de novo sequencing of P. jirovecii genome starting from a single broncho- alveolar lavage fluid of a single patient (ii) the de novo sequencing of the genome of the plant pathogen Taphrina deformans, a fungus closely related to P. jirovecii, and (iii) the genome scale comparison of P. jirovecii to other Taphrinomycotina members. Enrichment in P. jirovecii cells by immuno-precipitation, whole DNA random amplification, two complementary high throughput DNA sequencing methods, and in silico sorting and assembly of sequences were used for the de novo reconstruction of P. jirovecii genome from the microbiota of a single clinical specimen. An iterative ad hoc pipeline as well as numerical simulations was used to recover P. jirovecii sequences while purging out contaminants and assembly or amplification chimeras. This strategy produced a 8.1 Mb assembly, which encodes 3,898 genes. Homology searches, mapping on biochemical pathways atlases, and manual validations revealed that this genome lacks (i) most of the enzymes dedicated to the amino acids biosyntheses, and (ii) most virulence factors observed in other fungi, e.g. the glyoxylate shunt pathway and specific peptidases involved in the degradation of the host cell membrane. The same analyses applied to the available genomic sequences from Pneumocystis carinii the species infecting rats and Pneumocystis murina the species infecting mice revealed the same deficiencies. The genome sequencing of T. deformans yielded a 13 Mb assembly, which encodes 5,735 genes. T. deformans possesses enzymes involved plant cell wall degradation, secondary metabolism, the glyoxylate cycle, detoxification, sterol biosynthesis, as well as the biosyntheses of plant hormones such as abscisic acid or indole-3-acetic acid. T. deformans also harbors gene subsets that have counterparts in plant saprophytes or pathogens, which is consistent with its alternate saprophytic and pathogenic lifestyles. Mating genes were also identified. The homothallism of this fungus suggests a mating-type switching mechanism. Comparative analyses indicated that 81% of P. jirovecii genes are shared with eight other Taphrinomycotina members, including T. deformans, P. carinii and P. murina. These genes are mostly involved in housekeeping activities. The genes specific to the Pneumocystis genus represent 8%, and are involved in RNA metabolism and signaling. The signaling is known to be crucial for interaction of Pneumocystis spp with their environment. Eleven percent are unique to P. jirovecii and encode mostly proteins of unknown function. These genes in conjunction with other ones (e.g. the major surface glycoproteins) might govern the interaction of P. jirovecii with its human host cells, and potentially be responsible of the host specificity. P. jirovecii exhibits a reduced genome in size with a low GC content, and most probably scavenges vital compounds such as amino acids and cholesterol from human lungs. Consistently, its genome encodes a large set of transporters (ca. 22% of its genes), which may play a pivotal role in the acquisition of these compounds. All these features are generally observed in obligate parasite of various kingdoms (bacteria, protozoa, fungi). Moreover, epidemiological studies failed to evidence a free-living form of the fungus and Pneumocystis spp were shown to co-evolved with their hosts. Given also the lack of virulence factors, our observations strongly suggest that P. jirovecii is an obligate parasite specialized in the colonization of human lungs, and which causes disease only in individuals with compromised immune system. The same conclusion is most likely true for all other Pneumocystis spp in their respective mammalian host. - Pneumocystis jirovecii est un champignon appartenant à ine branche basale des Ascomycotina, le sous-embranchement des Taphrinomycotina. C'est un parasite spécifique aux humains qui réside principalement dans les poumons, et qui peut causer des pneumonies sévères chez des individus ayant un système immunitaire déficient. En dépit de son importance clinique, de nombreux aspects de sa biologie demeurent,largement méconnus, au moins en partie à cause de l'absence d'un système de culture in vitro continu. Cette thèse traite de la reconstruction du génome et de la génomique comparative de P. jirovecii. Elle comporte trois parties: (i) le séquençage de novo du génome de P. jirovecii à partir d'un lavage broncho-alvéolaire provenant d'un seul patient, (ii) le séquençage de novo du génome d'un champignon pathogène de plante Taphrina deformans qui est phylogénétiquement proche de P. jirovecii, et (iii) la comparaison du génome de P. jirovecii à celui d'autres membres du sous-embranchement des Taphrinomycotina. Un enrichissement en cellules de P. jirovecii par immuno-précipitation, une amplification aléatoire des molécules d'ADN, deux méthodes complémentaires de séquençage à haut débit, un tri in silico et un assemblage des séquences ont été utilisés pour reconstruire de novo le génome de P. jirovecii à partir du microbiote d'un seul échantillon clinique. Un pipeline spécifique ainsi que des simulations numériques ont été utilisés pour récupérer les séquences de P. jirovecii tout en éliminant les séquences contaminants et les chimères d'amplification ou d'assemblage. Cette stratégie a produit un assemblage de 8.1 Mb, qui contient 3898 gènes. Les recherches d'homologies, de cartographie des voies métaboliques et des validations manuelles ont révélé que ce génome est dépourvu (i) de la plupart des enzymes dédiées à la biosynthèse des acides aminés, et (ii) de la plupart des facteurs de virulence observés chez d'autres champignons, par exemple, le cycle du glyoxylate ainsi que des peptidases spécifiques impliquées dans la dégradation de la membrane de la cellule hôte. Les analyses appliquées aux données génomiques disponibles de Pneumocystis carinii, l'espèce infectant les rats, et de Pneumocystis murina, l'espèce infectant les souris, ont révélé les mêmes déficiences. Le séquençage du génome de T. deformans a généré un assemblage de 13.3 Mb qui contient 5735 gènes. T. deformans possède les gènes codant pour les enzymes impliquées dans la dégradation des parois cellulaires des plantes, le métabolisme secondaire, le cycle du glyoxylate, la détoxification, la biosynthèse des stérols ainsi que la biosynthèse d'hormones de plantes telles que l'acide abscissique ou l'acide indole 3-acétique. T. deformans possède également des sous-ensembles de gènes présents exclusivement chez des saprophytes ou des pathogènes de plantes, ce qui est consistent avec son mode de vie alternatif saprophyte et pathogène. Des gènes impliqués dans la conjugaison ont été identifiés. L'homothallisme de ce champignon suggère mécanisme de permutation du type conjuguant. Les analyses comparatives ont démontré que 81% des gènes de P. jirovecii sont présent chez les autres membres du sous-embranchement des Taphrinomycotina. Ces gènes sont essentiellement impliqués dans le métabolisme basai. Les gènes spécifiques au genre Pneumocystis représentent 8%, et sont impliqués dans le métabolisme de l'ARN et la signalisation. La signalisation est connue pour être cruciale pour l'interaction des espèces de Pneumocystis avec leur environnement. Les gènes propres à P. jirovecii représentent 11% et codent en majorité pour des protéines dont la fonction est inconnue. Ces gènes en conjonction avec d'autres (par exemple, les glycoprotéines de surface), pourraient être déterminants dans l'interaction de P. jirovecii avec les cellules de l'hôte humain, et être potentiellement responsable de la spécificité d'hôte. P. jirovecii possède un génome de taille réduite à faible pourcentage en GC et récupère très probablement des composés vitaux comme les acides aminés et le cholestérol à partir des poumons humains. De manière consistante, son génome code pour de nombreux transporteurs (22% de ses gènes), qui pourraient jouer un rôle essentiel dans l'acquisition de ces composés. Ces caractéristiques sont généralement observées chez les parasites obligatoires de plusieurs règnes (bactéries, protozoaires, champignons). De plus, les études épidémiologiques n'ont pas réussi à prouver l'existence d'ime forme vivant librement du champignon. Etant donné également l'absence de facteurs de virulence, nos observations suggèrent que P. jirovecii est un parasite obligatoire spécialisé dans la colonisation des poumons humains, ne causant une maladie que chez des individus ayant un système immunitaire compromis. La même conclusion est très probablement applicable à toutes les autres espèces de Pneumocystis dans leur hôte mammifère respectif.
Resumo:
Recent technological progress has greatly facilitated de novo genome sequencing. However, de novo assemblies consist in many pieces of contiguous sequence (contigs) arranged in thousands of scaffolds instead of small numbers of chromosomes. Confirming and improving the quality of such assemblies is critical for subsequent analysis. We present a method to evaluate genome scaffolding by aligning independently obtained transcriptome sequences to the genome and visually summarizing the alignments using the Cytoscape software. Applying this method to the genome of the red fire ant Solenopsis invicta allowed us to identify inconsistencies in 7%, confirm contig order in 20% and extend 16% of scaffolds.Scripts that generate tables for visualization in Cytoscape from FASTA sequence and scaffolding information files are publicly available at https://github.com/ksanao/TGNet.
Resumo:
We summarize the progress in whole-genome sequencing and analyses of primate genomes. These emerging genome datasets have broadened our understanding of primate genome evolution revealing unexpected and complex patterns of evolutionary change. This includes the characterization of genome structural variation, episodic changes in the repeat landscape, differences in gene expression, new models regarding speciation, and the ephemeral nature of the recombination landscape. The functional characterization of genomic differences important in primate speciation and adaptation remains a significant challenge. Limited access to biological materials, the lack of detailed phenotypic data and the endangered status of many critical primate species have significantly attenuated research into the genetic basis of primate evolution. Next-generation sequencing technologies promise to greatly expand the number of available primate genome sequences; however, such draft genome sequences will likely miss critical genetic differences within complex genomic regions unless dedicated efforts are put forward to understand the full spectrum of genetic variation.
Resumo:
We present here a draft genome sequence of the red jungle fowl, Gallus gallus. Because the chicken is a modern descendant of the dinosaurs and the first non-mammalian amniote to have its genome sequenced, the draft sequence of its genome--composed of approximately one billion base pairs of sequence and an estimated 20,000-23,000 genes--provides a new perspective on vertebrate genome evolution, while also improving the annotation of mammalian genomes. For example, the evolutionary distance between chicken and human provides high specificity in detecting functional elements, both non-coding and coding. Notably, many conserved non-coding sequences are far from genes and cannot be assigned to defined functional classes. In coding regions the evolutionary dynamics of protein domains and orthologous groups illustrate processes that distinguish the lineages leading to birds and mammals. The distinctive properties of avian microchromosomes, together with the inferred patterns of conserved synteny, provide additional insights into vertebrate chromosome architecture.
Resumo:
To understand the biology and evolution of ruminants, the cattle genome was sequenced to about sevenfold coverage. The cattle genome contains a minimum of 22,000 genes, with a core set of 14,345 orthologs shared among seven mammalian species of which 1217 are absent or undetected in noneutherian (marsupial or monotreme) genomes. Cattle-specific evolutionary breakpoint regions in chromosomes have a higher density of segmental duplications, enrichment of repetitive elements, and species-specific variations in genes associated with lactation and immune responsiveness. Genes involved in metabolism are generally highly conserved, although five metabolic genes are deleted or extensively diverged from their human orthologs. The cattle genome sequence thus provides a resource for understanding mammalian evolution and accelerating livestock genetic improvement for milk and meat production.
Resumo:
Genome sequencing efforts are providing us with complete genetic blueprints for hundreds of organisms. We are now faced with assigning, understanding, and modifying the functions of proteins encoded by these genomes. DBMODELING is a relational database of annotated comparative protein structure models and their metabolic pathway characterization, when identified. This procedure was applied to complete genomes such as Mycobacteritum tuberculosis and Xylella fastidiosa. The main interest in the study of metabolic pathways is that some of these pathways are not present in humans, which makes them selective targets for drug design, decreasing the impact of drugs in humans. In the database, there are currently 1116 proteins from two genomes. It can be accessed by any researcher at http://www.biocristalografia.df.ibilce.unesp.br/tools/. This project confirms that homology modeling is a useful tool in structural bioinformatics and that it can be very valuable in annotating genome sequence information, contributing to structural and functional genomics, and analyzing protein-ligand docking.
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)