928 resultados para complete genome
Resumo:
Schistosomes have a comparatively large genome, estimated for Schistosoma mansoni to be about 270 megabase pairs (haploid genome). Recent findings have shown that mobile genetic elements constitute significant proportions of the genomes of S. mansoni and S. japonicum. Much less information is available on the genome of the third major human schistosome, S. haematobium. In order to investigate the possible evolutionary origins of the S. mansoni long terminal repeat retrotransposons Boudicca and Sinbad, several genomes were searched by Southern blot for the presence of these retrotransposons. These included three species of schistosomes, S. mansoni, S. japonicum, and S. haematobium, and three related platyhelminth genomes, the liver flukes Fasciola hepatica and Fascioloides magna and the planarian, Dugesia dorotocephala. In addition, Homo sapiens and three snail host genomes, Biomphalaria glabrata, Oncomelania hupensis, and Bulinus truncatus, were examined for possible indications of a horizontal origin for these retrotransposons. Southern hybridization analysis indicated that both Boudicca and Sinbad were present in the genome of S. haematobium. Furthermore, low stringency Southern hybridization analyses suggested that a Boudicca-like retrotransposon was present in the genome of B. truncatus, the snail host of S. haematobium.
Resumo:
Lancelets ('amphioxus') are the modern survivors of an ancient chordate lineage, with a fossil record dating back to the Cambrian period. Here we describe the structure and gene content of the highly polymorphic approximately 520-megabase genome of the Florida lancelet Branchiostoma floridae, and analyse it in the context of chordate evolution. Whole-genome comparisons illuminate the murky relationships among the three chordate groups (tunicates, lancelets and vertebrates), and allow not only reconstruction of the gene complement of the last common chordate ancestor but also partial reconstruction of its genomic organization, as well as a description of two genome-wide duplications and subsequent reorganizations in the vertebrate lineage. These genome-scale events shaped the vertebrate genome and provided additional genetic variation for exploitation during vertebrate evolution.
Resumo:
In Xenopus laevis four estrogen-responsive genes are expressed simultaneously to produce vitellogenin, the precursor of the yolk proteins. One of these four genes, the gene A2, was sequenced completely, as well as cDNAs representing 75% of the coding region of the gene. From this data the exon-intron structure of the gene was established, revealing 35 exons that give a transcript of 5,619 bp without the poly A-tail. This A2 transcript encodes a vitellogenin of 1,807 amino acids, whose structure is discussed with respect to its function. At the nucleic acid as well as at the protein level no extensive homologies with any sequences other than vitellogenin were observed. Comparison of the amino acid sequence of the vitellogenin A2 molecule with biochemical data obtained from the different yolk proteins allowed us to localize the cleavage products on the vitellogenin precursor as follows: NH2 - lipovitellin I - phosvitin (or phosvette II - phosvette I) - lipovitellin II - COOH.
Resumo:
Differences between genomes can be due to single nucleotide variants, translocations, inversions, and copy number variants (CNVs, gain or loss of DNA). The latter can range from sub-microscopic events to complete chromosomal aneuploidies. Small CNVs are often benign but those larger than 500 kb are strongly associated with morbid consequences such as developmental disorders and cancer. Detecting CNVs within and between populations is essential to better understand the plasticity of our genome and to elucidate its possible contribution to disease. Hence there is a need for better-tailored and more robust tools for the detection and genome-wide analyses of CNVs. While a link between a given CNV and a disease may have often been established, the relative CNV contribution to disease progression and impact on drug response is not necessarily understood. In this review we discuss the progress, challenges, and limitations that occur at different stages of CNV analysis from the detection (using DNA microarrays and next-generation sequencing) and identification of recurrent CNVs to the association with phenotypes. We emphasize the importance of germline CNVs and propose strategies to aid clinicians to better interpret structural variations and assess their clinical implications.
Resumo:
In vertebrates, genome size has been shown to correlate with nuclear and cell sizes, and influences phenotypic features, such as brain complexity. In three different anuran families, advertisement calls of polyploids exhibit longer notes and intervals than diploids, and difference in cellular dimensions have been hypothesized to cause these modifications. We investigated this phenomenon in green toads (Bufo viridis subgroup) of three ploidy levels, in a different call type (release calls) that may evolve independently from advertisement calls, examining 1205 calls, from ten species, subspecies, and hybrid forms. Significant differences between pulse rates of six diploid and four polyploid (3n, 4n) green toad forms across a range of temperatures from 7 to 27 °C were found. Laboratory data supported differences in pulse rates of triploids vs. tetraploids, but failed to reach significance when including field recordings. This study supports the idea that genome size, irrespective of call type, phylogenetic context, and geographical background, might affect call properties in anurans and suggests a common principle governing this relationship. The nuclear-cell size ratio, affected by genome size, seems the most plausible explanation. However, we cannot rule out hypotheses under which call-influencing genes from an unexamined diploid ancestral species might also affect call properties in the hybrid-origin polyploids.
Resumo:
The number of sequences generated by genome projects has increased exponentially, but gene characterization has not followed at the same rate. Sequencing and analysis of full-length cDNAs is an important step in gene characterization that has been used nowadays by several research groups. In this work, we have selected Schistosoma mansoni clones for full-length sequencing, using an algorithm that investigates the presence of the initial methionine in the parasite sequence based on the positions of alignment start between two sequences. BLAST searches to produce such alignments have been performed using parasite expressed sequence tags produced by Minas Gerais Genome Network against sequences from the database Eukaryotic Cluster of Orthologous Groups (KOG). This procedure has allowed the selection of clones representing 398 proteins which have not been deposited as S. mansoni complete CDS in any public database. Dedicated sequencing of 96 of such clones with reads from both 5' and 3' ends has been performed. These reads have been assembled using PHRAP, resulting in the production of 33 full-length sequences that represent novel S. mansoni proteins. These results shall contribute to construct a more complete view of the biology of this important parasite.
Resumo:
BACKGROUND: Despite the continuous production of genome sequence for a number of organisms, reliable, comprehensive, and cost effective gene prediction remains problematic. This is particularly true for genomes for which there is not a large collection of known gene sequences, such as the recently published chicken genome. We used the chicken sequence to test comparative and homology-based gene-finding methods followed by experimental validation as an effective genome annotation method. RESULTS: We performed experimental evaluation by RT-PCR of three different computational gene finders, Ensembl, SGP2 and TWINSCAN, applied to the chicken genome. A Venn diagram was computed and each component of it was evaluated. The results showed that de novo comparative methods can identify up to about 700 chicken genes with no previous evidence of expression, and can correctly extend about 40% of homology-based predictions at the 5' end. CONCLUSIONS: De novo comparative gene prediction followed by experimental verification is effective at enhancing the annotation of the newly sequenced genomes provided by standard homology-based methods.
Resumo:
Two allelic genomic fragments containing ribosomal protein S4 encoding genes (rpS4) from Trypanosoma cruzi (CL-Brener strain) were isolated and characterized. One allele comprises two complete tandem repeats of a sequence encoding an rpS4 gene. In the other, only one rpS4 gene is found. Sequence comparison to the accessed data in the genome project database reveals that our two-copy allele corresponds to a variant haplotype. However, the deduced aminoacid sequence of all the gene copies is identical. The rpS4 transcripts processing sites were determined by comparison of genomic sequences with published cDNA data. The obtained sequence data demonstrates that rpS4 genes are expressed in epimastigotes, amastigotes, and trypomastigotes. A recombinant version of rpS4 was found to be an antigenic: it was recognized by 62.5% of the individuals with positive serology for T. cruzi and by 93.3% of patients with proven chronic chagasic disease.
Resumo:
Genotyping and molecular characterization of drug resistance mechanisms in Mycobacterium leprae enables disease transmission and drug resistance trends to be monitored. In the present study, we performed genome-wide analysis of Airaku-3, a multidrug-resistant strain with an unknown mechanism of resistance to rifampicin. We identified 12 unique non-synonymous single-nucleotide polymorphisms (SNPs) including two in the transporter-encoding ctpC and ctpI genes. In addition, two SNPs were found that improve the resolution of SNP-based genotyping, particularly for Venezuelan and South East Asian strains of M. leprae.
Resumo:
Genome-wide association studies (GWAS) have been successful in identifying common genetic variation involved in susceptibility to etiologically complex disease. We conducted a GWAS to identify common genetic variation involved in susceptibility to upper aero-digestive tract (UADT) cancers. Genome-wide genotyping was carried out using the Illumina HumanHap300 beadchips in 2,091 UADT cancer cases and 3,513 controls from two large European multi-centre UADT cancer studies, as well as 4,821 generic controls. The 19 top-ranked variants were investigated further in an additional 6,514 UADT cancer cases and 7,892 controls of European descent from an additional 13 UADT cancer studies participating in the INHANCE consortium. Five common variants presented evidence for significant association in the combined analysis (p≤5×10−7). Two novel variants were identified, a 4q21 variant (rs1494961, p = 1×10−8) located near DNA repair related genes HEL308 and FAM175A (or Abraxas) and a 12q24 variant (rs4767364, p = 2×10−8) located in an extended linkage disequilibrium region that contains multiple genes including the aldehyde dehydrogenase 2 (ALDH2) gene. Three remaining variants are located in the ADH gene cluster and were identified previously in a candidate gene study involving some of these samples. The association between these three variants and UADT cancers was independently replicated in 5,092 UADT cancer cases and 6,794 controls non-overlapping samples presented here (rs1573496-ADH7, p = 5×10−8; rs1229984-ADH1B, p = 7×10−9; and rs698-ADH1C, p = 0.02). These results implicate two variants at 4q21 and 12q24 and further highlight three ADH variants in UADT cancer susceptibility.
Analysis of a complete disjunctive table in which all the questions have the same set of categories.
Resumo:
The adipocyte-derived protein adiponectin is highly heritable and inversely associated with risk of type 2 diabetes mellitus (T2D) and coronary heart disease (CHD). We meta-analyzed 3 genome-wide association studies for circulating adiponectin levels (n = 8,531) and sought validation of the lead single nucleotide polymorphisms (SNPs) in 5 additional cohorts (n = 6,202). Five SNPs were genome-wide significant in their relationship with adiponectin (P< or =5x10(-8)). We then tested whether these 5 SNPs were associated with risk of T2D and CHD using a Bonferroni-corrected threshold of P< or =0.011 to declare statistical significance for these disease associations. SNPs at the adiponectin-encoding ADIPOQ locus demonstrated the strongest associations with adiponectin levels (P-combined = 9.2x10(-19) for lead SNP, rs266717, n = 14,733). A novel variant in the ARL15 (ADP-ribosylation factor-like 15) gene was associated with lower circulating levels of adiponectin (rs4311394-G, P-combined = 2.9x10(-8), n = 14,733). This same risk allele at ARL15 was also associated with a higher risk of CHD (odds ratio [OR] = 1.12, P = 8.5x10(-6), n = 22,421) more nominally, an increased risk of T2D (OR = 1.11, P = 3.2x10(-3), n = 10,128), and several metabolic traits. Expression studies in humans indicated that ARL15 is well-expressed in skeletal muscle. These findings identify a novel protein, ARL15, which influences circulating adiponectin levels and may impact upon CHD risk.