4 resultados para Plant genome mapping
em Université de Lausanne, Switzerland
Resumo:
The mutualistic symbiosis involving Glomeromycota, a distinctive phylum of early diverging Fungi, is widely hypothesized to have promoted the evolution of land plants during the middle Paleozoic. These arbuscular mycorrhizal fungi (AMF) perform vital functions in the phosphorus cycle that are fundamental to sustainable crop plant productivity. The unusual biological features of AMF have long fascinated evolutionary biologists. The coenocytic hyphae host a community of hundreds of nuclei and reproduce clonally through large multinucleated spores. It has been suggested that the AMF maintain a stable assemblage of several different genomes during the life cycle, but this genomic organization has been questioned. Here we introduce the 153-Mb haploid genome of Rhizophagus irregularis and its repertoire of 28,232 genes. The observed low level of genome polymorphism (0.43 SNP per kb) is not consistent with the occurrence of multiple, highly diverged genomes. The expansion of mating-related genes suggests the existence of cryptic sex-related processes. A comparison of gene categories confirms that R. irregularis is close to the Mucoromycotina. The AMF obligate biotrophy is not explained by genome erosion or any related loss of metabolic complexity in central metabolism, but is marked by a lack of genes encoding plant cell wall-degrading enzymes and of genes involved in toxin and thiamine synthesis. A battery of mycorrhiza-induced secreted proteins is expressed in symbiotic tissues. The present comprehensive repertoire of R. irregularis genes provides a basis for future research on symbiosis-related mechanisms in Glomeromycota.
Resumo:
Metabolic traits are molecular phenotypes that can drive clinical phenotypes and may predict disease progression. Here, we report results from a metabolome- and genome-wide association study on (1)H-NMR urine metabolic profiles. The study was conducted within an untargeted approach, employing a novel method for compound identification. From our discovery cohort of 835 Caucasian individuals who participated in the CoLaus study, we identified 139 suggestively significant (P<5×10(-8)) and independent associations between single nucleotide polymorphisms (SNP) and metabolome features. Fifty-six of these associations replicated in the TasteSensomics cohort, comprising 601 individuals from São Paulo of vastly diverse ethnic background. They correspond to eleven gene-metabolite associations, six of which had been previously identified in the urine metabolome and three in the serum metabolome. Our key novel findings are the associations of two SNPs with NMR spectral signatures pointing to fucose (rs492602, P = 6.9×10(-44)) and lysine (rs8101881, P = 1.2×10(-33)), respectively. Fine-mapping of the first locus pinpointed the FUT2 gene, which encodes a fucosyltransferase enzyme and has previously been associated with Crohn's disease. This implicates fucose as a potential prognostic disease marker, for which there is already published evidence from a mouse model. The second SNP lies within the SLC7A9 gene, rare mutations of which have been linked to severe kidney damage. The replication of previous associations and our new discoveries demonstrate the potential of untargeted metabolomics GWAS to robustly identify molecular disease markers.
Resumo:
Many disorders are associated with altered serum protein concentrations, including malnutrition, cancer, and cardiovascular, kidney, and inflammatory diseases. Although these protein concentrations are highly heritable, relatively little is known about their underlying genetic determinants. Through transethnic meta-analysis of European-ancestry and Japanese genome-wide association studies, we identified six loci at genome-wide significance (p < 5 × 10(-8)) for serum albumin (HPN-SCN1B, GCKR-FNDC4, SERPINF2-WDR81, TNFRSF11A-ZCCHC2, FRMD5-WDR76, and RPS11-FCGRT, in up to 53,190 European-ancestry and 9,380 Japanese individuals) and three loci for total protein (TNFRS13B, 6q21.3, and ELL2, in up to 25,539 European-ancestry and 10,168 Japanese individuals). We observed little evidence of heterogeneity in allelic effects at these loci between groups of European and Japanese ancestry but obtained substantial improvements in the resolution of fine mapping of potential causal variants by leveraging transethnic differences in the distribution of linkage disequilibrium. We demonstrated a functional role for the most strongly associated serum albumin locus, HPN, for which Hpn knockout mice manifest low plasma albumin concentrations. Other loci associated with serum albumin harbor genes related to ribosome function, protein translation, and proteasomal degradation, whereas those associated with serum total protein include genes related to immune function. Our results highlight the advantages of transethnic meta-analysis for the discovery and fine mapping of complex trait loci and have provided initial insights into the underlying genetic architecture of serum protein concentrations and their association with human disease.
Resumo:
Next-generation sequencing (NGS) technologies have become the standard for data generation in studies of population genomics, as the 1000 Genomes Project (1000G). However, these techniques are known to be problematic when applied to highly polymorphic genomic regions, such as the human leukocyte antigen (HLA) genes. Because accurate genotype calls and allele frequency estimations are crucial to population genomics analyses, it is important to assess the reliability of NGS data. Here, we evaluate the reliability of genotype calls and allele frequency estimates of the single-nucleotide polymorphisms (SNPs) reported by 1000G (phase I) at five HLA genes (HLA-A, -B, -C, -DRB1, and -DQB1). We take advantage of the availability of HLA Sanger sequencing of 930 of the 1092 1000G samples and use this as a gold standard to benchmark the 1000G data. We document that 18.6% of SNP genotype calls in HLA genes are incorrect and that allele frequencies are estimated with an error greater than ±0.1 at approximately 25% of the SNPs in HLA genes. We found a bias toward overestimation of reference allele frequency for the 1000G data, indicating mapping bias is an important cause of error in frequency estimation in this dataset. We provide a list of sites that have poor allele frequency estimates and discuss the outcomes of including those sites in different kinds of analyses. Because the HLA region is the most polymorphic in the human genome, our results provide insights into the challenges of using of NGS data at other genomic regions of high diversity.