846 resultados para Whole genome sequencing
Resumo:
Alcohol consumption is a moderately heritable trait, but the genetic basis in humans is largely unknown, despite its clinical and societal importance. We report a genome-wide association study meta-analysis of approximately 2.5 million directly genotyped or imputed SNPs with alcohol consumption (gram per day per kilogram body weight) among 12 population-based samples of European ancestry, comprising 26,316 individuals, with replication genotyping in an additional 21,185 individuals. SNP rs6943555 in autism susceptibility candidate 2 gene (AUTS2) was associated with alcohol consumption at genome-wide significance (P = 4 x 10(-8) to P = 4 x 10(-9)). We found a genotype-specific expression of AUTS2 in 96 human prefrontal cortex samples (P = 0.026) and significant (P < 0.017) differences in expression of AUTS2 in whole-brain extracts of mice selected for differences in voluntary alcohol consumption. Down-regulation of an AUTS2 homolog caused reduced alcohol sensitivity in Drosophila (P < 0.001). Our finding of a regulator of alcohol consumption adds knowledge to our understanding of genetic mechanisms influencing alcohol drinking behavior.
Resumo:
BACKGROUND: The tendency to conceive dizygotic (DZ) twins is a complex trait influenced by genetic and environmental factors. To search for new candidate loci for twinning, we conducted a genome-wide linkage scan in 525 families using microsatellite and single nucleotide polymorphism marker panels. METHODS AND RESULTS: Non-parametric linkage analyses, including 523 families containing a total of 1115 mothers of DZ twins (MODZT) from Australia and New Zealand (ANZ) and The Netherlands (NL), produced four linkage peaks above the threshold for suggestive linkage, including a highly suggestive peak at the extreme telomeric end of chromosome 6 with an exponential logarithm of odds \[(exp)LOD] score of 2.813 (P = 0.0002). Since the DZ twinning rate increases steeply with maternal age independent of genetic effects, we also investigated linkage including only families where at least one MODZT gave birth to her first set of twins before the age of 30. These analyses produced a maximum expLOD score of 2.718 (P = 0.0002), largely due to linkage signal from the ANZ cohort, however, ordered subset analyses indicated this result is most likely a chance finding in the combined dataset. Linkage analyses were also performed for two large DZ twinning families from the USA, one of which produced a peak on chromosome 2 in the region of two potential candidate genes. Sequencing of FSHR and FIGLA, along with INHBB in MODZTs from two large NL families with family specific linkage peaks directly over this gene, revealed a potentially functional variant in the 5' untranslated region of FSHR that segregated with the DZ twinning phenotype in the Utah family. CONCLUSION: Our data provide further evidence for complex inheritance of familial DZ twinning.
Resumo:
The studies presented in this thesis contribute to the understanding of evolutionary ecology of three major viruses threatening cultivated sweetpotato (Ipomoea batatas Lam) in East Africa: Sweet potato feathery mottle virus (SPFMV; genus Potyvirus; Potyviridae), Sweet potato chlorotic stunt virus (SPCSV; genus Crinivirus; Closteroviridae) and Sweet potato mild mottle virus (SPMMV; genus Ipomovirus; Potyviridae). The viruses were serologically detected and the positive results confirmed by RT-PCR and sequencing. SPFMV was detected in 24 wild plant species of family Convolvulacea (genera Ipomoea, Lepistemon and Hewittia), of which 19 species were new natural hosts for SPFMV. SPMMV and SPCSV were detected in wild plants belonging to 21 and 12 species (genera Ipomoea, Lepistemon and Hewittia), respectively, all of which were previously unknown to be natural hosts of these viruses. SPFMV was the most abundant virus being detected in 17% of the plants, while SPMMV and SPCSV were detected in 9.8% and 5.4% of the assessed plants, respectively. Wild plants in Uganda were infected with the East African (EA), common (C), and the ordinary (O) strains, or co-infected with the EA and the C strain of SPFMV. The viruses and virus-like diseases were more frequent in the eastern agro-ecological zone than the western and central zones, which contrasted with known incidences of these viruses in sweetpotato crops, except for northern zone where incidences were lowest in wild plants as in sweetpotato. The NIb/CP junction in SPMMV was determined experimentally which facilitated CP-based phylogenetic and evolutionary analyses of SPMMV. Isolates of all the three viruses from wild plants were genetically similar to those found in cultivated sweetpotatoes in East Africa. There was no evidence of host-driven population genetic structures suggesting frequent transmission of these viruses between their wild and cultivated hosts. The p22 RNA silencing suppressor-encoding sequence was absent in a few SPCSV isolates, but regardless of this, SPCSV isolates incited sweet potato virus disease (SPVD) in sweetpotato plants co-infected with SPFMV, indicating that p22 is redundant for synergism between SCSV and SPFMV. Molecular evolutionary analysis revealed that isolates of strain EA of SPFMV that is largely restricted geographically in East Africa experience frequent recombination in comparison to isolates of strain C that is globally distributed. Moreover, non-homologous recombination events between strains EA and C were rare, despite frequent co-infections of these strains in wild plants, suggesting purifying selection against non-homologous recombinants between these strains or that such recombinants are mostly not infectious. Recombination was detected also in the 5 - and 3 -proximal regions of the SPMMV genome providing the first evidence of recombination in genus Ipomovirus, but no recombination events were detected in the characterized genomic regions of SPCSV. Strong purifying selection was implicated on evolution of majority of amino acids of the proteins encoded by the analyzed genomic regions of SPFMV, SPMMV and SPCSV. However, positive selection was predicted on 17 amino acids distributed over the whole the coat protein (CP) in the globally distributed strain C, as compared to only 4 amino acids in the multifunctional CP N-terminus (CP-NT) of strain EA largely restricted geographically to East Africa. A few amino acid sites in the N-terminus of SPMMV P1, the p7 protein and RNA silencing suppressor proteins p22 and RNase3 of SPCSV were also submitted to positive selection. Positively selected amino acids may constitute ligand-binding domains that determine interactions with plant host and/or insect vector factors. The P1 proteinase of SPMMV (genus Ipomovirus) seems to respond to needs of adaptation, which was not observed with the helper component proteinase (HC-Pro) of SPMMV, although the HC-Pro is responsible for many important molecular interactions in genus Potyvirus. Because the centre of origin of cultivated sweetpotato is in the Americas from where the crop was dispersed to other continents in recent history (except for the Australasia and South Pacific region), it would be expected that identical viruses and their strains occur worldwide, presuming virus dispersal with the host. Apparently, this seems not to be the case with SPMMV, the strain EA of SPFMV and the strain EA of SPCSV that are largely geographically confined in East Africa where they are predominant and occur both in natural and agro-ecosystems. The geographical distribution of plant viruses is constrained more by virus-vector relations than by virus-host interactions, which in accordance of the wide range of natural host species and the geographical confinement to East Africa suggest that these viruses existed in East African wild plants before the introduction of sweetpotato. Subsequently, these studies provide compelling evidence that East Africa constitutes a cradle of SPFMV strain EA, SPCSV strain EA, and SPMMV. Therefore, sweet potato virus disease (SPVD) in East Africa may be one of the examples of damaging virus diseases resulting from exchange of viruses between introduced crops and indigenous wild plant species. Keywords: Convolvulaceae, East Africa, epidemiology, evolution, genetic variability, Ipomoea, recombination, SPCSV, SPFMV, SPMMV, selection pressure, sweetpotato, wild plant species Author s Address: Arthur K. Tugume, Department of Agricultural Sciences, Faculty of Agriculture and Forestry, University of Helsinki, Latokartanonkaari 7, P.O Box 27, FIN-00014, Helsinki, Finland. Email: tugume.arthur@helsinki.fi Author s Present Address: Arthur K. Tugume, Department of Botany, Faculty of Science, Makerere University, P.O. Box 7062, Kampala, Uganda. Email: aktugume@botany.mak.ac.ug, tugumeka@yahoo.com
Resumo:
The silver gemfish Rexea solandri is an important economic resource but vulnerable to overfishing in Australian waters. The complete mitochondrial genome sequence is described from 1.6 million reads obtained via next generation sequencing. The total length of the mitogenome is 16,350 bp comprising 2 rRNA, 13 protein-coding genes, 22 tRNA and 2 non-coding regions. The mitogenome sequence was validated against sequences of PCR fragments and BLAST queries of Genbank. Gene order was equivalent to that found in marine fishes.
Resumo:
Background Next-generation sequencing technology is an important tool for the rapid, genome-wide identification of genetic variations. However, it is difficult to resolve the ‘signal’ of variations of interest and the ‘noise’ of stochastic sequencing and bioinformatic errors in the large datasets that are generated. We report a simple approach to identify regional linkage to a trait that requires only two pools of DNA to be sequenced from progeny of a defined genetic cross (i.e. bulk segregant analysis) at low coverage (<10×) and without parentage assignment of individual SNPs. The analysis relies on regional averaging of pooled SNP frequencies to rapidly scan polymorphisms across the genome for differential regional homozygosity, which is then displayed graphically. Results Progeny from defined genetic crosses of Tribolium castaneum (F4 and F19) segregating for the phosphine resistance trait were exposed to phosphine to select for the resistance trait while the remainders were left unexposed. Next generation sequencing was then carried out on the genomic DNA from each pool of selected and unselected insects from each generation. The reads were mapped against the annotated T. castaneum genome from NCBI (v3.0) and analysed for SNP variations. Since it is difficult to accurately call individual SNP frequencies when the depth of sequence coverage is low, variant frequencies were averaged across larger regions. Results from regional SNP frequency averaging identified two loci, tc_rph1 on chromosome 8 and tc_rph2 on chromosome 9, which together are responsible for high level resistance. Identification of the two loci was possible with only 5-7× average coverage of the genome per dataset. These loci were subsequently confirmed by direct SNP marker analysis and fine-scale mapping. Individually, homozygosity of tc_rph1 or tc_rph2 results in only weak resistance to phosphine (estimated at up to 1.5-2.5× and 3-5× respectively), whereas in combination they interact synergistically to provide a high-level resistance >200×. The tc_rph2 resistance allele resulted in a significant fitness cost relative to the wild type allele in unselected beetles over eighteen generations. Conclusion We have validated the technique of linkage mapping by low-coverage sequencing of progeny from a simple genetic cross. The approach relied on regional averaging of SNP frequencies and was used to successfully identify candidate gene loci for phosphine resistance in T. castaneum. This is a relatively simple and rapid approach to identifying genomic regions associated with traits in defined genetic crosses that does not require any specialised statistical analysis.
Resumo:
The first complete genome sequence of capsicum chlorosis virus (CaCV) from Australia was determined using a combination of Illumina HiSeq RNA and Sanger sequencing technologies. Australian CaCV had a tripartite genome structure like other CaCV isolates. The large (L) RNA was 8913 nucleotides (nt) in length and contained a single open reading frame (ORF) of 8634 nt encoding a predicted RNA-dependent RNA polymerase (RdRp) in the viral-complementary (vc) sense. The medium (M) and small (S) RNA segments were 4846 and 3944 nt in length, respectively, each containing two non-overlapping ORFs in ambisense orientation, separated by intergenic regions (IGR). The M segment contained ORFs encoding the predicted non-structural movement protein (NSm; 927 nt) and precursor of glycoproteins (GP; 3366 nt) in the viral sense (v) and vc strand, respectively, separated by a 449-nt IGR. The S segment coded for the predicted nucleocapsid (N) protein (828 nt) and non-structural suppressor of silencing protein (NSs; 1320 nt) in the vc and v strand, respectively. The S RNA contained an IGR of 1663 nt, being the largest IGR of all CaCV isolates sequenced so far. Comparison of the Australian CaCV genome with complete CaCV genome sequences from other geographic regions showed highest sequence identity with a Taiwanese isolate. Genome sequence comparisons and phylogeny of all available CaCV isolates provided evidence for at least two highly diverged groups of CaCV isolates that may warrant re-classification of AIT-Thailand and CP-China isolates as unique tospoviruses, separate from CaCV.
Resumo:
Brassica napus is one of the most important oil crops in the world, and stem rot caused by the fungus Sclerotinia sclerotiorum results in major losses in yield and quality. To elucidate resistance genes and pathogenesis-related genes, genome-wide association analysis of 347 accessions was performed using the Illumina 60K Brassica SNP (single nucleotide polymorphism) array. In addition, the detached stem inoculation assay was used to select five highly resistant (R) and susceptible (S) B. napus lines, 48 h postinoculation with S. sclerotiorum for transcriptome sequencing. We identified 17 significant associations for stem resistance on chromosomes A8 and C6, five of which were on A8 and 12 on C6. The SNPs identified on A8 were located in a 409-kb haplotype block, and those on C6 were consistent with previous QTL mapping efforts. Transcriptome analysis suggested that S. sclerotiorum infection activates the immune system, sulphur metabolism, especially glutathione (GSH) and glucosinolates in both R and S genotypes. Genes found to be specific to the R genotype related to the jasmonic acid pathway, lignin biosynthesis, defence response, signal transduction and encoding transcription factors. Twenty-four genes were identified in both the SNP-trait association and transcriptome sequencing analyses, including a tau class glutathione S-transferase (GSTU) gene cluster. This study provides useful insight into the molecular mechanisms underlying the plant's response to S. sclerotiorum.
Resumo:
Metabolism is the cellular subsystem responsible for generation of energy from nutrients and production of building blocks for larger macromolecules. Computational and statistical modeling of metabolism is vital to many disciplines including bioengineering, the study of diseases, drug target identification, and understanding the evolution of metabolism. In this thesis, we propose efficient computational methods for metabolic modeling. The techniques presented are targeted particularly at the analysis of large metabolic models encompassing the whole metabolism of one or several organisms. We concentrate on three major themes of metabolic modeling: metabolic pathway analysis, metabolic reconstruction and the study of evolution of metabolism. In the first part of this thesis, we study metabolic pathway analysis. We propose a novel modeling framework called gapless modeling to study biochemically viable metabolic networks and pathways. In addition, we investigate the utilization of atom-level information on metabolism to improve the quality of pathway analyses. We describe efficient algorithms for discovering both gapless and atom-level metabolic pathways, and conduct experiments with large-scale metabolic networks. The presented gapless approach offers a compromise in terms of complexity and feasibility between the previous graph-theoretic and stoichiometric approaches to metabolic modeling. Gapless pathway analysis shows that microbial metabolic networks are not as robust to random damage as suggested by previous studies. Furthermore the amino acid biosynthesis pathways of the fungal species Trichoderma reesei discovered from atom-level data are shown to closely correspond to those of Saccharomyces cerevisiae. In the second part, we propose computational methods for metabolic reconstruction in the gapless modeling framework. We study the task of reconstructing a metabolic network that does not suffer from connectivity problems. Such problems often limit the usability of reconstructed models, and typically require a significant amount of manual postprocessing. We formulate gapless metabolic reconstruction as an optimization problem and propose an efficient divide-and-conquer strategy to solve it with real-world instances. We also describe computational techniques for solving problems stemming from ambiguities in metabolite naming. These techniques have been implemented in a web-based sofware ReMatch intended for reconstruction of models for 13C metabolic flux analysis. In the third part, we extend our scope from single to multiple metabolic networks and propose an algorithm for inferring gapless metabolic networks of ancestral species from phylogenetic data. Experimenting with 16 fungal species, we show that the method is able to generate results that are easily interpretable and that provide hypotheses about the evolution of metabolism.
Resumo:
Two complete mitochondrial genomes of the black marlin Istiompax indica were assembled from approximately 3.5 and 2.5 million reads produced by Ion Torrent next generation sequencing. The complete genomes were 16,531 bp and 16,532 bp in length consisting of 2 rRNA, 13 protein-coding genes, 22tRNA and 2 coding regions. They demonstrated a similar A + T base (52.6%) to other teleosts. Intraspecific sequence variation was 99.5% for three I. indica mitogenomes and 99.7% for X. gladius. A lower value (85%) was found for the I. platypterus mitogenomes from genbank and accredited to inadvertent inclusion of gene regions from a con-familial species in one record, highlighting the need for cautious downstream use of genbank data. © 2014 Informa UK Ltd.
Resumo:
In the last decade, huge breakthroughs in genetics - driven by new technology and different statistical approaches - have resulted in a plethora of new disease genes identified for both common and rare diseases. Massive parallel sequencing, commonly known as next-generation sequencing, is the latest advance in genetics, and has already facilitated the discovery of the molecular cause of many monogenic disorders. This article describes this new technology and reviews how this approach has been used successfully in patients with skeletal dysplasias. Moreover, this article illustrates how the study of rare diseases can inform understanding and therapeutic developments for common diseases such as osteoporosis. © International Osteoporosis Foundation and National Osteoporosis Foundation 2013.
Resumo:
Purpose: Mutations in IDH3B, an enzyme participating in the Krebs cycle, have recently been found to cause autosomal recessive retinitis pigmentosa (arRP). The MDH1 gene maps within the RP28 arRP linkage interval and encodes cytoplasmic malate dehydrogenase, an enzyme functionally related to IDH3B. As a proof of concept for candidate gene screening to be routinely performed by ultra high throughput sequencing (UHTs), we analyzed MDH1 in a patient from each of the two families described so far to show linkage between arRP and RP28. Methods: With genomic long-range PCR, we amplified all introns and exons of the MDH1 gene (23.4 kb). PCR products were then sequenced by short-read UHTs with no further processing. Computer-based mapping of the reads and mutation detection were performed by three independent software packages. Results: Despite the intrinsic complexity of human genome sequences, reads were easily mapped and analyzed, and all algorithms used provided the same results. The two patients were homozygous for all DNA variants identified in the region, which confirms previous linkage and homozygosity mapping results, but had different haplotypes, indicating genetic or allelic heterogeneity. None of the DNA changes detected could be associated with the disease. Conclusions: The MDH1 gene is not the cause of RP28-linked arRP. Our experimental strategy shows that long-range genomic PCR followed by UHTs provides an excellent system to perform a thorough screening of candidate genes for hereditary retinal degeneration.
Resumo:
Horizontal gene transfer (HGT) is known to be a major force in genome evolution. The acquisition of genes from viruses by eukaryotic genomes is a well-studied example of HGT, including rare cases of non-retroviral RNA virus integration. The present study describes the integration of cucumber mosaic virus RNA-1 into soybean genome. After an initial metatranscriptomic analysis of small RNAs derived from soybean, the de novo assembly resulted a 3029-nt contig homologous to RNA-1. The integration of this sequence in the soybean genome was confirmed by DNA deep sequencing. The locus where the integration occurred harbors the full RNA-1 sequence followed by the partial sequence of an endogenous mRNA and another sequence of RNA-1 as an inverted repeat and allowing the formation of a hairpin structure. This region recombined into a retrotransposon located inside an exon of a soybean gene. The nucleotide similarity of the integrated sequence compared to other Cucumber mosaic virus sequences indicates that the integration event occurred recently. We described a rare event of non-retroviral RNA virus integration in soybean that leads to the production of a double-stranded RNA in a similar fashion to virus resistance RNAi plants.
Resumo:
Purpose: Mutations in IDH3B, an enzyme participating in the Krebs cycle, have recently been found to cause autosomal recessive retinitis pigmentosa (arRP). The MDH1 gene maps within the RP28 arRP linkage interval and encodes cytoplasmic malate dehydrogenase, an enzyme functionally related to IDH3B. As a proof of concept for candidate gene screening to be routinely performed by ultra high throughput sequencing (UHTs), we analyzed MDH1 in a patient from each of the two families described so far to show linkage between arRP and RP28. Methods: With genomic long-range PCR, we amplified all introns and exons of the MDH1 gene (23.4 kb). PCR products were then sequenced by short-read UHTs with no further processing. Computer-based mapping of the reads and mutation detection were performed by three independent software packages. Results: Despite the intrinsic complexity of human genome sequences, reads were easily mapped and analyzed, and all algorithms used provided the same results. The two patients were homozygous for all DNA variants identified in the region, which confirms previous linkage and homozygosity mapping results, but had different haplotypes, indicating genetic or allelic heterogeneity. None of the DNA changes detected could be associated with the disease.
Resumo:
Background: Candida auris is a multidrug resistant, emerging agent of fungemia in humans. Its actual global distribution remains obscure as the current commercial methods of clinical diagnosis misidentify it as C. haemulonii. Here we report the first draft genome of C. auris to explore the genomic basis of virulence and unique differences that could be employed for differential diagnosis. Results: More than 99.5 % of the C. auris genomic reads did not align to the current whole (or draft) genome sequences of Candida albicans, Candida lusitaniae, Candida glabrata and Saccharomyces cerevisiae; thereby indicating its divergence from the active Candida clade. The genome spans around 12.49 Mb with 8527 predicted genes. Functional annotation revealed that among the sequenced Candida species, it is closest to the hemiascomycete species Clavispora lusitaniae. Comparison with the well-studied species Candida albicans showed that it shares significant virulence attributes with other pathogenic Candida species such as oligopeptide transporters, mannosyl transfersases, secreted proteases and genes involved in biofilm formation. We also identified a plethora of transporters belonging to the ABC and major facilitator superfamily along with known MDR transcription factors which explained its high tolerance to antifungal drugs. Conclusions: Our study emphasizes an urgent need for accurate fungal screening methods such as PCR and electrophoretic karyotyping to ensure proper management of fungemia. Our work highlights the potential genetic mechanisms involved in virulence and pathogenicity of an important emerging human pathogen namely C. auris. Owing to its diversity at the genomic scale; we expect the genome sequence to be a useful resource to map species specific differences that will help develop accurate diagnostic markers and better drug targets.
Resumo:
The Asian elephant Elephas maximus and the African elephant Loxodonta africana that diverged 5-7 million years ago exhibit differences in their physiology, behaviour and morphology. A comparative genomics approach would be useful and necessary for evolutionary and functional genetic studies of elephants. We performed sequencing of E. maximus and map to L. africana at similar to 15X coverage. Through comparative sequence analyses, we have identified Asian elephant specific homozygous, non-synonymous single nucleotide variants (SNVs) that map to 1514 protein coding genes, many of which are involved in olfaction. We also present the first report of a high-coverage transcriptome sequence in E. maximus from peripheral blood lymphocytes. We have identified 103 novel protein coding transcripts and 66-long non-coding (lnc)RNAs. We also report the presence of 181 protein domains unique to elephants when compared to other Afrotheria species. Each of these findings can be further investigated to gain a better understanding of functional differences unique to elephant species, as well as those unique to elephantids in comparison with other mammals. This work therefore provides a valuable resource to explore the immense research potential of comparative analyses of transcriptome and genome sequences in the Asian elephant.