831 resultados para Klebsiella pneumoniae genome sequence
Resumo:
We report a genome-wide association study for open-angle glaucoma (OAG) blindness using a discovery cohort of 590 individuals with severe visual field loss (cases) and 3,956 controls. We identified associated loci at TMCO1 (rs4656461[G] odds ratio (OR) = 1.68, P = 6.1 × 10-10) and CDKN2B-AS1 (rs4977756[A] OR = 1.50, P = 4.7 × 10-9). We replicated these associations in an independent cohort of cases with advanced OAG (rs4656461 P = 0.010; rs4977756 P = 0.042) and two additional cohorts of less severe OAG (rs4656461 combined discovery and replication P = 6.00 × 10-14, OR = 1.51, 95% CI 1.35-1.68; rs4977756 combined P = 1.35 × 10-14, OR = 1.39, 95% CI 1.28-1.51). We show retinal expression of genes at both loci in human ocular tissues. We also show that CDKN2A and CDKN2B are upregulated in the retina of a rat model of glaucoma. © 2011 Nature America, Inc. All rights reserved.
Resumo:
Copy number variants (CNVs) account for a major proportion of human genetic polymorphism and have been predicted to have an important role in genetic susceptibility to common disease. To address this we undertook a large, direct genome-wide study of association between CNVs and eight common human diseases. Using a purpose-designed array we typed 19,000 individuals into distinct copy-number classes at 3,432 polymorphic CNVs, including an estimated 50% of all common CNVs larger than 500 base pairs. We identified several biological artefacts that lead to false-positive associations, including systematic CNV differences between DNAs derived from blood and cell lines. Association testing and follow-up replication analyses confirmed three loci where CNVs were associated with diseaseIRGM for Crohns disease, HLA for Crohns disease, rheumatoid arthritis and type 1 diabetes, and TSPAN8 for type 2 diabetesalthough in each case the locus had previously been identified in single nucleotide polymorphism (SNP)-based studies, reflecting our observation that most common CNVs that are well-typed on our array are well tagged by SNPs and so have been indirectly explored through SNP studies. We conclude that common CNVs that can be typed on existing platforms are unlikely to contribute greatly to the genetic basis of common human diseases. © 2010 Macmillan Publishers Limited. All rights reserved.
Resumo:
Chlamydia pneumoniae is an obligate intracellular bacterium implicated in a wide range of human diseases including atherosclerosis and Alzheimer's disease. Efforts to understand the relationships between C. pneumoniae detected in these diseases have been hindered by the availability of sequence data for non-respiratory strains. In this study, we sequenced the whole genomes for C. pneumoniae isolates from atherosclerosis and Alzheimer's disease, and compared these to previously published C. pneumoniae genomes. Phylogenetic analyses of these new C. pneumoniae strains indicate two sub-groups within human C. pneumoniae, and suggest that both recombination and mutation events have driven the evolution of human C. pneumoniae. Further fine-detailed analyses of these new C. pneumoniae sequences show several genetically variable loci. This suggests that similar strains of C. pneumoniae are found in the brain, lungs and cardiovascular system and that only minor genetic differences may contribute to the adaptation of particular strains in human disease.
Resumo:
The restructuring of the crop agriculture industry over the past two decades has enabled patent holders to exclude, prevent and deter others from using certain research tools and delay or block further follow-on inventions
Resumo:
Background: Tuberculosis still remains one of the largest killer infectious diseases, warranting the identification of newer targets and drugs. Identification and validation of appropriate targets for designing drugs are critical steps in drug discovery, which are at present major bottle-necks. A majority of drugs in current clinical use for many diseases have been designed without the knowledge of the targets, perhaps because standard methodologies to identify such targets in a high-throughput fashion do not really exist. With different kinds of 'omics' data that are now available, computational approaches can be powerful means of obtaining short-lists of possible targets for further experimental validation. Results: We report a comprehensive in silico target identification pipeline, targetTB, for Mycobacterium tuberculosis. The pipeline incorporates a network analysis of the protein-protein interactome, a flux balance analysis of the reactome, experimentally derived phenotype essentiality data, sequence analyses and a structural assessment of targetability, using novel algorithms recently developed by us. Using flux balance analysis and network analysis, proteins critical for survival of M. tuberculosis are first identified, followed by comparative genomics with the host, finally incorporating a novel structural analysis of the binding sites to assess the feasibility of a protein as a target. Further analyses include correlation with expression data and non-similarity to gut flora proteins as well as 'anti-targets' in the host, leading to the identification of 451 high-confidence targets. Through phylogenetic profiling against 228 pathogen genomes, shortlisted targets have been further explored to identify broad-spectrum antibiotic targets, while also identifying those specific to tuberculosis. Targets that address mycobacterial persistence and drug resistance mechanisms are also analysed. Conclusion: The pipeline developed provides rational schema for drug target identification that are likely to have high rates of success, which is expected to save enormous amounts of money, resources and time in the drug discovery process. A thorough comparison with previously suggested targets in the literature demonstrates the usefulness of the integrated approach used in our study, highlighting the importance of systems-level analyses in particular. The method has the potential to be used as a general strategy for target identification and validation and hence significantly impact most drug discovery programmes.
Resumo:
The genomic sequences of several RNA plant viruses including cucumber mosaic virus, brome mosaic virus, alfalfa mosaic virus and tobacco mosaic virus have become available recently. The former two viruses are icosahedral while the latter two are bullet and rod shaped, respectively in particle morphology. The non-structural 3a proteins of cucumber mosaic virus and brome mosaic virus have an amino acid sequence homology of 35% and hence are evolutionarily related. In contrast, the coat proteins exhibit little homology, although the circular dichroism spectrum of these viruses are similar. The non-coding regions of the genome also exhibit variable but extensive homology. Comparison of the brome mosaic virus and alfalfa mosaic virus sequences reveals that they are probably related although with a much larger evolutionary distance. The polypeptide folds of the coat protein of three biologically distinct isometric plant viruses, tomato bushy stunt virus, southern bean mosaic virus and satellite tobacco necrosis virus have been shown to display a striking resemblance. All of them consist of a topologically similar 8-standard β-barrel. The implications of these studies to the understanding of the evolution of plant viruses will be discussed.
Resumo:
Blood cells participate in vital physiological processes, and their numbers are tightly regulated so that homeostasis is maintained. Disruption of key regulatory mechanisms underlies many blood-related Mendelian diseases but also contributes to more common disorders, including atherosclerosis. We searched for quantitative trait loci (QTL) for hematology traits through a whole-genome association study, because these could provide new insights into both hemopoeitic and disease mechanisms. We tested 1.8 million variants for association with 13 hematology traits measured in 6015 individuals from the Australian and Dutch populations. These traits included hemoglobin composition, platelet counts, and red blood cell and white blood cell indices. We identified three regions of strong association that, to our knowledge, have not been previously reported in the literature. The first was located in an intergenic region of chromosome 9q31 near LPAR1, explaining 1.5% of the variation in monocyte counts (best SNP rs7023923, p=8.9x10(-14)). The second locus was located on chromosome 6p21 and associated with mean cell erythrocyte volume (rs12661667, p=1.2x10(-9), 0.7% variance explained) in a region that spanned five genes, including CCND3, a member of the D-cyclin gene family that is involved in hematopoietic stem cell expansion. The third region was also associated with erythrocyte volume and was located in an intergenic region on chromosome 6q24 (rs592423, p=5.3x10(-9), 0.6% variance explained). All three loci replicated in an independent panel of 1543 individuals (p values=0.001, 9.9x10(-5), and 7x10(-5), respectively). The identification of these QTL provides new opportunities for furthering our understanding of the mechanisms regulating hemopoietic cell fate.
Resumo:
Background Next-generation sequencing technology is an important tool for the rapid, genome-wide identification of genetic variations. However, it is difficult to resolve the ‘signal’ of variations of interest and the ‘noise’ of stochastic sequencing and bioinformatic errors in the large datasets that are generated. We report a simple approach to identify regional linkage to a trait that requires only two pools of DNA to be sequenced from progeny of a defined genetic cross (i.e. bulk segregant analysis) at low coverage (<10×) and without parentage assignment of individual SNPs. The analysis relies on regional averaging of pooled SNP frequencies to rapidly scan polymorphisms across the genome for differential regional homozygosity, which is then displayed graphically. Results Progeny from defined genetic crosses of Tribolium castaneum (F4 and F19) segregating for the phosphine resistance trait were exposed to phosphine to select for the resistance trait while the remainders were left unexposed. Next generation sequencing was then carried out on the genomic DNA from each pool of selected and unselected insects from each generation. The reads were mapped against the annotated T. castaneum genome from NCBI (v3.0) and analysed for SNP variations. Since it is difficult to accurately call individual SNP frequencies when the depth of sequence coverage is low, variant frequencies were averaged across larger regions. Results from regional SNP frequency averaging identified two loci, tc_rph1 on chromosome 8 and tc_rph2 on chromosome 9, which together are responsible for high level resistance. Identification of the two loci was possible with only 5-7× average coverage of the genome per dataset. These loci were subsequently confirmed by direct SNP marker analysis and fine-scale mapping. Individually, homozygosity of tc_rph1 or tc_rph2 results in only weak resistance to phosphine (estimated at up to 1.5-2.5× and 3-5× respectively), whereas in combination they interact synergistically to provide a high-level resistance >200×. The tc_rph2 resistance allele resulted in a significant fitness cost relative to the wild type allele in unselected beetles over eighteen generations. Conclusion We have validated the technique of linkage mapping by low-coverage sequencing of progeny from a simple genetic cross. The approach relied on regional averaging of SNP frequencies and was used to successfully identify candidate gene loci for phosphine resistance in T. castaneum. This is a relatively simple and rapid approach to identifying genomic regions associated with traits in defined genetic crosses that does not require any specialised statistical analysis.
Resumo:
Sorghum (Sorghum bicolor) is one of the most important cereal crops globally and a potential energy plant for biofuel production. In order to explore genetic gain for a range of important quantitative traits, such as drought and heat tolerance, grain yield, stem sugar accumulation, and biomass production, via the use of molecular breeding and genomic selection strategies, knowledge of the available genetic variation and the underlying sequence polymorphisms, is required.
Resumo:
Two complete mitochondrial genomes of the black marlin Istiompax indica were assembled from approximately 3.5 and 2.5 million reads produced by Ion Torrent next generation sequencing. The complete genomes were 16,531 bp and 16,532 bp in length consisting of 2 rRNA, 13 protein-coding genes, 22tRNA and 2 coding regions. They demonstrated a similar A + T base (52.6%) to other teleosts. Intraspecific sequence variation was 99.5% for three I. indica mitogenomes and 99.7% for X. gladius. A lower value (85%) was found for the I. platypterus mitogenomes from genbank and accredited to inadvertent inclusion of gene regions from a con-familial species in one record, highlighting the need for cautious downstream use of genbank data. © 2014 Informa UK Ltd.
Resumo:
The mango industry in Australia is worth in excess of $150 million annually with the Kensington Pride (KP) cultivar capturing 60% of the domestic market. Valued by consumers for desirable taste and colour characteristics, KP has been used extensively as a parent in the Department of Agriculture and Fisheries’ (Queensland, Australia) mango breeding program with over 400 hybrid trees sharing KP as the male parent. In order to gain a better understanding of Australia’s most significant mango variety, Horticulture Innovation Australia had led an international collaboration between the Queensland Department of Agriculture and Fisheries (Australia), the International Crops Research Institute for the Semi-Arid Tropics (ICRISAT, India) and the Beijing Genomics Institute (China) to sequence the KP genome. Preliminary de novo assembly of illumina short read sequence data suggests that the KP genome is highly heterozygous and has an estimated genome size of 407 Mb. As refinements and additional sequence data are added to the assembly, a more complete picture of the mango genome will be elucidated.
Resumo:
The complete genome of the baker's yeast S. cerevisiae was analyzed for the presence of polypurine/polypyrimidine (poly[pu/py]) repeats and their occurrences were classified on the basis of their location within and outside open reading frames (ORFs). The analysis reveals that such sequence motifs are present abundantly both in coding as well as noncoding regions. Clear positional preferences are seen when these tracts occur in noncoding regions. These motifs appear to occur predominantly at a unit nucleosomal length both upstream and downstream of ORFs. Moreover, there is a biased distribution of polypurines in the coding strands when these motifs occur within open reading frames. The significance of the biased distribution is discussed with reference to the occurrence of these motifs in other known mRNA sequences and expressed sequence tags. A model for cis regulation of gene expression is proposed based on the ability of these motifs to form an intermolecular triple helix structure when present within the coding region and/or to modulate nucleosome positioning via enhanced histone affinity when present outside coding regions.
Resumo:
Viral genomes are encapsidated within protective protein shells. This encapsidation can be achieved either by a co-condensation reaction of the nucleic acid and coat proteins, or by first forming empty viral particles which are subsequently packaged with nucleic acid, the latter mechanism being typical for many dsDNA bacteriophages. Bacteriophage PRD1 is an icosahedral, non-tailed dsDNA virus that has an internal lipid membrane, the hallmark of the Tectiviridae family. Although PRD1 has been known to assemble empty particles into which the genome is subsequently packaged, the mechanism for this has been unknown, and there has been no evidence for a separate packaging vertex, similar to the portal structures used for packaging in the tailed bacteriophages and herpesviruses. In this study, a unique DNA packaging vertex was identified for PRD1, containing the packaging ATPase P9, packaging factor P6 and two small membrane proteins, P20 and P22, extending the packaging vertex to the internal membrane. Lack of small membrane protein P20 was shown to totally abolish packaging, making it an essential part of the PRD1 packaging mechanism. The minor capsid proteins P6 was shown to be an important packaging factor, its absence leading to greatly reduced packaging efficiency. An in vitro DNA packaging mechanism consisting of recombinant packaging ATPase P9, empty procapsids and mutant PRD1 DNA with a LacZ-insert was developed for the analysis of PRD1 packaging, the first such system ever for a virus containing an internal membrane. A new tectiviral sequence, a linear plasmid called pBClin15, was identified in Bacillus cereus, providing material for sequence analysis of the tectiviruses. Analysis of PRD1 P9 and other putative tectiviral ATPase sequences revealed several conserved sequence motifs, among them a new tectiviral packaging ATPase motif. Mutagenesis studies on PRD1 P9 were used to confirm the significance of the motifs. P9-type putative ATPase sequences carrying a similar sequence motif were identified in several other membrane containing dsDNA viruses of bacterial, archaeal and eukaryotic hosts, suggesting that these viruses may have similar packaging mechanisms. Interestingly, almost the same set of viruses that were found to have similar putative packaging ATPases had earlier been found to share similar coat protein folds and capsid structures, and a common origin for these viruses had been suggested. The finding in this study of similar packaging proteins further supports the idea that these viruses are descendants of a common ancestor.
Resumo:
Evolutionary genetics incorporates traditional population genetics and studies of the origins of genetic variation by mutation and recombination, and the molecular evolution of genomes. Among the primary forces that have potential to affect the genetic variation within and among populations, including those that may lead to adaptation and speciation, are genetic drift, gene flow, mutations and natural selection. The main challenges in knowing the genetic basis of evolutionary changes is to distinguish the adaptive selection forces that cause existent DNA sequence variants and also to identify the nucleotide differences responsible for the observed phenotypic variation. To understand the effects of various forces, interpretation of gene sequence variation has been the principal basis of many evolutionary genetic studies. The main aim of this thesis was to assess different forms of teleost gene sequence polymorphisms in evolutionary genetic studies of Atlantic salmon (Salmo salar) and other species. Firstly, the level of Darwinian adaptive evolution affected coding regions of the growth hormone (GH) gene during the teleost evolution was investigated based on the sequence data existing in public databases. Secondly, a target gene approach was used to identify within population variation in the growth hormone 1 (GH1) gene in salmon. Then, a new strategy for single nucleotide polymorphisms (SNPs) discovery in salmonid fishes was introduced, and, finally, the usefulness of a limited number of SNP markers as molecular tools in several applications of population genetics in Atlantic salmon was assessed. This thesis showed that the gene sequences in databases can be utilized to perform comparative studies of molecular evolution, and some putative evidence of the existence of Darwinian selection during the teleost GH evolution was presented. In addition, existent sequence data was exploited to investigate GH1 gene variation within Atlantic salmon populations throughout its range. Purifying selection is suggested to be the predominant evolutionary force controlling the genetic variation of this gene in salmon, and some support for gene flow between continents was also observed. The novel approach to SNP discovery in species with duplicated genome fragments introduced here proved to be an effective method, and this may have several applications in evolutionary genetics with different species - e.g. when developing gene-targeted markers to investigate quantitative genetic variation. The thesis also demonstrated that only a few SNPs performed highly similar signals in some of the population genetic analyses when compared with the microsatellite markers. This may have useful applications when estimating genetic diversity in genes having a potential role in ecological and conservation issues, or when using hard biological samples in genetic studies as SNPs can be applied with relatively highly degraded DNA.
Resumo:
Filamentous fungi of the subphylum Pezizomycotina are well known as protein and secondary metabolite producers. Various industries take advantage of these capabilities. However, the molecular biology of yeasts, i.e. Saccharomycotina and especially that of Saccharomyces cerevisiae, the baker's yeast, is much better known. In an effort to explain fungal phenotypes through their genotypes we have compared protein coding gene contents of Pezizomycotina and Saccharomycotina. Only biomass degradation and secondary metabolism related protein families seem to have expanded recently in Pezizomycotina. Of the protein families clearly diverged between Pezizomycotina and Saccharomycotina, those related to mitochondrial functions emerge as the most prominent. However, the primary metabolism as described in S. cerevisiae is largely conserved in all fungi. Apart from the known secondary metabolism, Pezizomycotina have pathways that could link secondary metabolism to primary metabolism and a wealth of undescribed enzymes. Previous studies of individual Pezizomycotina genomes have shown that regardless of the difference in production efficiency and diversity of secreted proteins, the content of the known secretion machinery genes in Pezizomycotina and Saccharomycotina appears very similar. Genome wide analysis of gene products is therefore needed to better understand the efficient secretion of Pezizomycotina. We have developed methods applicable to transcriptome analysis of non-sequenced organisms. TRAC (Transcriptional profiling with the aid of affinity capture) has been previously developed at VTT for fast, focused transcription analysis. We introduce a version of TRAC that allows more powerful signal amplification and multiplexing. We also present computational optimisations of transcriptome analysis of non-sequenced organism and TRAC analysis in general. Trichoderma reesei is one of the most commonly used Pezizomycotina in the protein production industry. In order to understand its secretion system better and find clues for improvement of its industrial performance, we have analysed its transcriptomic response to protein secretion stress conditions. In comparison to S. cerevisiae, the response of T. reesei appears different, but still impacts on the same cellular functions. We also discovered in T. reesei interesting similarities to mammalian protein secretion stress response. Together these findings highlight targets for more detailed studies.