937 resultados para single-nucleotide polymorphism
Resumo:
Background Phylogeographic reconstruction of some bacterial populations is hindered by low diversity coupled with high levels of lateral gene transfer. A comparison of recombination levels and diversity at seven housekeeping genes for eleven bacterial species, most of which are commonly cited as having high levels of lateral gene transfer shows that the relative contributions of homologous recombination versus mutation for Burkholderia pseudomallei is over two times higher than for Streptococcus pneumoniae and is thus the highest value yet reported in bacteria. Despite the potential for homologous recombination to increase diversity, B. pseudomallei exhibits a relative lack of diversity at these loci. In these situations, whole genome genotyping of orthologous shared single nucleotide polymorphism loci, discovered using next generation sequencing technologies, can provide very large data sets capable of estimating core phylogenetic relationships. We compared and searched 43 whole genome sequences of B. pseudomallei and its closest relatives for single nucleotide polymorphisms in orthologous shared regions to use in phylogenetic reconstruction. Results Bayesian phylogenetic analyses of >14,000 single nucleotide polymorphisms yielded completely resolved trees for these 43 strains with high levels of statistical support. These results enable a better understanding of a separate analysis of population differentiation among >1,700 B. pseudomallei isolates as defined by sequence data from seven housekeeping genes. We analyzed this larger data set for population structure and allele sharing that can be attributed to lateral gene transfer. Our results suggest that despite an almost panmictic population, we can detect two distinct populations of B. pseudomallei that conform to biogeographic patterns found in many plant and animal species. That is, separation along Wallace's Line, a biogeographic boundary between Southeast Asia and Australia. Conclusion We describe an Australian origin for B. pseudomallei, characterized by a single introduction event into Southeast Asia during a recent glacial period, and variable levels of lateral gene transfer within populations. These patterns provide insights into mechanisms of genetic diversification in B. pseudomallei and its closest relatives, and provide a framework for integrating the traditionally separate fields of population genetics and phylogenetics for other bacterial species with high levels of lateral gene transfer.
Resumo:
Genetic research of complex diseases is a challenging, but exciting, area of research. The early development of the research was limited, however, until the completion of the Human Genome and HapMap projects, along with the reduction in the cost of genotyping, which paves the way for understanding the genetic composition of complex diseases. In this thesis, we focus on the statistical methods for two aspects of genetic research: phenotype definition for diseases with complex etiology and methods for identifying potentially associated Single Nucleotide Polymorphisms (SNPs) and SNP-SNP interactions. With regard to phenotype definition for diseases with complex etiology, we firstly investigated the effects of different statistical phenotyping approaches on the subsequent analysis. In light of the findings, and the difficulties in validating the estimated phenotype, we proposed two different methods for reconciling phenotypes of different models using Bayesian model averaging as a coherent mechanism for accounting for model uncertainty. In the second part of the thesis, the focus is turned to the methods for identifying associated SNPs and SNP interactions. We review the use of Bayesian logistic regression with variable selection for SNP identification and extended the model for detecting the interaction effects for population based case-control studies. In this part of study, we also develop a machine learning algorithm to cope with the large scale data analysis, namely modified Logic Regression with Genetic Program (MLR-GEP), which is then compared with the Bayesian model, Random Forests and other variants of logic regression.
Resumo:
Although numerous genetic and acquired factors are appreciated as risk factors for venous thromboembolism (VTE) [1,2], only recently have male gender [3,4], dyslipoproteinemia [5], and silent atherosclerotic vascular disease [6] been linked to VTE. We recently found that high-density lipoprotein (HDL) deficiency is a key feature of a pattern of dyslipoproteinemia that is associated with VTE in males, and we found that the common TaqI B1 variation in the cholesteryl ester transfer protein (CETP) gene is significantly linked to VTE [5]. However, the TaqI B1/B2 single nucleotide polymorphism (SNP) itself is unlikely to affect directly CETP activity, but it is linked to nonsynonymous CETP SNPs Ala373Pro and Arg451Gln [7–9]. Here, we demonstrate that these two CETP variations are associated with VTE and low plasma HDL levels in males.
Resumo:
BACKGROUND Endometriosis is a polygenic disease with a complex and multifactorial aetiology that affects 8-10% of women of reproductive age. Epidemiological data support a link between endometriosis and cancers of the reproductive tract. Fibroblast growth factor receptor 2 (FGFR2) has recently been implicated in both endometrial and breast cancer. Our previous studies on endometriosis identified significant linkage to a novel susceptibility locus on chromosome 10q26 and the FGFR2 gene maps within this linkage region. We therefore hypothesized that variation in FGFR2 may contribute to the risk of endometriosis. METHODS We genotyped 13 single nucleotide polymorphisms (SNPs) densely covering a 27 kb region within intron 2 of FGFR2 including two SNPs (rs2981582 and rs1219648) significantly associated with breast cancer and a total 40 tagSNPs across 150 kb of the FGFR2 gene. SNPs were genotyped in 958 endometriosis cases and 959 unrelated controls. RESULTS We found no evidence for association between endometriosis and FGFR2 intron 2 SNPs or SNP haplotypes and no evidence for association between endometriosis and variation across the FGFR2 gene. CONCLUSIONS Common variation in the breast-cancer implicated intron 2 and other highly plausible causative candidate regions of FGFR2 do not appear to be a major contributor to endometriosis susceptibility in our large Australian sample.
Resumo:
Several studies have demonstrated an association between polycystic ovary syndrome (PCOS) and the dinucleotide repeat microsatellite marker D19S884, which is located in intron 55 of the fibrillin-3 (FBN3) gene. Fibrillins, including FBN1 and 2, interact with latent transforming growth factor (TGF)-β-binding proteins (LTBP) and thereby control the bioactivity of TGFβs. TGFβs stimulate fibroblast replication and collagen production. The PCOS ovarian phenotype includes increased stromal collagen and expansion of the ovarian cortex, features feasibly influenced by abnormal fibrillin expression. To examine a possible role of fibrillins in PCOS, particularly FBN3, we undertook tagging and functional single nucleotide polymorphism (SNP) analysis (32 SNPs including 10 that generate non-synonymous amino acid changes) using DNA from 173 PCOS patients and 194 controls. No SNP showed a significant association with PCOS and alleles of most SNPs showed almost identical population frequencies between PCOS and control subjects. No significant differences were observed for microsatellite D19S884. In human PCO stroma/cortex (n = 4) and non-PCO ovarian stroma (n = 9), follicles (n = 3) and corpora lutea (n = 3) and in human ovarian cancer cell lines (KGN, SKOV-3, OVCAR-3, OVCAR-5), FBN1 mRNA levels were approximately 100 times greater than FBN2 and 200–1000-fold greater than FBN3. Expression of LTBP-1 mRNA was 3-fold greater than LTBP-2. We conclude that FBN3 appears to have little involvement in PCOS but cannot rule out that other markers in the region of chromosome 19p13.2 are associated with PCOS or that FBN3 expression occurs in other organs and that this may be influencing the PCOS phenotype.
Resumo:
Microbial pollution in water periodically affects human health in Australia, particularly in times of drought and flood. There is an increasing need for the control of waterborn microbial pathogens. Methods, allowing the determination of the origin of faecal contamination in water, are generally referred to as Microbial Source Tracking (MST). Various approaches have been evaluated as indicatorsof microbial pathogens in water samples, including detection of different microorganisms and various host-specific markers. However, until today there have been no universal MST methods that could reliably determine the source (human or animal) of faecal contamination. Therefore, the use of multiple approaches is frequently advised. MST is currently recognised as a research tool, rather than something to be included in routine practices. The main focus of this research was to develop novel and universally applicable methods to meet the demands for MST methods in routine testing of water samples. Escherichia coli was chosen initially as the object organism for our studies as, historically and globally, it is the standard indicator of microbial contamination in water. In this thesis, three approaches are described: single nucleotide polymorphism (SNP) genotyping, clustered regularly interspaced short palindromic repeats (CRISPR) screening using high resolution melt analysis (HRMA) methods and phage detection development based on CRISPR types. The advantage of the combination SNP genotyping and CRISPR genes has been discussed in this study. For the first time, a highly discriminatory single nucleotide polymorphism interrogation of E. coli population was applied to identify the host-specific cluster. Six human and one animal-specific SNP profile were revealed. SNP genotyping was successfully applied in the field investigations of the Coomera watershed, South-East Queensland, Australia. Four human profiles [11], [29], [32] and [45] and animal specific SNP profile [7] were detected in water. Two human-specific profiles [29] and [11] were found to be prevalent in the samples over a time period of years. The rainfall (24 and 72 hours), tide height and time, general land use (rural, suburban), seasons, distance from the river mouth and salinity show a lack of relashionship with the diversity of SNP profiles present in the Coomera watershed (p values > 0.05). Nevertheless, SNP genotyping method is able to identify and distinquish between human- and non-human specific E. coli isolates in water sources within one day. In some samples, only mixed profiles were detected. To further investigate host-specificity in these mixed profiles CRISPR screening protocol was developed, to be used on the set of E. coli, previously analysed for SNP profiles. CRISPR loci, which are the pattern of previous DNA coliphages attacks, were considered to be a promising tool for detecting host-specific markers in E. coli. Spacers in CRISPR loci could also reveal the dynamics of virulence in E. coli as well in other pathogens in water. Despite the fact that host-specificity was not observed in the set of E. coli analysed, CRISPR alleles were shown to be useful in detection of the geographical site of sources. HRMA allows determination of ‘different’ and ‘same’ CRISPR alleles and can be introduced in water monitoring as a cost-effective and rapid method. Overall, we show that the identified human specific SNP profiles [11], [29], [32] and [45] can be useful as marker genotypes globally for identification of human faecal contamination in water. Developed in the current study, the SNP typing approach can be used in water monitoring laboratories as an inexpensive, high-throughput and easy adapted protocol. The unique approach based on E. coli spacers for the search for unknown phage was developed to examine the host-specifity in phage sequences. Preliminary experiments on the recombinant plasmids showed the possibility of using this method for recovering phage sequences. Future studies will determine the host-specificity of DNA phage genotyping as soon as first reliable sequences can be acquired. No doubt, only implication of multiple approaches in MST will allow identification of the character of microbial contamination with higher confidence and readability.
Resumo:
The major limitation of current typing methods for Streptococcus pyogenes, such as emm sequence typing and T typing, is that these are based on regions subject to considerable selective pressure. Multilocus sequence typing (MLST) is a better indicator of the genetic backbone of a strain but is not widely used due to high costs. The objective of this study was to develop a robust and cost-effective alternative to S. pyogenes MLST. A 10-member single nucleotide polymorphism (SNP) set that provides a Simpson’s Index of Diversity (D) of 0.99 with respect to the S. pyogenes MLST database was derived. A typing format involving high-resolution melting (HRM) analysis of small fragments nucleated by each of the resolution-optimized SNPs was developed. The fragments were 59–119 bp in size and, based on differences in G+C content, were predicted to generate three to six resolvable HRM curves. The combination of curves across each of the 10 fragments can be used to generate a melt type (MelT) for each sequence type (ST). The 525 STs currently in the S. pyogenes MLST database are predicted to resolve into 298 distinct MelTs and the method is calculated to provide a D of 0.996 against the MLST database. The MelTs are concordant with the S. pyogenes population structure. To validate the method we examined clinical isolates of S. pyogenes of 70 STs. Curves were generated as predicted by G+C content discriminating the 70 STs into 65 distinct MelTs.
Resumo:
This study used next generation sequencing technologies to investigate growth in a cultured crustacean. The objective was to identify and characterise specific gene loci that contribute important phenotypic variation to growth as well as to develop a large set of SNP markers in candidate genes for assessing correlations between specific mutations and individual growth performance. The genomic dataset generated provides a fundamental resource for application in future crustacean stock improvement programs. Ultimately, the data can be applied to development of culture lines with improved growth performance.
Resumo:
We employed a Hidden-Markov-Model (HMM) algorithm in loss of heterozygosity (LOH) analysis of high-density single nucleotide polymorphism (SNP) array data from Non-Hodgkin’s lymphoma (NHL) entities, follicular lymphoma (FL), and diffuse large B-cell lymphoma (DLBCL). This revealed a high frequency of LOH over the chromosomal region 11p11.2, containing the gene encoding the protein tyrosine phosphatase receptor type J (PTPRJ). Although PTPRJ regulates components of key survival pathways in B-cells (i.e., BCR, MAPK, and PI3K signaling), its role in B-cell development is poorly understood. LOH of PTPRJ has been described in several types of cancer but not in any hematological malignancy. Interestingly, FL cases with LOH exhibited down-regulation of PTPRJ, in contrast no significant variation of expression was shown in DLBCLs. In addition, sequence screening in Exons 5 and 13 of PTPRJ identified the G973A (rs2270993), T1054C (rs2270992), A1182C (rs1566734), and G2971C (rs4752904) coding SNPs (cSNPs). The A1182 allele was significantly more frequent in FLs and in NHLs with LOH. Significant over-representation of the C1054 (rs2270992) and the C2971 (rs4752904) alleles were also observed in LOH cases. A haplotype analysis also revealed a significant lower frequency of haplotype GTCG in NHL cases, but it was only detected in cases with retention. Conversely, haplotype GCAC was over-representated in cases with LOH. Altogether, these results indicate that the inactivation of PTPRJ may be a common lymphomagenic mechanism in these NHL subtypes and that haplotypes in PTPRJ gene may play a role in susceptibility to NHL, by affecting activation of PTPRJ in these B-cell lymphomas.
Resumo:
Multiple sclerosis (MS) is a common chronic inflammatory disease of the central nervous system. Susceptibility to the disease is affected by both environmental and genetic factors. Genetic factors include haplotypes in the histocompatibility complex (MHC) and over 50 non-MHC loci reported by genome-wide association studies. Amongst these, we previously reported polymorphisms in chromosome 12q13-14 with a protective effect in individuals of European descent. This locus spans 288 kb and contains 17 genes, including several candidate genes which have potentially significant pathogenic and therapeutic implications. In this study, we aimed to fine-map this locus. We have implemented a two-phase study: a variant discovery phase where we have used next-generation sequencing and two target-enrichment strategies [long-range polymerase chain reaction (PCR) and Nimblegen's solution phase hybridization capture] in pools of 25 samples; and a genotyping phase where we genotyped 712 variants in 3577 healthy controls and 3269 MS patients. This study confirmed the association (rs2069502, P = 9.9 × 10−11, OR = 0.787) and narrowed down the locus of association to an 86.5 kb region. Although the study was unable to pinpoint the key-associated variant, we have identified a 42 (genotyped and imputed) single-nucleotide polymorphism haplotype block likely to harbour the causal variant. No evidence of association at previously reported low-frequency variants in CYP27B1 was observed. As part of the study we compared variant discovery performance using two target-enrichment strategies. We concluded that our pools enriched with Nimblegen's solution phase hybridization capture had better sensitivity to detect true variants than the pools enriched with long-range PCR, whilst specificity was better in the long-range PCR-enriched pools compared with solution phase hybridization capture enriched pools; this result has important implications for the design of future fine-mapping studies.
Resumo:
Many primary immunodeficiency disorders of differing etiologies have been well characterized, and much understanding of immunological processes has been gained by investigating the mechanisms of disease. Here, we have used a whole-genome approach, employing single-nucleotide polymorphism and gene expression microarrays, to provide insight into the molecular etiology of a novel immunodeficiency disorder. Using DNA copy number profiling, we define a hyperploid region on 14q11.2 in the immunodeficiency case associated with the interleukin (IL)-25 locus. This alteration was associated with significantly heightened expression of IL25 following T-cell activation. An associated dominant type 2 helper T cell bias in the immunodeficiency case provides a mechanistic explanation for recurrence of infections by pathogens met by Th1-driven responses. Furthermore, this highlights the capacity of IL25 to alter normal human immune responses.
Resumo:
Objective: To perform a 1-stage meta-analysis of genome-wide association studies (GWAS) of multiple sclerosis (MS) susceptibility and to explore functional consequences of new susceptibility loci. Methods: We synthesized 7 MS GWAS. Each data set was imputed using HapMap phase II, and a per single nucleotide polymorphism (SNP) meta-analysis was performed across the 7 data sets. We explored RNA expression data using a quantitative trait analysis in peripheral blood mononuclear cells (PBMCs) of 228 subjects with demyelinating disease. Results: We meta-analyzed 2,529,394 unique SNPs in 5,545 cases and 12,153 controls. We identified 3 novel susceptibility alleles: rs170934T at 3p24.1 (odds ratio [OR], 1.17; p ¼ 1.6 � 10�8) near EOMES, rs2150702G in the second intron of MLANA on chromosome 9p24.1 (OR, 1.16; p ¼ 3.3 � 10�8), and rs6718520A in an intergenic region on chromosome 2p21, with THADA as the nearest flanking gene (OR, 1.17; p ¼ 3.4 � 10�8). The 3 new loci do not have a strong cis effect on RNA expression in PBMCs. Ten other susceptibility loci had a suggestive p < 1 � 10�6, some of these loci have evidence of association in other inflammatory diseases (ie, IL12B, TAGAP, PLEK, and ZMIZ1). Interpretation: We have performed a meta-analysis of GWAS in MS that more than doubles the size of previous gene discovery efforts and highlights 3 novel MS susceptibility loci. These and additional loci with suggestive evidence of association are excellent candidates for further investigations to refine and validate their role in the genetic architecture of MS.
Resumo:
Recent association studies in multiple sclerosis (MS) have identified and replicated several single nucleotide polymorphism (SNP) susceptibility loci including CLEC16A, IL2RA, IL7R, RPL5, CD58, CD40 and chromosome 12q13–14 in addition to the well established allele HLA-DR15. There is potential that these genetic susceptibility factors could also modulate MS disease severity, as demonstrated previously for the MS risk allele HLA-DR15. We investigated this hypothesis in a cohort of 1006 well characterised MS patients from South-Eastern Australia. We tested the MS-associated SNPs for association with five measures of disease severity incorporating disability, age of onset, cognition and brain atrophy. We observed trends towards association between the RPL5 risk SNP and time between first demyelinating event and relapse, and between the CD40 risk SNP and symbol digit test score. No associations were significant after correction for multiple testing. We found no evidence for the hypothesis that these new MS disease risk-associated SNPs influence disease severity.