147 resultados para Whole Genome Sequences
Resumo:
Background Illumina's Infinium SNP BeadChips are extensively used in both small and large-scale genetic studies. A fundamental step in any analysis is the processing of raw allele A and allele B intensities from each SNP into genotype calls (AA, AB, BB). Various algorithms which make use of different statistical models are available for this task. We compare four methods (GenCall, Illuminus, GenoSNP and CRLMM) on data where the true genotypes are known in advance and data from a recently published genome-wide association study. Results In general, differences in accuracy are relatively small between the methods evaluated, although CRLMM and GenoSNP were found to consistently outperform GenCall. The performance of Illuminus is heavily dependent on sample size, with lower no call rates and improved accuracy as the number of samples available increases. For X chromosome SNPs, methods with sex-dependent models (Illuminus, CRLMM) perform better than methods which ignore gender information (GenCall, GenoSNP). We observe that CRLMM and GenoSNP are more accurate at calling SNPs with low minor allele frequency than GenCall or Illuminus. The sample quality metrics from each of the four methods were found to have a high level of agreement at flagging samples with unusual signal characteristics. Conclusions CRLMM, GenoSNP and GenCall can be applied with confidence in studies of any size, as their performance was shown to be invariant to the number of samples available. Illuminus on the other hand requires a larger number of samples to achieve comparable levels of accuracy and its use in smaller studies (50 or fewer individuals) is not recommended.
Resumo:
Linkage disequilibrium (LD) mapping is commonly used as a fine mapping tool in human genome mapping and has been used with some success for initial disease gene isolation in certain isolated in-bred human populations. An understanding of the population history of domestic dog breeds suggests that LD mapping could be routinely utilized in this species for initial genome-wide scans. Such an approach offers significant advantages over traditional linkage analysis. Here, we demonstrate, using canine copper toxicosis in the Bedlington terrier as the model, that LD mapping could be reasonably expected to be a useful strategy in low-resolution, genome-wide scans in pure-bred dogs. Significant LD was demonstrated over distances up to 33.3 cM. It is very unlikely, for a number of reasons discussed, that this result could be extrapolated to the rest of the genome. It is, however, consistent with the expectation given the population structure of canine breeds and, in this breed at least, with the hypothesis that it may be possible to utilize LD in a genome-wide scan. In this study, LD mapping confirmed the location of the copper toxicosis in Bedlington terrier gene (CT-BT) and was able to do so in a population that was refractory to traditional linkage analysis.
Resumo:
Susceptibility to complex traits, by definition, involves aetiological polymorphisms at multiple genetic loci combined with variable contributions by environmental factors. However, the approaches taken to identifying genetic loci implicated in susceptibility to complex traits frequently overlooks the compounding contribution of multiple loci in favour of highlighting a single gene solely responsible for predisposition. It is only in a small minority of cases that this has resulted in clear disease heritability associated with polymorphisms in a single gene. More often, this approach has led to an accumulation of single-gene associations with minor contributions to disease susceptibility. As the genomic era advances and genome-wide screens become higher in resolution and throughput, the need for simultaneous consideration of multiple loci is becoming more important. With special reference to non-Hodgkin’s lymphoma (NHL), this chapter will overview the current progress made in elucidating genetic polymorphisms associated with disease susceptibility. We also present novel data from a high-resolution single nucleotide polymorphism (SNP) microarray screen for susceptibility loci that are involved in NHL. Using an ‘informed approach’, the findings are highlighted within the context of cellular pathways, and provide insight and new ideas for methods of analysis for genome-wide screens for susceptibility.
Resumo:
Melanoma has historically been refractive to traditional therapeutic approaches. As such, the development of novel drug strategies has been needed to improve rates of overall survival in patients with melanoma, particularly those with late stage or disseminated disease. Recent success with molecularly based targeted drugs, such as Vemurafenib in BRAF-mutant melanomas, has now made “personalized medicine” a reality within some oncology clinics. In this sense, tailored drugs can be administered to patients according to their tumor “mutation profiles.” The success of these drug strategies, in part, can be attributed to the identification of the genetic mechanisms responsible for the development and progression of metastatic melanoma. Recently, the advances in sequencing technology have allowed for comprehensive mutation analysis of tumors and have led to the identification of a number of genes involved in the etiology of metastatic melanoma. As the methodology and costs associated with next-generation sequencing continue to improve, this technology will be rapidly adopted into routine clinical oncology practices and will significantly impact on personalized therapy. This review summarizes current and emerging molecular targets in metastatic melanoma, discusses the potential application of next-generation sequencing within the paradigm of personalized medicine, and describes the current limitations for the adoption of this technology within the clinic.
Resumo:
Forward genetic screens have identified numerous genes involved in development and metabolism, and remain a cornerstone of biological research. However, to locate a causal mutation, the practice of crossing to a polymorphic background to generate a mapping population can be problematic if the mutant phenotype is difficult to recognize in the hybrid F2 progeny, or dependent on parental specific traits. Here in a screen for leaf hyponasty mutants, we have performed a single backcross of an Ethane Methyl Sulphonate (EMS) generated hyponastic mutant to its parent. Whole genome deep sequencing of a bulked homozygous F2 population and analysis via the Next Generation EMS mutation mapping pipeline (NGM) unambiguously determined the causal mutation to be a single nucleotide polymorphisim (SNP) residing in HASTY, a previously characterized gene involved in microRNA biogenesis. We have evaluated the feasibility of this backcross approach using three additional SNP mapping pipelines; SHOREmap, the GATK pipeline, and the samtools pipeline. Although there was variance in the identification of EMS SNPs, all returned the same outcome in clearly identifying the causal mutation in HASTY. The simplicity of performing a single parental backcross and genome sequencing a small pool of segregating mutants has great promise for identifying mutations that may be difficult to map using conventional approaches.
Resumo:
Sorghum is a food and feed cereal crop adapted to heat and drought and a staple for 500 million of the world’s poorest people. Its small diploid genome and phenotypic diversity make it an ideal C4 grass model as a complement to C3 rice. Here we present high coverage (16–45 × ) resequenced genomes of 44 sorghum lines representing the primary gene pool and spanning dimensions of geographic origin, end-use and taxonomic group. We also report the first resequenced genome of S. propinquum, identifying 8 M high-quality SNPs, 1.9 M indels and specific gene loss and gain events in S. bicolor. We observe strong racial structure and a complex domestication history involving at least two distinct domestication events. These assembled genomes enable the leveraging of existing cereal functional genomics data against the novel diversity available in sorghum, providing an unmatched resource for the genetic improvement of sorghum and other grass species.
Resumo:
Metastasis accounts for the poor prognosis of the majority of solid tumors. The phenotypic transition of nonmotile epithelial tumor cells to migratory and invasive “mesenchymal” cells (epithelial-to-mesenchymal transition [EMT]) enables the transit of cancer cells from the primary tumor to distant sites. There is no single marker of EMT; rather, multiple measures are required to define cell state. Thus, the multiparametric capability of high-content screening is ideally suited for the comprehensive analysis of EMT regulators. The aim of this study was to generate a platform to systematically identify functional modulators of tumor cell plasticity using the bladder cancer cell line TSU-Pr1-B1 as a model system. A platform enabling the quantification of key EMT characteristics, cell morphology and mesenchymal intermediate filament vimentin, was developed using the fluorescent whole-cell-tracking reagent CMFDA and a fluorescent promoter reporter construct, respectively. The functional effect of genome-wide modulation of protein-coding genes and miRNAs coupled with those of a collection of small-molecule kinase inhibitors on EMT was assessed using the Target Activation Bioapplication integrated in the Cellomics ArrayScan platform. Data from each of the three screens were integrated to identify a cohort of targets that were subsequently examined in a validation assay using siRNA duplexes. Identification of established regulators of EMT supports the utility of this screening approach and indicated capacity to identify novel regulators of this plasticity program. Pathway analysis coupled with interrogation of cancer-related expression profile databases and other EMT-related screens provided key evidence to prioritize further experimental investigation into the molecular regulators of EMT in cancer cells.
Resumo:
Ankylosing spondylitis (AS) is a common inflammatory arthritis predominantly affecting the axial skeleton. Susceptibility to the disease is thought to be oligogenic. To identify the genes involved, we have performed a genomewide scan in 185 families containing 255 affected sibling pairs. Two-point and multipoint nonparametric linkage analysis was performed. Regions were identified showing "suggestive" or stronger linkage with the disease on chromosomes 1p, 2q, 6p, 9q, 10q, 16q, and 19q. The MHC locus was identified as encoding the greatest component of susceptibility, with an overall LOD score of 15.6. The strongest non-MHC linkage lies on chromosome 16q (overall LOD score 4.7). These results strongly support the presence of non-MHC genetic-susceptibility factors in AS and point to their likely locations.
Resumo:
We report here the genome sequences of two alphabaculoviruses of Helicoverpa spp. from Australia: AC53, used in the biopesticides ViVUS and ViVUS Max, and H25EA1, used in in vitro production studies.
Resumo:
Objective. To undertake a systematic wholegenome screen to identify regions exhibiting genetic linkage to rheumatoid arthritis (RA). Methods. Two hundred fifty-two RA-affected sibling pairs from 182 UK families were genotyped using 365 highly informative microsatellite markers. Microsatellite genotyping was performed using fluorescent polymerase chain reaction primers and semiautomated DNA sequencing technology. Linkage analysis was undertaken using MAPMAKER/SIBS for single-point and multipoint analysis. Results. Significant linkage (maximum logarithm of odds score 4.7 [P = 0.000003] at marker D6S276, 1 cM from HLA-DRB1) was identified around the major histocompatibility complex (MHC) region on chromosome 6. Suggestive linkage (P < 7.4 × 10-4) was identified on chromosome 6q by single- and multipoint analysis. Ten other sites of nominal linkage (P < 0.05) were identified on chromosomes 3p, 4q, 7p, 2 regions of 10q, 2 regions of 14q, 16p, 21q, and Xq by single-point analysis and on 3 sites (1q, 14q, and 14q) by multipoint analysis. Conclusion. Linkage to the MHC region was confirmed. Eleven non-HLA regions demonstrated evidence of suggestive or nominal linkage, but none reached the genome-wide threshold for significant linkage (P = 2.2 × 10-5). Results of previous genome screens have suggested that 6 of these regions may be involved in RA susceptibility.
Resumo:
Debates on gene patents have necessitated the analysis of patents that disclose and reference human sequences. In this study, we built an automated classifier that assigns sequences to one of nine predefined categories according to their functional roles in patent claims by applying natural language processing and supervised learning techniques. To improve its correctness, we experimented with various feature mappings, resulting in the maximal accuracy of 79%.
Resumo:
Objective. Ankylosing spondylitis (AS) is a debilitating chronic inflammatory condition with a high degree of familiality (λs=82) and heritability (>90%) that primarily affects spinal and sacroiliac joints. Whole genome scans for linkage to AS phenotypes have been conducted, although results have been inconsistent between studies and all have had modest sample sizes. One potential solution to these issues is to combine data from multiple studies in a retrospective meta-analysis. Methods: The International Genetics of Ankylosing Spondylitis Consortium combined data from three whole genome linkage scans for AS (n=3744 subjects) to determine chromosomal markers that show evidence of linkage with disease. Linkage markers typed in different centres were integrated into a consensus map to facilitate effective data pooling. We performed a weighted meta-analysis to combine the linkage results, and compared them with the three individual scans and a combined pooled scan. Results: In addition to the expected region surrounding the HLA-B27 gene on chromosome 6, we determined that several marker regions showed significant evidence of linkage with disease status. Regions on chromosome 10q and 16q achieved 'suggestive' evidence of linkage, and regions on chromosomes 1q, 3q, 5q, 6q, 9q, 17q and 19q showed at least nominal linkage in two or more scans and in the weighted meta-analysis. Regions previously associated with AS on chromosome 2q (the IL-1 gene cluster) and 22q (CYP2D6) exhibited nominal linkage in the meta-analysis, providing further statistical support for their involvement in susceptibility to AS. Conclusion: These findings provide a useful guide for future studies aiming to identify the genes involved in this highly heritable condition. . Published by on behalf of the British Society for Rheumatology.
Resumo:
Genotyping in DNA pools reduces the cost and the time required to complete large genotyping projects. The aim of the present study was to evaluate pooling as part of a strategy for fine mapping in regions of significant linkage. Thirty-nine single nucleotide polymorphisms (SNPs) were analyzed in two genomic DNA pools of 384 individuals each and results compared with data after typing all individuals used in the pools. There were no significant differences using data from either 2 or 8 heterozygous individuals to correct frequency estimates for unequal allelic amplification. After correction, the mean difference between estimates from the genomic pool and individual allele frequencies was .033. A major limitation of the use of DNA pools is the time and effort required to carefully adjust the concentration of each individual DNA sample before mixing aliquots. Pools were also constructed by combining DNA after Multiple Displacement Amplification (MDA). The MDA pools gave similar results to pools constructed after careful DNA quantitation (mean difference from individual genotyping .040) and MDA provides a rapid method to generate pools suitable for some applications. Pools provide a rapid and cost-effective screen to eliminate SNPs that are not polymorphic in a test population and can detect minor allele frequencies as low as 1% in the pooled samples. With current levels of accuracy, pooling is best suited to an initial screen in the SNP validation process that can provide high-throughput comparisons between cases and controls to prioritize SNPs for subsequent individual genotyping.
Resumo:
The extent to which low-frequency (minor allele frequency (MAF) between 1-5%) and rare (MAF = 1%) variants contribute to complex traits and disease in the general population is mainly unknown. Bone mineral density (BMD) is highly heritable, a major predictor of osteoporotic fractures, and has been previously associated with common genetic variants, as well as rare, population-specific, coding variants. Here we identify novel non-coding genetic variants with large effects on BMD (ntotal = 53,236) and fracture (ntotal = 508,253) in individuals of European ancestry from the general population. Associations for BMD were derived from whole-genome sequencing (n = 2,882 from UK10K (ref. 10); a population-based genome sequencing consortium), whole-exome sequencing (n = 3,549), deep imputation of genotyped samples using a combined UK10K/1000 Genomes reference panel (n = 26,534), and de novo replication genotyping (n = 20,271). We identified a low-frequency non-coding variant near a novel locus, EN1, with an effect size fourfold larger than the mean of previously reported common variants for lumbar spine BMD (rs11692564(T), MAF = 1.6%, replication effect size = +0.20 s.d., Pmeta = 2 x 10(-14)), which was also associated with a decreased risk of fracture (odds ratio = 0.85; P = 2 x 10(-11); ncases = 98,742 and ncontrols = 409,511). Using an En1(cre/flox) mouse model, we observed that conditional loss of En1 results in low bone mass, probably as a consequence of high bone turnover. We also identified a novel low-frequency non-coding variant with large effects on BMD near WNT16 (rs148771817(T), MAF = 1.2%, replication effect size = +0.41 s.d., Pmeta = 1 x 10(-11)). In general, there was an excess of association signals arising from deleterious coding and conserved non-coding variants. These findings provide evidence that low-frequency non-coding variants have large effects on BMD and fracture, thereby providing rationale for whole-genome sequencing and improved imputation reference panels to study the genetic architecture of complex traits and disease in the general population.