987 resultados para Genomic selection


Relevância:

40.00% 40.00%

Publicador:

Resumo:

Background: The increasing number of genomic sequences of bacteria makes it possible to select unique SNPs of a particular strain/species at the whole genome level and thus design specific primers based on the SNPs. The high similarity of genomic sequences among phylogenetically-related bacteria requires the identification of the few loci in the genome that can serve as unique markers for strain differentiation. PrimerSNP attempts to identify reliable strain-specific markers, on which specific primers are designed for pathogen detection purpose.Results: PrimerSNP is an online tool to design primers based on strain specific SNPs for multiple strains/species of microorganisms at the whole genome level. The allele-specific primers could distinguish query sequences of one strain from other homologous sequences by standard PCR reaction. Additionally, PrimerSNP provides a feature for designing common primers that can amplify all the homologous sequences of multiple strains/species of microorganisms. PrimerSNP is freely available at http://cropdisease.ars.usda.gov/similar to primer.Conclusion: PrimerSNP is a high-throughput specific primer generation tool for the differentiation of phylogenetically-related strains/species. Experimental validation showed that this software had a successful prediction rate of 80.4 - 100% for strain specific primer design.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The bovine species have witnessed and played a major role in the drastic socio-economical changes that shaped our culture over the last 10,000 years. During this journey, cattle hitchhiked on human development and colonized the world, facing strong selective pressures such as dramatic environmental changes and disease challenge. Consequently, hundreds of specialized cattle breeds emerged and spread around the globe, making up a rich spectrum of genomic resources. Their DNA still carry the scars left from adapting to this wide range of conditions, and we are now empowered with data and analytical tools to track the milestones of past selection in their genomes. In this review paper, we provide a summary of the reconstructed demographic events that shaped cattle diversity, offer a critical synthesis of popular methodologies applied to the search for signatures of selection (SS) in genomic data, and give examples of recent SS studies in cattle. Then, we outline the potential and challenges of the application of SS analysis in cattle, and discuss the future directions in this field.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

High-throughput gene expression technologies such as microarrays have been utilized in a variety of scientific applications. Most of the work has been on assessing univariate associations between gene expression with clinical outcome (variable selection) or on developing classification procedures with gene expression data (supervised learning). We consider a hybrid variable selection/classification approach that is based on linear combinations of the gene expression profiles that maximize an accuracy measure summarized using the receiver operating characteristic curve. Under a specific probability model, this leads to consideration of linear discriminant functions. We incorporate an automated variable selection approach using LASSO. An equivalence between LASSO estimation with support vector machines allows for model fitting using standard software. We apply the proposed method to simulated data as well as data from a recently published prostate cancer study.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Background: Speciation reversal: the erosion of species differentiation via an increase in introgressive hybridization due to the weakening of previously divergent selection regimes, is thought to be an important, yet poorly understood, driver of biodiversity loss. Our study system, the Alpine whitefish (Coregonus spp.) species complex is a classic example of a recent postglacial adaptive radiation: forming an array of endemic lake flocks, with the independent origination of similar ecotypes among flocks. However, many of the lakes of the Alpine radiation have been seriously impacted by anthropogenic nutrient enrichment, resulting in a collapse in neutral genetic and phenotypic differentiation within the most polluted lakes. Here we investigate the effects of eutrophication on the selective forces that have shaped this radiation, using population genomics. We studied eight sympatric species assemblages belonging to five independent parallel adaptive radiations, and one species pair in secondary contact. We used AFLP markers, and applied FST outlier (BAYESCAN, DFDIST) and logistic regression analyses (MATSAM), to identify candidate regions for disruptive selection in the genome and their associations with adaptive traits within each lake flock. The number of outlier and adaptive trait associated loci identified per lake were then regressed against two variables (historical phosphorus concentration and contemporary oxygen concentration) representing the strength of eutrophication. Results: Whilst we identify disruptive selection candidate regions in all lake flocks, we find similar trends, across analysis methods, towards fewer disruptive selection candidate regions and fewer adaptive trait/candidate loci associations in the more polluted lakes. Conclusions: Weakened disruptive selection and a concomitant breakdown in reproductive isolating mechanisms in more polluted lakes has lead to increased gene flow between coexisting Alpine whitefish species. We hypothesize that the resulting higher rates of interspecific recombination reduce either the number or extent of genomic islands of divergence surrounding loci evolving under disruptive natural selection. This produces the negative trend seen in the number of selection candidate loci recovered during genome scans of whitefish species flocks, with increasing levels of anthropogenic eutrophication: as the likelihood decreases that AFLP restriction sites will fall within regions of heightened genomic divergence and therefore be classified as FST outlier loci. This study explores for the first time the potential effects of human-mediated relaxation of disruptive selection on heterogeneous genomic divergence between coexisting species.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This article is protected by copyright. All rights reserved. Acknowledgements This study was funded by a BBSRC studentship (MAW) and NERC grants NE/H00775X/1 and NE/D000602/1 (SBP). The authors are grateful to Mario Röder and Keliya Bai for fieldwork assistance, and all estate owners, factors and keepers for access to field sites, most particularly MJ Taylor and Mike Nisbet (Airlie), Neil Brown (Allargue), RR Gledson and David Scrimgeour (Delnadamph), Andrew Salvesen and John Hay (Dinnet), Stuart Young and Derek Calder (Edinglassie), Kirsty Donald and David Busfield (Glen Dye), Neil Hogbin and Ab Taylor (Glen Muick), Alistair Mitchell (Glenlivet), Simon Blackett, Jim Davidson and Liam Donald (Invercauld), Richard Cooke and Fred Taylor† (Invermark), Shaila Rao and Christopher Murphy (Mar Lodge), and Ralph Peters and Philip Astor (Tillypronie). Data accessibility • Genotype data (DataDryad: doi:10.5061/dryad.4t7jk) • Metadata (information on sampling sites, phenotypes and medication regimen) (DataDryad: doi:10.5061/dryad.4t7jk)

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background: The malaria parasite Plasmodium falciparum exhibits abundant genetic diversity, and this diversity is key to its success as a pathogen. Previous efforts to study genetic diversity in P. falciparum have begun to elucidate the demographic history of the species, as well as patterns of population structure and patterns of linkage disequilibrium within its genome. Such studies will be greatly enhanced by new genomic tools and recent large-scale efforts to map genomic variation. To that end, we have developed a high throughput single nucleotide polymorphism (SNP) genotyping platform for P. falciparum. Results: Using an Affymetrix 3,000 SNP assay array, we found roughly half the assays (1,638) yielded high quality, 100% accurate genotyping calls for both major and minor SNP alleles. Genotype data from 76 global isolates confirm significant genetic differentiation among continental populations and varying levels of SNP diversity and linkage disequilibrium according to geographic location and local epidemiological factors. We further discovered that nonsynonymous and silent (synonymous or noncoding) SNPs differ with respect to within-population diversity, interpopulation differentiation, and the degree to which allele frequencies are correlated between populations. Conclusions: The distinct population profile of nonsynonymous variants indicates that natural selection has a significant influence on genomic diversity in P. falciparum, and that many of these changes may reflect functional variants deserving of follow-up study. Our analysis demonstrates the potential for new high-throughput genotyping technologies to enhance studies of population structure, natural selection, and ultimately enable genome-wide association studies in P. falciparum to find genes underlying key phenotypic traits.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper identifies research priorities in evaluating the ways in which "genomic medicine"-the use of genetic information to prevent and treat disease-may reduce tobacco-related harm by: (1) assisting more smokers to quit; (2) preventing non-smokers from beginning to smoke tobacco; and (3) reducing the harm caused by tobacco smoking. The method proposed to achieve the first aim is pharmacogenetics", the use of genetic information to optimise the selection of smoking-cessation programmes by screening smokers for polymorphisms that predict responses to different methods of smoking cessation. This method competes with the development of more effective forms of smoking cessation that involve vaccinating smokers against the effects of nicotine and using new pharmaceuticals (such as cannabinoid antagonists and nicotine agonists). The second and third aims are more speculative. They include: screening the population for genetic susceptibility to nicotine dependence and intervening (eg, by vaccinating children and adolescents against the effects of nicotine) to prevent smoking uptake, and screening the population for genetic susceptibility to tobacco-related diseases. A framework is described for future research on these policy options. This includes: epidemiological modelling and economic evaluation to specify the conditions under which these strategies are cost-effective; and social psychological research into the effect of providing genetic information on smokers' preparedness to quit, and the general views of the public on tobacco smoking.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The limited ability of common variants to account for the genetic contribution to complex disease has prompted searches for rare variants of large effect, to partly explain the 'missing heritability'. Analyses of genome-wide genotyping data have identified genomic structural variants (GSVs) as a source of such rare causal variants. Recent studies have reported multiple GSV loci associated with risk of obesity. We attempted to replicate these associations by similar analysis of two familial-obesity case-control cohorts and a population cohort, and detected GSVs at 11 out of 18 loci, at frequencies similar to those previously reported. Based on their reported frequencies and effect sizes (OR≥25), we had sufficient statistical power to detect the large majority (80%) of genuine associations at these loci. However, only one obesity association was replicated. Deletion of a 220 kb region on chromosome 16p11.2 has a carrier population frequency of 2×10(-4) (95% confidence interval [9.6×10(-5)-3.1×10(-4)]); accounts overall for 0.5% [0.19%-0.82%] of severe childhood obesity cases (P = 3.8×10(-10); odds ratio = 25.0 [9.9-60.6]); and results in a mean body mass index (BMI) increase of 5.8 kg.m(-2) [1.8-10.3] in adults from the general population. We also attempted replication using BMI as a quantitative trait in our population cohort; associations with BMI at or near nominal significance were detected at two further loci near KIF2B and within FOXP2, but these did not survive correction for multiple testing. These findings emphasise several issues of importance when conducting rare GSV association, including the need for careful cohort selection and replication strategy, accurate GSV identification, and appropriate correction for multiple testing and/or control of false discovery rate. Moreover, they highlight the potential difficulty in replicating rare CNV associations across different populations. Nevertheless, we show that such studies are potentially valuable for the identification of variants making an appreciable contribution to complex disease.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Introduction: As part of the MicroArray Quality Control (MAQC)-II project, this analysis examines how the choice of univariate feature-selection methods and classification algorithms may influence the performance of genomic predictors under varying degrees of prediction difficulty represented by three clinically relevant endpoints. Methods: We used gene-expression data from 230 breast cancers (grouped into training and independent validation sets), and we examined 40 predictors (five univariate feature-selection methods combined with eight different classifiers) for each of the three endpoints. Their classification performance was estimated on the training set by using two different resampling methods and compared with the accuracy observed in the independent validation set. Results: A ranking of the three classification problems was obtained, and the performance of 120 models was estimated and assessed on an independent validation set. The bootstrapping estimates were closer to the validation performance than were the cross-validation estimates. The required sample size for each endpoint was estimated, and both gene-level and pathway-level analyses were performed on the obtained models. Conclusions: We showed that genomic predictor accuracy is determined largely by an interplay between sample size and classification difficulty. Variations on univariate feature-selection methods and choice of classification algorithm have only a modest impact on predictor performance, and several statistically equally good predictors can be developed for any given classification problem.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Three species of flatworms from the genus Echinococcus (E. granulosus, E. multilocularis and E. vogeli) and four strains of E. granulosus (cattle, horse, pig and sheep strains) were analysed by the PCR-SSCP method followed by sequencing, using as targets two non-coding and two coding (one nuclear and one mitochondrial) genomic regions. The sequencing data was used to evaluate hypothesis about the parasite breeding system and the causes of genetic diversification. The calculated recombination parameters suggested that cross-fertilisation was rare in the history of the group. However, the relative rates of substitution in the coding sequences showed that positive selection (instead of purifying selection) drove the evolution of an elastase and neutrophil chemotaxis inhibitor gene (AgB/1). The phylogenetic analyses revealed several ambiguities, indicating that the taxonomic status of the E. granulosus horse strain should be revised

Relevância:

30.00% 30.00%

Publicador:

Resumo:

An essential step of the life cycle of retroviruses is the stable insertion of a copy of their DNA genome into the host cell genome, and lentiviruses are no exception. This integration step, catalyzed by the viral-encoded integrase, ensures long-term expression of the viral genes, thus allowing a productive viral replication and rendering retroviral vectors also attractive for the field of gene therapy. At the same time, this ability to integrate into the host genome raises safety concerns regarding the use of retroviral-based gene therapy vectors, due to the genomic locations of integration sites. The availability of the human genome sequence made possible the analysis of the integration site preferences, which revealed to be nonrandom and retrovirus-specific, i.e. all lentiviruses studied so far favor integration in active transcription units, while other retroviruses have a different integration site distribution. Several mechanisms have been proposed that may influence integration targeting, which include (i) chromatin accessibility, (ii) cell cycle effects, and (iii) tethering proteins. Recent data provide evidence that integration site selection can occur via a tethering mechanism, through the recruitment of the lentiviral integrase by the cellular LEDGF/p75 protein, both proteins being the two major players in lentiviral integration targeting.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Understanding the genomic basis of evolutionary adaptation requires insight into the molecular basis underlying phenotypic variation. However, even changes in molecular pathways associated with extreme variation, gains and losses of specific phenotypes, remain largely uncharacterized. Here, we investigate the large interspecific differences in the ability to survive infection by parasitoids across 11 Drosophila species and identify genomic changes associated with gains and losses of parasitoid resistance. We show that a cellular immune defense, encapsulation, and the production of a specialized blood cell, lamellocytes, are restricted to a sublineage of Drosophila, but that encapsulation is absent in one species of this sublineage, Drosophila sechellia. Our comparative analyses of hemopoiesis pathway genes and of genes differentially expressed during the encapsulation response revealed that hemopoiesis-associated genes are highly conserved and present in all species independently of their resistance. In contrast, 11 genes that are differentially expressed during the response to parasitoids are novel genes, specific to the Drosophila sublineage capable of lamellocyte-mediated encapsulation. These novel genes, which are predominantly expressed in hemocytes, arose via duplications, whereby five of them also showed signatures of positive selection, as expected if they were recruited for new functions. Three of these novel genes further showed large-scale and presumably loss-of-function sequence changes in D. sechellia, consistent with the loss of resistance in this species. In combination, these convergent lines of evidence suggest that co-option of duplicated genes in existing pathways and subsequent neofunctionalization are likely to have contributed to the evolution of the lamellocyte-mediated encapsulation in Drosophila.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The stable insertion of a copy of their genome into the host cell genome is an essential step of the life cycle of retroviruses. The site of viral DNA integration, mediated by the viral-encoded integrase enzyme, has important consequences for both the virus and the host cell. The analysis of retroviral integration site distribution was facilitated by the availability of the human genome sequence, revealing the non-random feature of integration site selection and identifying different favored and disfavored genomic locations for individual retroviruses. This review will summarize the current knowledge about retroviral differences in their integration site preferences as well as the mechanisms involved in this process.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Aleppo pine (Pinus halepensis Mill.) is a relevant conifer species for studying adaptive responses to drought and fire regimes in the Mediterranean region. In this study, we performed Illumina next-generation sequencing of two phenotypically divergent Aleppo pine accessions with the aims of (i) characterizing the transcriptome through Illumina RNA-Seq on trees phenotypically divergent for adaptive traits linked to fire adaptation and drought, (ii) performing a functional annotation of the assembled transcriptome, (iii) identifying genes with accelerated evolutionary rates, (iv) studying the expression levels of the annotated genes and (v) developing gene-based markers for population genomic and association genetic studies. The assembled transcriptome consisted of 48,629 contigs and covered about 54.6 Mbp. The comparison of Aleppo pine transcripts to Picea sitchensis protein-coding sequences resulted in the detection of 34,014 SNPs across species, with a Ka /Ks average value of 0.216, suggesting that the majority of the assembled genes are under negative selection. Several genes were differentially expressed across the two pine accessions with contrasted phenotypes, including a glutathione-s-transferase, a cellulose synthase and a cobra-like protein. A large number of new markers (3334 amplifiable SSRs and 28,236 SNPs) have been identified which should facilitate future population genomics and association genetics in this species. A 384-SNP Oligo Pool Assay for genotyping with the Illumina VeraCode technology has been designed which showed an high overall SNP conversion rate (76.6%). Our results showed that Illumina next-generation sequencing is a valuable technology to obtain an extensive overview on whole transcriptomes of nonmodel species with large genomes.