961 resultados para GENOME-WIDE DETECTION
Resumo:
A recently emerging bleeding canker disease, caused by Pseudomonas syringae pathovar aesculi (Pae), is threatening European horse chestnut in northwest Europe. Very little is known about the origin and biology of this new disease. We used the nucleotide sequences of seven commonly used marker genes to investigate the phylogeny of three strains isolated recently from bleeding stem cankers on European horse chestnut in Britain (E-Pae). On the basis of these sequences alone, the E-Pae strains were identical to the Pae type-strain (I-Pae), isolated from leaf spots on Indian horse chestnut in India in 1969. The phylogenetic analyses also showed that Pae belongs to a distinct clade of P. syringae pathovars adapted to woody hosts. We generated genome-wide Illumina sequence data from the three E-Pae strains and one strain of I-Pae. Comparative genomic analyses revealed pathovar-specific genomic regions in Pae potentially implicated in virulence on a tree host, including genes for the catabolism of plant-derived aromatic compounds and enterobactin synthesis. Several gene clusters displayed intra-pathovar variation, including those encoding type IV secretion, a novel fatty acid biosynthesis pathway and a sucrose uptake pathway. Rates of single nucleotide polymorphisms in the four Pae genomes indicate that the three E-Pae strains diverged from each other much more recently than they diverged from I-Pae. The very low genetic diversity among the three geographically distinct E-Pae strains suggests that they originate from a single, recent introduction into Britain, thus highlighting the serious environmental risks posed by the spread of an exotic plant pathogenic bacterium to a new geographic location. The genomic regions in Pae that are absent from other P. syringae pathovars that infect herbaceous hosts may represent candidate genetic adaptations to infection of the woody parts of the tree.
Resumo:
The accurate prediction of the biochemical function of a protein is becoming increasingly important, given the unprecedented growth of both structural and sequence databanks. Consequently, computational methods are required to analyse such data in an automated manner to ensure genomes are annotated accurately. Protein structure prediction methods, for example, are capable of generating approximate structural models on a genome-wide scale. However, the detection of functionally important regions in such crude models, as well as structural genomics targets, remains an extremely important problem. The method described in the current study, MetSite, represents a fully automatic approach for the detection of metal-binding residue clusters applicable to protein models of moderate quality. The method involves using sequence profile information in combination with approximate structural data. Several neural network classifiers are shown to be able to distinguish metal sites from non-sites with a mean accuracy of 94.5%. The method was demonstrated to identify metal-binding sites correctly in LiveBench targets where no obvious metal-binding sequence motifs were detectable using InterPro. Accurate detection of metal sites was shown to be feasible for low-resolution predicted structures generated using mGenTHREADER where no side-chain information was available. High-scoring predictions were observed for a recently solved hypothetical protein from Haemophilus influenzae, indicating a putative metal-binding site.
Resumo:
Abstract Background: The amount and structure of genetic diversity in dessert apple germplasm conserved at a European level is mostly unknown, since all diversity studies conducted in Europe until now have been performed on regional or national collections. Here, we applied a common set of 16 SSR markers to genotype more than 2,400 accessions across 14 collections representing three broad European geographic regions (North+East, West and South) with the aim to analyze the extent, distribution and structure of variation in the apple genetic resources in Europe. Results: A Bayesian model-based clustering approach showed that diversity was organized in three groups, although these were only moderately differentiated (FST=0.031). A nested Bayesian clustering approach allowed identification of subgroups which revealed internal patterns of substructure within the groups, allowing a finer delineation of the variation into eight subgroups (FST=0.044). The first level of stratification revealed an asymmetric division of the germplasm among the three groups, and a clear association was found with the geographical regions of origin of the cultivars. The substructure revealed clear partitioning of genetic groups among countries, but also interesting associations between subgroups and breeding purposes of recent cultivars or particular usage such as cider production. Additional parentage analyses allowed us to identify both putative parents of more than 40 old and/or local cultivars giving interesting insights in the pedigree of some emblematic cultivars. Conclusions: The variation found at group and sub-group levels may reflect a combination of historical processes of migration/selection and adaptive factors to diverse agricultural environments that, together with genetic drift, have resulted in extensive genetic variation but limited population structure. The European dessert apple germplasm represents an important source of genetic diversity with a strong historical and patrimonial value. The present work thus constitutes a decisive step in the field of conservation genetics. Moreover, the obtained data can be used for defining a European apple core collection useful for further identification of genomic regions associated with commercially important horticultural traits in apple through genome-wide association studies.
Resumo:
Background: Linkage mapping is used to identify genomic regions affecting the expression of complex traits. However, when experimental crosses such as F2 populations or backcrosses are used to map regions containing a Quantitative Trait Locus (QTL), the size of the regions identified remains quite large, i.e. 10 or more Mb. Thus, other experimental strategies are needed to refine the QTL locations. Advanced Intercross Lines (AIL) are produced by repeated intercrossing of F2 animals and successive generations, which decrease linkage disequilibrium in a controlled manner. Although this approach is seen as promising, both to replicate QTL analyses and fine-map QTL, only a few AIL datasets, all originating from inbred founders, have been reported in the literature. Methods: We have produced a nine-generation AIL pedigree (n = 1529) from two outbred chicken lines divergently selected for body weight at eight weeks of age. All animals were weighed at eight weeks of age and genotyped for SNP located in nine genomic regions where significant or suggestive QTL had previously been detected in the F2 population. In parallel, we have developed a novel strategy to analyse the data that uses both genotype and pedigree information of all AIL individuals to replicate the detection of and fine-map QTL affecting juvenile body weight. Results: Five of the nine QTL detected with the original F2 population were confirmed and fine-mapped with the AIL, while for the remaining four, only suggestive evidence of their existence was obtained. All original QTL were confirmed as a single locus, except for one, which split into two linked QTL. Conclusions: Our results indicate that many of the QTL, which are genome-wide significant or suggestive in the analyses of large intercross populations, are true effects that can be replicated and fine-mapped using AIL. Key factors for success are the use of large populations and powerful statistical tools. Moreover, we believe that the statistical methods we have developed to efficiently study outbred AIL populations will increase the number of organisms for which in-depth complex traits can be analyzed.
Resumo:
Phenotypically discordant monozygotic twins offer the possibility of gene discovery through delineation of molecular abnormalities in one member of the twin pair. One proposed mechanism of discordance is postzygotically occurring genomic alterations resulting from mitotic recombination and other somatic changes. Detection of altered genomic fragments can reveal candidate gene loci that can be verified through additional analyses. We investigated this hypothesis using array comparative genomic hybridization; the 50K and 250K Affymetrix GeneChip (R) SNP arrays and an Illumina custom array consisting of 1,536 SNPs, to scan for genomic alterations in a sample of monozygotic twin pairs with discordant cleft lip and/or palate phenotypes. Paired analysis for deletions, amplifications and loss of heterozygosity, along with sequence verification of SNPs with discordant genotype calls did not reveal any genomic discordance between twin pairs in lymphocyte DNA samples. Our results demonstrate that postzygotic genomic alterations are not a common cause of monozygotic twin discordance for isolated cleft lip and/or palate. However, rare or balanced genomic alterations, tissue-specific events and small aberrations beyond the detection level of our experimental approach cannot be ruled out. The stability of genomes we observed in our study samples also suggests that detection of discordant events in other monozygotic twin pairs would be remarkable and of potential disease significance.
Resumo:
Background: New challenges are rising in the animal protein market, and one of the main world challenges is to produce more in shorter time, with better quality and in a sustainable way. Brazil is the largest beef exporter in volume hence the factors affecting the beef meat chain are of major concern in countrýs economy. An emerging class of biotechnological approaches, the molecular markers, is bringing new perspectives to face these challenges, particularly after the publication of the first complete livestock genome (bovine), which has triggered a massive initiative to put in practice the benefits of the so called the Post-Genomic Era. Review: This article aimed at showing the directions and insights in the application of molecular markers on livestock genetic improvement and reproduction as well at organizing the progress so far, pointing some perspectives of these emerging technologies in Brazilian ruminant production context. An overview on the nature of the main molecular markers explored in ruminant production is provided, which describes the molecular bases and detection approaches available for microsatellites (STR) and single nucleotide polymorphisms (SNP). A topic is dedicated to review the history of association studies between markers and important trait variation in livestock, showing the timeline starting on quantitative trait loci (QTL) identification using STR markers and ending in high resolution SNP panels to proceed whole genome scans for phenotype/genotype association. Also the article organizes this information to reveal how QTL prospection using STR could open ground to the feasibility of marker-assisted selection and why this approach is quickly being replaced by studies involving the application of genome-wide association using SNP research in a new concept called genomic selection. Conclusion: The world's scientific community is dedicating effort and resources to apply SNP information in livestock selection through the development of high density panels for genomic association studies, connecting molecular genetic data with phenotypes of economic interest. Once generated, this information can be used to take decisions in genetic improvement programs by selecting animals with the assistance of molecular markers.
Resumo:
Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)
Resumo:
The publication of the human genome sequence in 2001 was a major step forward in knowledge necessary to understand the variations between individuals. For farmed species, genomic sequence information will facilitate the selection of animals optimised to live, and be productive, in particular environments. The availability of cattle genome sequence has allowed the breeding industry to take the first steps towards predicting phenotypes from genotypes by estimating a genomic breeding value (gEBV) for bulls using genome-wide DNA markers. The sequencing of the buffalo genome and creation of a panel of DNA markers has created the opportunity to apply molecular selection approaches for this species.The genomes of several buffalo of different breeds were sequenced and aligned with the bovine genome, which facilitated the identification of millions of sequence variants in the buffalo genomes. Based on frequencies of variants within and among buffalo breeds, and their distribution across the genome compared with the bovine genome, 90,000 putative single nucleotide polymorphisms (SNP) were selected to create an Axiom (R) Buffalo Genotyping Array 90K. This SNP Chip was tested in buffalo populations from Italy and Brazil and found to have at least 75% high quality and polymorphic markers in these populations. The 90K SNP chip was then used to investigate the structure of buffalo populations, and to localise the variations having a major effect on milk production.
Resumo:
To better understand agronomic and end-use quality in wheat (Triticum aestivum L.) we developed a population containing 154 F6:8 recombinant inbred lines (RILs) from the cross TAM107-R7/Arlin. The parental lines and RILs were phenotyped at six environments in Nebraska and differed for resistance to Wheat soilborne mosaic virus (WSBMV), morphological, agronomic, and end-use quality traits. Additionally, a 2300 cM genome-wide linkage map was created for quantitative trait loci (QTL) analysis. Based on our results across multiple environments, the best RILs could be used for cultivar improvement. The population and marker data are publicly available for interested researchers for future research. The population was used to determine the effect of WSBMV on agronomic and end-use quality and for the mapping of a resistance locus. Results from two infected environments showed that all but two agronomic traits were significantly affected by the disease. Specifically, the disease reduced grain yield by 30% of susceptible RILs and they flowered 5 d later and were 11 cm shorter. End-use quality traits were not negatively affected but flour protein content was increased in susceptible RILs. The resistance locus SbmTmr1 mapped to 27.1 cM near marker wPt-5870 on chromosome 5DL using ELISA data. Finally, we investigated how WSBMV affected QTL detection in the population. QTLs were mapped at two WSBMV infected environments, four uninfected environments, and in the resistant and susceptible RIL subpopulations in the infected environments. Fifty-two significant (LOD≥3) QTLs were mapped in RILs at uninfected environments. Many of the QTLs were pleiotropic or closely linked at 6 chromosomal regions. Forty-seven QTLs were mapped in RILs at WSBMV infected environments. Comparisons between uninfected and infected environments identified 20 common QTLs and 21 environmentally specific QTLs. Finally, 24 QTLs were determined to be affected by WSBMV by comparing the subpopulations in QTL analyses within the same environment. The comparisons were statistically validated using marker by disease interactions. These results showed that QTLs can be affected by WSBMV and careful interpretation of QTL results is needed where biotic stresses are present. Finally, beneficial QTLs not affected by WSBMV or the environment are candidates for marker-assisted selection.
Resumo:
Genome-wide association studies have failed to establish common variant risk for the majority of common human diseases. The underlying reasons for this failure are explained by recent studies of resequencing and comparison of over 1200 human genomes and 10 000 exomes, together with the delineation of DNA methylation patterns (epigenome) and full characterization of coding and noncoding RNAs (transcriptome) being transcribed. These studies have provided the most comprehensive catalogues of functional elements and genetic variants that are now available for global integrative analysis and experimental validation in prospective cohort studies. With these datasets, researchers will have unparalleled opportunities for the alignment, mining, and testing of hypotheses for the roles of specific genetic variants, including copy number variations, single nucleotide polymorphisms, and indels as the cause of specific phenotypes and diseases. Through the use of next-generation sequencing technologies for genotyping and standardized ontological annotation to systematically analyze the effects of genomic variation on humans and model organism phenotypes, we will be able to find candidate genes and new clues for disease’s etiology and treatment. This article describes essential concepts in genetics and genomic technologies as well as the emerging computational framework to comprehensively search websites and platforms available for the analysis and interpretation of genomic data.
Resumo:
This 9p21 locus, encode for important proteins involved in cell cycle regulation and apoptosis containing the p16/CDKN2A (cyclin-dependent kinase inhibitor 2a) tumor suppressor gene and two other related genes, p14/ARF and p15/CDKN2B. This locus, is a major target of inactivation in the pathogenesis of a number of human tumors, both solid and haematologic, and is a frequent site of loss or deletion also in acute lymphoblastic leukemia (ALL) ranging from 18% to 45% 1. In order to explore, at high resolution, the frequency and size of alterations affecting this locus in adult BCR-ABL1-positive ALL and to investigate their prognostic value, 112 patients (101 de novo and 11 relapse cases) were analyzed by genome-wide single nucleotide polymorphisms arrays and gene candidate deep exon sequencing. Paired diagnosis-relapse samples were further available and analyzed for 19 (19%) cases. CDKN2A/ARF and CDKN2B genomic alterations were identified in 29% and 25% of newly diagnosed patients, respectively. Deletions were monoallelic in 72% of cases and in 43% the minimal overlapping region of the lost area spanned only the CDKN2A/2B gene locus. The analysis at the time of relapse showed an almost significant increase in the detection rate of CDKN2A/ARF loss (47%) compared to diagnosis (p = 0.06). Point mutations within the 9p21 locus were found at very low level with only a non-synonymous substition in the exon 2 of CDKN2A. Finally, correlation with clinical outcome showed that deletions of CDKN2A/B are significantly associated with poor outcome in terms of overall survival (p = 0.0206), disease free-survival (p = 0.0010) and cumulative incidence of relapse (p = 0.0014). The inactivation of 9p21 locus by genomic deletions is a frequent event in BCR-ABL1-positive ALL. Deletions are frequently acquired at the leukemia progression and work as a poor prognostic marker.
Resumo:
The aim of this work was to identify markers associated with production traits in the pig genome using different approaches. We focused the attention on Italian Large White pig breed using Genome Wide Association Studies (GWAS) and applying a selective genotyping approach to increase the power of the analyses. Furthermore, we searched the pig genome using Next Generation Sequencing (NSG) Ion Torrent Technology to combine selective genotyping approach and deep sequencing for SNP discovery. Other two studies were carried on with a different approach. Allele frequency changes for SNPs affecting candidate genes and at Genome Wide level were analysed to identify selection signatures driven by selection program during the last 20 years. This approach confirmed that a great number of markers may affect production traits and that they are captured by the classical selection programs. GWAS revealed 123 significant or suggestively significant SNP associated with Back Fat Thickenss and 229 associated with Average Daily Gain. 16 Copy Number Variant Regions resulted more frequent in lean or fat pigs and showed that different copies of those region could have a limited impact on fat. These often appear to be involved in food intake and behavior, beside affecting genes involved in metabolic pathways and their expression. By combining NGS sequencing with selective genotyping approach, new variants where discovered and at least 54 are worth to be analysed in association studies. The study of groups of pigs undergone to stringent selection showed that allele frequency of some loci can drastically change if they are close to traits that are interesting for selection schemes. These approaches could be, in future, integrated in genomic selection plans.
Resumo:
The domestic dog offers a unique opportunity to explore the genetic basis of disease, morphology and behaviour. Humans share many diseases with our canine companions, making dogs an ideal model organism for comparative disease genetics. Using newly developed resources, genome-wide association studies in dog breeds are proving to be exceptionally powerful. Towards this aim, veterinarians and geneticists from 12 European countries are collaborating to collect and analyse the DNA from large cohorts of dogs suffering from a range of carefully defined diseases of relevance to human health. This project, named LUPA, has already delivered considerable results. The consortium has collaborated to develop a new high density single nucleotide polymorphism (SNP) array. Mutations for four monogenic diseases have been identified and the information has been utilised to find mutations in human patients. Several complex diseases have been mapped and fine mapping is underway. These findings should ultimately lead to a better understanding of the molecular mechanisms underlying complex diseases in both humans and their best friend.
Resumo:
Submicroscopic changes in chromosomal DNA copy number dosage are common and have been implicated in many heritable diseases and cancers. Recent high-throughput technologies have a resolution that permits the detection of segmental changes in DNA copy number that span thousands of basepairs across the genome. Genome-wide association studies (GWAS) may simultaneously screen for copy number-phenotype and SNP-phenotype associations as part of the analytic strategy. However, genome-wide array analyses are particularly susceptible to batch effects as the logistics of preparing DNA and processing thousands of arrays often involves multiple laboratories and technicians, or changes over calendar time to the reagents and laboratory equipment. Failure to adjust for batch effects can lead to incorrect inference and requires inefficient post-hoc quality control procedures that exclude regions that are associated with batch. Our work extends previous model-based approaches for copy number estimation by explicitly modeling batch effects and using shrinkage to improve locus-specific estimates of copy number uncertainty. Key features of this approach include the use of diallelic genotype calls from experimental data to estimate batch- and locus-specific parameters of background and signal without the requirement of training data. We illustrate these ideas using a study of bipolar disease and a study of chromosome 21 trisomy. The former has batch effects that dominate much of the observed variation in quantile-normalized intensities, while the latter illustrates the robustness of our approach to datasets where as many as 25% of the samples have altered copy number. Locus-specific estimates of copy number can be plotted on the copy-number scale to investigate mosaicism and guide the choice of appropriate downstream approaches for smoothing the copy number as a function of physical position. The software is open source and implemented in the R package CRLMM available at Bioconductor (http:www.bioconductor.org).
Resumo:
Recurrent airway obstruction (RAO), or heaves, is a naturally occurring asthma-like disease that is related to sensitisation and exposure to mouldy hay and has a familial basis with a complex mode of inheritance. A genome-wide scanning approach using two half-sibling families was taken in order to locate the chromosome regions that contribute to the inherited component of this condition in these families. Initially, a panel of 250 microsatellite markers, which were chosen as a well-spaced, polymorphic selection covering the 31 equine autosomes, was used to genotype the two half-sibling families, which comprised in total 239 Warmblood horses. Subsequently, supplementary markers were added for a total of 315 genotyped markers. Each half-sibling family is focused around a severely RAO-affected stallion, and the phenotype of each individual was assessed for RAO and related signs, namely, breathing effort at rest, breathing effort at work, coughing, and nasal discharge, using an owner-based questionnaire. Analysis using a regression method for half-sibling family structures was performed using RAO and each of the composite clinical signs separately; two chromosome regions (on ECA13 and ECA15) showed a genome-wide significant association with RAO at P < 0.05. An additional 11 chromosome regions showed a more modest association. This is the first publication that describes the mapping of genetic loci involved in RAO. Several candidate genes are located in these regions, a number of which are interleukins. These are important signalling molecules that are intricately involved in the control of the immune response and are therefore good positional candidates.