982 resultados para SNP array
Resumo:
Despite elevated incidence and recurrence rates for Primary Spontaneous Pneumothorax (PSP), little is known about its etiology, and the genetics of idiopathic PSP remains unexplored. To identify genetic variants contributing to sporadic PSP risk, we conducted the first PSP genome-wide association study. Two replicate pools of 92 Portuguese PSP cases and of 129 age- and sex-matched controls were allelotyped in triplicate on the Affymetrix Human SNP Array 6.0 arrays. Markers passing quality control were ranked by relative allele score difference between cases and controls (|RASdiff|), by a novel cluster method and by a combined Z-test. 101 single nucleotide polymorphisms (SNPs) were selected using these three approaches for technical validation by individual genotyping in the discovery dataset. 87 out of 94 successfully tested SNPs were nominally associated in the discovery dataset. Replication of the 87 technically validated SNPs was then carried out in an independent replication dataset of 100 Portuguese cases and 425 controls. The intergenic rs4733649 SNP in chromosome 8 (between LINC00824 and LINC00977) was associated with PSP in the discovery (P = 4.07E-03, ORC[95% CI] = 1.88[1.22-2.89]), replication (P = 1.50E-02, ORC[95% CI] = 1.50[1.08-2.09]) and combined datasets (P = 8.61E-05, ORC[95% CI] = 1.65[1.29-2.13]). This study identified for the first time one genetic risk factor for sporadic PSP, but future studies are warranted to further confirm this finding in other populations and uncover its functional role in PSP pathogenesis.
Resumo:
Despite elevated incidence and recurrence rates for Primary Spontaneous Pneumothorax (PSP), little is known about its etiology, and the genetics of idiopathic PSP remains unexplored. To identify genetic variants contributing to sporadic PSP risk, we conducted the first PSP genome-wide association study. Two replicate pools of 92 Portuguese PSP cases and of 129 age- and sex-matched controls were allelotyped in triplicate on the Affymetrix Human SNP Array 6.0 arrays. Markers passing quality control were ranked by relative allele score difference between cases and controls (|RASdiff|), by a novel cluster method and by a combined Z-test. 101 single nucleotide polymorphisms (SNPs) were selected using these three approaches for technical validation by individual genotyping in the discovery dataset. 87 out of 94 successfully tested SNPs were nominally associated in the discovery dataset. Replication of the 87 technically validated SNPs was then carried out in an independent replication dataset of 100 Portuguese cases and 425 controls. The intergenic rs4733649 SNP in chromosome 8 (between LINC00824 and LINC00977) was associated with PSP in the discovery (P = 4.07E-03, ORC[95% CI] = 1.88[1.22–2.89]), replication (P = 1.50E-02, ORC[95% CI] = 1.50[1.08–2.09]) and combined datasets (P = 8.61E-05, ORC[95% CI] = 1.65[1.29–2.13]). This study identified for the first time one genetic risk factor for sporadic PSP, but future studies are warranted to further confirm this finding in other populations and uncover its functional role in PSP pathogenesis.
Resumo:
Whole Exome Sequencing (WES) is rapidly becoming the first-tier test in clinics, both thanks to its declining costs and the development of new platforms that help clinicians in the analysis and interpretation of SNV and InDels. However, we still know very little on how CNV detection could increase WES diagnostic yield. A plethora of exome CNV callers have been published over the years, all showing good performances towards specific CNV classes and sizes, suggesting that the combination of multiple tools is needed to obtain an overall good detection performance. Here we present TrainX, a ML-based method for calling heterozygous CNVs in WES data using EXCAVATOR2 Normalized Read Counts. We select males and females’ non pseudo-autosomal chromosome X alignments to construct our dataset and train our model, make predictions on autosomes target regions and use HMM to call CNVs. We compared TrainX against a set of CNV tools differing for the detection method (GATK4 gCNV, ExomeDepth, DECoN, CNVkit and EXCAVATOR2) and found that our algorithm outperformed them in terms of stability, as we identified both deletions and duplications with good scores (0.87 and 0.82 F1-scores respectively) and for sizes reaching the minimum resolution of 2 target regions. We also evaluated the method robustness using a set of WES and SNP array data (n=251), part of the Italian cohort of Epi25 collaborative, and were able to retrieve all clinical CNVs previously identified by the SNP array. TrainX showed good accuracy in detecting heterozygous CNVs of different sizes, making it a promising tool to use in a diagnostic setting.
Resumo:
Background A whole-genome genotyping array has previously been developed for Malus using SNP data from 28 Malus genotypes. This array offers the prospect of high throughput genotyping and linkage map development for any given Malus progeny. To test the applicability of the array for mapping in diverse Malus genotypes, we applied the array to the construction of a SNPbased linkage map of an apple rootstock progeny. Results Of the 7,867 Malus SNP markers on the array, 1,823 (23.2 %) were heterozygous in one of the two parents of the progeny, 1,007 (12.8 %) were heterozygous in both parental genotypes, whilst just 2.8 % of the 921 Pyrus SNPs were heterozygous. A linkage map spanning 1,282.2 cM was produced comprising 2,272 SNP markers, 306 SSR markers and the S-locus. The length of the M432 linkage map was increased by 52.7 cM with the addition of the SNP markers, whilst marker density increased from 3.8 cM/marker to 0.5 cM/marker. Just three regions in excess of 10 cM remain where no markers were mapped. We compared the positions of the mapped SNP markers on the M432 map with their predicted positions on the ‘Golden Delicious’ genome sequence. A total of 311 markers (13.7 % of all mapped markers) mapped to positions that conflicted with their predicted positions on the ‘Golden Delicious’ pseudo-chromosomes, indicating the presence of paralogous genomic regions or misassignments of genome sequence contigs during the assembly and anchoring of the genome sequence. Conclusions We incorporated data for the 2,272 SNP markers onto the map of the M432 progeny and have presented the most complete and saturated map of the full 17 linkage groups of M. pumila to date. The data were generated rapidly in a high-throughput semi-automated pipeline, permitting significant savings in time and cost over linkage map construction using microsatellites. The application of the array will permit linkage maps to be developed for QTL analyses in a cost-effective manner, and the identification of SNPs that have been assigned erroneous positions on the ‘Golden Delicious’ reference sequence will assist in the continued improvement of the genome sequence assembly for that variety.
Resumo:
Background: High-throughput SNP genotyping has become an essential requirement for molecular breeding and population genomics studies in plant species. Large scale SNP developments have been reported for several mainstream crops. A growing interest now exists to expand the speed and resolution of genetic analysis to outbred species with highly heterozygous genomes. When nucleotide diversity is high, a refined diagnosis of the target SNP sequence context is needed to convert queried SNPs into high-quality genotypes using the Golden Gate Genotyping Technology (GGGT). This issue becomes exacerbated when attempting to transfer SNPs across species, a scarcely explored topic in plants, and likely to become significant for population genomics and inter specific breeding applications in less domesticated and less funded plant genera. Results: We have successfully developed the first set of 768 SNPs assayed by the GGGT for the highly heterozygous genome of Eucalyptus from a mixed Sanger/454 database with 1,164,695 ESTs and the preliminary 4.5X draft genome sequence for E. grandis. A systematic assessment of in silico SNP filtering requirements showed that stringent constraints on the SNP surrounding sequences have a significant impact on SNP genotyping performance and polymorphism. SNP assay success was high for the 288 SNPs selected with more rigorous in silico constraints; 93% of them provided high quality genotype calls and 71% of them were polymorphic in a diverse panel of 96 individuals of five different species. SNP reliability was high across nine Eucalyptus species belonging to three sections within subgenus Symphomyrtus and still satisfactory across species of two additional subgenera, although polymorphism declined as phylogenetic distance increased. Conclusions: This study indicates that the GGGT performs well both within and across species of Eucalyptus notwithstanding its nucleotide diversity >= 2%. The development of a much larger array of informative SNPs across multiple Eucalyptus species is feasible, although strongly dependent on having a representative and sufficiently deep collection of sequences from many individuals of each target species. A higher density SNP platform will be instrumental to undertake genome-wide phylogenetic and population genomics studies and to implement molecular breeding by Genomic Selection in Eucalyptus.
Resumo:
Background: The malaria parasite Plasmodium falciparum exhibits abundant genetic diversity, and this diversity is key to its success as a pathogen. Previous efforts to study genetic diversity in P. falciparum have begun to elucidate the demographic history of the species, as well as patterns of population structure and patterns of linkage disequilibrium within its genome. Such studies will be greatly enhanced by new genomic tools and recent large-scale efforts to map genomic variation. To that end, we have developed a high throughput single nucleotide polymorphism (SNP) genotyping platform for P. falciparum. Results: Using an Affymetrix 3,000 SNP assay array, we found roughly half the assays (1,638) yielded high quality, 100% accurate genotyping calls for both major and minor SNP alleles. Genotype data from 76 global isolates confirm significant genetic differentiation among continental populations and varying levels of SNP diversity and linkage disequilibrium according to geographic location and local epidemiological factors. We further discovered that nonsynonymous and silent (synonymous or noncoding) SNPs differ with respect to within-population diversity, interpopulation differentiation, and the degree to which allele frequencies are correlated between populations. Conclusions: The distinct population profile of nonsynonymous variants indicates that natural selection has a significant influence on genomic diversity in P. falciparum, and that many of these changes may reflect functional variants deserving of follow-up study. Our analysis demonstrates the potential for new high-throughput genotyping technologies to enhance studies of population structure, natural selection, and ultimately enable genome-wide association studies in P. falciparum to find genes underlying key phenotypic traits.
Resumo:
Raised blood pressure (BP) is a major risk factor for cardiovascular disease. Previous studies have identified 47 distinct genetic variants robustly associated with BP, but collectively these explain only a few percent of the heritability for BP phenotypes. To find additional BP loci, we used a bespoke gene-centric array to genotype an independent discovery sample of 25,118 individuals that combined hypertensive case-control and general population samples. We followed up four SNPs associated with BP at our p < 8.56 × 10(-7) study-specific significance threshold and six suggestively associated SNPs in a further 59,349 individuals. We identified and replicated a SNP at LSP1/TNNT3, a SNP at MTHFR-NPPB independent (r(2) = 0.33) of previous reports, and replicated SNPs at AGT and ATP2B1 reported previously. An analysis of combined discovery and follow-up data identified SNPs significantly associated with BP at p < 8.56 × 10(-7) at four further loci (NPR3, HFE, NOS3, and SOX6). The high number of discoveries made with modest genotyping effort can be attributed to using a large-scale yet targeted genotyping array and to the development of a weighting scheme that maximized power when meta-analyzing results from samples ascertained with extreme phenotypes, in combination with results from nonascertained or population samples. Chromatin immunoprecipitation and transcript expression data highlight potential gene regulatory mechanisms at the MTHFR and NOS3 loci. These results provide candidates for further study to help dissect mechanisms affecting BP and highlight the utility of studying SNPs and samples that are independent of those studied previously even when the sample size is smaller than that in previous studies.
Resumo:
Raised blood pressure (BP) is a major risk factor for cardiovascular disease. Previous studies have identified 47 distinct genetic variants robustly associated with BP, but collectively these explain only a few percent of the heritability for BP phenotypes. To find additional BP loci, we used a bespoke gene-centric array to genotype an independent discovery sample of 25,118 individuals that combined hypertensive case-control and general population samples. We followed up four SNPs associated with BP at our p < 8.56 × 10(-7) study-specific significance threshold and six suggestively associated SNPs in a further 59,349 individuals. We identified and replicated a SNP at LSP1/TNNT3, a SNP at MTHFR-NPPB independent (r(2) = 0.33) of previous reports, and replicated SNPs at AGT and ATP2B1 reported previously. An analysis of combined discovery and follow-up data identified SNPs significantly associated with BP at p < 8.56 × 10(-7) at four further loci (NPR3, HFE, NOS3, and SOX6). The high number of discoveries made with modest genotyping effort can be attributed to using a large-scale yet targeted genotyping array and to the development of a weighting scheme that maximized power when meta-analyzing results from samples ascertained with extreme phenotypes, in combination with results from nonascertained or population samples. Chromatin immunoprecipitation and transcript expression data highlight potential gene regulatory mechanisms at the MTHFR and NOS3 loci. These results provide candidates for further study to help dissect mechanisms affecting BP and highlight the utility of studying SNPs and samples that are independent of those studied previously even when the sample size is smaller than that in previous studies.
Resumo:
Die vorliegende Dissertation entstand im Rahmen eines multizentrischen EU-geförderten Projektes, das die Anwendungsmöglichkeiten von Einzelnukleotid-Polymorphismen (SNPs) zur Individualisierung von Personen im Kontext der Zuordnung von biologischen Tatortspuren oder auch bei der Identifizierung unbekannter Toter behandelt. Die übergeordnete Zielsetzung des Projektes bestand darin, hochauflösende Genotypisierungsmethoden zu etablieren und zu validieren, die mit hoher Genauigkeit aber geringen Aufwand SNPs im Multiplexformat simultan analysieren können. Zunächst wurden 29 Y-chromosomale und 52 autosomale SNPs unter der Anforderung ausgewählt, dass sie als Multiplex eine möglichst hohe Individualisierungschance aufweisen. Anschließend folgten die Validierungen beider Multiplex-Systeme und der SNaPshot™-Minisequenzierungsmethode in systematischen Studien unter Beteiligung aller Arbeitsgruppen des Projektes. Die validierte Referenzmethode auf der Basis einer Minisequenzierung diente einerseits für die kontrollierte Zusammenarbeit unterschiedlicher Laboratorien und andererseits als Grundlage für die Entwicklung eines Assays zur SNP-Genotypisierung mittels der elektronischen Microarray-Technologie in dieser Arbeit. Der eigenständige Hauptteil dieser Dissertation beschreibt unter Verwendung der zuvor validierten autosomalen SNPs die Neuentwicklung und Validierung eines Hybridisierungsassays für die elektronische Microarray-Plattform der Firma Nanogen Dazu wurden im Vorfeld drei verschiedene Assays etabliert, die sich im Funktionsprinzip auf dem Microarray unterscheiden. Davon wurde leistungsorientiert das Capture down-Assay zur Weiterentwicklung ausgewählt. Nach zahlreichen Optimierungsmaßnahmen hinsichtlich PCR-Produktbehandlung, gerätespezifischer Abläufe und analysespezifischer Oligonukleotiddesigns stand das Capture down-Assay zur simultanen Typisierung von drei Individuen mit je 32 SNPs auf einem Microarray bereit. Anschließend wurde dieses Verfahren anhand von 40 DNA-Proben mit bekannten Genotypen für die 32 SNPs validiert und durch parallele SNaPshot™-Typisierung die Genauigkeit bestimmt. Das Ergebnis beweist nicht nur die Eignung des validierten Analyseassays und der elektronischen Microarray-Technologie für bestimmte Fragestellungen, sondern zeigt auch deren Vorteile in Bezug auf Schnelligkeit, Flexibilität und Effizienz. Die Automatisierung, welche die räumliche Anordnung der zu untersuchenden Fragmente unmittelbar vor der Analyse ermöglicht, reduziert unnötige Arbeitsschritte und damit die Fehlerhäufigkeit und Kontaminationsgefahr bei verbesserter Zeiteffizienz. Mit einer maximal erreichten Genauigkeit von 94% kann die Zuverlässigkeit der in der forensischen Genetik aktuell eingesetzten STR-Systeme jedoch noch nicht erreicht werden. Die Rolle des neuen Verfahrens wird damit nicht in einer Ablösung der etablierten Methoden, sondern in einer Ergänzung zur Lösung spezieller Probleme wie z.B. der Untersuchung stark degradierter DNA-Spuren zu finden sein.
Resumo:
Submicroscopic changes in chromosomal DNA copy number dosage are common and have been implicated in many heritable diseases and cancers. Recent high-throughput technologies have a resolution that permits the detection of segmental changes in DNA copy number that span thousands of basepairs across the genome. Genome-wide association studies (GWAS) may simultaneously screen for copy number-phenotype and SNP-phenotype associations as part of the analytic strategy. However, genome-wide array analyses are particularly susceptible to batch effects as the logistics of preparing DNA and processing thousands of arrays often involves multiple laboratories and technicians, or changes over calendar time to the reagents and laboratory equipment. Failure to adjust for batch effects can lead to incorrect inference and requires inefficient post-hoc quality control procedures that exclude regions that are associated with batch. Our work extends previous model-based approaches for copy number estimation by explicitly modeling batch effects and using shrinkage to improve locus-specific estimates of copy number uncertainty. Key features of this approach include the use of diallelic genotype calls from experimental data to estimate batch- and locus-specific parameters of background and signal without the requirement of training data. We illustrate these ideas using a study of bipolar disease and a study of chromosome 21 trisomy. The former has batch effects that dominate much of the observed variation in quantile-normalized intensities, while the latter illustrates the robustness of our approach to datasets where as many as 25% of the samples have altered copy number. Locus-specific estimates of copy number can be plotted on the copy-number scale to investigate mosaicism and guide the choice of appropriate downstream approaches for smoothing the copy number as a function of physical position. The software is open source and implemented in the R package CRLMM available at Bioconductor (http:www.bioconductor.org).
Resumo:
Recurrent airway obstruction is one of the most common airway diseases affecting mature horses. Increased bronchoalveolar mucus, neutrophil accumulation in airways, and airway obstruction are the main features of this disease. Mucociliary clearance is a key component of pulmonary defense mechanisms. Cilia are the motile part of this system and a complex linear array of dynein motors is responsible for their motility by moving along the microtubules in the axonemes of cilia and flagella. We previously detected a QTL for RAO on ECA 13 in a half-sib family of European Warmblood horses. The gene encoding DNAH3 is located in the peak of the detected QTL and encodes a dynein subunit. Therefore, we analysed this gene as a positional and functional candidate gene for RAO. In a mutation analysis of all 62 exons we detected 53 new polymorphisms including 7 non-synonymous variants. We performed an association study using 38 polymorphisms in a cohort of 422 animals. However, after correction for multiple testing we did not detect a significant association of any of these polymorphisms with RAO (P>0.05). Therefore, it seems unlikely that variants at the DNAH3 gene are responsible for the RAO QTL in European Warmblood horses.
Resumo:
Multiple myeloma is characterized by genomic alterations frequently involving gains and losses of chromosomes. Single nucleotide polymorphism (SNP)-based mapping arrays allow the identification of copy number changes at the sub-megabase level and the identification of loss of heterozygosity (LOH) due to monosomy and uniparental disomy (UPD). We have found that SNP-based mapping array data and fluorescence in situ hybridization (FISH) copy number data correlated well, making the technique robust as a tool to investigate myeloma genomics. The most frequently identified alterations are located at 1p, 1q, 6q, 8p, 13, and 16q. LOH is found in these large regions and also in smaller regions throughout the genome with a median size of 1 Mb. We have identified that UPD is prevalent in myeloma and occurs through a number of mechanisms including mitotic nondisjunction and mitotic recombination. For the first time in myeloma, integration of mapping and expression data has allowed us to reduce the complexity of standard gene expression data and identify candidate genes important in both the transition from normal to monoclonal gammopathy of unknown significance (MGUS) to myeloma and in different subgroups within myeloma. We have documented these genes, providing a focus for further studies to identify and characterize those that are key in the pathogenesis of myeloma.
Resumo:
Hevea brasiliensis (Willd. Ex Adr. Juss.) Muell.-Arg. is the primary source of natural rubber that is native to the Amazon rainforest. The singular properties of natural rubber make it superior to and competitive with synthetic rubber for use in several applications. Here, we performed RNA sequencing (RNA-seq) of H. brasiliensis bark on the Illumina GAIIx platform, which generated 179,326,804 raw reads on the Illumina GAIIx platform. A total of 50,384 contigs that were over 400 bp in size were obtained and subjected to further analyses. A similarity search against the non-redundant (nr) protein database returned 32,018 (63%) positive BLASTx hits. The transcriptome analysis was annotated using the clusters of orthologous groups (COG), gene ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG), and Pfam databases. A search for putative molecular marker was performed to identify simple sequence repeats (SSRs) and single nucleotide polymorphisms (SNPs). In total, 17,927 SSRs and 404,114 SNPs were detected. Finally, we selected sequences that were identified as belonging to the mevalonate (MVA) and 2-C-methyl-D-erythritol 4-phosphate (MEP) pathways, which are involved in rubber biosynthesis, to validate the SNP markers. A total of 78 SNPs were validated in 36 genotypes of H. brasiliensis. This new dataset represents a powerful information source for rubber tree bark genes and will be an important tool for the development of microsatellites and SNP markers for use in future genetic analyses such as genetic linkage mapping, quantitative trait loci identification, investigations of linkage disequilibrium and marker-assisted selection.
Resumo:
In about 50% of first trimester spontaneous abortion the cause remains undetermined after standard cytogenetic investigation. We evaluated the usefulness of array-CGH in diagnosing chromosome abnormalities in products of conception from first trimester spontaneous abortions. Cell culture was carried out in short- and long-term cultures of 54 specimens and cytogenetic analysis was successful in 49 of them. Cytogenetic abnormalities (numerical and structural) were detected in 22 (44.89%) specimens. Subsequent, array-CGH based on large insert clones spaced at ~1 Mb intervals over the whole genome was used in 17 cases with normal G-banding karyotype. This revealed chromosome aneuplodies in three additional cases, giving a final total of 51% cases in which an abnormal karyotype was detected. In keeping with other recently published works, this study shows that array-CGH detects abnormalities in a further ~10% of spontaneous abortion specimens considered to be normal using standard cytogenetic methods. As such, array-CGH technique may present a suitable complementary test to cytogenetic analysis in cases with a normal karyotype.
Resumo:
Collagen XVIII can generate two fragments, NC11-728 containing a frizzled motif which possibly acts in Wnt signaling and Endostatin, which is cleaved from the NC1 and is a potent inhibitor of angiogenesis. Collagen XVIII and Wnt signaling have recently been associated with adipogenic differentiation and obesity in some animal models, but not in humans. In the present report, we have shown that COL18A1 expression increases during human adipogenic differentiation. We also tested if polymorphisms in the Frizzled (c.1136C>T; Thr379Met) and Endostatin (c.4349G>A; Asp1437Asn) regions contribute towards susceptibility to obesity in patients with type 2 diabetes (113 obese, BMI =30; 232 non-obese, BMI < 30) of European ancestry. No evidence of association was observed between the allele c.4349G>A and obesity, but we observed a significantly higher frequency of homozygotes c.1136TT in obese (19.5%) than in non-obese individuals (10.9%) [P = 0.02; OR = 2.0 (95%CI: 1.07-3.73)], suggesting that the allele c.1136T is associated to obesity in a recessive model. This genotype, after controlling for cholesterol, LDL cholesterol, and triglycerides, was independently associated with obesity (P = 0.048), and increases the chance of obesity in 2.8 times. Therefore, our data suggest the involvement of collagen XVIII in human adipogenesis and susceptibility to obesity.