961 resultados para GENOME-WIDE DETECTION


Relevância:

80.00% 80.00%

Publicador:

Resumo:

Simulation-based assessment is a popular and frequently necessary approach to evaluation of statistical procedures. Sometimes overlooked is the ability to take advantage of underlying mathematical relations and we focus on this aspect. We show how to take advantage of large-sample theory when conducting a simulation using the analysis of genomic data as a motivating example. The approach uses convergence results to provide an approximation to smaller-sample results, results that are available only by simulation. We consider evaluating and comparing a variety of ranking-based methods for identifying the most highly associated SNPs in a genome-wide association study, derive integral equation representations of the pre-posterior distribution of percentiles produced by three ranking methods, and provide examples comparing performance. These results are of interest in their own right and set the framework for a more extensive set of comparisons.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The ability to measure gene expression on a genome-wide scale is one of the most promising accomplishments in molecular biology. Microarrays, the technology that first permitted this, were riddled with problems due to unwanted sources of variability. Many of these problems are now mitigated, after a decade’s worth of statistical methodology development. The recently developed RNA sequencing (RNA-seq) technology has generated much excitement in part due to claims of reduced variability in comparison to microarrays. However, we show RNA-seq data demonstrates unwanted and obscuring variability similar to what was first observed in microarrays. In particular, we find GC-content has a strong sample specific effect on gene expression measurements that, if left uncorrected, leads to false positives in downstream results. We also report on commonly observed data distortions that demonstrate the need for data normalization. Here we describe statistical methodology that improves precision by 42% without loss of accuracy. Our resulting conditional quantile normalization (CQN) algorithm combines robust generalized regression to remove systematic bias introduced by deterministic features such as GC-content, and quantile normalization to correct for global distortions.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Amplifications and deletions of chromosomal DNA, as well as copy-neutral loss of heterozygosity have been associated with diseases processes. High-throughput single nucleotide polymorphism (SNP) arrays are useful for making genome-wide estimates of copy number and genotype calls. Because neighboring SNPs in high throughput SNP arrays are likely to have dependent copy number and genotype due to the underlying haplotype structure and linkage disequilibrium, hidden Markov models (HMM) may be useful for improving genotype calls and copy number estimates that do not incorporate information from nearby SNPs. We improve previous approaches that utilize a HMM framework for inference in high throughput SNP arrays by integrating copy number, genotype calls, and the corresponding confidence scores when available. Using simulated data, we demonstrate how confidence scores control smoothing in a probabilistic framework. Software for fitting HMMs to SNP array data is available in the R package ICE.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

OBJECTIVE: To report the study of a multigenerational Swiss family with dopa-responsive dystonia (DRD). METHODS: Clinical investigation was made of available family members, including historical and chart reviews. Subject examinations were video recorded. Genetic analysis included a genome-wide linkage study with microsatellite markers (STR), GTP cyclohydrolase I (GCH1) gene sequencing, and dosage analysis. RESULTS: We evaluated 32 individuals, of whom 6 were clinically diagnosed with DRD, with childhood-onset progressive foot dystonia, later generalizing, followed by parkinsonism in the two older patients. The response to levodopa was very good. Two additional patients had late onset dopa-responsive parkinsonism. Three other subjects had DRD symptoms on historical grounds. We found suggestive linkage to the previously reported DYT14 locus, which excluded GCH1. However, further study with more stringent criteria for disease status attribution showed linkage to a larger region, which included GCH1. No mutation was found in GCH1 by gene sequencing but dosage methods identified a novel heterozygous deletion of exons 3 to 6 of GCH1. The mutation was found in seven subjects. One of the patients with dystonia represented a phenocopy. CONCLUSIONS: This study rules out the previously reported DYT14 locus as a cause of disease, as a novel multiexonic deletion was identified in GCH1. This work highlights the necessity of an accurate clinical diagnosis in linkage studies as well as the need for appropriate allele frequencies, penetrance, and phenocopy estimates. Comprehensive sequencing and dosage analysis of known genes is recommended prior to genome-wide linkage analysis.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Little is known about the genes and proteins involved in the process of human memory. To identify genetic factors related to human episodic memory performance, we conducted an ultra-high-density genome-wide screen at > 500 000 single nucleotide polymorphisms (SNPs) in a sample of normal young adults stratified for performance on an episodic recall memory test. Analysis of this data identified SNPs within the calmodulin-binding transcription activator 1 (CAMTA1) gene that were significantly associated with memory performance. A follow up study, focused on the CAMTA1 locus in an independent cohort consisting of cognitively normal young adults, singled out SNP rs4908449 with a P-value of 0.0002 as the most significant associated SNP in the region. These validated genetic findings were further supported by the identification of CAMTA1 transcript enrichment in memory-related human brain regions and through a functional magnetic resonance imaging experiment on individuals matched for memory performance that identified CAMTA1 allele-specific upregulation of medial temporal lobe brain activity in those individuals harboring the 'at-risk' allele for poorer memory performance. The CAMTA1 locus encodes a purported transcription factor that interfaces with the calcium-calmodulin system of the cell to alter gene expression patterns. Our validated genomic and functional biological findings described herein suggest a role for CAMTA1 in human episodic memory.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

To identify components of the copper homeostatic mechanism of Lactococcus lactis, we employed two-dimensional gel electrophoresis to detect changes in the proteome in response to copper. Three proteins upregulated by copper were identified: glyoxylase I (YaiA), a nitroreductase (YtjD), and lactate oxidase (LctO). The promoter regions of these genes feature cop boxes of consensus TACAnnTGTA, which are the binding site of CopY-type copper-responsive repressors. A genome-wide search for cop boxes revealed 28 such sequence motifs. They were tested by electrophoretic mobility shift assays for the interaction with purified CopR, the CopY-type repressor of L. lactis. Seven of the cop boxes interacted with CopR in a copper-sensitive manner. They were present in the promoter region of five genes, lctO, ytjD, copB, ydiD, and yahC; and two polycistronic operons, yahCD-yaiAB and copRZA. Induction of these genes by copper was confirmed by real-time quantitative PCR. The copRZA operon encodes the CopR repressor of the regulon; a copper chaperone, CopZ; and a putative copper ATPase, CopA. When expressed in Escherichia coli, the copRZA operon conferred copper resistance, suggesting that it functions in copper export from the cytoplasm. Other member genes of the CopR regulon may similarly be involved in copper metabolism.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

As the development of genotyping and next-generation sequencing technologies, multi-marker testing in genome-wide association study and rare variant association study became active research areas in statistical genetics. This dissertation contains three methodologies for association study by exploring different genetic data features and demonstrates how to use those methods to test genetic association hypothesis. The methods can be categorized into in three scenarios: 1) multi-marker testing for strong Linkage Disequilibrium regions, 2) multi-marker testing for family-based association studies, 3) multi-marker testing for rare variant association study. I also discussed the advantage of using these methods and demonstrated its power by simulation studies and applications to real genetic data.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Hardwoods comprise about half of the biomass of forestlands in North America and present many uses including economic, ecological and aesthetic functions. Forest trees rely on the genetic variation within tree populations to overcome the many biotic, abiotic, anthropogenic factors which are further worsened by climate change, that threaten their continued survival and functionality. To harness these inherent genetic variations of tree populations, informed knowledge of the genomic resources and techniques, which are currently lacking or very limited, are imperative for forest managers. The current study therefore aimed to develop genomic microsatellite markers for the leguminous tree species, honey locust, Gleditsia triacanthos L. and test their applicability in assessing genetic variation, estimation of gene flow patterns and identification of a full-sib mapping population. We also aimed to test the usefulness of already developed nuclear and gene-based microsatellite markers in delineation of species and taxonomic relationships between four of the taxonomically difficult Section Lobatae species (Quercus coccinea, Q. ellipsoidalis, Q. rubra and Q. velutina. We recorded 100% amplification of G. triacanthos genomic microsatellites developed using Illumina sequencing techniques in a panel of seven unrelated individuals with 14 of these showing high polymorphism and reproducibility. When characterized in 36 natural population samples, we recorded 20 alleles per locus with no indication for null alleles at 13 of the 14 microsatellites. This is the first report of genomic microsatellites for this species. Honey locust trees occur in fragmented populations of abandoned farmlands and pastures and is described as essentially dioecious. Pollen dispersal if the main source of gene flow within and between populations with the ability to offset the effects of random genetic drift. Factors known to influence gene include fragmentation and degree of isolation, which make the patterns gene flow in fragmented populations of honey locust a necessity for their sustainable management. In this follow-up study, we used a subset of nine of the 14 developed gSSRs to estimate gene flow and identify a full-sib mapping population in two isolated fragments of honey locust. Our analyses indicated that the majority of the seedlings (65-100% - at both strict and relaxed assignment thresholds) were sired by pollen from outside the two fragment populations. Only one selfing event was recorded confirming the functional dioeciousness of honey locust and that the seed parents are almost completely outcrossed. From the Butternut Valley, TN population, pollen donor genotypes were reconstructed and used in paternity assignment analyses to identify a relatively large full-sib family comprised of 149 individuals, proving the usefulness of isolated forest fragments in identification of full-sib families. In the Ames Plantation stand, contemporary pollen dispersal followed a fat-tailed exponential-power distribution, an indication of effective gene flow. Our estimate of δ was 4,282.28 m, suggesting that insect pollinators of honey locust disperse pollen over very long distances. The high proportion of pollen influx into our sampled population implies that our fragment population forms part of a large effectively reproducing population. The high tendency of oak species to hybridize while still maintaining their species identity make it difficult to resolve their taxonomic relationships. Oaks of the section Lobatae are famous in this regard and remain unresolved at both morphological and genetic markers. We applied 28 microsatellite markers including outlier loci with potential roles in reproductive isolation and adaptive divergence between species to natural populations of four known interfertile red oaks, Q. coccinea, Q. ellpsoidalis, Q. rubra and Q. velutina. To better resolve the taxonomic relationships in this difficult clade, we assigned individual samples to species, identified hybrids and introgressive forms and reconstructed phylogenetic relationships among the four species after exclusion of genetically intermediate individuals. Genetic assignment analyses identified four distinct species clusters, with Q. rubra most differentiated from the three other species, but also with a comparatively large number of misclassified individuals (7.14%), hybrids (7.14%) and introgressive forms (18.83%) between Q. ellipsoidalis and Q. velutina. After the exclusion of genetically intermediate individuals, Q. ellipsoidalis grouped as sister species to the largely parapatric Q. coccinea with high bootstrap support (91 %). Genetically intermediate forms in a mixed species stand were located proximate to both potential parental species, which supports recent hybridization of Q. velutina with both Q. ellipsoidalis and Q. rubra. Analyses of genome-wide patterns of interspecific differentiation can provide a better understanding of speciation processes and taxonomic relationships in this taxonomically difficult group of red oak species.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

OBJECTIVES: Recently, a genome-wide association study showed that single-nucleotide polymorphisms (SNPs) in the chromosome 4q27 region containing IL2 and IL21 are associated with celiac disease. Given the increased prevalence of inflammatory bowel disease (IBD) among celiac disease patients, we investigated the possible involvement of these SNPs in IBD. METHODS: Five SNPs strongly associated with celiac disease within the KIAA1109/TENR/IL2/IL21 linkage disequilibrium block on chromosome 4q27 and one coding SNP within the IL21 gene were analyzed in a large German IBD cohort. The study population comprised a total of 2,948 Caucasian individuals, including 1,461 IBD patients (ulcerative colitis (UC): n=514, Crohn's disease (CD): n=947) and 1,487 healthy unrelated controls. RESULTS: Three of the five celiac disease risk markers had a protective effect on UC susceptibility, and this effect remained significant after correcting for multiple testing: rs6840978: P=0.0082, P(corr)=0.049, odds ratio (OR) 0.77, 95% confidence interval (CI) 0.63-0.93; rs6822844: P=0.0028, P(corr)=0.017, OR 0.73, 95% CI 0.59-0.90; rs13119723: P=0.0058, P(corr)=0.035, OR 0.75, 95% CI 0.61-0.92. A haplotype consisting of the six SNPs tested was markedly associated with UC susceptibility (P=0.0025, P(corr)=0.015, OR 0.72, 95% CI 0.58-0.89). Moreover, in UC, epistasis was observed between the IL23R SNP rs1004819 and three SNPs in the KIAA1109/TENR/IL2/IL21 block (rs13151961, rs13119723, and rs6822844). CONCLUSIONS: Similar to other autoimmune diseases such as celiac disease, rheumatoid arthritis, type 1 diabetes, Graves' disease, and psoriatic arthritis, genetic variation in the chromosome 4q27 region predisposes to UC, suggesting a common genetic background for these diseases.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Acute infection with the hepatitis C virus (HCV) induces a wide range of innate and adaptive immune responses. A total of 20-50% of acutely HCV-infected individuals permanently control the virus, referred to as 'spontaneous hepatitis C clearance', while the infection progresses to chronic hepatitis C in the majority of cases. Numerous studies have examined host genetic determinants of hepatitis C infection outcome and revealed the influence of genetic polymorphisms of human leukocyte antigens, killer immunoglobulin-like receptors, chemokines, interleukins and interferon-stimulated genes on spontaneous hepatitis C clearance. However, most genetic associations were not confirmed in independent cohorts, revealed opposing results in diverse populations or were limited by varying definitions of hepatitis C outcomes or small sample size. Coordinated efforts are needed in the search for key genetic determinants of spontaneous hepatitis C clearance that include well-conducted candidate genetic and genome-wide association studies, direct sequencing and follow-up functional studies.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The development of a completely annotated sheep genome sequence is a key need for understanding the phylogenetic relationships and genetic diversity among the many different sheep breeds worldwide and for identifying genes controlling economically and physiologically important traits. The ovine genome sequence assembly will be crucial for developing optimized breeding programs based on highly productive, healthy sheep phenotypes that are adapted to modern breeding and production conditions. Scientists and breeders around the globe have been contributing to this goal by generating genomic and cDNA libraries, performing genome-wide and trait-associated analyses of polymorphism, expression analysis, genome sequencing, and by developing virtual and physical comparative maps. The International Sheep Genomics Consortium (ISGC), an informal network of sheep genomics researchers, is playing a major role in coordinating many of these activities. In addition to serving as an essential tool for monitoring chromosome abnormalities in specific sheep populations, ovine molecular cytogenetics provides physical anchors which link and order genome regions, such as sequence contigs, genes and polymorphic DNA markers to ovine chromosomes. Likewise, molecular cytogenetics can contribute to the process of defining evolutionary breakpoints between related species. The selective expansion of the sheep cytogenetic map, using loci to connect maps and identify chromosome bands, can substantially contribute to improving the quality of the annotated sheep genome sequence and will also accelerate its assembly. Furthermore, identifying major morphological chromosome anomalies and micro-rearrangements, such as gene duplications or deletions, that might occur between different sheep breeds and other Ovis species will also be important to understand the diversity of sheep chromosome structure and its implications for cross-breeding. To date, 566 loci have been assigned to specific chromosome regions in sheep and the new cytogenetic map is presented as part of this review. This review will also summarize the current cytogenomic status of the sheep genome, describe current activities in the sheep cytogenomics research sector, and will discuss the cytogenomics data in context with other major sheep genomics projects.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Attention deficit/hyperactivity disorder (ADHD) is a highly heritable neurodevelopmental disorder of childhood onset. Clinical and biological evidence points to shared common central nervous system (CNS) pathology of ADHD and restless legs syndrome (RLS). It was hypothesized that variants previously found to be associated with RLS in two large genome-wide association studies (GWA), will also be associated with ADHD. SNPs located in MEIS1 (rs2300478), BTBD9 (rs9296249, rs3923809, rs6923737), and MAP2K5 (rs12593813, rs4489954) as well as three SNPs tagging the identified haplotype in MEIS1 (rs6710341, rs12469063, rs4544423) were genotyped in a well characterized German sample of 224 families comprising one or more affected sibs (386 children) and both parents. We found no evidence for preferential transmission of the hypothesized variants to ADHD. Subsequent analyses elicited nominal significant association with haplotypes consisting of the three SNPs in BTBD9 (chi2 = 14.8, df = 7, nominal p = 0.039). According to exploratory post hoc analyses, the major contribution to this finding came from the A-A-A-haplotype with a haplotype-wise nominal p-value of 0.009. However, this result did not withstand correction for multiple testing. In view of our results, RLS risk alleles may have a lower effect on ADHD than on RLS or may not be involved in ADHD. The negative findings may additionally result from genetic heterogeneity of ADHD, i.e. risk alleles for RLS may only be relevant for certain subtypes of ADHD. Genes relevant to RLS remain interesting candidates for ADHD; particularly BTBD9 needs further study, as it has been related to iron storage, a potential pathophysiological link between RLS and certain subtypes of ADHD.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Lung function measures are heritable, predict mortality and are relevant in diagnosis of chronic obstructive pulmonary disease (COPD). COPD and asthma are diseases of the airways with major public health impacts and each have a heritable component. Genome-wide association studies of SNPs have revealed novel genetic associations with both diseases but only account for a small proportion of the heritability. Complex copy number variation may account for some of the missing heritability. A well-characterised genomic region of complex copy number variation contains beta-defensin genes (DEFB103, DEFB104 and DEFB4), which have a role in the innate immune response. Previous studies have implicated these and related genes as being associated with asthma or COPD. We hypothesised that copy number variation of these genes may play a role in lung function in the general population and in COPD and asthma risk. We undertook copy number typing of this locus in 1149 adult and 689 children using a paralogue ratio test and investigated association with COPD, asthma and lung function. Replication of findings was assessed in a larger independent sample of COPD cases and smoking controls. We found evidence for an association of beta-defensin copy number with COPD in the adult cohort (OR = 1.4, 95%CI:1.02-1.92, P = 0.039) but this finding, and findings from a previous study, were not replicated in a larger follow-up sample(OR = 0.89, 95%CI:0.72-1.07, P = 0.217). No robust evidence of association with asthma in children was observed. We found no evidence for association between beta-defensin copy number and lung function in the general populations. Our findings suggest that previous reports of association of beta-defensin copy number with COPD should be viewed with caution. Suboptimal measurement of copy number can lead to spurious associations. Further beta-defensin copy number measurement in larger sample sizes of COPD cases and children with asthma are needed.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Coat color and pattern variations in domestic animals are frequently inherited as simple monogenic traits, but a number are known to have a complex genetic basis. While the analysis of complex trait data remains a challenge in all species, we can use the reduced haplotypic diversity in domestic animal populations to gain insight into the genomic interactions underlying complex phenotypes. White face and leg markings are examples of complex traits in horses where little is known of the underlying genetics. In this study, Franches-Montagnes (FM) horses were scored for the occurrence of white facial and leg markings using a standardized scoring system. A genome-wide association study (GWAS) was performed for several white patterning traits in 1,077 FM horses. Seven quantitative trait loci (QTL) affecting the white marking score with p-values p≤10(-4) were identified. Three loci, MC1R and the known white spotting genes, KIT and MITF, were identified as the major loci underlying the extent of white patterning in this breed. Together, the seven loci explain 54% of the genetic variance in total white marking score, while MITF and KIT alone account for 26%. Although MITF and KIT are the major loci controlling white patterning, their influence varies according to the basic coat color of the horse and the specific body location of the white patterning. Fine mapping across the MITF and KIT loci was used to characterize haplotypes present. Phylogenetic relationships among haplotypes were calculated to assess their selective and evolutionary influences on the extent of white patterning. This novel approach shows that KIT and MITF act in an additive manner and that accumulating mutations at these loci progressively increase the extent of white markings.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Hereditary nasal parakeratosis (HNPK), an inherited monogenic autosomal recessive skin disorder, leads to crusts and fissures on the nasal planum of Labrador Retrievers. We performed a genome-wide association study (GWAS) using 13 HNPK cases and 23 controls. We obtained a single strong association signal on chromosome 2 (p(raw) = 4.4×10⁻¹⁴). The analysis of shared haplotypes among the 13 cases defined a critical interval of 1.6 Mb with 25 predicted genes. We re-sequenced the genome of one case at 38× coverage and detected 3 non-synonymous variants in the critical interval with respect to the reference genome assembly. We genotyped these variants in larger cohorts of dogs and only one was perfectly associated with the HNPK phenotype in a cohort of more than 500 dogs. This candidate causative variant is a missense variant in the SUV39H2 gene encoding a histone 3 lysine 9 (H3K9) methyltransferase, which mediates chromatin silencing. The variant c.972T>G is predicted to change an evolutionary conserved asparagine into a lysine in the catalytically active domain of the enzyme (p.N324K). We further studied the histopathological alterations in the epidermis in vivo. Our data suggest that the HNPK phenotype is not caused by hyperproliferation, but rather delayed terminal differentiation of keratinocytes. Thus, our data provide evidence that SUV39H2 is involved in the epigenetic regulation of keratinocyte differentiation ensuring proper stratification and tight sealing of the mammalian epidermis.