896 resultados para Whole genome mapping
Resumo:
Recent genomic analyses suggest the importance of combinatorial regulation by broadly expressed transcription factors rather than expression domains characterized by highly specific factors.
Resumo:
Recent emergence of human connectome imaging has led to a high demand on angular and spatial resolutions for diffusion magnetic resonance imaging (MRI). While there have been significant growths in high angular resolution diffusion imaging, the improvement in spatial resolution is still limited due to a number of technical challenges, such as the low signal-to-noise ratio and high motion artifacts. As a result, the benefit of a high spatial resolution in the whole-brain connectome imaging has not been fully evaluated in vivo. In this brief report, the impact of spatial resolution was assessed in a newly acquired whole-brain three-dimensional diffusion tensor imaging data set with an isotropic spatial resolution of 0.85 mm. It was found that the delineation of short cortical association fibers is drastically improved as well as the definition of fiber pathway endings into the gray/white matter boundary-both of which will help construct a more accurate structural map of the human brain connectome.
Resumo:
Transcriptional regulation has been studied intensively in recent decades. One important aspect of this regulation is the interaction between regulatory proteins, such as transcription factors (TF) and nucleosomes, and the genome. Different high-throughput techniques have been invented to map these interactions genome-wide, including ChIP-based methods (ChIP-chip, ChIP-seq, etc.), nuclease digestion methods (DNase-seq, MNase-seq, etc.), and others. However, a single experimental technique often only provides partial and noisy information about the whole picture of protein-DNA interactions. Therefore, the overarching goal of this dissertation is to provide computational developments for jointly modeling different experimental datasets to achieve a holistic inference on the protein-DNA interaction landscape.
We first present a computational framework that can incorporate the protein binding information in MNase-seq data into a thermodynamic model of protein-DNA interaction. We use a correlation-based objective function to model the MNase-seq data and a Markov chain Monte Carlo method to maximize the function. Our results show that the inferred protein-DNA interaction landscape is concordant with the MNase-seq data and provides a mechanistic explanation for the experimentally collected MNase-seq fragments. Our framework is flexible and can easily incorporate other data sources. To demonstrate this flexibility, we use prior distributions to integrate experimentally measured protein concentrations.
We also study the ability of DNase-seq data to position nucleosomes. Traditionally, DNase-seq has only been widely used to identify DNase hypersensitive sites, which tend to be open chromatin regulatory regions devoid of nucleosomes. We reveal for the first time that DNase-seq datasets also contain substantial information about nucleosome translational positioning, and that existing DNase-seq data can be used to infer nucleosome positions with high accuracy. We develop a Bayes-factor-based nucleosome scoring method to position nucleosomes using DNase-seq data. Our approach utilizes several effective strategies to extract nucleosome positioning signals from the noisy DNase-seq data, including jointly modeling data points across the nucleosome body and explicitly modeling the quadratic and oscillatory DNase I digestion pattern on nucleosomes. We show that our DNase-seq-based nucleosome map is highly consistent with previous high-resolution maps. We also show that the oscillatory DNase I digestion pattern is useful in revealing the nucleosome rotational context around TF binding sites.
Finally, we present a state-space model (SSM) for jointly modeling different kinds of genomic data to provide an accurate view of the protein-DNA interaction landscape. We also provide an efficient expectation-maximization algorithm to learn model parameters from data. We first show in simulation studies that the SSM can effectively recover underlying true protein binding configurations. We then apply the SSM to model real genomic data (both DNase-seq and MNase-seq data). Through incrementally increasing the types of genomic data in the SSM, we show that different data types can contribute complementary information for the inference of protein binding landscape and that the most accurate inference comes from modeling all available datasets.
This dissertation provides a foundation for future research by taking a step toward the genome-wide inference of protein-DNA interaction landscape through data integration.
Resumo:
We conducted a genome-wide association study testing single nucleotide polymorphisms (SNPs) and copy number variants (CNVs) for association with early-onset myocardial infarction in 2,967 cases and 3,075 controls. We carried out replication in an independent sample with an effective sample size of up to 19,492. SNPs at nine loci reached genome-wide significance: three are newly identified (21q22 near MRPS6-SLC5A3-KCNE2, 6p24 in PHACTR1 and 2q33 in WDR12) and six replicated prior observations1-4 (9p21, 1p13 near CELSR2-PSRC1-SORT1, 10q11 near CXCL12, 1q41 in MIA3, 19p13 near LDLR and 1p32 near PCSK9). We tested 554 common copy number polymorphisms (>1% allele frequency) and none met the pre-specified threshold for replication (P < 10-3). We identified 8,065 rare CNVs but did not detect a greater CNV burden in cases compared to controls, in genes compared to the genome as a whole, or at any individual locus. SNPs at nine loci were reproducibly associated with myocardial infarction, but tests of common and rare CNVs failed to identify additional associations with myocardial infarction risk.
Resumo:
Psychotic symptoms occur in ~40% of subjects with Alzheimer's disease (AD) and are associated with more rapid cognitive decline and increased functional deficits. They show heritability up to 61% and have been proposed as a marker for a disease subtype suitable for gene mapping efforts. We undertook a combined analysis of three genome-wide association studies (GWASs) to identify loci that (1) increase susceptibility to an AD and subsequent psychotic symptoms; or (2) modify risk of psychotic symptoms in the presence of neurodegeneration caused by AD. In all, 1299 AD cases with psychosis (AD+P), 735 AD cases without psychosis (AD-P) and 5659 controls were drawn from Genetic and Environmental Risk in AD Consortium 1 (GERAD1), the National Institute on Aging Late-Onset Alzheimer's Disease (NIA-LOAD) family study and the University of Pittsburgh Alzheimer Disease Research Center (ADRC) GWASs. Unobserved genotypes were imputed to provide data on >1.8 million single-nucleotide polymorphisms (SNPs). Analyses in each data set were completed comparing (1) AD+P to AD-P cases, and (2) AD+P cases with controls (GERAD1, ADRC only). Aside from the apolipoprotein E (APOE) locus, the strongest evidence for association was observed in an intergenic region on chromosome 4 (rs753129; 'AD+PvAD-P' P=2.85 × 10(-7); 'AD+PvControls' P=1.11 × 10(-4)). SNPs upstream of SLC2A9 (rs6834555, P=3.0 × 10(-7)) and within VSNL1 (rs4038131, P=5.9 × 10(-7)) showed strongest evidence for association with AD+P when compared with controls. These findings warrant further investigation in larger, appropriately powered samples in which the presence of psychotic symptoms in AD has been well characterized.Molecular Psychiatry advance online publication, 18 October 2011; doi:10.1038/mp.2011.125.
Resumo:
From our linkage study of Irish families with a high density of schizophrenia, we have previously reported evidence for susceptibility genes in regions 5q21-31, 6p24-21, 8p22-21, and 10p15-p11. In this report, we describe the cumulative results from independent genome scans of three a priori random subsets of 90 families each, and from multipoint analysis of all 270 families in ten regions. Of these ten regions, three (13q32, 18p11-q11, and 18q22-23) did not generate scores above the empirical baseline pairwise scan results, and one (6q13-26) generated a weak signal. Six other regions produced more positive pairwise and multipoint results. They showed the following maximum multipoint H-LOD (heterogeneity LOD) and NPL scores: 2p14-13: 0.89 (P = 0.06) and 2.08 (P = 0.02), 4q24-32: 1.84 (P = 0.007) and 1.67 (P = 0.03), 5q21-31: 2.88 (P= 0.0007), and 2.65 (P = 0.002), 6p25-24: 2.13 (P = 0.005) and 3.59 (P = 0.0005), 6p23: 2.42 (P = 0.001) and 3.07 (P = 0.001), 8p22-21: 1.57 (P = 0.01) and 2.56 (P = 0.005), 10p15-11: 2.04 (P = 0.005) and 1.78 (P = 0.03). The degree of 'internal replication' across subsets differed, with 5q, 6p, and 8p being most consistent and 2p and 10p being least consistent. On 6p, the data suggested the presence of two susceptibility genes, in 6p25-24 and 6p23-22. Very few families were positive on more than one region, and little correlation between regions was evident, suggesting substantial locus heterogeneity. The levels of statistical significance were modest, as expected from loci contributing to complex traits. However, our internal replications, when considered along with the positive results obtained in multiple other samples, suggests that most of these six regions are likely to contain genes that influence liability to schizophrenia.
Resumo:
Tuberculosis (TB) caused by Mycobacterium bovis is a re-emerging disease of livestock that is of major economic importance worldwide, as well as being a zoonotic risk there is significant heritability for host resistance to bovine TB (bTB) in dairy cattle. To identify resistance loci for bTB, we undertook a genome-wide association study in female Holstein-Friesian cattle with 592 cases and 559 age-matched controls from case herds. Cases and controls were categorised into distinct phenotypes: skin test and lesion positive vs skin test negative on multiple occasions, respectively these animals were genotyped with the Illumina BovineHD 700K BeadChip. Genome-wide rapid association using linear and logistic mixed models and regression (GRAMMAR), regional heritability mapping (RHM) and haplotype-sharing analysis identified two novel resistance loci that attained chromosome-wise significance, protein tyrosine phosphatase receptor T (PTPRT; P=4.8 × 10 -7) and myosin IIIB (MYO3B; P=5.4 × 10 -6). We estimated that 21% of the phenotypic variance in TB resistance could be explained by all of the informative single-nucleotide polymorphisms, of which the region encompassing the PTPRT gene accounted for 6.2% of the variance and a further 3.6% was associated with a putative copy number variant in MYO3B the results from this study add to our understanding of variation in host control of infection and suggest that genetic marker-based selection for resistance to bTB has the potential to make a significant contribution to bTB control.
Resumo:
Marginal zone B-cell lymphomas (MZLs) have been divided into 3 distinct subtypes (extranodal MZLs of mucosa-associated lymphoid tissue [MALT] type, nodal MZLs, and splenic MZLs). Nevertheless, the relationship between the subtypes is still unclear. We performed a comprehensive analysis of genomic DNA copy number changes in a very large series of MZL cases with the aim of addressing this question. Samples from 218 MZL patients (25 nodal, 57 MALT, 134 splenic, and 2 not better specified MZLs) were analyzed with the Affymetrix Human Mapping 250K SNP arrays, and the data combined with matched gene expression in 33 of 218 cases. MALT lymphoma presented significantly more frequently gains at 3p, 6p, 18p, and del(6q23) (TNFAIP3/A20), whereas splenic MZLs was associated with del(7q31), del(8p). Nodal MZLs did not show statistically significant differences compared with MALT lymphoma while lacking the splenic MZLs-related 7q losses. Gains of 3q and 18q were common to all 3 subtypes. del(8p) was often present together with del(17p) (TP53). Although del(17p) did not determine a worse outcome and del(8p) was only of borderline significance, the presence of both deletions had a highly significant negative impact on the outcome of splenic MZLs.
Resumo:
Despite years of investigation into triclabendazole (TCBZ) resistance in Fasciola hepatica, the genetic mechanisms responsible remain unknown. Extensive analysis of multiple triclabendazole-susceptible and -resistant isolates using a combination of experimental in vivo and in vitro approaches has been carried out, yet few, if any, genes have been demonstrated experimentally to be associated with resistance phenotypes in the field. In this review we summarize the current understanding of TCBZ resistance from the approaches employed to date. We report the current genomic and genetic resources for F. hepatica that are available to facilitate novel functional genomics and genetic experiments for this parasite in the future. Finally, we describe our own non-biased approach to mapping the major genetic loci involved in conferring TCBZ resistance in F. hepatica.
Resumo:
BACKGROUND: Evolution equipped Bdellovibrio bacteriovorus predatory bacteria to invade other bacteria, digesting and replicating, sealed within them thus preventing nutrient-sharing with organisms in the surrounding environment. Bdellovibrio were previously described as "obligate predators" because only by mutations, often in gene bd0108, are 1 in ~1x10(7) of predatory lab strains of Bdellovibrio converted to prey-independent growth. A previous genomic analysis of B. bacteriovorus strain HD100 suggested that predatory consumption of prey DNA by lytic enzymes made Bdellovibrio less likely than other bacteria to acquire DNA by lateral gene transfer (LGT). However the Doolittle and Pan groups predicted, in silico, both ancient and recent lateral gene transfer into the B. bacteriovorus HD100 genome.
RESULTS: To test these predictions, we isolated a predatory bacterium from the River Tiber- a good potential source of LGT as it is rich in diverse bacteria and organic pollutants- by enrichment culturing with E. coli prey cells. The isolate was identified as B. bacteriovorus and named as strain Tiberius. Unusually, this Tiberius strain showed simultaneous prey-independent growth on organic nutrients and predatory growth on live prey. Despite the prey-independent growth, the homolog of bd0108 did not have typical prey-independent-type mutations. The dual growth mode may reflect the high carbon content of the river, and gives B. bacteriovorus Tiberius extended non-predatory contact with the other bacteria present. The HD100 and Tiberius genomes were extensively syntenic despite their different cultured-terrestrial/freshly-isolated aquatic histories; but there were significant differences in gene content indicative of genomic flux and LGT. Gene content comparisons support previously published in silico predictions for LGT in strain HD100 with substantial conservation of genes predicted to have ancient LGT origins but little conservation of AT-rich genes predicted to be recently acquired.
CONCLUSIONS: The natural niche and dual predatory, and prey-independent growth of the B. bacteriovorus Tiberius strain afforded it extensive non-predatory contact with other marine and freshwater bacteria from which LGT is evident in its genome. Thus despite their arsenal of DNA-lytic enzymes; Bdellovibrio are not always predatory in natural niches and their genomes are shaped by acquiring whole genes from other bacteria.
Resumo:
Abstract: Selection among broilers for performance traits is resulting in locomotion problems and bone disorders, once skeletal structure is not strong enough to support body weight in broilers with high growth rates. In this study, genetic parameters were estimated for body weight at 42 days of age (BW42), and tibia traits (length, width, and weight) in a population of broiler chickens. Quantitative trait loci (QTL) were identified for tibia traits to expand our knowledge of the genetic architecture of the broiler population. Genetic correlations ranged from 0.56 +/- 0.18 (between tibia length and BW42) to 0.89 +/- 0.06 (between tibia width and weight), suggesting that these traits are either controlled by pleiotropic genes or by genes that are in linkage disequilibrium. For QTL mapping, the genome was scanned with 127 microsatellites, representing a coverage of 2630 cM. Eight QTL were mapped on Gallus gallus chromosomes (GGA): GGA1, GGA4, GGA6, GGA13, and GGA24. The QTL regions for tibia length and weight were mapped on GGA1, between LEI0079 and MCW145 markers. The gene DACH1 is located in this region; this gene acts to form the apical ectodermal ridge, responsible for limb development. Body weight at 42 days of age was included in the model as a covariate for selection effect of bone traits. Two QTL were found for tibia weight on GGA2 and GGA4, and one for tibia width on GGA3. Information originating from these QTL will assist in the search for candidate genes for these bone traits in future studies.
Resumo:
Metabolic traits are molecular phenotypes that can drive clinical phenotypes and may predict disease progression. Here, we report results from a metabolome- and genome-wide association study on (1)H-NMR urine metabolic profiles. The study was conducted within an untargeted approach, employing a novel method for compound identification. From our discovery cohort of 835 Caucasian individuals who participated in the CoLaus study, we identified 139 suggestively significant (P<5×10(-8)) and independent associations between single nucleotide polymorphisms (SNP) and metabolome features. Fifty-six of these associations replicated in the TasteSensomics cohort, comprising 601 individuals from São Paulo of vastly diverse ethnic background. They correspond to eleven gene-metabolite associations, six of which had been previously identified in the urine metabolome and three in the serum metabolome. Our key novel findings are the associations of two SNPs with NMR spectral signatures pointing to fucose (rs492602, P = 6.9×10(-44)) and lysine (rs8101881, P = 1.2×10(-33)), respectively. Fine-mapping of the first locus pinpointed the FUT2 gene, which encodes a fucosyltransferase enzyme and has previously been associated with Crohn's disease. This implicates fucose as a potential prognostic disease marker, for which there is already published evidence from a mouse model. The second SNP lies within the SLC7A9 gene, rare mutations of which have been linked to severe kidney damage. The replication of previous associations and our new discoveries demonstrate the potential of untargeted metabolomics GWAS to robustly identify molecular disease markers.
Resumo:
Many disorders are associated with altered serum protein concentrations, including malnutrition, cancer, and cardiovascular, kidney, and inflammatory diseases. Although these protein concentrations are highly heritable, relatively little is known about their underlying genetic determinants. Through transethnic meta-analysis of European-ancestry and Japanese genome-wide association studies, we identified six loci at genome-wide significance (p < 5 × 10(-8)) for serum albumin (HPN-SCN1B, GCKR-FNDC4, SERPINF2-WDR81, TNFRSF11A-ZCCHC2, FRMD5-WDR76, and RPS11-FCGRT, in up to 53,190 European-ancestry and 9,380 Japanese individuals) and three loci for total protein (TNFRS13B, 6q21.3, and ELL2, in up to 25,539 European-ancestry and 10,168 Japanese individuals). We observed little evidence of heterogeneity in allelic effects at these loci between groups of European and Japanese ancestry but obtained substantial improvements in the resolution of fine mapping of potential causal variants by leveraging transethnic differences in the distribution of linkage disequilibrium. We demonstrated a functional role for the most strongly associated serum albumin locus, HPN, for which Hpn knockout mice manifest low plasma albumin concentrations. Other loci associated with serum albumin harbor genes related to ribosome function, protein translation, and proteasomal degradation, whereas those associated with serum total protein include genes related to immune function. Our results highlight the advantages of transethnic meta-analysis for the discovery and fine mapping of complex trait loci and have provided initial insights into the underlying genetic architecture of serum protein concentrations and their association with human disease.
Resumo:
Whole-grain foods are touted for multiple health benefits, including enhancing insulin sensitivity and reducing type 2 diabetes risk. Recent genome-wide association studies (GWAS) have identified several single nucleotide polymorphisms (SNPs) associated with fasting glucose and insulin concentrations in individuals free of diabetes. We tested the hypothesis that whole-grain food intake and genetic variation interact to influence concentrations of fasting glucose and insulin. Via meta-analysis of data from 14 cohorts comprising ∼ 48,000 participants of European descent, we studied interactions of whole-grain intake with loci previously associated in GWAS with fasting glucose (16 loci) and/or insulin (2 loci) concentrations. For tests of interaction, we considered a P value <0.0028 (0.05 of 18 tests) as statistically significant. Greater whole-grain food intake was associated with lower fasting glucose and insulin concentrations independent of demographics, other dietary and lifestyle factors, and BMI (β [95% CI] per 1-serving-greater whole-grain intake: -0.009 mmol/l glucose [-0.013 to -0.005], P < 0.0001 and -0.011 pmol/l [ln] insulin [-0.015 to -0.007], P = 0.0003). No interactions met our multiple testing-adjusted statistical significance threshold. The strongest SNP interaction with whole-grain intake was rs780094 (GCKR) for fasting insulin (P = 0.006), where greater whole-grain intake was associated with a smaller reduction in fasting insulin concentrations in those with the insulin-raising allele. Our results support the favorable association of whole-grain intake with fasting glucose and insulin and suggest a potential interaction between variation in GCKR and whole-grain intake in influencing fasting insulin concentrations.