23 resultados para Human genome - Theses
em Duke University
Resumo:
Photographs from the February 1997 Bermuda meeting. Courtesy of Gert-Jan van Ommen.
Resumo:
Lipoprotein-associated phospholipase A(2) (Lp-PLA(2)) is an emerging risk factor and therapeutic target for cardiovascular disease. The activity and mass of this enzyme are heritable traits, but major genetic determinants have not been explored in a systematic, genome-wide fashion. We carried out a genome-wide association study of Lp-PLA(2) activity and mass in 6,668 Caucasian subjects from the population-based Framingham Heart Study. Clinical data and genotypes from the Affymetrix 550K SNP array were obtained from the open-access Framingham SHARe project. Each polymorphism that passed quality control was tested for associations with Lp-PLA(2) activity and mass using linear mixed models implemented in the R statistical package, accounting for familial correlations, and controlling for age, sex, smoking, lipid-lowering-medication use, and cohort. For Lp-PLA(2) activity, polymorphisms at four independent loci reached genome-wide significance, including the APOE/APOC1 region on chromosome 19 (p = 6 x 10(-24)); CELSR2/PSRC1 on chromosome 1 (p = 3 x 10(-15)); SCARB1 on chromosome 12 (p = 1x10(-8)) and ZNF259/BUD13 in the APOA5/APOA1 gene region on chromosome 11 (p = 4 x 10(-8)). All of these remained significant after accounting for associations with LDL cholesterol, HDL cholesterol, or triglycerides. For Lp-PLA(2) mass, 12 SNPs achieved genome-wide significance, all clustering in a region on chromosome 6p12.3 near the PLA2G7 gene. Our analyses demonstrate that genetic polymorphisms may contribute to inter-individual variation in Lp-PLA(2) activity and mass.
Resumo:
The centromere is the chromosomal locus essential for chromosome inheritance and genome stability. Human centromeres are located at repetitive alpha satellite DNA arrays that compose approximately 5% of the genome. Contiguous alpha satellite DNA sequence is absent from the assembled reference genome, limiting current understanding of centromere organization and function. Here, we review the progress in centromere genomics spanning the discovery of the sequence to its molecular characterization and the work done during the Human Genome Project era to elucidate alpha satellite structure and sequence variation. We discuss exciting recent advances in alpha satellite sequence assembly that have provided important insight into the abundance and complex organization of this sequence on human chromosomes. In light of these new findings, we offer perspectives for future studies of human centromere assembly and function.
Resumo:
Improvements in genomic technology, both in the increased speed and reduced cost of sequencing, have expanded the appreciation of the abundance of human genetic variation. However the sheer amount of variation, as well as the varying type and genomic content of variation, poses a challenge in understanding the clinical consequence of a single mutation. This work uses several methodologies to interpret the observed variation in the human genome, and presents novel strategies for the prediction of allele pathogenicity.
Using the zebrafish model system as an in vivo assay of allele function, we identified a novel driver of Bardet-Biedl Syndrome (BBS) in CEP76. A combination of targeted sequencing of 785 cilia-associated genes in a cohort of BBS patients and subsequent in vivo functional assays recapitulating the human phenotype gave strong evidence for the role of CEP76 mutations in the pathology of an affected family. This portion of the work demonstrated the necessity of functional testing in validating disease-associated mutations, and added to the catalogue of known BBS disease genes.
Further study into the role of copy-number variations (CNVs) in a cohort of BBS patients showed the significant contribution of CNVs to disease pathology. Using high-density array comparative genomic hybridization (aCGH) we were able to identify pathogenic CNVs as small as several hundred bp. Dissection of constituent gene and in vivo experiments investigating epistatic interactions between affected genes allowed for an appreciation of several paradigms by which CNVs can contribute to disease. This study revealed that the contribution of CNVs to disease in BBS patients is much higher than previously expected, and demonstrated the necessity of consideration of CNV contribution in future (and retrospective) investigations of human genetic disease.
Finally, we used a combination of comparative genomics and in vivo complementation assays to identify second-site compensatory modification of pathogenic alleles. These pathogenic alleles, which are found compensated in other species (termed compensated pathogenic deviations [CPDs]), represent a significant fraction (from 3 – 10%) of human disease-associated alleles. In silico pathogenicity prediction algorithms, a valuable method of allele prioritization, often misrepresent these alleles as benign, leading to omission of possibly informative variants in studies of human genetic disease. We created a mathematical model that was able to predict CPDs and putative compensatory sites, and functionally showed in vivo that second-site mutation can mitigate the pathogenicity of disease alleles. Additionally, we made publically available an in silico module for the prediction of CPDs and modifier sites.
These studies have advanced the ability to interpret the pathogenicity of multiple types of human variation, as well as made available tools for others to do so as well.
Resumo:
BACKGROUND: Genetic association studies are conducted to discover genetic loci that contribute to an inherited trait, identify the variants behind these associations and ascertain their functional role in determining the phenotype. To date, functional annotations of the genetic variants have rarely played more than an indirect role in assessing evidence for association. Here, we demonstrate how these data can be systematically integrated into an association study's analysis plan. RESULTS: We developed a Bayesian statistical model for the prior probability of phenotype-genotype association that incorporates data from past association studies and publicly available functional annotation data regarding the susceptibility variants under study. The model takes the form of a binary regression of association status on a set of annotation variables whose coefficients were estimated through an analysis of associated SNPs in the GWAS Catalog (GC). The functional predictors examined included measures that have been demonstrated to correlate with the association status of SNPs in the GC and some whose utility in this regard is speculative: summaries of the UCSC Human Genome Browser ENCODE super-track data, dbSNP function class, sequence conservation summaries, proximity to genomic variants in the Database of Genomic Variants and known regulatory elements in the Open Regulatory Annotation database, PolyPhen-2 probabilities and RegulomeDB categories. Because we expected that only a fraction of the annotations would contribute to predicting association, we employed a penalized likelihood method to reduce the impact of non-informative predictors and evaluated the model's ability to predict GC SNPs not used to construct the model. We show that the functional data alone are predictive of a SNP's presence in the GC. Further, using data from a genome-wide study of ovarian cancer, we demonstrate that their use as prior data when testing for association is practical at the genome-wide scale and improves power to detect associations. CONCLUSIONS: We show how diverse functional annotations can be efficiently combined to create 'functional signatures' that predict the a priori odds of a variant's association to a trait and how these signatures can be integrated into a standard genome-wide-scale association analysis, resulting in improved power to detect truly associated variants.
Resumo:
Pharmacogenomics (PGx) offers the promise of utilizing genetic fingerprints to predict individual responses to drugs in terms of safety, efficacy and pharmacokinetics. Early-phase clinical trial PGx applications can identify human genome variations that are meaningful to study design, selection of participants, allocation of resources and clinical research ethics. Results can inform later-phase study design and pipeline developmental decisions. Nevertheless, our review of the clinicaltrials.gov database demonstrates that PGx is rarely used by drug developers. Of the total 323 trials that included PGx as an outcome, 80% have been conducted by academic institutions after initial regulatory approval. Barriers for the application of PGx are discussed. We propose a framework for the role of PGx in early-phase drug development and recommend PGx be universally considered in study design, result interpretation and hypothesis generation for later-phase studies, but PGx results from underpowered studies should not be used by themselves to terminate drug-development programs.
Resumo:
CD8+ T cells are associated with long term control of virus replication to low or undetectable levels in a population of HIV+ therapy-naïve individuals known as virus controllers (VCs; <5000 RNA copies/ml and CD4+ lymphocyte counts >400 cells/µl). These subjects' ability to control viremia in the absence of therapy makes them the gold standard for the type of CD8+ T-cell response that should be induced with a vaccine. Studying the regulation of CD8+ T cells responses in these VCs provides the opportunity to discover mechanisms of durable control of HIV-1. Previous research has shown that the CD8+ T cell population in VCs is heterogeneous in its ability to inhibit virus replication and distinct T cells are responsible for virus inhibition. Further defining both the functional properties and regulation of the specific features of the select CD8+ T cells responsible for potent control of viremia the in VCs would enable better evaluation of T cell-directed vaccine strategies and may inform the design of new therapies.
Here we discuss the progress made in elucidating the features and regulation of CD8+ T cell response in virus controllers. We first detail the development of assays to quantify CD8+ T cells' ability to inhibit virus replication. This includes the use of a multi-clade HIV-1 panel which can subsequently be used as a tool for evaluation of T cell directed vaccines. We used these assays to evaluate the CD8+ response among cohorts of HIV-1 seronegative, HIV-1 acutely infected, and HIV-1 chronically infected (both VC and chronic viremic) patients. Contact and soluble CD8+ T cell virus inhibition assays (VIAs) are able to distinguish these patient groups based on the presence and magnitude of the responses. When employed in conjunction with peptide stimulation, the soluble assay reveals peptide stimulation induces CD8+ T cell responses with a prevalence of Gag p24 and Nef specificity among the virus controllers tested. Given this prevalence, we aimed to determine the gene expression profile of Gag p24-, Nef-, and unstimulated CD8+ T cells. RNA was isolated from CD8+ T-cells from two virus controllers with strong virus inhibition and one seronegative donor after a 5.5 hour stimulation period then analyzed using the Illumina Human BeadChip platform (Duke Center for Human Genome Variation). Analysis revealed that 565 (242 Nef and 323 Gag) genes were differentially expressed in CD8+ T-cells that were able to inhibit virus replication compared to those that could not. We compared the differentially expressed genes to published data sets from other CD8+ T-cell effector function experiments focusing our analysis on the most recurring genes with immunological, gene regulatory, apoptotic or unknown functions. The most commonly identified gene in these studies was TNFRSF9. Using PCR in a larger cohort of virus controllers we confirmed the up-regulation of TNFRSF9 in Gag p24 and Nef-specific CD8+ T cell mediated virus inhibition. We also observed increase in the mRNA encoding antiviral cytokines macrophage inflammatory proteins (MIP-1α, MIP-1αP, MIP-1β), interferon gamma (IFN-γ), granulocyte-macrophage colony-stimulating factor (GM-CSF), and recently identified lymphotactin (XCL1).
Our previous work suggests the CD8+ T-cell response to HIV-1 can be regulated at the level of gene regulation. Because RNA abundance is modulated by transcription of new mRNAs and decay of new and existing RNA we aimed to evaluate the net rate of transcription and mRNA decay for the cytokines we identified as differentially regulated. To estimate rate of mRNA synthesis and decay, we stimulated isolated CD8+ T-cells with Gag p24 and Nef peptides adding 4-thiouridine (4SU) during the final hour of stimulation, allowing for separation of RNA made during the final hour of stimulation. Subsequent PCR of RNA isolated from these cells, allowed us to determine how much mRNA was made for our genes of interest during the final hour which we used to calculate rate of transcription. To assess if stimulation caused a change in RNA stability, we calculated the decay rates of these mRNA over time. In Gag p24 and Nef stimulated T cells , the abundance of the mRNA of many of the cytokines examined was dependent on changes in both transcription and mRNA decay with evidence for potential differences in the regulation of mRNA between Nef and Gag specific CD8+ T cells. The results were highly reproducible in that in one subject that was measured in three independent experiments the results were concordant.
This data suggests that mRNA stability, in addition to transcription, is key in regulating the direct anti-HIV-1 function of antigen-specific memory CD8+ T cells by enabling rapid recall of anti-HIV-1 effector functions, namely the production and increased stability of antiviral cytokines. We have started to uncover the mechanisms employed by CD8+ T cell subsets with antigen-specific anti-HIV-1 activity, in turn, enhancing our ability to inhibit virus replication by informing both cure strategies and HIV-1 vaccine designs that aim to reduce transmission and can aid in blocking HIV-1 acquisition.
Resumo:
cERMIT is a computationally efficient motif discovery tool based on analyzing genome-wide quantitative regulatory evidence. Instead of pre-selecting promising candidate sequences, it utilizes information across all sequence regions to search for high-scoring motifs. We apply cERMIT on a range of direct binding and overexpression datasets; it substantially outperforms state-of-the-art approaches on curated ChIP-chip datasets, and easily scales to current mammalian ChIP-seq experiments with data on thousands of non-coding regions.
Resumo:
There is growing evidence that the complexity of higher organisms does not correlate with the ‘complexity’ of the genome (the human genome contains fewer protein coding genes than corn, and many genes are preserved across species). Rather, complexity is associated with the complexity of the pathways and processes whereby the cell utilises the deoxyribonucleic acid molecule, and much else, in the process of phenotype formation. These pro- cesses include the activity of the epigenome, noncoding ribonucleic acids, alternative splicing and post-transla- tional modifications. Not accidentally, all of these pro- cesses appear to be of particular importance for the human brain, the most complex organ in nature. Because these processes can be highly environmentally reactive, they are a key to understanding behavioural plasticity and highlight the importance of the developmental process in explaining behavioural outcomes.
Resumo:
Email exchange in 2013 between Kathryn Maxson (Duke) and Kris Wetterstrand (NHGRI), regarding country funding and other data for the HGP sequencing centers. Also includes the email request for such information, from NHGRI to the centers, in 2000, and the aggregate data collected.
Resumo:
Jean Weissenbach, telephone interview by Kathryn Maxson and Robert Cook-Deegan, conducted from Durham, NC 09 February 2012. Jean Weissenbach, a leader in French genetic mapping, directed the French national sequencing center, Généthon, during the HGP and was instrumental in helping to build agreement to the Bermuda Principles in France.
Resumo:
Mark Guyer and Jane Peterson, in-person interview with Kathryn Maxson and Robert Cook-Deegan, conducted in Rockville, MD (NIH campus), 18 August 2011. Mark Guyer and Jane Peterson were grants program officers at the NIH during the HGP, and were some of the longest-standing employees in the HGP administrative structure. Both witnessed the transformation of the Office of Genome Research into the National Center for Human Genome Research and, finally, the National Human Genome Research Institute. They were close participants in the history of the Bermuda Principles within the NIH.
Resumo:
BACKGROUND: There is considerable interest in the development of methods to efficiently identify all coding variants present in large sample sets of humans. There are three approaches possible: whole-genome sequencing, whole-exome sequencing using exon capture methods, and RNA-Seq. While whole-genome sequencing is the most complete, it remains sufficiently expensive that cost effective alternatives are important. RESULTS: Here we provide a systematic exploration of how well RNA-Seq can identify human coding variants by comparing variants identified through high coverage whole-genome sequencing to those identified by high coverage RNA-Seq in the same individual. This comparison allowed us to directly evaluate the sensitivity and specificity of RNA-Seq in identifying coding variants, and to evaluate how key parameters such as the degree of coverage and the expression levels of genes interact to influence performance. We find that although only 40% of exonic variants identified by whole genome sequencing were captured using RNA-Seq; this number rose to 81% when concentrating on genes known to be well-expressed in the source tissue. We also find that a high false positive rate can be problematic when working with RNA-Seq data, especially at higher levels of coverage. CONCLUSIONS: We conclude that as long as a tissue relevant to the trait under study is available and suitable quality control screens are implemented, RNA-Seq is a fast and inexpensive alternative approach for finding coding variants in genes with sufficiently high expression levels.
Resumo:
BACKGROUND: The rate of emergence of human pathogens is steadily increasing; most of these novel agents originate in wildlife. Bats, remarkably, are the natural reservoirs of many of the most pathogenic viruses in humans. There are two bat genome projects currently underway, a circumstance that promises to speed the discovery host factors important in the coevolution of bats with their viruses. These genomes, however, are not yet assembled and one of them will provide only low coverage, making the inference of most genes of immunological interest error-prone. Many more wildlife genome projects are underway and intend to provide only shallow coverage. RESULTS: We have developed a statistical method for the assembly of gene families from partial genomes. The method takes full advantage of the quality scores generated by base-calling software, incorporating them into a complete probabilistic error model, to overcome the limitation inherent in the inference of gene family members from partial sequence information. We validated the method by inferring the human IFNA genes from the genome trace archives, and used it to infer 61 type-I interferon genes, and single type-II interferon genes in the bats Pteropus vampyrus and Myotis lucifugus. We confirmed our inferences by direct cloning and sequencing of IFNA, IFNB, IFND, and IFNK in P. vampyrus, and by demonstrating transcription of some of the inferred genes by known interferon-inducing stimuli. CONCLUSION: The statistical trace assembler described here provides a reliable method for extracting information from the many available and forthcoming partial or shallow genome sequencing projects, thereby facilitating the study of a wider variety of organisms with ecological and biomedical significance to humans than would otherwise be possible.
Resumo:
Genome rearrangement often produces chromosomes with two centromeres (dicentrics) that are inherently unstable because of bridge formation and breakage during cell division. However, mammalian dicentrics, and particularly those in humans, can be quite stable, usually because one centromere is functionally silenced. Molecular mechanisms of centromere inactivation are poorly understood since there are few systems to experimentally create dicentric human chromosomes. Here, we describe a human cell culture model that enriches for de novo dicentrics. We demonstrate that transient disruption of human telomere structure non-randomly produces dicentric fusions involving acrocentric chromosomes. The induced dicentrics vary in structure near fusion breakpoints and like naturally-occurring dicentrics, exhibit various inter-centromeric distances. Many functional dicentrics persist for months after formation. Even those with distantly spaced centromeres remain functionally dicentric for 20 cell generations. Other dicentrics within the population reflect centromere inactivation. In some cases, centromere inactivation occurs by an apparently epigenetic mechanism. In other dicentrics, the size of the alpha-satellite DNA array associated with CENP-A is reduced compared to the same array before dicentric formation. Extra-chromosomal fragments that contained CENP-A often appear in the same cells as dicentrics. Some of these fragments are derived from the same alpha-satellite DNA array as inactivated centromeres. Our results indicate that dicentric human chromosomes undergo alternative fates after formation. Many retain two active centromeres and are stable through multiple cell divisions. Others undergo centromere inactivation. This event occurs within a broad temporal window and can involve deletion of chromatin that marks the locus as a site for CENP-A maintenance/replenishment.