957 resultados para human genome variation
Resumo:
BACKGROUND: DNA sequence polymorphisms analysis can provide valuable information on the evolutionary forces shaping nucleotide variation, and provides an insight into the functional significance of genomic regions. The recent ongoing genome projects will radically improve our capabilities to detect specific genomic regions shaped by natural selection. Current available methods and software, however, are unsatisfactory for such genome-wide analysis. RESULTS: We have developed methods for the analysis of DNA sequence polymorphisms at the genome-wide scale. These methods, which have been tested on a coalescent-simulated and actual data files from mouse and human, have been implemented in the VariScan software package version 2.0. Additionally, we have also incorporated a graphical-user interface. The main features of this software are: i) exhaustive population-genetic analyses including those based on the coalescent theory; ii) analysis adapted to the shallow data generated by the high-throughput genome projects; iii) use of genome annotations to conduct a comprehensive analyses separately for different functional regions; iv) identification of relevant genomic regions by the sliding-window and wavelet-multiresolution approaches; v) visualization of the results integrated with current genome annotations in commonly available genome browsers. CONCLUSION: VariScan is a powerful and flexible suite of software for the analysis of DNA polymorphisms. The current version implements new algorithms, methods, and capabilities, providing an important tool for an exhaustive exploratory analysis of genome-wide DNA polymorphism data.
Resumo:
Genome-wide association studies (GWAS) are conducted with the promise to discover novel genetic variants associated with diverse traits. For most traits, associated markers individually explain just a modest fraction of the phenotypic variation, but their number can well be in the hundreds. We developed a maximum likelihood method that allows us to infer the distribution of associated variants even when many of them were missed by chance. Compared to previous approaches, the novelty of our method is that it (a) does not require having an independent (unbiased) estimate of the effect sizes; (b) makes use of the complete distribution of P-values while allowing for the false discovery rate; (c) takes into account allelic heterogeneity and the SNP pruning strategy. We applied our method to the latest GWAS meta-analysis results of the GIANT consortium. It revealed that while the explained variance of genome-wide (GW) significant SNPs is around 1% for waist-hip ratio (WHR), the observed P-values provide evidence for the existence of variants explaining 10% (CI=[8.5-11.5%]) of the phenotypic variance in total. Similarly, the total explained variance likely to exist for height is estimated to be 29% (CI=[28-30%]), three times higher than what the observed GW significant SNPs give rise to. This methodology also enables us to predict the benefit of future GWA studies that aim to reveal more associated genetic markers via increased sample size.
Resumo:
To understand the biology and evolution of ruminants, the cattle genome was sequenced to about sevenfold coverage. The cattle genome contains a minimum of 22,000 genes, with a core set of 14,345 orthologs shared among seven mammalian species of which 1217 are absent or undetected in noneutherian (marsupial or monotreme) genomes. Cattle-specific evolutionary breakpoint regions in chromosomes have a higher density of segmental duplications, enrichment of repetitive elements, and species-specific variations in genes associated with lactation and immune responsiveness. Genes involved in metabolism are generally highly conserved, although five metabolic genes are deleted or extensively diverged from their human orthologs. The cattle genome sequence thus provides a resource for understanding mammalian evolution and accelerating livestock genetic improvement for milk and meat production.
Resumo:
Discussion on improving the power of genome-wide association studies to identify candidate variants and genes is generally centered on issues of maximizing sample size; less attention is given to the role of phenotype definition and ascertainment. The authors used genome-wide data from patients infected with human immunodeficiency virus type 1 (HIV-1) to assess whether differences in type of population (622 seroconverters vs. 636 seroprevalent subjects) or the number of measurements available for defining the phenotype resulted in differences in the effect sizes of associations between single nucleotide polymorphisms and the phenotype, HIV-1 viral load at set point. The effect estimate for the top 100 single nucleotide polymorphisms was 0.092 (95% confidence interval: 0.074, 0.110) log(10) viral load (log(10) copies of HIV-1 per mL of blood) greater in seroconverters than in seroprevalent subjects. The difference was even larger when the authors focused on chromosome 6 variants (0.153 log(10) viral load) or on variants that achieved genome-wide significance (0.232 log(10) viral load). The estimates of the genetic effects tended to be slightly larger when more viral load measurements were available, particularly among seroconverters and for variants that achieved genome-wide significance. Differences in phenotype definition and ascertainment may affect the estimated magnitude of genetic effects and should be considered in optimizing power for discovering new associations.
Resumo:
Genetic variants influence the risk to develop certain diseases or give rise to differences in drug response. Recent progresses in cost-effective, high-throughput genome-wide techniques, such as microarrays measuring Single Nucleotide Polymorphisms (SNPs), have facilitated genotyping of large clinical and population cohorts. Combining the massive genotypic data with measurements of phenotypic traits allows for the determination of genetic differences that explain, at least in part, the phenotypic variations within a population. So far, models combining the most significant variants can only explain a small fraction of the variance, indicating the limitations of current models. In particular, researchers have only begun to address the possibility of interactions between genotypes and the environment. Elucidating the contributions of such interactions is a difficult task because of the large number of genetic as well as possible environmental factors.In this thesis, I worked on several projects within this context. My first and main project was the identification of possible SNP-environment interactions, where the phenotypes were serum lipid levels of patients from the Swiss HIV Cohort Study (SHCS) treated with antiretroviral therapy. Here the genotypes consisted of a limited set of SNPs in candidate genes relevant for lipid transport and metabolism. The environmental variables were the specific combinations of drugs given to each patient over the treatment period. My work explored bioinformatic and statistical approaches to relate patients' lipid responses to these SNPs, drugs and, importantly, their interactions. The goal of this project was to improve our understanding and to explore the possibility of predicting dyslipidemia, a well-known adverse drug reaction of antiretroviral therapy. Specifically, I quantified how much of the variance in lipid profiles could be explained by the host genetic variants, the administered drugs and SNP-drug interactions and assessed the predictive power of these features on lipid responses. Using cross-validation stratified by patients, we could not validate our hypothesis that models that select a subset of SNP-drug interactions in a principled way have better predictive power than the control models using "random" subsets. Nevertheless, all models tested containing SNP and/or drug terms, exhibited significant predictive power (as compared to a random predictor) and explained a sizable proportion of variance, in the patient stratified cross-validation context. Importantly, the model containing stepwise selected SNP terms showed higher capacity to predict triglyceride levels than a model containing randomly selected SNPs. Dyslipidemia is a complex trait for which many factors remain to be discovered, thus missing from the data, and possibly explaining the limitations of our analysis. In particular, the interactions of drugs with SNPs selected from the set of candidate genes likely have small effect sizes which we were unable to detect in a sample of the present size (<800 patients).In the second part of my thesis, I performed genome-wide association studies within the Cohorte Lausannoise (CoLaus). I have been involved in several international projects to identify SNPs that are associated with various traits, such as serum calcium, body mass index, two-hour glucose levels, as well as metabolic syndrome and its components. These phenotypes are all related to major human health issues, such as cardiovascular disease. I applied statistical methods to detect new variants associated with these phenotypes, contributing to the identification of new genetic loci that may lead to new insights into the genetic basis of these traits. This kind of research will lead to a better understanding of the mechanisms underlying these pathologies, a better evaluation of disease risk, the identification of new therapeutic leads and may ultimately lead to the realization of "personalized" medicine.
Resumo:
Background: Recent studies in pigs have detected copy number variants (CNVs) using the Comparative Genomic Hybridization technique in arrays designed to cover specific porcine chromosomes. The goal of this study was to identify CNV regions (CNVRs) in swine species based on whole genome SNP genotyping chips. Results: We used predictions from three different programs (cnvPartition, PennCNV and GADA) to analyze data from the Porcine SNP60 BeadChip. A total of 49 CNVRs were identified in 55 animals from an Iberian x Landrace cross (IBMAP) according to three criteria: detected in at least two animals, contained three or more consecutive SNPs and recalled by at least two programs. Mendelian inheritance of CNVRs was confirmed in animals belonging to several generations of the IBMAP cross. Subsequently, a segregation analysis of these CNVRs was performed in 372 additional animals from the IBMAP cross and its distribution was studied in 133 unrelated pig samples from different geographical origins. Five out of seven analyzed CNVRs were validated by real time quantitative PCR, some of which coincide with well known examples of CNVs conserved across mammalian species. Conclusions: Our results illustrate the usefulness of Porcine SNP60 BeadChip to detect CNVRs and show that structural variants can not be neglected when studying the genetic variability in this species.
Resumo:
Numerous links between genetic variants and phenotypes are known and genome-wide association studies dramatically increased the number of genetic variants associated with traits during the last decade. However, how changes in the DNA perturb the molecular mechanisms and impact on the phenotype of an organism remains elusive. Studies suggest that many traitassociated variants are in the non-coding region of the genome and probably act through regulation of gene expression. During my thesis I investigated how genetic variants affect gene expression through gene regulatory mechanisms. The first chapter was a collaborative project with a pharmaceutical company, where we investigated genome-wide copy number variation (CNVs) among Cynomolgus monkeys (Macaca fascicularis) used in pharmaceutical studies, and associated them to changes in gene expression. We found substantial copy number variation and identified CNVs linked to tissue-specific expression changes of proximal genes. The second and third chapters focus on genetic variation in humans and its effects on gene regulatory mechanisms and gene expression. The second chapter studies two human trios, where the allelic effects of genetic variation on genome-wide gene expression, protein-DNA binding and chromatin modifications were investigated. We found abundant allele specific activity across all measured molecular phenotypes and show extended coordinated behavior among them. In the third chapter, we investigated the impact of genetic variation on these phenotypes in 47 unrelated individuals. We found that chromatin phenotypes are organized into local variable modules, often linked to genetic variation and gene expression. Our results suggest that chromatin variation emerges as a result of perturbations of cis-regulatory elements by genetic variants, leading to gene expression changes. The work of this thesis provides novel insights into how genetic variation impacts gene expression by perturbing regulatory mechanisms. -- De nombreux liens entre variations génétiques et phénotypes sont connus. Les études d'association pangénomique ont considérablement permis d'augmenter le nombre de variations génétiques associées à des phénotypes au cours de la dernière décennie. Cependant, comprendre comment ces changements perturbent les mécanismes moléculaires et affectent le phénotype d'un organisme nous échappe encore. Des études suggèrent que de nombreuses variations, associées à des phénotypes, sont situées dans les régions non codantes du génome et sont susceptibles d'agir en modifiant la régulation d'expression des gènes. Au cours de ma thèse, j'ai étudié comment les variations génétiques affectent les niveaux d'expression des gènes en perturbant les mécanismes de régulation de leur expression. Le travail présenté dans le premier chapitre est un projet en collaboration avec une société pharmaceutique. Nous avons étudié les variations en nombre de copies (CNV) présentes chez le macaque crabier (Macaca fascicularis) qui est utilisé dans les études pharmaceutiques, et nous les avons associées avec des changements d'expression des gènes. Nous avons découvert qu'il existe une variabilité substantielle du nombre de copies et nous avons identifié des CNVs liées aux changements d'expression des gènes situés dans leur voisinage. Ces associations sont présentes ou absentes de manière spécifique dans certains tissus. Les deuxième et troisième chapitres se concentrent sur les variations génétiques dans les populations humaines et leurs effets sur les mécanismes de régulation des gènes et leur expression. Le premier se penche sur deux trios humains, père, mère, enfant, au sein duquel nous avons étudié les effets alléliques des variations génétiques sur l'expression des gènes, les liaisons protéine-ADN et les modifications de la chromatine. Nous avons découvert que l'activité spécifique des allèles est abondante abonde dans tous ces phénotypes moléculaires et nous avons démontré que ces derniers ont un comportement coordonné entre eux. Dans le second, nous avons examiné l'impact des variations génétiques de ces phénotypes moléculaires chez 47 individus, sans lien de parenté. Nous avons observé que les phénotypes de la chromatine sont organisés en modules locaux, qui sont liés aux variations génétiques et à l'expression des gènes. Nos résultats suggèrent que la variabilité de la chromatine est due à des variations génétiques qui perturbent des éléments cis-régulateurs, et peut conduire à des changements dans l'expression des gènes. Le travail présenté dans cette thèse fournit de nouvelles pistes pour comprendre l'impact des différentes variations génétiques sur l'expression des gènes à travers les mécanismes de régulation.
Resumo:
Several human genetic syndromes have long been recognized to be defective in DNA repair mechanisms. This was first discovered by Cleaver (1968), who showed that cells from patients with xeroderma pigmentosum (XP) were defective for the ability to remove ultraviolet (UV)-induced lesions from their genome. Since then, new discoveries have promoted DNA repair studies to one of the most exciting areas of molecular biology. The present work intends to give a brief summary of the main known human genetic diseases related to DNA repair and how they may be linked to acquired diseases such as cancer
Coping with genetic diversity: the contribution of pathogen and human genomics to modern vaccinology
Resumo:
Vaccine development faces major difficulties partly because of genetic variation in both infectious organisms and humans. This causes antigenic variation in infectious agents and a high interindividual variability in the human response to the vaccine. The exponential growth of genome sequence information has induced a shift from conventional culture-based to genome-based vaccinology, and allows the tackling of challenges in vaccine development due to pathogen genetic variability. Additionally, recent advances in immunogenetics and genomics should help in the understanding of the influence of genetic factors on the interindividual and interpopulation variations in immune responses to vaccines, and could be useful for developing new vaccine strategies. Accumulating results provide evidence for the existence of a number of genes involved in protective immune responses that are induced either by natural infections or vaccines. Variation in immune responses could be viewed as the result of a perturbation of gene networks; this should help in understanding how a particular polymorphism or a combination thereof could affect protective immune responses. Here we will present: i) the first genome-based vaccines that served as proof of concept, and that provided new critical insights into vaccine development strategies; ii) an overview of genetic predisposition in infectious diseases and genetic control in responses to vaccines; iii) population genetic differences that are a rationale behind group-targeted vaccines; iv) an outlook for genetic control in infectious diseases, with special emphasis on the concept of molecular networks that will provide a structure to the huge amount of genomic data.
Resumo:
DNA methylation is essential in X chromosome inactivation and genomic imprinting, maintaining repression of XIST in the active X chromosome and monoallelic repression of imprinted genes. Disruption of the DNA methyltransferase genes DNMT1 and DNMT3B in the HCT116 cell line (DKO cells) leads to global DNA hypomethylation and biallelic expression of the imprinted gene IGF2 but does not lead to reactivation of XIST expression, suggesting thatXIST repression is due to a more stable epigenetic mark than imprinting. To test this hypothesis, we induced acute hypomethylation in HCT116 cells by 5-aza-2′-deoxycytidine (5-aza-CdR) treatment (HCT116-5-aza-CdR) and compared that to DKO cells, evaluating DNA methylation by microarray and monitoring the expression of XIST and imprinted genes IGF2, H19, and PEG10. Whereas imprinted genes showed biallelic expression in HCT116-5-aza-CdR and DKO cells, the XIST locus was hypomethylated and weakly expressed only under acute hypomethylation conditions, indicating the importance ofXIST repression in the active X to cell survival. Given that DNMT3A is the only active DNMT in DKO cells, it may be responsible for ensuring the repression of XIST in those cells. Taken together, our data suggest that XIST repression is more tightly controlled than genomic imprinting and, at least in part, is due to DNMT3A.
Resumo:
Picornaviruses are the most common human viruses and the identification of the picornaviruses is nowadays based on molecular techniques, for example, reverse transcriptase polymerase chain reaction (RT-PCR). One aim of this thesis was to improve the identification of picornaviruses, especially rhino- and enteroviruses, with a real-time assay format and, also, to improve the differentiation of the viruses with genus-specific locked nucleic acid (LNA) probes. Another aim was to identify and study the causative agent of the enterovirus epidemics that appeared in Finland during seasons 2008-2010. In this thesis, the first version of picornavirus qRT-PCR with a melting curve analysis was used in a study of rhinovirus transmission within families with a rhinovirus positive index child where rhinovirus infection was monitored in all family members. In conclusion, rhinoviruses spread effectively within families causing mostly symptomatic infections in children and asymptomatic infections in adults. To improve the differentiation between rhino- and enterovirus the picornavirus qRT-PCR was modified with LNA-incorporated probes. The LNA probes were validated with picornavirus prototypes and different clinical specimen types. The LNA probe-based picornavirus qRT-PCR was able to differentiate all rhino- and enteroviruses correctly, which makes it suitable for diagnostic use. Moreover, in this thesis enterovirus outbreaks were studied with a well-observed method to create a strain-specific qRT-PCR from the typing region VP1 protein. In a hand-foot-and-mouth-disease (HFMD) outbreak in 2008, the causative agent was identified as CV-A6 and when the molecular evolution of the new HFMD CV-A6 strain was studied it was found that CV-A6 was the emerging agent for HFMD and onychomadesis. Furthermore, unusual E-30 meningitis epidemics that apeared during seasons 2009 and 2010 were studied with strain-specific qRT-PCR. The E-30 affected mostly adolescents and was probably spread in sports teams.
Resumo:
We investigated the ability of a selection of human influenza A viruses, including recent clinical isolates, to induce IFN-beta production in cultured cell lines. In contrast to the well-characterized laboratory strain A/PR/8/34, several, but not all, recent isolates of H3N2 viruses resulted in moderate IFN-beta stimulation. Through the generation of recombinant viruses, we were able to show that this is not due to a loss of the ability of the NS1 genes to suppress IFN-beta induction; indeed, the NS1 genes behaved similarly with respect to their abilities to block dsRNA signaling. Interestingly, replication of A/Sydney/5/97 virus was less Susceptible to pre-treatment with IFN-alpha than the other viruses. In contrast to the universal effect on dsRNA signaling, we noted differences in the effect of NS1 proteins on expression of interferon stimulated genes and also genes induced by a distinct pathway. The majority of NS1 proteins blocked expression From both IFN-dependent and TNF-dependent promoters by an apparent post-transcriptional mechanism. The NS1 gene of A/PR/8/34 NS1 did not confer these blocks. We noted striking differences in the Cellular localization of different influenza A virus NS1 proteins during infection, which might explain differences in biological activity. (C) 2005 Elsevier Inc. All rights reserved.
Resumo:
A cellular receptor for the haemagglutinating enteroviruses (HEV), and the protein that mediates haemagglutination, is the membrane complement regulatory protein decay accelerating factor (DAF; CD55). Although primate DAF is highly conserved, significant differences exist to enable cell lines derived from primates to be utilized for the characterization of the DAF binding phenotype of human enteroviruses. Thus, several distinct DAF-binding phenotypes of a selection of HEVs (viz. coxsackievirus A21 and echoviruses 6, 7, 11-13, 29) were identified from binding and infection assays using a panel of primate cells derived from human, orang-utan, African Green monkey and baboon tissues. These studies complement our recent determination of the crystal structure of SCR(34) of human DAF [Williams, P., Chaudhry, Y., Goodfellow, I. G., Billington, J., Powell, R., Spiller, O. B., Evans, D. J. & Lea, S. (2003). J Biol Chem 278, 10691-10696] and have enabled us to better map the regions of DAF with which enteroviruses interact and, in certain cases, predict specific virus-receptor contacts.
Resumo:
As an obligatory parasite of humans, the body louse (Pediculus humanus humanus) is an important vector for human diseases, including epidemic typhus, relapsing fever, and trench fever. Here, we present genome sequences of the body louse and its primary bacterial endosymbiont Candidatus Riesia pediculicola. The body louse has the smallest known insect genome, spanning 108 Mb. Despite its status as an obligate parasite, it retains a remarkably complete basal insect repertoire of 10,773 protein-coding genes and 57 microRNAs. Representing hemimetabolous insects, the genome of the body louse thus provides a reference for studies of holometabolous insects. Compared with other insect genomes, the body louse genome contains significantly fewer genes associated with environmental sensing and response, including odorant and gustatory receptors and detoxifying enzymes. The unique architecture of the 18 minicircular mitochondrial chromosomes of the body louse may be linked to the loss of the gene encoding the mitochondrial single-stranded DNA binding protein. The genome of the obligatory louse endosymbiont Candidatus Riesia pediculicola encodes less than 600 genes on a short, linear chromosome and a circular plasmid. The plasmid harbors a unique arrangement of genes required for the synthesis of pantothenate, an essential vitamin deficient in the louse diet. The human body louse, its primary endosymbiont, and the bacterial pathogens that it vectors all possess genomes reduced in size compared with their free-living close relatives. Thus, the body louse genome project offers unique information and tools to use in advancing understanding of coevolution among vectors, symbionts, and pathogens.
Resumo:
BACKGROUND: Humans from an early age look longer at preferred stimuli, and also typically look longer at facial expressions of emotion, particularly happy faces. Atypical gaze patterns towards social stimuli are common in Autism Spectrum Conditions (ASC). However, it is unknown if gaze fixation patterns have any genetic basis. In this study, we tested if variations in the cannabinoid receptor 1 (CNR1) gene are associated with gaze duration towards happy faces. This gene was selected because CNR1 is a key component of the endocannabinoid system, involved in processing reward, and in our previous fMRI study we found variations in CNR1 modulates the striatal response to happy (but not disgust) faces. The striatum is involved in guiding gaze to rewarding aspects of a visual scene. We aimed to validate and extend this result in another sample using a different technique (gaze tracking). METHODS: 30 volunteers (13 males, 17 females) from the general population observed dynamic emotion expressions on a screen while their eye movements were recorded. They were genotyped for the identical four SNPs in the CNR1 gene tested in our earlier fMRI study. RESULTS: Two SNPs (rs806377 and rs806380) were associated with differential gaze duration for happy (but not disgust) faces. Importantly, the allelic groups associated with greater striatal response to happy faces in the fMRI study were associated with longer gaze duration for happy faces. CONCLUSIONS: These results suggest CNR1 variations modulate striatal function that underlies the perception of signals of social reward such as happy faces. This suggests CNR1 is a key element in the molecular architecture of perception of certain basic emotions. This may have implications for understanding neurodevelopmental conditions marked by atypical eye contact and facial emotion processing, such as ASC.