957 resultados para human genome variation
Resumo:
Searching for matches between large collections of short (14-30 nucleotides) words and sequence databases comprising full genomes or transcriptomes is a common task in biological sequence analysis. We investigated the performance of simple indexing strategies for handling such tasks and developed two programs, fetchGWI and tagger, that index either the database or the query set. Either strategy outperforms megablast for searches with more than 10,000 probes. FetchGWI is shown to be a versatile tool for rapidly searching multiple genomes, whose performance is limited in most cases by the speed of access to the filesystem. We have made publicly available a Web interface for searching the human, mouse, and several other genomes and transcriptomes with oligonucleotide queries.
Resumo:
It is common practice in genome-wide association studies (GWAS) to focus on the relationship between disease risk and genetic variants one marker at a time. When relevant genes are identified it is often possible to implicate biological intermediates and pathways likely to be involved in disease aetiology. However, single genetic variants typically explain small amounts of disease risk. Our idea is to construct allelic scores that explain greater proportions of the variance in biological intermediates, and subsequently use these scores to data mine GWAS. To investigate the approach's properties, we indexed three biological intermediates where the results of large GWAS meta-analyses were available: body mass index, C-reactive protein and low density lipoprotein levels. We generated allelic scores in the Avon Longitudinal Study of Parents and Children, and in publicly available data from the first Wellcome Trust Case Control Consortium. We compared the explanatory ability of allelic scores in terms of their capacity to proxy for the intermediate of interest, and the extent to which they associated with disease. We found that allelic scores derived from known variants and allelic scores derived from hundreds of thousands of genetic markers explained significant portions of the variance in biological intermediates of interest, and many of these scores showed expected correlations with disease. Genome-wide allelic scores however tended to lack specificity suggesting that they should be used with caution and perhaps only to proxy biological intermediates for which there are no known individual variants. Power calculations confirm the feasibility of extending our strategy to the analysis of tens of thousands of molecular phenotypes in large genome-wide meta-analyses. We conclude that our method represents a simple way in which potentially tens of thousands of molecular phenotypes could be screened for causal relationships with disease without having to expensively measure these variables in individual disease collections.
Genetic variation in GIPR influences the glucose and insulin responses to an oral glucose challenge.
Resumo:
Glucose levels 2 h after an oral glucose challenge are a clinical measure of glucose tolerance used in the diagnosis of type 2 diabetes. We report a meta-analysis of nine genome-wide association studies (n = 15,234 nondiabetic individuals) and a follow-up of 29 independent loci (n = 6,958-30,620). We identify variants at the GIPR locus associated with 2-h glucose level (rs10423928, beta (s.e.m.) = 0.09 (0.01) mmol/l per A allele, P = 2.0 x 10(-15)). The GIPR A-allele carriers also showed decreased insulin secretion (n = 22,492; insulinogenic index, P = 1.0 x 10(-17); ratio of insulin to glucose area under the curve, P = 1.3 x 10(-16)) and diminished incretin effect (n = 804; P = 4.3 x 10(-4)). We also identified variants at ADCY5 (rs2877716, P = 4.2 x 10(-16)), VPS13C (rs17271305, P = 4.1 x 10(-8)), GCKR (rs1260326, P = 7.1 x 10(-11)) and TCF7L2 (rs7903146, P = 4.2 x 10(-10)) associated with 2-h glucose. Of the three newly implicated loci (GIPR, ADCY5 and VPS13C), only ADCY5 was found to be associated with type 2 diabetes in collaborating studies (n = 35,869 cases, 89,798 controls, OR = 1.12, 95% CI 1.09-1.15, P = 4.8 x 10(-18)).
Resumo:
There is increasing evidence that the microcirculation plays an important role in the pathogenesis of cardiovascular diseases. Changes in retinal vascular caliber reflect early microvascular disease and predict incident cardiovascular events. We performed a genome-wide association study to identify genetic variants associated with retinal vascular caliber. We analyzed data from four population-based discovery cohorts with 15,358 unrelated Caucasian individuals, who are members of the Cohort for Heart and Aging Research in Genomic Epidemiology (CHARGE) consortium, and replicated findings in four independent Caucasian cohorts (n = 6,652). All participants had retinal photography and retinal arteriolar and venular caliber measured from computer software. In the discovery cohorts, 179 single nucleotide polymorphisms (SNP) spread across five loci were significantly associated (p<5.0×10(-8)) with retinal venular caliber, but none showed association with arteriolar caliber. Collectively, these five loci explain 1.0%-3.2% of the variation in retinal venular caliber. Four out of these five loci were confirmed in independent replication samples. In the combined analyses, the top SNPs at each locus were: rs2287921 (19q13; p = 1.61×10(-25), within the RASIP1 locus), rs225717 (6q24; p = 1.25×10(-16), adjacent to the VTA1 and NMBR loci), rs10774625 (12q24; p = 2.15×10(-13), in the region of ATXN2,SH2B3 and PTPN11 loci), and rs17421627 (5q14; p = 7.32×10(-16), adjacent to the MEF2C locus). In two independent samples, locus 12q24 was also associated with coronary heart disease and hypertension. Our population-based genome-wide association study demonstrates four novel loci associated with retinal venular caliber, an endophenotype of the microcirculation associated with clinical cardiovascular disease. These data provide further insights into the contribution and biological mechanisms of microcirculatory changes that underlie cardiovascular disease.
Resumo:
We have shown that indels in gp120 V4 are associated to the presence of duplicated and palindromic sequences, suggesting that they may be produced by strand-slippage misalignment mechanism. Indels in V4 involved region-specific duplications 9 to 15 bp long, and repeats of various lengths, associated to trinucleotides AAT. No duplications were found in V3 and C3. The frequency of palindromic sequences in individual genes was found to be significantly higher in gp120 (p < or = 3.00E-7), and significantly lower in Tat (p < or = 9.00E-7) than the average frequency calculated over the full genome. The finding of elements of misalignment in association with indels in V4 suggests that these mutations may occur in proviral DNA after integration of HIV into the host genome. It also implies that occurrence of large indels in gp120 is not random but is directed by the presence and distribution of elements of misalignment in the HIV genome.
Resumo:
Recent genome-wide association (GWA) studies described 95 loci controlling serum lipid levels. These common variants explain ∼25% of the heritability of the phenotypes. To date, no unbiased screen for gene-environment interactions for circulating lipids has been reported. We screened for variants that modify the relationship between known epidemiological risk factors and circulating lipid levels in a meta-analysis of genome-wide association (GWA) data from 18 population-based cohorts with European ancestry (maximum N = 32,225). We collected 8 further cohorts (N = 17,102) for replication, and rs6448771 on 4p15 demonstrated genome-wide significant interaction with waist-to-hip-ratio (WHR) on total cholesterol (TC) with a combined P-value of 4.79×10(-9). There were two potential candidate genes in the region, PCDH7 and CCKAR, with differential expression levels for rs6448771 genotypes in adipose tissue. The effect of WHR on TC was strongest for individuals carrying two copies of G allele, for whom a one standard deviation (sd) difference in WHR corresponds to 0.19 sd difference in TC concentration, while for A allele homozygous the difference was 0.12 sd. Our findings may open up possibilities for targeted intervention strategies for people characterized by specific genomic profiles. However, more refined measures of both body-fat distribution and metabolic measures are needed to understand how their joint dynamics are modified by the newly found locus.
Resumo:
The ability of a population to adapt to changing environments depends critically on the amount and kind of genetic variability it possesses. Mutations are an important source of new genetic variability and may lead to new adaptations, especially if the population size is large. Mutation rates are extremely variable between and within species, and males usually have higher mutation rates as a result of elevated rates of male germ cell division. This male bias affects the overall mutation rate. We examined the factors that influence male mutation bias, and focused on the effects of classical life-history parameters, such as the average age at reproduction and elevated rates of sperm production in response to sexual selection and sperm competition. We argue that human-induced changes in age at reproduction or in sexual selection will affect male mutation biases and hence overall mutation rates. Depending on the effective population size, these changes are likely to influence the long-term persistence of a population.
Resumo:
A gas chromatography-mass spectrometry method is presented which allows the simultaneous determination of the plasma concentrations of the selective serotonin reuptake inhibitors citalopram, paroxetine, sertraline, and their pharmacologically active N-demethylated metabolites (desmethylcitalopram, didesmethylcitalopram, and desmethylsertraline) after derivatization with the reagent N-methyl-bis(trifluoroacetamide). No interferences from endogenous compounds are observed following the extraction of plasma samples from six different human subjects. The standard curves are linear over a working range of 10-500 ng/mL for citalopram, 10-300 ng/mL for desmethylcitalopram, 5-60 ng/mL for didesmethylcitalopram, 20-400 ng/mL for sertraline and desmethylsertraline, and 10-200 ng/mL for paroxetine. Recoveries measured at three concentrations range from 81 to 118% for the tertiary amines (citalopram and the internal standard methylmaprotiline), 73 to 95% for the secondary amines (desmethylcitalopram, paroxetine and sertraline), and 39 to 66% for the primary amines (didesmethylcitalopram and desmethylsertraline). Intra- and interday coefficients of variation determined at three concentrations range from 3 to 11% for citalopram and its metabolites, 4 to 15% for paroxetine, and 5 to 13% for sertraline and desmethylsertraline. The limits of quantitation of the method are 2 ng/mL for citalopram and paroxetine, 1 ng/mL for sertraline, and 0.5 ng/mL for desmethylcitalopram, didesmethylcitalopram, and desmethylsertraline. No interferences are noted from 20 other psychotropic drugs. This sensitive and specific method can be used for single-dose pharmacokinetics. It is also useful for therapeutic drug monitoring of these three drugs and could possibly be adapted for the quantitation of the two other selective serotonin reuptake inhibitors on the market, namely fluoxetine and fluvoxamine.
Resumo:
Down syndrome (DS) is characterized by extensive phenotypic variability, with most traits occurring in only a fraction of affected individuals. Substantial gene-expression variation is present among unaffected individuals, and this variation has a strong genetic component. Since DS is caused by genomic-dosage imbalance, we hypothesize that gene-expression variation of human chromosome 21 (HSA21) genes in individuals with DS has an impact on the phenotypic variability among affected individuals. We studied gene-expression variation in 14 lymphoblastoid and 17 fibroblast cell lines from individuals with DS and an equal number of controls. Gene expression was assayed using quantitative real-time polymerase chain reaction on 100 and 106 HSA21 genes and 23 and 26 non-HSA21 genes in lymphoblastoid and fibroblast cell lines, respectively. Surprisingly, only 39% and 62% of HSA21 genes in lymphoblastoid and fibroblast cells, respectively, showed a statistically significant difference between DS and normal samples, although the average up-regulation of HSA21 genes was close to the expected 1.5-fold in both cell types. Gene-expression variation in DS and normal samples was evaluated using the Kolmogorov-Smirnov test. According to the degree of overlap in expression levels, we classified all genes into 3 groups: (A) nonoverlapping, (B) partially overlapping, and (C) extensively overlapping expression distributions between normal and DS samples. We hypothesize that, in each cell type, group A genes are the most dosage sensitive and are most likely involved in the constant DS traits, group B genes might be involved in variable DS traits, and group C genes are not dosage sensitive and are least likely to participate in DS pathological phenotypes. This study provides the first extensive data set on HSA21 gene-expression variation in DS and underscores its role in modulating the outcome of gene-dosage imbalance.
Resumo:
Background: To determine whether misalignment structures such as duplications, repeats, and palindromes are associated to insertions/deletions (indels) in gp120, indicating that indels are indeed frameshift mutations generated by DNA misalignment mechanism. Methods: Cloning and sequencing of a fragment of HIV-1 gp120 spanning C2-C4 derived from plasma RNA in 12 patients with early chronic disease and naïve to antiretroviral therapy. Results: Indels in V4 involved always insertion and deletion of duplicated nucleotide segments, and AAT repeats, and were associated to the presence of palindromic sequences. No duplications were detected in V3 and C3. Palindromic sequences occurred with similar frequencies in V3, C3 and V4; the frequency of palindromes in individual genes was found to be significantly higher in structural (gp120, p ≤ 3.00E-7) and significantly lower in regulatory (Tat, p ≤ 9.00E-7) genes, as compared to the average frequency calculated over the full genome. Discussion: Indels in V4 are associated to misalignment structures (i.e. duplications repeat and palindromes) indicating DNA misalignment as the mechanism underlying length variation in V4. The finding that indels in V4 are caused by DNA misalignment has some very important implications: 1) indels in V4 are likely to occur in proviral DNA (and not in RNA), after integration of HIV into the host genome; 2) they are likely to occur as progressive modifications of the early founder virus during chronic infection, as more and more cells get infected; 3) frameshift mutations involving any number of base pairs are likely to occur evenly across gp120; however, only those mutants carrying a functional gp120 (indels as multiples of three base pairs) will be able to perpetuate the virus cycle and to keep spreading through the population.
Resumo:
T helper cell (Th) functions are crucial for proper immune defence against various intra- and extracellular pathogens. According to the specific immune responses, Th cells can be classified into subtypes, Th1 and Th2 cells being the most frequently characterized classes. Th1 and Th2 cells interact with other immune cells by regulating their functions with specific cytokine production. IFN, IL-2 and TNF- are the cytokines predominantly produced by Th1 cells whereas Th2 cells produce Th2-type cytokines, such as IL-4, IL-5 and IL-13. Upon TCR activation and in the presence of polarizing cytokines, Th cells differentiate into effector subtypes from a common precursor cell. IFN and IL-12 are the predominant Th1 polarizing cytokines whereas IL-4 directs Th2 polarization. The cytokines mediate their effects through specific receptor signalling. The differentiation process is complex, involving various signalling molecules and routes, as well as functions of the specific transcription factors. The functions of the Th1/Th2 cells are tightly regulated; however, knowledge on human Th cell differentiation is, as yet, fairly poor. The susceptibility for many immune-mediated disorders often originates from disturbed Th cell responses. Thus, research is needed for defining the molecular mechanisms involved in the differentiation and balanced functions of the Th cells. Importantly, the new information obtained will be crucial for a better understanding of the pathogenesis of immune-mediated disorders, such as asthma or autoimmune diseases. In the first subproject of this thesis, the role of genetic polymorphisms in the human STAT6, GATA3 and STAT4 genes were investigated for asthma or atopy susceptibility in Finnish asthma families by association analysis. These genes code for key transcription factors regulating Th cell differentiation. The study resulted in the identification of a GATA3 haplotype that associated with asthma and related traits (high serum IgE level). In the second subproject, an optimized method for human primary T cell transfection and enrichment was established. The method can be utilized for functional studies for the selected genes of interest. The method was also utilized in the third subproject, which aimed at the identification of novel genes involved in early human Th cell polarization (0-48h) using genome-wide oligonucleotide arrays. As a result, numerous genes and ESTs with known or unknown functions were identified in the study. Using an shRNA knockdown approach, a panel of novel IL-4/STAT6 regulated genes were identified in the functional studies of the genes. Moreover, one of the genes, NDFIP2, with a previously uncharacterized role in the human Th differentiation, was observed to promote IFN production of the differentiated Th1 cells. Taken together, the results obtained have revealed potential new relevant candidate genes serving as a basis for further studies characterizing the detailed networks involved in the human Th cell differentiation as well as in the genetic susceptibility of Th-mediated immune disorders.
Resumo:
Sugar beet (Beta vulgaris ssp. vulgaris) is an important crop of temperate climates which provides nearly 30% of the world's annual sugar production and is a source for bioethanol and animal feed. The species belongs to the order of Caryophylalles, is diploid with 2n = 18 chromosomes, has an estimated genome size of 714-758 megabases and shares an ancient genome triplication with other eudicot plants. Leafy beets have been cultivated since Roman times, but sugar beet is one of the most recently domesticated crops. It arose in the late eighteenth century when lines accumulating sugar in the storage root were selected from crosses made with chard and fodder beet. Here we present a reference genome sequence for sugar beet as the first non-rosid, non-asterid eudicot genome, advancing comparative genomics and phylogenetic reconstructions. The genome sequence comprises 567 megabases, of which 85% could be assigned to chromosomes. The assembly covers a large proportion of the repetitive sequence content that was estimated to be 63%. We predicted 27,421 protein-coding genes supported by transcript data and annotated them on the basis of sequence homology. Phylogenetic analyses provided evidence for the separation of Caryophyllales before the split of asterids and rosids, and revealed lineage-specific gene family expansions and losses. We sequenced spinach (Spinacia oleracea), another Caryophyllales species, and validated features that separate this clade from rosids and asterids. Intraspecific genomic variation was analysed based on the genome sequences of sea beet (Beta vulgaris ssp. maritima; progenitor of all beet crops) and four additional sugar beet accessions. We identified seven million variant positions in the reference genome, and also large regions of low variability, indicating artificial selection. The sugar beet genome sequence enables the identification of genes affecting agronomically relevant traits, supports molecular breeding and maximizes the plant's potential in energy biotechnology.
Resumo:
A preliminary understanding into the phenotypic effect of DNA segment copy number variation (CNV) is emerging. These rearrangements were demonstrated to influence, in a somewhat dose-dependent manner, the expression of genes that map within them. They were also shown to modify the expression of genes located on their flanks and sometimes those at a great distance from their boundary. Here we demonstrate, by monitoring these effects at multiple life stages, that these controls over expression are effective throughout mouse development. Similarly, we observe that the more specific spatial expression patterns of CNV genes are maintained through life. However, we find that some brain-expressed genes mapping within CNVs appear to be under compensatory loops only at specific time points, indicating that the effect of CNVs on these genes is modulated during development. Notably, we also observe that CNV genes are significantly enriched within transcripts that show variable time courses of expression between strains. Thus, modifying the copy number of a gene may potentially alter not only its expression level, but also the timing of its expression.
Resumo:
Chromosomes with Ag staining that varies from one metaphase to the other can be distinguished from those with an Ag-staining that is the same in all metaphases. The intercellular variation of an Ag-NOR can be attributed to many different factors. Whatever the importance of technical factors, they do not seem to account for the large variations in Ag-staining which were observed for each ac. This suggests the existence of a natural intercellular variability of the NOR's activity. The variation of the Ag-stainability of a given NOR, the diversity of Ag-stainings observed on the ten ac of one individual and the differences that exist between individuals raise the question of the existence of a compensation of activity between nucleolar organizers. The study, for each individual, of the mean sum of staining per metaphase reveals that this value is not absolutely constant from one individual to another; in the carriers of Robertsonian fusions it is smaller than in chromosomally normal individuals. The analysis of the transmission shows that inactive NORs remain inactive and that active NORs present a variation in the activity from one generation to the next.
Resumo:
Les Champignons Endomycorhiziens Arbusculaires (CEA) forment une symbiose racinaire avec environ 80% des espèces connues de plantes vasculaires. Ils occupent une position écologique très importante liée aux bénéfices qu'ils confèrent aux plantes. Des études moléculaires effectuées sur des gènes ribosomaux ont révélé un très grand polymorphisme, tant à l'intérieur des espèces qu'entre celles-ci. Ces champignons étant coenocytiques et multinucléés, l'organisation de cette variabilité génétique intraspécifique pourrait avoir différentes origines. Ce travail se propose d'examiner l'organisation et l'évolution de cette variabilité. Sur la base de fossiles, l'existence des CEA remonte à au moins 450 millions d'années. Cette symbiose peut donc être considérée comme ancienne. Les premières données moléculaires n'indiquant pas de reproduction sexuée, une hypothèse fut élaborée stipulant que les CEA seraient des asexués ancestraux. La première partie de cette thèse (chapitre 2) met en évidence l'existence de recombinaison dans différents CEA mais montre également que celle-ci est insuffisante pour purger les mutations accumulées. La reproduction étant essentiellement asexuée, on peut prédire que les nombreux noyaux ont probablement divergé génétiquement. En collaboration avec M. Hijri nous avons pu vérifier cette hypothèse (chapitre 2). Dans le chapitre 3 j'ai cherché à comprendre si le polymorphisme était également présent dans une population naturelle du CEA Glomus intraradices au niveau intraspécifique, ce qui n'avait encore jamais été examiné. En comparant les empreintes génétiques d'individus obtenus chacun à partir d'une spore mise en culture, j'ai clairement démontré que d'importantes différences génétiques existent entre ceux-ci. Un résultat similaire, portant sur des traits quantitatifs d'individus de la même population, a été trouvé par A. Koch. Les deux études en ensemble montre que le polymorphisme génétique dans cette population est suffisamment grand pour être important au niveau écologique. Dans le chapitre 4, j'ai cherché a examiner le polymorphisme des séquences du gène BiP au sein d'un individu. C'est la première étude qui examine la diversité génétique du génome de CEA avec un autre marqueur que l'ADN ribosomique. J'ai trouvé 31 types de séquences différentes du gène BiP issu d'un isolat de G. intraradices mis en culture à partir d'une seule spore. Cette variation n'était pas restreinte à des zones sélectivement neutres du BiP. Mes résultats montrent qu'il y a un grand nombre de variants non-fonctionnels, proportionnellement au faible nombre de copies attendues par noyau. Ceci va dans le sens d'une partition de l'information génétique entre les noyaux.<br/><br/>Arbuscular mycorrhizal fungi (AMF) are root symbionts with about 80% of all known species of vascular land plants. AMF are ecologically important because of the benefits that they confer to plants. Molecular studies on AMF showed that rDNA sequences were highly variable between species and within species. Because AMF are coenocytic and multinucleate there are several possibilities how this intraspecific genetic variation could be organized. Therefore, the organization and evolution of this variation in AMF were investigated in the present work. Based on fossil records the AMF symbiosis has existed for 450 Million years and is therefore considered ancient. First molecular data indicated no evident sexual reproduction and gave rise to the hypothesis that AMF might be ancient asexuals. The first part of this thesis (Chapter 2) shows evidence for recombination in different AMF but also indicates that it has not been frequent enough to purge accumulated mutations. Given asexual reproduction, it has been predicted that the many nuclei in AMF should diverge leading to genetically different nuclei. This hypothesis has been confirmed by an experiment of M. Hijri and is also included in chapter 2 as the results were published together. In chapter 3 I then investigated whether intraspecific genetic variation also exists in a field population of the AMF Glomus intraradices. Comparing genetic fingerprints of individuals derived from single spores I could clearly show that large genetic differences exist. A similar result, based on quantitative genetic traits, was found for the same population by A. Koch. The two studies taken together show that the genetic variation observed in the population is high enough to be of ecological relevance. Lastly, in chapter 4, I investigated within individual genetic variation among BiP gene sequences. It is the first study that has analyzed genetic diversity in the AMF genome in a region of DNA other than rDNA. I found 31 sequence variants of the BiP gene in one G. intraradices isolate that originated from one spore. Genetic variation was not only restricted to selectively neutral parts of BiP. A high number of predicted non-functional variants compared to a likely low number of copies per nucleus indicated that functional genetic information might even be partitioned among nuclei. The results of this work contribute to our understanding of potential evolutionary strategies of ancient asexuals, they also suggest that genetic differences in a population might be ecologically relevant and they show that this variation even occurs in functional regions of the AMF genome.