947 resultados para Allele frequency data
Resumo:
The evolution of a quantitative phenotype is often envisioned as a trait substitution sequence where mutant alleles repeatedly replace resident ones. In infinite populations, the invasion fitness of a mutant in this two-allele representation of the evolutionary process is used to characterize features about long-term phenotypic evolution, such as singular points, convergence stability (established from first-order effects of selection), branching points, and evolutionary stability (established from second-order effects of selection). Here, we try to characterize long-term phenotypic evolution in finite populations from this two-allele representation of the evolutionary process. We construct a stochastic model describing evolutionary dynamics at non-rare mutant allele frequency. We then derive stability conditions based on stationary average mutant frequencies in the presence of vanishing mutation rates. We find that the second-order stability condition obtained from second-order effects of selection is identical to convergence stability. Thus, in two-allele systems in finite populations, convergence stability is enough to characterize long-term evolution under the trait substitution sequence assumption. We perform individual-based simulations to confirm our analytic results.
Resumo:
Loss-of-function variants in innate immunity genes are associated with Mendelian disorders in the form of primary immunodeficiencies. Recent resequencing projects report that stop-gains and frameshifts are collectively prevalent in humans and could be responsible for some of the inter-individual variability in innate immune response. Current computational approaches evaluating loss-of-function in genes carrying these variants rely on gene-level characteristics such as evolutionary conservation and functional redundancy across the genome. However, innate immunity genes represent a particular case because they are more likely to be under positive selection and duplicated. To create a ranking of severity that would be applicable to innate immunity genes we evaluated 17,764 stop-gain and 13,915 frameshift variants from the NHLBI Exome Sequencing Project and 1,000 Genomes Project. Sequence-based features such as loss of functional domains, isoform-specific truncation and nonsense-mediated decay were found to correlate with variant allele frequency and validated with gene expression data. We integrated these features in a Bayesian classification scheme and benchmarked its use in predicting pathogenic variants against Online Mendelian Inheritance in Man (OMIM) disease stop-gains and frameshifts. The classification scheme was applied in the assessment of 335 stop-gains and 236 frameshifts affecting 227 interferon-stimulated genes. The sequence-based score ranks variants in innate immunity genes according to their potential to cause disease, and complements existing gene-based pathogenicity scores. Specifically, the sequence-based score improves measurement of functional gene impairment, discriminates across different variants in a given gene and appears particularly useful for analysis of less conserved genes.
Resumo:
The male-to-female sex ratio at birth is constant across world populations with an average of 1.06 (106 male to 100 female live births) for populations of European descent. The sex ratio is considered to be affected by numerous biological and environmental factors and to have a heritable component. The aim of this study was to investigate the presence of common allele modest effects at autosomal and chromosome X variants that could explain the observed sex ratio at birth. We conducted a large-scale genome-wide association scan (GWAS) meta-analysis across 51 studies, comprising overall 114 863 individuals (61 094 women and 53 769 men) of European ancestry and 2 623 828 common (minor allele frequency >0.05) single-nucleotide polymorphisms (SNPs). Allele frequencies were compared between men and women for directly-typed and imputed variants within each study. Forward-time simulations for unlinked, neutral, autosomal, common loci were performed under the demographic model for European populations with a fixed sex ratio and a random mating scheme to assess the probability of detecting significant allele frequency differences. We do not detect any genome-wide significant (P < 5 × 10(-8)) common SNP differences between men and women in this well-powered meta-analysis. The simulated data provided results entirely consistent with these findings. This large-scale investigation across ~115 000 individuals shows no detectable contribution from common genetic variants to the observed skew in the sex ratio. The absence of sex-specific differences is useful in guiding genetic association study design, for example when using mixed controls for sex-biased traits.
Resumo:
The Hardy-Weinberg law, formulated about 100 years ago, states that under certainassumptions, the three genotypes AA, AB and BB at a bi-allelic locus are expected to occur inthe proportions p2, 2pq, and q2 respectively, where p is the allele frequency of A, and q = 1-p.There are many statistical tests being used to check whether empirical marker data obeys theHardy-Weinberg principle. Among these are the classical xi-square test (with or withoutcontinuity correction), the likelihood ratio test, Fisher's Exact test, and exact tests in combinationwith Monte Carlo and Markov Chain algorithms. Tests for Hardy-Weinberg equilibrium (HWE)are numerical in nature, requiring the computation of a test statistic and a p-value.There is however, ample space for the use of graphics in HWE tests, in particular for the ternaryplot. Nowadays, many genetical studies are using genetical markers known as SingleNucleotide Polymorphisms (SNPs). SNP data comes in the form of counts, but from the countsone typically computes genotype frequencies and allele frequencies. These frequencies satisfythe unit-sum constraint, and their analysis therefore falls within the realm of compositional dataanalysis (Aitchison, 1986). SNPs are usually bi-allelic, which implies that the genotypefrequencies can be adequately represented in a ternary plot. Compositions that are in exactHWE describe a parabola in the ternary plot. Compositions for which HWE cannot be rejected ina statistical test are typically “close" to the parabola, whereas compositions that differsignificantly from HWE are “far". By rewriting the statistics used to test for HWE in terms ofheterozygote frequencies, acceptance regions for HWE can be obtained that can be depicted inthe ternary plot. This way, compositions can be tested for HWE purely on the basis of theirposition in the ternary plot (Graffelman & Morales, 2008). This leads to nice graphicalrepresentations where large numbers of SNPs can be tested for HWE in a single graph. Severalexamples of graphical tests for HWE (implemented in R software), will be shown, using SNPdata from different human populations
Resumo:
Placental malaria is a special form of malaria that causes up to 200,000 maternal and infant deaths every year. Previous studies show that two receptor molecules, hyaluronic acid and chondroitin sulphate A, are mediating the adhesion of parasite-infected erythrocytes in the placenta of patients, which is believed to be a key step in the pathogenesis of the disease. In this study, we aimed at identifying sites of malaria-induced adaptation by scanning for signatures of natural selection in 24 genes in the complete biosynthesis pathway of these two receptor molecules. We analyzed a total of 24 Mb of publicly available polymorphism data from the International HapMap project for three human populations with European, Asian and African ancestry, with the African population from a region of presently and historically high malaria prevalence. Using the methods based on allele frequency distributions, genetic differentiation between populations, and on long-range haplotype structure, we found only limited evidence for malaria-induced genetic adaptation in this set of genes in the African population; however, we identified one candidate gene with clear evidence of selection in the Asian population. Although historical exposure to malaria in this population cannot be ruled out, we speculate that it might be caused by other pathogens, as there is growing evidence that these molecules are important receptors in a variety of host-pathogen interactions. We propose to use the present methods in a systematic way to help identify candidate regions under positive selection as a consequence of malaria.
Resumo:
OBJECTIVE: Studies of major depression in twins and families have shown moderate to high heritability, but extensive molecular studies have failed to identify susceptibility genes convincingly. To detect genetic variants contributing to major depression, the authors performed a genome-wide association study using 1,636 cases of depression ascertained in the U.K. and 1,594 comparison subjects screened negative for psychiatric disorders. METHOD: Cases were collected from 1) a case-control study of recurrent depression (the Depression Case Control [DeCC] study; N=1346), 2) an affected sibling pair linkage study of recurrent depression (probands from the Depression Network [DeNT] study; N=332), and 3) a pharmacogenetic study (the Genome-Based Therapeutic Drugs for Depression [GENDEP] study; N=88). Depression cases and comparison subjects were genotyped at Centre National de Génotypage on the Illumina Human610-Quad BeadChip. After applying stringent quality control criteria for missing genotypes, departure from Hardy-Weinberg equilibrium, and low minor allele frequency, the authors tested for association to depression using logistic regression, correcting for population ancestry. RESULTS: Single nucleotide polymorphisms (SNPs) in BICC1 achieved suggestive evidence for association, which strengthened after imputation of ungenotyped markers, and in analysis of female depression cases. A meta-analysis of U.K. data with previously published results from studies in Munich and Lausanne showed some evidence for association near neuroligin 1 (NLGN1) on chromosome 3, but did not support findings at BICC1. CONCLUSIONS: This study identifies several signals for association worthy of further investigation but, as in previous genome-wide studies, suggests that individual gene contributions to depression are likely to have only minor effects, and very large pooled analyses will be required to identify them.
Resumo:
We consider two fundamental properties in the analysis of two-way tables of positive data: the principle of distributional equivalence, one of the cornerstones of correspondence analysis of contingency tables, and the principle of subcompositional coherence, which forms the basis of compositional data analysis. For an analysis to be subcompositionally coherent, it suffices to analyse the ratios of the data values. The usual approach to dimension reduction in compositional data analysis is to perform principal component analysis on the logarithms of ratios, but this method does not obey the principle of distributional equivalence. We show that by introducing weights for the rows and columns, the method achieves this desirable property. This weighted log-ratio analysis is theoretically equivalent to spectral mapping , a multivariate method developed almost 30 years ago for displaying ratio-scale data from biological activity spectra. The close relationship between spectral mapping and correspondence analysis is also explained, as well as their connection with association modelling. The weighted log-ratio methodology is applied here to frequency data in linguistics and to chemical compositional data in archaeology.
Resumo:
This report describes the results of the research project investigating the use of advanced field data acquisition technologies for lowa transponation agencies. The objectives of the research project were to (1) research and evaluate current data acquisition technologies for field data collection, manipulation, and reporting; (2) identify the current field data collection approach and the interest level in applying current technologies within Iowa transportation agencies; and (3) summarize findings, prioritize technology needs, and provide recommendations regarding suitable applications for future development. A steering committee consisting oretate, city, and county transportation officials provided guidance during this project. Technologies considered in this study included (1) data storage (bar coding, radio frequency identification, touch buttons, magnetic stripes, and video logging); (2) data recognition (voice recognition and optical character recognition); (3) field referencing systems (global positioning systems [GPS] and geographic information systems [GIs]); (4) data transmission (radio frequency data communications and electronic data interchange); and (5) portable computers (pen-based computers). The literature review revealed that many of these technologies could have useful applications in the transponation industry. A survey was developed to explain current data collection methods and identify the interest in using advanced field data collection technologies. Surveys were sent out to county and city engineers and state representatives responsible for certain programs (e.g., maintenance management and construction management). Results showed that almost all field data are collected using manual approaches and are hand-carried to the office where they are either entered into a computer or manually stored. A lack of standardization was apparent for the type of software applications used by each agency--even the types of forms used to manually collect data differed by agency. Furthermore, interest in using advanced field data collection technologies depended upon the technology, program (e.g.. pavement or sign management), and agency type (e.g., state, city, or county). The state and larger cities and counties seemed to be interested in using several of the technologies, whereas smaller agencies appeared to have very little interest in using advanced techniques to capture data. A more thorough analysis of the survey results is provided in the report. Recommendations are made to enhance the use of advanced field data acquisition technologies in Iowa transportation agencies: (1) Appoint a statewide task group to coordinate the effort to automate field data collection and reporting within the Iowa transportation agencies. Subgroups representing the cities, counties, and state should be formed with oversight provided by the statewide task group. (2) Educate employees so that they become familiar with the various field data acquisition technologies.
Resumo:
BACKGROUND: LDL cholesterol has a causal role in the development of cardiovascular disease. Improved understanding of the biological mechanisms that underlie the metabolism and regulation of LDL cholesterol might help to identify novel therapeutic targets. We therefore did a genome-wide association study of LDL-cholesterol concentrations. METHODS: We used genome-wide association data from up to 11,685 participants with measures of circulating LDL-cholesterol concentrations across five studies, including data for 293 461 autosomal single nucleotide polymorphisms (SNPs) with a minor allele frequency of 5% or more that passed our quality control criteria. We also used data from a second genome-wide array in up to 4337 participants from three of these five studies, with data for 290,140 SNPs. We did replication studies in two independent populations consisting of up to 4979 participants. Statistical approaches, including meta-analysis and linkage disequilibrium plots, were used to refine association signals; we analysed pooled data from all seven populations to determine the effect of each SNP on variations in circulating LDL-cholesterol concentrations. FINDINGS: In our initial scan, we found two SNPs (rs599839 [p=1.7x10(-15)] and rs4970834 [p=3.0x10(-11)]) that showed genome-wide statistical association with LDL cholesterol at chromosomal locus 1p13.3. The second genome screen found a third statistically associated SNP at the same locus (rs646776 [p=4.3x10(-9)]). Meta-analysis of data from all studies showed an association of SNPs rs599839 (combined p=1.2x10(-33)) and rs646776 (p=4.8x10(-20)) with LDL-cholesterol concentrations. SNPs rs599839 and rs646776 both explained around 1% of the variation in circulating LDL-cholesterol concentrations and were associated with about 15% of an SD change in LDL cholesterol per allele, assuming an SD of 1 mmol/L. INTERPRETATION: We found evidence for a novel locus for LDL cholesterol on chromosome 1p13.3. These results potentially provide insight into the biological mechanisms that underlie the regulation of LDL cholesterol and might help in the discovery of novel therapeutic targets for cardiovascular disease.
Resumo:
Different signatures of natural selection persist over varying time scales in our genome, revealing possible episodes of adaptative evolution during human history. Here, we identify genes showing signatures of ancestral positive selection in the human lineage and investigate whether some of those genes have been evolving adaptatively in extant human populations. Specifically, we compared more than 11,000 human genes with their orthologs inchimpanzee, mouse, rat and dog and applied a branch-site likelihood method to test for positive selection on the human lineage. Among the significant cases, a robust set of 11 genes were then further explored for signatures of recent positive selection using SNP data. We genotyped 223 SNPs in 39 worldwide populations from the HGDP Diversity panel and supplemented this information with available genotypes for up to 4,814 SNPs distributed along 2 Mb centered on each gene. After exploring the allele frequency spectrum, population differentiation and the maintainance of long unbroken haplotypes, we found signals of recent adaptative phenomena in only one of the 11 candidate gene regions. However, the signal ofrecent selection in this region may come from a different, neighbouring gene (CD5) ratherthan from the candidate gene itself (VPS37C). For this set of positively-selected genes in thehuman lineage, we find no indication that these genes maintained their rapid evolutionarypace among human populations. Based on these data, it therefore appears that adaptation forhuman-specific and for population-specific traits may have involved different genes.
Resumo:
In this paper we introduce a highly efficient reversible data hiding system. It is based on dividing the image into tiles and shifting the histograms of each image tile between its minimum and maximum frequency. Data are then inserted at the pixel level with the largest frequency to maximize data hiding capacity. It exploits the special properties of medical images, where the histogram of their nonoverlapping image tiles mostly peak around some gray values and the rest of the spectrum is mainlyempty. The zeros (or minima) and peaks (maxima) of the histograms of the image tiles are then relocated to embed the data. The grey values of some pixels are therefore modified.High capacity, high fidelity, reversibility and multiple data insertions are the key requirements of data hiding in medical images. We show how histograms of image tiles of medical images can be exploited to achieve these requirements. Compared with data hiding method applied to the whole image, our scheme can result in 30%-200% capacity improvement and still with better image quality, depending on the medical image content. Additional advantages of the proposed method include hiding data in the regions of non-interest and better exploitation of spatial masking.
Resumo:
BACKGROUND AND PURPOSE: Transgenic mice overexpressing Notch2 in the uvea exhibit a hyperplastic ciliary body leading to increased IOP and glaucoma. The aim of this study was to investigate the possible presence of NOTCH2 variants in patients with primary open-angle glaucoma (POAG). METHODS: We screened DNA samples from 130 patients with POAG for NOTCH2 variants by denaturing high-performance liquid chromatography after PCR amplification and validated our data by direct Sanger sequencing. RESULTS: No mutations were observed in the coding regions of NOTCH2 or in the splice sites. 19 known SNPs (single nucleotide polymorphisms) were detected. An SNP located in intron 24, c.[4005+45A>G], was seen in 28.5% of the patients (37/130 patients). As this SNP is reported to have a minor allele frequency of 7% in the 1000 genomes database, it could be associated with POAG. However, we evaluated its frequency in an ethnic-matched control group of 96 subjects unaffected by POAG and observed a frequency of 29%, indicating that it was not related to POAG. CONCLUSION: NOTCH2 seemed to be a good candidate for POAG as it is expressed in the anterior segment in the human eye. However, mutational analysis did not show any causative mutation. This study also shows that proper ethnic-matched control groups are essential in association studies and that values given in databases are sometimes misleading.
Resumo:
We examined the genetic population structure of the european hake (Merluccius merluccius) using electrophoretically detectable population markers in 35 protein loci. Samples were collected from 7 locations in the Atlantic Ocean and Mediterranean Sea. Six loci were polymorphic using the 0.05 criterion of polymorphism. Sample heterozigosities ranged from 0.052 to 0.072 and averaged 0.0625. In this study, significant allele frequency differences were detected between Atlantic and Mediterranean populations in three polymorphic loci: GAPDH-1*, GPI-2* and SOD-1*. Two major genetic groups were considered: a North-Atlantic stock and the Mediterranean stock. The Nei genetic distance, D, (based on 33 loci) between samples from these two groups ranged from 0.002 to 0.006. Genetic differenciation between these areas appears to reflect the barrier effect of Strait of Gibraltar. On average over loci, 96.92 % of the total gene diversity was contained within samples, 0.23 % expressed differences among locations within areas, and 2.64 % differences between regions. A review of morphological variation together with the genetic data presented here suggest that the populations of hake from these areas are subdivided into two different stocks: the North-Atlantic stock and the Mediterranean stock. The most conservative approach to the management of these stocks is to consider the Atlantic and Mediterranean stocks independently from oneanother
Resumo:
HLA class II genes are strongly associated with susceptibility and resistance to insulin-dependent diabetes mellitus (IDDM). The present study reports the HLA-DRB1 genotyping of 41 IDDM patients and 99 healthy subjects from the Southeast of Brazil (Campinas region). Both groups consisted of an ethnic mixture of Caucasian, African Negro and Amerindian origin. HLA-DRB1*03 and *04 alleles were found at significantly higher frequencies among IDDM patients compared to the controls (DRB1*03: 48.8% vs 18.2%, P<0.005, RR = 4.27; DRB1*04: 43.9% vs 15.1%, P<0.008, RR = 4.37) and were associated with a susceptibility to the disease. DRB1*03/*04 heterozygosity conferred a strong IDDM risk (RR = 5.44). In contrast, the HLA-DRB1*11 allele frequency was lower among IDDM patients (7.3% vs 26.3% in controls), but the difference was not significant. These data agree with those described for other populations and allow genetic characterization of IDDM in Brazil
Resumo:
A 3-bp insertion/deletion polymorphism in intron 6 of GSTM3 (rs1799735, GSTM3*A/*B) affects the activity of the phase 2 xenobiotic metabolizing enzyme GSTM3 and has been associated with increased cancer risk. The GSTM3*B allele is rare or absent in Southeast Asians, occurs in 5-20% of Europeans but was detected in 80% of Bantu from South Africa. The wide genetic diversity among Africans led us to investigate whether the high frequency of GSTM3*B prevailed in other sub-Saharan African populations. In 168 healthy individuals from Angola, Mozambique and the São Tomé e Príncipe islands, the GSTM3*B allele was three times more frequent (0.74-0.78) than the GSTM3*A allele (0.22-0.26), with no significant differences in allele frequency across the three groups. We combined these data with previously published results to carry out a multidimensional scaling analysis, which provided a visualization of the worldwide population affinities based on the GSTM3 *A/*B polymorphism.