29 resultados para Genetic population data
Resumo:
Stephens and Donnelly have introduced a simple yet powerful importance sampling scheme for computing the likelihood in population genetic models. Fundamental to the method is an approximation to the conditional probability of the allelic type of an additional gene, given those currently in the sample. As noted by Li and Stephens, the product of these conditional probabilities for a sequence of draws that gives the frequency of allelic types in a sample is an approximation to the likelihood, and can be used directly in inference. The aim of this note is to demonstrate the high level of accuracy of "product of approximate conditionals" (PAC) likelihood when used with microsatellite data. Results obtained on simulated microsatellite data show that this strategy leads to a negligible bias over a wide range of the scaled mutation parameter theta. Furthermore, the sampling variance of likelihood estimates as well as the computation time are lower than that obtained with importance sampling on the whole range of theta. It follows that this approach represents an efficient substitute to IS algorithms in computer intensive (e.g. MCMC) inference methods in population genetics. (c) 2006 Elsevier Inc. All rights reserved.
Resumo:
An example of the evolution of the interacting behaviours of parents and progeny is studied using iterative equations linking the frequencies of the gametes produced by the progeny to the frequencies of the gametes in the parental generation. This population genetics approach shows that a model in which both behaviours are determined by a single locus can lead to a stable equilibrium in which the two behaviours continue to segregate. A model in which the behaviours are determined by genes at two separate loci leads eventually to fixation of the alleles at both loci but this can take many generations of selection. Models of the type described in this paper will be needed to understand the evolution of complex behaviour when genomic or experimental information is available about the genetic determinants of behaviour and the selective values of different genomes. (c) 2007 Elsevier Inc. All rights reserved.
Resumo:
Long distance dispersal (LDD) plays an important role in many population processes like colonization, range expansion, and epidemics. LDD of small particles like fungal spores is often a result of turbulent wind dispersal and is best described by functions with power-law behavior in the tails ("fat tailed"). The influence of fat-tailed LDD on population genetic structure is reported in this article. In computer simulations, the population structure generated by power-law dispersal with exponents in the range of -2 to -1, in distinct contrast to that generated by exponential dispersal, has a fractal structure. As the power-law exponent becomes smaller, the distribution of individual genotypes becomes more self-similar at different scales. Common statistics like G(ST) are not well suited to summarizing differences between the population genetic structures. Instead, fractal and self-similarity statistics demonstrated differences in structure arising from fat-tailed and exponential dispersal. When dispersal is fat tailed, a log-log plot of the Simpson index against distance between subpopulations has an approximately constant gradient over a large range of spatial scales. The fractal dimension D-2 is linearly inversely related to the power-law exponent, with a slope of similar to -2. In a large simulation arena, fat-tailed LDD allows colonization of the entire space by all genotypes whereas exponentially bounded dispersal eventually confines all descendants of a single clonal lineage to a relatively small area.
Resumo:
The identification of signatures of natural selection in genomic surveys has become an area of intense research, stimulated by the increasing ease with which genetic markers can be typed. Loci identified as subject to selection may be functionally important, and hence (weak) candidates for involvement in disease causation. They can also be useful in determining the adaptive differentiation of populations, and exploring hypotheses about speciation. Adaptive differentiation has traditionally been identified from differences in allele frequencies among different populations, summarised by an estimate of F-ST. Low outliers relative to an appropriate neutral population-genetics model indicate loci subject to balancing selection, whereas high outliers suggest adaptive (directional) selection. However, the problem of identifying statistically significant departures from neutrality is complicated by confounding effects on the distribution of F-ST estimates, and current methods have not yet been tested in large-scale simulation experiments. Here, we simulate data from a structured population at many unlinked, diallelic loci that are predominantly neutral but with some loci subject to adaptive or balancing selection. We develop a hierarchical-Bayesian method, implemented via Markov chain Monte Carlo (MCMC), and assess its performance in distinguishing the loci simulated under selection from the neutral loci. We also compare this performance with that of a frequentist method, based on moment-based estimates of F-ST. We find that both methods can identify loci subject to adaptive selection when the selection coefficient is at least five times the migration rate. Neither method could reliably distinguish loci under balancing selection in our simulations, even when the selection coefficient is twenty times the migration rate.
Resumo:
Inferring the spatial expansion dynamics of invading species from molecular data is notoriously difficult due to the complexity of the processes involved. For these demographic scenarios, genetic data obtained from highly variable markers may be profitably combined with specific sampling schemes and information from other sources using a Bayesian approach. The geographic range of the introduced toad Bufo marinus is still expanding in eastern and northern Australia, in each case from isolates established around 1960. A large amount of demographic and historical information is available on both expansion areas. In each area, samples were collected along a transect representing populations of different ages and genotyped at 10 microsatellite loci. Five demographic models of expansion, differing in the dispersal pattern for migrants and founders and in the number of founders, were considered. Because the demographic history is complex, we used an approximate Bayesian method, based on a rejection-regression algorithm. to formally test the relative likelihoods of the five models of expansion and to infer demographic parameters. A stepwise migration-foundation model with founder events was statistically better supported than other four models in both expansion areas. Posterior distributions supported different dynamics of expansion in the studied areas. Populations in the eastern expansion area have a lower stable effective population size and have been founded by a smaller number of individuals than those in the northern expansion area. Once demographically stabilized, populations exchange a substantial number of effective migrants per generation in both expansion areas, and such exchanges are larger in northern than in eastern Australia. The effective number of migrants appears to be considerably lower than that of founders in both expansion areas. We found our inferences to be relatively robust to various assumptions on marker. demographic, and historical features. The method presented here is the only robust, model-based method available so far, which allows inferring complex population dynamics over a short time scale. It also provides the basis for investigating the interplay between population dynamics, drift, and selection in invasive species.
Resumo:
We describe and evaluate a new estimator of the effective population size (N-e), a critical parameter in evolutionary and conservation biology. This new "SummStat" N-e. estimator is based upon the use of summary statistics in an approximate Bayesian computation framework to infer N-e. Simulations of a Wright-Fisher population with known N-e show that the SummStat estimator is useful across a realistic range of individuals and loci sampled, generations between samples, and N-e values. We also address the paucity of information about the relative performance of N-e estimators by comparing the SUMMStat estimator to two recently developed likelihood-based estimators and a traditional moment-based estimator. The SummStat estimator is the least biased of the four estimators compared. In 32 of 36 parameter combinations investigated rising initial allele frequencies drawn from a Dirichlet distribution, it has the lowest bias. The relative mean square error (RMSE) of the SummStat estimator was generally intermediate to the others. All of the estimators had RMSE > 1 when small samples (n = 20, five loci) were collected a generation apart. In contrast, when samples were separated by three or more generations and Ne less than or equal to 50, the SummStat and likelihood-based estimators all had greatly reduced RMSE. Under the conditions simulated, SummStat confidence intervals were more conservative than the likelihood-based estimators and more likely to include true N-e. The greatest strength of the SummStat estimator is its flexible structure. This flexibility allows it to incorporate any, potentially informative summary statistic from Population genetic data.
Resumo:
Population subdivision complicates analysis of molecular variation. Even if neutrality is assumed, three evolutionary forces need to be considered: migration, mutation, and drift. Simplification can be achieved by assuming that the process of migration among and drift within subpopulations is occurring fast compared to Mutation and drift in the entire population. This allows a two-step approach in the analysis: (i) analysis of population subdivision and (ii) analysis of molecular variation in the migrant pool. We model population subdivision using an infinite island model, where we allow the migration/drift parameter Theta to vary among populations. Thus, central and peripheral populations can be differentiated. For inference of Theta, we use a coalescence approach, implemented via a Markov chain Monte Carlo (MCMC) integration method that allows estimation of allele frequencies in the migrant pool. The second step of this approach (analysis of molecular variation in the migrant pool) uses the estimated allele frequencies in the migrant pool for the study of molecular variation. We apply this method to a Drosophila ananassae sequence data set. We find little indication of isolation by distance, but large differences in the migration parameter among populations. The population as a whole seems to be expanding. A population from Bogor (Java, Indonesia) shows the highest variation and seems closest to the species center.
Resumo:
Reliable and sufficiently discriminative methods are needed for differentiating individual strains of Salmonella enterica serotype Enteritidis beyond the phenotypic level; however, a consensus has not been reached as to which molecular method is best suited for this purpose. In addition, data are lacking on the molecular fingerprinting of serotype Enteritidis from poultry environments in the United Kingdom. This study evaluated the combined use of classical methods (phage typing) with three well-established molecular methods (ribotyping, macrorestriction analysis of genomic DNA, and plasmid profiling) in the assessment of diversity within 104 isolates of serotype Enteritidis from eight unaffiliated poultry farms in England. The most sensitive technique for identifying polymorphism was PstI-SphII ribotyping, distinguishing a total of 22 patterns, 10 of which were found among phage type 4 isolates. Pulsed-field gel electrophoresis of XhaI-digested genomic DNA segregated the isolates into only six types with minor differences between them. In addition, 14 plasmid profiles were found among this population. When all of the typing methods were combined, 54 types of strains were differentiated, and most of the poultry farms presented a variety of strains, which suggests that serotype Enteritidis organisms representing different genomic groups are circulating in England. In conclusion, geographical and animal origins of Salmonella serotype Enteritidis isolates may have a considerable influence on selecting the best typing strategy for individual programs, and a single method cannot be relied on for discriminating between strains.
Resumo:
Background: Exposure to solar ultraviolet-B (UV-B) radiation is a major source of vitamin D3. Chemistry climate models project decreases in ground-level solar erythemal UV over the current century. It is unclear what impact this will have on vitamin D status at the population level. The purpose of this study was to measure the association between ground-level solar UV-B and serum concentrations of 25-hydroxyvitamin D (25(OH)D) using a secondary analysis of the 2007 to 2009 Canadian Health Measures Survey (CHMS). Methods: Blood samples collected from individuals aged 12 to 79 years sampled across Canada were analyzed for 25(OH)D (n=4,398). Solar UV-B irradiance was calculated for the 15 CHMS collection sites using the Tropospheric Ultraviolet and Visible Radiation Model. Multivariable linear regression was used to evaluate the association between 25(OH)D and solar UV-B adjusted for other predictors and to explore effect modification. Results: Cumulative solar UV-B irradiance averaged over 91 days (91-day UV-B) prior to blood draw correlated significantly with 25(OH)D. Independent of other predictors, a 1 kJ/m 2 increase in 91-day UV-B was associated with a significant 0.5 nmol/L (95% CI 0.3-0.8) increase in mean 25(OH)D (P =0.0001). The relationship was stronger among younger individuals and those spending more time outdoors. Based on current projections of decreases in ground-level solar UV-B, we predict less than a 1 nmol/L decrease in mean 25(OH)D for the population. Conclusions: In Canada, cumulative exposure to ambient solar UV-B has a small but significant association with 25(OH)D concentrations. Public health messages to improve vitamin D status should target safe sun exposure with sunscreen use, and also enhanced dietary and supplemental intake and maintenance of a healthy body weight.
Resumo:
Brain activity can be measured with several non-invasive neuroimaging modalities, but each modality has inherent limitations with respect to resolution, contrast and interpretability. It is hoped that multimodal integration will address these limitations by using the complementary features of already available data. However, purely statistical integration can prove problematic owing to the disparate signal sources. As an alternative, we propose here an advanced neural population model implemented on an anatomically sound cortical mesh with freely adjustable connectivity, which features proper signal expression through a realistic head model for the electroencephalogram (EEG), as well as a haemodynamic model for functional magnetic resonance imaging based on blood oxygen level dependent contrast (fMRI BOLD). It hence allows simultaneous and realistic predictions of EEG and fMRI BOLD from the same underlying model of neural activity. As proof of principle, we investigate here the influence on simulated brain activity of strengthening visual connectivity. In the future we plan to fit multimodal data with this neural population model. This promises novel, model-based insights into the brain's activity in sleep, rest and task conditions.
Resumo:
MAGIC populations represent one of a new generation of crop genetic mapping resources combining high genetic recombination and diversity. We describe the creation and validation of an eight-parent MAGIC population consisting of 1091 F7 lines of winter-sown wheat (Triticum aestivum L.). Analyses based on genotypes from a 90,000-single nucleotide polymorphism (SNP) array find the population to be well-suited as a platform for fine-mapping quantitative trait loci (QTL) and gene isolation. Patterns of linkage disequilibrium (LD) show the population to be highly recombined; genetic marker diversity among the founders was 74% of that captured in a larger set of 64 wheat varieties, and 54% of SNPs segregating among the 64 lines also segregated among the eight founder lines. In contrast, a commonly used reference bi-parental population had only 54% of the diversity of the 64 varieties with 27% of SNPs segregating. We demonstrate the potential of this MAGIC resource by identifying a highly diagnostic marker for the morphological character "awn presence/absence" and independently validate it in an association-mapping panel. These analyses show this large, diverse, and highly recombined MAGIC population to be a powerful resource for the genetic dissection of target traits in wheat, and it is well-placed to efficiently exploit ongoing advances in phenomics and genomics. Genetic marker and trait data, together with instructions for access to seed, are available at http://www.niab.com/MAGIC/.
Resumo:
How and when the Americas were populated remains contentious. Using ancient and modern genome-wide data, we found that the ancestors of all present-day Native Americans, including Athabascans and Amerindians, entered the Americas as a single migration wave from Siberia no earlier than 23 thousand years ago (ka) and after no more than an 8000-year isolation period in Beringia. After their arrival to the Americas, ancestral Native Americans diversified into two basal genetic branches around 13 ka, one that is now dispersed across North and South America and the other restricted to North America. Subsequent gene flow resulted in some Native Americans sharing ancestry with present-day East Asians (including Siberians) and, more distantly, Australo-Melanesians. Putative “Paleoamerican” relict populations, including the historical Mexican Pericúes and South American Fuego-Patagonians, are not directly related to modern Australo-Melanesians as suggested by the Paleoamerican Model.
Resumo:
Abstract Background: The amount and structure of genetic diversity in dessert apple germplasm conserved at a European level is mostly unknown, since all diversity studies conducted in Europe until now have been performed on regional or national collections. Here, we applied a common set of 16 SSR markers to genotype more than 2,400 accessions across 14 collections representing three broad European geographic regions (North+East, West and South) with the aim to analyze the extent, distribution and structure of variation in the apple genetic resources in Europe. Results: A Bayesian model-based clustering approach showed that diversity was organized in three groups, although these were only moderately differentiated (FST=0.031). A nested Bayesian clustering approach allowed identification of subgroups which revealed internal patterns of substructure within the groups, allowing a finer delineation of the variation into eight subgroups (FST=0.044). The first level of stratification revealed an asymmetric division of the germplasm among the three groups, and a clear association was found with the geographical regions of origin of the cultivars. The substructure revealed clear partitioning of genetic groups among countries, but also interesting associations between subgroups and breeding purposes of recent cultivars or particular usage such as cider production. Additional parentage analyses allowed us to identify both putative parents of more than 40 old and/or local cultivars giving interesting insights in the pedigree of some emblematic cultivars. Conclusions: The variation found at group and sub-group levels may reflect a combination of historical processes of migration/selection and adaptive factors to diverse agricultural environments that, together with genetic drift, have resulted in extensive genetic variation but limited population structure. The European dessert apple germplasm represents an important source of genetic diversity with a strong historical and patrimonial value. The present work thus constitutes a decisive step in the field of conservation genetics. Moreover, the obtained data can be used for defining a European apple core collection useful for further identification of genomic regions associated with commercially important horticultural traits in apple through genome-wide association studies.