964 resultados para Bayesian Population Modelling


Relevância:

30.00% 30.00%

Publicador:

Resumo:

The purpose of this paper is to develop a Bayesian analysis for the right-censored survival data when immune or cured individuals may be present in the population from which the data is taken. In our approach the number of competing causes of the event of interest follows the Conway-Maxwell-Poisson distribution which generalizes the Poisson distribution. Markov chain Monte Carlo (MCMC) methods are used to develop a Bayesian procedure for the proposed model. Also, some discussions on the model selection and an illustration with a real data set are considered.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background: Human respiratory syncytial virus (HRSV) is one of the major etiologic agents of respiratory tract infections among children worldwide. Methodology/Principal Findings: Here through a comprehensive analysis of the two major HRSV groups A and B (n = 1983) which comprise of several genotypes, we present a complex pattern of population dynamics of HRSV over a time period of 50 years (1956-2006). Circulation pattern of HRSV revealed a series of expansions and fluctuations of co-circulating lineages with a predominance of HRSVA. Positively selected amino acid substitutions of the G glycoprotein occurred upon population growth of GB3 with a 60-nucleotide insertion (GB3 Insert), while other genotypes acquired substitutions upon both population growth and decrease, thus possibly reflecting a role for immune selected epitopes in linkage to the traced substitution sites that may have important relevance for vaccine design. Analysis evidenced the co-circulation and predominance of distinct HRSV genotypes in Brazil and suggested a year-round presence of the virus. In Brazil, GA2 and GA5 were the main culprits of HRSV outbreaks until recently, when the GB3 Insert became highly prevalent. Using Bayesian methods, we determined the dispersal patterns of genotypes through several inferred migratory routes. Conclusions/Significance: Genotypes spread across continents and between neighboring areas. Crucially, genotypes also remained at any given region for extended periods, independent of seasonal outbreaks possibly maintained by re-infecting the general population.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The objective of this paper is to model variations in test-day milk yields of first lactations of Holstein cows by RR using B-spline functions and Bayesian inference in order to fit adequate and parsimonious models for the estimation of genetic parameters. They used 152,145 test day milk yield records from 7317 first lactations of Holstein cows. The model established in this study was additive, permanent environmental and residual random effects. In addition, contemporary group and linear and quadratic effects of the age of cow at calving were included as fixed effects. Authors modeled the average lactation curve of the population with a fourth-order orthogonal Legendre polynomial. They concluded that a cubic B-spline with seven random regression coefficients for both the additive genetic and permanent environment effects was to be the best according to residual mean square and residual variance estimates. Moreover they urged a lower order model (quadratic B-spline with seven random regression coefficients for both random effects) could be adopted because it yielded practically the same genetic parameter estimates with parsimony. (C) 2012 Elsevier B.V. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Abstract Background Banana cultivars are mostly derived from hybridization between wild diploid subspecies of Musa acuminata (A genome) and M. balbisiana (B genome), and they exhibit various levels of ploidy and genomic constitution. The Embrapa ex situ Musa collection contains over 220 accessions, of which only a few have been genetically characterized. Knowledge regarding the genetic relationships and diversity between modern cultivars and wild relatives would assist in conservation and breeding strategies. Our objectives were to determine the genomic constitution based on Internal Transcribed Spacer (ITS) regions polymorphism and the ploidy of all accessions by flow cytometry and to investigate the population structure of the collection using Simple Sequence Repeat (SSR) loci as co-dominant markers based on Structure software, not previously performed in Musa. Results From the 221 accessions analyzed by flow cytometry, the correct ploidy was confirmed or established for 212 (95.9%), whereas digestion of the ITS region confirmed the genomic constitution of 209 (94.6%). Neighbor-joining clustering analysis derived from SSR binary data allowed the detection of two major groups, essentially distinguished by the presence or absence of the B genome, while subgroups were formed according to the genomic composition and commercial classification. The co-dominant nature of SSR was explored to analyze the structure of the population based on a Bayesian approach, detecting 21 subpopulations. Most of the subpopulations were in agreement with the clustering analysis. Conclusions The data generated by flow cytometry, ITS and SSR supported the hypothesis about the occurrence of homeologue recombination between A and B genomes, leading to discrepancies in the number of sets or portions from each parental genome. These phenomenons have been largely disregarded in the evolution of banana, as the “single-step domestication” hypothesis had long predominated. These findings will have an impact in future breeding approaches. Structure analysis enabled the efficient detection of ancestry of recently developed tetraploid hybrids by breeding programs, and for some triploids. However, for the main commercial subgroups, Structure appeared to be less efficient to detect the ancestry in diploid groups, possibly due to sampling restrictions. The possibility of inferring the membership among accessions to correct the effects of genetic structure opens possibilities for its use in marker-assisted selection by association mapping.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This thesis presents Bayesian solutions to inference problems for three types of social network data structures: a single observation of a social network, repeated observations on the same social network, and repeated observations on a social network developing through time. A social network is conceived as being a structure consisting of actors and their social interaction with each other. A common conceptualisation of social networks is to let the actors be represented by nodes in a graph with edges between pairs of nodes that are relationally tied to each other according to some definition. Statistical analysis of social networks is to a large extent concerned with modelling of these relational ties, which lends itself to empirical evaluation. The first paper deals with a family of statistical models for social networks called exponential random graphs that takes various structural features of the network into account. In general, the likelihood functions of exponential random graphs are only known up to a constant of proportionality. A procedure for performing Bayesian inference using Markov chain Monte Carlo (MCMC) methods is presented. The algorithm consists of two basic steps, one in which an ordinary Metropolis-Hastings up-dating step is used, and another in which an importance sampling scheme is used to calculate the acceptance probability of the Metropolis-Hastings step. In paper number two a method for modelling reports given by actors (or other informants) on their social interaction with others is investigated in a Bayesian framework. The model contains two basic ingredients: the unknown network structure and functions that link this unknown network structure to the reports given by the actors. These functions take the form of probit link functions. An intrinsic problem is that the model is not identified, meaning that there are combinations of values on the unknown structure and the parameters in the probit link functions that are observationally equivalent. Instead of using restrictions for achieving identification, it is proposed that the different observationally equivalent combinations of parameters and unknown structure be investigated a posteriori. Estimation of parameters is carried out using Gibbs sampling with a switching devise that enables transitions between posterior modal regions. The main goal of the procedures is to provide tools for comparisons of different model specifications. Papers 3 and 4, propose Bayesian methods for longitudinal social networks. The premise of the models investigated is that overall change in social networks occurs as a consequence of sequences of incremental changes. Models for the evolution of social networks using continuos-time Markov chains are meant to capture these dynamics. Paper 3 presents an MCMC algorithm for exploring the posteriors of parameters for such Markov chains. More specifically, the unobserved evolution of the network in-between observations is explicitly modelled thereby avoiding the need to deal with explicit formulas for the transition probabilities. This enables likelihood based parameter inference in a wider class of network evolution models than has been available before. Paper 4 builds on the proposed inference procedure of Paper 3 and demonstrates how to perform model selection for a class of network evolution models.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this work we aim to propose a new approach for preliminary epidemiological studies on Standardized Mortality Ratios (SMR) collected in many spatial regions. A preliminary study on SMRs aims to formulate hypotheses to be investigated via individual epidemiological studies that avoid bias carried on by aggregated analyses. Starting from collecting disease counts and calculating expected disease counts by means of reference population disease rates, in each area an SMR is derived as the MLE under the Poisson assumption on each observation. Such estimators have high standard errors in small areas, i.e. where the expected count is low either because of the low population underlying the area or the rarity of the disease under study. Disease mapping models and other techniques for screening disease rates among the map aiming to detect anomalies and possible high-risk areas have been proposed in literature according to the classic and the Bayesian paradigm. Our proposal is approaching this issue by a decision-oriented method, which focus on multiple testing control, without however leaving the preliminary study perspective that an analysis on SMR indicators is asked to. We implement the control of the FDR, a quantity largely used to address multiple comparisons problems in the eld of microarray data analysis but which is not usually employed in disease mapping. Controlling the FDR means providing an estimate of the FDR for a set of rejected null hypotheses. The small areas issue arises diculties in applying traditional methods for FDR estimation, that are usually based only on the p-values knowledge (Benjamini and Hochberg, 1995; Storey, 2003). Tests evaluated by a traditional p-value provide weak power in small areas, where the expected number of disease cases is small. Moreover tests cannot be assumed as independent when spatial correlation between SMRs is expected, neither they are identical distributed when population underlying the map is heterogeneous. The Bayesian paradigm oers a way to overcome the inappropriateness of p-values based methods. Another peculiarity of the present work is to propose a hierarchical full Bayesian model for FDR estimation in testing many null hypothesis of absence of risk.We will use concepts of Bayesian models for disease mapping, referring in particular to the Besag York and Mollié model (1991) often used in practice for its exible prior assumption on the risks distribution across regions. The borrowing of strength between prior and likelihood typical of a hierarchical Bayesian model takes the advantage of evaluating a singular test (i.e. a test in a singular area) by means of all observations in the map under study, rather than just by means of the singular observation. This allows to improve the power test in small areas and addressing more appropriately the spatial correlation issue that suggests that relative risks are closer in spatially contiguous regions. The proposed model aims to estimate the FDR by means of the MCMC estimated posterior probabilities b i's of the null hypothesis (absence of risk) for each area. An estimate of the expected FDR conditional on data (\FDR) can be calculated in any set of b i's relative to areas declared at high-risk (where thenull hypothesis is rejected) by averaging the b i's themselves. The\FDR can be used to provide an easy decision rule for selecting high-risk areas, i.e. selecting as many as possible areas such that the\FDR is non-lower than a prexed value; we call them\FDR based decision (or selection) rules. The sensitivity and specicity of such rule depend on the accuracy of the FDR estimate, the over-estimation of FDR causing a loss of power and the under-estimation of FDR producing a loss of specicity. Moreover, our model has the interesting feature of still being able to provide an estimate of relative risk values as in the Besag York and Mollié model (1991). A simulation study to evaluate the model performance in FDR estimation accuracy, sensitivity and specificity of the decision rule, and goodness of estimation of relative risks, was set up. We chose a real map from which we generated several spatial scenarios whose counts of disease vary according to the spatial correlation degree, the size areas, the number of areas where the null hypothesis is true and the risk level in the latter areas. In summarizing simulation results we will always consider the FDR estimation in sets constituted by all b i's selected lower than a threshold t. We will show graphs of the\FDR and the true FDR (known by simulation) plotted against a threshold t to assess the FDR estimation. Varying the threshold we can learn which FDR values can be accurately estimated by the practitioner willing to apply the model (by the closeness between\FDR and true FDR). By plotting the calculated sensitivity and specicity (both known by simulation) vs the\FDR we can check the sensitivity and specicity of the corresponding\FDR based decision rules. For investigating the over-smoothing level of relative risk estimates we will compare box-plots of such estimates in high-risk areas (known by simulation), obtained by both our model and the classic Besag York Mollié model. All the summary tools are worked out for all simulated scenarios (in total 54 scenarios). Results show that FDR is well estimated (in the worst case we get an overestimation, hence a conservative FDR control) in small areas, low risk levels and spatially correlated risks scenarios, that are our primary aims. In such scenarios we have good estimates of the FDR for all values less or equal than 0.10. The sensitivity of\FDR based decision rules is generally low but specicity is high. In such scenario the use of\FDR = 0:05 or\FDR = 0:10 based selection rule can be suggested. In cases where the number of true alternative hypotheses (number of true high-risk areas) is small, also FDR = 0:15 values are well estimated, and \FDR = 0:15 based decision rules gains power maintaining an high specicity. On the other hand, in non-small areas and non-small risk level scenarios the FDR is under-estimated unless for very small values of it (much lower than 0.05); this resulting in a loss of specicity of a\FDR = 0:05 based decision rule. In such scenario\FDR = 0:05 or, even worse,\FDR = 0:1 based decision rules cannot be suggested because the true FDR is actually much higher. As regards the relative risk estimation, our model achieves almost the same results of the classic Besag York Molliè model. For this reason, our model is interesting for its ability to perform both the estimation of relative risk values and the FDR control, except for non-small areas and large risk level scenarios. A case of study is nally presented to show how the method can be used in epidemiology.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The aim of this work is to put forward a statistical mechanics theory of social interaction, generalizing econometric discrete choice models. After showing the formal equivalence linking econometric multinomial logit models to equilibrium statical mechanics, a multi- population generalization of the Curie-Weiss model for ferromagnets is considered as a starting point in developing a model capable of describing sudden shifts in aggregate human behaviour. Existence of the thermodynamic limit for the model is shown by an asymptotic sub-additivity method and factorization of correlation functions is proved almost everywhere. The exact solution for the model is provided in the thermodynamical limit by nding converging upper and lower bounds for the system's pressure, and the solution is used to prove an analytic result regarding the number of possible equilibrium states of a two-population system. The work stresses the importance of linking regimes predicted by the model to real phenomena, and to this end it proposes two possible procedures to estimate the model's parameters starting from micro-level data. These are applied to three case studies based on census type data: though these studies are found to be ultimately inconclusive on an empirical level, considerations are drawn that encourage further refinements of the chosen modelling approach, to be considered in future work.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Two Amerindian populations from the Peruvian Amazon (Yanesha) and from rural lowlands of the Argentinean Gran Chaco (Wichi) were analyzed. They represent two case study of the South American genetic variability. The Yanesha represent a model of population isolated for long-time in the Amazon rainforest, characterized by environmental and altitudinal stratifications. The Wichi represent a model of population living in an area recently colonized by European populations (the Criollos are the population of the admixed descendents), whose aim is to depict the native ancestral gene pool and the degree of admixture, in relation to the very high prevalence of Chagas disease. The methods used for the genotyping are common, concerning the Y chromosome markers (male lineage) and the mitochondrial markers (maternal lineage). The determination of the phylogeographic diagnostic polymorphisms was carried out by the classical techniques of PCR, restriction enzymes, sequencing and specific mini-sequencing. New method for the detection of the protozoa Trypanosoma cruzi was developed by means of the nested PCR. The main results show patterns of genetic stratification in Yanesha forest communities, referable to different migrations at different times, estimated by Bayesian analyses. In particular Yanesha were considered as a population of transition between the Amazon basin and the Andean Cordillera, evaluating the potential migration routes and the separation of clusters of community in relation to different genetic bio-ancestry. As the Wichi, the gene pool analyzed appears clearly differentiated by the admixed sympatric Criollos, due to strict social practices (deeply analyzed with the support of cultural anthropological tools) that have preserved the native identity at a diachronic level. A pattern of distribution of the seropositivity in relation to the different phylogenetic lineages (the adaptation in evolutionary terms) does not appear, neither Amerindian nor European, but in relation to environmental and living conditions of the two distinct subpopulations.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This doctoral thesis is devoted to the study of the causal effects of the maternal smoking on the delivery cost. The interest of economic consequences of smoking in pregnancy have been studied fairly extensively in the USA, and very little is known in European context. To identify the causal relation between different maternal smoking status and the delivery cost in the Emilia-Romagna region two distinct methods were used. The first - geometric multidimensional - is mainly based on the multivariate approach and involves computing and testing the global imbalance, classifying cases in order to generate well-matched comparison groups, and then computing treatment effects. The second - structural modelling - refers to a general methodological account of model-building and model-testing. The main idea of this approach is to decompose the global mechanism into sub-mechanisms though a recursive decomposition of a multivariate distribution.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Introgression of domestic cat genes into European wildcat (Felis silvestris silvestris) populations and reduction of wildcats’ range in Europe, leaded by habitat loss and fragmentation, are considered two of the main conservation problems for this endangered feline. This thesis addressed the questions related with the artificial hybridization and populations’ fragmentation, using a conservation genetics perspective. We combined the use of highly polymorphic loci, Bayesian statistical inferences and landscape analyses tools to investigate the origin of the geographic-genetic substructure of European wildcats (Felis silvestris silvestris) in Italy and Europe. The genetic variability of microsatellites evidenced that European wildcat populations currently distributed in Italy differentiated in, and expanded from two distinct glacial refuges during the Last Glacial Maximum. The genetic and geographic substructure detected between the eastern and western sides of the Apennine ridge, resulted by adaptation to specific ecological conditions of the Mediterranean habitats. European wildcat populations in Europe are strongly structured into 5 geographic-genetic macro clusters corresponding to: the Italian peninsular & Sicily; Balkans & north-eastern Italy; Germany eastern; central Europe; and Iberian Peninsula. Central European population might have differentiated in the extra-Mediterranean Würm ice age refuge areas (Northern Alps, Carpathians, and the Bulgarian mountain systems), while the divergence among and within the southern European populations might have resulted by the Pleistocene bio geographical framework of Europe, with three southern refugia localized in the Balkans, Italian Peninsula and Iberia Peninsula. We further combined the use of most informative autosomal SNPs with uniparental markers (mtDNA and Y-linked) for accurately detecting parental genotypes and levels of introgressive hybridization between European wild and domestic cats. A total of 11 hybrids were identified. The presence of domestic mitochondrial haplotypes shared with some wild individuals led us to hypnotize the possibility that ancient introgressive events might have occurred and that further investigation should be recommended.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

P>1. There are a number of models describing population structure, many of which have the capacity to incorporate spatial habitat effects. One such model is the source-sink model, that describes a system where some habitats have a natality that is higher than mortality (source) and others have a mortality that exceeds natality (sink). A source can be maintained in the absence of migration, whereas a sink will go extinct. 2. However, the interaction between population dynamics and habitat quality is complex, and concerns have been raised about the validity of published empirical studies addressing source-sink dynamics. In particular, some of these studies fail to provide data on survival, a significant component in disentangling a sink from a low quality source. Moreover, failing to account for a density-dependent increase in mortality, or decrease in fecundity, can result in a territory being falsely assigned as a sink, when in fact, this density-dependent suppression only decreases the population size to a lower level, hence indicating a 'pseudo-sink'. 3. In this study, we investigate a long-term data set for key components of territory-specific demography (mortality and reproduction) and their relationship to habitat characteristics in the territorial, group-living Siberian jay (Perisoreus infaustus). We also assess territory-specific population growth rates (r), to test whether spatial population dynamics are consistent with the ideas of source-sink dynamics. 4. Although average mortality did not differ between sexes, habitat-specific mortality did. Female mortality was higher in older forests, a pattern not observed in males. Male mortality only increased with an increasing amount of open areas. Moreover, reproductive success was higher further away from human settlement, indicating a strong effect of human-associated nest predators. 5. Averaged over all years, 76% of the territories were sources. These territories generally consisted of less open areas, and were located further away from human settlement. 6. The source-sink model provides a tool for modelling demography in distinct habitat patches of different quality, which can aid in identifying key habitats within the landscape, and thus, reduce the risk of implementing unsound management decisions.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background The estimation of demographic parameters from genetic data often requires the computation of likelihoods. However, the likelihood function is computationally intractable for many realistic evolutionary models, and the use of Bayesian inference has therefore been limited to very simple models. The situation changed recently with the advent of Approximate Bayesian Computation (ABC) algorithms allowing one to obtain parameter posterior distributions based on simulations not requiring likelihood computations. Results Here we present ABCtoolbox, a series of open source programs to perform Approximate Bayesian Computations (ABC). It implements various ABC algorithms including rejection sampling, MCMC without likelihood, a Particle-based sampler and ABC-GLM. ABCtoolbox is bundled with, but not limited to, a program that allows parameter inference in a population genetics context and the simultaneous use of different types of markers with different ploidy levels. In addition, ABCtoolbox can also interact with most simulation and summary statistics computation programs. The usability of the ABCtoolbox is demonstrated by inferring the evolutionary history of two evolutionary lineages of Microtus arvalis. Using nuclear microsatellites and mitochondrial sequence data in the same estimation procedure enabled us to infer sex-specific population sizes and migration rates and to find that males show smaller population sizes but much higher levels of migration than females. Conclusion ABCtoolbox allows a user to perform all the necessary steps of a full ABC analysis, from parameter sampling from prior distributions, data simulations, computation of summary statistics, estimation of posterior distributions, model choice, validation of the estimation procedure, and visualization of the results.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background Depressive and anxiety symptoms often co-occur resulting in a debate about common and distinct features of depression and anxiety. Methods An exploratory factor analysis (EFA) and a bifactor modelling approach were used to separate a general distress continuum from more specific sub-domains of depression and anxiety in an adolescent community sample (n = 1159, age 14). The Mood and Feelings Questionnaire and the Revised Children's Manifest Anxiety Scale were used. Results A three-factor confirmatory factor analysis is reported which identified a) mood and social-cognitive symptoms of depression, b) worrying symptoms, and c) somatic and information-processing symptoms as distinct yet closely related constructs. Subsequent bifactor modelling supported a general distress factor which accounted for the communality of the depression and anxiety items. Specific factors for hopelessness-suicidal thoughts and restlessness-fatigue indicated distinct psychopathological constructs which account for unique information over and above the general distress factor. The general distress factor and the hopelessness-suicidal factor were more severe in females but the restlessness-fatigue factor worse in males. Measurement precision of the general distress factor was higher and spanned a wider range of the population than any of the three first-order factors. Conclusions The general distress factor provides the most reliable target for epidemiological analysis but specific factors may help to refine valid phenotype dimensions for aetiological research and assist in prognostic modelling of future psychiatric episodes.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Knowledge on the relative importance of alternative sources of human campylobacteriosis is important in order to implement effective disease prevention measures. The objective of this study was to assess the relative importance of three key exposure pathways (travelling abroad, poultry meat, pet contact) for different patient age groups in Switzerland. With a stochastic exposure model data on Campylobacter incidence for the years 2002-2007 were linked with data for the three exposure pathways and the results of a case-control study. Mean values for the population attributable fractions (PAF) over all age groups and years were 27% (95% CI 17-39) for poultry consumption, 27% (95% CI 22-32) for travelling abroad, 8% (95% CI 6-9) for pet contact and 39% (95% CI 25-50) for other risk factors. This model provided robust results when using data available for Switzerland, but the uncertainties remained high. The output of the model could be improved if more accurate input data are available to estimate the infection rate per exposure. In particular, the relatively high proportion of cases attributed to 'other risk factors' requires further attention.