933 resultados para Bayesian mixture model


Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this work we compared the estimates of the parameters of ARCH models using a complete Bayesian method and an empirical Bayesian method in which we adopted a non-informative prior distribution and informative prior distribution, respectively. We also considered a reparameterization of those models in order to map the space of the parameters into real space. This procedure permits choosing prior normal distributions for the transformed parameters. The posterior summaries were obtained using Monte Carlo Markov chain methods (MCMC). The methodology was evaluated by considering the Telebras series from the Brazilian financial market. The results show that the two methods are able to adjust ARCH models with different numbers of parameters. The empirical Bayesian method provided a more parsimonious model to the data and better adjustment than the complete Bayesian method.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A detailed numerical simulation of ethanol turbulent spray combustion on a rounded jet flame is pre- sented in this article. The focus is to propose a robust mathematical model with relatively low complexity sub- models to reproduce the main characteristics of the cou- pling between both phases, such as the turbulence modulation, turbulent droplets dissipation, and evaporative cooling effect. A RANS turbulent model is implemented. Special features of the model include an Eulerian– Lagrangian procedure under a fully two-way coupling and a modified flame sheet model with a joint mixture fraction– enthalpy b -PDF. Reasonable agreement between measured and computed mean profiles of temperature of the gas phase and droplet size distributions is achieved. Deviations found between measured and predicted mean velocity profiles are attributed to the turbulent combustion modeling adopted

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This thesis presents Bayesian solutions to inference problems for three types of social network data structures: a single observation of a social network, repeated observations on the same social network, and repeated observations on a social network developing through time. A social network is conceived as being a structure consisting of actors and their social interaction with each other. A common conceptualisation of social networks is to let the actors be represented by nodes in a graph with edges between pairs of nodes that are relationally tied to each other according to some definition. Statistical analysis of social networks is to a large extent concerned with modelling of these relational ties, which lends itself to empirical evaluation. The first paper deals with a family of statistical models for social networks called exponential random graphs that takes various structural features of the network into account. In general, the likelihood functions of exponential random graphs are only known up to a constant of proportionality. A procedure for performing Bayesian inference using Markov chain Monte Carlo (MCMC) methods is presented. The algorithm consists of two basic steps, one in which an ordinary Metropolis-Hastings up-dating step is used, and another in which an importance sampling scheme is used to calculate the acceptance probability of the Metropolis-Hastings step. In paper number two a method for modelling reports given by actors (or other informants) on their social interaction with others is investigated in a Bayesian framework. The model contains two basic ingredients: the unknown network structure and functions that link this unknown network structure to the reports given by the actors. These functions take the form of probit link functions. An intrinsic problem is that the model is not identified, meaning that there are combinations of values on the unknown structure and the parameters in the probit link functions that are observationally equivalent. Instead of using restrictions for achieving identification, it is proposed that the different observationally equivalent combinations of parameters and unknown structure be investigated a posteriori. Estimation of parameters is carried out using Gibbs sampling with a switching devise that enables transitions between posterior modal regions. The main goal of the procedures is to provide tools for comparisons of different model specifications. Papers 3 and 4, propose Bayesian methods for longitudinal social networks. The premise of the models investigated is that overall change in social networks occurs as a consequence of sequences of incremental changes. Models for the evolution of social networks using continuos-time Markov chains are meant to capture these dynamics. Paper 3 presents an MCMC algorithm for exploring the posteriors of parameters for such Markov chains. More specifically, the unobserved evolution of the network in-between observations is explicitly modelled thereby avoiding the need to deal with explicit formulas for the transition probabilities. This enables likelihood based parameter inference in a wider class of network evolution models than has been available before. Paper 4 builds on the proposed inference procedure of Paper 3 and demonstrates how to perform model selection for a class of network evolution models.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Wave breaking is an important coastal process, influencing hydro-morphodynamic processes such as turbulence generation and wave energy dissipation, run-up on the beach and overtopping of coastal defence structures. During breaking, waves are complex mixtures of air and water (“white water”) whose properties affect velocity and pressure fields in the vicinity of the free surface and, depending on the breaker characteristics, different mechanisms for air entrainment are usually observed. Several laboratory experiments have been performed to investigate the role of air bubbles in the wave breaking process (Chanson & Cummings, 1994, among others) and in wave loading on vertical wall (Oumeraci et al., 2001; Peregrine et al., 2006, among others), showing that the air phase is not negligible since the turbulent energy dissipation involves air-water mixture. The recent advancement of numerical models has given valuable insights in the knowledge of wave transformation and interaction with coastal structures. Among these models, some solve the RANS equations coupled with a free-surface tracking algorithm and describe velocity, pressure, turbulence and vorticity fields (Lara et al. 2006 a-b, Clementi et al., 2007). The single-phase numerical model, in which the constitutive equations are solved only for the liquid phase, neglects effects induced by air movement and trapped air bubbles in water. Numerical approximations at the free surface may induce errors in predicting breaking point and wave height and moreover, entrapped air bubbles and water splash in air are not properly represented. The aim of the present thesis is to develop a new two-phase model called COBRAS2 (stands for Cornell Breaking waves And Structures 2 phases), that is the enhancement of the single-phase code COBRAS0, originally developed at Cornell University (Lin & Liu, 1998). In the first part of the work, both fluids are considered as incompressible, while the second part will treat air compressibility modelling. The mathematical formulation and the numerical resolution of the governing equations of COBRAS2 are derived and some model-experiment comparisons are shown. In particular, validation tests are performed in order to prove model stability and accuracy. The simulation of the rising of a large air bubble in an otherwise quiescent water pool reveals the model capability to reproduce the process physics in a realistic way. Analytical solutions for stationary and internal waves are compared with corresponding numerical results, in order to test processes involving wide range of density difference. Waves induced by dam-break in different scenarios (on dry and wet beds, as well as on a ramp) are studied, focusing on the role of air as the medium in which the water wave propagates and on the numerical representation of bubble dynamics. Simulations of solitary and regular waves, characterized by both spilling and plunging breakers, are analyzed with comparisons with experimental data and other numerical model in order to investigate air influence on wave breaking mechanisms and underline model capability and accuracy. Finally, modelling of air compressibility is included in the new developed model and is validated, revealing an accurate reproduction of processes. Some preliminary tests on wave impact on vertical walls are performed: since air flow modelling allows to have a more realistic reproduction of breaking wave propagation, the dependence of wave breaker shapes and aeration characteristics on impact pressure values is studied and, on the basis of a qualitative comparison with experimental observations, the numerical simulations achieve good results.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this work we aim to propose a new approach for preliminary epidemiological studies on Standardized Mortality Ratios (SMR) collected in many spatial regions. A preliminary study on SMRs aims to formulate hypotheses to be investigated via individual epidemiological studies that avoid bias carried on by aggregated analyses. Starting from collecting disease counts and calculating expected disease counts by means of reference population disease rates, in each area an SMR is derived as the MLE under the Poisson assumption on each observation. Such estimators have high standard errors in small areas, i.e. where the expected count is low either because of the low population underlying the area or the rarity of the disease under study. Disease mapping models and other techniques for screening disease rates among the map aiming to detect anomalies and possible high-risk areas have been proposed in literature according to the classic and the Bayesian paradigm. Our proposal is approaching this issue by a decision-oriented method, which focus on multiple testing control, without however leaving the preliminary study perspective that an analysis on SMR indicators is asked to. We implement the control of the FDR, a quantity largely used to address multiple comparisons problems in the eld of microarray data analysis but which is not usually employed in disease mapping. Controlling the FDR means providing an estimate of the FDR for a set of rejected null hypotheses. The small areas issue arises diculties in applying traditional methods for FDR estimation, that are usually based only on the p-values knowledge (Benjamini and Hochberg, 1995; Storey, 2003). Tests evaluated by a traditional p-value provide weak power in small areas, where the expected number of disease cases is small. Moreover tests cannot be assumed as independent when spatial correlation between SMRs is expected, neither they are identical distributed when population underlying the map is heterogeneous. The Bayesian paradigm oers a way to overcome the inappropriateness of p-values based methods. Another peculiarity of the present work is to propose a hierarchical full Bayesian model for FDR estimation in testing many null hypothesis of absence of risk.We will use concepts of Bayesian models for disease mapping, referring in particular to the Besag York and Mollié model (1991) often used in practice for its exible prior assumption on the risks distribution across regions. The borrowing of strength between prior and likelihood typical of a hierarchical Bayesian model takes the advantage of evaluating a singular test (i.e. a test in a singular area) by means of all observations in the map under study, rather than just by means of the singular observation. This allows to improve the power test in small areas and addressing more appropriately the spatial correlation issue that suggests that relative risks are closer in spatially contiguous regions. The proposed model aims to estimate the FDR by means of the MCMC estimated posterior probabilities b i's of the null hypothesis (absence of risk) for each area. An estimate of the expected FDR conditional on data (\FDR) can be calculated in any set of b i's relative to areas declared at high-risk (where thenull hypothesis is rejected) by averaging the b i's themselves. The\FDR can be used to provide an easy decision rule for selecting high-risk areas, i.e. selecting as many as possible areas such that the\FDR is non-lower than a prexed value; we call them\FDR based decision (or selection) rules. The sensitivity and specicity of such rule depend on the accuracy of the FDR estimate, the over-estimation of FDR causing a loss of power and the under-estimation of FDR producing a loss of specicity. Moreover, our model has the interesting feature of still being able to provide an estimate of relative risk values as in the Besag York and Mollié model (1991). A simulation study to evaluate the model performance in FDR estimation accuracy, sensitivity and specificity of the decision rule, and goodness of estimation of relative risks, was set up. We chose a real map from which we generated several spatial scenarios whose counts of disease vary according to the spatial correlation degree, the size areas, the number of areas where the null hypothesis is true and the risk level in the latter areas. In summarizing simulation results we will always consider the FDR estimation in sets constituted by all b i's selected lower than a threshold t. We will show graphs of the\FDR and the true FDR (known by simulation) plotted against a threshold t to assess the FDR estimation. Varying the threshold we can learn which FDR values can be accurately estimated by the practitioner willing to apply the model (by the closeness between\FDR and true FDR). By plotting the calculated sensitivity and specicity (both known by simulation) vs the\FDR we can check the sensitivity and specicity of the corresponding\FDR based decision rules. For investigating the over-smoothing level of relative risk estimates we will compare box-plots of such estimates in high-risk areas (known by simulation), obtained by both our model and the classic Besag York Mollié model. All the summary tools are worked out for all simulated scenarios (in total 54 scenarios). Results show that FDR is well estimated (in the worst case we get an overestimation, hence a conservative FDR control) in small areas, low risk levels and spatially correlated risks scenarios, that are our primary aims. In such scenarios we have good estimates of the FDR for all values less or equal than 0.10. The sensitivity of\FDR based decision rules is generally low but specicity is high. In such scenario the use of\FDR = 0:05 or\FDR = 0:10 based selection rule can be suggested. In cases where the number of true alternative hypotheses (number of true high-risk areas) is small, also FDR = 0:15 values are well estimated, and \FDR = 0:15 based decision rules gains power maintaining an high specicity. On the other hand, in non-small areas and non-small risk level scenarios the FDR is under-estimated unless for very small values of it (much lower than 0.05); this resulting in a loss of specicity of a\FDR = 0:05 based decision rule. In such scenario\FDR = 0:05 or, even worse,\FDR = 0:1 based decision rules cannot be suggested because the true FDR is actually much higher. As regards the relative risk estimation, our model achieves almost the same results of the classic Besag York Molliè model. For this reason, our model is interesting for its ability to perform both the estimation of relative risk values and the FDR control, except for non-small areas and large risk level scenarios. A case of study is nally presented to show how the method can be used in epidemiology.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Despite intensive research during the last decades, thetheoreticalunderstanding of supercooled liquids and the glasstransition is stillfar from being complete. Besides analytical investigations,theso-called energy-landscape approach has turned out to beveryfruitful. In the literature, many numerical studies havedemonstratedthat, at sufficiently low temperatures, all thermodynamicquantities can be predicted with the help of the propertiesof localminima in the potential-energy-landscape (PEL). The main purpose of this thesis is to strive for anunderstanding ofdynamics in terms of the potential energy landscape. Incontrast to the study of static quantities, this requirestheknowledge of barriers separating the minima.Up to now, it has been the general viewpoint that thermallyactivatedprocesses ('hopping') determine the dynamics only belowTc(the critical temperature of mode-coupling theory), in thesense that relaxation rates follow from local energybarriers.As we show here, this viewpoint should be revisedsince the temperature dependence of dynamics is governed byhoppingprocesses already below 1.5Tc.At the example of a binary mixture of Lennard-Jonesparticles (BMLJ),we establish a quantitative link from the diffusioncoefficient,D(T), to the PEL topology. This is achieved in three steps:First, we show that it is essential to consider wholesuperstructuresof many PEL minima, called metabasins, rather than singleminima. Thisis a consequence of strong correlations within groups of PELminima.Second, we show that D(T) is inversely proportional to theaverageresidence time in these metabasins. Third, the temperaturedependenceof the residence times is related to the depths of themetabasins, asgiven by the surrounding energy barriers. We further discuss that the study of small (but not toosmall) systemsis essential, in that one deals with a less complex energylandscapethan in large systems. In a detailed analysis of differentsystemsizes, we show that the small BMLJ system consideredthroughout thethesis is free of major finite-size-related artifacts.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Forest models are tools for explaining and predicting the dynamics of forest ecosystems. They simulate forest behavior by integrating information on the underlying processes in trees, soil and atmosphere. Bayesian calibration is the application of probability theory to parameter estimation. It is a method, applicable to all models, that quantifies output uncertainty and identifies key parameters and variables. This study aims at testing the Bayesian procedure for calibration to different types of forest models, to evaluate their performances and the uncertainties associated with them. In particular,we aimed at 1) applying a Bayesian framework to calibrate forest models and test their performances in different biomes and different environmental conditions, 2) identifying and solve structure-related issues in simple models, and 3) identifying the advantages of additional information made available when calibrating forest models with a Bayesian approach. We applied the Bayesian framework to calibrate the Prelued model on eight Italian eddy-covariance sites in Chapter 2. The ability of Prelued to reproduce the estimated Gross Primary Productivity (GPP) was tested over contrasting natural vegetation types that represented a wide range of climatic and environmental conditions. The issues related to Prelued's multiplicative structure were the main topic of Chapter 3: several different MCMC-based procedures were applied within a Bayesian framework to calibrate the model, and their performances were compared. A more complex model was applied in Chapter 4, focusing on the application of the physiology-based model HYDRALL to the forest ecosystem of Lavarone (IT) to evaluate the importance of additional information in the calibration procedure and their impact on model performances, model uncertainties, and parameter estimation. Overall, the Bayesian technique proved to be an excellent and versatile tool to successfully calibrate forest models of different structure and complexity, on different kind and number of variables and with a different number of parameters involved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background The estimation of demographic parameters from genetic data often requires the computation of likelihoods. However, the likelihood function is computationally intractable for many realistic evolutionary models, and the use of Bayesian inference has therefore been limited to very simple models. The situation changed recently with the advent of Approximate Bayesian Computation (ABC) algorithms allowing one to obtain parameter posterior distributions based on simulations not requiring likelihood computations. Results Here we present ABCtoolbox, a series of open source programs to perform Approximate Bayesian Computations (ABC). It implements various ABC algorithms including rejection sampling, MCMC without likelihood, a Particle-based sampler and ABC-GLM. ABCtoolbox is bundled with, but not limited to, a program that allows parameter inference in a population genetics context and the simultaneous use of different types of markers with different ploidy levels. In addition, ABCtoolbox can also interact with most simulation and summary statistics computation programs. The usability of the ABCtoolbox is demonstrated by inferring the evolutionary history of two evolutionary lineages of Microtus arvalis. Using nuclear microsatellites and mitochondrial sequence data in the same estimation procedure enabled us to infer sex-specific population sizes and migration rates and to find that males show smaller population sizes but much higher levels of migration than females. Conclusion ABCtoolbox allows a user to perform all the necessary steps of a full ABC analysis, from parameter sampling from prior distributions, data simulations, computation of summary statistics, estimation of posterior distributions, model choice, validation of the estimation procedure, and visualization of the results.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In order to achieve host cell entry, the apicomplexan parasite Neospora caninum relies on the contents of distinct organelles, named micronemes, rhoptries and dense granules, which are secreted at defined timepoints during and after host cell entry. It was shown previously that a vaccine composed of a mixture of three recombinant antigens, corresponding to the two microneme antigens NcMIC1 and NcMIC3 and the rhoptry protein NcROP2, prevented disease and limited cerebral infection and transplacental transmission in mice. In this study, we selected predicted immunogenic domains of each of these proteins and created four different chimeric antigens, with the respective domains incorporated into these chimers in different orders. Following vaccination, mice were challenged intraperitoneally with 2 × 10(6)N. caninum tachzyoites and were then carefully monitored for clinical symptoms during 4 weeks post-infection. Of the four chimeric antigens, only recNcMIC3-1-R provided complete protection against disease with 100% survivors, compared to 40-80% of survivors in the other groups. Serology did not show any clear differences in total IgG, IgG1 and IgG2a levels between the different treatment groups. Vaccination with all four chimeric variants generated an IL-4 biased cytokine expression, which then shifted to an IFN-γ-dominated response following experimental infection. Sera of recNcMIC3-1-R vaccinated mice reacted with each individual recombinant antigen, as well as with three distinct bands in Neospora extracts with similar Mr as NcMIC1, NcMIC3 and NcROP2, and exhibited distinct apical labeling in tachyzoites. These results suggest that recNcMIC3-1-R is an interesting chimeric vaccine candidate and should be followed up in subsequent studies in a fetal infection model.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Indoor radon is regularly measured in Switzerland. However, a nationwide model to predict residential radon levels has not been developed. The aim of this study was to develop a prediction model to assess indoor radon concentrations in Switzerland. The model was based on 44,631 measurements from the nationwide Swiss radon database collected between 1994 and 2004. Of these, 80% randomly selected measurements were used for model development and the remaining 20% for an independent model validation. A multivariable log-linear regression model was fitted and relevant predictors selected according to evidence from the literature, the adjusted R², the Akaike's information criterion (AIC), and the Bayesian information criterion (BIC). The prediction model was evaluated by calculating Spearman rank correlation between measured and predicted values. Additionally, the predicted values were categorised into three categories (50th, 50th-90th and 90th percentile) and compared with measured categories using a weighted Kappa statistic. The most relevant predictors for indoor radon levels were tectonic units and year of construction of the building, followed by soil texture, degree of urbanisation, floor of the building where the measurement was taken and housing type (P-values <0.001 for all). Mean predicted radon values (geometric mean) were 66 Bq/m³ (interquartile range 40-111 Bq/m³) in the lowest exposure category, 126 Bq/m³ (69-215 Bq/m³) in the medium category, and 219 Bq/m³ (108-427 Bq/m³) in the highest category. Spearman correlation between predictions and measurements was 0.45 (95%-CI: 0.44; 0.46) for the development dataset and 0.44 (95%-CI: 0.42; 0.46) for the validation dataset. Kappa coefficients were 0.31 for the development and 0.30 for the validation dataset, respectively. The model explained 20% overall variability (adjusted R²). In conclusion, this residential radon prediction model, based on a large number of measurements, was demonstrated to be robust through validation with an independent dataset. The model is appropriate for predicting radon level exposure of the Swiss population in epidemiological research. Nevertheless, some exposure misclassification and regression to the mean is unavoidable and should be taken into account in future applications of the model.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Genomic alterations have been linked to the development and progression of cancer. The technique of Comparative Genomic Hybridization (CGH) yields data consisting of fluorescence intensity ratios of test and reference DNA samples. The intensity ratios provide information about the number of copies in DNA. Practical issues such as the contamination of tumor cells in tissue specimens and normalization errors necessitate the use of statistics for learning about the genomic alterations from array-CGH data. As increasing amounts of array CGH data become available, there is a growing need for automated algorithms for characterizing genomic profiles. Specifically, there is a need for algorithms that can identify gains and losses in the number of copies based on statistical considerations, rather than merely detect trends in the data. We adopt a Bayesian approach, relying on the hidden Markov model to account for the inherent dependence in the intensity ratios. Posterior inferences are made about gains and losses in copy number. Localized amplifications (associated with oncogene mutations) and deletions (associated with mutations of tumor suppressors) are identified using posterior probabilities. Global trends such as extended regions of altered copy number are detected. Since the posterior distribution is analytically intractable, we implement a Metropolis-within-Gibbs algorithm for efficient simulation-based inference. Publicly available data on pancreatic adenocarcinoma, glioblastoma multiforme and breast cancer are analyzed, and comparisons are made with some widely-used algorithms to illustrate the reliability and success of the technique.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A number of authors have studies the mixture survival model to analyze survival data with nonnegligible cure fractions. A key assumption made by these authors is the independence between the survival time and the censoring time. To our knowledge, no one has studies the mixture cure model in the presence of dependent censoring. To account for such dependence, we propose a more general cure model which allows for dependent censoring. In particular, we derive the cure models from the perspective of competing risks and model the dependence between the censoring time and the survival time using a class of Archimedean copula models. Within this framework, we consider the parameter estimation, the cure detection, and the two-sample comparison of latency distribution in the presence of dependent censoring when a proportion of patients is deemed cured. Large sample results using the martingale theory are obtained. We applied the proposed methodologies to the SEER prostate cancer data.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper describes the use of model-based geostatistics for choosing the optimal set of sampling locations, collectively called the design, for a geostatistical analysis. Two types of design situations are considered. These are retrospective design, which concerns the addition of sampling locations to, or deletion of locations from, an existing design, and prospective design, which consists of choosing optimal positions for a new set of sampling locations. We propose a Bayesian design criterion which focuses on the goal of efficient spatial prediction whilst allowing for the fact that model parameter values are unknown. The results show that in this situation a wide range of inter-point distances should be included in the design, and the widely used regular design is therefore not the optimal choice.