882 resultados para Bayesian model selection
Selection bias and unobservable heterogeneity applied at the wage equation of European married women
Resumo:
This paper utilizes a panel data sample selection model to correct the selection in the analysis of longitudinal labor market data for married women in European countries. We estimate the female wage equation in a framework of unbalanced panel data models with sample selection. The wage equations of females have several potential sources of.
Resumo:
This paper studies optimal monetary policy in a framework that explicitly accounts for policymakers' uncertainty about the channels of transmission of oil prices into the economy. More specfically, I examine the robust response to the real price of oil that US monetary authorities would have been recommended to implement in the period 1970 2009; had they used the approach proposed by Cogley and Sargent (2005b) to incorporate model uncertainty and learning into policy decisions. In this context, I investigate the extent to which regulator' changing beliefs over different models of the economy play a role in the policy selection process. The main conclusion of this work is that, in the specific environment under analysis, one of the underlying models dominates the optimal interest rate response to oil prices. This result persists even when alternative assumptions on the model's priors change the pattern of the relative posterior probabilities, and can thus be attributed to the presence of model uncertainty itself.
Resumo:
The availability of rich firm-level data sets has recently led researchers to uncover new evidence on the effects of trade liberalization. First, trade openness forces the least productive firms to exit the market. Secondly, it induces surviving firms to increase their innovation efforts and thirdly, it increases the degree of product market competition. In this paper we propose a model aimed at providing a coherent interpretation of these findings. We introducing firm heterogeneity into an innovation-driven growth model, where incumbent firms operating in oligopolistic industries perform cost-reducing innovations. In this framework, trade liberalization leads to higher product market competition, lower markups and higher quantity produced. These changes in markups and quantities, in turn, promote innovation and productivity growth through a direct competition effect, based on the increase in the size of the market, and a selection effect, produced by the reallocation of resources towards more productive firms. Calibrated to match US aggregate and firm-level statistics, the model predicts that a 10 percent reduction in variable trade costs reduces markups by 1:15 percent, firm surviving probabilities by 1 percent, and induces an increase in productivity growth of about 13 percent. More than 90 percent of the trade-induced growth increase can be attributed to the selection effect.
Advanced mapping of environmental data: Geostatistics, Machine Learning and Bayesian Maximum Entropy
Resumo:
This book combines geostatistics and global mapping systems to present an up-to-the-minute study of environmental data. Featuring numerous case studies, the reference covers model dependent (geostatistics) and data driven (machine learning algorithms) analysis techniques such as risk mapping, conditional stochastic simulations, descriptions of spatial uncertainty and variability, artificial neural networks (ANN) for spatial data, Bayesian maximum entropy (BME), and more.
Resumo:
We study a dynamic model where growth requires both long-term investment and the selection of talented managers. When ability is not ex-ante observable and contracts are incomplete, managerial selection imposes a cost, as managers facing the risk of being replaced tend to choose a sub-optimally low level of long-term investment. This generates a trade-off between selection and investment that has implications for the choice of contractual relationships. Our analysis shows that rigid long-term contracts sacrificing managerial selection may be optimal at early stages of economic development and when access to information is limited. As the economy grows, however, knowledge accumulation increases the return to talent and makes it optimal to adopt flexible contractual relationships, where managerial selection is implemented even at the cost of lower investment. Better institutions, in the form of a richer contracting environment and less severe informational frictions, speed up the transition to short-term relationships.
Resumo:
Summary (in English) Computer simulations provide a practical way to address scientific questions that would be otherwise intractable. In evolutionary biology, and in population genetics in particular, the investigation of evolutionary processes frequently involves the implementation of complex models, making simulations a particularly valuable tool in the area. In this thesis work, I explored three questions involving the geographical range expansion of populations, taking advantage of spatially explicit simulations coupled with approximate Bayesian computation. First, the neutral evolutionary history of the human spread around the world was investigated, leading to a surprisingly simple model: A straightforward diffusion process of migrations from east Africa throughout a world map with homogeneous landmasses replicated to very large extent the complex patterns observed in real human populations, suggesting a more continuous (as opposed to structured) view of the distribution of modern human genetic diversity, which may play a better role as a base model for further studies. Second, the postglacial evolution of the European barn owl, with the formation of a remarkable coat-color cline, was inspected with two rounds of simulations: (i) determine the demographic background history and (ii) test the probability of a phenotypic cline, like the one observed in the natural populations, to appear without natural selection. We verified that the modern barn owl population originated from a single Iberian refugium and that they formed their color cline, not due to neutral evolution, but with the necessary participation of selection. The third and last part of this thesis refers to a simulation-only study inspired by the barn owl case above. In this chapter, we showed that selection is, indeed, effective during range expansions and that it leaves a distinguished signature, which can then be used to detect and measure natural selection in range-expanding populations. Résumé (en français) Les simulations fournissent un moyen pratique pour répondre à des questions scientifiques qui seraient inabordable autrement. En génétique des populations, l'étude des processus évolutifs implique souvent la mise en oeuvre de modèles complexes, et les simulations sont un outil particulièrement précieux dans ce domaine. Dans cette thèse, j'ai exploré trois questions en utilisant des simulations spatialement explicites dans un cadre de calculs Bayésiens approximés (approximate Bayesian computation : ABC). Tout d'abord, l'histoire de la colonisation humaine mondiale et de l'évolution de parties neutres du génome a été étudiée grâce à un modèle étonnement simple. Un processus de diffusion des migrants de l'Afrique orientale à travers un monde avec des masses terrestres homogènes a reproduit, dans une très large mesure, les signatures génétiques complexes observées dans les populations humaines réelles. Un tel modèle continu (opposé à un modèle structuré en populations) pourrait être très utile comme modèle de base dans l'étude de génétique humaine à l'avenir. Deuxièmement, l'évolution postglaciaire d'un gradient de couleur chez l'Effraie des clocher (Tyto alba) Européenne, a été examiné avec deux séries de simulations pour : (i) déterminer l'histoire démographique de base et (ii) tester la probabilité qu'un gradient phénotypique, tel qu'observé dans les populations naturelles puisse apparaître sans sélection naturelle. Nous avons montré que la population actuelle des chouettes est sortie d'un unique refuge ibérique et que le gradient de couleur ne peux pas s'être formé de manière neutre (sans l'action de la sélection naturelle). La troisième partie de cette thèse se réfère à une étude par simulations inspirée par l'étude de l'Effraie. Dans ce dernier chapitre, nous avons montré que la sélection est, en effet, aussi efficace dans les cas d'expansion d'aire de distribution et qu'elle laisse une signature unique, qui peut être utilisée pour la détecter et estimer sa force.
Resumo:
Attrition in longitudinal studies can lead to biased results. The study is motivated by the unexpected observation that alcohol consumption decreased despite increased availability, which may be due to sample attrition of heavy drinkers. Several imputation methods have been proposed, but rarely compared in longitudinal studies of alcohol consumption. The imputation of consumption level measurements is computationally particularly challenging due to alcohol consumption being a semi-continuous variable (dichotomous drinking status and continuous volume among drinkers), and the non-normality of data in the continuous part. Data come from a longitudinal study in Denmark with four waves (2003-2006) and 1771 individuals at baseline. Five techniques for missing data are compared: Last value carried forward (LVCF) was used as a single, and Hotdeck, Heckman modelling, multivariate imputation by chained equations (MICE), and a Bayesian approach as multiple imputation methods. Predictive mean matching was used to account for non-normality, where instead of imputing regression estimates, "real" observed values from similar cases are imputed. Methods were also compared by means of a simulated dataset. The simulation showed that the Bayesian approach yielded the most unbiased estimates for imputation. The finding of no increase in consumption levels despite a higher availability remained unaltered. Copyright (C) 2011 John Wiley & Sons, Ltd.
Resumo:
Background: The imatinib trough plasma concentration (C(min)) correlates with clinical response in cancer patients. Therapeutic drug monitoring (TDM) of plasma C(min) is therefore suggested. In practice, however, blood sampling for TDM is often not performed at trough. The corresponding measurement is thus only remotely informative about C(min) exposure. Objectives: The objectives of this study were to improve the interpretation of randomly measured concentrations by using a Bayesian approach for the prediction of C(min), incorporating correlation between pharmacokinetic parameters, and to compare the predictive performance of this method with alternative approaches, by comparing predictions with actual measured trough levels, and with predictions obtained by a reference method, respectively. Methods: A Bayesian maximum a posteriori (MAP) estimation method accounting for correlation (MAP-ρ) between pharmacokinetic parameters was developed on the basis of a population pharmacokinetic model, which was validated on external data. Thirty-one paired random and trough levels, observed in gastrointestinal stromal tumour patients, were then used for the evaluation of the Bayesian MAP-ρ method: individual C(min) predictions, derived from single random observations, were compared with actual measured trough levels for assessment of predictive performance (accuracy and precision). The method was also compared with alternative approaches: classical Bayesian MAP estimation assuming uncorrelated pharmacokinetic parameters, linear extrapolation along the typical elimination constant of imatinib, and non-linear mixed-effects modelling (NONMEM) first-order conditional estimation (FOCE) with interaction. Predictions of all methods were finally compared with 'best-possible' predictions obtained by a reference method (NONMEM FOCE, using both random and trough observations for individual C(min) prediction). Results: The developed Bayesian MAP-ρ method accounting for correlation between pharmacokinetic parameters allowed non-biased prediction of imatinib C(min) with a precision of ±30.7%. This predictive performance was similar for the alternative methods that were applied. The range of relative prediction errors was, however, smallest for the Bayesian MAP-ρ method and largest for the linear extrapolation method. When compared with the reference method, predictive performance was comparable for all methods. The time interval between random and trough sampling did not influence the precision of Bayesian MAP-ρ predictions. Conclusion: Clinical interpretation of randomly measured imatinib plasma concentrations can be assisted by Bayesian TDM. Classical Bayesian MAP estimation can be applied even without consideration of the correlation between pharmacokinetic parameters. Individual C(min) predictions are expected to vary less through Bayesian TDM than linear extrapolation. Bayesian TDM could be developed in the future for other targeted anticancer drugs and for the prediction of other pharmacokinetic parameters that have been correlated with clinical outcomes.
Resumo:
Natural selection is typically exerted at some specific life stages. If natural selection takes place before a trait can be measured, using conventional models can cause wrong inference about population parameters. When the missing data process relates to the trait of interest, a valid inference requires explicit modeling of the missing process. We propose a joint modeling approach, a shared parameter model, to account for nonrandom missing data. It consists of an animal model for the phenotypic data and a logistic model for the missing process, linked by the additive genetic effects. A Bayesian approach is taken and inference is made using integrated nested Laplace approximations. From a simulation study we find that wrongly assuming that missing data are missing at random can result in severely biased estimates of additive genetic variance. Using real data from a wild population of Swiss barn owls Tyto alba, our model indicates that the missing individuals would display large black spots; and we conclude that genes affecting this trait are already under selection before it is expressed. Our model is a tool to correctly estimate the magnitude of both natural selection and additive genetic variance.
Resumo:
The thymus is the site of T cell development. Several stromal and hematopoietic cell types are necessary for the proper function of thymic selection and eventually peripheral immunity. Thymic epithelial cells (TECs) are essential for T cell lineage commitment, expansion, and maturation in the thymus. We were interested in developing an in vivo model in which exogenous gene expression could be transiently induced in embryonic TEC (Tet-On system). To this end, we have generated a bacterial artificial chromosome (BAC) transgenic mouse line in which the reverse tetracycline-dependent transactivator (rtTA) is expressed under the control of the Foxn1 promoter, a transcriptional factor indispensable for TEC development. To analyze the expression pattern and efficiency of this novel mouse model, we crossed the Foxn1-rtTA founder with a Tet-Responsive Element (TRE)-LacZ GFP mouse reporter to obtain a double transgenic mouse. In the presence of doxycycline, rtTA can interact with TRE and induce the expression of GFP and LacZ. In this double transgenic mouse, we observed that GFP expression was high, inducible and limited to TEC in fetal thymus. In contrast, in adult thymus, when TEC development and maturation is completed, GFP was barely detectable. Therefore, Foxn1-rtTA represents a new and efficient transgenic mouse model to induce genes of interest specifically in fetal thymic epithelium. genesis 51:717-724. © 2013 Wiley Periodicals, Inc.
Resumo:
Every year, debris flows cause huge damage in mountainous areas. Due to population pressure in hazardous zones, the socio-economic impact is much higher than in the past. Therefore, the development of indicative susceptibility hazard maps is of primary importance, particularly in developing countries. However, the complexity of the phenomenon and the variability of local controlling factors limit the use of processbased models for a first assessment. A debris flow model has been developed for regional susceptibility assessments using digital elevation model (DEM) with a GIS-based approach.. The automatic identification of source areas and the estimation of debris flow spreading, based on GIS tools, provide a substantial basis for a preliminary susceptibility assessment at a regional scale. One of the main advantages of this model is its workability. In fact, everything is open to the user, from the data choice to the selection of the algorithms and their parameters. The Flow-R model was tested in three different contexts: two in Switzerland and one in Pakistan, for indicative susceptibility hazard mapping. It was shown that the quality of the DEM is the most important parameter to obtain reliable results for propagation, but also to identify the potential debris flows sources.
Resumo:
The weak selection approximation of population genetics has made possible the analysis of social evolution under a considerable variety of biological scenarios. Despite its extensive usage, the accuracy of weak selection in predicting the emergence of altruism under limited dispersal when selection intensity increases remains unclear. Here, we derive the condition for the spread of an altruistic mutant in the infinite island model of dispersal under a Moran reproductive process and arbitrary strength of selection. The simplicity of the model allows us to compare weak and strong selection regimes analytically. Our results demonstrate that the weak selection approximation is robust to moderate increases in selection intensity and therefore provides a good approximation to understand the invasion of altruism in spatially structured population. In particular, we find that the weak selection approximation is excellent even if selection is very strong, when either migration is much stronger than selection or when patches are large. Importantly, we emphasize that the weak selection approximation provides the ideal condition for the invasion of altruism, and increasing selection intensity will impede the emergence of altruism. We discuss that this should also hold for more complicated life cycles and for culturally transmitted altruism. Using the weak selection approximation is therefore unlikely to miss out on any demographic scenario that lead to the evolution of altruism under limited dispersal.
Resumo:
Background The 'database search problem', that is, the strengthening of a case - in terms of probative value - against an individual who is found as a result of a database search, has been approached during the last two decades with substantial mathematical analyses, accompanied by lively debate and centrally opposing conclusions. This represents a challenging obstacle in teaching but also hinders a balanced and coherent discussion of the topic within the wider scientific and legal community. This paper revisits and tracks the associated mathematical analyses in terms of Bayesian networks. Their derivation and discussion for capturing probabilistic arguments that explain the database search problem are outlined in detail. The resulting Bayesian networks offer a distinct view on the main debated issues, along with further clarity. Methods As a general framework for representing and analyzing formal arguments in probabilistic reasoning about uncertain target propositions (that is, whether or not a given individual is the source of a crime stain), this paper relies on graphical probability models, in particular, Bayesian networks. This graphical probability modeling approach is used to capture, within a single model, a series of key variables, such as the number of individuals in a database, the size of the population of potential crime stain sources, and the rarity of the corresponding analytical characteristics in a relevant population. Results This paper demonstrates the feasibility of deriving Bayesian network structures for analyzing, representing, and tracking the database search problem. The output of the proposed models can be shown to agree with existing but exclusively formulaic approaches. Conclusions The proposed Bayesian networks allow one to capture and analyze the currently most well-supported but reputedly counter-intuitive and difficult solution to the database search problem in a way that goes beyond the traditional, purely formulaic expressions. The method's graphical environment, along with its computational and probabilistic architectures, represents a rich package that offers analysts and discussants with additional modes of interaction, concise representation, and coherent communication.
Resumo:
Compositional random vectors are fundamental tools in the Bayesian analysis of categorical data.Many of the issues that are discussed with reference to the statistical analysis of compositionaldata have a natural counterpart in the construction of a Bayesian statistical model for categoricaldata.This note builds on the idea of cross-fertilization of the two areas recommended by Aitchison (1986)in his seminal book on compositional data. Particular emphasis is put on the problem of whatparameterization to use
Resumo:
Uncertainty quantification of petroleum reservoir models is one of the present challenges, which is usually approached with a wide range of geostatistical tools linked with statistical optimisation or/and inference algorithms. Recent advances in machine learning offer a novel approach to model spatial distribution of petrophysical properties in complex reservoirs alternative to geostatistics. The approach is based of semisupervised learning, which handles both ?labelled? observed data and ?unlabelled? data, which have no measured value but describe prior knowledge and other relevant data in forms of manifolds in the input space where the modelled property is continuous. Proposed semi-supervised Support Vector Regression (SVR) model has demonstrated its capability to represent realistic geological features and describe stochastic variability and non-uniqueness of spatial properties. On the other hand, it is able to capture and preserve key spatial dependencies such as connectivity of high permeability geo-bodies, which is often difficult in contemporary petroleum reservoir studies. Semi-supervised SVR as a data driven algorithm is designed to integrate various kind of conditioning information and learn dependences from it. The semi-supervised SVR model is able to balance signal/noise levels and control the prior belief in available data. In this work, stochastic semi-supervised SVR geomodel is integrated into Bayesian framework to quantify uncertainty of reservoir production with multiple models fitted to past dynamic observations (production history). Multiple history matched models are obtained using stochastic sampling and/or MCMC-based inference algorithms, which evaluate posterior probability distribution. Uncertainty of the model is described by posterior probability of the model parameters that represent key geological properties: spatial correlation size, continuity strength, smoothness/variability of spatial property distribution. The developed approach is illustrated with a fluvial reservoir case. The resulting probabilistic production forecasts are described by uncertainty envelopes. The paper compares the performance of the models with different combinations of unknown parameters and discusses sensitivity issues.