878 resultados para Hierarchical Bayesian Metaanalysis
Resumo:
Rare species have restricted geographic ranges, habitat specialization, and/or small population sizes. Datasets on rare species distribution usually have few observations, limited spatial accuracy and lack of valid absences; conversely they provide comprehensive views of species distributions allowing to realistically capture most of their realized environmental niche. Rare species are the most in need of predictive distribution modelling but also the most difficult to model. We refer to this contrast as the "rare species modelling paradox" and propose as a solution developing modelling approaches that deal with a sufficiently large set of predictors, ensuring that statistical models aren't overfitted. Our novel approach fulfils this condition by fitting a large number of bivariate models and averaging them with a weighted ensemble approach. We further propose that this ensemble forecasting is conducted within a hierarchic multi-scale framework. We present two ensemble models for a test species, one at regional and one at local scale, each based on the combination of 630 models. In both cases, we obtained excellent spatial projections, unusual when modelling rare species. Model results highlight, from a statistically sound approach, the effects of multiple drivers in a same modelling framework and at two distinct scales. From this added information, regional models can support accurate forecasts of range dynamics under climate change scenarios, whereas local models allow the assessment of isolated or synergistic impacts of changes in multiple predictors. This novel framework provides a baseline for adaptive conservation, management and monitoring of rare species at distinct spatial and temporal scales.
Resumo:
This paper presents and discusses the use of Bayesian procedures - introduced through the use of Bayesian networks in Part I of this series of papers - for 'learning' probabilities from data. The discussion will relate to a set of real data on characteristics of black toners commonly used in printing and copying devices. Particular attention is drawn to the incorporation of the proposed procedures as an integral part in probabilistic inference schemes (notably in the form of Bayesian networks) that are intended to address uncertainties related to particular propositions of interest (e.g., whether or not a sample originates from a particular source). The conceptual tenets of the proposed methodologies are presented along with aspects of their practical implementation using currently available Bayesian network software.
Resumo:
Calculating explicit closed form solutions of Cournot models where firms have private information about their costs is, in general, very cumbersome. Most authors consider therefore linear demands and constant marginal costs. However, within this framework, the nonnegativity constraint on prices (and quantities) has been ignored or not properly dealt with and the correct calculation of all Bayesian Nash equilibria is more complicated than expected. Moreover, multiple symmetric and interior Bayesianf equilibria may exist for an open set of parameters. The reason for this is that linear demand is not really linear, since there is a kink at zero price: the general ''linear'' inverse demand function is P (Q) = max{a - bQ, 0} rather than P (Q) = a - bQ.
Resumo:
Ground-penetrating radar (GPR) has the potential to provide valuable information on hydrological properties of the vadose zone because of their strong sensitivity to soil water content. In particular, recent evidence has suggested that the stochastic inversion of crosshole GPR data within a coupled geophysical-hydrological framework may allow for effective estimation of subsurface van-Genuchten-Mualem (VGM) parameters and their corresponding uncertainties. An important and still unresolved issue, however, is how to best integrate GPR data into a stochastic inversion in order to estimate the VGM parameters and their uncertainties, thus improving hydrological predictions. Recognizing the importance of this issue, the aim of the research presented in this thesis was to first introduce a fully Bayesian inversion called Markov-chain-Monte-carlo (MCMC) strategy to perform the stochastic inversion of steady-state GPR data to estimate the VGM parameters and their uncertainties. Within this study, the choice of the prior parameter probability distributions from which potential model configurations are drawn and tested against observed data was also investigated. Analysis of both synthetic and field data collected at the Eggborough (UK) site indicates that the geophysical data alone contain valuable information regarding the VGM parameters. However, significantly better results are obtained when these data are combined with a realistic, informative prior. A subsequent study explore in detail the dynamic infiltration case, specifically to what extent time-lapse ZOP GPR data, collected during a forced infiltration experiment at the Arrenaes field site (Denmark), can help to quantify VGM parameters and their uncertainties using the MCMC inversion strategy. The findings indicate that the stochastic inversion of time-lapse GPR data does indeed allow for a substantial refinement in the inferred posterior VGM parameter distributions. In turn, this significantly improves knowledge of the hydraulic properties, which are required to predict hydraulic behaviour. Finally, another aspect that needed to be addressed involved the comparison of time-lapse GPR data collected under different infiltration conditions (i.e., natural loading and forced infiltration conditions) to estimate the VGM parameters using the MCMC inversion strategy. The results show that for the synthetic example, considering data collected during a forced infiltration test helps to better refine soil hydraulic properties compared to data collected under natural infiltration conditions. When investigating data collected at the Arrenaes field site, further complications arised due to model error and showed the importance of also including a rigorous analysis of the propagation of model error with time and depth when considering time-lapse data. Although the efforts in this thesis were focused on GPR data, the corresponding findings are likely to have general applicability to other types of geophysical data and field environments. Moreover, the obtained results allow to have confidence for future developments in integration of geophysical data with stochastic inversions to improve the characterization of the unsaturated zone but also reveal important issues linked with stochastic inversions, namely model errors, that should definitely be addressed in future research.
Resumo:
Microsatellites are used to unravel the fine-scale genetic structure of a hybrid zone between chromosome races Valais and Cordon of the common shrew (Sorex araneus) located in the French Alps. A total of 269 individuals collected between 1992 and 1995 was typed for seven microsatellite loci. A modified version of the classical multiple correspondence analysis is carried out. This analysis clearly shows the dichotomy between the two races. Several approaches are used to study genetic structuring. Gene flow is clearly reduced between these chromosome races and is estimated at one migrant every two generations using X-statistics and one migrant per generation using F-statistics. Hierarchical F- and R-statistics are compared and their efficiency to detect inter- and intraracial patterns of divergence is discussed. Within-race genetic structuring is significant, but remains weak. F-ST displays similar values on both sides of the hybrid zone, although no environmental barriers are found on the Cordon side, whereas the Valais side is divided by several mountain rivers. We introduce the exact G-test to microsatellite data which proved to be a powerful test to detect genetic differentiation within as well as among races. The genetic background of karyotypic hybrids was compared with the genetic background of pure parental forms using a CRT-MCA. Our results indicate that, without knowledge of the karyotypes, we would not have been able to distinguish these hybrids from karyotypically pure samples.
Resumo:
In the context of Systems Biology, computer simulations of gene regulatory networks provide a powerful tool to validate hypotheses and to explore possible system behaviors. Nevertheless, modeling a system poses some challenges of its own: especially the step of model calibration is often difficult due to insufficient data. For example when considering developmental systems, mostly qualitative data describing the developmental trajectory is available while common calibration techniques rely on high-resolution quantitative data. Focusing on the calibration of differential equation models for developmental systems, this study investigates different approaches to utilize the available data to overcome these difficulties. More specifically, the fact that developmental processes are hierarchically organized is exploited to increase convergence rates of the calibration process as well as to save computation time. Using a gene regulatory network model for stem cell homeostasis in Arabidopsis thaliana the performance of the different investigated approaches is evaluated, documenting considerable gains provided by the proposed hierarchical approach.
Resumo:
In the scenario of social bookmarking, a user browsing the Web bookmarks web pages and assigns free-text labels (i.e., tags) to them according to their personal preferences. In this technical report, we approach one of the practical aspects when it comes to represent users' interests from their tagging activity, namely the categorization of tags into high-level categories of interest. The reason is that the representation of user profiles on the basis of the myriad of tags available on the Web is certainly unfeasible from various practical perspectives; mainly concerning the unavailability of data to reliably, accurately measure interests across such fine-grained categorisation, and, should the data be available, its overwhelming computational intractability. Motivated by this, our study presents the results of a categorization process whereby a collection of tags posted at Delicious #http://delicious.com# are classified into 200 subcategories of interest.
Resumo:
Individuals sampled in hybrid zones are usually analysed according to their sampling locality, morphology, behaviour or karyotype. But the increasing availability of genetic information more and more favours its use for individual sorting purposes and numerous assignment methods based on the genetic composition of individuals have been developed. The shrews of the Sorex araneus group offer good opportunities to test the genetic assignment on individuals identified by their karyotype. Here we explored the potential and efficiency of a Bayesian assignment method combined or not with a reference dataset to study admixture and individual assignment in the difficult context of two hybrid zones between karyotypic species of the Sorex araneus group. As a whole, we assigned more than 80% of the individuals to their respective karyotypic categories (i.e. 'pure' species or hybrids). This assignment level is comparable to what was obtained for the same species away from hybrid zones. Additionally, we showed that the assignment result for several individuals was strongly affected by the inclusion or not of a reference dataset. This highlights the importance of such comparisons when analysing hybrid zones. Finally, differences between the admixture levels detected in both hybrid zones support the hypothesis of an impact of chromosomal rearrangements on gene flow.
Resumo:
Part I of this series of articles focused on the construction of graphical probabilistic inference procedures, at various levels of detail, for assessing the evidential value of gunshot residue (GSR) particle evidence. The proposed models - in the form of Bayesian networks - address the issues of background presence of GSR particles, analytical performance (i.e., the efficiency of evidence searching and analysis procedures) and contamination. The use and practical implementation of Bayesian networks for case pre-assessment is also discussed. This paper, Part II, concentrates on Bayesian parameter estimation. This topic complements Part I in that it offers means for producing estimates useable for the numerical specification of the proposed probabilistic graphical models. Bayesian estimation procedures are given a primary focus of attention because they allow the scientist to combine (his/her) prior knowledge about the problem of interest with newly acquired experimental data. The present paper also considers further topics such as the sensitivity of the likelihood ratio due to uncertainty in parameters and the study of likelihood ratio values obtained for members of particular populations (e.g., individuals with or without exposure to GSR).
Resumo:
Almost 30 years ago, Bayesian networks (BNs) were developed in the field of artificial intelligence as a framework that should assist researchers and practitioners in applying the theory of probability to inference problems of more substantive size and, thus, to more realistic and practical problems. Since the late 1980s, Bayesian networks have also attracted researchers in forensic science and this tendency has considerably intensified throughout the last decade. This review article provides an overview of the scientific literature that describes research on Bayesian networks as a tool that can be used to study, develop and implement probabilistic procedures for evaluating the probative value of particular items of scientific evidence in forensic science. Primary attention is drawn here to evaluative issues that pertain to forensic DNA profiling evidence because this is one of the main categories of evidence whose assessment has been studied through Bayesian networks. The scope of topics is large and includes almost any aspect that relates to forensic DNA profiling. Typical examples are inference of source (or, 'criminal identification'), relatedness testing, database searching and special trace evidence evaluation (such as mixed DNA stains or stains with low quantities of DNA). The perspective of the review presented here is not exclusively restricted to DNA evidence, but also includes relevant references and discussion on both, the concept of Bayesian networks as well as its general usage in legal sciences as one among several different graphical approaches to evidence evaluation.
Resumo:
Testosterone abuse is conventionally assessed by the urinary testosterone/epitestosterone (T/E) ratio, levels above 4.0 being considered suspicious. A deletion polymorphism in the gene coding for UGT2B17 is strongly associated with reduced testosterone glucuronide (TG) levels in urine. Many of the individuals devoid of the gene would not reach a T/E ratio of 4.0 after testosterone intake. Future test programs will most likely shift from population based- to individual-based T/E cut-off ratios using Bayesian inference. A longitudinal analysis is dependent on an individual's true negative baseline T/E ratio. The aim was to investigate whether it is possible to increase the sensitivity and specificity of the T/E test by addition of UGT2B17 genotype information in a Bayesian framework. A single intramuscular dose of 500mg testosterone enanthate was given to 55 healthy male volunteers with either two, one or no allele (ins/ins, ins/del or del/del) of the UGT2B17 gene. Urinary excretion of TG and the T/E ratio was measured during 15 days. The Bayesian analysis was conducted to calculate the individual T/E cut-off ratio. When adding the genotype information, the program returned lower individual cut-off ratios in all del/del subjects increasing the sensitivity of the test considerably. It will be difficult, if not impossible, to discriminate between a true negative baseline T/E value and a false negative one without knowledge of the UGT2B17 genotype. UGT2B17 genotype information is crucial, both to decide which initial cut-off ratio to use for an individual, and for increasing the sensitivity of the Bayesian analysis.
Resumo:
The CbrA/B system in pseudomonads is involved in the utilization of carbon sources and carbon catabolite repression (CCR) through the activation of the small RNAs crcZ in Pseudomonas aeruginosa, and crcZ and crcY in Pseudomonas putida. Interestingly, previous works reported that the CbrA/B system activity in P. aeruginosa PAO1 and P. putida KT2442 responded differently to the presence of different carbon sources, thus raising the question of the exact nature of the signal(s) detected by CbrA. Here, we demonstrated that the CbrA/B/CrcZ(Y) signal transduction pathway is similarly activated in the two Pseudomonas species. We show that the CbrA sensor kinase is fully interchangeable between the two species and, moreover, responds similarly to the presence of different carbon sources. In addition, a metabolomics analysis supported the hypothesis that CCR responds to the internal energy status of the cell, as the internal carbon/nitrogen ratio seems to determine CCR and non-CCR conditions. The strong difference found in the 2-oxoglutarate/glutamine ratio between CCR and non-CCR conditions points to the close relationship between carbon and nitrogen availability, or the relationship between the CbrA/B and NtrB/C systems, suggesting that both regulatory systems sense the same sort or interrelated signal.
Resumo:
Evidence of a sport-specific hierarchy of protective factors against doping would thus be a powerful aid in adapting information and prevention campaigns to target the characteristics of specific athlete groups, and especially those athletes most vulnerable for doping control. The contents of phone calls to a free and anonymous national anti-doping service called 'ecoute dopage' were analysed (192 bodybuilders, 124 cyclists and 44 footballers). The results showed that the protective factors that emerged from analysis could be categorised into two groups. The first comprised 'Health concerns', 'Respect for the law' and 'Doping controls from the environment' and the second comprised 'Doubts about the effectiveness of illicit products, 'Thinking skills' and 'Doubts about doctors'. The ranking of the factors for the cyclists differed from that of the other athletes. The ordering of factors was 1) respect for the law, 2) doping controls from the environment, 3) health concerns 4) doubts about doctors, and 5) doubts about the effectiveness illicit products. The results are analysed in terms of the ranking in each athlete group and the consequences on the athletes' experience and relationship to doping. Specific prevention campaigns are proposed to limit doping behaviour in general and for each sport.
Resumo:
We analyze crash data collected by the Iowa Department of Transportation using Bayesian methods. The data set includes monthly crash numbers, estimated monthly traffic volumes, site length and other information collected at 30 paired sites in Iowa over more than 20 years during which an intervention experiment was set up. The intervention consisted in transforming 15 undivided road segments from four-lane to three lanes, while an additional 15 segments, thought to be comparable in terms of traffic safety-related characteristics were not converted. The main objective of this work is to find out whether the intervention reduces the number of crashes and the crash rates at the treated sites. We fitted a hierarchical Poisson regression model with a change-point to the number of monthly crashes per mile at each of the sites. Explanatory variables in the model included estimated monthly traffic volume, time, an indicator for intervention reflecting whether the site was a “treatment” or a “control” site, and various interactions. We accounted for seasonal effects in the number of crashes at a site by including smooth trigonometric functions with three different periods to reflect the four seasons of the year. A change-point at the month and year in which the intervention was completed for treated sites was also included. The number of crashes at a site can be thought to follow a Poisson distribution. To estimate the association between crashes and the explanatory variables, we used a log link function and added a random effect to account for overdispersion and for autocorrelation among observations obtained at the same site. We used proper but non-informative priors for all parameters in the model, and carried out all calculations using Markov chain Monte Carlo methods implemented in WinBUGS. We evaluated the effect of the four to three-lane conversion by comparing the expected number of crashes per year per mile during the years preceding the conversion and following the conversion for treatment and control sites. We estimated this difference using the observed traffic volumes at each site and also on a per 100,000,000 vehicles. We also conducted a prospective analysis to forecast the expected number of crashes per mile at each site in the study one year, three years and five years following the four to three-lane conversion. Posterior predictive distributions of the number of crashes, the crash rate and the percent reduction in crashes per mile were obtained for each site for the months of January and June one, three and five years after completion of the intervention. The model appears to fit the data well. We found that in most sites, the intervention was effective and reduced the number of crashes. Overall, and for the observed traffic volumes, the reduction in the expected number of crashes per year and mile at converted sites was 32.3% (31.4% to 33.5% with 95% probability) while at the control sites, the reduction was estimated to be 7.1% (5.7% to 8.2% with 95% probability). When the reduction in the expected number of crashes per year, mile and 100,000,000 AADT was computed, the estimates were 44.3% (43.9% to 44.6%) and 25.5% (24.6% to 26.0%) for converted and control sites, respectively. In both cases, the difference in the percent reduction in the expected number of crashes during the years following the conversion was significantly larger at converted sites than at control sites, even though the number of crashes appears to decline over time at all sites. Results indicate that the reduction in the expected number of sites per mile has a steeper negative slope at converted than at control sites. Consistent with this, the forecasted reduction in the number of crashes per year and mile during the years after completion of the conversion at converted sites is more pronounced than at control sites. Seasonal effects on the number of crashes have been well-documented. In this dataset, we found that, as expected, the expected number of monthly crashes per mile tends to be higher during winter months than during the rest of the year. Perhaps more interestingly, we found that there is an interaction between the four to three-lane conversion and season; the reduction in the number of crashes appears to be more pronounced during months, when the weather is nice than during other times of the year, even though a reduction was estimated for the entire year. Thus, it appears that the four to three-lane conversion, while effective year-round, is particularly effective in reducing the expected number of crashes in nice weather.