910 resultados para Probabilistic Inference
Resumo:
Continuing developments in science and technology mean that the amounts of information forensic scientists are able to provide for criminal investigations is ever increasing. The commensurate increase in complexity creates difficulties for scientists and lawyers with regard to evaluation and interpretation, notably with respect to issues of inference and decision. Probability theory, implemented through graphical methods, and specifically Bayesian networks, provides powerful methods to deal with this complexity. Extensions of these methods to elements of decision theory provide further support and assistance to the judicial system. Bayesian Networks for Probabilistic Inference and Decision Analysis in Forensic Science provides a unique and comprehensive introduction to the use of Bayesian decision networks for the evaluation and interpretation of scientific findings in forensic science, and for the support of decision-makers in their scientific and legal tasks. Includes self-contained introductions to probability and decision theory. Develops the characteristics of Bayesian networks, object-oriented Bayesian networks and their extension to decision models. Features implementation of the methodology with reference to commercial and academically available software. Presents standard networks and their extensions that can be easily implemented and that can assist in the reader's own analysis of real cases. Provides a technique for structuring problems and organizing data based on methods and principles of scientific reasoning. Contains a method for the construction of coherent and defensible arguments for the analysis and evaluation of scientific findings and for decisions based on them. Is written in a lucid style, suitable for forensic scientists and lawyers with minimal mathematical background. Includes a foreword by Ian Evett. The clear and accessible style of this second edition makes this book ideal for all forensic scientists, applied statisticians and graduate students wishing to evaluate forensic findings from the perspective of probability and decision analysis. It will also appeal to lawyers and other scientists and professionals interested in the evaluation and interpretation of forensic findings, including decision making based on scientific information.
Resumo:
Part I of this series of articles focused on the construction of graphical probabilistic inference procedures, at various levels of detail, for assessing the evidential value of gunshot residue (GSR) particle evidence. The proposed models - in the form of Bayesian networks - address the issues of background presence of GSR particles, analytical performance (i.e., the efficiency of evidence searching and analysis procedures) and contamination. The use and practical implementation of Bayesian networks for case pre-assessment is also discussed. This paper, Part II, concentrates on Bayesian parameter estimation. This topic complements Part I in that it offers means for producing estimates useable for the numerical specification of the proposed probabilistic graphical models. Bayesian estimation procedures are given a primary focus of attention because they allow the scientist to combine (his/her) prior knowledge about the problem of interest with newly acquired experimental data. The present paper also considers further topics such as the sensitivity of the likelihood ratio due to uncertainty in parameters and the study of likelihood ratio values obtained for members of particular populations (e.g., individuals with or without exposure to GSR).
Resumo:
We devise a message passing algorithm for probabilistic inference in composite systems, consisting of a large number of variables, that exhibit weak random interactions among all variables and strong interactions with a small subset of randomly chosen variables; the relative strength of the two interactions is controlled by a free parameter. We examine the performance of the algorithm numerically on a number of systems of this type for varying mixing parameter values.
Resumo:
This paper concerns the problem of agent trust in an electronic market place. We maintain that agent trust involves making decisions under uncertainty and therefore the phenomenon should be modelled probabilistically. We therefore propose a probabilistic framework that models agent interactions as a Hidden Markov Model (HMM). The observations of the HMM are the interaction outcomes and the hidden state is the underlying probability of a good outcome. The task of deciding whether to interact with another agent reduces to probabilistic inference of the current state of that agent given all previous interaction outcomes. The model is extended to include a probabilistic reputation system which involves agents gathering opinions about other agents and fusing them with their own beliefs. Our system is fully probabilistic and hence delivers the following improvements with respect to previous work: (a) the model assumptions are faithfully translated into algorithms; our system is optimal under those assumptions, (b) It can account for agents whose behaviour is not static with time (c) it can estimate the rate with which an agent's behaviour changes. The system is shown to significantly outperform previous state-of-the-art methods in several numerical experiments. Copyright © 2010, International Foundation for Autonomous Agents and Multiagent Systems (www.ifaamas.org). All rights reserved.
Resumo:
The Dirichlet process mixture model (DPMM) is a ubiquitous, flexible Bayesian nonparametric statistical model. However, full probabilistic inference in this model is analytically intractable, so that computationally intensive techniques such as Gibbs sampling are required. As a result, DPMM-based methods, which have considerable potential, are restricted to applications in which computational resources and time for inference is plentiful. For example, they would not be practical for digital signal processing on embedded hardware, where computational resources are at a serious premium. Here, we develop a simplified yet statistically rigorous approximate maximum a-posteriori (MAP) inference algorithm for DPMMs. This algorithm is as simple as DP-means clustering, solves the MAP problem as well as Gibbs sampling, while requiring only a fraction of the computational effort. (For freely available code that implements the MAP-DP algorithm for Gaussian mixtures see http://www.maxlittle.net/.) Unlike related small variance asymptotics (SVA), our method is non-degenerate and so inherits the “rich get richer” property of the Dirichlet process. It also retains a non-degenerate closed-form likelihood which enables out-of-sample calculations and the use of standard tools such as cross-validation. We illustrate the benefits of our algorithm on a range of examples and contrast it to variational, SVA and sampling approaches from both a computational complexity perspective as well as in terms of clustering performance. We demonstrate the wide applicabiity of our approach by presenting an approximate MAP inference method for the infinite hidden Markov model whose performance contrasts favorably with a recently proposed hybrid SVA approach. Similarly, we show how our algorithm can applied to a semiparametric mixed-effects regression model where the random effects distribution is modelled using an infinite mixture model, as used in longitudinal progression modelling in population health science. Finally, we propose directions for future research on approximate MAP inference in Bayesian nonparametrics.
Resumo:
PURPOSE: All kinds of blood manipulations aim to increase the total hemoglobin mass (tHb-mass). To establish tHb-mass as an effective screening parameter for detecting blood doping, the knowledge of its normal variation over time is necessary. The aim of the present study, therefore, was to determine the intraindividual variance of tHb-mass in elite athletes during a training year emphasizing off, training, and race seasons at sea level. METHODS: tHb-mass and hemoglobin concentration ([Hb]) were determined in 24 endurance athletes five times during a year and were compared with a control group (n = 6). An analysis of covariance was used to test the effects of training phases, age, gender, competition level, body mass, and training volume. Three error models, based on 1) a total percentage error of measurement, 2) the combination of a typical percentage error (TE) of analytical origin with an absolute SD of biological origin, and 3) between-subject and within-subject variance components as obtained by an analysis of variance, were tested. RESULTS: In addition to the expected influence of performance status, the main results were that the effects of training volume (P = 0.20) and training phases (P = 0.81) on tHb-mass were not significant. We found that within-subject variations mainly have an analytical origin (TE approximately 1.4%) and a very small SD (7.5 g) of biological origin. CONCLUSION: tHb-mass shows very low individual oscillations during a training year (<6%), and these oscillations are below the expected changes in tHb-mass due to Herythropoetin (EPO) application or blood infusion (approximately 10%). The high stability of tHb-mass over a period of 1 year suggests that it should be included in an athlete's biological passport and analyzed by recently developed probabilistic inference techniques that define subject-based reference ranges.
Resumo:
This paper presents and discusses the use of Bayesian procedures - introduced through the use of Bayesian networks in Part I of this series of papers - for 'learning' probabilities from data. The discussion will relate to a set of real data on characteristics of black toners commonly used in printing and copying devices. Particular attention is drawn to the incorporation of the proposed procedures as an integral part in probabilistic inference schemes (notably in the form of Bayesian networks) that are intended to address uncertainties related to particular propositions of interest (e.g., whether or not a sample originates from a particular source). The conceptual tenets of the proposed methodologies are presented along with aspects of their practical implementation using currently available Bayesian network software.
The hematology laboratory in blood doping (bd): 2014 update on the athlete biological passport (APB)
Resumo:
Introduction: Blood doping (BD) is the use of Erythropoietic Stimulating Agents (ESAs) and/or transfusion to increase aerobic performance in athletes. Direct toxicologic techniques are insufficient to unmask sophisticated doping protocols. The Hematological module of the ABP (World Anti-Doping Agency), associates decision support technology and expert assessment to indirectly detect BD hematological effects. Methods: The ABP module is based on blood parameters, under strict pre-analytical and analytical rules for collection, storage and transport at 2-12°C, internal and external QC. Accuracy, reproducibility and interlaboratory harmonization fulfill forensic standard. Blood samples are collected in competition and out-ofcompetition. Primary parameters for longitudinal monitoring are: - hemoglobin (HGB); - reticulocyte percentage (RET); - OFF score, indicator of suppressed erythropoiesis, calculated as [HGB(g/L) * 60-√RET%]. Statistical calculation predicts individual expected limits by probabilistic inference. Secondary parameters are RBC, HCT, MCHC-MCH-MCV-RDW-IFR. ABP profiles flagged as atypical are review by experts in hematology, pharmacology, sports medicine or physiology, and classified as: - normal - suspect (to target) - likely due to BD - likely due to pathology. Results: Thousands of athletes worldwide are currently monitored. Since 2010, at least 35 athletes have been sanctioned and others are prosecuted on the sole basis of abnormal ABP, with a 240% increase of positivity to direct tests for ESA, thanks to improved targeting of suspicious athletes (WADA data). Specific doping scenarios have been identified by the Experts (Table and Figure). Figure. Typical HGB and RET profiles in two highly suspicious athletes. A. Sample 2: simultaneous increases in HGB and RET (likely ESA stimulation) in a male. B. Samples 3, 6 and 7: "OFF" picture, with high HGB and low RET in a female. Sample 10: normal HGB and increased RET (ESA or blood withdrawal). Conclusions: ABP is a powerful tool for indirect doping detection, based on the recognition of specific, unphysiological changes triggered by blood doping. The effect of factors of heterogeneity, such as sex and altitude, must also be considered. Schumacher YO, et al. Drug Test Anal 2012, 4:846-853. Sottas PE, et al. Clin Chem 2011, 57:969-976.
Resumo:
In a series of three experiments, participants made inferences about which one of a pair of two objects scored higher on a criterion. The first experiment was designed to contrast the prediction of Probabilistic Mental Model theory (Gigerenzer, Hoffrage, & Kleinbölting, 1991) concerning sampling procedure with the hard-easy effect. The experiment failed to support the theory's prediction that a particular pair of randomly sampled item sets would differ in percentage correct; but the observation that German participants performed practically as well on comparisons between U.S. cities (many of which they did not even recognize) than on comparisons between German cities (about which they knew much more) ultimately led to the formulation of the recognition heuristic. Experiment 2 was a second, this time successful, attempt to unconfound item difficulty and sampling procedure. In Experiment 3, participants' knowledge and recognition of each city was elicited, and how often this could be used to make an inference was manipulated. Choices were consistent with the recognition heuristic in about 80% of the cases when it discriminated and people had no additional knowledge about the recognized city (and in about 90% when they had such knowledge). The frequency with which the heuristic could be used affected the percentage correct, mean confidence, and overconfidence as predicted. The size of the reference class, which was also manipulated, modified these effects in meaningful and theoretically important ways.
Resumo:
This article extends existing discussion in literature on probabilistic inference and decision making with respect to continuous hypotheses that are prevalent in forensic toxicology. As a main aim, this research investigates the properties of a widely followed approach for quantifying the level of toxic substances in blood samples, and to compare this procedure with a Bayesian probabilistic approach. As an example, attention is confined to the presence of toxic substances, such as THC, in blood from car drivers. In this context, the interpretation of results from laboratory analyses needs to take into account legal requirements for establishing the 'presence' of target substances in blood. In a first part, the performance of the proposed Bayesian model for the estimation of an unknown parameter (here, the amount of a toxic substance) is illustrated and compared with the currently used method. The model is then used in a second part to approach-in a rational way-the decision component of the problem, that is judicial questions of the kind 'Is the quantity of THC measured in the blood over the legal threshold of 1.5 μg/l?'. This is pointed out through a practical example.
Resumo:
Both, Bayesian networks and probabilistic evaluation are gaining more and more widespread use within many professional branches, including forensic science. Notwithstanding, they constitute subtle topics with definitional details that require careful study. While many sophisticated developments of probabilistic approaches to evaluation of forensic findings may readily be found in published literature, there remains a gap with respect to writings that focus on foundational aspects and on how these may be acquired by interested scientists new to these topics. This paper takes this as a starting point to report on the learning about Bayesian networks for likelihood ratio based, probabilistic inference procedures in a class of master students in forensic science. The presentation uses an example that relies on a casework scenario drawn from published literature, involving a questioned signature. A complicating aspect of that case study - proposed to students in a teaching scenario - is due to the need of considering multiple competing propositions, which is an outset that may not readily be approached within a likelihood ratio based framework without drawing attention to some additional technical details. Using generic Bayesian networks fragments from existing literature on the topic, course participants were able to track the probabilistic underpinnings of the proposed scenario correctly both in terms of likelihood ratios and of posterior probabilities. In addition, further study of the example by students allowed them to derive an alternative Bayesian network structure with a computational output that is equivalent to existing probabilistic solutions. This practical experience underlines the potential of Bayesian networks to support and clarify foundational principles of probabilistic procedures for forensic evaluation.
Resumo:
Individual differences in cognitive style can be characterized along two dimensions: ‘systemizing’ (S, the drive to analyze or build ‘rule-based’ systems) and ‘empathizing’ (E, the drive to identify another's mental state and respond to this with an appropriate emotion). Discrepancies between these two dimensions in one direction (S > E) or the other (E > S) are associated with sex differences in cognition: on average more males show an S > E cognitive style, while on average more females show an E > S profile. The neurobiological basis of these different profiles remains unknown. Since individuals may be typical or atypical for their sex, it is important to move away from the study of sex differences and towards the study of differences in cognitive style. Using structural magnetic resonance imaging we examined how neuroanatomy varies as a function of the discrepancy between E and S in 88 adult males from the general population. Selecting just males allows us to study discrepant E-S profiles in a pure way, unconfounded by other factors related to sex and gender. An increasing S > E profile was associated with increased gray matter volume in cingulate and dorsal medial prefrontal areas which have been implicated in processes related to cognitive control, monitoring, error detection, and probabilistic inference. An increasing E > S profile was associated with larger hypothalamic and ventral basal ganglia regions which have been implicated in neuroendocrine control, motivation and reward. These results suggest an underlying neuroanatomical basis linked to the discrepancy between these two important dimensions of individual differences in cognitive style.
Resumo:
Point pattern matching in Euclidean Spaces is one of the fundamental problems in Pattern Recognition, having applications ranging from Computer Vision to Computational Chemistry. Whenever two complex patterns are encoded by two sets of points identifying their key features, their comparison can be seen as a point pattern matching problem. This work proposes a single approach to both exact and inexact point set matching in Euclidean Spaces of arbitrary dimension. In the case of exact matching, it is assured to find an optimal solution. For inexact matching (when noise is involved), experimental results confirm the validity of the approach. We start by regarding point pattern matching as a weighted graph matching problem. We then formulate the weighted graph matching problem as one of Bayesian inference in a probabilistic graphical model. By exploiting the existence of fundamental constraints in patterns embedded in Euclidean Spaces, we prove that for exact point set matching a simple graphical model is equivalent to the full model. It is possible to show that exact probabilistic inference in this simple model has polynomial time complexity with respect to the number of elements in the patterns to be matched. This gives rise to a technique that for exact matching provably finds a global optimum in polynomial time for any dimensionality of the underlying Euclidean Space. Computational experiments comparing this technique with well-known probabilistic relaxation labeling show significant performance improvement for inexact matching. The proposed approach is significantly more robust under augmentation of the sizes of the involved patterns. In the absence of noise, the results are always perfect.
Resumo:
Amongst all the objectives in the study of time series, uncovering the dynamic law of its generation is probably the most important. When the underlying dynamics are not available, time series modelling consists of developing a model which best explains a sequence of observations. In this thesis, we consider hidden space models for analysing and describing time series. We first provide an introduction to the principal concepts of hidden state models and draw an analogy between hidden Markov models and state space models. Central ideas such as hidden state inference or parameter estimation are reviewed in detail. A key part of multivariate time series analysis is identifying the delay between different variables. We present a novel approach for time delay estimating in a non-stationary environment. The technique makes use of hidden Markov models and we demonstrate its application for estimating a crucial parameter in the oil industry. We then focus on hybrid models that we call dynamical local models. These models combine and generalise hidden Markov models and state space models. Probabilistic inference is unfortunately computationally intractable and we show how to make use of variational techniques for approximating the posterior distribution over the hidden state variables. Experimental simulations on synthetic and real-world data demonstrate the application of dynamical local models for segmenting a time series into regimes and providing predictive distributions.
Resumo:
Networking encompasses a variety of tasks related to the communication of information on networks; it has a substantial economic and societal impact on a broad range of areas including transportation systems, wired and wireless communications and a range of Internet applications. As transportation and communication networks become increasingly more complex, the ever increasing demand for congestion control, higher traffic capacity, quality of service, robustness and reduced energy consumption requires new tools and methods to meet these conflicting requirements. The new methodology should serve for gaining better understanding of the properties of networking systems at the macroscopic level, as well as for the development of new principled optimization and management algorithms at the microscopic level. Methods of statistical physics seem best placed to provide new approaches as they have been developed specifically to deal with nonlinear large-scale systems. This review aims at presenting an overview of tools and methods that have been developed within the statistical physics community and that can be readily applied to address the emerging problems in networking. These include diffusion processes, methods from disordered systems and polymer physics, probabilistic inference, which have direct relevance to network routing, file and frequency distribution, the exploration of network structures and vulnerability, and various other practical networking applications. © 2013 IOP Publishing Ltd.