968 resultados para Randomization-based Inference
Resumo:
This paper presents the unique collection of additional features of Qu-Prolog, a variant of the Al programming language Prolog, and illustrates how they can be used for implementing DAI applications. By this we mean applications comprising communicating information servers, expert systems, or agents, with sophisticated reasoning capabilities and internal concurrency. Such an application exploits the key features of Qu-Prolog: support for the programming of sound non-clausal inference systems, multi-threading, and high level inter-thread message communication between Qu-Prolog query threads anywhere on the internet. The inter-thread communication uses email style symbolic names for threads, allowing easy construction of distributed applications using public names for threads. How threads react to received messages is specified by a disjunction of reaction rules which the thread periodically executes. A communications API allows smooth integration of components written in C, which to Qu-Prolog, look like remote query threads.
Resumo:
We consider a mixture model approach to the regression analysis of competing-risks data. Attention is focused on inference concerning the effects of factors on both the probability of occurrence and the hazard rate conditional on each of the failure types. These two quantities are specified in the mixture model using the logistic model and the proportional hazards model, respectively. We propose a semi-parametric mixture method to estimate the logistic and regression coefficients jointly, whereby the component-baseline hazard functions are completely unspecified. Estimation is based on maximum likelihood on the basis of the full likelihood, implemented via an expectation-conditional maximization (ECM) algorithm. Simulation studies are performed to compare the performance of the proposed semi-parametric method with a fully parametric mixture approach. The results show that when the component-baseline hazard is monotonic increasing, the semi-parametric and fully parametric mixture approaches are comparable for mildly and moderately censored samples. When the component-baseline hazard is not monotonic increasing, the semi-parametric method consistently provides less biased estimates than a fully parametric approach and is comparable in efficiency in the estimation of the parameters for all levels of censoring. The methods are illustrated using a real data set of prostate cancer patients treated with different dosages of the drug diethylstilbestrol. Copyright (C) 2003 John Wiley Sons, Ltd.
Resumo:
Trabalho apresentado no âmbito do Mestrado em Engenharia Informática, como requisito parcial para obtenção do grau de Mestre em Engenharia Informática
Resumo:
The assessment of existing timber structures is often limited to information obtained from non or semi destructive testing, as mechanical testing is in many cases not possible due to its destructive nature. Therefore, the available data provides only an indirect measurement of the reference mechanical properties of timber elements, often obtained through empirical based correlations. Moreover, the data must result from the combination of different tests, as to provide a reliable source of information for a structural analysis. Even if general guidelines are available for each typology of testing, there is still a need for a global methodology allowing to combine information from different sources and infer upon that information in a decision process. In this scope, the present work presents the implementation of a probabilistic based framework for safety assessment of existing timber elements. This methodology combines information gathered in different scales and follows a probabilistic framework allowing for the structural assessment of existing timber elements with possibility of inference and updating of its mechanical properties, through Bayesian methods. The probabilistic based framework is based in four main steps: (i) scale of information; (ii) measurement data; (iii) probability assignment; and (iv) structural analysis. In this work, the proposed methodology is implemented in a case study. Data was obtained through a multi-scale experimental campaign made to old chestnut timber beams accounting correlations of non and semi-destructive tests with mechanical properties. Finally, different inference scenarios are discussed aiming at the characterization of the safety level of the elements.
Resumo:
A novel framework for probabilistic-based structural assessment of existing structures, which combines model identification and reliability assessment procedures, considering in an objective way different sources of uncertainty, is presented in this paper. A short description of structural assessment applications, provided in literature, is initially given. Then, the developed model identification procedure, supported in a robust optimization algorithm, is presented. Special attention is given to both experimental and numerical errors, to be considered in this algorithm convergence criterion. An updated numerical model is obtained from this process. The reliability assessment procedure, which considers a probabilistic model for the structure in analysis, is then introduced, incorporating the results of the model identification procedure. The developed model is then updated, as new data is acquired, through a Bayesian inference algorithm, explicitly addressing statistical uncertainty. Finally, the developed framework is validated with a set of reinforced concrete beams, which were loaded up to failure in laboratory.
Resumo:
Introduction: Non-invasive brain imaging techniques often contrast experimental conditions across a cohort of participants, obfuscating distinctions in individual performance and brain mechanisms that are better characterised by the inter-trial variability. To overcome such limitations, we developed topographic analysis methods for single-trial EEG data [1]. So far this was typically based on time-frequency analysis of single-electrode data or single independent components. The method's efficacy is demonstrated for event-related responses to environmental sounds, hitherto studied at an average event-related potential (ERP) level. Methods: Nine healthy subjects participated to the experiment. Auditory meaningful sounds of common objects were used for a target detection task [2]. On each block, subjects were asked to discriminate target sounds, which were living or man-made auditory objects. Continuous 64-channel EEG was acquired during the task. Two datasets were considered for each subject including single-trial of the two conditions, living and man-made. The analysis comprised two steps. In the first part, a mixture of Gaussians analysis [3] provided representative topographies for each subject. In the second step, conditional probabilities for each Gaussian provided statistical inference on the structure of these topographies across trials, time, and experimental conditions. Similar analysis was conducted at group-level. Results: Results show that the occurrence of each map is structured in time and consistent across trials both at the single-subject and at group level. Conducting separate analyses of ERPs at single-subject and group levels, we could quantify the consistency of identified topographies and their time course of activation within and across participants as well as experimental conditions. A general agreement was found with previous analysis at average ERP level. Conclusions: This novel approach to single-trial analysis promises to have impact on several domains. In clinical research, it gives the possibility to statistically evaluate single-subject data, an essential tool for analysing patients with specific deficits and impairments and their deviation from normative standards. In cognitive neuroscience, it provides a novel tool for understanding behaviour and brain activity interdependencies at both single-subject and at group levels. In basic neurophysiology, it provides a new representation of ERPs and promises to cast light on the mechanisms of its generation and inter-individual variability.
Resumo:
Continuing developments in science and technology mean that the amounts of information forensic scientists are able to provide for criminal investigations is ever increasing. The commensurate increase in complexity creates difficulties for scientists and lawyers with regard to evaluation and interpretation, notably with respect to issues of inference and decision. Probability theory, implemented through graphical methods, and specifically Bayesian networks, provides powerful methods to deal with this complexity. Extensions of these methods to elements of decision theory provide further support and assistance to the judicial system. Bayesian Networks for Probabilistic Inference and Decision Analysis in Forensic Science provides a unique and comprehensive introduction to the use of Bayesian decision networks for the evaluation and interpretation of scientific findings in forensic science, and for the support of decision-makers in their scientific and legal tasks. Includes self-contained introductions to probability and decision theory. Develops the characteristics of Bayesian networks, object-oriented Bayesian networks and their extension to decision models. Features implementation of the methodology with reference to commercial and academically available software. Presents standard networks and their extensions that can be easily implemented and that can assist in the reader's own analysis of real cases. Provides a technique for structuring problems and organizing data based on methods and principles of scientific reasoning. Contains a method for the construction of coherent and defensible arguments for the analysis and evaluation of scientific findings and for decisions based on them. Is written in a lucid style, suitable for forensic scientists and lawyers with minimal mathematical background. Includes a foreword by Ian Evett. The clear and accessible style of this second edition makes this book ideal for all forensic scientists, applied statisticians and graduate students wishing to evaluate forensic findings from the perspective of probability and decision analysis. It will also appeal to lawyers and other scientists and professionals interested in the evaluation and interpretation of forensic findings, including decision making based on scientific information.
Resumo:
Given a sample from a fully specified parametric model, let Zn be a given finite-dimensional statistic - for example, an initial estimator or a set of sample moments. We propose to (re-)estimate the parameters of the model by maximizing the likelihood of Zn. We call this the maximum indirect likelihood (MIL) estimator. We also propose a computationally tractable Bayesian version of the estimator which we refer to as a Bayesian Indirect Likelihood (BIL) estimator. In most cases, the density of the statistic will be of unknown form, and we develop simulated versions of the MIL and BIL estimators. We show that the indirect likelihood estimators are consistent and asymptotically normally distributed, with the same asymptotic variance as that of the corresponding efficient two-step GMM estimator based on the same statistic. However, our likelihood-based estimators, by taking into account the full finite-sample distribution of the statistic, are higher order efficient relative to GMM-type estimators. Furthermore, in many cases they enjoy a bias reduction property similar to that of the indirect inference estimator. Monte Carlo results for a number of applications including dynamic and nonlinear panel data models, a structural auction model and two DSGE models show that the proposed estimators indeed have attractive finite sample properties.
Resumo:
A parts based model is a parametrization of an object class using a collection of landmarks following the object structure. The matching of parts based models is one of the problems where pairwise Conditional Random Fields have been successfully applied. The main reason of their effectiveness is tractable inference and learning due to the simplicity of involved graphs, usually trees. However, these models do not consider possible patterns of statistics among sets of landmarks, and thus they sufffer from using too myopic information. To overcome this limitation, we propoese a novel structure based on a hierarchical Conditional Random Fields, which we explain in the first part of this memory. We build a hierarchy of combinations of landmarks, where matching is performed taking into account the whole hierarchy. To preserve tractable inference we effectively sample the label set. We test our method on facial feature selection and human pose estimation on two challenging datasets: Buffy and MultiPIE. In the second part of this memory, we present a novel approach to multiple kernel combination that relies on stacked classification. This method can be used to evaluate the landmarks of the parts-based model approach. Our method is based on combining responses of a set of independent classifiers for each individual kernel. Unlike earlier approaches that linearly combine kernel responses, our approach uses them as inputs to another set of classifiers. We will show that we outperform state-of-the-art methods on most of the standard benchmark datasets.
Resumo:
Mendelian randomization refers to the random allocation of alleles at the time of gamete formation. In observational epidemiology, this refers to the use of genetic variants to estimate a causal effect between a modifiable risk factor and an outcome of interest. In this review, we recall the principles of a "Mendelian randomization" approach in observational epidemiology, which is based on the technique of instrumental variables; we provide simulations and an example based on real data to demonstrate its implications; we present the results of a systematic search on original articles having used this approach; and we discuss some limitations of this approach in view of what has been found so far.
Resumo:
Although the relationship between serum uric acid (SUA) and adiposity is well established, the direction of the causality is still unclear in the presence of conflicting evidences. We used a bidirectional Mendelian randomization approach to explore the nature and direction of causality between SUA and adiposity in a population-based study of Caucasians aged 35 to 75 years. We used, as instrumental variables, rs6855911 within the SUA gene SLC2A9 in one direction, and combinations of SNPs within the adiposity genes FTO, MC4R and TMEM18 in the other direction. Adiposity markers included weight, body mass index, waist circumference and fat mass. We applied a two-stage least squares regression: a regression of SUA/adiposity markers on our instruments in the first stage and a regression of the response of interest on the fitted values from the first stage regression in the second stage. SUA explained by the SLC2A9 instrument was not associated to fat mass (regression coefficient [95% confidence interval]: 0.05 [-0.10, 0.19] for fat mass) contrasting with the ordinary least square estimate (0.37 [0.34, 0.40]). By contrast, fat mass explained by genetic variants of the FTO, MC4R and TMEM18 genes was positively and significantly associated to SUA (0.31 [0.01, 0.62]), similar to the ordinary least square estimate (0.27 [0.25, 0.29]). Results were similar for the other adiposity markers. Using a bidirectional Mendelian randomization approach in adult Caucasians, our findings suggest that elevated SUA is a consequence rather than a cause of adiposity.
Resumo:
In the forensic examination of DNA mixtures, the question of how to set the total number of contributors (N) presents a topic of ongoing interest. Part of the discussion gravitates around issues of bias, in particular when assessments of the number of contributors are not made prior to considering the genotypic configuration of potential donors. Further complication may stem from the observation that, in some cases, there may be numbers of contributors that are incompatible with the set of alleles seen in the profile of a mixed crime stain, given the genotype of a potential contributor. In such situations, procedures that take a single and fixed number contributors as their output can lead to inferential impasses. Assessing the number of contributors within a probabilistic framework can help avoiding such complication. Using elements of decision theory, this paper analyses two strategies for inference on the number of contributors. One procedure is deterministic and focuses on the minimum number of contributors required to 'explain' an observed set of alleles. The other procedure is probabilistic using Bayes' theorem and provides a probability distribution for a set of numbers of contributors, based on the set of observed alleles as well as their respective rates of occurrence. The discussion concentrates on mixed stains of varying quality (i.e., different numbers of loci for which genotyping information is available). A so-called qualitative interpretation is pursued since quantitative information such as peak area and height data are not taken into account. The competing procedures are compared using a standard scoring rule that penalizes the degree of divergence between a given agreed value for N, that is the number of contributors, and the actual value taken by N. Using only modest assumptions and a discussion with reference to a casework example, this paper reports on analyses using simulation techniques and graphical models (i.e., Bayesian networks) to point out that setting the number of contributors to a mixed crime stain in probabilistic terms is, for the conditions assumed in this study, preferable to a decision policy that uses categoric assumptions about N.
Resumo:
Doping with natural steroids can be detected by evaluating the urinary concentrations and ratios of several endogenous steroids. Since these biomarkers of steroid doping are known to present large inter-individual variations, monitoring of individual steroid profiles over time allows switching from population-based towards subject-based reference ranges for improved detection. In an Athlete Biological Passport (ABP), biomarkers data are collated throughout the athlete's sporting career and individual thresholds defined adaptively. For now, this approach has been validated on a limited number of markers of steroid doping, such as the testosterone (T) over epitestosterone (E) ratio to detect T misuse in athletes. Additional markers are required for other endogenous steroids like dihydrotestosterone (DHT) and dehydroepiandrosterone (DHEA). By combining comprehensive steroid profiles composed of 24 steroid concentrations with Bayesian inference techniques for longitudinal profiling, a selection was made for the detection of DHT and DHEA misuse. The biomarkers found were rated according to relative response, parameter stability, discriminative power, and maximal detection time. This analysis revealed DHT/E, DHT/5β-androstane-3α,17β-diol and 5α-androstane-3α,17β-diol/5β-androstane-3α,17β-diol as best biomarkers for DHT administration and DHEA/E, 16α-hydroxydehydroepiandrosterone/E, 7β-hydroxydehydroepiandrosterone/E and 5β-androstane-3α,17β-diol/5α-androstane-3α,17β-diol for DHEA. The selected biomarkers were found suitable for individual referencing. A drastic overall increase in sensitivity was obtained. The use of multiple markers as formalized in an Athlete Steroidal Passport (ASP) can provide firm evidence of doping with endogenous steroids.
Resumo:
Small sample properties are of fundamental interest when only limited data is avail-able. Exact inference is limited by constraints imposed by speci.c nonrandomizedtests and of course also by lack of more data. These e¤ects can be separated as we propose to evaluate a test by comparing its type II error to the minimal type II error among all tests for the given sample. Game theory is used to establish this minimal type II error, the associated randomized test is characterized as part of a Nash equilibrium of a .ctitious game against nature.We use this method to investigate sequential tests for the di¤erence between twomeans when outcomes are constrained to belong to a given bounded set. Tests ofinequality and of noninferiority are included. We .nd that inference in terms oftype II error based on a balanced sample cannot be improved by sequential sampling or even by observing counter factual evidence providing there is a reasonable gap between the hypotheses.