952 resultados para prior probabilities
Resumo:
Prior probabilities represent a core element of the Bayesian probabilistic approach to relatedness testing. This letter opinions on the commentary 'Use of prior odds for missing persons identifications' by Budowle et al. (2011), published recently in this journal. Contrary to Budowle et al. (2011), we argue that the concept of prior probabilities (i) is not endowed with the notion of objectivity, (ii) is not a case for computation and (iii) does not require new guidelines edited by the forensic DNA community - as long as probability is properly considered as an expression of personal belief. Please see related article: http://www.investigativegenetics.com/content/3/1/3
Resumo:
Background: Genome-wide association studies (GWAS) require large sample sizes to obtain adequate statistical power, but it may be possible to increase the power by incorporating complementary data. In this study we investigated the feasibility of automatically retrieving information from the medical literature and leveraging this information in GWAS. Methods: We developed a method that searches through PubMed abstracts for pre-assigned keywords and key concepts, and uses this information to assign prior probabilities of association for each single nucleotide polymorphism (SNP) with the phenotype of interest - the Adjusting Association Priors with Text (AdAPT) method. Association results from a GWAS can subsequently be ranked in the context of these priors using the Bayes False Discovery Probability (BFDP) framework. We initially tested AdAPT by comparing rankings of known susceptibility alleles in a previous lung cancer GWAS, and subsequently applied it in a two-phase GWAS of oral cancer. Results: Known lung cancer susceptibility SNPs were consistently ranked higher by AdAPT BFDPs than by p-values. In the oral cancer GWAS, we sought to replicate the top five SNPs as ranked by AdAPT BFDPs, of which rs991316, located in the ADH gene region of 4q23, displayed a statistically significant association with oral cancer risk in the replication phase (per-rare-allele log additive p-value [p(trend)] = 2.5 x 10(-3)). The combined OR for having one additional rare allele was 0.83 (95% CI: 0.76-0.90), and this association was independent of previously identified susceptibility SNPs that are associated with overall UADT cancer in this gene region. We also investigated if rs991316 was associated with other cancers of the upper aerodigestive tract (UADT), but no additional association signal was found. Conclusion: This study highlights the potential utility of systematically incorporating prior knowledge from the medical literature in genome-wide analyses using the AdAPT methodology. AdAPT is available online (url: http://services.gate.ac.uk/lld/gwas/service/config).
Resumo:
This report outlines the derivation and application of a non-zero mean, polynomial-exponential covariance function based Gaussian process which forms the prior wind field model used in 'autonomous' disambiguation. It is principally used since the non-zero mean permits the computation of realistic local wind vector prior probabilities which are required when applying the scaled-likelihood trick, as the marginals of the full wind field prior. As the full prior is multi-variate normal, these marginals are very simple to compute.
Resumo:
Hidden Markov models (HMMs) are probabilistic models that are well adapted to many tasks in bioinformatics, for example, for predicting the occurrence of specific motifs in biological sequences. MAMOT is a command-line program for Unix-like operating systems, including MacOS X, that we developed to allow scientists to apply HMMs more easily in their research. One can define the architecture and initial parameters of the model in a text file and then use MAMOT for parameter optimization on example data, decoding (like predicting motif occurrence in sequences) and the production of stochastic sequences generated according to the probabilistic model. Two examples for which models are provided are coiled-coil domains in protein sequences and protein binding sites in DNA. A wealth of useful features include the use of pseudocounts, state tying and fixing of selected parameters in learning, and the inclusion of prior probabilities in decoding. AVAILABILITY: MAMOT is implemented in C++, and is distributed under the GNU General Public Licence (GPL). The software, documentation, and example model files can be found at http://bcf.isb-sib.ch/mamot
Resumo:
A comparative performance analysis of four geolocation methods in terms of their theoretical root mean square positioning errors is provided. Comparison is established in two different ways: strict and average. In the strict type, methods are examined for a particular geometric configuration of base stations(BSs) with respect to mobile position, which determines a givennoise profile affecting the respective time-of-arrival (TOA) or timedifference-of-arrival (TDOA) estimates. In the average type, methodsare evaluated in terms of the expected covariance matrix ofthe position error over an ensemble of random geometries, so thatcomparison is geometry independent. Exact semianalytical equationsand associated lower bounds (depending solely on the noiseprofile) are obtained for the average covariance matrix of the positionerror in terms of the so-called information matrix specific toeach geolocation method. Statistical channel models inferred fromfield trials are used to define realistic prior probabilities for therandom geometries. A final evaluation provides extensive resultsrelating the expected position error to channel model parametersand the number of base stations.
Resumo:
L’objectif principal de cette thèse est d’identifier les étoiles de faible masse et naines brunes membres d’associations cinématiques jeunes du voisinage solaire. Ces associations sont typiquement âgées de moins de 200 millions d’années et regroupent chacune un ensemble d’étoiles s’étant formées au même moment et dans un même environnement. La majorité de leurs membres d'environ plus de 0.3 fois la masse du Soleil sont déjà connus, cependant les membres moins massifs (et moins brillants) nous échappent encore. Leur identification permettra de lever le voile sur plusieurs questions fondamentales en astrophysique. En particulier, le fait de cibler des objets jeunes, encore chauds et lumineux par leur formation récente, permettra d’atteindre un régime de masses encore peu exploré, jusqu'à seulement quelques fois la masse de Jupiter. Elles nous permettront entre autres de contraindre la fonction de masse initiale et d'explorer la connection entre naines brunes et exoplanètes, étant donné que les moins massives des naines brunes jeunes auront des propriétés physiques très semblables aux exoplanètes géantes gazeuses. Pour mener à bien ce projet, nous avons adapté l'outil statistique BANYAN I pour qu'il soit applicable aux objets de très faibles masses en plus de lui apporter plusieurs améliorations. Nous avons entre autres inclus l'utilisation de deux diagrammes couleur-magnitude permettant de différencier les étoiles de faible masse et naines brunes jeunes à celles plus vieilles, ajouté l'utilisation de probabilités a priori pour rendre les résultats plus réalistes, adapté les modèles spatiaux et cinématiques des associations jeunes en utilisant des ellipsoïdes gaussiennes tridimensionnelles dont l'alignement des axes est libre, effectué une analyse Monte Carlo pour caractériser le taux de faux-positifs et faux-négatifs, puis revu la structure du code informatique pour le rendre plus efficace. Dans un premier temps, nous avons utilisé ce nouvel algorithme, BANYAN II, pour identifier 25 nouvelles candidates membres d'associations jeunes parmi un échantillon de 158 étoiles de faible masse (de types spectraux > M4) et naines brunes jeunes déjà connues. Nous avons ensuite effectué la corrélation croisée de deux catalogues couvrant tout le ciel en lumière proche-infrarouge et contenant ~ 500 millions d’objets célestes pour identifier environ 100 000 candidates naines brunes et étoiles de faible masse du voisinage solaire. À l'aide de l'outil BANYAN II, nous avons alors identifié quelques centaines d'objets appartenant fort probablement à une association jeune parmi cet échantillon et effectué un suivi spectroscopique en lumière proche-infrarouge pour les caractériser. Les travaux présentés ici ont mené à l'identification de 79 candidates naines brunes jeunes ainsi que 150 candidates étoiles de faible masse jeunes, puis un suivi spectroscopique nous a permis de confirmer le jeune âge de 49 de ces naines brunes et 62 de ces étoiles de faible masse. Nous avons ainsi approximativement doublé le nombre de naines brunes jeunes connues, ce qui a ouvert la porte à une caractérisation statistique de leur population. Ces nouvelles naines brunes jeunes représentent un laboratoire idéal pour mieux comprendre l'atmosphère des exoplanètes géantes gazeuses. Nous avons identifié les premiers signes d’une remontée dans la fonction de masse initiale des naines brunes aux très faibles masses dans l'association jeune Tucana-Horologium, ce qui pourrait indiquer que l’éjection d’exoplanètes joue un rôle important dans la composition de leur population. Les résultats du suivi spectroscopique nous ont permis de construire une séquence empirique complète pour les types spectraux M5-L5 à l'âge du champ, à faible (β) et très faible (γ) gravité de surface. Nous avons effectué une comparaison de ces données aux modèles d'évolution et d'atmosphère, puis nous avons construit un ensemble de séquences empiriques de couleur-magnitude et types spectraux-magnitude pour les naines brunes jeunes. Finalement, nous avons découvert deux nouvelles exoplanètes par un suivi en imagerie directe des étoiles jeunes de faible masse identifiées dans ce projet. La future mission GAIA et le suivi spectroscopique complet des candidates présentées dans cette thèse permettront de confirmer leur appartenance aux associations jeunes et de contraindre la fonction de masse initiale dans le régime sous-stellaire.
Resumo:
We had previously shown that regularization principles lead to approximation schemes, as Radial Basis Functions, which are equivalent to networks with one layer of hidden units, called Regularization Networks. In this paper we show that regularization networks encompass a much broader range of approximation schemes, including many of the popular general additive models, Breiman's hinge functions and some forms of Projection Pursuit Regression. In the probabilistic interpretation of regularization, the different classes of basis functions correspond to different classes of prior probabilities on the approximating function spaces, and therefore to different types of smoothness assumptions. In the final part of the paper, we also show a relation between activation functions of the Gaussian and sigmoidal type.
Resumo:
A recent article in this journal (Ioannidis JP (2005) Why most published research findings are false. PLoS Med 2: e124) argued that more than half of published research findings in the medical literature are false. In this commentary, we examine the structure of that argument, and show that it has three basic components: 1)An assumption that the prior probability of most hypotheses explored in medical research is below 50%. 2)Dichotomization of P-values at the 0.05 level and introduction of a “bias” factor (produced by significance-seeking), the combination of which severely weakens the evidence provided by every design. 3)Use of Bayes theorem to show that, in the face of weak evidence, hypotheses with low prior probabilities cannot have posterior probabilities over 50%. Thus, the claim is based on a priori assumptions that most tested hypotheses are likely to be false, and then the inferential model used makes it impossible for evidence from any study to overcome this handicap. We focus largely on step (2), explaining how the combination of dichotomization and “bias” dilutes experimental evidence, and showing how this dilution leads inevitably to the stated conclusion. We also demonstrate a fallacy in another important component of the argument –that papers in “hot” fields are more likely to produce false findings. We agree with the paper’s conclusions and recommendations that many medical research findings are less definitive than readers suspect, that P-values are widely misinterpreted, that bias of various forms is widespread, that multiple approaches are needed to prevent the literature from being systematically biased and the need for more data on the prevalence of false claims. But calculating the unreliability of the medical research literature, in whole or in part, requires more empirical evidence and different inferential models than were used. The claim that “most research findings are false for most research designs and for most fields” must be considered as yet unproven.
Resumo:
In this paper we propose a new method for the automatic detection and tracking of road traffic signs using an on-board single camera. This method aims to increase the reliability of the detections such that it can boost the performance of any traffic sign recognition scheme. The proposed approach exploits a combination of different features, such as color, appearance, and tracking information. This information is introduced into a recursive Bayesian decision framework, in which prior probabilities are dynamically adapted to tracking results. This decision scheme obtains a number of candidate regions in the image, according to their HS (Hue-Saturation). Finally, a Kalman filter with an adaptive noise tuning provides the required time and spatial coherence to the estimates. Results have shown that the proposed method achieves high detection rates in challenging scenarios, including illumination changes, rapid motion and significant perspective distortion
Resumo:
Frequentist statistical methods continue to predominate in many areas of science despite prominent calls for "statistical reform." They do so in part because their main rivals, Bayesian methods, appeal to prior probability distributions that arguably lack an objective justification in typical cases. Some methodologists find a third approach called likelihoodism attractive because it avoids important objections to frequentism without appealing to prior probabilities. However, likelihoodist methods do not provide guidance for belief or action, but only assessments of data as evidence. I argue that there is no good way to use those assessments to guide beliefs or actions without appealing to prior probabilities, and that as a result likelihoodism is not a viable alternative to frequentism and Bayesianism for statistical reform efforts in science.
Resumo:
This paper, addresses the problem of novelty detection in the case that the observed data is a mixture of a known 'background' process contaminated with an unknown other process, which generates the outliers, or novel observations. The framework we describe here is quite general, employing univariate classification with incomplete information, based on knowledge of the distribution (the 'probability density function', 'pdf') of the data generated by the 'background' process. The relative proportion of this 'background' component (the 'prior' 'background' 'probability), the 'pdf' and the 'prior' probabilities of all other components are all assumed unknown. The main contribution is a new classification scheme that identifies the maximum proportion of observed data following the known 'background' distribution. The method exploits the Kolmogorov-Smirnov test to estimate the proportions, and afterwards data are Bayes optimally separated. Results, demonstrated with synthetic data, show that this approach can produce more reliable results than a standard novelty detection scheme. The classification algorithm is then applied to the problem of identifying outliers in the SIC2004 data set, in order to detect the radioactive release simulated in the 'oker' data set. We propose this method as a reliable means of novelty detection in the emergency situation which can also be used to identify outliers prior to the application of a more general automatic mapping algorithm. © Springer-Verlag 2007.
Resumo:
International audience
Resumo:
To explore the projection efficiency of a design, Tsai, et al [2000. Projective three-level main effects designs robust to model uncertainty. Biometrika 87, 467-475] introduced the Q criterion to compare three-level main-effects designs for quantitative factors that allow the consideration of interactions in addition to main effects. In this paper, we extend their method and focus on the case in which experimenters have some prior knowledge, in advance of running the experiment, about the probabilities of effects being non-negligible. A criterion which incorporates experimenters' prior beliefs about the importance of each effect is introduced to compare orthogonal, or nearly orthogonal, main effects designs with robustness to interactions as a secondary consideration. We show that this criterion, exploiting prior information about model uncertainty, can lead to more appropriate designs reflecting experimenters' prior beliefs. (c) 2006 Elsevier B.V. All rights reserved.