951 resultados para Statistical hypothesis testing
Resumo:
Many people regard the concept of hypothesis testing as fundamental to inferential statistics. Various schools of thought, in particular frequentist and Bayesian, have promoted radically different solutions for taking a decision about the plausibility of competing hypotheses. Comprehensive philosophical comparisons about their advantages and drawbacks are widely available and continue to span over large debates in the literature. More recently, controversial discussion was initiated by an editorial decision of a scientific journal [1] to refuse any paper submitted for publication containing null hypothesis testing procedures. Since the large majority of papers published in forensic journals propose the evaluation of statistical evidence based on the so called p-values, it is of interest to expose the discussion of this journal's decision within the forensic science community. This paper aims to provide forensic science researchers with a primer on the main concepts and their implications for making informed methodological choices.
Resumo:
A procedure for calculating critical level and power of likelihood ratio test, based on a Monte-Carlo simulation method is proposed. General principles of software building for its realization are given. Some examples of its application are shown.
Resumo:
Mode of access: Internet.
Resumo:
Efficient automatic protein classification is of central importance in genomic annotation. As an independent way to check the reliability of the classification, we propose a statistical approach to test if two sets of protein domain sequences coming from two families of the Pfam database are significantly different. We model protein sequences as realizations of Variable Length Markov Chains (VLMC) and we use the context trees as a signature of each protein family. Our approach is based on a Kolmogorov-Smirnov-type goodness-of-fit test proposed by Balding et at. [Limit theorems for sequences of random trees (2008), DOI: 10.1007/s11749-008-0092-z]. The test statistic is a supremum over the space of trees of a function of the two samples; its computation grows, in principle, exponentially fast with the maximal number of nodes of the potential trees. We show how to transform this problem into a max-flow over a related graph which can be solved using a Ford-Fulkerson algorithm in polynomial time on that number. We apply the test to 10 randomly chosen protein domain families from the seed of Pfam-A database (high quality, manually curated families). The test shows that the distributions of context trees coming from different families are significantly different. We emphasize that this is a novel mathematical approach to validate the automatic clustering of sequences in any context. We also study the performance of the test via simulations on Galton-Watson related processes.
Resumo:
Mode of access: Internet.
Resumo:
In the 1920s, Ronald Fisher developed the theory behind the p value and Jerzy Neyman and Egon Pearson developed the theory of hypothesis testing. These distinct theories have provided researchers important quantitative tools to confirm or refute their hypotheses. The p value is the probability to obtain an effect equal to or more extreme than the one observed presuming the null hypothesis of no effect is true; it gives researchers a measure of the strength of evidence against the null hypothesis. As commonly used, investigators will select a threshold p value below which they will reject the null hypothesis. The theory of hypothesis testing allows researchers to reject a null hypothesis in favor of an alternative hypothesis of some effect. As commonly used, investigators choose Type I error (rejecting the null hypothesis when it is true) and Type II error (accepting the null hypothesis when it is false) levels and determine some critical region. If the test statistic falls into that critical region, the null hypothesis is rejected in favor of the alternative hypothesis. Despite similarities between the two, the p value and the theory of hypothesis testing are different theories that often are misunderstood and confused, leading researchers to improper conclusions. Perhaps the most common misconception is to consider the p value as the probability that the null hypothesis is true rather than the probability of obtaining the difference observed, or one that is more extreme, considering the null is true. Another concern is the risk that an important proportion of statistically significant results are falsely significant. Researchers should have a minimum understanding of these two theories so that they are better able to plan, conduct, interpret, and report scientific experiments.
Resumo:
The purpose of this master thesis was to perform simulations that involve use of random number while testing hypotheses especially on two samples populations being compared weather by their means, variances or Sharpe ratios. Specifically, we simulated some well known distributions by Matlab and check out the accuracy of an hypothesis testing. Furthermore, we went deeper and check what could happen once the bootstrapping method as described by Effrons is applied on the simulated data. In addition to that, one well known RobustSharpe hypothesis testing stated in the paper of Ledoit and Wolf was applied to measure the statistical significance performance between two investment founds basing on testing weather there is a statistically significant difference between their Sharpe Ratios or not. We collected many literatures about our topic and perform by Matlab many simulated random numbers as possible to put out our purpose; As results we come out with a good understanding that testing are not always accurate; for instance while testing weather two normal distributed random vectors come from the same normal distribution. The Jacque-Berra test for normality showed that for the normal random vector r1 and r2, only 94,7% and 95,7% respectively are coming from normal distribution in contrast 5,3% and 4,3% failed to shown the truth already known; but when we introduce the bootstrapping methods by Effrons while estimating pvalues where the hypothesis decision is based, the accuracy of the test was 100% successful. From the above results the reports showed that bootstrapping methods while testing or estimating some statistics should always considered because at most cases the outcome are accurate and errors are minimized in the computation. Also the RobustSharpe test which is known to use one of the bootstrapping methods, studentised one, were applied first on different simulated data including distribution of many kind and different shape secondly, on real data, Hedge and Mutual funds. The test performed quite well to agree with the existence of statistical significance difference between their Sharpe ratios as described in the paper of Ledoit andWolf.
Resumo:
Belief propagation (BP) is a technique for distributed inference in wireless networks and is often used even when the underlying graphical model contains cycles. In this paper, we propose a uniformly reweighted BP scheme that reduces the impact of cycles by weighting messages by a constant ?edge appearance probability? rho ? 1. We apply this algorithm to distributed binary hypothesis testing problems (e.g., distributed detection) in wireless networks with Markov random field models. We demonstrate that in the considered setting the proposed method outperforms standard BP, while maintaining similar complexity. We then show that the optimal ? can be approximated as a simple function of the average node degree, and can hence be computed in a distributed fashion through a consensus algorithm.
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-08
Resumo:
Dissertação apresentada para obtenção do Grau de Doutor em Matemática, Estatística, pela Universidade Nova de Lisboa, faculdade de Ciências e Tecnologia
Resumo:
Small sample properties are of fundamental interest when only limited data is avail-able. Exact inference is limited by constraints imposed by speci.c nonrandomizedtests and of course also by lack of more data. These e¤ects can be separated as we propose to evaluate a test by comparing its type II error to the minimal type II error among all tests for the given sample. Game theory is used to establish this minimal type II error, the associated randomized test is characterized as part of a Nash equilibrium of a .ctitious game against nature.We use this method to investigate sequential tests for the di¤erence between twomeans when outcomes are constrained to belong to a given bounded set. Tests ofinequality and of noninferiority are included. We .nd that inference in terms oftype II error based on a balanced sample cannot be improved by sequential sampling or even by observing counter factual evidence providing there is a reasonable gap between the hypotheses.
Resumo:
Consider the problem of testing k hypotheses simultaneously. In this paper,we discuss finite and large sample theory of stepdown methods that providecontrol of the familywise error rate (FWE). In order to improve upon theBonferroni method or Holm's (1979) stepdown method, Westfall and Young(1993) make eective use of resampling to construct stepdown methods thatimplicitly estimate the dependence structure of the test statistics. However,their methods depend on an assumption called subset pivotality. The goalof this paper is to construct general stepdown methods that do not requiresuch an assumption. In order to accomplish this, we take a close look atwhat makes stepdown procedures work, and a key component is a monotonicityrequirement of critical values. By imposing such monotonicity on estimatedcritical values (which is not an assumption on the model but an assumptionon the method), it is demonstrated that the problem of constructing a validmultiple test procedure which controls the FWE can be reduced to the problemof contructing a single test which controls the usual probability of a Type 1error. This reduction allows us to draw upon an enormous resamplingliterature as a general means of test contruction.
Resumo:
In the first part of the study, nine estimators of the first-order autoregressive parameter are reviewed and a new estimator is proposed. The relationships and discrepancies between the estimators are discussed in order to achieve a clear differentiation. In the second part of the study, the precision in the estimation of autocorrelation is studied. The performance of the ten lag-one autocorrelation estimators is compared in terms of Mean Square Error (combining bias and variance) using data series generated by Monte Carlo simulation. The results show that there is not a single optimal estimator for all conditions, suggesting that the estimator ought to be chosen according to sample size and to the information available of the possible direction of the serial dependence. Additionally, the probability of labelling an actually existing autocorrelation as statistically significant is explored using Monte Carlo sampling. The power estimates obtained are quite similar among the tests associated with the different estimators. These estimates evidence the small probability of detecting autocorrelation in series with less than 20 measurement times.
Resumo:
PowerPoint slides for Hypothesis Testing. Examples are taken from the Medical Literature