980 resultados para Statistical testing
Resumo:
In the 1920s, Ronald Fisher developed the theory behind the p value and Jerzy Neyman and Egon Pearson developed the theory of hypothesis testing. These distinct theories have provided researchers important quantitative tools to confirm or refute their hypotheses. The p value is the probability to obtain an effect equal to or more extreme than the one observed presuming the null hypothesis of no effect is true; it gives researchers a measure of the strength of evidence against the null hypothesis. As commonly used, investigators will select a threshold p value below which they will reject the null hypothesis. The theory of hypothesis testing allows researchers to reject a null hypothesis in favor of an alternative hypothesis of some effect. As commonly used, investigators choose Type I error (rejecting the null hypothesis when it is true) and Type II error (accepting the null hypothesis when it is false) levels and determine some critical region. If the test statistic falls into that critical region, the null hypothesis is rejected in favor of the alternative hypothesis. Despite similarities between the two, the p value and the theory of hypothesis testing are different theories that often are misunderstood and confused, leading researchers to improper conclusions. Perhaps the most common misconception is to consider the p value as the probability that the null hypothesis is true rather than the probability of obtaining the difference observed, or one that is more extreme, considering the null is true. Another concern is the risk that an important proportion of statistically significant results are falsely significant. Researchers should have a minimum understanding of these two theories so that they are better able to plan, conduct, interpret, and report scientific experiments.
Resumo:
When researchers introduce a new test they have to demonstrate that it is valid, using unbiased designs and suitable statistical procedures. In this article we use Monte Carlo analyses to highlight how incorrect statistical procedures (i.e., stepwise regression, extreme scores analyses) or ignoring regression assumptions (e.g., heteroscedasticity) contribute to wrong validity estimates. Beyond these demonstrations, and as an example, we re-examined the results reported by Warwick, Nettelbeck, and Ward (2010) concerning the validity of the Ability Emotional Intelligence Measure (AEIM). Warwick et al. used the wrong statistical procedures to conclude that the AEIM was incrementally valid beyond intelligence and personality traits in predicting various outcomes. In our re-analysis, we found that the reliability-corrected multiple correlation of their measures with personality and intelligence was up to .69. Using robust statistical procedures and appropriate controls, we also found that the AEIM did not predict incremental variance in GPA, stress, loneliness, or well-being, demonstrating the importance for testing validity instead of looking for it.
Resumo:
The present work focuses the attention on the skew-symmetry index as a measure of social reciprocity. This index is based on the correspondence between the amount of behaviour that each individual addresses to its partners and what it receives from them in return. Although the skew-symmetry index enables researchers to describe social groups, statistical inferential tests are required. The main aim of the present study is to propose an overall statistical technique for testing symmetry in experimental conditions, calculating the skew-symmetry statistic (Φ) at group level. Sampling distributions for the skew- symmetry statistic have been estimated by means of a Monte Carlo simulation in order to allow researchers to make statistical decisions. Furthermore, this study will allow researchers to choose the optimal experimental conditions for carrying out their research, as the power of the statistical test has been estimated. This statistical test could be used in experimental social psychology studies in which researchers may control the group size and the number of interactions within dyads.
Resumo:
Background: Research in epistasis or gene-gene interaction detection for human complex traits has grown over the last few years. It has been marked by promising methodological developments, improved translation efforts of statistical epistasis to biological epistasis and attempts to integrate different omics information sources into the epistasis screening to enhance power. The quest for gene-gene interactions poses severe multiple-testing problems. In this context, the maxT algorithm is one technique to control the false-positive rate. However, the memory needed by this algorithm rises linearly with the amount of hypothesis tests. Gene-gene interaction studies will require a memory proportional to the squared number of SNPs. A genome-wide epistasis search would therefore require terabytes of memory. Hence, cache problems are likely to occur, increasing the computation time. In this work we present a new version of maxT, requiring an amount of memory independent from the number of genetic effects to be investigated. This algorithm was implemented in C++ in our epistasis screening software MBMDR-3.0.3. We evaluate the new implementation in terms of memory efficiency and speed using simulated data. The software is illustrated on real-life data for Crohn’s disease. Results: In the case of a binary (affected/unaffected) trait, the parallel workflow of MBMDR-3.0.3 analyzes all gene-gene interactions with a dataset of 100,000 SNPs typed on 1000 individuals within 4 days and 9 hours, using 999 permutations of the trait to assess statistical significance, on a cluster composed of 10 blades, containing each four Quad-Core AMD Opteron(tm) Processor 2352 2.1 GHz. In the case of a continuous trait, a similar run takes 9 days. Our program found 14 SNP-SNP interactions with a multiple-testing corrected p-value of less than 0.05 on real-life Crohn’s disease (CD) data. Conclusions: Our software is the first implementation of the MB-MDR methodology able to solve large-scale SNP-SNP interactions problems within a few days, without using much memory, while adequately controlling the type I error rates. A new implementation to reach genome-wide epistasis screening is under construction. In the context of Crohn’s disease, MBMDR-3.0.3 could identify epistasis involving regions that are well known in the field and could be explained from a biological point of view. This demonstrates the power of our software to find relevant phenotype-genotype higher-order associations.
Resumo:
The objective of this work was to evaluate the effects of temperature (10, 20, 30, 20/10 and 30/10ºC) and period of storage on electrical conductivity (EC) in four seed lots of corn (Zea mays L.), as well as the mineral composition of the soaking solution. EC test determines indirectly the integrity of seed membrane systems, and is used for the assessment of seed vigor, because this test detects the seed deterioration process since its early phase. The research comprised determinations of water content, germination, accelerated aging (AA), cold (CT) and EC vigor tests, and determinations of Ca2+, Mg2+ and K+ release to the solution, after seed soaking of four corn seed lots. The evaluations were performed each four months during a period of 16 months. For statistical analysis, a completely randomized split plot design was used with eight replications. Except for seed lots stored at 10ºC, all vigor evaluations revealed a decline in vigor, but AA and CT showed more sensitiveness to declines of seed physiological quality than EC. Potassium was the main leached ion regardless of the storage temperature.
Resumo:
To enable a mathematically and physically sound execution of the fatigue test and a correct interpretation of its results, statistical evaluation methods are used to assist in the analysis of fatigue testing data. The main objective of this work is to develop step-by-stepinstructions for statistical analysis of the laboratory fatigue data. The scopeof this project is to provide practical cases about answering the several questions raised in the treatment of test data with application of the methods and formulae in the document IIW-XIII-2138-06 (Best Practice Guide on the Statistical Analysis of Fatigue Data). Generally, the questions in the data sheets involve some aspects: estimation of necessary sample size, verification of the statistical equivalence of the collated sets of data, and determination of characteristic curves in different cases. The series of comprehensive examples which are given in this thesis serve as a demonstration of the various statistical methods to develop a sound procedure to create reliable calculation rules for the fatigue analysis.
Resumo:
Tämän tutkielman tavoitteena on tarkastella Kiinan osakemarkkinoiden tehokkuutta ja random walk -hypoteesin voimassaoloa. Tavoitteena on myös selvittää esiintyykö viikonpäiväanomalia Kiinan osakemarkkinoilla. Tutkimusaineistona käytetään Shanghain osakepörssin A-sarjan,B-sarjan ja yhdistelmä-sarjan ja Shenzhenin yhdistelmä-sarjan indeksien päivittäisiä logaritmisoituja tuottoja ajalta 21.2.1992-30.12.2005 sekä Shenzhenin osakepörssin A-sarjan ja B-sarjan indeksien päivittäisiä logaritmisoituja tuottoja ajalta 5.10.1992-30.12.2005. Tutkimusmenetelminä käytetään neljä tilastollista menetelmää, mukaan lukien autokorrelaatiotestiä, epäparametrista runs-testiä, varianssisuhdetestiä sekä Augmented Dickey-Fullerin yksikköjuuritestiä. Viikonpäiväanomalian esiintymistä tutkitaan käyttämällä pienimmän neliösumman menetelmää (OLS). Testejä tehdään sekä koko aineistolla että kolmella erillisellä ajanjaksolla. Tämän tutkielman empiiriset tulokset tukevat aikaisempia tutkimuksia Kiinan osakemarkkinoiden tehottomuudesta. Lukuun ottamatta yksikköjuuritestien saatuja tuloksia, autokorrelaatio-, runs- ja varianssisuhdetestien perusteella random walk-hypoteesi hylättiin molempien Kiinan osakemarkkinoiden kohdalla. Tutkimustulokset osoittavat, että molemmilla osakepörssillä B-sarjan indeksien käyttäytyminenon ollut huomattavasti enemmän random walk -hypoteesin vastainen kuin A-sarjan indeksit. Paitsi B-sarjan markkinat, molempien Kiinan osakemarkkinoiden tehokkuus näytti myös paranevan vuoden 2001 markkinabuumin jälkeen. Tutkimustulokset osoittavat myös viikonpäiväanomalian esiintyvän Shanghain osakepörssillä, muttei kuitenkaan Shenzhenin osakepörssillä koko tarkasteluajanjaksolla.
Resumo:
Although usability evaluations have been focused on assessing different contexts of use, no proper specifications have been addressed towards the particular environment of academic websites in the Spanish-speaking context of use. Considering that this context involves hundreds of millions of potential users, the AIPO Association is running the UsabAIPO Project. The ultimate goal is to promote an adequate translation of international standards, methods and ideal values related to usability in order to adapt them to diverse Spanish-related contexts of use. This article presents the main statistical results coming from the Second and Third Stages of the UsabAIPO Project, where the UsabAIPO Heuristic method (based on Heuristic Evaluation techniques) and seven Cognitive Walkthroughs were performed over 69 university websites. The planning and execution of the UsabAIPO Heuristic method and the Cognitive Walkthroughs, the definition of two usability metrics, as well as the outline of the UsabAIPO Heuristic Management System prototype are also sketched.
Resumo:
In the present research we have set forth a new, simple, Trade-Off model that would allow us to calculate how much debt and, by default, how much equity a company should have, using easily available information and calculating the cost of debt dynamically on the basis of the effect that the capital structure of the company has on the risk of bankruptcy; in an attempt to answer this question. The proposed model has been applied to the companies that make up the Dow Jones Industrial Average (DJIA) in 2007. We have used consolidated financial data from 1996 to 2006, published by Bloomberg. We have used simplex optimization method to find the debt level that maximizes firm value. Then, we compare the estimated debt with real debt of companies using statistical nonparametric Mann-Whitney. The results indicate that 63% of companies do not show a statistically significant difference between the real and the estimated debt.
Resumo:
In the present work we focus on two indices that quantify directionality and skew-symmetrical patterns in social interactions as measures of social reciprocity: the Directional consistency (DC) and Skew symmetry indices. Although both indices enable researchers to describe social groups, most studies require statistical inferential tests. The main aims of the present study are: firstly, to propose an overall statistical technique for testing null hypotheses regarding social reciprocity in behavioral studies, using the DC and Skew symmetry statistics (Φ) at group level; and secondly, to compare both statistics in order to allow researchers to choose the optimal measure depending on the conditions. In order to allow researchers to make statistical decisions, statistical significance for both statistics has been estimated by means of a Monte Carlo simulation. Furthermore, this study will enable researchers to choose the optimal observational conditions for carrying out their research, as the power of the statistical tests has been estimated.
Resumo:
The purpose of this master thesis was to perform simulations that involve use of random number while testing hypotheses especially on two samples populations being compared weather by their means, variances or Sharpe ratios. Specifically, we simulated some well known distributions by Matlab and check out the accuracy of an hypothesis testing. Furthermore, we went deeper and check what could happen once the bootstrapping method as described by Effrons is applied on the simulated data. In addition to that, one well known RobustSharpe hypothesis testing stated in the paper of Ledoit and Wolf was applied to measure the statistical significance performance between two investment founds basing on testing weather there is a statistically significant difference between their Sharpe Ratios or not. We collected many literatures about our topic and perform by Matlab many simulated random numbers as possible to put out our purpose; As results we come out with a good understanding that testing are not always accurate; for instance while testing weather two normal distributed random vectors come from the same normal distribution. The Jacque-Berra test for normality showed that for the normal random vector r1 and r2, only 94,7% and 95,7% respectively are coming from normal distribution in contrast 5,3% and 4,3% failed to shown the truth already known; but when we introduce the bootstrapping methods by Effrons while estimating pvalues where the hypothesis decision is based, the accuracy of the test was 100% successful. From the above results the reports showed that bootstrapping methods while testing or estimating some statistics should always considered because at most cases the outcome are accurate and errors are minimized in the computation. Also the RobustSharpe test which is known to use one of the bootstrapping methods, studentised one, were applied first on different simulated data including distribution of many kind and different shape secondly, on real data, Hedge and Mutual funds. The test performed quite well to agree with the existence of statistical significance difference between their Sharpe ratios as described in the paper of Ledoit andWolf.
Resumo:
This paper studies seemingly unrelated linear models with integrated regressors and stationary errors. By adding leads and lags of the first differences of the regressors and estimating this augmented dynamic regression model by feasible generalized least squares using the long-run covariance matrix, we obtain an efficient estimator of the cointegrating vector that has a limiting mixed normal distribution. Simulation results suggest that this new estimator compares favorably with others already proposed in the literature. We apply these new estimators to the testing of purchasing power parity (PPP) among the G-7 countries. The test based on the efficient estimates rejects the PPP hypothesis for most countries.
Resumo:
In this paper we propose exact likelihood-based mean-variance efficiency tests of the market portfolio in the context of Capital Asset Pricing Model (CAPM), allowing for a wide class of error distributions which include normality as a special case. These tests are developed in the frame-work of multivariate linear regressions (MLR). It is well known however that despite their simple statistical structure, standard asymptotically justified MLR-based tests are unreliable. In financial econometrics, exact tests have been proposed for a few specific hypotheses [Jobson and Korkie (Journal of Financial Economics, 1982), MacKinlay (Journal of Financial Economics, 1987), Gib-bons, Ross and Shanken (Econometrica, 1989), Zhou (Journal of Finance 1993)], most of which depend on normality. For the gaussian model, our tests correspond to Gibbons, Ross and Shanken’s mean-variance efficiency tests. In non-gaussian contexts, we reconsider mean-variance efficiency tests allowing for multivariate Student-t and gaussian mixture errors. Our framework allows to cast more evidence on whether the normality assumption is too restrictive when testing the CAPM. We also propose exact multivariate diagnostic checks (including tests for multivariate GARCH and mul-tivariate generalization of the well known variance ratio tests) and goodness of fit tests as well as a set estimate for the intervening nuisance parameters. Our results [over five-year subperiods] show the following: (i) multivariate normality is rejected in most subperiods, (ii) residual checks reveal no significant departures from the multivariate i.i.d. assumption, and (iii) mean-variance efficiency tests of the market portfolio is not rejected as frequently once it is allowed for the possibility of non-normal errors.
Resumo:
It is well known that standard asymptotic theory is not valid or is extremely unreliable in models with identification problems or weak instruments [Dufour (1997, Econometrica), Staiger and Stock (1997, Econometrica), Wang and Zivot (1998, Econometrica), Stock and Wright (2000, Econometrica), Dufour and Jasiak (2001, International Economic Review)]. One possible way out consists here in using a variant of the Anderson-Rubin (1949, Ann. Math. Stat.) procedure. The latter, however, allows one to build exact tests and confidence sets only for the full vector of the coefficients of the endogenous explanatory variables in a structural equation, which in general does not allow for individual coefficients. This problem may in principle be overcome by using projection techniques [Dufour (1997, Econometrica), Dufour and Jasiak (2001, International Economic Review)]. AR-types are emphasized because they are robust to both weak instruments and instrument exclusion. However, these techniques can be implemented only by using costly numerical techniques. In this paper, we provide a complete analytic solution to the problem of building projection-based confidence sets from Anderson-Rubin-type confidence sets. The latter involves the geometric properties of “quadrics” and can be viewed as an extension of usual confidence intervals and ellipsoids. Only least squares techniques are required for building the confidence intervals. We also study by simulation how “conservative” projection-based confidence sets are. Finally, we illustrate the methods proposed by applying them to three different examples: the relationship between trade and growth in a cross-section of countries, returns to education, and a study of production functions in the U.S. economy.
Resumo:
We discuss statistical inference problems associated with identification and testability in econometrics, and we emphasize the common nature of the two issues. After reviewing the relevant statistical notions, we consider in turn inference in nonparametric models and recent developments on weakly identified models (or weak instruments). We point out that many hypotheses, for which test procedures are commonly proposed, are not testable at all, while some frequently used econometric methods are fundamentally inappropriate for the models considered. Such situations lead to ill-defined statistical problems and are often associated with a misguided use of asymptotic distributional results. Concerning nonparametric hypotheses, we discuss three basic problems for which such difficulties occur: (1) testing a mean (or a moment) under (too) weak distributional assumptions; (2) inference under heteroskedasticity of unknown form; (3) inference in dynamic models with an unlimited number of parameters. Concerning weakly identified models, we stress that valid inference should be based on proper pivotal functions —a condition not satisfied by standard Wald-type methods based on standard errors — and we discuss recent developments in this field, mainly from the viewpoint of building valid tests and confidence sets. The techniques discussed include alternative proposed statistics, bounds, projection, split-sampling, conditioning, Monte Carlo tests. The possibility of deriving a finite-sample distributional theory, robustness to the presence of weak instruments, and robustness to the specification of a model for endogenous explanatory variables are stressed as important criteria assessing alternative procedures.