103 resultados para HYPOTHESIS TESTS
em Consorci de Serveis Universitaris de Catalunya (CSUC), Spain
Resumo:
This paper analyzes whether standard covariance matrix tests work whendimensionality is large, and in particular larger than sample size. Inthe latter case, the singularity of the sample covariance matrix makeslikelihood ratio tests degenerate, but other tests based on quadraticforms of sample covariance matrix eigenvalues remain well-defined. Westudy the consistency property and limiting distribution of these testsas dimensionality and sample size go to infinity together, with theirratio converging to a finite non-zero limit. We find that the existingtest for sphericity is robust against high dimensionality, but not thetest for equality of the covariance matrix to a given matrix. For thelatter test, we develop a new correction to the existing test statisticthat makes it robust against high dimensionality.
Resumo:
It is common in econometric applications that several hypothesis tests arecarried out at the same time. The problem then becomes how to decide whichhypotheses to reject, accounting for the multitude of tests. In this paper,we suggest a stepwise multiple testing procedure which asymptoticallycontrols the familywise error rate at a desired level. Compared to relatedsingle-step methods, our procedure is more powerful in the sense that itoften will reject more false hypotheses. In addition, we advocate the useof studentization when it is feasible. Unlike some stepwise methods, ourmethod implicitly captures the joint dependence structure of the teststatistics, which results in increased ability to detect alternativehypotheses. We prove our method asymptotically controls the familywise errorrate under minimal assumptions. We present our methodology in the context ofcomparing several strategies to a common benchmark and deciding whichstrategies actually beat the benchmark. However, our ideas can easily beextended and/or modied to other contexts, such as making inference for theindividual regression coecients in a multiple regression framework. Somesimulation studies show the improvements of our methods over previous proposals. We also provide an application to a set of real data.
Resumo:
Background: Research in epistasis or gene-gene interaction detection for human complex traits has grown over the last few years. It has been marked by promising methodological developments, improved translation efforts of statistical epistasis to biological epistasis and attempts to integrate different omics information sources into the epistasis screening to enhance power. The quest for gene-gene interactions poses severe multiple-testing problems. In this context, the maxT algorithm is one technique to control the false-positive rate. However, the memory needed by this algorithm rises linearly with the amount of hypothesis tests. Gene-gene interaction studies will require a memory proportional to the squared number of SNPs. A genome-wide epistasis search would therefore require terabytes of memory. Hence, cache problems are likely to occur, increasing the computation time. In this work we present a new version of maxT, requiring an amount of memory independent from the number of genetic effects to be investigated. This algorithm was implemented in C++ in our epistasis screening software MBMDR-3.0.3. We evaluate the new implementation in terms of memory efficiency and speed using simulated data. The software is illustrated on real-life data for Crohn’s disease. Results: In the case of a binary (affected/unaffected) trait, the parallel workflow of MBMDR-3.0.3 analyzes all gene-gene interactions with a dataset of 100,000 SNPs typed on 1000 individuals within 4 days and 9 hours, using 999 permutations of the trait to assess statistical significance, on a cluster composed of 10 blades, containing each four Quad-Core AMD Opteron(tm) Processor 2352 2.1 GHz. In the case of a continuous trait, a similar run takes 9 days. Our program found 14 SNP-SNP interactions with a multiple-testing corrected p-value of less than 0.05 on real-life Crohn’s disease (CD) data. Conclusions: Our software is the first implementation of the MB-MDR methodology able to solve large-scale SNP-SNP interactions problems within a few days, without using much memory, while adequately controlling the type I error rates. A new implementation to reach genome-wide epistasis screening is under construction. In the context of Crohn’s disease, MBMDR-3.0.3 could identify epistasis involving regions that are well known in the field and could be explained from a biological point of view. This demonstrates the power of our software to find relevant phenotype-genotype higher-order associations.
Resumo:
It is proved the algebraic equality between Jennrich's (1970) asymptotic$X^2$ test for equality of correlation matrices, and a Wald test statisticderived from Neudecker and Wesselman's (1990) expression of theasymptoticvariance matrix of the sample correlation matrix.
Resumo:
We present a new method for constructing exact distribution-free tests (and confidence intervals) for variables that can generate more than two possible outcomes.This method separates the search for an exact test from the goal to create a non-randomized test. Randomization is used to extend any exact test relating to meansof variables with finitely many outcomes to variables with outcomes belonging to agiven bounded set. Tests in terms of variance and covariance are reduced to testsrelating to means. Randomness is then eliminated in a separate step.This method is used to create confidence intervals for the difference between twomeans (or variances) and tests of stochastic inequality and correlation.
Resumo:
Small sample properties are of fundamental interest when only limited data is avail-able. Exact inference is limited by constraints imposed by speci.c nonrandomizedtests and of course also by lack of more data. These e¤ects can be separated as we propose to evaluate a test by comparing its type II error to the minimal type II error among all tests for the given sample. Game theory is used to establish this minimal type II error, the associated randomized test is characterized as part of a Nash equilibrium of a .ctitious game against nature.We use this method to investigate sequential tests for the di¤erence between twomeans when outcomes are constrained to belong to a given bounded set. Tests ofinequality and of noninferiority are included. We .nd that inference in terms oftype II error based on a balanced sample cannot be improved by sequential sampling or even by observing counter factual evidence providing there is a reasonable gap between the hypotheses.
Resumo:
We present an exact test for whether two random variables that have known bounds on their support are negatively correlated. The alternative hypothesis is that they are not negatively correlated. No assumptions are made on the underlying distributions. We show by example that the Spearman rank correlation test as the competing exact test of correlation in nonparametric settings rests on an additional assumption on the data generating process without which it is not valid as a test for correlation.We then show how to test for the significance of the slope in a linear regression analysis that invovles a single independent variable and where outcomes of the dependent variable belong to a known bounded set.
Resumo:
This paper tests hysteresis effects in unemployment using panel data for 19 OECD countries covering the period 1956-2001. The tests exploit the cross-section variations of the series, and additionally, allow for a diferent number of endogenous breakpoints in the unemployment series. The critical values are simulated based on our specific panel sizes and time periods. The findings stress the importance of accounting for exogenous shocks in the series and give support to the natural-rate hypothesis of unemployment for the majority of the countries analyzed
Resumo:
This paper tests hysteresis effects in unemployment using panel data for 19 OECD countries covering the period 1956-2001. The tests exploit the cross-section variations of the series, and additionally, allow for a diferent number of endogenous breakpoints in the unemployment series. The critical values are simulated based on our specific panel sizes and time periods. The findings stress the importance of accounting for exogenous shocks in the series and give support to the natural-rate hypothesis of unemployment for the majority of the countries analyzed
Resumo:
In the first part of the study, nine estimators of the first-order autoregressive parameter are reviewed and a new estimator is proposed. The relationships and discrepancies between the estimators are discussed in order to achieve a clear differentiation. In the second part of the study, the precision in the estimation of autocorrelation is studied. The performance of the ten lag-one autocorrelation estimators is compared in terms of Mean Square Error (combining bias and variance) using data series generated by Monte Carlo simulation. The results show that there is not a single optimal estimator for all conditions, suggesting that the estimator ought to be chosen according to sample size and to the information available of the possible direction of the serial dependence. Additionally, the probability of labelling an actually existing autocorrelation as statistically significant is explored using Monte Carlo sampling. The power estimates obtained are quite similar among the tests associated with the different estimators. These estimates evidence the small probability of detecting autocorrelation in series with less than 20 measurement times.
Resumo:
Monte Carlo simulations were used to generate data for ABAB designs of different lengths. The points of change in phase are randomly determined before gathering behaviour measurements, which allows the use of a randomization test as an analytic technique. Data simulation and analysis can be based either on data-division-specific or on common distributions. Following one method or another affects the results obtained after the randomization test has been applied. Therefore, the goal of the study was to examine these effects in more detail. The discrepancies in these approaches are obvious when data with zero treatment effect are considered and such approaches have implications for statistical power studies. Data-division-specific distributions provide more detailed information about the performance of the statistical technique.
Resumo:
[cat] Mentre que una creixent literatura que ha examinat la relació entre la renda i la despesa sanitària suggereix que els serveis sanitaris són un be de luxe (elasticitat renda superior a la unitat), aquesta conclusió es contínuament debatuda atesa l'heterogeneïtat dels resultats. Aquest article testa la hipòtesis dels serveis sanitaris com bens de luxe fent server anàlisi de meta- regressió, particularment analitzant l'existència de biaixos de selecció de publicació, precisió així com biaixos d'agregació. Els resultats apunten l'existència d'un biaix de publicació, robust independentment dels controls analitzats. Els biaixos de precisió i agregació semblen tenir un paper en la generació de les estimacions de l'elasticitat renda. Els nostres resultat suggereixen que l'elasticitat renda dels serveis sanitaris un cop corregir pels biaixos esmentat varien entre 0.26 i 0.84, però no podem rebutjar que la elasticitat renda es igual a la unitat en algunes estimacions de l'elasticitat corregides.
Resumo:
[cat] Mentre que una creixent literatura que ha examinat la relació entre la renda i la despesa sanitària suggereix que els serveis sanitaris són un be de luxe (elasticitat renda superior a la unitat), aquesta conclusió es contínuament debatuda atesa l'heterogeneïtat dels resultats. Aquest article testa la hipòtesis dels serveis sanitaris com bens de luxe fent server anàlisi de meta- regressió, particularment analitzant l'existència de biaixos de selecció de publicació, precisió així com biaixos d'agregació. Els resultats apunten l'existència d'un biaix de publicació, robust independentment dels controls analitzats. Els biaixos de precisió i agregació semblen tenir un paper en la generació de les estimacions de l'elasticitat renda. Els nostres resultat suggereixen que l'elasticitat renda dels serveis sanitaris un cop corregir pels biaixos esmentat varien entre 0.26 i 0.84, però no podem rebutjar que la elasticitat renda es igual a la unitat en algunes estimacions de l'elasticitat corregides.
Resumo:
The Hausman (1978) test is based on the vector of differences of two estimators. It is usually assumed that one of the estimators is fully efficient, since this simplifies calculation of the test statistic. However, this assumption limits the applicability of the test, since widely used estimators such as the generalized method of moments (GMM) or quasi maximum likelihood (QML) are often not fully efficient. This paper shows that the test may easily be implemented, using well-known methods, when neither estimator is efficient. To illustrate, we present both simulation results as well as empirical results for utilization of health care services.