953 resultados para Two-sample tests
Resumo:
Sizes and power of selected two-sample tests of the equality of survival distributions are compared by simulation for small samples from unequally, randomly-censored exponential distributions. The tests investigated include parametric tests (F, Score, Likelihood, Asymptotic), logrank tests (Mantel, Peto-Peto), and Wilcoxon-Type tests (Gehan, Prentice). Equal sized samples, n = 18, 16, 32 with 1000 (size) and 500 (power) simulation trials, are compared for 16 combinations of the censoring proportions 0%, 20%, 40%, and 60%. For n = 8 and 16, the Asymptotic, Peto-Peto, and Wilcoxon tests perform at nominal 5% size expectations, but the F, Score and Mantel tests exceeded 5% size confidence limits for 1/3 of the censoring combinations. For n = 32, all tests showed proper size, with the Peto-Peto test most conservative in the presence of unequal censoring. Powers of all tests are compared for exponential hazard ratios of 1.4 and 2.0. There is little difference in power characteristics of the tests within the classes of tests considered. The Mantel test showed 90% to 95% power efficiency relative to parametric tests. Wilcoxon-type tests have the lowest relative power but are robust to differential censoring patterns. A modified Peto-Peto test shows power comparable to the Mantel test. For n = 32, a specific Weibull-exponential comparison of crossing survival curves suggests that the relative powers of logrank and Wilcoxon-type tests are dependent on the scale parameter of the Weibull distribution. Wilcoxon-type tests appear more powerful than logrank tests in the case of late-crossing and less powerful for early-crossing survival curves. Guidelines for the appropriate selection of two-sample tests are given. ^
Resumo:
In this paper, we study several tests for the equality of two unknown distributions. Two are based on empirical distribution functions, three others on nonparametric probability density estimates, and the last ones on differences between sample moments. We suggest controlling the size of such tests (under nonparametric assumptions) by using permutational versions of the tests jointly with the method of Monte Carlo tests properly adjusted to deal with discrete distributions. We also propose a combined test procedure, whose level is again perfectly controlled through the Monte Carlo test technique and has better power properties than the individual tests that are combined. Finally, in a simulation experiment, we show that the technique suggested provides perfect control of test size and that the new tests proposed can yield sizeable power improvements.
Resumo:
The diagnosis of leprosy continues to be based on clinical symptoms and early diagnosis and treatment are critical to preventing disability and transmission. Sensitive and specific laboratory tests are not available for diagnosing leprosy. Despite the limited applicability of anti-phenolic glycolipid-I (PGL-I) serology for diagnosis, it has been suggested as an additional tool to classify leprosy patients (LPs) for treatment purposes. Two formats of rapid tests to detect anti-PGL-I antibodies [ML immunochromatography assay (ICA) and ML Flow] were compared in different groups, multibacillary patients, paucibacillary patients, household contacts and healthy controls in Brazil and Nepal. High ML Flow intra-test concordance was observed and low to moderate agreement between the results of ML ICA and ML Flow tests on the serum of LPs was observed. LPs were "seroclassified" according to the results of these tests and the seroclassification was compared to other currently used classification systems: the World Health Organization operational classification, the bacilloscopic index and the Ridley-Jopling classification. When analysing the usefulness of these tests in the operational classification of PB and MB leprosy for treatment and follow-up purposes, the ML Flow test was the best point-of-care test for subjects in Nepal and despite the need for sample dilution, the ML ICA test yielded better performance among Brazilian subjects. Our results identified possible ways to improve the performance of both tests.
Resumo:
It is proved the algebraic equality between Jennrich's (1970) asymptotic$X^2$ test for equality of correlation matrices, and a Wald test statisticderived from Neudecker and Wesselman's (1990) expression of theasymptoticvariance matrix of the sample correlation matrix.
Resumo:
In the context of multivariate linear regression (MLR) models, it is well known that commonly employed asymptotic test criteria are seriously biased towards overrejection. In this paper, we propose a general method for constructing exact tests of possibly nonlinear hypotheses on the coefficients of MLR systems. For the case of uniform linear hypotheses, we present exact distributional invariance results concerning several standard test criteria. These include Wilks' likelihood ratio (LR) criterion as well as trace and maximum root criteria. The normality assumption is not necessary for most of the results to hold. Implications for inference are two-fold. First, invariance to nuisance parameters entails that the technique of Monte Carlo tests can be applied on all these statistics to obtain exact tests of uniform linear hypotheses. Second, the invariance property of the latter statistic is exploited to derive general nuisance-parameter-free bounds on the distribution of the LR statistic for arbitrary hypotheses. Even though it may be difficult to compute these bounds analytically, they can easily be simulated, hence yielding exact bounds Monte Carlo tests. Illustrative simulation experiments show that the bounds are sufficiently tight to provide conclusive results with a high probability. Our findings illustrate the value of the bounds as a tool to be used in conjunction with more traditional simulation-based test methods (e.g., the parametric bootstrap) which may be applied when the bounds are not conclusive.
Resumo:
This paper considers two-sided tests for the parameter of an endogenous variable in an instrumental variable (IV) model with heteroskedastic and autocorrelated errors. We develop the nite-sample theory of weighted-average power (WAP) tests with normal errors and a known long-run variance. We introduce two weights which are invariant to orthogonal transformations of the instruments; e.g., changing the order in which the instruments appear. While tests using the MM1 weight can be severely biased, optimal tests based on the MM2 weight are naturally two-sided when errors are homoskedastic. We propose two boundary conditions that yield two-sided tests whether errors are homoskedastic or not. The locally unbiased (LU) condition is related to the power around the null hypothesis and is a weaker requirement than unbiasedness. The strongly unbiased (SU) condition is more restrictive than LU, but the associated WAP tests are easier to implement. Several tests are SU in nite samples or asymptotically, including tests robust to weak IV (such as the Anderson-Rubin, score, conditional quasi-likelihood ratio, and I. Andrews' (2015) PI-CLC tests) and two-sided tests which are optimal when the sample size is large and instruments are strong. We refer to the WAP-SU tests based on our weights as MM1-SU and MM2-SU tests. Dropping the restrictive assumptions of normality and known variance, the theory is shown to remain valid at the cost of asymptotic approximations. The MM2-SU test is optimal under the strong IV asymptotics, and outperforms other existing tests under the weak IV asymptotics.
Resumo:
In this note, we show that an extension of a test for perfect ranking in a balanced ranked set sample given by Li and Balakrishnan (2008) to the multi-cycle case turns out to be equivalent to the test statistic proposed by Frey et al. (2007). This provides an alternative interpretation and motivation for their test statistic.
Resumo:
Testing for two-sample differences is challenging when the differences are local and only involve a small portion of the data. To solve this problem, we apply a multi- resolution scanning framework that performs dependent local tests on subsets of the sample space. We use a nested dyadic partition of the sample space to get a collection of windows and test for sample differences within each window. We put a joint prior on the states of local hypotheses that allows both vertical and horizontal message passing among the partition tree to reflect the spatial dependency features among windows. This information passing framework is critical to detect local sample differences. We use both the loopy belief propagation algorithm and MCMC to get the posterior null probability on each window. These probabilities are then used to report sample differences based on decision procedures. Simulation studies are conducted to illustrate the performance. Multiple testing adjustment and convergence of the algorithms are also discussed.
Resumo:
Abstract
Resumo:
A wide range of tests for heteroskedasticity have been proposed in the econometric and statistics literature. Although a few exact homoskedasticity tests are available, the commonly employed procedures are quite generally based on asymptotic approximations which may not provide good size control in finite samples. There has been a number of recent studies that seek to improve the reliability of common heteroskedasticity tests using Edgeworth, Bartlett, jackknife and bootstrap methods. Yet the latter remain approximate. In this paper, we describe a solution to the problem of controlling the size of homoskedasticity tests in linear regression contexts. We study procedures based on the standard test statistics [e.g., the Goldfeld-Quandt, Glejser, Bartlett, Cochran, Hartley, Breusch-Pagan-Godfrey, White and Szroeter criteria] as well as tests for autoregressive conditional heteroskedasticity (ARCH-type models). We also suggest several extensions of the existing procedures (sup-type of combined test statistics) to allow for unknown breakpoints in the error variance. We exploit the technique of Monte Carlo tests to obtain provably exact p-values, for both the standard and the new tests suggested. We show that the MC test procedure conveniently solves the intractable null distribution problem, in particular those raised by the sup-type and combined test statistics as well as (when relevant) unidentified nuisance parameter problems under the null hypothesis. The method proposed works in exactly the same way with both Gaussian and non-Gaussian disturbance distributions [such as heavy-tailed or stable distributions]. The performance of the procedures is examined by simulation. The Monte Carlo experiments conducted focus on : (1) ARCH, GARCH, and ARCH-in-mean alternatives; (2) the case where the variance increases monotonically with : (i) one exogenous variable, and (ii) the mean of the dependent variable; (3) grouped heteroskedasticity; (4) breaks in variance at unknown points. We find that the proposed tests achieve perfect size control and have good power.
Resumo:
This paper presents a comparison of two tests designed to predict which hearing impaired patients may benefit from high frequency amplification.
Resumo:
The article considers screening human populations with two screening tests. If any of the two tests is positive, then full evaluation of the disease status is undertaken; however, if both diagnostic tests are negative, then disease status remains unknown. This procedure leads to a data constellation in which, for each disease status, the 2 × 2 table associated with the two diagnostic tests used in screening has exactly one empty, unknown cell. To estimate the unobserved cell counts, previous approaches assume independence of the two diagnostic tests and use specific models, including the special mixture model of Walter or unconstrained capture–recapture estimates. Often, as is also demonstrated in this article by means of a simple test, the independence of the two screening tests is not supported by the data. Two new estimators are suggested that allow associations of the screening test, although the form of association must be assumed to be homogeneous over disease status. These estimators are modifications of the simple capture–recapture estimator and easy to construct. The estimators are investigated for several screening studies with fully evaluated disease status in which the superior behavior of the new estimators compared to the previous conventional ones can be shown. Finally, the performance of the new estimators is compared with maximum likelihood estimators, which are more difficult to obtain in these models. The results indicate the loss of efficiency as minor.
Resumo:
The article considers screening human populations with two screening tests. If any of the two tests is positive, then full evaluation of the disease status is undertaken; however, if both diagnostic tests are negative, then disease status remains unknown. This procedure leads to a data constellation in which, for each disease status, the 2 x 2 table associated with the two diagnostic tests used in screening has exactly one empty, unknown cell. To estimate the unobserved cell counts, previous approaches assume independence of the two diagnostic tests and use specific models, including the special mixture model of Walter or unconstrained capture-recapture estimates. Often, as is also demonstrated in this article by means of a simple test, the independence of the two screening tests is not supported by the data. Two new estimators are suggested that allow associations of the screening test, although the form of association must be assumed to be homogeneous over disease status. These estimators are modifications of the simple capture-recapture estimator and easy to construct. The estimators are investigated for several screening studies with fully evaluated disease status in which the superior behavior of the new estimators compared to the previous conventional ones can be shown. Finally, the performance of the new estimators is compared with maximum likelihood estimators, which are more difficult to obtain in these models. The results indicate the loss of efficiency as minor.
Resumo:
Sensitivity and specificity are measures that allow us to evaluate the performance of a diagnostic test. In practice, it is common to have situations where a proportion of selected individuals cannot have the real state of the disease verified, since the verification could be an invasive procedure, as occurs with biopsy. This happens, as a special case, in the diagnosis of prostate cancer, or in any other situation related to risks, that is, not practicable, nor ethical, or in situations with high cost. For this case, it is common to use diagnostic tests based only on the information of verified individuals. This procedure can lead to biased results or workup bias. In this paper, we introduce a Bayesian approach to estimate the sensitivity and the specificity for two diagnostic tests considering verified and unverified individuals, a result that generalizes the usual situation based on only one diagnostic test.