965 resultados para chi-square distance
Resumo:
Swain corrects the chi-square overidentification test (i.e., likelihood ratio test of fit) for structural equation models whethr with or without latent variables. The chi-square statistic is asymptotically correct; however, it does not behave as expected in small samples and/or when the model is complex (cf. Herzog, Boomsma, & Reinecke, 2007). Thus, particularly in situations where the ratio of sample size (n) to the number of parameters estimated (p) is relatively small (i.e., the p to n ratio is large), the chi-square test will tend to overreject correctly specified models. To obtain a closer approximation to the distribution of the chi-square statistic, Swain (1975) developed a correction; this scaling factor, which converges to 1 asymptotically, is multiplied with the chi-square statistic. The correction better approximates the chi-square distribution resulting in more appropriate Type 1 reject error rates (see Herzog & Boomsma, 2009; Herzog, et al., 2007).
Resumo:
A family of scaling corrections aimed to improve the chi-square approximation of goodness-of-fit test statistics in small samples, large models, and nonnormal data was proposed in Satorra and Bentler (1994). For structural equations models, Satorra-Bentler's (SB) scaling corrections are available in standard computer software. Often, however, the interest is not on the overall fit of a model, but on a test of the restrictions that a null model say ${\cal M}_0$ implies on a less restricted one ${\cal M}_1$. If $T_0$ and $T_1$ denote the goodness-of-fit test statistics associated to ${\cal M}_0$ and ${\cal M}_1$, respectively, then typically the difference $T_d = T_0 - T_1$ is used as a chi-square test statistic with degrees of freedom equal to the difference on the number of independent parameters estimated under the models ${\cal M}_0$ and ${\cal M}_1$. As in the case of the goodness-of-fit test, it is of interest to scale the statistic $T_d$ in order to improve its chi-square approximation in realistic, i.e., nonasymptotic and nonnormal, applications. In a recent paper, Satorra (1999) shows that the difference between two Satorra-Bentler scaled test statistics for overall model fit does not yield the correct SB scaled difference test statistic. Satorra developed an expression that permits scaling the difference test statistic, but his formula has some practical limitations, since it requires heavy computations that are notavailable in standard computer software. The purpose of the present paper is to provide an easy way to compute the scaled difference chi-square statistic from the scaled goodness-of-fit test statistics of models ${\cal M}_0$ and ${\cal M}_1$. A Monte Carlo study is provided to illustrate the performance of the competing statistics.
Resumo:
This study explored the reduction of adenosine triphosphate (ATP) levels in L-02 hepatocytes by hexavalent chromium (Cr(VI)) using chi-square analysis. Cells were treated with 2, 4, 8, 16, or 32 μM Cr(VI) for 12, 24, or 36 h. Methyl thiazolyl tetrazolium (MTT) experiments and measurements of intracellular ATP levels were performed by spectrophotometry or bioluminescence assays following Cr(VI) treatment. The chi-square test was used to determine the difference between cell survival rate and ATP levels. For the chi-square analysis, the results of the MTT or ATP experiments were transformed into a relative ratio with respect to the control (%). The relative ATP levels increased at 12 h, decreased at 24 h, and increased slightly again at 36 h following 4, 8, 16, 32 μM Cr(VI) treatment, corresponding to a "V-shaped" curve. Furthermore, the results of the chi-square analysis demonstrated a significant difference of the ATP level in the 32-μM Cr(VI) group (P < 0.05). The results suggest that the chi-square test can be applied to analyze the interference effects of Cr(VI) on ATP levels in L-02 hepatocytes. The decreased ATP levels at 24 h indicated disruption of mitochondrial energy metabolism and the slight increase of ATP levels at 36 h indicated partial recovery of mitochondrial function or activated glycolysis in L-02 hepatocytes.
Resumo:
In this paper, we consider the non-central chi-square chart with two stage samplings. During the first stage, one item of the sample is inspected and, depending on the result, the sampling is either interrupted, or it goes on to the second stage, where the remaining sample items are inspected and the non-central chi-square statistic is computed. The proposed chart is not only more sensitive than the joint (X) over bar and R charts, but operationally simpler too, particularly when appropriate devices, such as go-no-go gauges, can be used to decide if the sampling should go on to the second stage or not. (c) 2004 Elsevier B.V. All rights reserved.
Resumo:
Traditionally, an (X) over bar -chart is used to control the process mean and an R-chart to control the process variance. However, these charts are not sensitive to small changes in process parameters. A good alternative to these charts is the exponentially weighted moving average (EWMA) control chart for controlling the process mean and variability, which is very effective in detecting small process disturbances. In this paper, we propose a single chart that is based on the non-central chi-square statistic, which is more effective than the joint (X) over bar and R charts in detecting assignable cause(s) that change the process mean and/or increase variability. It is also shown that the EWMA control chart based on a non-central chi-square statistic is more effective in detecting both increases and decreases in mean and/or variability.
Resumo:
Throughout this article, it is assumed that the no-central chi-square chart with two stage samplings (TSS Chisquare chart) is employed to monitor a process where the observations from the quality characteristic of interest X are independent and identically normally distributed with mean μ and variance σ2. The process is considered to start with the mean and the variance on target (μ = μ0; σ2 = σ0 2), but at some random time in the future an assignable cause shifts the mean from μ0 to μ1 = μ0 ± δσ0, δ >0 and/or increases the variance from σ0 2 to σ1 2 = γ2σ0 2, γ > 1. Before the assignable cause occurrence, the process is considered to be in a state of statistical control (defined by the in-control state). Similar to the Shewhart charts, samples of size n 0+ 1 are taken from the process at regular time intervals. The samplings are performed in two stages. At the first stage, the first item of the i-th sample is inspected. If its X value, say Xil, is close to the target value (|Xil-μ0|< w0σ 0, w0>0), then the sampling is interrupted. Otherwise, at the second stage, the remaining n0 items are inspected and the following statistic is computed. Wt = Σj=2n 0+1(Xij - μ0 + ξiσ 0)2 i = 1,2 Let d be a positive constant then ξ, =d if Xil > 0 ; otherwise ξi =-d. A signal is given at sample i if |Xil-μ0| > w0σ 0 and W1 > knia:tl, where kChi is the factor used in determining the upper control limit for the non-central chi-square chart. If devices such as go and no-go gauges can be considered, then measurements are not required except when the sampling goes to the second stage. Let P be the probability of deciding that the process is in control and P 1, i=1,2, be the probability of deciding that the process is in control at stage / of the sampling procedure. Thus P = P1 + P 2 - P1P2, P1 = Pr[μ0 - w0σ0 ≤ X ≤ μ0+ w 0σ0] P2=Pr[W ≤ kChi σ0 2], (3) During the in-control period, W / σ0 2 is distributed as a non-central chi-square distribution with n0 degrees of freedom and a non-centrality parameter λ0 = n0d2, i.e. W / σ0 2 - xn0 22 (λ0) During the out-of-control period, W / σ1 2 is distributed as a non-central chi-square distribution with n0 degrees of freedom and a non-centrality parameter λ1 = n0(δ + ξ)2 / γ2 The effectiveness of a control chart in detecting a process change can be measured by the average run length (ARL), which is the speed with which a control chart detects process shifts. The ARL for the proposed chart is easily determined because in this case, the number of samples before a signal is a geometrically distributed random variable with parameter 1-P, that is, ARL = I /(1-P). It is shown that the performance of the proposed chart is better than the joint X̄ and R charts, Furthermore, if the TSS Chi-square chart is used for monitoring diameters, volumes, weights, etc., then appropriate devices, such as go-no-go gauges can be used to decide if the sampling should go to the second stage or not. When the process is stable, and the joint X̄ and R charts are in use, the monitoring becomes monotonous because rarely an X̄ or R value fall outside the control limits. The natural consequence is the user to pay less and less attention to the steps required to obtain the X̄ and R value. In some cases, this lack of attention can result in serious mistakes. The TSS Chi-square chart has the advantage that most of the samplings are interrupted, consequently, most of the time the user will be working with attributes. Our experience shows that the inspection of one item by attribute is much less monotonous than measuring four or five items at each sampling.
Resumo:
The purpose of this research is to develop a new statistical method to determine the minimum set of rows (R) in a R x C contingency table of discrete data that explains the dependence of observations. The statistical power of the method will be empirically determined by computer simulation to judge its efficiency over the presently existing methods. The method will be applied to data on DNA fragment length variation at six VNTR loci in over 72 populations from five major racial groups of human (total sample size is over 15,000 individuals; each sample having at least 50 individuals). DNA fragment lengths grouped in bins will form the basis of studying inter-population DNA variation within the racial groups are significant, will provide a rigorous re-binning procedure for forensic computation of DNA profile frequencies that takes into account intra-racial DNA variation among populations. ^
Resumo:
"NAVWEPS report 7770. NOTS TP 2749."
Resumo:
When the data are counts or the frequencies of particular events and can be expressed as a contingency table, then they can be analysed using the chi-square distribution. When applied to a 2 x 2 table, the test is approximate and care needs to be taken in analysing tables when the expected frequencies are small either by applying Yate’s correction or by using Fisher’s exact test. Larger contingency tables can also be analysed using this method. Note that it is a serious statistical error to use any of these tests on measurement data!
Resumo:
Correspondence analysis, when used to visualize relationships in a table of counts(for example, abundance data in ecology), has been frequently criticized as being too sensitiveto objects (for example, species) that occur with very low frequency or in very few samples. Inthis statistical report we show that this criticism is generally unfounded. We demonstrate this inseveral data sets by calculating the actual contributions of rare objects to the results ofcorrespondence analysis and canonical correspondence analysis, both to the determination ofthe principal axes and to the chi-square distance. It is a fact that rare objects are oftenpositioned as outliers in correspondence analysis maps, which gives the impression that theyare highly influential, but their low weight offsets their distant positions and reduces their effecton the results. An alternative scaling of the correspondence analysis solution, the contributionbiplot, is proposed as a way of mapping the results in order to avoid the problem of outlying andlow contributing rare objects.
Resumo:
Power transformations of positive data tables, prior to applying the correspondence analysis algorithm, are shown to open up a family of methods with direct connections to the analysis of log-ratios. Two variations of this idea are illustrated. The first approach is simply to power the original data and perform a correspondence analysis this method is shown to converge to unweighted log-ratio analysis as the power parameter tends to zero. The second approach is to apply the power transformation to thecontingency ratios, that is the values in the table relative to expected values based on the marginals this method converges to weighted log-ratio analysis, or the spectral map. Two applications are described: first, a matrix of population genetic data which is inherently two-dimensional, and second, a larger cross-tabulation with higher dimensionality, from a linguistic analysis of several books.
Resumo:
This paper establishes a general framework for metric scaling of any distance measure between individuals based on a rectangular individuals-by-variables data matrix. The method allows visualization of both individuals and variables as well as preserving all the good properties of principal axis methods such as principal components and correspondence analysis, based on the singular-value decomposition, including the decomposition of variance into components along principal axes which provide the numerical diagnostics known as contributions. The idea is inspired from the chi-square distance in correspondence analysis which weights each coordinate by an amount calculated from the margins of the data table. In weighted metric multidimensional scaling (WMDS) we allow these weights to be unknown parameters which are estimated from the data to maximize the fit to the original distances. Once this extra weight-estimation step is accomplished, the procedure follows the classical path in decomposing a matrix and displaying its rows and columns in biplots.
Resumo:
Subcompositional coherence is a fundamental property of Aitchison s approach to compositional data analysis, and is the principal justification for using ratios of components. We maintain, however, that lack of subcompositional coherence, that is incoherence, can be measured in an attempt to evaluate whether any given technique is close enough, for all practical purposes, to being subcompositionally coherent. This opens up the field to alternative methods, which might be better suited to cope with problems such as data zeros and outliers, while being only slightly incoherent. The measure that we propose is based on the distance measure between components. We show that the two-part subcompositions, which appear to be the most sensitive to subcompositional incoherence, can be used to establish a distance matrix which can be directly compared with the pairwise distances in the full composition. The closeness of these two matrices can be quantified using a stress measure that is common in multidimensional scaling, providing a measure of subcompositional incoherence. The approach is illustrated using power-transformed correspondence analysis, which has already been shown to converge to log-ratio analysis as the power transform tends to zero.