29 resultados para Inimizes the chi-square
Resumo:
A family of scaling corrections aimed to improve the chi-square approximation of goodness-of-fit test statistics in small samples, large models, and nonnormal data was proposed in Satorra and Bentler (1994). For structural equations models, Satorra-Bentler's (SB) scaling corrections are available in standard computer software. Often, however, the interest is not on the overall fit of a model, but on a test of the restrictions that a null model say ${\cal M}_0$ implies on a less restricted one ${\cal M}_1$. If $T_0$ and $T_1$ denote the goodness-of-fit test statistics associated to ${\cal M}_0$ and ${\cal M}_1$, respectively, then typically the difference $T_d = T_0 - T_1$ is used as a chi-square test statistic with degrees of freedom equal to the difference on the number of independent parameters estimated under the models ${\cal M}_0$ and ${\cal M}_1$. As in the case of the goodness-of-fit test, it is of interest to scale the statistic $T_d$ in order to improve its chi-square approximation in realistic, i.e., nonasymptotic and nonnormal, applications. In a recent paper, Satorra (1999) shows that the difference between two Satorra-Bentler scaled test statistics for overall model fit does not yield the correct SB scaled difference test statistic. Satorra developed an expression that permits scaling the difference test statistic, but his formula has some practical limitations, since it requires heavy computations that are notavailable in standard computer software. The purpose of the present paper is to provide an easy way to compute the scaled difference chi-square statistic from the scaled goodness-of-fit test statistics of models ${\cal M}_0$ and ${\cal M}_1$. A Monte Carlo study is provided to illustrate the performance of the competing statistics.
Resumo:
Correspondence analysis, when used to visualize relationships in a table of counts(for example, abundance data in ecology), has been frequently criticized as being too sensitiveto objects (for example, species) that occur with very low frequency or in very few samples. Inthis statistical report we show that this criticism is generally unfounded. We demonstrate this inseveral data sets by calculating the actual contributions of rare objects to the results ofcorrespondence analysis and canonical correspondence analysis, both to the determination ofthe principal axes and to the chi-square distance. It is a fact that rare objects are oftenpositioned as outliers in correspondence analysis maps, which gives the impression that theyare highly influential, but their low weight offsets their distant positions and reduces their effecton the results. An alternative scaling of the correspondence analysis solution, the contributionbiplot, is proposed as a way of mapping the results in order to avoid the problem of outlying andlow contributing rare objects.
Resumo:
La intenció d'aquest article és detallar l'abast del capital social als esdeveniments culturals celebrats a Catalunya i analitzar la influència sobre l'atracció turística dels mateixos. Es pretén determinar també quin és l'impacte que tres elements de capital social que intervenen en l'organització d'esdeveniments (elements de motivació, creació de xarxes internes i lideratge) tenen sobre el sector turístic local. L'estudi parteix d'una mostra de 263 esdeveniments als quals s'ha adreçat una enquesta per determinar la presència i pes dels factors de capital social. Aquesta informació s'ha creuat amb dades sobre impactes i atracció turística obtingudes també a partir de la mateixa enquesta i, a partir de l'aplicació del test del chi quadrat, s'ha contrastat si les diferències existents entre els diferents factors del capital social són estadísticament significatives. Les conclusions principals obtingudes indiquen que els esdeveniments que tenen elements de capital social que els reforça la seva cohesió social entenen i justifiquen la celebració com a fet socialitzador, independentment del seu abast turístic. A més es detecta que la creació de xarxes de relació enforteix la cohesió interna, la representativitat i el sentit d'identitat de la comunitat. Finalment es constata que la presència d'elements de lideratge que donen visibilitat i vinculen l'esdeveniment amb xarxes externes explica la diferència existent en la capacitat d'atracció i impactes turístics dels esdeveniments. La principal aportació del treball és posar de manifest el paper del capital social com a factor que incideix en la repercussió social i turística dels esdeveniments catalans. La diagnosi efectuada permet recomanar la incorporació del capital social com un actiu estratègic per a la gestió i per a la creació de nous productes i polítiques turístiques centrades en els esdeveniments culturals.
Resumo:
We compare two methods for visualising contingency tables and developa method called the ratio map which combines the good properties of both.The first is a biplot based on the logratio approach to compositional dataanalysis. This approach is founded on the principle of subcompositionalcoherence, which assures that results are invariant to considering subsetsof the composition. The second approach, correspondence analysis, isbased on the chi-square approach to contingency table analysis. Acornerstone of correspondence analysis is the principle of distributionalequivalence, which assures invariance in the results when rows or columnswith identical conditional proportions are merged. Both methods may bedescribed as singular value decompositions of appropriately transformedmatrices. Correspondence analysis includes a weighting of the rows andcolumns proportional to the margins of the table. If this idea of row andcolumn weights is introduced into the logratio biplot, we obtain a methodwhich obeys both principles of subcompositional coherence and distributionalequivalence.
Resumo:
This paper establishes a general framework for metric scaling of any distance measure between individuals based on a rectangular individuals-by-variables data matrix. The method allows visualization of both individuals and variables as well as preserving all the good properties of principal axis methods such as principal components and correspondence analysis, based on the singular-value decomposition, including the decomposition of variance into components along principal axes which provide the numerical diagnostics known as contributions. The idea is inspired from the chi-square distance in correspondence analysis which weights each coordinate by an amount calculated from the margins of the data table. In weighted metric multidimensional scaling (WMDS) we allow these weights to be unknown parameters which are estimated from the data to maximize the fit to the original distances. Once this extra weight-estimation step is accomplished, the procedure follows the classical path in decomposing a matrix and displaying its rows and columns in biplots.
Resumo:
A continuous random variable is expanded as a sum of a sequence of uncorrelated random variables. These variables are principal dimensions in continuous scaling on a distance function, as an extension of classic scaling on a distance matrix. For a particular distance, these dimensions are principal components. Then some properties are studied and an inequality is obtained. Diagonal expansions are considered from the same continuous scaling point of view, by means of the chi-square distance. The geometric dimension of a bivariate distribution is defined and illustrated with copulas. It is shown that the dimension can have the power of continuum.
Resumo:
Structural equation models are widely used in economic, socialand behavioral studies to analyze linear interrelationships amongvariables, some of which may be unobservable or subject to measurementerror. Alternative estimation methods that exploit different distributionalassumptions are now available. The present paper deals with issues ofasymptotic statistical inferences, such as the evaluation of standarderrors of estimates and chi--square goodness--of--fit statistics,in the general context of mean and covariance structures. The emphasisis on drawing correct statistical inferences regardless of thedistribution of the data and the method of estimation employed. A(distribution--free) consistent estimate of $\Gamma$, the matrix ofasymptotic variances of the vector of sample second--order moments,will be used to compute robust standard errors and a robust chi--squaregoodness--of--fit squares. Simple modifications of the usual estimateof $\Gamma$ will also permit correct inferences in the case of multi--stage complex samples. We will also discuss the conditions under which,regardless of the distribution of the data, one can rely on the usual(non--robust) inferential statistics. Finally, a multivariate regressionmodel with errors--in--variables will be used to illustrate, by meansof simulated data, various theoretical aspects of the paper.
Resumo:
Although correspondence analysis is now widely available in statistical software packages and applied in a variety of contexts, notably the social and environmental sciences, there are still some misconceptions about this method as well as unresolved issues which remain controversial to this day. In this paper we hope to settle these matters, namely (i) the way CA measures variance in a two-way table and how to compare variances between tables of different sizes, (ii) the influence, or rather lack of influence, of outliers in the usual CA maps, (iii) the scaling issue and the biplot interpretation of maps,(iv) whether or not to rotate a solution, and (v) statistical significance of results.
Resumo:
In this paper we deal with the identification of dependencies between time series of equity returns. Marginal distribution functions are assumed to be known, and a bivariate chi-square test of fit is applied in a fully parametric copula approach. Several families of copulas are fitted and compared with Spanish stock market data. The results show that the t-copula generally outperforms other dependence structures, and highlight the difficulty in adjusting a significant number of bivariate data series
Resumo:
In this paper we deal with the identification of dependencies between time series of equity returns. Marginal distribution functions are assumed to be known, and a bivariate chi-square test of fit is applied in a fully parametric copula approach. Several families of copulas are fitted and compared with Spanish stock market data. The results show that the t-copula generally outperforms other dependence structures, and highlight the difficulty in adjusting a significant number of bivariate data series
Resumo:
Background: Development of three classification trees (CT) based on the CART (Classification and Regression Trees), CHAID (Chi-Square Automatic Interaction Detection) and C4.5 methodologies for the calculation of probability of hospital mortality; the comparison of the results with the APACHE II, SAPS II and MPM II-24 scores, and with a model based on multiple logistic regression (LR). Methods: Retrospective study of 2864 patients. Random partition (70:30) into a Development Set (DS) n = 1808 and Validation Set (VS) n = 808. Their properties of discrimination are compared with the ROC curve (AUC CI 95%), Percent of correct classification (PCC CI 95%); and the calibration with the Calibration Curve and the Standardized Mortality Ratio (SMR CI 95%). Results: CTs are produced with a different selection of variables and decision rules: CART (5 variables and 8 decision rules), CHAID (7 variables and 15 rules) and C4.5 (6 variables and 10 rules). The common variables were: inotropic therapy, Glasgow, age, (A-a)O2 gradient and antecedent of chronic illness. In VS: all the models achieved acceptable discrimination with AUC above 0.7. CT: CART (0.75(0.71-0.81)), CHAID (0.76(0.72-0.79)) and C4.5 (0.76(0.73-0.80)). PCC: CART (72(69- 75)), CHAID (72(69-75)) and C4.5 (76(73-79)). Calibration (SMR) better in the CT: CART (1.04(0.95-1.31)), CHAID (1.06(0.97-1.15) and C4.5 (1.08(0.98-1.16)). Conclusion: With different methodologies of CTs, trees are generated with different selection of variables and decision rules. The CTs are easy to interpret, and they stratify the risk of hospital mortality. The CTs should be taken into account for the classification of the prognosis of critically ill patients.
Resumo:
El objetivo de la presente investigación fue analizar la correspondencia entre los resultados de una evaluación de tierras con la distribución real de los cultivos. Para ello la aptitud biofísica de las tierras se comparó con diferentes tipologías de frecuencia de ocurrencia de los cultivos y rotaciones derivadas de mapas de cultivos multitemporales. La investigación fue llevada a cabo en el distrito de riego de Flumen (33.000 ha), localizado en el valle del Ebro (NE España). La evaluación de tierras se basó en una cartografía de suelos 1:100.000, según el esquema FAO, para los principales cultivos presentes en el área de estudio (alfalfa, cereales de invierno, maíz, arroz y girasol). Se utilizaron tres mapas de frecuencia de cultivos y un mapa de rotaciones, derivado de una serie temporal de imágenes Landsat TM y ETM+ del periodo 1993-2000, y se compararon con los mapas de aptitud de tierras para los diferentes cultivos. Se analizó estadísticamente (Pearson χ2, Cramer V, Gamma y Somers D) la relación entre los dos tipos de variables. Los resultados muestran la existencia de una relación significativa (P=0,001) entre la localización de los cultivos y la idoneidad de las tierras, excepto de cultivos oportunistas como el girasol, muy influenciado por las subvenciones en el periodo estudiado. Las rotaciones basadas en la alfalfa muestran los mayores porcentajes (52%) de ocupación en las tierras más aptas para la agricultura en el área de estudio. El presente enfoque multitemporal de análisis de la información ofrece una visión más real que la comparación entre un mapa de evaluación de tierras y un mapa de cultivos de una fecha determinada, cuando se valora el grado de acuerdo entre las recomendaciones sobre la aptitud de las tierras y los cultivos realmente cultivados por los agricultores.
Resumo:
The statistical analysis of literary style is the part of stylometry that compares measurable characteristicsin a text that are rarely controlled by the author, with those in other texts. When thegoal is to settle authorship questions, these characteristics should relate to the author’s style andnot to the genre, epoch or editor, and they should be such that their variation between authors islarger than the variation within comparable texts from the same author.For an overview of the literature on stylometry and some of the techniques involved, see for exampleMosteller and Wallace (1964, 82), Herdan (1964), Morton (1978), Holmes (1985), Oakes (1998) orLebart, Salem and Berry (1998).Tirant lo Blanc, a chivalry book, is the main work in catalan literature and it was hailed to be“the best book of its kind in the world” by Cervantes in Don Quixote. Considered by writterslike Vargas Llosa or Damaso Alonso to be the first modern novel in Europe, it has been translatedseveral times into Spanish, Italian and French, with modern English translations by Rosenthal(1996) and La Fontaine (1993). The main body of this book was written between 1460 and 1465,but it was not printed until 1490.There is an intense and long lasting debate around its authorship sprouting from its first edition,where its introduction states that the whole book is the work of Martorell (1413?-1468), while atthe end it is stated that the last one fourth of the book is by Galba (?-1490), after the death ofMartorell. Some of the authors that support the theory of single authorship are Riquer (1990),Chiner (1993) and Badia (1993), while some of those supporting the double authorship are Riquer(1947), Coromines (1956) and Ferrando (1995). For an overview of this debate, see Riquer (1990).Neither of the two candidate authors left any text comparable to the one under study, and thereforediscriminant analysis can not be used to help classify chapters by author. By using sample textsencompassing about ten percent of the book, and looking at word length and at the use of 44conjunctions, prepositions and articles, Ginebra and Cabos (1998) detect heterogeneities that mightindicate the existence of two authors. By analyzing the diversity of the vocabulary, Riba andGinebra (2000) estimates that stylistic boundary to be near chapter 383.Following the lead of the extensive literature, this paper looks into word length, the use of the mostfrequent words and into the use of vowels in each chapter of the book. Given that the featuresselected are categorical, that leads to three contingency tables of ordered rows and therefore tothree sequences of multinomial observations.Section 2 explores these sequences graphically, observing a clear shift in their distribution. Section 3describes the problem of the estimation of a suden change-point in those sequences, in the followingsections we propose various ways to estimate change-points in multinomial sequences; the methodin section 4 involves fitting models for polytomous data, the one in Section 5 fits gamma modelsonto the sequence of Chi-square distances between each row profiles and the average profile, theone in Section 6 fits models onto the sequence of values taken by the first component of thecorrespondence analysis as well as onto sequences of other summary measures like the averageword length. In Section 7 we fit models onto the marginal binomial sequences to identify thefeatures that distinguish the chapters before and after that boundary. Most methods rely heavilyon the use of generalized linear models
Resumo:
The general objective of the study was to empirically test a reciprocal model of job satisfaction and life satisfaction while controlling for some social demographic variables. 827 employees working in 34 car dealerships in Northern Quebec (56% responses rate) were surveyed. The multiple item questionnaires were analysed using correlation analysis, chi square and ANOVAs. Results show interesting patterns emerging for the relationships between job and life satisfaction of which 49.2% of all individuals have spillover, 43.5% compensation, and 7.3% segmentation type of relationships. Results, nonetheless, are far richer and the model becomes much more refined when social demographic indicators are taken into account. Globally, social demographic variables demonstrate some effects on each satisfaction individually but also on the interrelation (nature of the relations) between life and work satisfaction.