238 resultados para document analysis
Resumo:
We apply a multilevel hierarchical model to explore whether anaggregation fallacy exists in estimating the income elasticity of healthexpenditure by ignoring the regional composition of national healthexpenditure figures. We use data for 110 regions in eight OECD countriesin 1997: Australia, Canada, France, Germany, Italy, Spain, Sweden andUnited Kingdom. In doing this we have tried to identify two sources ofrandom variation: within countries and between-countries. Our resultsshow that: 1- Variability between countries amounts to (SD) 0.5433, andjust 13% of that can be attributed to income elasticity and the remaining87% to autonomous health expenditure; 2- Within countries, variabilityamounts to (SD) 1.0249; and 3- The intra-class correlation is 0.5300. Weconclude that we have to take into account the degree of fiscaldecentralisation within countries in estimating income elasticity ofhealth expenditure. Two reasons lie behind this: a) where there isdecentralisation to the regions, policies aimed at emulating diversitytend to increase national health care expenditure; and b) without fiscaldecentralisation, central monitoring of finance tends to reduce regionaldiversity and therefore decrease national health expenditure.
Resumo:
The application of correspondence analysis to square asymmetrictables is often unsuccessful because of the strong role played by thediagonal entries of the matrix, obscuring the data off the diagonal. A simplemodification of the centering of the matrix, coupled with the correspondingchange in row and column masses and row and column metrics, allows the tableto be decomposed into symmetric and skew--symmetric components, which canthen be analyzed separately. The symmetric and skew--symmetric analyses canbe performed using a simple correspondence analysis program if the data areset up in a special block format.
Resumo:
Correspondence analysis has found extensive use in ecology, archeology, linguisticsand the social sciences as a method for visualizing the patterns of association in a table offrequencies or nonnegative ratio-scale data. Inherent to the method is the expression of the datain each row or each column relative to their respective totals, and it is these sets of relativevalues (called profiles) that are visualized. This relativization of the data makes perfect sensewhen the margins of the table represent samples from sub-populations of inherently differentsizes. But in some ecological applications sampling is performed on equal areas or equalvolumes so that the absolute levels of the observed occurrences may be of relevance, in whichcase relativization may not be required. In this paper we define the correspondence analysis ofthe raw unrelativized data and discuss its properties, comparing this new method to regularcorrespondence analysis and to a related variant of non-symmetric correspondence analysis.
Resumo:
The generalization of simple correspondence analysis, for two categorical variables, to multiple correspondence analysis where they may be three or more variables, is not straighforward, both from a mathematical and computational point of view. In this paper we detail the exact computational steps involved in performing a multiple correspondence analysis, including the special aspects of adjusting the principal inertias to correct the percentages of inertia, supplementary points and subset analysis. Furthermore, we give the algorithm for joint correspondence analysis where the cross-tabulations of all unique pairs of variables are analysed jointly. The code in the R language for every step of the computations is given, as well as the results of each computation.
Resumo:
We consider two fundamental properties in the analysis of two-way tables of positive data: the principle of distributional equivalence, one of the cornerstones of correspondence analysis of contingency tables, and the principle of subcompositional coherence, which forms the basis of compositional data analysis. For an analysis to be subcompositionally coherent, it suffices to analyse the ratios of the data values. The usual approach to dimension reduction in compositional data analysis is to perform principal component analysis on the logarithms of ratios, but this method does not obey the principle of distributional equivalence. We show that by introducing weights for the rows and columns, the method achieves this desirable property. This weighted log-ratio analysis is theoretically equivalent to spectral mapping , a multivariate method developed almost 30 years ago for displaying ratio-scale data from biological activity spectra. The close relationship between spectral mapping and correspondence analysis is also explained, as well as their connection with association modelling. The weighted log-ratio methodology is applied here to frequency data in linguistics and to chemical compositional data in archaeology.
Resumo:
In the analysis of multivariate categorical data, typically the analysis of questionnaire data, it is often advantageous, for substantive and technical reasons, to analyse a subset of response categories. In multiple correspondence analysis, where each category is coded as a column of an indicator matrix or row and column of Burt matrix, it is not correct to simply analyse the corresponding submatrix of data, since the whole geometric structure is different for the submatrix . A simple modification of the correspondence analysis algorithm allows the overall geometric structure of the complete data set to be retained while calculating the solution for the selected subset of points. This strategy is useful for analysing patterns of response amongst any subset of categories and relating these patterns to demographic factors, especially for studying patterns of particular responses such as missing and neutral responses. The methodology is illustrated using data from the International Social Survey Program on Family and Changing Gender Roles in 1994.
Resumo:
Gazelle companies are relevant because they generate much more employment than other companies and deliver high returns to their shareholders. This paper analyzes their behavior in the years of high growth and their evolution in the following years. The main factors that explain their success are competitive advantages based on human resources, innovation, internationalization, the excellence in processes and a conservative financial policy. Nevertheless, as time goes by they can be divided in two groups: a group which continues having growth, but most of them with lower growth rates; and the rest which face great problems or even disappear. The present study identifies several key factors that explain this different evolution.
Resumo:
This paper provides updated empirical evidence about the real and nominal effects of monetary policy in Italy, by using structural VAR analysis. We discuss different empirical approaches that have been used in order to identify monetary policy exogenous shocks. We argue that the data support the view that the Bank of Italy, at least in the recent past, has been targeting the rate on overnight interbank loans. Therefore, we interpret shocks to the overnight rate as purely exogenous monetary policy shocks and study how different macroeconomic variables react to such shocks.
Resumo:
A Method is offered that makes it possible to apply generalized canonicalcorrelations analysis (CANCOR) to two or more matrices of different row and column order. The new method optimizes the generalized canonical correlationanalysis objective by considering only the observed values. This is achieved byemploying selection matrices. We present and discuss fit measures to assessthe quality of the solutions. In a simulation study we assess the performance of our new method and compare it to an existing procedure called GENCOM,proposed by Green and Carroll. We find that our new method outperforms the GENCOM algorithm both with respect to model fit and recovery of the truestructure. Moreover, as our new method does not require any type of iteration itis easier to implement and requires less computation. We illustrate the methodby means of an example concerning the relative positions of the political parties inthe Netherlands based on provincial data.
Resumo:
This paper studies the effect of providing relative performance feedback information onindividual performance and on individual affective response, when agents are rewardedaccording to their absolute performance. In a laboratory set-up, agents perform a realeffort task and when receiving feedback, they are asked to rate their happiness, arousaland feeling of dominance. Control subjects learn only their absolute performance, whilethe treated subjects additionally learn the average performance in the session.Performance is 17 percent higher when relative performance feedback is provided.Furthermore, although feedback increases the performance independent of the content(i.e., performing above or below the average), the content is determinant for theaffective response. When subjects are treated, the inequality in the happiness and thefeeling of dominance between those subjects performing above and below the averageincreases by 8 and 6 percentage points, respectively.
Resumo:
Many multivariate methods that are apparently distinct can be linked by introducing oneor more parameters in their definition. Methods that can be linked in this way arecorrespondence analysis, unweighted or weighted logratio analysis (the latter alsoknown as "spectral mapping"), nonsymmetric correspondence analysis, principalcomponent analysis (with and without logarithmic transformation of the data) andmultidimensional scaling. In this presentation I will show how several of thesemethods, which are frequently used in compositional data analysis, may be linkedthrough parametrizations such as power transformations, linear transformations andconvex linear combinations. Since the methods of interest here all lead to visual mapsof data, a "movie" can be made where where the linking parameter is allowed to vary insmall steps: the results are recalculated "frame by frame" and one can see the smoothchange from one method to another. Several of these "movies" will be shown, giving adeeper insight into the similarities and differences between these methods.
Resumo:
This paper investigates what has caused output and inflation volatility to fall in the USusing a small scale structural model using Bayesian techniques and rolling samples. Thereare instabilities in the posterior of the parameters describing the private sector, the policyrule and the standard deviation of the shocks. Results are robust to the specification ofthe policy rule. Changes in the parameters describing the private sector are the largest,but those of the policy rule and the covariance matrix of the shocks explain the changes most.
Resumo:
This paper introduces the approach of using Total Unduplicated Reach and Frequency analysis (TURF) to design a product line through a binary linear programming model. This improves the efficiency of the search for the solution to the problem compared to the algorithms that have been used to date. The results obtained through our exact algorithm are presented, and this method shows to be extremely efficient both in obtaining optimal solutions and in computing time for very large instances of the problem at hand. Furthermore, the proposed technique enables the model to be improved in order to overcome the main drawbacks presented by TURF analysis in practice.
Resumo:
A family of scaling corrections aimed to improve the chi-square approximation of goodness-of-fit test statistics in small samples, large models, and nonnormal data was proposed in Satorra and Bentler (1994). For structural equations models, Satorra-Bentler's (SB) scaling corrections are available in standard computer software. Often, however, the interest is not on the overall fit of a model, but on a test of the restrictions that a null model say ${\cal M}_0$ implies on a less restricted one ${\cal M}_1$. If $T_0$ and $T_1$ denote the goodness-of-fit test statistics associated to ${\cal M}_0$ and ${\cal M}_1$, respectively, then typically the difference $T_d = T_0 - T_1$ is used as a chi-square test statistic with degrees of freedom equal to the difference on the number of independent parameters estimated under the models ${\cal M}_0$ and ${\cal M}_1$. As in the case of the goodness-of-fit test, it is of interest to scale the statistic $T_d$ in order to improve its chi-square approximation in realistic, i.e., nonasymptotic and nonnormal, applications. In a recent paper, Satorra (1999) shows that the difference between two Satorra-Bentler scaled test statistics for overall model fit does not yield the correct SB scaled difference test statistic. Satorra developed an expression that permits scaling the difference test statistic, but his formula has some practical limitations, since it requires heavy computations that are notavailable in standard computer software. The purpose of the present paper is to provide an easy way to compute the scaled difference chi-square statistic from the scaled goodness-of-fit test statistics of models ${\cal M}_0$ and ${\cal M}_1$. A Monte Carlo study is provided to illustrate the performance of the competing statistics.
Resumo:
We consider the joint visualization of two matrices which have common rowsand columns, for example multivariate data observed at two time pointsor split accord-ing to a dichotomous variable. Methods of interest includeprincipal components analysis for interval-scaled data, or correspondenceanalysis for frequency data or ratio-scaled variables on commensuratescales. A simple result in matrix algebra shows that by setting up thematrices in a particular block format, matrix sum and difference componentscan be visualized. The case when we have more than two matrices is alsodiscussed and the methodology is applied to data from the InternationalSocial Survey Program.