23 resultados para Multiple-trait analysis
Resumo:
When continuous data are coded to categorical variables, two types of coding are possible: crisp coding in the form of indicator, or dummy, variables with values either 0 or 1; or fuzzy coding where each observation is transformed to a set of "degrees of membership" between 0 and 1, using co-called membership functions. It is well known that the correspondence analysis of crisp coded data, namely multiple correspondence analysis, yields principal inertias (eigenvalues) that considerably underestimate the quality of the solution in a low-dimensional space. Since the crisp data only code the categories to which each individual case belongs, an alternative measure of fit is simply to count how well these categories are predicted by the solution. Another approach is to consider multiple correspondence analysis equivalently as the analysis of the Burt matrix (i.e., the matrix of all two-way cross-tabulations of the categorical variables), and then perform a joint correspondence analysis to fit just the off-diagonal tables of the Burt matrix - the measure of fit is then computed as the quality of explaining these tables only. The correspondence analysis of fuzzy coded data, called "fuzzy multiple correspondence analysis", suffers from the same problem, albeit attenuated. Again, one can count how many correct predictions are made of the categories which have highest degree of membership. But here one can also defuzzify the results of the analysis to obtain estimated values of the original data, and then calculate a measure of fit in the familiar percentage form, thanks to the resultant orthogonal decomposition of variance. Furthermore, if one thinks of fuzzy multiple correspondence analysis as explaining the two-way associations between variables, a fuzzy Burt matrix can be computed and the same strategy as in the crisp case can be applied to analyse the off-diagonal part of this matrix. In this paper these alternative measures of fit are defined and applied to a data set of continuous meteorological variables, which are coded crisply and fuzzily into three categories. Measuring the fit is further discussed when the data set consists of a mixture of discrete and continuous variables.
Resumo:
The generalization of simple correspondence analysis, for two categorical variables, to multiple correspondence analysis where they may be three or more variables, is not straighforward, both from a mathematical and computational point of view. In this paper we detail the exact computational steps involved in performing a multiple correspondence analysis, including the special aspects of adjusting the principal inertias to correct the percentages of inertia, supplementary points and subset analysis. Furthermore, we give the algorithm for joint correspondence analysis where the cross-tabulations of all unique pairs of variables are analysed jointly. The code in the R language for every step of the computations is given, as well as the results of each computation.
Resumo:
In the analysis of multivariate categorical data, typically the analysis of questionnaire data, it is often advantageous, for substantive and technical reasons, to analyse a subset of response categories. In multiple correspondence analysis, where each category is coded as a column of an indicator matrix or row and column of Burt matrix, it is not correct to simply analyse the corresponding submatrix of data, since the whole geometric structure is different for the submatrix . A simple modification of the correspondence analysis algorithm allows the overall geometric structure of the complete data set to be retained while calculating the solution for the selected subset of points. This strategy is useful for analysing patterns of response amongst any subset of categories and relating these patterns to demographic factors, especially for studying patterns of particular responses such as missing and neutral responses. The methodology is illustrated using data from the International Social Survey Program on Family and Changing Gender Roles in 1994.
Resumo:
This paper presents findings from a study investigating a firm s ethical practices along the value chain. In so doing we attempt to better understand potential relationships between a firm s ethical stance with its customers and those of its suppliers within a supply chain and identify particular sectoral and cultural influences that might impinge on this. Drawing upon a database comprising of 667 industrial firms from 27 different countries, we found that ethical practices begin with the firm s relationship with its customers, the characteristics of which then influence the ethical stance with the firm s suppliers within the supply chain. Importantly, market structure along with some key cultural characteristics were also found to exert significant influence on the implementation of ethical policies in these firms.
Resumo:
The generalization of simple (two-variable) correspondence analysis to more than two categorical variables, commonly referred to as multiple correspondence analysis, is neither obvious nor well-defined. We present two alternative ways of generalizing correspondence analysis, one based on the quantification of the variables and intercorrelation relationships, and the other based on the geometric ideas of simple correspondence analysis. We propose a version of multiple correspondence analysis, with adjusted principal inertias, as the method of choice for the geometric definition, since it contains simple correspondence analysis as an exact special case, which is not the situation of the standard generalizations. We also clarify the issue of supplementary point representation and the properties of joint correspondence analysis, a method that visualizes all two-way relationships between the variables. The methodology is illustrated using data on attitudes to science from the International Social Survey Program on Environment in 1993.
Resumo:
The use of simple and multiple correspondence analysis is well-established in socialscience research for understanding relationships between two or more categorical variables.By contrast, canonical correspondence analysis, which is a correspondence analysis with linearrestrictions on the solution, has become one of the most popular multivariate techniques inecological research. Multivariate ecological data typically consist of frequencies of observedspecies across a set of sampling locations, as well as a set of observed environmental variablesat the same locations. In this context the principal dimensions of the biological variables aresought in a space that is constrained to be related to the environmental variables. Thisrestricted form of correspondence analysis has many uses in social science research as well,as is demonstrated in this paper. We first illustrate the result that canonical correspondenceanalysis of an indicator matrix, restricted to be related an external categorical variable, reducesto a simple correspondence analysis of a set of concatenated (or stacked ) tables. Then weshow how canonical correspondence analysis can be used to focus on, or partial out, aparticular set of response categories in sample survey data. For example, the method can beused to partial out the influence of missing responses, which usually dominate the results of amultiple correspondence analysis.
Resumo:
We show the equivalence between the use of correspondence analysis (CA)of concadenated tables and the application of a particular version ofconjoint analysis called categorical conjoint measurement (CCM). Theconnection is established using canonical correlation (CC). The second part introduces the interaction e¤ects in all three variants of theanalysis and shows how to pass between the results of each analysis.
Resumo:
Do mediterranean genera not included in Tachet et al. 2002 have mediterranean trait characteristics? Multiple-trait databases are increasingly used in community ecology in different regions of the world. In Europe, Tachet et al.(2002) compiled an aquatic macroinvertebrate database for 473 taxa using information on 11 biological traits described by 63 categories. However, less studied regions, at the time of the compilation of the database, such as the mediterranean Basin, can harbour exclusive genera, which were not included in Tachet"s database. In a large-scale study across the mediterranean Basin, we found 44 genera that were not included in Tachet"s database (NEW genera). Our main aim was to compile trait information for these NEW genera and assess whether these genera had specific traits that could explain their exclusivity to the Mediterranean region. We compared the trait characteristics of NEW genera to those of genera only found in Mediterranean or temperate regions that were included in the Tachet"s database (MED and TEM genera, respectively). We found that NEW genera had more mediterranean characteristics than TEM genera and that some trait categories of NEW genera were even more mediterranean-like than the traits of MED genera (e.g., diapause). Therefore, our results suggest that the specific biological traits of these NEW genera allow them to cope successfully and exclusively with the harsh environmental conditions of the mediterranean climate rivers, which could partially explain their absence in Tachet"s database. Other explanations, such as the limited dispersal ability of these NEW genera to reach and colonize temperate Europe or the rarity of these NEW genera, should also be considered. We provide biological traits of the NEW genera to be used in future studies on the mediterranean river ecology.
Resumo:
Report for the scientific sojourn at the Université de Bourgogne, France, from July until October 2007..Surlie ageing after second fermentation is a fundamental operation in the production of quality sparkling wine like Cava and Champagne. Recently, the importance of the interaction between wine and lees cell surface has been reported. Cell surface properties depending on wall biochemical composition are major determinants in microbial interactions, having important repercussions in several technological aspects. Sorption and flocculation are especially important in sparkling wine production, and are governed by distinct cell surface properties. The aim of the present research carried out during the four months of the stage was to know the implication of lees surface modifications occurring during surlie ageing in sparkling wine quality and elaboration. The relationship between physico-chemical properties such as hydrophobicity, charge and electron-donor characteristics, and the yeast surface sorption capacities, we determined these factors in a model system. Then, real industrial lees samples were investigated. The surface properties of sparkling wine lees from the same strain of Saccharomyces cerevisiae were characterized according to the time of surlie ageing, and their possible influence on lees sorption and flocculation capacity was evaluated. Surlie ageing after second fermentation is a fundamental operation in the production of quality sparkling wine like Cava and Champagne. Recently, the importance of the interaction between wine and lees cell surface has been reported. Cell surface properties depending on wall biochemical composition are major determinants in microbial interactions, having important repercussions in several technological aspects. Sorption and flocculation are especially important in sparkling wine production, and are governed by distinct cell surface properties. The aim of the present research carried out during the four months of the stage was to know the implication of lees surface modifications occurring during surlie ageing in sparkling wine quality and elaboration. The relationship between physico-chemical properties such as hydrophobicity, charge and electron-donor characteristics, and the yeast surface sorption capacities, we determined these factors in a model system. Then, real industrial lees samples were investigated. The surface properties of sparkling wine lees from the same strain of Saccharomyces cerevisiae were characterized according to the time of surlie ageing, and their possible influence on lees sorption and flocculation capacity was evaluated.
Resumo:
Using data from the Spanish household budget survey, we investigate life- cycle effects on several product expenditures. A latent-variable model approach is adopted to evaluate the impact of income on expenditures, controlling for the number of members in the family. Two latent factors underlying repeated measures of monetary and non-monetary income are used as explanatory variables in the expenditure regression equations, thus avoiding possible bias associated to the measurement error in income. The proposed methodology also takes care of the case in which product expenditures exhibit a pattern of infrequent purchases. Multiple-group analysis is used to assess the variation of key parameters of the model across various household life-cycle typologies. The analysis discloses significant life-cycle effects on the mean levels of expenditures; it also detects significant life-cycle effects on the way expenditures are affected by income and family size. Asymptotic robust methods are used to account for possible non-normality of the data.
Resumo:
Using data from the Spanish household budget survey, we investigate life-cycle effects on several product expenditures. A latent-variable model approach is adopted to evaluate the impact of income on expenditures, controlling for the number of members in the family. Two latent factors underlying repeated measures of monetary and non-monetary income are used as explanatory variables in the expenditure regression equations, thus avoiding possible bias associated to the measurement error in income. The proposed methodology also takes care of the case in which product expenditures exhibit a pattern of infrequent purchases. Multiple-group analysis is used to assess the variation of key parameters of the model across various household life-cycle typologies. The analysis discloses significant life-cycle effects on the mean levels of expenditures; it also detects significant life-cycle effects on the way expenditures are affected by income and family size. Asymptotic robust methods are used to account for possible non-normality of the data.
Resumo:
A biplot, which is the multivariate generalization of the two-variable scatterplot, can be used to visualize the results of many multivariate techniques, especially those that are based on the singular value decomposition. We consider data sets consisting of continuous-scale measurements, their fuzzy coding and the biplots that visualize them, using a fuzzy version of multiple correspondence analysis. Of special interest is the way quality of fit of the biplot is measured, since it is well-known that regular (i.e., crisp) multiple correspondence analysis seriously under-estimates this measure. We show how the results of fuzzy multiple correspondence analysis can be defuzzified to obtain estimated values of the original data, and prove that this implies an orthogonal decomposition of variance. This permits a measure of fit to be calculated in the familiar form of a percentage of explained variance, which is directly comparable to the corresponding fit measure used in principal component analysis of the original data. The approach is motivated initially by its application to a simulated data set, showing how the fuzzy approach can lead to diagnosing nonlinear relationships, and finally it is applied to a real set of meteorological data.
Resumo:
The purpose of this paper is to examine the relation between government measures, volunteer participation, climate variables and forest fires. A number of studies have related forest fires to causes of ignition, to fire history in one area, to the type of vegetation and weathercharacteristics or to community institutions, but there is little research on the relation between fire production and government prevention and extinction measures from a policy evaluation perspective.An observational approach is first applied to select forest fires in the north east of Spain. Taking a selection of fires with a certain size, a multiple regression analysis is conducted to find significant relations between policy instruments under the control of the government and the number of hectares burn in each case, controlling at the same time the effect of weather conditions and other context variables. The paper brings evidence on the effects of simultaneity and the relevance of recurring to army soldiers in specific days with extraordinary high simultaneity. The analysis also brings light on the effectiveness of twopreventive policies and of helicopters for extinction tasks.
Resumo:
Structural equation models (SEM) are commonly used to analyze the relationship between variables some of which may be latent, such as individual ``attitude'' to and ``behavior'' concerning specific issues. A number of difficulties arise when we want to compare a large number of groups, each with large sample size, and the manifest variables are distinctly non-normally distributed. Using an specific data set, we evaluate the appropriateness of the following alternative SEM approaches: multiple group versus MIMIC models, continuous versus ordinal variables estimation methods, and normal theory versus non-normal estimation methods. The approaches are applied to the ISSP-1993 Environmental data set, with the purpose of exploring variation in the mean level of variables of ``attitude'' to and ``behavior''concerning environmental issues and their mutual relationship across countries. Issues of both theoretical and practical relevance arise in the course of this application.
Resumo:
We study the dynamics of reaction-diffusion fronts under the influence of multiplicative noise. An approximate theoretical scheme is introduced to compute the velocity of the front and its diffusive wandering due to the presence of noise. The theoretical approach is based on a multiple scale analysis rather than on a small noise expansion and is confirmed with numerical simulations for a wide range of the noise intensity. We report on the possibility of noise sustained solutions with a continuum of possible velocities, in situations where only a single velocity is allowed without noise.