Biblioteca Digital

948 resultados para Correspondence analysis (Statistics)

Arrow plot and CA maps on microarray preprocessing methods

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Microarray allow to monitoring simultaneously thousands of genes, where the abundance of the transcripts under a same experimental condition at the same time can be quantified. Among various available array technologies, double channel cDNA microarray experiments have arisen in numerous technical protocols associated to genomic studies, which is the focus of this work. Microarray experiments involve many steps and each one can affect the quality of raw data. Background correction and normalization are preprocessing techniques to clean and correct the raw data when undesirable fluctuations arise from technical factors. Several recent studies showed that there is no preprocessing strategy that outperforms others in all circumstances and thus it seems difficult to provide general recommendations. In this work, it is proposed to use exploratory techniques to visualize the effects of preprocessing methods on statistical analysis of cancer two-channel microarray data sets, where the cancer types (classes) are known. For selecting differential expressed genes the arrow plot was used and the graph of profiles resultant from the correspondence analysis for visualizing the results. It was used 6 background methods and 6 normalization methods, performing 36 pre-processing methods and it was analyzed in a published cDNA microarray database (Liver) available at http://genome-www5.stanford.edu/ which microarrays were already classified by cancer type. All statistical analyses were performed using the R statistical software.

Victimization in Childhood of Male Sex Offenders: Relationship between Violence Experienced and Subsequent Offenses through Discourse Analysis

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This study aims at better understanding how the form of childhood violence experienced and the type of offense subsequently committed affect how sex offenders recall punishments and difficult events. Fifty-four male perpetrators convicted of sexual offenses against children (SOCs) or against adults (SOAs) were interviewed in France, Belgium, and Switzerland using the Lausanne Clinical Interview (Entretien Clinique de Lausanne or LCI). Almost three-quarters of the sex offenders reported having been victimized during childhood. The correspondence analysis identified several factors that differentiated them. Their appraisal of the distressing event, method of coping with and distancing themselves from it, and how they dealt with emotions varied markedly depending on whether they recognized having experienced various forms of violence during childhood and on what type of offense they subsequently committed. Victimization can be identified as much by the events experienced as by their effect on the sex offender's discourse. Identification of these discursive indicators may lead to an improved therapeutic approach for potentially traumatic childhood experiences.

Dynamic graphics of parametrically linked multivariate methods used in compositional data analysis

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Many multivariate methods that are apparently distinct can be linked by introducing oneor more parameters in their definition. Methods that can be linked in this way arecorrespondence analysis, unweighted or weighted logratio analysis (the latter alsoknown as "spectral mapping"), nonsymmetric correspondence analysis, principalcomponent analysis (with and without logarithmic transformation of the data) andmultidimensional scaling. In this presentation I will show how several of thesemethods, which are frequently used in compositional data analysis, may be linkedthrough parametrizations such as power transformations, linear transformations andconvex linear combinations. Since the methods of interest here all lead to visual mapsof data, a "movie" can be made where where the linking parameter is allowed to vary insmall steps: the results are recalculated "frame by frame" and one can see the smoothchange from one method to another. Several of these "movies" will be shown, giving adeeper insight into the similarities and differences between these methods

Hierarchical analyses of genetic differentiation in a hybrid zone of Sorex araneus (Insectivora : Soricidae)

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Microsatellites are used to unravel the fine-scale genetic structure of a hybrid zone between chromosome races Valais and Cordon of the common shrew (Sorex araneus) located in the French Alps. A total of 269 individuals collected between 1992 and 1995 was typed for seven microsatellite loci. A modified version of the classical multiple correspondence analysis is carried out. This analysis clearly shows the dichotomy between the two races. Several approaches are used to study genetic structuring. Gene flow is clearly reduced between these chromosome races and is estimated at one migrant every two generations using X-statistics and one migrant per generation using F-statistics. Hierarchical F- and R-statistics are compared and their efficiency to detect inter- and intraracial patterns of divergence is discussed. Within-race genetic structuring is significant, but remains weak. F-ST displays similar values on both sides of the hybrid zone, although no environmental barriers are found on the Cordon side, whereas the Valais side is divided by several mountain rivers. We introduce the exact G-test to microsatellite data which proved to be a powerful test to detect genetic differentiation within as well as among races. The genetic background of karyotypic hybrids was compared with the genetic background of pure parental forms using a CRT-MCA. Our results indicate that, without knowledge of the karyotypes, we would not have been able to distinguish these hybrids from karyotypically pure samples.

Fuzzy coding in constrained ordinations

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Canonical correspondence analysis and redundancy analysis are two methods of constrained ordination regularly used in the analysis of ecological data when several response variables (for example, species abundances) are related linearly to several explanatory variables (for example, environmental variables, spatial positions of samples). In this report I demonstrate the advantages of the fuzzy coding of explanatory variables: first, nonlinear relationships can be diagnosed; second, more variance in the responses can be explained; and third, in the presence of categorical explanatory variables (for example, years, regions) the interpretation of the resulting triplot ordination is unified because all explanatory variables are measured at a categorical level.

Contribution biplots

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In order to interpret the biplot it is necessary to know which points usually variables are the ones that are important contributors to the solution, and this information is available separately as part of the biplot s numerical results. We propose a new scaling of the display, called the contribution biplot, which incorporates this diagnostic directly into the graphical display, showing visually the important contributors and thus facilitating the biplot interpretation and often simplifying the graphical representation considerably. The contribution biplot can be applied to a wide variety of analyses such as correspondence analysis, principal component analysis, log-ratio analysis and the graphical results of a discriminant analysis/MANOVA, in fact to any method based on the singular-value decomposition. In the contribution biplot one set of points, usually the rows of the data matrix, optimally represent the spatial positions of the cases or sample units, according to some distance measure that usually incorporates some form of standardization unless all data are comparable in scale. The other set of points, usually the columns, is represented by vectors that are related to their contributions to the low-dimensional solution. A fringe benefit is that usually only one common scale for row and column points is needed on the principal axes, thus avoiding the problem of enlarging or contracting the scale of one set of points to make the biplot legible. Furthermore, this version of the biplot also solves the problem in correspondence analysis of low-frequency categories that are located on the periphery of the map, giving the false impression that they are important, when they are in fact contributing minimally to the solution.

Dynamic perceptual mapping

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Perceptual maps have been used for decades by market researchers to illuminatethem about the similarity between brands in terms of a set of attributes, to position consumersrelative to brands in terms of their preferences, or to study how demographic and psychometricvariables relate to consumer choice. Invariably these maps are two-dimensional and static. Aswe enter the era of electronic publishing, the possibilities for dynamic graphics are opening up.We demonstrate the usefulness of introducing motion into perceptual maps through fourexamples. The first example shows how a perceptual map can be viewed in three dimensions,and the second one moves between two analyses of the data that were collected according todifferent protocols. In a third example we move from the best view of the data at the individuallevel to one which focuses on between-group differences in aggregated data. A final exampleconsiders the case when several demographic variables or market segments are available foreach respondent, showing an animation with increasingly detailed demographic comparisons.These examples of dynamic maps use several data sets from marketing and social scienceresearch.

Are Americans' musical preferences more omnivores today?

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Although we found a general trend favouring the omnivorousness thesis, as soon as we adjusted it to a set of structural factors and consumers tastes it was clear that this was caused by elitist inclusive omnivores who had increased the scope of their tastes. In general, younger cohorts were becoming less omnivorous, nevertheless, they were also becoming more educated and had greater to higher levels of inc ome, making the youth moreomnivorous. As expected, upscale consumers set limits on their popular taste: musicalgenres, whose audiences had educational levels below the mean profile were less preferredby upscale respondents. In spite of this, as time passed, some popular brows gained socialstatus.

Weighted metric multidimensional scaling

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper establishes a general framework for metric scaling of any distance measure between individuals based on a rectangular individuals-by-variables data matrix. The method allows visualization of both individuals and variables as well as preserving all the good properties of principal axis methods such as principal components and correspondence analysis, based on the singular-value decomposition, including the decomposition of variance into components along principal axes which provide the numerical diagnostics known as contributions. The idea is inspired from the chi-square distance in correspondence analysis which weights each coordinate by an amount calculated from the margins of the data table. In weighted metric multidimensional scaling (WMDS) we allow these weights to be unknown parameters which are estimated from the data to maximize the fit to the original distances. Once this extra weight-estimation step is accomplished, the procedure follows the classical path in decomposing a matrix and displaying its rows and columns in biplots.

Biplots of fuzzy coded data

Relevância:

90.00% 90.00%

Publicador:

Resumo:

A biplot, which is the multivariate generalization of the two-variable scatterplot, can be used to visualize the results of many multivariate techniques, especially those that are based on the singular value decomposition. We consider data sets consisting of continuous-scale measurements, their fuzzy coding and the biplots that visualize them, using a fuzzy version of multiple correspondence analysis. Of special interest is the way quality of fit of the biplot is measured, since it is well-known that regular (i.e., crisp) multiple correspondence analysis seriously under-estimates this measure. We show how the results of fuzzy multiple correspondence analysis can be defuzzified to obtain estimated values of the original data, and prove that this implies an orthogonal decomposition of variance. This permits a measure of fit to be calculated in the familiar form of a percentage of explained variance, which is directly comparable to the corresponding fit measure used in principal component analysis of the original data. The approach is motivated initially by its application to a simulated data set, showing how the fuzzy approach can lead to diagnosing nonlinear relationships, and finally it is applied to a real set of meteorological data.

Measuring subcompositional incoherence

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Subcompositional coherence is a fundamental property of Aitchison s approach to compositional data analysis, and is the principal justification for using ratios of components. We maintain, however, that lack of subcompositional coherence, that is incoherence, can be measured in an attempt to evaluate whether any given technique is close enough, for all practical purposes, to being subcompositionally coherent. This opens up the field to alternative methods, which might be better suited to cope with problems such as data zeros and outliers, while being only slightly incoherent. The measure that we propose is based on the distance measure between components. We show that the two-part subcompositions, which appear to be the most sensitive to subcompositional incoherence, can be used to establish a distance matrix which can be directly compared with the pairwise distances in the full composition. The closeness of these two matrices can be quantified using a stress measure that is common in multidimensional scaling, providing a measure of subcompositional incoherence. The approach is illustrated using power-transformed correspondence analysis, which has already been shown to converge to log-ratio analysis as the power transform tends to zero.

European research funding and regional technological capabilities: Network composition analysis

Relevância:

90.00% 90.00%

Publicador:

Resumo:

We use network and correspondence analysis to describe the compositionof the research networks in the European BRITE--EURAM program. Our mainfinding is that 27\% of the participants in this program fall into one oftwo sets of highly ``interconnected'' institutions --one centered aroundlarge firms (with smaller firms and research centers providing specializedservices), and the other around universities--. Moreover, these ``hubs''are composed largely of institutions coming from the technologically mostadvanced regions of Europe. This is suggestive of the difficulties of attainingEuropean ``cohesion'', as technically advanced institutions naturally linkwith partners of similar technological capabilities.

Weighted Euclidean biplots

Relevância:

90.00% 90.00%

Publicador:

Resumo:

We construct a weighted Euclidean distance that approximates any distance or dissimilarity measure between individuals that is based on a rectangular cases-by-variables data matrix. In contrast to regular multidimensional scaling methods for dissimilarity data, the method leads to biplots of individuals and variables while preserving all the good properties of dimension-reduction methods that are based on the singular-value decomposition. The main benefits are the decomposition of variance into components along principal axes, which provide the numerical diagnostics known as contributions, and the estimation of nonnegative weights for each variable. The idea is inspired by the distance functions used in correspondence analysis and in principal component analysis of standardized data, where the normalizations inherent in the distances can be considered as differential weighting of the variables. In weighted Euclidean biplots we allow these weights to be unknown parameters, which are estimated from the data to maximize the fit to the chosen distances or dissimilarities. These weights are estimated using a majorization algorithm. Once this extra weight-estimation step is accomplished, the procedure follows the classical path in decomposing the matrix and displaying its rows and columns in biplots.

Where and how family members spend time at home? A quantitative analysis of the observational tracking on the everyday lives of Italian families

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper examines a dataset that derives from an observational tracking, in order to analyze where and how middle-class working families spend time at home. We use an ethnographic approach to study the everyday lives of Italian dual-income middle-class families, with the aim to analyze quantitatively the use of home spaces and the types of activities of family members on weekday afternoons and evenings. The different analyses (multiple correspondence analysis, agglomerative hierarchical cluster, discriminant analysis) show how particular spaces and activities in these spaces are dominated by certain family members. We suggest a combination of qualitative and quantitative methodologies as useful tools to explore in detail the everyday lives of families, and to understand how family members use the domestic spaces. In particular, we consider relevant the use of quantitative analyses to examine ethnographic data, especially in connection with the methodological reflexivity among researchers

Multivariate analysis of pollen frequency of the native species Escallonia pulverulenta (Saxifragaceae) in Chilean honeys

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The aim of this work was the identification of geographic zones suitable for the production of honeys in which pollen grains of Escallonia pulverulenta (Ruiz & Pav.) Pers. (Saxifragaceae) can be detected. The analysis of botanical origin of 240 honey samples produced between La Serena and Puerto Mont (the IV and X Administrative Regions of Chile), allowed the detection of pollen grains of E. pulverulenta in 46 Chilean honeys. The geographic distribution of the honeys studied is presented together with their affinities, through factor analysis and frequency tables. The study was based on the presence of E. pulverulenta pollen. Escallonia pulverulenta pollen percentages oscillated between 0.24% and 78.5%. Seventeen of the studied samples were designated as unifloral - i.e. samples showing more than 45% pollen of a determined plant species. Two of these corresponded to E. pulverulenta (corontillo, madroño or barraco) honeys. The remaining unifloral honeys correspond to 8 samples of Lotus uliginosus Schkuhr (birdsfoot trefoil), 2 samples of Aristotelia chilensis (Molina) Stuntz (maqui) and 1 sample of Escallonia rubra (Ruiz & Pav.) Pers. (siete camisas), Eucryphia cordifolia Cav. (ulmo or muemo), Weinmannia trichosperma Cav. (tineo), Rubus ulmifolius Schott (blackberry) and Brassica rapa L. (turnip). Honeys with different percentages of E. pulverulenta pollen - statistically analyzed through correspondence analysis - could be associated and assigned to one of three geographic types, defined on the basis of this analysis. The geographical type areas defined were the Northern Mediterranean Zone (samples from the IV Region), Central Mediterranean Zone (samples from the V to the VIII regions including two samples of unifloral Escallonia pulverulenta honey), and Southern Mediterranean Zone (samples from the IX Region).

«
1
2
3
4
5
6
7
8
...
63
64
»