974 resultados para Data Interpretation, Statistical
Resumo:
In an earlier investigation (Burger et al., 2000) five sediment cores near the Rodrigues Triple Junction in the Indian Ocean were studied applying classical statistical methods (fuzzy c-means clustering, linear mixing model, principal component analysis) for the extraction of endmembers and evaluating the spatial and temporal variation of geochemical signals. Three main factors of sedimentation were expected by the marine geologists: a volcano-genetic, a hydro-hydrothermal and an ultra-basic factor. The display of fuzzy membership values and/or factor scores versus depth provided consistent results for two factors only; the ultra-basic component could not be identified. The reason for this may be that only traditional statistical methods were applied, i.e. the untransformed components were used and the cosine-theta coefficient as similarity measure. During the last decade considerable progress in compositional data analysis was made and many case studies were published using new tools for exploratory analysis of these data. Therefore it makes sense to check if the application of suitable data transformations, reduction of the D-part simplex to two or three factors and visual interpretation of the factor scores would lead to a revision of earlier results and to answers to open questions . In this paper we follow the lines of a paper of R. Tolosana- Delgado et al. (2005) starting with a problem-oriented interpretation of the biplot scattergram, extracting compositional factors, ilr-transformation of the components and visualization of the factor scores in a spatial context: The compositional factors will be plotted versus depth (time) of the core samples in order to facilitate the identification of the expected sources of the sedimentary process. Kew words: compositional data analysis, biplot, deep sea sediments
Resumo:
Factor analysis as frequent technique for multivariate data inspection is widely used also for compositional data analysis. The usual way is to use a centered logratio (clr) transformation to obtain the random vector y of dimension D. The factor model is then y = Λf + e (1) with the factors f of dimension k < D, the error term e, and the loadings matrix Λ. Using the usual model assumptions (see, e.g., Basilevsky, 1994), the factor analysis model (1) can be written as Cov(y) = ΛΛT + ψ (2) where ψ = Cov(e) has a diagonal form. The diagonal elements of ψ as well as the loadings matrix Λ are estimated from an estimation of Cov(y). Given observed clr transformed data Y as realizations of the random vector y. Outliers or deviations from the idealized model assumptions of factor analysis can severely effect the parameter estimation. As a way out, robust estimation of the covariance matrix of Y will lead to robust estimates of Λ and ψ in (2), see Pison et al. (2003). Well known robust covariance estimators with good statistical properties, like the MCD or the S-estimators (see, e.g. Maronna et al., 2006), rely on a full-rank data matrix Y which is not the case for clr transformed data (see, e.g., Aitchison, 1986). The isometric logratio (ilr) transformation (Egozcue et al., 2003) solves this singularity problem. The data matrix Y is transformed to a matrix Z by using an orthonormal basis of lower dimension. Using the ilr transformed data, a robust covariance matrix C(Z) can be estimated. The result can be back-transformed to the clr space by C(Y ) = V C(Z)V T where the matrix V with orthonormal columns comes from the relation between the clr and the ilr transformation. Now the parameters in the model (2) can be estimated (Basilevsky, 1994) and the results have a direct interpretation since the links to the original variables are still preserved. The above procedure will be applied to data from geochemistry. Our special interest is on comparing the results with those of Reimann et al. (2002) for the Kola project data
Resumo:
The proportional odds model provides a powerful tool for analysing ordered categorical data and setting sample size, although for many clinical trials its validity is questionable. The purpose of this paper is to present a new class of constrained odds models which includes the proportional odds model. The efficient score and Fisher's information are derived from the profile likelihood for the constrained odds model. These results are new even for the special case of proportional odds where the resulting statistics define the Mann-Whitney test. A strategy is described involving selecting one of these models in advance, requiring assumptions as strong as those underlying proportional odds, but allowing a choice of such models. The accuracy of the new procedure and its power are evaluated.
Resumo:
In conventional phylogeographic studies, historical demographic processes are elucidated from the geographical distribution of individuals represented on an inferred gene tree. However, the interpretation of gene trees in this context can be difficult as the same demographic/geographical process can randomly lead to multiple different genealogies. Likewise, the same gene trees can arise under different demographic models. This problem has led to the emergence of many statistical methods for making phylogeographic inferences. A popular phylogeographic approach based on nested clade analysis is challenged by the fact that a certain amount of the interpretation of the data is left to the subjective choices of the user, and it has been argued that the method performs poorly in simulation studies. More rigorous statistical methods based on coalescence theory have been developed. However, these methods may also be challenged by computational problems or poor model choice. In this review, we will describe the development of statistical methods in phylogeographic analysis, and discuss some of the challenges facing these methods.
Resumo:
The aims of the study were to test the hypotheses that some symptoms of starvation/severe dietary restraint are interpreted by patients with eating disorders in terms or control. Sixty-nine women satisfying the Diagnostic and Statistical Manual of Mental Disorders - IV edition (DSM-IV) criteria for a clinical eating disorder and 107 controls participated in the Study. All the participants completed an ambiguous scenarios paradigm, the Eating Disorder Lamination Questionnaire (EDE-Q) and the Beck Depression Inventory (BDI). Significantly more eating disorder patients than non clinical participants interpreted the starvation/dietary restraint symptoms of hunger, heightened satiety, and dizziness in terms of control. The data give further Support to the recent cognitive-behavioural theory of eating disorders suggesting that eating disorder patients interpret some starvation/dietary restraint symptoms in terms of control.
Resumo:
In this paper, we address issues in segmentation Of remotely sensed LIDAR (LIght Detection And Ranging) data. The LIDAR data, which were captured by airborne laser scanner, contain 2.5 dimensional (2.5D) terrain surface height information, e.g. houses, vegetation, flat field, river, basin, etc. Our aim in this paper is to segment ground (flat field)from non-ground (houses and high vegetation) in hilly urban areas. By projecting the 2.5D data onto a surface, we obtain a texture map as a grey-level image. Based on the image, Gabor wavelet filters are applied to generate Gabor wavelet features. These features are then grouped into various windows. Among these windows, a combination of their first and second order of statistics is used as a measure to determine the surface properties. The test results have shown that ground areas can successfully be segmented from LIDAR data. Most buildings and high vegetation can be detected. In addition, Gabor wavelet transform can partially remove hill or slope effects in the original data by tuning Gabor parameters.