930 resultados para Compositional data analysis
Resumo:
In standard multivariate statistical analysis common hypotheses of interest concern changes in mean vectors and subvectors. In compositional data analysis it is now well established that compositional change is most readily described in terms of the simplicial operation of perturbation and that subcompositions replace the marginal concept of subvectors. To motivate the statistical developments of this paper we present two challenging compositional problems from food production processes. Against this background the relevance of perturbations and subcompositions can be clearly seen. Moreover we can identify a number of hypotheses of interest involving the specification of particular perturbations or differences between perturbations and also hypotheses of subcompositional stability. We identify the two problems as being the counterpart of the analysis of paired comparison or split plot experiments and of separate sample comparative experiments in the jargon of standard multivariate analysis. We then develop appropriate estimation and testing procedures for a complete lattice of relevant compositional hypotheses
Resumo:
Simpson's paradox, also known as amalgamation or aggregation paradox, appears when dealing with proportions. Proportions are by construction parts of a whole, which can be interpreted as compositions assuming they only carry relative information. The Aitchison inner product space structure of the simplex, the sample space of compositions, explains the appearance of the paradox, given that amalgamation is a nonlinear operation within that structure. Here we propose to use balances, which are specific elements of this structure, to analyse situations where the paradox might appear. With the proposed approach we obtain that the centre of the tables analysed is a natural way to compare them, which avoids by construction the possibility of a paradox. Key words: Aitchison geometry, geometric mean, orthogonal projection
Resumo:
Pounamu (NZ jade), or nephrite, is a protected mineral in its natural form following the transfer of ownership back to Ngai Tahu under the Ngai Tahu (Pounamu Vesting) Act 1997. Any theft of nephrite is prosecutable under the Crimes Act 1961. Scientific evidence is essential in cases where origin is disputed. A robust method for discrimination of this material through the use of elemental analysis and compositional data analysis is required. Initial studies have characterised the variability within a given nephrite source. This has included investigation of both in situ outcrops and alluvial material. Methods for the discrimination of two geographically close nephrite sources are being developed. Key Words: forensic, jade, nephrite, laser ablation, inductively coupled plasma mass spectrometry, multivariate analysis, elemental analysis, compositional data analysis
Resumo:
Planners in public and private institutions would like coherent forecasts of the components of age-specic mortality, such as causes of death. This has been di cult to achieve because the relative values of the forecast components often fail to behave in a way that is coherent with historical experience. In addition, when the group forecasts are combined the result is often incompatible with an all-groups forecast. It has been shown that cause-specic mortality forecasts are pessimistic when compared with all-cause forecasts (Wilmoth, 1995). This paper abandons the conventional approach of using log mortality rates and forecasts the density of deaths in the life table. Since these values obey a unit sum constraint for both conventional single-decrement life tables (only one absorbing state) and multiple-decrement tables (more than one absorbing state), they are intrinsically relative rather than absolute values across decrements as well as ages. Using the methods of Compositional Data Analysis pioneered by Aitchison (1986), death densities are transformed into the real space so that the full range of multivariate statistics can be applied, then back-transformed to positive values so that the unit sum constraint is honoured. The structure of the best-known, single-decrement mortality-rate forecasting model, devised by Lee and Carter (1992), is expressed in compositional form and the results from the two models are compared. The compositional model is extended to a multiple-decrement form and used to forecast mortality by cause of death for Japan
Resumo:
Self-organizing maps (Kohonen 1997) is a type of artificial neural network developed to explore patterns in high-dimensional multivariate data. The conventional version of the algorithm involves the use of Euclidean metric in the process of adaptation of the model vectors, thus rendering in theory a whole methodology incompatible with non-Euclidean geometries. In this contribution we explore the two main aspects of the problem: 1. Whether the conventional approach using Euclidean metric can shed valid results with compositional data. 2. If a modification of the conventional approach replacing vectorial sum and scalar multiplication by the canonical operators in the simplex (i.e. perturbation and powering) can converge to an adequate solution. Preliminary tests showed that both methodologies can be used on compositional data. However, the modified version of the algorithm performs poorer than the conventional version, in particular, when the data is pathological. Moreover, the conventional ap- proach converges faster to a solution, when data is \well-behaved". Key words: Self Organizing Map; Artificial Neural networks; Compositional data
Resumo:
In this theme you will work through a series of texts and activities and reflect on your view of research and the process of analysis of data and information. Most activities are supported by textual or audio material and are there to stimulate your thinking in a given area. The purpose of this theme is to help you gain a general overview of the main approaches to research design. Although the theme comprises two main sections, one on quantitative research and the other on qualitative research, this is purely to guide your study. The two approaches may be viewed as being part of a continuum with many research studies now incorporating elements of both styles. Eventually you will need to choose a research approach or methodology that will be practical, relevant, appropriate, ethical, of good quality and effective for the research idea or question that you have in mind.
Resumo:
Source files for theme 7
Resumo:
data analysis table
Resumo:
Resumen basado en el de la publicación
Resumo:
This article reflects on key methodological issues emerging from children and young people's involvement in data analysis processes. We outline a pragmatic framework illustrating different approaches to engaging children, using two case studies of children's experiences of participating in data analysis. The article highlights methods of engagement and important issues such as the balance of power between adults and children, training, support, ethical considerations, time and resources. We argue that involving children in data analysis processes can have several benefits, including enabling a greater understanding of children's perspectives and helping to prioritise children's agendas in policy and practice. (C) 2007 The Author(s). Journal compilation (C) 2007 National Children's Bureau.
Recent developments in genetic data analysis: what can they tell us about human demographic history?
Resumo:
Over the last decade, a number of new methods of population genetic analysis based on likelihood have been introduced. This review describes and explains the general statistical techniques that have recently been used, and discusses the underlying population genetic models. Experimental papers that use these methods to infer human demographic and phylogeographic history are reviewed. It appears that the use of likelihood has hitherto had little impact in the field of human population genetics, which is still primarily driven by more traditional approaches. However, with the current uncertainty about the effects of natural selection, population structure and ascertainment of single-nucleotide polymorphism markers, it is suggested that likelihood-based methods may have a greater impact in the future.
Resumo:
In this paper, we address issues in segmentation Of remotely sensed LIDAR (LIght Detection And Ranging) data. The LIDAR data, which were captured by airborne laser scanner, contain 2.5 dimensional (2.5D) terrain surface height information, e.g. houses, vegetation, flat field, river, basin, etc. Our aim in this paper is to segment ground (flat field)from non-ground (houses and high vegetation) in hilly urban areas. By projecting the 2.5D data onto a surface, we obtain a texture map as a grey-level image. Based on the image, Gabor wavelet filters are applied to generate Gabor wavelet features. These features are then grouped into various windows. Among these windows, a combination of their first and second order of statistics is used as a measure to determine the surface properties. The test results have shown that ground areas can successfully be segmented from LIDAR data. Most buildings and high vegetation can be detected. In addition, Gabor wavelet transform can partially remove hill or slope effects in the original data by tuning Gabor parameters.
Resumo:
The principle aim of this research is to elucidate the factors driving the total rate of return of non-listed funds using a panel data analytical framework. In line with previous results, we find that core funds exhibit lower yet more stable returns than value-added and, in particular, opportunistic funds, both cross-sectionally and over time. After taking into account overall market exposure, as measured by weighted market returns, the excess returns of value-added and opportunity funds are likely to stem from: high leverage, high exposure to development, active asset management and investment in specialized property sectors. A random effects estimation of the panel data model largely confirms the findings obtained from the fixed effects model. Again, the country and sector property effect shows the strongest significance in explaining total returns. The stock market variable is negative which hints at switching effects between competing asset classes. For opportunity funds, on average, the returns attributable to gearing are three times higher than those for value added funds and over five times higher than for core funds. Overall, there is relatively strong evidence indicating that country and sector allocation, style, gearing and fund size combinations impact on the performance of unlisted real estate funds.