10 resultados para mathematical regression

em Universitat de Girona, Spain


Relevância:

30.00% 30.00%

Publicador:

Resumo:

It is well known that regression analyses involving compositional data need special attention because the data are not of full rank. For a regression analysis where both the dependent and independent variable are components we propose a transformation of the components emphasizing their role as dependent and independent variables. A simple linear regression can be performed on the transformed components. The regression line can be depicted in a ternary diagram facilitating the interpretation of the analysis in terms of components. An exemple with time-budgets illustrates the method and the graphical features

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In CoDaWork’05, we presented an application of discriminant function analysis (DFA) to 4 different compositional datasets and modelled the first canonical variable using a segmented regression model solely based on an observation about the scatter plots. In this paper, multiple linear regressions are applied to different datasets to confirm the validity of our proposed model. In addition to dating the unknown tephras by calibration as discussed previously, another method of mapping the unknown tephras into samples of the reference set or missing samples in between consecutive reference samples is proposed. The application of these methodologies is demonstrated with both simulated and real datasets. This new proposed methodology provides an alternative, more acceptable approach for geologists as their focus is on mapping the unknown tephra with relevant eruptive events rather than estimating the age of unknown tephra. Kew words: Tephrochronology; Segmented regression

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Time series regression models are especially suitable in epidemiology for evaluating short-term effects of time-varying exposures on health. The problem is that potential for confounding in time series regression is very high. Thus, it is important that trend and seasonality are properly accounted for. Our paper reviews the statistical models commonly used in time-series regression methods, specially allowing for serial correlation, make them potentially useful for selected epidemiological purposes. In particular, we discuss the use of time-series regression for counts using a wide range Generalised Linear Models as well as Generalised Additive Models. In addition, recently critical points in using statistical software for GAM were stressed, and reanalyses of time series data on air pollution and health were performed in order to update already published. Applications are offered through an example on the relationship between asthma emergency admissions and photochemical air pollutants

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The Aitchison vector space structure for the simplex is generalized to a Hilbert space structure A2(P) for distributions and likelihoods on arbitrary spaces. Central notations of statistics, such as Information or Likelihood, can be identified in the algebraical structure of A2(P) and their corresponding notions in compositional data analysis, such as Aitchison distance or centered log ratio transform. In this way very elaborated aspects of mathematical statistics can be understood easily in the light of a simple vector space structure and of compositional data analysis. E.g. combination of statistical information such as Bayesian updating, combination of likelihood and robust M-estimation functions are simple additions/ perturbations in A2(Pprior). Weighting observations corresponds to a weighted addition of the corresponding evidence. Likelihood based statistics for general exponential families turns out to have a particularly easy interpretation in terms of A2(P). Regular exponential families form finite dimensional linear subspaces of A2(P) and they correspond to finite dimensional subspaces formed by their posterior in the dual information space A2(Pprior). The Aitchison norm can identified with mean Fisher information. The closing constant itself is identified with a generalization of the cummulant function and shown to be Kullback Leiblers directed information. Fisher information is the local geometry of the manifold induced by the A2(P) derivative of the Kullback Leibler information and the space A2(P) can therefore be seen as the tangential geometry of statistical inference at the distribution P. The discussion of A2(P) valued random variables, such as estimation functions or likelihoods, give a further interpretation of Fisher information as the expected squared norm of evidence and a scale free understanding of unbiased reasoning

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The application of Discriminant function analysis (DFA) is not a new idea in the study of tephrochrology. In this paper, DFA is applied to compositional datasets of two different types of tephras from Mountain Ruapehu in New Zealand and Mountain Rainier in USA. The canonical variables from the analysis are further investigated with a statistical methodology of change-point problems in order to gain a better understanding of the change in compositional pattern over time. Finally, a special case of segmented regression has been proposed to model both the time of change and the change in pattern. This model can be used to estimate the age for the unknown tephras using Bayesian statistical calibration

Relevância:

20.00% 20.00%

Publicador:

Resumo:

There is almost not a case in exploration geology, where the studied data doesn’t includes below detection limits and/or zero values, and since most of the geological data responds to lognormal distributions, these “zero data” represent a mathematical challenge for the interpretation. We need to start by recognizing that there are zero values in geology. For example the amount of quartz in a foyaite (nepheline syenite) is zero, since quartz cannot co-exists with nepheline. Another common essential zero is a North azimuth, however we can always change that zero for the value of 360°. These are known as “Essential zeros”, but what can we do with “Rounded zeros” that are the result of below the detection limit of the equipment? Amalgamation, e.g. adding Na2O and K2O, as total alkalis is a solution, but sometimes we need to differentiate between a sodic and a potassic alteration. Pre-classification into groups requires a good knowledge of the distribution of the data and the geochemical characteristics of the groups which is not always available. Considering the zero values equal to the limit of detection of the used equipment will generate spurious distributions, especially in ternary diagrams. Same situation will occur if we replace the zero values by a small amount using non-parametric or parametric techniques (imputation). The method that we are proposing takes into consideration the well known relationships between some elements. For example, in copper porphyry deposits, there is always a good direct correlation between the copper values and the molybdenum ones, but while copper will always be above the limit of detection, many of the molybdenum values will be “rounded zeros”. So, we will take the lower quartile of the real molybdenum values and establish a regression equation with copper, and then we will estimate the “rounded” zero values of molybdenum by their corresponding copper values. The method could be applied to any type of data, provided we establish first their correlation dependency. One of the main advantages of this method is that we do not obtain a fixed value for the “rounded zeros”, but one that depends on the value of the other variable. Key words: compositional data analysis, treatment of zeros, essential zeros, rounded zeros, correlation dependency

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Sediment composition is mainly controlled by the nature of the source rock(s), and chemical (weathering) and physical processes (mechanical crushing, abrasion, hydrodynamic sorting) during alteration and transport. Although the factors controlling these processes are conceptually well understood, detailed quantification of compositional changes induced by a single process are rare, as are examples where the effects of several processes can be distinguished. The present study was designed to characterize the role of mechanical crushing and sorting in the absence of chemical weathering. Twenty sediment samples were taken from Alpine glaciers that erode almost pure granitoid lithologies. For each sample, 11 grain-size fractions from granules to clay (ø grades <-1 to >9) were separated, and each fraction was analysed for its chemical composition. The presence of clear steps in the box-plots of all parts (in adequate ilr and clr scales) against ø is assumed to be explained by typical crystal size ranges for the relevant mineral phases. These scatter plots and the biplot suggest a splitting of the full grain size range into three groups: coarser than ø=4 (comparatively rich in SiO2, Na2O, K2O, Al2O3, and dominated by “felsic” minerals like quartz and feldspar), finer than ø=8 (comparatively rich in TiO2, MnO, MgO, Fe2O3, mostly related to “mafic” sheet silicates like biotite and chlorite), and intermediate grains sizes (4≤ø <8; comparatively rich in P2O5 and CaO, related to apatite, some feldspar). To further test the absence of chemical weathering, the observed compositions were regressed against three explanatory variables: a trend on grain size in ø scale, a step function for ø≥4, and another for ø≥8. The original hypothesis was that the trend could be identified with weathering effects, whereas each step function would highlight those minerals with biggest characteristic size at its lower end. Results suggest that this assumption is reasonable for the step function, but that besides weathering some other factors (different mechanical behavior of minerals) have also an important contribution to the trend. Key words: sediment, geochemistry, grain size, regression, step function

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The preceding two editions of CoDaWork included talks on the possible consideration of densities as infinite compositions: Egozcue and D´ıaz-Barrero (2003) extended the Euclidean structure of the simplex to a Hilbert space structure of the set of densities within a bounded interval, and van den Boogaart (2005) generalized this to the set of densities bounded by an arbitrary reference density. From the many variations of the Hilbert structures available, we work with three cases. For bounded variables, a basis derived from Legendre polynomials is used. For variables with a lower bound, we standardize them with respect to an exponential distribution and express their densities as coordinates in a basis derived from Laguerre polynomials. Finally, for unbounded variables, a normal distribution is used as reference, and coordinates are obtained with respect to a Hermite-polynomials-based basis. To get the coordinates, several approaches can be considered. A numerical accuracy problem occurs if one estimates the coordinates directly by using discretized scalar products. Thus we propose to use a weighted linear regression approach, where all k- order polynomials are used as predictand variables and weights are proportional to the reference density. Finally, for the case of 2-order Hermite polinomials (normal reference) and 1-order Laguerre polinomials (exponential), one can also derive the coordinates from their relationships to the classical mean and variance. Apart of these theoretical issues, this contribution focuses on the application of this theory to two main problems in sedimentary geology: the comparison of several grain size distributions, and the comparison among different rocks of the empirical distribution of a property measured on a batch of individual grains from the same rock or sediment, like their composition

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Optimum experimental designs depend on the design criterion, the model and the design region. The talk will consider the design of experiments for regression models in which there is a single response with the explanatory variables lying in a simplex. One example is experiments on various compositions of glass such as those considered by Martin, Bursnall, and Stillman (2001). Because of the highly symmetric nature of the simplex, the class of models that are of interest, typically Scheff´e polynomials (Scheff´e 1958) are rather different from those of standard regression analysis. The optimum designs are also rather different, inheriting a high degree of symmetry from the models. In the talk I will hope to discuss a variety of modes for such experiments. Then I will discuss constrained mixture experiments, when not all the simplex is available for experimentation. Other important aspects include mixture experiments with extra non-mixture factors and the blocking of mixture experiments. Much of the material is in Chapter 16 of Atkinson, Donev, and Tobias (2007). If time and my research allows, I would hope to finish with a few comments on design when the responses, rather than the explanatory variables, lie in a simplex. References Atkinson, A. C., A. N. Donev, and R. D. Tobias (2007). Optimum Experimental Designs, with SAS. Oxford: Oxford University Press. Martin, R. J., M. C. Bursnall, and E. C. Stillman (2001). Further results on optimal and efficient designs for constrained mixture experiments. In A. C. Atkinson, B. Bogacka, and A. Zhigljavsky (Eds.), Optimal Design 2000, pp. 225–239. Dordrecht: Kluwer. Scheff´e, H. (1958). Experiments with mixtures. Journal of the Royal Statistical Society, Ser. B 20, 344–360. 1

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Based on Rijt-Plooij and Plooij’s (1992) research on emergence of regression periods in the first two years of life, the presence of such periods in a group of 18 babies (10 boys and 8 girls, aged between 3 weeks and 14 months) from a Catalonian population was analyzed. The measurements were a questionnaire filled in by the infants’ mothers, a semi-structured weekly tape-recorded interview, and observations in their homes. The procedure and the instruments used in the project follow those proposed by Rijt-Plooij and Plooij. Our results confirm the existence of the regression periods in the first year of children’s life. Inter-coder agreement for trained coders was 78.2% and within-coder agreement was 90.1 %. In the discussion, the possible meaning and relevance of regression periods in order to understand development from a psychobiological and social framework is commented upon