80 resultados para Multivariate statistics
em Consorci de Serveis Universitaris de Catalunya (CSUC), Spain
Resumo:
Durante los cuatro años de disfrute de la beca (2006 – 2009) se ha consolidado una base de datos de medidas osteológicas del esqueleto apendicular de numerosas especies del O. Carnivora. Concretamente, se han medido 364 individuos de 126 especies. Los ejemplares pertenecían a las colecciones del Phyletisches Museum (Jena, Alemania), el Museum für Naturkunde (Berlín, Alemania), el Museu de Ciències Naturals de la Ciutadella (Barcelona, España), el Múseum National d'Histoire Naturelle (París, Francia), y el Museo Nacional de Ciencias Naturales (Madrid, España). Asimismo, con estos datos se han estado preparando tres artículos sobre la morfología de ciertos elementos del esqueleto apendicular en carnívoros, dos de los cuales se encuentran actualmente en estado de revisión para su publicación científica. Dos de ellos, "Scapula, habitat and locomotion in Carnivora" y "Size and shape in the carnivore scapula", relacionan la morfología escapular con factores como el tamaño del animal, el tipo de locomoción que presenta y el hábitat en el que se encuentra; el primero mediante metodología multivariante (análisis funcional) y el segundo bajo las nuevas técnicas de morfometría geométrica. El tercer artículo, "Scaling and mechanics in the carnivore calcaneus: A comparison of natural and artificial selection", evalúa el efecto de diferentes tipos de selección, natural frente a artificial, sobre la morfología del calcáneo y su influencia en la biomecánica de este hueso. Finalmente, también se ha desarrollado un estudio experimental sobre la búsqueda de estabilidad durante la locomoción arbórea, cuyos resultados han dado lugar al artículo "The search for stability on narrow supports: An experimental study in cats and dogs", que también se halla bajo revisión actualmente.
Resumo:
Planners in public and private institutions would like coherent forecasts of the components of age-specic mortality, such as causes of death. This has been di cult toachieve because the relative values of the forecast components often fail to behave ina way that is coherent with historical experience. In addition, when the group forecasts are combined the result is often incompatible with an all-groups forecast. It hasbeen shown that cause-specic mortality forecasts are pessimistic when compared withall-cause forecasts (Wilmoth, 1995). This paper abandons the conventional approachof using log mortality rates and forecasts the density of deaths in the life table. Sincethese values obey a unit sum constraint for both conventional single-decrement life tables (only one absorbing state) and multiple-decrement tables (more than one absorbingstate), they are intrinsically relative rather than absolute values across decrements aswell as ages. Using the methods of Compositional Data Analysis pioneered by Aitchison(1986), death densities are transformed into the real space so that the full range of multivariate statistics can be applied, then back-transformed to positive values so that theunit sum constraint is honoured. The structure of the best-known, single-decrementmortality-rate forecasting model, devised by Lee and Carter (1992), is expressed incompositional form and the results from the two models are compared. The compositional model is extended to a multiple-decrement form and used to forecast mortalityby cause of death for Japan
Resumo:
Theory of compositional data analysis is often focused on the composition only. However in practical applications we often treat a composition together with covariableswith some other scale. This contribution systematically gathers and develop statistical tools for this situation. For instance, for the graphical display of the dependenceof a composition with a categorical variable, a colored set of ternary diagrams mightbe a good idea for a first look at the data, but it will fast hide important aspects ifthe composition has many parts, or it takes extreme values. On the other hand colored scatterplots of ilr components could not be very instructive for the analyst, if theconventional, black-box ilr is used.Thinking on terms of the Euclidean structure of the simplex, we suggest to set upappropriate projections, which on one side show the compositional geometry and on theother side are still comprehensible by a non-expert analyst, readable for all locations andscales of the data. This is e.g. done by defining special balance displays with carefully-selected axes. Following this idea, we need to systematically ask how to display, explore,describe, and test the relation to complementary or explanatory data of categorical, real,ratio or again compositional scales.This contribution shows that it is sufficient to use some basic concepts and very fewadvanced tools from multivariate statistics (principal covariances, multivariate linearmodels, trellis or parallel plots, etc.) to build appropriate procedures for all these combinations of scales. This has some fundamental implications in their software implementation, and how might they be taught to analysts not already experts in multivariateanalysis
Resumo:
We consider the application of normal theory methods to the estimation and testing of a general type of multivariate regressionmodels with errors--in--variables, in the case where various data setsare merged into a single analysis and the observable variables deviatepossibly from normality. The various samples to be merged can differ on the set of observable variables available. We show that there is a convenient way to parameterize the model so that, despite the possiblenon--normality of the data, normal--theory methods yield correct inferencesfor the parameters of interest and for the goodness--of--fit test. Thetheory described encompasses both the functional and structural modelcases, and can be implemented using standard software for structuralequations models, such as LISREL, EQS, LISCOMP, among others. An illustration with Monte Carlo data is presented.
Resumo:
Standard methods for the analysis of linear latent variable models oftenrely on the assumption that the vector of observed variables is normallydistributed. This normality assumption (NA) plays a crucial role inassessingoptimality of estimates, in computing standard errors, and in designinganasymptotic chi-square goodness-of-fit test. The asymptotic validity of NAinferences when the data deviates from normality has been calledasymptoticrobustness. In the present paper we extend previous work on asymptoticrobustnessto a general context of multi-sample analysis of linear latent variablemodels,with a latent component of the model allowed to be fixed across(hypothetical)sample replications, and with the asymptotic covariance matrix of thesamplemoments not necessarily finite. We will show that, under certainconditions,the matrix $\Gamma$ of asymptotic variances of the analyzed samplemomentscan be substituted by a matrix $\Omega$ that is a function only of thecross-product moments of the observed variables. The main advantage of thisis thatinferences based on $\Omega$ are readily available in standard softwareforcovariance structure analysis, and do not require to compute samplefourth-order moments. An illustration with simulated data in the context ofregressionwith errors in variables will be presented.
Resumo:
Connections between Statistics and Archaeology have always appeared veryfruitful. The objective of this paper is to offer an outlook of somestatistical techniques that are being developed in the most recentyears and that can be of interest for archaeologists in the short run.
Resumo:
Many multivariate methods that are apparently distinct can be linked by introducing oneor more parameters in their definition. Methods that can be linked in this way arecorrespondence analysis, unweighted or weighted logratio analysis (the latter alsoknown as "spectral mapping"), nonsymmetric correspondence analysis, principalcomponent analysis (with and without logarithmic transformation of the data) andmultidimensional scaling. In this presentation I will show how several of thesemethods, which are frequently used in compositional data analysis, may be linkedthrough parametrizations such as power transformations, linear transformations andconvex linear combinations. Since the methods of interest here all lead to visual mapsof data, a "movie" can be made where where the linking parameter is allowed to vary insmall steps: the results are recalculated "frame by frame" and one can see the smoothchange from one method to another. Several of these "movies" will be shown, giving adeeper insight into the similarities and differences between these methods.
Resumo:
The problem of prediction is considered in a multidimensional setting. Extending an idea presented by Barndorff-Nielsen and Cox, a predictive density for a multivariate random variable of interest is proposed. This density has the form of an estimative density plus a correction term. It gives simultaneous prediction regions with coverage error of smaller asymptotic order than the estimative density. A simulation study is also presented showing the magnitude of the improvement with respect to the estimative method.
Resumo:
Panel data can be arranged into a matrix in two ways, called 'long' and 'wide' formats (LFand WF). The two formats suggest two alternative model approaches for analyzing paneldata: (i) univariate regression with varying intercept; and (ii) multivariate regression withlatent variables (a particular case of structural equation model, SEM). The present papercompares the two approaches showing in which circumstances they yield equivalent?insome cases, even numerically equal?results. We show that the univariate approach givesresults equivalent to the multivariate approach when restrictions of time invariance (inthe paper, the TI assumption) are imposed on the parameters of the multivariate model.It is shown that the restrictions implicit in the univariate approach can be assessed bychi-square difference testing of two nested multivariate models. In addition, commontests encountered in the econometric analysis of panel data, such as the Hausman test, areshown to have an equivalent representation as chi-square difference tests. Commonalitiesand differences between the univariate and multivariate approaches are illustrated usingan empirical panel data set of firms' profitability as well as a simulated panel data.
Resumo:
This paper presents an initial challenge to tackle the every so "tricky" points encountered when dealing with energy accounting, and thereafter illustrates how such a system of accounting can be used when assessing for the metabolic changes in societies. The paper is divided in four main sections. The first three, present a general discussion on the main issues encountered when conducting energy analyses. The last section, subsequently, combines this heuristic approach to the actual formalization of it, in quantitative terms, for the analysis of possible energy scenarios. Section one covers the broader issue of how to account for the relevant categories used when accounting for Joules of energy; emphasizing on the clear distinction between Primary Energy Sources (PES) (which are the physical exploited entities that are used to derive useable energy forms (energy carriers)) and Energy Carriers (EC) (the actual useful energy that is transmitted for the appropriate end uses within a society). Section two sheds light on the concept of Energy Return on Investment (EROI). Here, it is emphasized that, there must already be a certain amount of energy carriers available to be able to extract/exploit Primary Energy Sources to thereafter generate a net supply of energy carriers. It is pointed out that this current trend of intense energy supply has only been possible to the great use and dependence on fossil energy. Section three follows up on the discussion of EROI, indicating that a single numeric indicator such as an output/input ratio is not sufficient in assessing for the performance of energetic systems. Rather an integrated approach that incorporates (i) how big the net supply of Joules of EC can be, given an amount of extracted PES (the external constraints); (ii) how much EC needs to be invested to extract an amount of PES; and (iii) the power level that it takes for both processes to succeed, is underlined. Section four, ultimately, puts the theoretical concepts at play, assessing for how the metabolic performances of societies can be accounted for within this analytical framework.
Resumo:
When actuaries face with the problem of pricing an insurance contract that contains different types of coverage, such as a motor insurance or homeowner's insurance policy, they usually assume that types of claim are independent. However, this assumption may not be realistic: several studies have shown that there is a positive correlation between types of claim. Here we introduce different regression models in order to relax the independence assumption, including zero-inflated models to account for excess of zeros and overdispersion. These models have been largely ignored to multivariate Poisson date, mainly because of their computational di±culties. Bayesian inference based on MCMC helps to solve this problem (and also lets us derive, for several quantities of interest, posterior summaries to account for uncertainty). Finally, these models are applied to an automobile insurance claims database with three different types of claims. We analyse the consequences for pure and loaded premiums when the independence assumption is relaxed by using different multivariate Poisson regression models and their zero-inflated versions.
Resumo:
This paper proposes a contemporaneous-threshold multivariate smooth transition autoregressive (C-MSTAR) model in which the regime weights depend on the ex ante probabilities that latent regime-specific variables exceed certain threshold values. A key feature of the model is that the transition function depends on all the parameters of the model as well as on the data. Since the mixing weights are also a function of the regime-specific innovation covariance matrix, the model can account for contemporaneous regime-specific co-movements of the variables. The stability and distributional properties of the proposed model are discussed, as well as issues of estimation, testing and forecasting. The practical usefulness of the C-MSTAR model is illustrated by examining the relationship between US stock prices and interest rates.
Resumo:
Tropical cyclones are affected by a large number of climatic factors, which translates into complex patterns of occurrence. The variability of annual metrics of tropical-cyclone activity has been intensively studied, in particular since the sudden activation of the North Atlantic in the mid 1990’s. We provide first a swift overview on previous work by diverse authors about these annual metrics for the North-Atlantic basin, where the natural variability of the phenomenon, the existence of trends, the drawbacks of the records, and the influence of global warming have been the subject of interesting debates. Next, we present an alternative approach that does not focus on seasonal features but on the characteristics of single events [Corral et al., Nature Phys. 6, 693 (2010)]. It is argued that the individual-storm power dissipation index (PDI) constitutes a natural way to describe each event, and further, that the PDI statistics yields a robust law for the occurrence of tropical cyclones in terms of a power law. In this context, methods of fitting these distributions are discussed. As an important extension to this work we introduce a distribution function that models the whole range of the PDI density (excluding incompleteness effects at the smallest values), the gamma distribution, consisting in a powerlaw with an exponential decay at the tail. The characteristic scale of this decay, represented by the cutoff parameter, provides very valuable information on the finiteness size of the basin, via the largest values of the PDIs that the basin can sustain. We use the gamma fit to evaluate the influence of sea surface temperature (SST) on the occurrence of extreme PDI values, for which we find an increase around 50 % in the values of these basin-wide events for a 0.49 C SST average difference. Similar findings are observed for the effects of the positive phase of the Atlantic multidecadal oscillation and the number of hurricanes in a season on the PDI distribution. In the case of the El Niño Southern oscillation (ENSO), positive and negative values of the multivariate ENSO index do not have a significant effect on the PDI distribution; however, when only extreme values of the index are used, it is found that the presence of El Niño decreases the PDI of the most extreme hurricanes.
Resumo:
In order to obtain a high-resolution Pleistocene stratigraphy, eleven continuouslycored boreholes, 100 to 220m deep were drilled in the northern part of the PoPlain by Regione Lombardia in the last five years. Quantitative provenanceanalysis (QPA, Weltje and von Eynatten, 2004) of Pleistocene sands was carriedout by using multivariate statistical analysis (principal component analysis, PCA,and similarity analysis) on an integrated data set, including high-resolution bulkpetrography and heavy-mineral analyses on Pleistocene sands and of 250 majorand minor modern rivers draining the southern flank of the Alps from West toEast (Garzanti et al, 2004; 2006). Prior to the onset of major Alpine glaciations,metamorphic and quartzofeldspathic detritus from the Western and Central Alpswas carried from the axial belt to the Po basin longitudinally parallel to theSouthAlpine belt by a trunk river (Vezzoli and Garzanti, 2008). This scenariorapidly changed during the marine isotope stage 22 (0.87 Ma), with the onset ofthe first major Pleistocene glaciation in the Alps (Muttoni et al, 2003). PCA andsimilarity analysis from core samples show that the longitudinal trunk river at thistime was shifted southward by the rapid southward and westward progradation oftransverse alluvial river systems fed from the Central and Southern Alps.Sediments were transported southward by braided river systems as well as glacialsediments transported by Alpine valley glaciers invaded the alluvial plain.Kew words: Detrital modes; Modern sands; Provenance; Principal ComponentsAnalysis; Similarity, Canberra Distance; palaeodrainage