8 resultados para conditional
em Universitat de Girona, Spain
Resumo:
The biplot has proved to be a powerful descriptive and analytical tool in many areas of applications of statistics. For compositional data the necessary theoretical adaptation has been provided, with illustrative applications, by Aitchison (1990) and Aitchison and Greenacre (2002). These papers were restricted to the interpretation of simple compositional data sets. In many situations the problem has to be described in some form of conditional modelling. For example, in a clinical trial where interest is in how patients’ steroid metabolite compositions may change as a result of different treatment regimes, interest is in relating the compositions after treatment to the compositions before treatment and the nature of the treatments applied. To study this through a biplot technique requires the development of some form of conditional compositional biplot. This is the purpose of this paper. We choose as a motivating application an analysis of the 1992 US President ial Election, where interest may be in how the three-part composition, the percentage division among the three candidates - Bush, Clinton and Perot - of the presidential vote in each state, depends on the ethnic composition and on the urban-rural composition of the state. The methodology of conditional compositional biplots is first developed and a detailed interpretation of the 1992 US Presidential Election provided. We use a second application involving the conditional variability of tektite mineral compositions with respect to major oxide compositions to demonstrate some hazards of simplistic interpretation of biplots. Finally we conjecture on further possible applications of conditional compositional biplots
Resumo:
One of the tantalising remaining problems in compositional data analysis lies in how to deal with data sets in which there are components which are essential zeros. By an essential zero we mean a component which is truly zero, not something recorded as zero simply because the experimental design or the measuring instrument has not been sufficiently sensitive to detect a trace of the part. Such essential zeros occur in many compositional situations, such as household budget patterns, time budgets, palaeontological zonation studies, ecological abundance studies. Devices such as nonzero replacement and amalgamation are almost invariably ad hoc and unsuccessful in such situations. From consideration of such examples it seems sensible to build up a model in two stages, the first determining where the zeros will occur and the second how the unit available is distributed among the non-zero parts. In this paper we suggest two such models, an independent binomial conditional logistic normal model and a hierarchical dependent binomial conditional logistic normal model. The compositional data in such modelling consist of an incidence matrix and a conditional compositional matrix. Interesting statistical problems arise, such as the question of estimability of parameters, the nature of the computational process for the estimation of both the incidence and compositional parameters caused by the complexity of the subcompositional structure, the formation of meaningful hypotheses, and the devising of suitable testing methodology within a lattice of such essential zero-compositional hypotheses. The methodology is illustrated by application to both simulated and real compositional data
Resumo:
This analysis was stimulated by the real data analysis problem of household expenditure data. The full dataset contains expenditure data for a sample of 1224 households. The expenditure is broken down at 2 hierarchical levels: 9 major levels (e.g. housing, food, utilities etc.) and 92 minor levels. There are also 5 factors and 5 covariates at the household level. Not surprisingly, there are a small number of zeros at the major level, but many zeros at the minor level. The question is how best to model the zeros. Clearly, models that try to add a small amount to the zero terms are not appropriate in general as at least some of the zeros are clearly structural, e.g. alcohol/tobacco for households that are teetotal. The key question then is how to build suitable conditional models. For example, is the sub-composition of spending excluding alcohol/tobacco similar for teetotal and non-teetotal households? In other words, we are looking for sub-compositional independence. Also, what determines whether a household is teetotal? Can we assume that it is independent of the composition? In general, whether teetotal will clearly depend on the household level variables, so we need to be able to model this dependence. The other tricky question is that with zeros on more than one component, we need to be able to model dependence and independence of zeros on the different components. Lastly, while some zeros are structural, others may not be, for example, for expenditure on durables, it may be chance as to whether a particular household spends money on durables within the sample period. This would clearly be distinguishable if we had longitudinal data, but may still be distinguishable by looking at the distribution, on the assumption that random zeros will usually be for situations where any non-zero expenditure is not small. While this analysis is based on around economic data, the ideas carry over to many other situations, including geological data, where minerals may be missing for structural reasons (similar to alcohol), or missing because they occur only in random regions which may be missed in a sample (similar to the durables)
Resumo:
This paper examines a dataset which is modeled well by the Poisson-Log Normal process and by this process mixed with Log Normal data, which are both turned into compositions. This generates compositional data that has zeros without any need for conditional models or assuming that there is missing or censored data that needs adjustment. It also enables us to model dependence on covariates and within the composition
Resumo:
The Dirichlet family owes its privileged status within simplex distributions to easyness of interpretation and good mathematical properties. In particular, we recall fundamental properties for the analysis of compositional data such as closure under amalgamation and subcomposition. From a probabilistic point of view, it is characterised (uniquely) by a variety of independence relationships which makes it indisputably the reference model for expressing the non trivial idea of substantial independence for compositions. Indeed, its well known inadequacy as a general model for compositional data stems from such an independence structure together with the poorness of its parametrisation. In this paper a new class of distributions (called Flexible Dirichlet) capable of handling various dependence structures and containing the Dirichlet as a special case is presented. The new model exhibits a considerably richer parametrisation which, for example, allows to model the means and (part of) the variance-covariance matrix separately. Moreover, such a model preserves some good mathematical properties of the Dirichlet, i.e. closure under amalgamation and subcomposition with new parameters simply related to the parent composition parameters. Furthermore, the joint and conditional distributions of subcompositions and relative totals can be expressed as simple mixtures of two Flexible Dirichlet distributions. The basis generating the Flexible Dirichlet, though keeping compositional invariance, shows a dependence structure which allows various forms of partitional dependence to be contemplated by the model (e.g. non-neutrality, subcompositional dependence and subcompositional non-invariance), independence cases being identified by suitable parameter configurations. In particular, within this model substantial independence among subsets of components of the composition naturally occurs when the subsets have a Dirichlet distribution
Resumo:
En aquest article s'intenta aportar una visió descriptiva del funcionament d'haber, ser i estar en construccions 'perifràstiques' del castellà medieval. Quant a les perífrasis 'AUX+ infinitiu', s'observa que no s'ajusten a una anàlisi del tipus 'SV que selecciona un SV'. D'altra banda, s'aporta evidència que afavoreix una visió de la derivació dels futurs i condicionals analítics on el verb [-finit] s'excorpora de l'auxiliar funcional per traslladar-se a CO. Finalment, s'estableix que les construccions 'haber/ser/estar+participi' poden ser analitzades com a verbs lèxics que subcategoritzen una oració reduïda el predicat de la qual és el participi. Això permet relacionar l'avantposició de participi amb la dels SA i la dels arguments interns. Els aspectes bàsics d'aquest canvi sintàctic que duu del castellà medieval i preclàssic a l'espanyol actual són: (a) la categoria lèxica SV dels verbs en qüestió es reanalitza com una categoria funcional SAsp i aquests verbs esdevenen auxiliars; (b) hi ha un canvi de subcabgorització, ja que aquests verbs deixen de subcategoritzar una oració reduïda per passar a subcategoritzar un SVmàx, i (c) la pèrdua de la projecció màxima SCONC1 comporta la desparició dels efectes de la llei Tobler-Mussafia, de la possibilitat d'avantposar el participi i també de la concordança de participi en els perfets compostos.
Resumo:
(INFINITIVE + CLITIC + AUX) is an evidential configuration in Old Spanish and Old Catalan, whereas (PARTICIPLE + CLITIC + AUX) is an instance of weak or unmarked focus fronting. The evidentiality of mesoclitic structures can be put forward on the bases of three main arguments: a) mesoclisis is not compulsory (i.e., whenever you have a clitic, you can either have mesoclisis or proclisis/enclisis); b) mesoclitic futures and conditionals are attested in interrogative sentences (with wh- elements); and c) they are not found in derived adverbial clauses (which is what you expect if they have an evidential value, since they bring about intervention effects corresponding to the derivational account of conditional and temporal sentences, for example - see Haegeman 2007 and ff.), and are related to high modal expressions (thus interfering with MoodPIrrealis)
Resumo:
Els estudis de supervivència s'interessen pel temps que passa des de l'inici de l'estudi (diagnòstic de la malaltia, inici del tractament,...) fins que es produeix l'esdeveniment d'interès (mort, curació, millora,...). No obstant això, moltes vegades aquest esdeveniment s'observa més d'una vegada en un mateix individu durant el període de seguiment (dades de supervivència multivariant). En aquest cas, és necessari utilitzar una metodologia diferent a la utilitzada en l'anàlisi de supervivència estàndard. El principal problema que l'estudi d'aquest tipus de dades comporta és que les observacions poden no ser independents. Fins ara, aquest problema s'ha solucionat de dues maneres diferents en funció de la variable dependent. Si aquesta variable segueix una distribució de la família exponencial s'utilitzen els models lineals generalitzats mixtes (GLMM); i si aquesta variable és el temps, variable amb una distribució de probabilitat no pertanyent a aquesta família, s'utilitza l'anàlisi de supervivència multivariant. El que es pretén en aquesta tesis és unificar aquests dos enfocs, és a dir, utilitzar una variable dependent que sigui el temps amb agrupacions d'individus o d'observacions, a partir d'un GLMM, amb la finalitat d'introduir nous mètodes pel tractament d'aquest tipus de dades.