6 resultados para Bacon
em Consorci de Serveis Universitaris de Catalunya (CSUC), Spain
Resumo:
This analysis was stimulated by the real data analysis problem of householdexpenditure data. The full dataset contains expenditure data for a sample of 1224 households. The expenditure is broken down at 2 hierarchical levels: 9 major levels (e.g. housing, food, utilities etc.) and 92 minor levels. There are also 5 factors and 5 covariates at the household level. Not surprisingly, there are a small number of zeros at the major level, but many zeros at the minor level. The question is how best to model the zeros. Clearly, models that tryto add a small amount to the zero terms are not appropriate in general as at least some of the zeros are clearly structural, e.g. alcohol/tobacco for households that are teetotal. The key question then is how to build suitable conditional models. For example, is the sub-composition of spendingexcluding alcohol/tobacco similar for teetotal and non-teetotal households?In other words, we are looking for sub-compositional independence. Also, what determines whether a household is teetotal? Can we assume that it is independent of the composition? In general, whether teetotal will clearly depend on the household level variables, so we need to be able to model this dependence. The other tricky question is that with zeros on more than onecomponent, we need to be able to model dependence and independence of zeros on the different components. Lastly, while some zeros are structural, others may not be, for example, for expenditure on durables, it may be chance as to whether a particular household spends money on durableswithin the sample period. This would clearly be distinguishable if we had longitudinal data, but may still be distinguishable by looking at the distribution, on the assumption that random zeros will usually be for situations where any non-zero expenditure is not small.While this analysis is based on around economic data, the ideas carry over tomany other situations, including geological data, where minerals may be missing for structural reasons (similar to alcohol), or missing because they occur only in random regions which may be missed in a sample (similar to the durables)
Resumo:
Aitchison and Bacon-Shone (1999) considered convex linear combinations ofcompositions. In other words, they investigated compositions of compositions, wherethe mixing composition follows a logistic Normal distribution (or a perturbationprocess) and the compositions being mixed follow a logistic Normal distribution. Inthis paper, I investigate the extension to situations where the mixing compositionvaries with a number of dimensions. Examples would be where the mixingproportions vary with time or distance or a combination of the two. Practicalsituations include a river where the mixing proportions vary along the river, or acrossa lake and possibly with a time trend. This is illustrated with a dataset similar to thatused in the Aitchison and Bacon-Shone paper, which looked at how pollution in aloch depended on the pollution in the three rivers that feed the loch. Here, I explicitlymodel the variation in the linear combination across the loch, assuming that the meanof the logistic Normal distribution depends on the river flows and relative distancefrom the source origins
Resumo:
This paper examines a dataset which is modeled well by thePoisson-Log Normal process and by this process mixed with LogNormal data, which are both turned into compositions. Thisgenerates compositional data that has zeros without any need forconditional models or assuming that there is missing or censoreddata that needs adjustment. It also enables us to model dependenceon covariates and within the composition
Resumo:
The application of Discriminant function analysis (DFA) is not a new idea in the studyof tephrochrology. In this paper, DFA is applied to compositional datasets of twodifferent types of tephras from Mountain Ruapehu in New Zealand and MountainRainier in USA. The canonical variables from the analysis are further investigated witha statistical methodology of change-point problems in order to gain a betterunderstanding of the change in compositional pattern over time. Finally, a special caseof segmented regression has been proposed to model both the time of change and thechange in pattern. This model can be used to estimate the age for the unknown tephrasusing Bayesian statistical calibration
Resumo:
In CoDaWork’05, we presented an application of discriminant function analysis (DFA) to 4 differentcompositional datasets and modelled the first canonical variable using a segmented regression modelsolely based on an observation about the scatter plots. In this paper, multiple linear regressions areapplied to different datasets to confirm the validity of our proposed model. In addition to dating theunknown tephras by calibration as discussed previously, another method of mapping the unknown tephrasinto samples of the reference set or missing samples in between consecutive reference samples isproposed. The application of these methodologies is demonstrated with both simulated and real datasets.This new proposed methodology provides an alternative, more acceptable approach for geologists as theirfocus is on mapping the unknown tephra with relevant eruptive events rather than estimating the age ofunknown tephra.Kew words: Tephrochronology; Segmented regression
Resumo:
Objectives: We undertook a systematic literature review as a background to the European League Against Rheumatism (EULAR) recommendations for conducting clinical trials in anti-neutrophil cytoplasm antibody associated vasculitis (AAV), and to assess the quality of evidence for outcome measures in AAV. Methods: Using a systematic Medline search, we categorised the identified studies according to diagnoses. Factors affecting remission, relapse, renal function and overall survival were identified. Results: A total of 44 papers were reviewed from 502 identified by our search criteria. There was considerable inconsistency in definitions of end points. Remission rates varied from 30% to 93% in Wegener granulomatosis (WG), 75% to 89% in microscopic polyangiitis (MPA) and 81% to 91% in Churg¿Strauss syndrome (CSS). The 5-year survival for WG, MPA and CSS was 74¿91%, 45¿76% and 60¿97%. Relapse (variably defined) was common in the first 2 years but the frequency varied: 18% to 60% in WG, 8% in MPA, and 35% in CSS. The rate of renal survival in WG varied from 23% at 15 months to 23% at 120 months. Methods used to assess morbidity varied between studies. Ignoring the variations in definitions of the stage of disease, factors influencing remission, relapse, renal and overall survival included immunosuppressive therapy used, type of organ involvement, presence of ANCA, older age and male ender. Conclusions: Factors influencing remission, relapse, renal and overall survival include the type of immunosuppressive therapy used, pattern of organ involvement, presence of ANCA, older age and male gender. Methodological variations between studies highlight the need for a consensus on terminology and definitions for future conduct of clinical studies in AAV.