948 resultados para conditional autoregressive models


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Pós-graduação em Matematica Aplicada e Computacional - FCT

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The research objectives of this thesis were to contribute to Bayesian statistical methodology by contributing to risk assessment statistical methodology, and to spatial and spatio-temporal methodology, by modelling error structures using complex hierarchical models. Specifically, I hoped to consider two applied areas, and use these applications as a springboard for developing new statistical methods as well as undertaking analyses which might give answers to particular applied questions. Thus, this thesis considers a series of models, firstly in the context of risk assessments for recycled water, and secondly in the context of water usage by crops. The research objective was to model error structures using hierarchical models in two problems, namely risk assessment analyses for wastewater, and secondly, in a four dimensional dataset, assessing differences between cropping systems over time and over three spatial dimensions. The aim was to use the simplicity and insight afforded by Bayesian networks to develop appropriate models for risk scenarios, and again to use Bayesian hierarchical models to explore the necessarily complex modelling of four dimensional agricultural data. The specific objectives of the research were to develop a method for the calculation of credible intervals for the point estimates of Bayesian networks; to develop a model structure to incorporate all the experimental uncertainty associated with various constants thereby allowing the calculation of more credible credible intervals for a risk assessment; to model a single day’s data from the agricultural dataset which satisfactorily captured the complexities of the data; to build a model for several days’ data, in order to consider how the full data might be modelled; and finally to build a model for the full four dimensional dataset and to consider the timevarying nature of the contrast of interest, having satisfactorily accounted for possible spatial and temporal autocorrelations. This work forms five papers, two of which have been published, with two submitted, and the final paper still in draft. The first two objectives were met by recasting the risk assessments as directed, acyclic graphs (DAGs). In the first case, we elicited uncertainty for the conditional probabilities needed by the Bayesian net, incorporated these into a corresponding DAG, and used Markov chain Monte Carlo (MCMC) to find credible intervals, for all the scenarios and outcomes of interest. In the second case, we incorporated the experimental data underlying the risk assessment constants into the DAG, and also treated some of that data as needing to be modelled as an ‘errors-invariables’ problem [Fuller, 1987]. This illustrated a simple method for the incorporation of experimental error into risk assessments. In considering one day of the three-dimensional agricultural data, it became clear that geostatistical models or conditional autoregressive (CAR) models over the three dimensions were not the best way to approach the data. Instead CAR models are used with neighbours only in the same depth layer. This gave flexibility to the model, allowing both the spatially structured and non-structured variances to differ at all depths. We call this model the CAR layered model. Given the experimental design, the fixed part of the model could have been modelled as a set of means by treatment and by depth, but doing so allows little insight into how the treatment effects vary with depth. Hence, a number of essentially non-parametric approaches were taken to see the effects of depth on treatment, with the model of choice incorporating an errors-in-variables approach for depth in addition to a non-parametric smooth. The statistical contribution here was the introduction of the CAR layered model, the applied contribution the analysis of moisture over depth and estimation of the contrast of interest together with its credible intervals. These models were fitted using WinBUGS [Lunn et al., 2000]. The work in the fifth paper deals with the fact that with large datasets, the use of WinBUGS becomes more problematic because of its highly correlated term by term updating. In this work, we introduce a Gibbs sampler with block updating for the CAR layered model. The Gibbs sampler was implemented by Chris Strickland using pyMCMC [Strickland, 2010]. This framework is then used to consider five days data, and we show that moisture in the soil for all the various treatments reaches levels particular to each treatment at a depth of 200 cm and thereafter stays constant, albeit with increasing variances with depth. In an analysis across three spatial dimensions and across time, there are many interactions of time and the spatial dimensions to be considered. Hence, we chose to use a daily model and to repeat the analysis at all time points, effectively creating an interaction model of time by the daily model. Such an approach allows great flexibility. However, this approach does not allow insight into the way in which the parameter of interest varies over time. Hence, a two-stage approach was also used, with estimates from the first-stage being analysed as a set of time series. We see this spatio-temporal interaction model as being a useful approach to data measured across three spatial dimensions and time, since it does not assume additivity of the random spatial or temporal effects.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background Multilevel and spatial models are being increasingly used to obtain substantive information on area-level inequalities in cancer survival. Multilevel models assume independent geographical areas, whereas spatial models explicitly incorporate geographical correlation, often via a conditional autoregressive prior. However the relative merits of these methods for large population-based studies have not been explored. Using a case-study approach, we report on the implications of using multilevel and spatial survival models to study geographical inequalities in all-cause survival. Methods Multilevel discrete-time and Bayesian spatial survival models were used to study geographical inequalities in all-cause survival for a population-based colorectal cancer cohort of 22,727 cases aged 20–84 years diagnosed during 1997–2007 from Queensland, Australia. Results Both approaches were viable on this large dataset, and produced similar estimates of the fixed effects. After adding area-level covariates, the between-area variability in survival using multilevel discrete-time models was no longer significant. Spatial inequalities in survival were also markedly reduced after adjusting for aggregated area-level covariates. Only the multilevel approach however, provided an estimation of the contribution of geographical variation to the total variation in survival between individual patients. Conclusions With little difference observed between the two approaches in the estimation of fixed effects, multilevel models should be favored if there is a clear hierarchical data structure and measuring the independent impact of individual- and area-level effects on survival differences is of primary interest. Bayesian spatial analyses may be preferred if spatial correlation between areas is important and if the priority is to assess small-area variations in survival and map spatial patterns. Both approaches can be readily fitted to geographically enabled survival data from international settings

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis addresses modeling of financial time series, especially stock market returns and daily price ranges. Modeling data of this kind can be approached with so-called multiplicative error models (MEM). These models nest several well known time series models such as GARCH, ACD and CARR models. They are able to capture many well established features of financial time series including volatility clustering and leptokurtosis. In contrast to these phenomena, different kinds of asymmetries have received relatively little attention in the existing literature. In this thesis asymmetries arise from various sources. They are observed in both conditional and unconditional distributions, for variables with non-negative values and for variables that have values on the real line. In the multivariate context asymmetries can be observed in the marginal distributions as well as in the relationships of the variables modeled. New methods for all these cases are proposed. Chapter 2 considers GARCH models and modeling of returns of two stock market indices. The chapter introduces the so-called generalized hyperbolic (GH) GARCH model to account for asymmetries in both conditional and unconditional distribution. In particular, two special cases of the GARCH-GH model which describe the data most accurately are proposed. They are found to improve the fit of the model when compared to symmetric GARCH models. The advantages of accounting for asymmetries are also observed through Value-at-Risk applications. Both theoretical and empirical contributions are provided in Chapter 3 of the thesis. In this chapter the so-called mixture conditional autoregressive range (MCARR) model is introduced, examined and applied to daily price ranges of the Hang Seng Index. The conditions for the strict and weak stationarity of the model as well as an expression for the autocorrelation function are obtained by writing the MCARR model as a first order autoregressive process with random coefficients. The chapter also introduces inverse gamma (IG) distribution to CARR models. The advantages of CARR-IG and MCARR-IG specifications over conventional CARR models are found in the empirical application both in- and out-of-sample. Chapter 4 discusses the simultaneous modeling of absolute returns and daily price ranges. In this part of the thesis a vector multiplicative error model (VMEM) with asymmetric Gumbel copula is found to provide substantial benefits over the existing VMEM models based on elliptical copulas. The proposed specification is able to capture the highly asymmetric dependence of the modeled variables thereby improving the performance of the model considerably. The economic significance of the results obtained is established when the information content of the volatility forecasts derived is examined.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Discrete Conditional Phase-type (DC-Ph) models consist of a process component (survival distribution) preceded by a set of related conditional discrete variables. This paper introduces a DC-Ph model where the conditional component is a classification tree. The approach is utilised for modelling health service capacities by better predicting service times, as captured by Coxian Phase-type distributions, interfaced with results from a classification tree algorithm. To illustrate the approach, a case-study within the healthcare delivery domain is given, namely that of maternity services. The classification analysis is shown to give good predictors for complications during childbirth. Based on the classification tree predictions, the duration of childbirth on the labour ward is then modelled as either a two or three-phase Coxian distribution. The resulting DC-Ph model is used to calculate the number of patients and associated bed occupancies, patient turnover, and to model the consequences of changes to risk status.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We present a new version of the hglm package for fittinghierarchical generalized linear models (HGLM) with spatially correlated random effects. A CAR family for conditional autoregressive random effects was implemented. Eigen decomposition of the matrix describing the spatial structure (e.g. the neighborhood matrix) was used to transform the CAR random effectsinto an independent, but heteroscedastic, gaussian random effect. A linear predictor is fitted for the random effect variance to estimate the parameters in the CAR model.This gives a computationally efficient algorithm for moderately sized problems (e.g. n<5000).

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Environmental data are spatial, temporal, and often come with many zeros. In this paper, we included space–time random effects in zero-inflated Poisson (ZIP) and ‘hurdle’ models to investigate haulout patterns of harbor seals on glacial ice. The data consisted of counts, for 18 dates on a lattice grid of samples, of harbor seals hauled out on glacial ice in Disenchantment Bay, near Yakutat, Alaska. A hurdle model is similar to a ZIP model except it does not mix zeros from the binary and count processes. Both models can be used for zero-inflated data, and we compared space–time ZIP and hurdle models in a Bayesian hierarchical model. Space–time ZIP and hurdle models were constructed by using spatial conditional autoregressive (CAR) models and temporal first-order autoregressive (AR(1)) models as random effects in ZIP and hurdle regression models. We created maps of smoothed predictions for harbor seal counts based on ice density, other covariates, and spatio-temporal random effects. For both models predictions around the edges appeared to be positively biased. The linex loss function is an asymmetric loss function that penalizes overprediction more than underprediction, and we used it to correct for prediction bias to get the best map for space–time ZIP and hurdle models.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Scholars have found that socioeconomic status was one of the key factors that influenced early-stage lung cancer incidence rates in a variety of regions. This thesis examined the association between median household income and lung cancer incidence rates in Texas counties. A total of 254 individual counties in Texas with corresponding lung cancer incidence rates from 2004 to 2008 and median household incomes in 2006 were collected from the National Cancer Institute Surveillance System. A simple linear model and spatial linear models with two structures, Simultaneous Autoregressive Structure (SAR) and Conditional Autoregressive Structure (CAR), were used to link median household income and lung cancer incidence rates in Texas. The residuals of the spatial linear models were analyzed with Moran's I and Geary's C statistics, and the statistical results were used to detect similar lung cancer incidence rate clusters and disease patterns in Texas.^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, the exchange rate forecasting performance of neural network models are evaluated against the random walk, autoregressive moving average and generalised autoregressive conditional heteroskedasticity models. There are no guidelines available that can be used to choose the parameters of neural network models and therefore, the parameters are chosen according to what the researcher considers to be the best. Such an approach, however,implies that the risk of making bad decisions is extremely high, which could explain why in many studies, neural network models do not consistently perform better than their time series counterparts. In this paper, through extensive experimentation, the level of subjectivity in building neural network models is considerably reduced and therefore giving them a better chance of Forecasting exchange rates with linear and nonlinear models 415 performing well. The results show that in general, neural network models perform better than the traditionally used time series models in forecasting exchange rates.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The main objective of this PhD was to further develop Bayesian spatio-temporal models (specifically the Conditional Autoregressive (CAR) class of models), for the analysis of sparse disease outcomes such as birth defects. The motivation for the thesis arose from problems encountered when analyzing a large birth defect registry in New South Wales. The specific components and related research objectives of the thesis were developed from gaps in the literature on current formulations of the CAR model, and health service planning requirements. Data from a large probabilistically-linked database from 1990 to 2004, consisting of fields from two separate registries: the Birth Defect Registry (BDR) and Midwives Data Collection (MDC) were used in the analyses in this thesis. The main objective was split into smaller goals. The first goal was to determine how the specification of the neighbourhood weight matrix will affect the smoothing properties of the CAR model, and this is the focus of chapter 6. Secondly, I hoped to evaluate the usefulness of incorporating a zero-inflated Poisson (ZIP) component as well as a shared-component model in terms of modeling a sparse outcome, and this is carried out in chapter 7. The third goal was to identify optimal sampling and sample size schemes designed to select individual level data for a hybrid ecological spatial model, and this is done in chapter 8. Finally, I wanted to put together the earlier improvements to the CAR model, and along with demographic projections, provide forecasts for birth defects at the SLA level. Chapter 9 describes how this is done. For the first objective, I examined a series of neighbourhood weight matrices, and showed how smoothing the relative risk estimates according to similarity by an important covariate (i.e. maternal age) helped improve the model’s ability to recover the underlying risk, as compared to the traditional adjacency (specifically the Queen) method of applying weights. Next, to address the sparseness and excess zeros commonly encountered in the analysis of rare outcomes such as birth defects, I compared a few models, including an extension of the usual Poisson model to encompass excess zeros in the data. This was achieved via a mixture model, which also encompassed the shared component model to improve on the estimation of sparse counts through borrowing strength across a shared component (e.g. latent risk factor/s) with the referent outcome (caesarean section was used in this example). Using the Deviance Information Criteria (DIC), I showed how the proposed model performed better than the usual models, but only when both outcomes shared a strong spatial correlation. The next objective involved identifying the optimal sampling and sample size strategy for incorporating individual-level data with areal covariates in a hybrid study design. I performed extensive simulation studies, evaluating thirteen different sampling schemes along with variations in sample size. This was done in the context of an ecological regression model that incorporated spatial correlation in the outcomes, as well as accommodating both individual and areal measures of covariates. Using the Average Mean Squared Error (AMSE), I showed how a simple random sample of 20% of the SLAs, followed by selecting all cases in the SLAs chosen, along with an equal number of controls, provided the lowest AMSE. The final objective involved combining the improved spatio-temporal CAR model with population (i.e. women) forecasts, to provide 30-year annual estimates of birth defects at the Statistical Local Area (SLA) level in New South Wales, Australia. The projections were illustrated using sixteen different SLAs, representing the various areal measures of socio-economic status and remoteness. A sensitivity analysis of the assumptions used in the projection was also undertaken. By the end of the thesis, I will show how challenges in the spatial analysis of rare diseases such as birth defects can be addressed, by specifically formulating the neighbourhood weight matrix to smooth according to a key covariate (i.e. maternal age), incorporating a ZIP component to model excess zeros in outcomes and borrowing strength from a referent outcome (i.e. caesarean counts). An efficient strategy to sample individual-level data and sample size considerations for rare disease will also be presented. Finally, projections in birth defect categories at the SLA level will be made.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The use of hierarchical Bayesian spatial models in the analysis of ecological data is increasingly prevalent. The implementation of these models has been heretofore limited to specifically written software that required extensive programming knowledge to create. The advent of WinBUGS provides access to Bayesian hierarchical models for those without the programming expertise to create their own models and allows for the more rapid implementation of new models and data analysis. This facility is demonstrated here using data collected by the Missouri Department of Conservation for the Missouri Turkey Hunting Survey of 1996. Three models are considered, the first uses the collected data to estimate the success rate for individual hunters at the county level and incorporates a conditional autoregressive (CAR) spatial effect. The second model builds upon the first by simultaneously estimating the success rate and harvest at the county level, while the third estimates the success rate and hunting pressure at the county level. These models are discussed in detail as well as their implementation in WinBUGS and the issues arising therein. Future areas of application for WinBUGS and the latest developments in WinBUGS are discussed as well.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Ecological studies are based on characteristics of groups of individuals, which are common in various disciplines including epidemiology. It is of great interest for epidemiologists to study the geographical variation of a disease by accounting for the positive spatial dependence between neighbouring areas. However, the choice of scale of the spatial correlation requires much attention. In view of a lack of studies in this area, this study aims to investigate the impact of differing definitions of geographical scales using a multilevel model. We propose a new approach -- the grid-based partitions and compare it with the popular census region approach. Unexplained geographical variation is accounted for via area-specific unstructured random effects and spatially structured random effects specified as an intrinsic conditional autoregressive process. Using grid-based modelling of random effects in contrast to the census region approach, we illustrate conditions where improvements are observed in the estimation of the linear predictor, random effects, parameters, and the identification of the distribution of residual risk and the aggregate risk in a study region. The study has found that grid-based modelling is a valuable approach for spatially sparse data while the SLA-based and grid-based approaches perform equally well for spatially dense data.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

A modelação e análise de séries temporais de valores inteiros têm sido alvo de grande investigação e desenvolvimento nos últimos anos, com aplicações várias em diversas áreas da ciência. Nesta tese a atenção centrar-se-á no estudo na classe de modelos basedos no operador thinning binomial. Tendo como base o operador thinning binomial, esta tese focou-se na construção e estudo de modelos SETINAR(2; p(1); p(2)) e PSETINAR(2; 1; 1)T , modelos autorregressivos de valores inteiros com limiares autoinduzidos e dois regimes, admitindo que as inovações formam uma sucessão de variáveis independentes com distribuição de Poisson. Relativamente ao primeiro modelo analisado, o modelo SETINAR(2; p(1); p(2)), além do estudo das suas propriedades probabilísticas e de métodos, clássicos e bayesianos, para estimar os parâmetros, analisou-se a questão da seleção das ordens, no caso de elas serem desconhecidas. Com este objetivo consideraram-se algoritmos de Monte Carlo via cadeias de Markov, em particular o algoritmo Reversible Jump, abordando-se também o problema da seleção de modelos, usando metodologias clássica e bayesiana. Complementou-se a análise através de um estudo de simulação e uma aplicação a dois conjuntos de dados reais. O modelo PSETINAR(2; 1; 1)T proposto, é também um modelo autorregressivo com limiares autoinduzidos e dois regimes, de ordem unitária em cada um deles, mas apresentando uma estrutura periódica. Estudaram-se as suas propriedades probabilísticas, analisaram-se os problemas de inferência e predição de futuras observações e realizaram-se estudos de simulação.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Latent variable models in finance originate both from asset pricing theory and time series analysis. These two strands of literature appeal to two different concepts of latent structures, which are both useful to reduce the dimension of a statistical model specified for a multivariate time series of asset prices. In the CAPM or APT beta pricing models, the dimension reduction is cross-sectional in nature, while in time-series state-space models, dimension is reduced longitudinally by assuming conditional independence between consecutive returns, given a small number of state variables. In this paper, we use the concept of Stochastic Discount Factor (SDF) or pricing kernel as a unifying principle to integrate these two concepts of latent variables. Beta pricing relations amount to characterize the factors as a basis of a vectorial space for the SDF. The coefficients of the SDF with respect to the factors are specified as deterministic functions of some state variables which summarize their dynamics. In beta pricing models, it is often said that only the factorial risk is compensated since the remaining idiosyncratic risk is diversifiable. Implicitly, this argument can be interpreted as a conditional cross-sectional factor structure, that is, a conditional independence between contemporaneous returns of a large number of assets, given a small number of factors, like in standard Factor Analysis. We provide this unifying analysis in the context of conditional equilibrium beta pricing as well as asset pricing with stochastic volatility, stochastic interest rates and other state variables. We address the general issue of econometric specifications of dynamic asset pricing models, which cover the modern literature on conditionally heteroskedastic factor models as well as equilibrium-based asset pricing models with an intertemporal specification of preferences and market fundamentals. We interpret various instantaneous causality relationships between state variables and market fundamentals as leverage effects and discuss their central role relative to the validity of standard CAPM-like stock pricing and preference-free option pricing.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

El aseguramiento de portafolio trae consigo unos costos de transacción asociados que son reconocidos por la teoría financiera pero que no han sido objeto de estudio de muchas aproximaciones empíricas. Mediante modelos econométricos de series de tiempo se puede pronosticar el número de rebalanceos necesarios para mantener un portafolio asegurado, así como el tiempo que debe transcurrir entre cada uno de estos. Para tal fin se usan modelos de Datos de Cuenta de Poisson Autorregresivos (ACP) modificados para captar las características de la serie y modelos de Duración Autorregresivos (ACD). Los modelos capturan la autocorrelación de las series y pronostican adecuadamente el costo de transacción asociado a los rebalanceos.