41 resultados para archimax copulas
Resumo:
Extreme rainfall events have triggered a significant number of flash floods in Madeira Island along its past and recent history. Madeira is a volcanic island where the spatial rainfall distribution is strongly affected by its rugged topography. In this thesis, annual maximum of daily rainfall data from 25 rain gauge stations located in Madeira Island were modelled by the generalised extreme value distribution. Also, the hypothesis of a Gumbel distribution was tested by two methods and the existence of a linear trend in both distributions parameters was analysed. Estimates for the 50– and 100–year return levels were also obtained. Still in an univariate context, the assumption that a distribution function belongs to the domain of attraction of an extreme value distribution for monthly maximum rainfall data was tested for the rainy season. The available data was then analysed in order to find the most suitable domain of attraction for the sampled distribution. In a different approach, a search for thresholds was also performed for daily rainfall values through a graphical analysis. In a multivariate context, a study was made on the dependence between extreme rainfall values from the considered stations based on Kendall’s τ measure. This study suggests the influence of factors such as altitude, slope orientation, distance between stations and their proximity of the sea on the spatial distribution of extreme rainfall. Groups of three pairwise associated stations were also obtained and an adjustment was made to a family of extreme value copulas involving the Marshall–Olkin family, whose parameters can be written as a function of Kendall’s τ association measures of the obtained pairs.
Resumo:
Pós-graduação em Matematica Aplicada e Computacional - FCT
Resumo:
The main aim of this Ph.D. dissertation is the study of clustering dependent data by means of copula functions with particular emphasis on microarray data. Copula functions are a popular multivariate modeling tool in each field where the multivariate dependence is of great interest and their use in clustering has not been still investigated. The first part of this work contains the review of the literature of clustering methods, copula functions and microarray experiments. The attention focuses on the K–means (Hartigan, 1975; Hartigan and Wong, 1979), the hierarchical (Everitt, 1974) and the model–based (Fraley and Raftery, 1998, 1999, 2000, 2007) clustering techniques because their performance is compared. Then, the probabilistic interpretation of the Sklar’s theorem (Sklar’s, 1959), the estimation methods for copulas like the Inference for Margins (Joe and Xu, 1996) and the Archimedean and Elliptical copula families are presented. In the end, applications of clustering methods and copulas to the genetic and microarray experiments are highlighted. The second part contains the original contribution proposed. A simulation study is performed in order to evaluate the performance of the K–means and the hierarchical bottom–up clustering methods in identifying clusters according to the dependence structure of the data generating process. Different simulations are performed by varying different conditions (e.g., the kind of margins (distinct, overlapping and nested) and the value of the dependence parameter ) and the results are evaluated by means of different measures of performance. In light of the simulation results and of the limits of the two investigated clustering methods, a new clustering algorithm based on copula functions (‘CoClust’ in brief) is proposed. The basic idea, the iterative procedure of the CoClust and the description of the written R functions with their output are given. The CoClust algorithm is tested on simulated data (by varying the number of clusters, the copula models, the dependence parameter value and the degree of overlap of margins) and is compared with the performance of model–based clustering by using different measures of performance, like the percentage of well–identified number of clusters and the not rejection percentage of H0 on . It is shown that the CoClust algorithm allows to overcome all observed limits of the other investigated clustering techniques and is able to identify clusters according to the dependence structure of the data independently of the degree of overlap of margins and the strength of the dependence. The CoClust uses a criterion based on the maximized log–likelihood function of the copula and can virtually account for any possible dependence relationship between observations. Many peculiar characteristics are shown for the CoClust, e.g. its capability of identifying the true number of clusters and the fact that it does not require a starting classification. Finally, the CoClust algorithm is applied to the real microarray data of Hedenfalk et al. (2001) both to the gene expressions observed in three different cancer samples and to the columns (tumor samples) of the whole data matrix.
Resumo:
We propose an extension of the approach provided by Kluppelberg and Kuhn (2009) for inference on second-order structure moments. As in Kluppelberg and Kuhn (2009) we adopt a copula-based approach instead of assuming normal distribution for the variables, thus relaxing the equality in distribution assumption. A new copula-based estimator for structure moments is investigated. The methodology provided by Kluppelberg and Kuhn (2009) is also extended considering the copulas associated with the family of Eyraud-Farlie-Gumbel-Morgenstern distribution functions (Kotz, Balakrishnan, and Johnson, 2000, Equation 44.73). Finally, a comprehensive simulation study and an application to real financial data are performed in order to compare the different approaches.
Resumo:
We propose a novel class of models for functional data exhibiting skewness or other shape characteristics that vary with spatial or temporal location. We use copulas so that the marginal distributions and the dependence structure can be modeled independently. Dependence is modeled with a Gaussian or t-copula, so that there is an underlying latent Gaussian process. We model the marginal distributions using the skew t family. The mean, variance, and shape parameters are modeled nonparametrically as functions of location. A computationally tractable inferential framework for estimating heterogeneous asymmetric or heavy-tailed marginal distributions is introduced. This framework provides a new set of tools for increasingly complex data collected in medical and public health studies. Our methods were motivated by and are illustrated with a state-of-the-art study of neuronal tracts in multiple sclerosis patients and healthy controls. Using the tools we have developed, we were able to find those locations along the tract most affected by the disease. However, our methods are general and highly relevant to many functional data sets. In addition to the application to one-dimensional tract profiles illustrated here, higher-dimensional extensions of the methodology could have direct applications to other biological data including functional and structural MRI.
Resumo:
In this paper, we extend the debate concerning Credit Default Swap valuation to include time varying correlation and co-variances. Traditional multi-variate techniques treat the correlations between covariates as constant over time; however, this view is not supported by the data. Secondly, since financial data does not follow a normal distribution because of its heavy tails, modeling the data using a Generalized Linear model (GLM) incorporating copulas emerge as a more robust technique over traditional approaches. This paper also includes an empirical analysis of the regime switching dynamics of credit risk in the presence of liquidity by following the general practice of assuming that credit and market risk follow a Markov process. The study was based on Credit Default Swap data obtained from Bloomberg that spanned the period January 1st 2004 to August 08th 2006. The empirical examination of the regime switching tendencies provided quantitative support to the anecdotal view that liquidity decreases as credit quality deteriorates. The analysis also examined the joint probability distribution of the credit risk determinants across credit quality through the use of a copula function which disaggregates the behavior embedded in the marginal gamma distributions, so as to isolate the level of dependence which is captured in the copula function. The results suggest that the time varying joint correlation matrix performed far superior as compared to the constant correlation matrix; the centerpiece of linear regression models.
Resumo:
Em testes nos quais uma quantidade considerável de indivíduos não dispõe de tempo suciente para responder todos os itens temos o que é chamado de efeito de Speededness. O uso do modelo unidimensional da Teoria da Resposta ao Item (TRI) em testes com speededness pode nos levar a uma série de interpretações errôneas uma vez que nesse modelo é suposto que os respondentes possuem tempo suciente para responder todos os itens. Nesse trabalho, desenvolvemos uma análise Bayesiana do modelo tri-dimensional da TRI proposto por Wollack e Cohen (2005) considerando uma estrutura de dependência entre as distribuições a priori dos traços latentes a qual modelamos com o uso de cópulas. Apresentamos um processo de estimação para o modelo proposto e fazemos um estudo de simulação comparativo com a análise realizada por Bazan et al. (2010) na qual foi utilizada distribuições a priori independentes para os traços latentes. Finalmente, fazemos uma análise de sensibilidade do modelo em estudo e apresentamos uma aplicação levando em conta um conjunto de dados reais proveniente de um subteste do EGRA, chamado de Nonsense Words, realizado no Peru em 2007. Nesse subteste os alunos são avaliados por via oral efetuando a leitura, sequencialmente, de 50 palavras sem sentidos em 60 segundos o que caracteriza a presença do efeito speededness.
Resumo:
Regular vine copulas are multivariate dependence models constructed from pair-copulas (bivariate copulas). In this paper, we allow the dependence parameters of the pair-copulas in a D-vine decomposition to be potentially time-varying, following a nonlinear restricted ARMA(1,m) process, in order to obtain a very flexible dependence model for applications to multivariate financial return data. We investigate the dependence among the broad stock market indexes from Germany (DAX), France (CAC 40), Britain (FTSE 100), the United States (S&P 500) and Brazil (IBOVESPA) both in a crisis and in a non-crisis period. We find evidence of stronger dependence among the indexes in bear markets. Surprisingly, though, the dynamic D-vine copula indicates the occurrence of a sharp decrease in dependence between the indexes FTSE and CAC in the beginning of 2011, and also between CAC and DAX during mid-2011 and in the beginning of 2008, suggesting the absence of contagion in these cases. We also evaluate the dynamic D-vine copula with respect to Value-at-Risk (VaR) forecasting accuracy in crisis periods. The dynamic D-vine outperforms the static D-vine in terms of predictive accuracy for our real data sets.
Resumo:
Dans cette thèse on s’intéresse à la modélisation de la dépendance entre les risques en assurance non-vie, plus particulièrement dans le cadre des méthodes de provisionnement et en tarification. On expose le contexte actuel et les enjeux liés à la modélisation de la dépendance et l’importance d’une telle approche avec l’avènement des nouvelles normes et exigences des organismes réglementaires quant à la solvabilité des compagnies d’assurances générales. Récemment, Shi et Frees (2011) suggère d’incorporer la dépendance entre deux lignes d’affaires à travers une copule bivariée qui capture la dépendance entre deux cellules équivalentes de deux triangles de développement. Nous proposons deux approches différentes pour généraliser ce modèle. La première est basée sur les copules archimédiennes hiérarchiques, et la deuxième sur les effets aléatoires et la famille de distributions bivariées Sarmanov. Nous nous intéressons dans un premier temps, au Chapitre 2, à un modèle utilisant la classe des copules archimédiennes hiérarchiques, plus précisément la famille des copules partiellement imbriquées, afin d’inclure la dépendance à l’intérieur et entre deux lignes d’affaires à travers les effets calendaires. Par la suite, on considère un modèle alternatif, issu d’une autre classe de la famille des copules archimédiennes hiérarchiques, celle des copules totalement imbriquées, afin de modéliser la dépendance entre plus de deux lignes d’affaires. Une approche avec agrégation des risques basée sur un modèle formé d’une arborescence de copules bivariées y est également explorée. Une particularité importante de l’approche décrite au Chapitre 3 est que l’inférence au niveau de la dépendance se fait à travers les rangs des résidus, afin de pallier un éventuel risque de mauvaise spécification des lois marginales et de la copule régissant la dépendance. Comme deuxième approche, on s’intéresse également à la modélisation de la dépendance à travers des effets aléatoires. Pour ce faire, on considère la famille de distributions bivariées Sarmanov qui permet une modélisation flexible à l’intérieur et entre les lignes d’affaires, à travers les effets d’années de calendrier, années d’accident et périodes de développement. Des expressions fermées de la distribution jointe, ainsi qu’une illustration empirique avec des triangles de développement sont présentées au Chapitre 4. Aussi, nous proposons un modèle avec effets aléatoires dynamiques, où l’on donne plus de poids aux années les plus récentes, et utilisons l’information de la ligne corrélée afin d’effectuer une meilleure prédiction du risque. Cette dernière approche sera étudiée au Chapitre 5, à travers une application numérique sur les nombres de réclamations, illustrant l’utilité d’un tel modèle dans le cadre de la tarification. On conclut cette thèse par un rappel sur les contributions scientifiques de cette thèse, tout en proposant des angles d’ouvertures et des possibilités d’extension de ces travaux.
Resumo:
For derived flood frequency analysis based on hydrological modelling long continuous precipitation time series with high temporal resolution are needed. Often, the observation network with recording rainfall gauges is poor, especially regarding the limited length of the available rainfall time series. Stochastic precipitation synthesis is a good alternative either to extend or to regionalise rainfall series to provide adequate input for long-term rainfall-runoff modelling with subsequent estimation of design floods. Here, a new two step procedure for stochastic synthesis of continuous hourly space-time rainfall is proposed and tested for the extension of short observed precipitation time series. First, a single-site alternating renewal model is presented to simulate independent hourly precipitation time series for several locations. The alternating renewal model describes wet spell durations, dry spell durations and wet spell intensities using univariate frequency distributions separately for two seasons. The dependence between wet spell intensity and duration is accounted for by 2-copulas. For disaggregation of the wet spells into hourly intensities a predefined profile is used. In the second step a multi-site resampling procedure is applied on the synthetic point rainfall event series to reproduce the spatial dependence structure of rainfall. Resampling is carried out successively on all synthetic event series using simulated annealing with an objective function considering three bivariate spatial rainfall characteristics. In a case study synthetic precipitation is generated for some locations with short observation records in two mesoscale catchments of the Bode river basin located in northern Germany. The synthetic rainfall data are then applied for derived flood frequency analysis using the hydrological model HEC-HMS. The results show good performance in reproducing average and extreme rainfall characteristics as well as in reproducing observed flood frequencies. The presented model has the potential to be used for ungauged locations through regionalisation of the model parameters.
Resumo:
Dissertação (mestrado)—Universidade de Brasília, Instituto de Ciências Exatas, Departamento de Estatística, 2015.