86 resultados para copula
Resumo:
Optimal design for generalized linear models has primarily focused on univariate data. Often experiments are performed that have multiple dependent responses described by regression type models, and it is of interest and of value to design the experiment for all these responses. This requires a multivariate distribution underlying a pre-chosen model for the data. Here, we consider the design of experiments for bivariate binary data which are dependent. We explore Copula functions which provide a rich and flexible class of structures to derive joint distributions for bivariate binary data. We present methods for deriving optimal experimental designs for dependent bivariate binary data using Copulas, and demonstrate that, by including the dependence between responses in the design process, more efficient parameter estimates are obtained than by the usual practice of simply designing for a single variable only. Further, we investigate the robustness of designs with respect to initial parameter estimates and Copula function, and also show the performance of compound criteria within this bivariate binary setting.
Resumo:
The most important aspect of modelling a geological variable, such as metal grade, is the spatial correlation. Spatial correlation describes the relationship between realisations of a geological variable sampled at different locations. Any method for spatially modelling such a variable should be capable of accurately estimating the true spatial correlation. Conventional kriged models are the most commonly used in mining for estimating grade or other variables at unsampled locations, and these models use the variogram or covariance function to model the spatial correlations in the process of estimation. However, this usage assumes the relationships of the observations of the variable of interest at nearby locations are only influenced by the vector distance between the locations. This means that these models assume linear spatial correlation of grade. In reality, the relationship with an observation of grade at a nearby location may be influenced by both distance between the locations and the value of the observations (ie non-linear spatial correlation, such as may exist for variables of interest in geometallurgy). Hence this may lead to inaccurate estimation of the ore reserve if a kriged model is used for estimating grade of unsampled locations when nonlinear spatial correlation is present. Copula-based methods, which are widely used in financial and actuarial modelling to quantify the non-linear dependence structures, may offer a solution. This method was introduced by Bárdossy and Li (2008) to geostatistical modelling to quantify the non-linear spatial dependence structure in a groundwater quality measurement network. Their copula-based spatial modelling is applied in this research paper to estimate the grade of 3D blocks. Furthermore, real-world mining data is used to validate this model. These copula-based grade estimates are compared with the results of conventional ordinary and lognormal kriging to present the reliability of this method.
Resumo:
The interdependence of Greece and other European stock markets and the subsequent portfolio implications are examined in wavelet and variational mode decomposition domain. In applying the decomposition techniques, we analyze the structural properties of data and distinguish between short and long term dynamics of stock market returns. First, the GARCH-type models are fitted to obtain the standardized residuals. Next, different copula functions are evaluated, and based on the conventional information criteria and time varying parameter, Joe-Clayton copula is chosen to model the tail dependence between the stock markets. The short-run lower tail dependence time paths show a sudden increase in comovement during the global financial crises. The results of the long-run dependence suggest that European stock markets have higher interdependence with Greece stock market. Individual country’s Value at Risk (VaR) separates the countries into two distinct groups. Finally, the two-asset portfolio VaR measures provide potential markets for Greece stock market investment diversification.
Resumo:
Overland rain retrieval using spaceborne microwave radiometer offers a myriad of complications as land presents itself as a radiometrically warm and highly variable background. Hence, land rainfall algorithms of the Tropical Rainfall Measuring Mission (TRMM) Microwave Imager (TMI) have traditionally incorporated empirical relations of microwave brightness temperature (Tb) with rain rate, rather than relying on physically based radiative transfer modeling of rainfall (as implemented in the TMI ocean algorithm). In this paper, sensitivity analysis is conducted using the Spearman rank correlation coefficient as benchmark, to estimate the best combination of TMI low-frequency channels that are highly sensitive to the near surface rainfall rate from the TRMM Precipitation Radar (PR). Results indicate that the TMI channel combinations not only contain information about rainfall wherein liquid water drops are the dominant hydrometeors but also aid in surface noise reduction over a predominantly vegetative land surface background. Furthermore, the variations of rainfall signature in these channel combinations are not understood properly due to their inherent uncertainties and highly nonlinear relationship with rainfall. Copula theory is a powerful tool to characterize the dependence between complex hydrological variables as well as aid in uncertainty modeling by ensemble generation. Hence, this paper proposes a regional model using Archimedean copulas, to study the dependence of TMI channel combinations with respect to precipitation, over the land regions of Mahanadi basin, India, using version 7 orbital data from the passive and active sensors on board TRMM, namely, TMI and PR. Studies conducted for different rainfall regimes over the study area show the suitability of Clayton and Gumbel copulas for modeling convective and stratiform rainfall types for the majority of the intraseasonal months. Furthermore, large ensembles of TMI Tb (from the most sensitive TMI channel combination) were generated conditional on various quantiles (25th, 50th, 75th, and 95th) of the convective and the stratiform rainfall. Comparatively greater ambiguity was observed to model extreme values of the convective rain type. Finally, the efficiency of the proposed model was tested by comparing the results with traditionally employed linear and quadratic models. Results reveal the superior performance of the proposed copula-based technique.
Resumo:
We define a copula process which describes the dependencies between arbitrarily many random variables independently of their marginal distributions. As an example, we develop a stochastic volatility model, Gaussian Copula Process Volatility (GCPV), to predict the latent standard deviations of a sequence of random variables. To make predictions we use Bayesian inference, with the Laplace approximation, and with Markov chain Monte Carlo as an alternative. We find both methods comparable. We also find our model can outperform GARCH on simulated and financial data. And unlike GARCH, GCPV can easily handle missing data, incorporate covariates other than time, and model a rich class of covariance structures.
Resumo:
The paper presents a new copula based method for measuring dependence between random variables. Our approach extends the Maximum Mean Discrepancy to the copula of the joint distribution. We prove that this approach has several advantageous properties. Similarly to Shannon mutual information, the proposed dependence measure is invariant to any strictly increasing transformation of the marginal variables. This is important in many applications, for example in feature selection. The estimator is consistent, robust to outliers, and uses rank statistics only. We derive upper bounds on the convergence rate and propose independence tests too. We illustrate the theoretical contributions through a series of experiments in feature selection and low-dimensional embedding of distributions.
Resumo:
Gaussian factor models have proven widely useful for parsimoniously characterizing dependence in multivariate data. There is a rich literature on their extension to mixed categorical and continuous variables, using latent Gaussian variables or through generalized latent trait models acommodating measurements in the exponential family. However, when generalizing to non-Gaussian measured variables the latent variables typically influence both the dependence structure and the form of the marginal distributions, complicating interpretation and introducing artifacts. To address this problem we propose a novel class of Bayesian Gaussian copula factor models which decouple the latent factors from the marginal distributions. A semiparametric specification for the marginals based on the extended rank likelihood yields straightforward implementation and substantial computational gains. We provide new theoretical and empirical justifications for using this likelihood in Bayesian inference. We propose new default priors for the factor loadings and develop efficient parameter-expanded Gibbs sampling for posterior computation. The methods are evaluated through simulations and applied to a dataset in political science. The models in this paper are implemented in the R package bfa.
Resumo:
Background: Heckman-type selection models have been used to control HIV prevalence estimates for selection bias when participation in HIV testing and HIV status are associated after controlling for observed variables. These models typically rely on the strong assumption that the error terms in the participation and the outcome equations that comprise the model are distributed as bivariate normal.
Methods: We introduce a novel approach for relaxing the bivariate normality assumption in selection models using copula functions. We apply this method to estimating HIV prevalence and new confidence intervals (CI) in the 2007 Zambia Demographic and Health Survey (DHS) by using interviewer identity as the selection variable that predicts participation (consent to test) but not the outcome (HIV status).
Results: We show in a simulation study that selection models can generate biased results when the bivariate normality assumption is violated. In the 2007 Zambia DHS, HIV prevalence estimates are similar irrespective of the structure of the association assumed between participation and outcome. For men, we estimate a population HIV prevalence of 21% (95% CI = 16%–25%) compared with 12% (11%–13%) among those who consented to be tested; for women, the corresponding figures are 19% (13%–24%) and 16% (15%–17%).
Conclusions: Copula approaches to Heckman-type selection models are a useful addition to the methodological toolkit of HIV epidemiology and of epidemiology in general. We develop the use of this approach to systematically evaluate the robustness of HIV prevalence estimates based on selection models, both empirically and in a simulation study.
Resumo:
In dieser Arbeit werden mithilfe der Likelihood-Tiefen, eingeführt von Mizera und Müller (2004), (ausreißer-)robuste Schätzfunktionen und Tests für den unbekannten Parameter einer stetigen Dichtefunktion entwickelt. Die entwickelten Verfahren werden dann auf drei verschiedene Verteilungen angewandt. Für eindimensionale Parameter wird die Likelihood-Tiefe eines Parameters im Datensatz als das Minimum aus dem Anteil der Daten, für die die Ableitung der Loglikelihood-Funktion nach dem Parameter nicht negativ ist, und dem Anteil der Daten, für die diese Ableitung nicht positiv ist, berechnet. Damit hat der Parameter die größte Tiefe, für den beide Anzahlen gleich groß sind. Dieser wird zunächst als Schätzer gewählt, da die Likelihood-Tiefe ein Maß dafür sein soll, wie gut ein Parameter zum Datensatz passt. Asymptotisch hat der Parameter die größte Tiefe, für den die Wahrscheinlichkeit, dass für eine Beobachtung die Ableitung der Loglikelihood-Funktion nach dem Parameter nicht negativ ist, gleich einhalb ist. Wenn dies für den zu Grunde liegenden Parameter nicht der Fall ist, ist der Schätzer basierend auf der Likelihood-Tiefe verfälscht. In dieser Arbeit wird gezeigt, wie diese Verfälschung korrigiert werden kann sodass die korrigierten Schätzer konsistente Schätzungen bilden. Zur Entwicklung von Tests für den Parameter, wird die von Müller (2005) entwickelte Simplex Likelihood-Tiefe, die eine U-Statistik ist, benutzt. Es zeigt sich, dass für dieselben Verteilungen, für die die Likelihood-Tiefe verfälschte Schätzer liefert, die Simplex Likelihood-Tiefe eine unverfälschte U-Statistik ist. Damit ist insbesondere die asymptotische Verteilung bekannt und es lassen sich Tests für verschiedene Hypothesen formulieren. Die Verschiebung in der Tiefe führt aber für einige Hypothesen zu einer schlechten Güte des zugehörigen Tests. Es werden daher korrigierte Tests eingeführt und Voraussetzungen angegeben, unter denen diese dann konsistent sind. Die Arbeit besteht aus zwei Teilen. Im ersten Teil der Arbeit wird die allgemeine Theorie über die Schätzfunktionen und Tests dargestellt und zudem deren jeweiligen Konsistenz gezeigt. Im zweiten Teil wird die Theorie auf drei verschiedene Verteilungen angewandt: Die Weibull-Verteilung, die Gauß- und die Gumbel-Copula. Damit wird gezeigt, wie die Verfahren des ersten Teils genutzt werden können, um (robuste) konsistente Schätzfunktionen und Tests für den unbekannten Parameter der Verteilung herzuleiten. Insgesamt zeigt sich, dass für die drei Verteilungen mithilfe der Likelihood-Tiefen robuste Schätzfunktionen und Tests gefunden werden können. In unverfälschten Daten sind vorhandene Standardmethoden zum Teil überlegen, jedoch zeigt sich der Vorteil der neuen Methoden in kontaminierten Daten und Daten mit Ausreißern.
Eventive and stative passives and copula selection in Canadian and American heritage speaker Spanish
Resumo:
Spanish captures the difference between eventive and stative passives via an obligatory choice between two copula; verbal passives take the copula ser and adjectival passives take the copula estar. In this study, we compare and contrast US and Canadian heritage speakers of Spanish on their knowledge of this difference in relation to copula choice in Spanish. The backgrounds of the target groups differ significantly from each other in that only one of them, the Canadian group, has grown up in a societal multilingual environment. We discuss the results as being supportive of two non-mutually exclusive explanation factors: (a) French facilitates (bootstraps) the acquisition of eventive and stative passives and/or (b) the US/Canadian HS differences (e.g. status of bilingualism and the languages at stake) is a reflection of the uniqueness of the language contact situations and the effects this has on the input HSS receive.
Resumo:
In this review paper we collect several results about copula-based models, especially concerning regression models, by focusing on some insurance applications. (C) 2009 Elsevier B.V. All rights reserved.
Resumo:
We discuss the connection between information and copula theories by showing that a copula can be employed to decompose the information content of a multivariate distribution into marginal and dependence components, with the latter quantified by the mutual information. We define the information excess as a measure of deviation from a maximum-entropy distribution. The idea of marginal invariant dependence measures is also discussed and used to show that empirical linear correlation underestimates the amplitude of the actual correlation in the case of non-Gaussian marginals. The mutual information is shown to provide an upper bound for the asymptotic empirical log-likelihood of a copula. An analytical expression for the information excess of T-copulas is provided, allowing for simple model identification within this family. We illustrate the framework in a financial data set. Copyright (C) EPLA, 2009