92 resultados para Heavy Tail Distributions
em Consorci de Serveis Universitaris de Catalunya (CSUC), Spain
Resumo:
A method to estimate an extreme quantile that requires no distributional assumptions is presented. The approach is based on transformed kernel estimation of the cumulative distribution function (cdf). The proposed method consists of a double transformation kernel estimation. We derive optimal bandwidth selection methods that have a direct expression for the smoothing parameter. The bandwidth can accommodate to the given quantile level. The procedure is useful for large data sets and improves quantile estimation compared to other methods in heavy tailed distributions. Implementation is straightforward and R programs are available.
Resumo:
[cat] Es presenta un estimador nucli transformat que és adequat per a distribucions de cua pesada. Utilitzant una transformació basada en la distribució de probabilitat Beta l’elecció del paràmetre de finestra és molt directa. Es presenta una aplicació a dades d’assegurances i es mostra com calcular el Valor en Risc.
Resumo:
[cat] Es presenta un estimador nucli transformat que és adequat per a distribucions de cua pesada. Utilitzant una transformació basada en la distribució de probabilitat Beta l’elecció del paràmetre de finestra és molt directa. Es presenta una aplicació a dades d’assegurances i es mostra com calcular el Valor en Risc.
Resumo:
We propose a new kernel estimation of the cumulative distribution function based on transformation and on bias reducing techniques. We derive the optimal bandwidth that minimises the asymptotic integrated mean squared error. The simulation results show that our proposed kernel estimation improves alternative approaches when the variable has an extreme value distribution with heavy tail and the sample size is small.
Resumo:
We present a real data set of claims amounts where costs related to damage are recorded separately from those related to medical expenses. Only claims with positive costs are considered here. Two approaches to density estimation are presented: a classical parametric and a semi-parametric method, based on transformation kernel density estimation. We explore the data set with standard univariate methods. We also propose ways to select the bandwidth and transformation parameters in the univariate case based on Bayesian methods. We indicate how to compare the results of alternative methods both looking at the shape of the overall density domain and exploring the density estimates in the right tail.
Resumo:
We establish the validity of subsampling confidence intervals for themean of a dependent series with heavy-tailed marginal distributions.Using point process theory, we study both linear and nonlinear GARCH-liketime series models. We propose a data-dependent method for the optimalblock size selection and investigate its performance by means of asimulation study.
Resumo:
A model for energy, pressure, and flow velocity distributions at the beginning of ultrarelativistic heavy ion collisions is presented, which can be used as an initial condition for hydrodynamic calculations. Our model takes into account baryon recoil for both target and projectile, arising from the acceleration of partons in an effective field F mu nu produced in the collision. The typical field strength (string tension) for RHIC energies is about 512 GeV/fm, which allows us to talk about string ropes. The results show that a quark-gluon plasma forms a tilted disk, such that the direction of the largest pressure gradient stays in the reaction plane, but deviates from both the beam and the usual transverse flow directions. Such initial conditions may lead to the creation of antiflow or third flow component [L. P. Csernai and D. Rhrich, Phys. Rev. Lett. B 458, 454 (1999)].
Resumo:
This paper develops a methodology to estimate the entire population distributions from bin-aggregated sample data. We do this through the estimation of the parameters of mixtures of distributions that allow for maximal parametric flexibility. The statistical approach we develop enables comparisons of the full distributions of height data from potential army conscripts across France's 88 departments for most of the nineteenth century. These comparisons are made by testing for differences-of-means stochastic dominance. Corrections for possible measurement errors are also devised by taking advantage of the richness of the data sets. Our methodology is of interest to researchers working on historical as well as contemporary bin-aggregated or histogram-type data, something that is still widely done since much of the information that is publicly available is in that form, often due to restrictions due to political sensitivity and/or confidentiality concerns.
Resumo:
We investigate the transition to synchronization in the Kuramoto model with bimodal distributions of the natural frequencies. Previous studies have concluded that the model exhibits a hysteretic phase transition if the bimodal distribution is close to a unimodal one, due to the shallowness the central dip. Here we show that proximity to the unimodal-bimodal border does not necessarily imply hysteresis when the width, but not the depth, of the central dip tends to zero. We draw this conclusion from a detailed study of the Kuramoto model with a suitable family of bimodal distributions.
Resumo:
It has been argued that by truncating the sample space of the negative binomial and of the inverse Gaussian-Poisson mixture models at zero, one is allowed to extend the parameter space of the model. Here that is proved to be the case for the more general three parameter Tweedie-Poisson mixture model. It is also proved that the distributions in the extended part of the parameter space are not the zero truncation of mixed poisson distributions and that, other than for the negative binomial, they are not mixtures of zero truncated Poisson distributions either. By extending the parameter space one can improve the fit when the frequency of one is larger and the right tail is heavier than is allowed by the unextended model. Considering the extended model also allows one to use the basic maximum likelihood based inference tools when parameter estimates fall in the extended part of the parameter space, and hence when the m.l.e. does not exist under the unextended model. This extended truncated Tweedie-Poisson model is proved to be useful in the analysis of words and species frequency count data.
Resumo:
"Vegeu el resum a l'inici del document del fitxer adjunt"
Resumo:
This paper presents an analysis of motor vehicle insurance claims relating to vehicle damage and to associated medical expenses. We use univariate severity distributions estimated with parametric and non-parametric methods. The methods are implemented using the statistical package R. Parametric analysis is limited to estimation of normal and lognormal distributions for each of the two claim types. The nonparametric analysis presented involves kernel density estimation. We illustrate the benefits of applying transformations to data prior to employing kernel based methods. We use a log-transformation and an optimal transformation amongst a class of transformations that produces symmetry in the data. The central aim of this paper is to provide educators with material that can be used in the classroom to teach statistical estimation methods, goodness of fit analysis and importantly statistical computing in the context of insurance and risk management. To this end, we have included in the Appendix of this paper all the R code that has been used in the analysis so that readers, both students and educators, can fully explore the techniques described
Resumo:
We compare rain event size distributions derived from measurements in climatically different regions, which we find to be well approximated by power laws of similar exponents over broad ranges. Differences can be seen in the large-scale cutoffs of the distributions. Event duration distributions suggest that the scale-free aspects are related to the absence of characteristic scales in the meteorological mesoscale.
Resumo:
Tropical cyclones are affected by a large number of climatic factors, which translates into complex patterns of occurrence. The variability of annual metrics of tropical-cyclone activity has been intensively studied, in particular since the sudden activation of the North Atlantic in the mid 1990’s. We provide first a swift overview on previous work by diverse authors about these annual metrics for the North-Atlantic basin, where the natural variability of the phenomenon, the existence of trends, the drawbacks of the records, and the influence of global warming have been the subject of interesting debates. Next, we present an alternative approach that does not focus on seasonal features but on the characteristics of single events [Corral et al., Nature Phys. 6, 693 (2010)]. It is argued that the individual-storm power dissipation index (PDI) constitutes a natural way to describe each event, and further, that the PDI statistics yields a robust law for the occurrence of tropical cyclones in terms of a power law. In this context, methods of fitting these distributions are discussed. As an important extension to this work we introduce a distribution function that models the whole range of the PDI density (excluding incompleteness effects at the smallest values), the gamma distribution, consisting in a powerlaw with an exponential decay at the tail. The characteristic scale of this decay, represented by the cutoff parameter, provides very valuable information on the finiteness size of the basin, via the largest values of the PDIs that the basin can sustain. We use the gamma fit to evaluate the influence of sea surface temperature (SST) on the occurrence of extreme PDI values, for which we find an increase around 50 % in the values of these basin-wide events for a 0.49 C SST average difference. Similar findings are observed for the effects of the positive phase of the Atlantic multidecadal oscillation and the number of hurricanes in a season on the PDI distribution. In the case of the El Niño Southern oscillation (ENSO), positive and negative values of the multivariate ENSO index do not have a significant effect on the PDI distribution; however, when only extreme values of the index are used, it is found that the presence of El Niño decreases the PDI of the most extreme hurricanes.
Resumo:
The preceding two editions of CoDaWork included talks on the possible considerationof densities as infinite compositions: Egozcue and D´ıaz-Barrero (2003) extended theEuclidean structure of the simplex to a Hilbert space structure of the set of densitieswithin a bounded interval, and van den Boogaart (2005) generalized this to the setof densities bounded by an arbitrary reference density. From the many variations ofthe Hilbert structures available, we work with three cases. For bounded variables, abasis derived from Legendre polynomials is used. For variables with a lower bound, westandardize them with respect to an exponential distribution and express their densitiesas coordinates in a basis derived from Laguerre polynomials. Finally, for unboundedvariables, a normal distribution is used as reference, and coordinates are obtained withrespect to a Hermite-polynomials-based basis.To get the coordinates, several approaches can be considered. A numerical accuracyproblem occurs if one estimates the coordinates directly by using discretized scalarproducts. Thus we propose to use a weighted linear regression approach, where all k-order polynomials are used as predictand variables and weights are proportional to thereference density. Finally, for the case of 2-order Hermite polinomials (normal reference)and 1-order Laguerre polinomials (exponential), one can also derive the coordinatesfrom their relationships to the classical mean and variance.Apart of these theoretical issues, this contribution focuses on the application of thistheory to two main problems in sedimentary geology: the comparison of several grainsize distributions, and the comparison among different rocks of the empirical distribution of a property measured on a batch of individual grains from the same rock orsediment, like their composition