996 resultados para Heavy-tailed distributions
Resumo:
We propose a family of multivariate heavy-tailed distributions that allow variable marginal amounts of tailweight. The originality comes from introducing multidimensional instead of univariate scale variables for the mixture of scaled Gaussian family of distributions. In contrast to most existing approaches, the derived distributions can account for a variety of shapes and have a simple tractable form with a closed-form probability density function whatever the dimension. We examine a number of properties of these distributions and illustrate them in the particular case of Pearson type VII and t tails. For these latter cases, we provide maximum likelihood estimation of the parameters and illustrate their modelling flexibility on simulated and real data clustering examples.
Resumo:
The Grubbs` measurement model is frequently used to compare several measuring devices. It is common to assume that the random terms have a normal distribution. However, such assumption makes the inference vulnerable to outlying observations, whereas scale mixtures of normal distributions have been an interesting alternative to produce robust estimates, keeping the elegancy and simplicity of the maximum likelihood theory. The aim of this paper is to develop an EM-type algorithm for the parameter estimation, and to use the local influence method to assess the robustness aspects of these parameter estimates under some usual perturbation schemes, In order to identify outliers and to criticize the model building we use the local influence procedure in a Study to compare the precision of several thermocouples. (C) 2008 Elsevier B.V. All rights reserved.
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
Birnbaum-Saunders models have largely been applied in material fatigue studies and reliability analyses to relate the total time until failure with some type of cumulative damage. In many problems related to the medical field, such as chronic cardiac diseases and different types of cancer, a cumulative damage caused by several risk factors might cause some degradation that leads to a fatigue process. In these cases, BS models can be suitable for describing the propagation lifetime. However, since the cumulative damage is assumed to be normally distributed in the BS distribution, the parameter estimates from this model can be sensitive to outlying observations. In order to attenuate this influence, we present in this paper BS models, in which a Student-t distribution is assumed to explain the cumulative damage. In particular, we show that the maximum likelihood estimates of the Student-t log-BS models attribute smaller weights to outlying observations, which produce robust parameter estimates. Also, some inferential results are presented. In addition, based on local influence and deviance component and martingale-type residuals, a diagnostics analysis is derived. Finally, a motivating example from the medical field is analyzed using log-BS regression models. Since the parameter estimates appear to be very sensitive to outlying and influential observations, the Student-t log-BS regression model should attenuate such influences. The model checking methodologies developed in this paper are used to compare the fitted models.
Resumo:
Gene expression noise results in protein number distributions ranging from long-tailed to Gaussian. We show how long-tailed distributions arise from a stochastic model of the constituent chemical reactions and suggest that, in conjunction with cooperative switches, they lead to more sensitive selection of a subpopulation of cells with high protein number than is possible with Gaussian distributions. Single-cell-tracking experiments are presented to validate some of the assumptions of the stochastic simulations. We also examine the effect of DNA looping on the shape of protein distributions. We further show that when switches are incorporated in the regulation of a gene via a feedback loop, the distributions can become bimodal. This might explain the bimodal distribution of certain morphogens during early embryogenesis.
Resumo:
In this paper the question of the extent to which truncated heavy tailed random vectors, taking values in a Banach space, retain the characteristic features of heavy tailed random vectors, is answered from the point of view of the central limit theorem.
Resumo:
We investigate the relaxation of long-tailed distributions under stochastic dynamics that do not support such tails. Linear relaxation is found to be a borderline case in which long tails are exponentially suppressed in time but not eliminated. Relaxation stronger than linear suppresses long tails immediately, but may lead to strong transient peaks in the probability distribution. We also find that a delta-function initial distribution under stronger than linear decay displays not one but two different regimes of diffusive spreading.
Resumo:
In this paper, we propose exact inference procedures for asset pricing models that can be formulated in the framework of a multivariate linear regression (CAPM), allowing for stable error distributions. The normality assumption on the distribution of stock returns is usually rejected in empirical studies, due to excess kurtosis and asymmetry. To model such data, we propose a comprehensive statistical approach which allows for alternative - possibly asymmetric - heavy tailed distributions without the use of large-sample approximations. The methods suggested are based on Monte Carlo test techniques. Goodness-of-fit tests are formally incorporated to ensure that the error distributions considered are empirically sustainable, from which exact confidence sets for the unknown tail area and asymmetry parameters of the stable error distribution are derived. Tests for the efficiency of the market portfolio (zero intercepts) which explicitly allow for the presence of (unknown) nuisance parameter in the stable error distribution are derived. The methods proposed are applied to monthly returns on 12 portfolios of the New York Stock Exchange over the period 1926-1995 (5 year subperiods). We find that stable possibly skewed distributions provide statistically significant improvement in goodness-of-fit and lead to fewer rejections of the efficiency hypothesis.
Resumo:
The thesis has covered various aspects of modeling and analysis of finite mean time series with symmetric stable distributed innovations. Time series analysis based on Box and Jenkins methods are the most popular approaches where the models are linear and errors are Gaussian. We highlighted the limitations of classical time series analysis tools and explored some generalized tools and organized the approach parallel to the classical set up. In the present thesis we mainly studied the estimation and prediction of signal plus noise model. Here we assumed the signal and noise follow some models with symmetric stable innovations.We start the thesis with some motivating examples and application areas of alpha stable time series models. Classical time series analysis and corresponding theories based on finite variance models are extensively discussed in second chapter. We also surveyed the existing theories and methods correspond to infinite variance models in the same chapter. We present a linear filtering method for computing the filter weights assigned to the observation for estimating unobserved signal under general noisy environment in third chapter. Here we consider both the signal and the noise as stationary processes with infinite variance innovations. We derived semi infinite, double infinite and asymmetric signal extraction filters based on minimum dispersion criteria. Finite length filters based on Kalman-Levy filters are developed and identified the pattern of the filter weights. Simulation studies show that the proposed methods are competent enough in signal extraction for processes with infinite variance.Parameter estimation of autoregressive signals observed in a symmetric stable noise environment is discussed in fourth chapter. Here we used higher order Yule-Walker type estimation using auto-covariation function and exemplify the methods by simulation and application to Sea surface temperature data. We increased the number of Yule-Walker equations and proposed a ordinary least square estimate to the autoregressive parameters. Singularity problem of the auto-covariation matrix is addressed and derived a modified version of the Generalized Yule-Walker method using singular value decomposition.In fifth chapter of the thesis we introduced partial covariation function as a tool for stable time series analysis where covariance or partial covariance is ill defined. Asymptotic results of the partial auto-covariation is studied and its application in model identification of stable auto-regressive models are discussed. We generalize the Durbin-Levinson algorithm to include infinite variance models in terms of partial auto-covariation function and introduce a new information criteria for consistent order estimation of stable autoregressive model.In chapter six we explore the application of the techniques discussed in the previous chapter in signal processing. Frequency estimation of sinusoidal signal observed in symmetric stable noisy environment is discussed in this context. Here we introduced a parametric spectrum analysis and frequency estimate using power transfer function. Estimate of the power transfer function is obtained using the modified generalized Yule-Walker approach. Another important problem in statistical signal processing is to identify the number of sinusoidal components in an observed signal. We used a modified version of the proposed information criteria for this purpose.
Resumo:
The purpose of this paper is to develop a Bayesian analysis for nonlinear regression models under scale mixtures of skew-normal distributions. This novel class of models provides a useful generalization of the symmetrical nonlinear regression models since the error distributions cover both skewness and heavy-tailed distributions such as the skew-t, skew-slash and the skew-contaminated normal distributions. The main advantage of these class of distributions is that they have a nice hierarchical representation that allows the implementation of Markov chain Monte Carlo (MCMC) methods to simulate samples from the joint posterior distribution. In order to examine the robust aspects of this flexible class, against outlying and influential observations, we present a Bayesian case deletion influence diagnostics based on the Kullback-Leibler divergence. Further, some discussions on the model selection criteria are given. The newly developed procedures are illustrated considering two simulations study, and a real data previously analyzed under normal and skew-normal nonlinear regression models. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
Linear mixed models were developed to handle clustered data and have been a topic of increasing interest in statistics for the past 50 years. Generally. the normality (or symmetry) of the random effects is a common assumption in linear mixed models but it may, sometimes, be unrealistic, obscuring important features of among-subjects variation. In this article, we utilize skew-normal/independent distributions as a tool for robust modeling of linear mixed models under a Bayesian paradigm. The skew-normal/independent distributions is an attractive class of asymmetric heavy-tailed distributions that includes the skew-normal distribution, skew-t, skew-slash and the skew-contaminated normal distributions as special cases, providing an appealing robust alternative to the routine use of symmetric distributions in this type of models. The methods developed are illustrated using a real data set from Framingham cholesterol study. (C) 2009 Elsevier B.V. All rights reserved.
Resumo:
Linear mixed effects models are frequently used to analyse longitudinal data, due to their flexibility in modelling the covariance structure between and within observations. Further, it is easy to deal with unbalanced data, either with respect to the number of observations per subject or per time period, and with varying time intervals between observations. In most applications of mixed models to biological sciences, a normal distribution is assumed both for the random effects and for the residuals. This, however, makes inferences vulnerable to the presence of outliers. Here, linear mixed models employing thick-tailed distributions for robust inferences in longitudinal data analysis are described. Specific distributions discussed include the Student-t, the slash and the contaminated normal. A Bayesian framework is adopted, and the Gibbs sampler and the Metropolis-Hastings algorithms are used to carry out the posterior analyses. An example with data on orthodontic distance growth in children is discussed to illustrate the methodology. Analyses based on either the Student-t distribution or on the usual Gaussian assumption are contrasted. The thick-tailed distributions provide an appealing robust alternative to the Gaussian process for modelling distributions of the random effects and of residuals in linear mixed models, and the MCMC implementation allows the computations to be performed in a flexible manner.
Resumo:
An extension of some standard likelihood based procedures to heteroscedastic nonlinear regression models under scale mixtures of skew-normal (SMSN) distributions is developed. This novel class of models provides a useful generalization of the heteroscedastic symmetrical nonlinear regression models (Cysneiros et al., 2010), since the random term distributions cover both symmetric as well as asymmetric and heavy-tailed distributions such as skew-t, skew-slash, skew-contaminated normal, among others. A simple EM-type algorithm for iteratively computing maximum likelihood estimates of the parameters is presented and the observed information matrix is derived analytically. In order to examine the performance of the proposed methods, some simulation studies are presented to show the robust aspect of this flexible class against outlying and influential observations and that the maximum likelihood estimates based on the EM-type algorithm do provide good asymptotic properties. Furthermore, local influence measures and the one-step approximations of the estimates in the case-deletion model are obtained. Finally, an illustration of the methodology is given considering a data set previously analyzed under the homoscedastic skew-t nonlinear regression model. (C) 2012 Elsevier B.V. All rights reserved.