30 resultados para Log-normal distribution
em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo
Resumo:
In this paper we introduce a new distribution, namely, the slashed half-normal distribution and it can be seen as an extension of the half-normal distribution. It is shown that the resulting distribution has more kurtosis than the ordinary half-normal distribution. Moments and some properties are derived for the new distribution. Moment estimators and maximum likelihood estimators can computed using numerical procedures. Results of two real data application are reported where model fitting is implemented by using maximum likelihood estimation. The applications illustrate the better performance of the new distribution.
Resumo:
In this paper we have quantified the consistency of word usage in written texts represented by complex networks, where words were taken as nodes, by measuring the degree of preservation of the node neighborhood. Words were considered highly consistent if the authors used them with the same neighborhood. When ranked according to the consistency of use, the words obeyed a log-normal distribution, in contrast to Zipf's law that applies to the frequency of use. Consistency correlated positively with the familiarity and frequency of use, and negatively with ambiguity and age of acquisition. An inspection of some highly consistent words confirmed that they are used in very limited semantic contexts. A comparison of consistency indices for eight authors indicated that these indices may be employed for author recognition. Indeed, as expected, authors of novels could be distinguished from those who wrote scientific texts. Our analysis demonstrated the suitability of the consistency indices, which can now be applied in other tasks, such as emotion recognition.
Resumo:
This paper introduces a skewed log-Birnbaum-Saunders regression model based on the skewed sinh-normal distribution proposed by Leiva et al. [A skewed sinh-normal distribution and its properties and application to air pollution, Comm. Statist. Theory Methods 39 (2010), pp. 426-443]. Some influence methods, such as the local influence and generalized leverage, are presented. Additionally, we derived the normal curvatures of local influence under some perturbation schemes. An empirical application to a real data set is presented in order to illustrate the usefulness of the proposed model.
Resumo:
Item response theory (IRT) comprises a set of statistical models which are useful in many fields, especially when there is an interest in studying latent variables (or latent traits). Usually such latent traits are assumed to be random variables and a convenient distribution is assigned to them. A very common choice for such a distribution has been the standard normal. Recently, Azevedo et al. [Bayesian inference for a skew-normal IRT model under the centred parameterization, Comput. Stat. Data Anal. 55 (2011), pp. 353-365] proposed a skew-normal distribution under the centred parameterization (SNCP) as had been studied in [R. B. Arellano-Valle and A. Azzalini, The centred parametrization for the multivariate skew-normal distribution, J. Multivariate Anal. 99(7) (2008), pp. 1362-1382], to model the latent trait distribution. This approach allows one to represent any asymmetric behaviour concerning the latent trait distribution. Also, they developed a Metropolis-Hastings within the Gibbs sampling (MHWGS) algorithm based on the density of the SNCP. They showed that the algorithm recovers all parameters properly. Their results indicated that, in the presence of asymmetry, the proposed model and the estimation algorithm perform better than the usual model and estimation methods. Our main goal in this paper is to propose another type of MHWGS algorithm based on a stochastic representation (hierarchical structure) of the SNCP studied in [N. Henze, A probabilistic representation of the skew-normal distribution, Scand. J. Statist. 13 (1986), pp. 271-275]. Our algorithm has only one Metropolis-Hastings step, in opposition to the algorithm developed by Azevedo et al., which has two such steps. This not only makes the implementation easier but also reduces the number of proposal densities to be used, which can be a problem in the implementation of MHWGS algorithms, as can be seen in [R.J. Patz and B.W. Junker, A straightforward approach to Markov Chain Monte Carlo methods for item response models, J. Educ. Behav. Stat. 24(2) (1999), pp. 146-178; R. J. Patz and B. W. Junker, The applications and extensions of MCMC in IRT: Multiple item types, missing data, and rated responses, J. Educ. Behav. Stat. 24(4) (1999), pp. 342-366; A. Gelman, G.O. Roberts, and W.R. Gilks, Efficient Metropolis jumping rules, Bayesian Stat. 5 (1996), pp. 599-607]. Moreover, we consider a modified beta prior (which generalizes the one considered in [3]) and a Jeffreys prior for the asymmetry parameter. Furthermore, we study the sensitivity of such priors as well as the use of different kernel densities for this parameter. Finally, we assess the impact of the number of examinees, number of items and the asymmetry level on the parameter recovery. Results of the simulation study indicated that our approach performed equally as well as that in [3], in terms of parameter recovery, mainly using the Jeffreys prior. Also, they indicated that the asymmetry level has the highest impact on parameter recovery, even though it is relatively small. A real data analysis is considered jointly with the development of model fitting assessment tools. The results are compared with the ones obtained by Azevedo et al. The results indicate that using the hierarchical approach allows us to implement MCMC algorithms more easily, it facilitates diagnosis of the convergence and also it can be very useful to fit more complex skew IRT models.
Resumo:
In this paper, we propose a random intercept Poisson model in which the random effect is assumed to follow a generalized log-gamma (GLG) distribution. This random effect accommodates (or captures) the overdispersion in the counts and induces within-cluster correlation. We derive the first two moments for the marginal distribution as well as the intraclass correlation. Even though numerical integration methods are, in general, required for deriving the marginal models, we obtain the multivariate negative binomial model from a particular parameter setting of the hierarchical model. An iterative process is derived for obtaining the maximum likelihood estimates for the parameters in the multivariate negative binomial model. Residual analysis is proposed and two applications with real data are given for illustration. (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
Aims. We studied four young star clusters to characterise their anomalous extinction or variable reddening and asses whether they could be due to contamination by either dense clouds or circumstellar effects. Methods. We evaluated the extinction law (R-V) by adopting two methods: (i) the use of theoretical expressions based on the colour-excess of stars with known spectral type; and (ii) the analysis of two-colour diagrams, where the slope of the observed colour distribution was compared to the normal distribution. An algorithm to reproduce the zero-age main-sequence (ZAMS) reddened colours was developed to derive the average visual extinction (A(V)) that provides the closest fit to the observational data. The structure of the clouds was evaluated by means of a statistical fractal analysis, designed to compare their geometric structure with the spatial distribution of the cluster members. Results. The cluster NGC 6530 is the only object of our sample affected by anomalous extinction. On average, the other clusters suffer normal extinction, but several of their members, mainly in NGC 2264, seem to have high R-V, probably because of circumstellar effects. The ZAMS fitting provides A(V) values that are in good agreement with those found in the literature. The fractal analysis shows that NGC 6530 has a centrally concentrated distribution of stars that differs from the substructures found in the density distribution of the cloud projected in the A(V) map, suggesting that the original cloud was changed by the cluster formation. However, the fractal dimension and statistical parameters of Berkeley 86, NGC 2244, and NGC 2264 indicate that there is a good cloud-cluster correlation, when compared to other works based on an artificial distribution of points.
Resumo:
The issue of assessing variance components is essential in deciding on the inclusion of random effects in the context of mixed models. In this work we discuss this problem by supposing nonlinear elliptical models for correlated data by using the score-type test proposed in Silvapulle and Silvapulle (1995). Being asymptotically equivalent to the likelihood ratio test and only requiring the estimation under the null hypothesis, this test provides a fairly easy computable alternative for assessing one-sided hypotheses in the context of the marginal model. Taking into account the possible non-normal distribution, we assume that the joint distribution of the response variable and the random effects lies in the elliptical class, which includes light-tailed and heavy-tailed distributions such as Student-t, power exponential, logistic, generalized Student-t, generalized logistic, contaminated normal, and the normal itself, among others. We compare the sensitivity of the score-type test under normal, Student-t and power exponential models for the kinetics data set discussed in Vonesh and Carter (1992) and fitted using the model presented in Russo et al. (2009). Also, a simulation study is performed to analyze the consequences of the kurtosis misspecification.
Resumo:
A rigorous asymptotic theory for Wald residuals in generalized linear models is not yet available. The authors provide matrix formulae of order O(n(-1)), where n is the sample size, for the first two moments of these residuals. The formulae can be applied to many regression models widely used in practice. The authors suggest adjusted Wald residuals to these models with approximately zero mean and unit variance. The expressions were used to analyze a real dataset. Some simulation results indicate that the adjusted Wald residuals are better approximated by the standard normal distribution than the Wald residuals.
Resumo:
Maize is one of the most important crops in the world. The products generated from this crop are largely used in the starch industry, the animal and human nutrition sector, and biomass energy production and refineries. For these reasons, there is much interest in figuring the potential grain yield of maize genotypes in relation to the environment in which they will be grown, as the productivity directly affects agribusiness or farm profitability. Questions like these can be investigated with ecophysiological crop models, which can be organized according to different philosophies and structures. The main objective of this work is to conceptualize a stochastic model for predicting maize grain yield and productivity under different conditions of water supply while considering the uncertainties of daily climate data. Therefore, one focus is to explain the model construction in detail, and the other is to present some results in light of the philosophy adopted. A deterministic model was built as the basis for the stochastic model. The former performed well in terms of the curve shape of the above-ground dry matter over time as well as the grain yield under full and moderate water deficit conditions. Through the use of a triangular distribution for the harvest index and a bivariate normal distribution of the averaged daily solar radiation and air temperature, the stochastic model satisfactorily simulated grain productivity, i.e., it was found that 10,604 kg ha(-1) is the most likely grain productivity, very similar to the productivity simulated by the deterministic model and for the real conditions based on a field experiment.
Resumo:
The choice of an appropriate family of linear models for the analysis of longitudinal data is often a matter of concern for practitioners. To attenuate such difficulties, we discuss some issues that emerge when analyzing this type of data via a practical example involving pretestposttest longitudinal data. In particular, we consider log-normal linear mixed models (LNLMM), generalized linear mixed models (GLMM), and models based on generalized estimating equations (GEE). We show how some special features of the data, like a nonconstant coefficient of variation, may be handled in the three approaches and evaluate their performance with respect to the magnitude of standard errors of interpretable and comparable parameters. We also show how different diagnostic tools may be employed to identify outliers and comment on available software. We conclude by noting that the results are similar, but that GEE-based models may be preferable when the goal is to compare the marginal expected responses.
Resumo:
This paper considers likelihood-based inference for the family of power distributions. Widely applicable results are presented which can be used to conduct inference for all three parameters of the general location-scale extension of the family. More specific results are given for the special case of the power normal model. The analysis of a large data set, formed from density measurements for a certain type of pollen, illustrates the application of the family and the results for likelihood-based inference. Throughout, comparisons are made with analogous results for the direct parametrisation of the skew-normal distribution.
Minimal alterations on the enamel surface by micro-abrasion: in vitro roughness and wear assessments
Resumo:
Objective: To evaluate the in vitro changes on the enamel surface after a micro-abrasion treatment promoted by different products. Material and Methods: Fifty (50) fragments of bovine enamel (15 mm × 5 mm) were randomly assigned to five groups (n=10) according to the product utilized: G1 (control)= silicone polisher (TDV), G2= 37% phosphoric acid (3M/ESPE) + pumice stone (SS White), G3= Micropol (DMC Equipment), G4= Opalustre (Ultradent) and G5= Whiteness RM (FGM Dental Products). Roughness and wear were the responsible variables used to analyze these surfaces in four stages: baseline, 60 s and 120 s after the micro-abrasion and after polishing, using a Hommel Tester T1000 device. After the tests, a normal distribution of data was verified, with repeated ANOVA analyses (p?0.05) which were used to compare each product in different stages. One-way ANOVA and Tukey tests were applied for individual comparisons between the products in each stage (p?0.05). Results: Means and standard deviations of roughness and wear (µm) after all the promoted stages were: G1=7.26(1.81)/13.16(2.67), G2=2.02(0.62)/37.44(3.33), G3=1.81(0.91)/34.93(6.92), G4=1.92(0.29)/38.42(0.65) and G5=1.98(0.53)/33.45(2.66). At 60 seconds, all products tended to produce less surface roughness with a variable gradual decrease over time. After polishing, there were no statistically significant differences between the groups, except for G1. Independent of the product utilized, the enamel wear occurred after the micro-abrasion. Conclusions: In this in vitro study, enamel micro-abrasion presented itself as a conservative approach, regardless of the type of the paste compound utilized. These products promoted minor roughness alterations and minimal wear. The use of phosphoric acid and pumice stone showed similar results to commercial products for the micro-abrasion with regard to the surface roughness and wear.
Resumo:
This study aimed to evaluate the spatial variability of leaf content of macro and micronutrients. The citrus plants orchard with 5 years of age, planted at regular intervals of 8 x 7 m, was managed under drip irrigation. Leaf samples were collected from each plant to be analyzed in the laboratory. Data were analyzed using the software R, version 2.5.1 Copyright (C) 2007, along with geostatistics package GeoR. All contents of macro and micronutrients studied were adjusted to normal distribution and showed spatial dependence.The best-fit models, based on the likelihood, for the macro and micronutrients were the spherical and matern. It is suggest for the macronutrients nitrogen, phosphorus, potassium, calcium, magnesium and sulfur the minimum distances between samples of 37; 58; 29; 63; 46 and 15 m respectively, while for the micronutrients boron, copper, iron, manganese and zinc, the distances suggests are 29; 9; 113; 35 and 14 m, respectively.
Resumo:
The beta-Birnbaum-Saunders (Cordeiro and Lemonte, 2011) and Birnbaum-Saunders (Birnbaum and Saunders, 1969a) distributions have been used quite effectively to model failure times for materials subject to fatigue and lifetime data. We define the log-beta-Birnbaum-Saunders distribution by the logarithm of the beta-Birnbaum-Saunders distribution. Explicit expressions for its generating function and moments are derived. We propose a new log-beta-Birnbaum-Saunders regression model that can be applied to censored data and be used more effectively in survival analysis. We obtain the maximum likelihood estimates of the model parameters for censored data and investigate influence diagnostics. The new location-scale regression model is modified for the possibility that long-term survivors may be presented in the data. Its usefulness is illustrated by means of two real data sets. (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
In this paper, we carry out robust modeling and influence diagnostics in Birnbaum-Saunders (BS) regression models. Specifically, we present some aspects related to BS and log-BS distributions and their generalizations from the Student-t distribution, and develop BS-t regression models, including maximum likelihood estimation based on the EM algorithm and diagnostic tools. In addition, we apply the obtained results to real data from insurance, which shows the uses of the proposed model. Copyright (c) 2011 John Wiley & Sons, Ltd.