76 resultados para LOG-LINEAR MODELS
em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo (BDPI/USP)
Resumo:
We introduce the log-beta Weibull regression model based on the beta Weibull distribution (Famoye et al., 2005; Lee et al., 2007). We derive expansions for the moment generating function which do not depend on complicated functions. The new regression model represents a parametric family of models that includes as sub-models several widely known regression models that can be applied to censored survival data. We employ a frequentist analysis, a jackknife estimator, and a parametric bootstrap for the parameters of the proposed model. We derive the appropriate matrices for assessing local influences on the parameter estimates under different perturbation schemes and present some ways to assess global influences. Further, for different parameter settings, sample sizes, and censoring percentages, several simulations are performed. In addition, the empirical distribution of some modified residuals are displayed and compared with the standard normal distribution. These studies suggest that the residual analysis usually performed in normal linear regression models can be extended to a modified deviance residual in the proposed regression model applied to censored data. We define martingale and deviance residuals to evaluate the model assumptions. The extended regression model is very useful for the analysis of real data and could give more realistic fits than other special regression models.
Resumo:
We introduce in this paper the class of linear models with first-order autoregressive elliptical errors. The score functions and the Fisher information matrices are derived for the parameters of interest and an iterative process is proposed for the parameter estimation. Some robustness aspects of the maximum likelihood estimates are discussed. The normal curvatures of local influence are also derived for some usual perturbation schemes whereas diagnostic graphics to assess the sensitivity of the maximum likelihood estimates are proposed. The methodology is applied to analyse the daily log excess return on the Microsoft whose empirical distributions appear to have AR(1) and heavy-tailed errors. (C) 2008 Elsevier B.V. All rights reserved.
Resumo:
The estimation of data transformation is very useful to yield response variables satisfying closely a normal linear model, Generalized linear models enable the fitting of models to a wide range of data types. These models are based on exponential dispersion models. We propose a new class of transformed generalized linear models to extend the Box and Cox models and the generalized linear models. We use the generalized linear model framework to fit these models and discuss maximum likelihood estimation and inference. We give a simple formula to estimate the parameter that index the transformation of the response variable for a subclass of models. We also give a simple formula to estimate the rth moment of the original dependent variable. We explore the possibility of using these models to time series data to extend the generalized autoregressive moving average models discussed by Benjamin er al. [Generalized autoregressive moving average models. J. Amer. Statist. Assoc. 98, 214-223]. The usefulness of these models is illustrated in a Simulation study and in applications to three real data sets. (C) 2009 Elsevier B.V. All rights reserved.
Resumo:
In this paper we extend partial linear models with normal errors to Student-t errors Penalized likelihood equations are applied to derive the maximum likelihood estimates which appear to be robust against outlying observations in the sense of the Mahalanobis distance In order to study the sensitivity of the penalized estimates under some usual perturbation schemes in the model or data the local influence curvatures are derived and some diagnostic graphics are proposed A motivating example preliminary analyzed under normal errors is reanalyzed under Student-t errors The local influence approach is used to compare the sensitivity of the model estimates (C) 2010 Elsevier B V All rights reserved
Resumo:
Mixed linear models are commonly used in repeated measures studies. They account for the dependence amongst observations obtained from the same experimental unit. Often, the number of observations is small, and it is thus important to use inference strategies that incorporate small sample corrections. In this paper, we develop modified versions of the likelihood ratio test for fixed effects inference in mixed linear models. In particular, we derive a Bartlett correction to such a test, and also to a test obtained from a modified profile likelihood function. Our results generalize those in [Zucker, D.M., Lieberman, O., Manor, O., 2000. Improved small sample inference in the mixed linear model: Bartlett correction and adjusted likelihood. Journal of the Royal Statistical Society B, 62,827-838] by allowing the parameter of interest to be vector-valued. Additionally, our Bartlett corrections allow for random effects nonlinear covariance matrix structure. We report simulation results which show that the proposed tests display superior finite sample behavior relative to the standard likelihood ratio test. An application is also presented and discussed. (C) 2008 Elsevier B.V. All rights reserved.
Resumo:
Birnbaum-Saunders models have largely been applied in material fatigue studies and reliability analyses to relate the total time until failure with some type of cumulative damage. In many problems related to the medical field, such as chronic cardiac diseases and different types of cancer, a cumulative damage caused by several risk factors might cause some degradation that leads to a fatigue process. In these cases, BS models can be suitable for describing the propagation lifetime. However, since the cumulative damage is assumed to be normally distributed in the BS distribution, the parameter estimates from this model can be sensitive to outlying observations. In order to attenuate this influence, we present in this paper BS models, in which a Student-t distribution is assumed to explain the cumulative damage. In particular, we show that the maximum likelihood estimates of the Student-t log-BS models attribute smaller weights to outlying observations, which produce robust parameter estimates. Also, some inferential results are presented. In addition, based on local influence and deviance component and martingale-type residuals, a diagnostics analysis is derived. Finally, a motivating example from the medical field is analyzed using log-BS regression models. Since the parameter estimates appear to be very sensitive to outlying and influential observations, the Student-t log-BS regression model should attenuate such influences. The model checking methodologies developed in this paper are used to compare the fitted models.
Resumo:
Capybaras were monitored weekly from 1998 to 2006 by counting individuals in three anthropogenic environments (mixed agricultural fields, forest and open areas) of southeastern Brazil in order to examine the possible influence of environmental variables (temperature, humidity, wind speed, precipitation and global radiation) on the detectability of this species. There was consistent seasonality in the number of capybaras in the study area, with a specific seasonal pattern in each area. Log-linear models were fitted to the sample counts of adult capybaras separately for each sampled area, with an allowance for monthly effects, time trends and the effects of environmental variables. Log-linear models containing effects for the months of the year and a quartic time trend were highly significant. The effects of environmental variables on sample counts were different in each type of environment. As environmental variables affect capybara detectability, they should be considered in future species survey/monitoring programs.
Resumo:
Aims: The heterogeneity of the Brazilian population renders the extrapolation of pharmacogenomic data derived from well-defined ethnic groups inappropriate. We investigated the influence of self-reported `race/color`, geographical origin and genetic ancestry on the distribution of four VKORC1 SNPs and haplotypes in Brazilians. Comparative data were obtained from two major ancestral roots of Brazilians: Portuguese and Africans from former Portuguese colonies. Materials & methods: A total of 1037 healthy adults Brazilians, recruited at four different geographical regions and self identified as white, brown or black (race/color categories), 89 Portuguese and 216 Africans from Angola and Mozambique were genotyped for the VKORC1 3673G>A (rs9923231), 5808T>G (rs2884737), 6853G>C (rs8050894) and 9041G>A (rs7294) polymorphisms using TaqMan (R) (Applied Biosystems, CA, USA) assays. VKORC1 haplotypes were statistically inferred using the haplo.stats software. We inferred the statistical association between the distribution of the VKORC1 polymorphisms among Brazilians and self-reported color, geographical region and genetic ancestry by fitting multinomial log linear models via neural networks. Individual proportions of European and African ancestry were used to assess the impact of genetic admixture on the frequency distribution of VKORC1 polymorphisms among Brazilians, and for the comparison of Brazilians with Portuguese and Africans. Results: The frequency distribution of the 3673G>A and 5808T>G polymorphisms, and VKORC1 haplotypes among Brazilians varies across geographical regions, within self-reported color categories and according to the individual proportions of European and African genetic ancestry. Notably, the frequency of the warfarin sensitive VKORC1 3673A allele and the distribution of VKORC1 haplotypes varied continuously as the individual proportion of European ancestry increased in the entire cohort, independently of race/color categorization and geographical origin. Brazilians with more than 80% African ancestry differ significantly from Angolans and Mozambicans in frequency of the 3673G>A, 5808T>G and 6853G>C polymorphisms and haplotype distribution, whereas no such differences are observed between Brazilians with more than 90% European ancestry and Portuguese individuals. Conclusion: The diversity of the Brazilian population, evident in the distribution of VKORC1 polymorphisms, must be taken into account in the design of pharmacogenetic clinical trials and dealt with as a continuous variable. Warfarin dosing algorithms that include `race` terms defined for other populations are clearly not applicable to the heterogeneous and extensively admixed Brazilian population.
Resumo:
We review some issues related to the implications of different missing data mechanisms on statistical inference for contingency tables and consider simulation studies to compare the results obtained under such models to those where the units with missing data are disregarded. We confirm that although, in general, analyses under the correct missing at random and missing completely at random models are more efficient even for small sample sizes, there are exceptions where they may not improve the results obtained by ignoring the partially classified data. We show that under the missing not at random (MNAR) model, estimates on the boundary of the parameter space as well as lack of identifiability of the parameters of saturated models may be associated with undesirable asymptotic properties of maximum likelihood estimators and likelihood ratio tests; even in standard cases the bias of the estimators may be low only for very large samples. We also show that the probability of a boundary solution obtained under the correct MNAR model may be large even for large samples and that, consequently, we may not always conclude that a MNAR model is misspecified because the estimate is on the boundary of the parameter space.
Resumo:
We introduce, for the first time, a new class of Birnbaum-Saunders nonlinear regression models potentially useful in lifetime data analysis. The class generalizes the regression model described by Rieck and Nedelman [Rieck, J.R., Nedelman, J.R., 1991. A log-linear model for the Birnbaum-Saunders distribution. Technometrics 33, 51-60]. We discuss maximum-likelihood estimation for the parameters of the model, and derive closed-form expressions for the second-order biases of these estimates. Our formulae are easily computed as ordinary linear regressions and are then used to define bias corrected maximum-likelihood estimates. Some simulation results show that the bias correction scheme yields nearly unbiased estimates without increasing the mean squared errors. Two empirical applications are analysed and discussed. Crown Copyright (C) 2009 Published by Elsevier B.V. All rights reserved.
Resumo:
OBJETIVO: Analisar a tendência da mortalidade por diarreia entre menores de 5 anos, no município de Osasco (SP), entre 1980 e 2000. MÉTODOS: Trata-se de estudo observacional com dois delineamentos. Um descritivo, que toma o indivíduo como unidade do estudo, e outro ecológico, analisando agregado populacional que incluiu análise de séries temporais. A fonte de dados foi o sistema de informação de mortalidade do Estado de São Paulo e censos de 1980, 1991 e 2000. Descreveu-se a variação sazonal e para a análise de tendência aplicaram-se modelos log lineares de regressão polinomiais, utilizando-se variáveis sociodemográficas da criança e da mãe. Foram analisadas a evolução de indicadores sociodemográficos do município de 1980 a 2000, as taxas médias de mortalidade por diarreia nos menores de 5 anos e seus diferenciais por distrito nos anos 90. RESULTADOS: Dos 1.360 óbitos, 94,3 e 75,3% atingiram, respectivamente, menores de 1 ano e de 6 meses. O declínio da mortalidade foi de 98,3%, com deslocamento da sazonalidade do verão para o outono. A mediana da idade elevou-se de 2 meses nos primeiros períodos para 3 meses no último. O resíduo de óbitos manteve-se entre filhos de mães de 20 a 29 anos e escolaridade < 8 anos. O risco relativo entre o distrito mais atingido e a taxa média do município diminuiu de 3,4 para 1,3 do primeiro para o segundo quinquênio dos anos 90. CONCLUSÃO: Nossos resultados apontam uma elevação da idade mais vulnerável e a provável mudança do agente mais frequentemente associado ao óbito por diarreia.
Resumo:
Objetivo: Analisar a tendência da mortalidade por diarreia entre menores de 5 anos, no município de Osasco (SP), entre 1980 e 2000. Métodos: Trata-se de estudo observacional com dois delineamentos. Um descritivo, que toma o indivíduo como unidade do estudo, e outro ecológico, analisando agregado populacional que incluiu análise de séries temporais. A fonte de dados foi o sistema de informação de mortalidade do Estado de São Paulo e censos de 1980, 1991 e 2000. Descreveu-se a variação sazonal e para a análise de tendência aplicaram-se modelos log lineares de regressão polinomiais, utilizando-se variáveis sociodemográficas da criança e da mãe. Foram analisadas a evolução de indicadores sociodemográficos do município de 1980 a 2000, as taxas médias de mortalidade por diarreia nos menores de 5 anos e seus diferenciais por distrito nos anos 90. Resultados: Dos 1.360 óbitos, 94,3 e 75,3% atingiram, respectivamente, menores de 1 ano e de 6 meses. O declínio da mortalidade foi de 98,3%, com deslocamento da sazonalidade do verão para o outono. A mediana da idade elevou-se de 2 meses nos primeiros períodos para 3 meses no último. O resíduo de óbitos manteve-se entre filhos de mães de 20 a 29 anos e escolaridade < 8 anos. O risco relativo entre o distrito mais atingido e a taxa média do município diminuiu de 3,4 para 1,3 do primeiro para o segundo quinquênio dos anos 90. Conclusão: Nossos resultados apontam uma elevação da idade mais vulnerável e a provável mudança do agente mais frequentemente associado ao óbito por diarreia
Resumo:
In this paper, we present various diagnostic methods for polyhazard models. Polyhazard models are a flexible family for fitting lifetime data. Their main advantage over the single hazard models, such as the Weibull and the log-logistic models, is to include a large amount of nonmonotone hazard shapes, as bathtub and multimodal curves. Some influence methods, such as the local influence and total local influence of an individual are derived, analyzed and discussed. A discussion of the computation of the likelihood displacement as well as the normal curvature in the local influence method are presented. Finally, an example with real data is given for illustration.
Resumo:
Joint generalized linear models and double generalized linear models (DGLMs) were designed to model outcomes for which the variability can be explained using factors and/or covariates. When such factors operate, the usual normal regression models, which inherently exhibit constant variance, will under-represent variation in the data and hence may lead to erroneous inferences. For count and proportion data, such noise factors can generate a so-called overdispersion effect, and the use of binomial and Poisson models underestimates the variability and, consequently, incorrectly indicate significant effects. In this manuscript, we propose a DGLM from a Bayesian perspective, focusing on the case of proportion data, where the overdispersion can be modeled using a random effect that depends on some noise factors. The posterior joint density function was sampled using Monte Carlo Markov Chain algorithms, allowing inferences over the model parameters. An application to a data set on apple tissue culture is presented, for which it is shown that the Bayesian approach is quite feasible, even when limited prior information is available, thereby generating valuable insight for the researcher about its experimental results.
Resumo:
We introduce in this paper a new class of discrete generalized nonlinear models to extend the binomial, Poisson and negative binomial models to cope with count data. This class of models includes some important models such as log-nonlinear models, logit, probit and negative binomial nonlinear models, generalized Poisson and generalized negative binomial regression models, among other models, which enables the fitting of a wide range of models to count data. We derive an iterative process for fitting these models by maximum likelihood and discuss inference on the parameters. The usefulness of the new class of models is illustrated with an application to a real data set. (C) 2008 Elsevier B.V. All rights reserved.