920 resultados para generalised linear mixed model
Resumo:
Linear mixed effects models have been widely used in analysis of data where responses are clustered around some random effects, so it is not reasonable to assume independence between observations in the same cluster. In most biological applications, it is assumed that the distributions of the random effects and of the residuals are Gaussian. This makes inferences vulnerable to the presence of outliers. Here, linear mixed effects models with normal/independent residual distributions for robust inferences are described. Specific distributions examined include univariate and multivariate versions of the Student-t, the slash and the contaminated normal. A Bayesian framework is adopted and Markov chain Monte Carlo is used to carry out the posterior analysis. The procedures are illustrated using birth weight data on rats in a texicological experiment. Results from the Gaussian and robust models are contrasted, and it is shown how the implementation can be used for outlier detection. The thick-tailed distributions provide an appealing robust alternative to the Gaussian process in linear mixed models, and they are easily implemented using data augmentation and MCMC techniques.
Resumo:
In this paper, we propose a random intercept Poisson model in which the random effect is assumed to follow a generalized log-gamma (GLG) distribution. This random effect accommodates (or captures) the overdispersion in the counts and induces within-cluster correlation. We derive the first two moments for the marginal distribution as well as the intraclass correlation. Even though numerical integration methods are, in general, required for deriving the marginal models, we obtain the multivariate negative binomial model from a particular parameter setting of the hierarchical model. An iterative process is derived for obtaining the maximum likelihood estimates for the parameters in the multivariate negative binomial model. Residual analysis is proposed and two applications with real data are given for illustration. (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
Despite the widespread popularity of linear models for correlated outcomes (e.g. linear mixed models and time series models), distribution diagnostic methodology remains relatively underdeveloped in this context. In this paper we present an easy-to-implement approach that lends itself to graphical displays of model fit. Our approach involves multiplying the estimated margional residual vector by the Cholesky decomposition of the inverse of the estimated margional variance matrix. The resulting "rotated" residuals are used to construct an empirical cumulative distribution function and pointwise standard errors. The theoretical framework, including conditions and asymptotic properties, involves technical details that are motivated by Lange and Ryan (1989), Pierce (1982), and Randles (1982). Our method appears to work well in a variety of circumstances, including models having independent units of sampling (clustered data) and models for which all observations are correlated (e.g., a single time series). Our methods can produce satisfactory results even for models that do not satisfy all of the technical conditions stated in our theory.
Resumo:
Generalized linear mixed models (GLMMs) provide an elegant framework for the analysis of correlated data. Due to the non-closed form of the likelihood, GLMMs are often fit by computational procedures like penalized quasi-likelihood (PQL). Special cases of these models are generalized linear models (GLMs), which are often fit using algorithms like iterative weighted least squares (IWLS). High computational costs and memory space constraints often make it difficult to apply these iterative procedures to data sets with very large number of cases. This paper proposes a computationally efficient strategy based on the Gauss-Seidel algorithm that iteratively fits sub-models of the GLMM to subsetted versions of the data. Additional gains in efficiency are achieved for Poisson models, commonly used in disease mapping problems, because of their special collapsibility property which allows data reduction through summaries. Convergence of the proposed iterative procedure is guaranteed for canonical link functions. The strategy is applied to investigate the relationship between ischemic heart disease, socioeconomic status and age/gender category in New South Wales, Australia, based on outcome data consisting of approximately 33 million records. A simulation study demonstrates the algorithm's reliability in analyzing a data set with 12 million records for a (non-collapsible) logistic regression model.
Resumo:
In linear mixed models, model selection frequently includes the selection of random effects. Two versions of the Akaike information criterion (AIC) have been used, based either on the marginal or on the conditional distribution. We show that the marginal AIC is no longer an asymptotically unbiased estimator of the Akaike information, and in fact favours smaller models without random effects. For the conditional AIC, we show that ignoring estimation uncertainty in the random effects covariance matrix, as is common practice, induces a bias that leads to the selection of any random effect not predicted to be exactly zero. We derive an analytic representation of a corrected version of the conditional AIC, which avoids the high computational cost and imprecision of available numerical approximations. An implementation in an R package is provided. All theoretical results are illustrated in simulation studies, and their impact in practice is investigated in an analysis of childhood malnutrition in Zambia.
Resumo:
We introduce the log-beta Weibull regression model based on the beta Weibull distribution (Famoye et al., 2005; Lee et al., 2007). We derive expansions for the moment generating function which do not depend on complicated functions. The new regression model represents a parametric family of models that includes as sub-models several widely known regression models that can be applied to censored survival data. We employ a frequentist analysis, a jackknife estimator, and a parametric bootstrap for the parameters of the proposed model. We derive the appropriate matrices for assessing local influences on the parameter estimates under different perturbation schemes and present some ways to assess global influences. Further, for different parameter settings, sample sizes, and censoring percentages, several simulations are performed. In addition, the empirical distribution of some modified residuals are displayed and compared with the standard normal distribution. These studies suggest that the residual analysis usually performed in normal linear regression models can be extended to a modified deviance residual in the proposed regression model applied to censored data. We define martingale and deviance residuals to evaluate the model assumptions. The extended regression model is very useful for the analysis of real data and could give more realistic fits than other special regression models.
Resumo:
A mixture model incorporating long-term survivors has been adopted in the field of biostatistics where some individuals may never experience the failure event under study. The surviving fractions may be considered as cured. In most applications, the survival times are assumed to be independent. However, when the survival data are obtained from a multi-centre clinical trial, it is conceived that the environ mental conditions and facilities shared within clinic affects the proportion cured as well as the failure risk for the uncured individuals. It necessitates a long-term survivor mixture model with random effects. In this paper, the long-term survivor mixture model is extended for the analysis of multivariate failure time data using the generalized linear mixed model (GLMM) approach. The proposed model is applied to analyse a numerical data set from a multi-centre clinical trial of carcinoma as an illustration. Some simulation experiments are performed to assess the applicability of the model based on the average biases of the estimates formed. Copyright (C) 2001 John Wiley & Sons, Ltd.
Resumo:
This report describes the full research proposal for the project \Balancing and lot-sizing mixed-model lines in the footwear industry", to be developed as part of the master program in Engenharia Electrotécnica e de Computadores - Sistemas de Planeamento Industrial of the Instituto Superior de Engenharia do Porto. The Portuguese footwear industry is undergoing a period of great development and innovation. The numbers speak for themselves, Portugal footwear exported 71 million pairs of shoes to over 130 countries in 2012. It is a diverse sector, which covers different categories of women, men and children shoes, each of them with various models. New and technologically advanced mixed-model assembly lines are being projected and installed to replace traditional mass assembly lines. Obviously there is a need to manage them conveniently and to improve their operations. This work focuses on balancing and lot-sizing stitching mixed-model lines in a real world environment. For that purpose it will be fundamental to develop and evaluate adequate effective solution methods. Different objectives may be considered, which are relevant for the companies, such as minimizing the number of workstations, and minimizing the makespan, while taking into account a lot of practical restrictions. The solution approaches will be based on approximate methods, namely by resorting to metaheuristics. To show the impact of having different lots in production the initial maximum amount for each lot is changed and a Tabu Search based procedure is used to improve the solutions. The developed approaches will be evaluated and tested. A special attention will be given to the solution of real applied problems. Future work may include the study of other neighbourhood structures related to Tabu Search and the development of ways to speed up the evaluation of neighbours, as well as improving the balancing solution method.
Resumo:
The aims of this study were to characterise the ground-level larval habitats of the mosquito Culex quinquefasciatus, to determine the relationships between habitat characteristics and larval abundance and to examine seasonal larval-stage variations in Córdoba city. Every two weeks for two years, 15 larval habitats (natural and artificial water bodies, including shallow wells, drains, retention ponds, canals and ditches) were visited and sampled for larval mosquitoes. Data regarding the water depth, temperature and pH, permanence, the presence of aquatic vegetation and the density of collected mosquito larvae were recorded. Data on the average air temperatures and accumulated precipitation during the 15 days prior to each sampling date were also obtained. Cx. quinquefasciatus larvae were collected throughout the study period and were generally most abundant in the summer season. Generalised linear mixed models indicated the average air temperature and presence of dicotyledonous aquatic vegetation as variables that served as important predictors of larval densities. Additionally, permanent breeding sites supported high larval densities. In Córdoba city and possibly in other highly populated cities at the same latitude with the same environmental conditions, control programs should focus on permanent larval habitats with aquatic vegetation during the early spring, when the Cx. quinquefasciatus population begins to increase.
Resumo:
PURPOSE: The longitudinal relaxation rate (R1 ) measured in vivo depends on the local microstructural properties of the tissue, such as macromolecular, iron, and water content. Here, we use whole brain multiparametric in vivo data and a general linear relaxometry model to describe the dependence of R1 on these components. We explore a) the validity of having a single fixed set of model coefficients for the whole brain and b) the stability of the model coefficients in a large cohort. METHODS: Maps of magnetization transfer (MT) and effective transverse relaxation rate (R2 *) were used as surrogates for macromolecular and iron content, respectively. Spatial variations in these parameters reflected variations in underlying tissue microstructure. A linear model was applied to the whole brain, including gray/white matter and deep brain structures, to determine the global model coefficients. Synthetic R1 values were then calculated using these coefficients and compared with the measured R1 maps. RESULTS: The model's validity was demonstrated by correspondence between the synthetic and measured R1 values and by high stability of the model coefficients across a large cohort. CONCLUSION: A single set of global coefficients can be used to relate R1 , MT, and R2 * across the whole brain. Our population study demonstrates the robustness and stability of the model. Magn Reson Med, 2014. © 2014 The Authors. Magnetic Resonance in Medicine published by Wiley Periodicals, Inc. Magn Reson Med 73:1309-1314, 2015. © 2014 Wiley Periodicals, Inc.
Resumo:
Aim To evaluate the effects of using distinct alternative sets of climatic predictor variables on the performance, spatial predictions and future projections of species distribution models (SDMs) for rare plants in an arid environment. . Location Atacama and Peruvian Deserts, South America (18º30'S - 31º30'S, 0 - 3 000 m) Methods We modelled the present and future potential distributions of 13 species of Heliotropium sect. Cochranea, a plant group with a centre of diversity in the Atacama Desert. We developed and applied a sequential procedure, starting from climate monthly variables, to derive six alternative sets of climatic predictor variables. We used them to fit models with eight modelling techniques within an ensemble forecasting framework, and derived climate change projections for each of them. We evaluated the effects of using these alternative sets of predictor variables on performance, spatial predictions and projections of SDMs using Generalised Linear Mixed Models (GLMM). Results The use of distinct sets of climatic predictor variables did not have a significant effect on overall metrics of model performance, but had significant effects on present and future spatial predictions. Main conclusion Using different sets of climatic predictors can yield the same model fits but different spatial predictions of current and future species distributions. This represents a new form of uncertainty in model-based estimates of extinction risk that may need to be better acknowledged and quantified in future SDM studies.