10 resultados para random coefficient models
em Dalarna University College Electronic Archive
Resumo:
Random effect models have been widely applied in many fields of research. However, models with uncertain design matrices for random effects have been little investigated before. In some applications with such problems, an expectation method has been used for simplicity. This method does not include the extra information of uncertainty in the design matrix is not included. The closed solution for this problem is generally difficult to attain. We therefore propose an two-step algorithm for estimating the parameters, especially the variance components in the model. The implementation is based on Monte Carlo approximation and a Newton-Raphson-based EM algorithm. As an example, a simulated genetics dataset was analyzed. The results showed that the proportion of the total variance explained by the random effects was accurately estimated, which was highly underestimated by the expectation method. By introducing heuristic search and optimization methods, the algorithm can possibly be developed to infer the 'model-based' best design matrix and the corresponding best estimates.
Resumo:
Gibrat's law predicts that firm growth is purely random and should be independent of firm size. We use a random effects-random coefficient model to test whether Gibrat's law holds on average in the studied sample as well as at the individual firm level in the Swedish energy market. No study has yet investigated whether Gibrat's law holds for individual firms, previous studies having instead estimated whether the law holds on average in the samples studied. The present results support the claim that Gibrat's law is more likely to be rejected ex ante when an entire firm population is considered, but more likely to be confirmed ex post after market selection has "cleaned" the original population of firms or when the analysis treats more disaggregated data. From a theoretical perspective, the results are consistent with models based on passive and active learning, indicating a steady state in the firm expansion process and that Gibrat's law is violated in the short term but holds in the long term once firms have reached a steady state. These results indicate that approximately 70 % of firms in the Swedish energy sector are in steady state, with only random fluctuations in size around that level over the 15 studied years.
Resumo:
This thesis develops and evaluates statistical methods for different types of genetic analyses, including quantitative trait loci (QTL) analysis, genome-wide association study (GWAS), and genomic evaluation. The main contribution of the thesis is to provide novel insights in modeling genetic variance, especially via random effects models. In variance component QTL analysis, a full likelihood model accounting for uncertainty in the identity-by-descent (IBD) matrix was developed. It was found to be able to correctly adjust the bias in genetic variance component estimation and gain power in QTL mapping in terms of precision. Double hierarchical generalized linear models, and a non-iterative simplified version, were implemented and applied to fit data of an entire genome. These whole genome models were shown to have good performance in both QTL mapping and genomic prediction. A re-analysis of a publicly available GWAS data set identified significant loci in Arabidopsis that control phenotypic variance instead of mean, which validated the idea of variance-controlling genes. The works in the thesis are accompanied by R packages available online, including a general statistical tool for fitting random effects models (hglm), an efficient generalized ridge regression for high-dimensional data (bigRR), a double-layer mixed model for genomic data analysis (iQTL), a stochastic IBD matrix calculator (MCIBD), a computational interface for QTL mapping (qtl.outbred), and a GWAS analysis tool for mapping variance-controlling loci (vGWAS).
Resumo:
This paper presents a two-step pseudo likelihood estimation technique for generalized linear mixed models with the random effects being correlated between groups. The core idea is to deal with the intractable integrals in the likelihood function by multivariate Taylor's approximation. The accuracy of the estimation technique is assessed in a Monte-Carlo study. An application of it with a binary response variable is presented using a real data set on credit defaults from two Swedish banks. Thanks to the use of two-step estimation technique, the proposed algorithm outperforms conventional pseudo likelihood algorithms in terms of computational time.
Resumo:
Objective To investigate if a home environment test battery can be used to measure effects of Parkinson’s disease (PD) treatment intervention and disease progression. Background Seventy-seven patients diagnosed with advanced PD were recruited in an open longitudinal 36-month study at 10 clinics in Sweden and Norway; 40 of them were treated with levodopa-carbidopa intestinal gel (LCIG) and 37 patients were candidates for switching from oral PD treatment to LCIG. They utilized a mobile device test battery, consisting of self-assessments of symptoms and objective measures of motor function through a set of fine motor tests (tapping and spiral drawings), in their homes. Both the LCIG-naïve and LCIG-non-naïve patients used the test battery four times per day during week-long test periods. Methods Assessments The LCIG-naïve patients used the test battery at baseline (before LCIG), month 0 (first visit; at least 3 months after intraduodenal LCIG), and thereafter quarterly for the first year and biannually for the second and third years. The LCIG-non-naïve patients used the test battery from the first visit, i.e. month 0. Out of the 77 patients, only 65 utilized the test battery; 35 were LCIG-non-naïve and 30 LCIG-naïve. In 20 of the LCIG-naïve patients, assessments with the test battery were available during oral treatment and at least one test period after having started infusion treatment. Three LCIG-naïve patients did not use the test battery at baseline but had at least one test period of assessments thereafter. Hence, n=23 in the LCIG-naïve group. In total, symptom assessments in the full sample (including both patient groups) were collected during 379 test periods and 10079 test occasions. For 369 of these test periods, clinical assessments including UPDRS and PDQ-39 were performed in afternoons at the start of the test periods. The repeated measurements of the test battery were processed and summarized into scores representing patients’ symptom severities over a test period, using statistical methods. Six conceptual dimensions were defined; four subjectively-reported: ‘walking’, ‘satisfied’, ‘dyskinesia’, and ‘off’ and two objectively-measured: ‘tapping’ and ‘spiral’. In addition, an ‘overall test score’ (OTS) was defined to represent the global health condition of the patient during a test period. Statistical methods Change in the test battery scores over time, that is at baseline and follow-up test periods, was assessed with linear mixed-effects models with patient ID as a random effect and test period as a fixed effect of interest. The within-patient variability of OTS was assessed using intra-class correlation coefficient (ICC), for the two patient groups. Correlations between clinical rating scores and test battery scores were assessed using Spearman’s rank correlations (rho). Results In LCIG-naïve patients, mean OTS compared to baseline was significantly improved from the first test period on LCIG treatment until month 24. However, there were no significant changes in mean OTS scores of LCIG-non-naïve patients, except for worse mean OTS at month 36 (p<0.01, n=16). The mean scores of all subjectively-reported dimensions improved significantly throughout the course of the study, except ‘walking’ at month 36 (p=0.41, n=4). However, there were no significant differences in mean scores of objectively-measured dimensions between baseline and other test periods, except improved ‘tapping’ at month 6 and month 36, and ‘spiral’ at month 3 (p<0.05). The LCIG-naïve patients had a higher within-subject variability in their OTS scores (ICC=0.67) compared to LCIG-non-naïve patients (ICC=0.71). The OTS correlated adequately with total UPDRS (rho=0.59) and total PDQ-39 (rho=0.59). Conclusions In this 3-year follow-up study of advanced PD patients treated with LCIG we found that it is possible to monitor PD progression over time using a home environment test battery. The significant improvements in the mean OTS scores indicate that the test battery is able to measure functional improvement with LCIG sustained over at least 24 months.
Resumo:
We present the hglm package for fitting hierarchical generalized linear models. It can be used for linear mixed models and generalized linear mixed models with random effects for a variety of links and a variety of distributions for both the outcomes and the random effects. Fixed effects can also be fitted in the dispersion part of the model.
Resumo:
Background: The sensitivity to microenvironmental changes varies among animals and may be under genetic control. It is essential to take this element into account when aiming at breeding robust farm animals. Here, linear mixed models with genetic effects in the residual variance part of the model can be used. Such models have previously been fitted using EM and MCMC algorithms. Results: We propose the use of double hierarchical generalized linear models (DHGLM), where the squared residuals are assumed to be gamma distributed and the residual variance is fitted using a generalized linear model. The algorithm iterates between two sets of mixed model equations, one on the level of observations and one on the level of variances. The method was validated using simulations and also by re-analyzing a data set on pig litter size that was previously analyzed using a Bayesian approach. The pig litter size data contained 10,060 records from 4,149 sows. The DHGLM was implemented using the ASReml software and the algorithm converged within three minutes on a Linux server. The estimates were similar to those previously obtained using Bayesian methodology, especially the variance components in the residual variance part of the model. Conclusions: We have shown that variance components in the residual variance part of a linear mixed model can be estimated using a DHGLM approach. The method enables analyses of animal models with large numbers of observations. An important future development of the DHGLM methodology is to include the genetic correlation between the random effects in the mean and residual variance parts of the model as a parameter of the DHGLM.
Resumo:
We present a new version of the hglm package for fittinghierarchical generalized linear models (HGLM) with spatially correlated random effects. A CAR family for conditional autoregressive random effects was implemented. Eigen decomposition of the matrix describing the spatial structure (e.g. the neighborhood matrix) was used to transform the CAR random effectsinto an independent, but heteroscedastic, gaussian random effect. A linear predictor is fitted for the random effect variance to estimate the parameters in the CAR model.This gives a computationally efficient algorithm for moderately sized problems (e.g. n<5000).
Resumo:
We present a new version (> 2.0) of the hglm package for fitting hierarchical generalized linear models (HGLMs) with spatially correlated random effects. CAR() and SAR() families for conditional and simultaneous autoregressive random effects were implemented. Eigen decomposition of the matrix describing the spatial structure (e.g., the neighborhood matrix) was used to transform the CAR/SAR random effects into an independent, but eteroscedastic, Gaussian random effect. A linear predictor is fitted for the random effect variance to estimate the parameters in the CAR and SAR models. This gives a computationally efficient algorithm for moderately sized problems.
Resumo:
Generalized linear mixed models are flexible tools for modeling non-normal data and are useful for accommodating overdispersion in Poisson regression models with random effects. Their main difficulty resides in the parameter estimation because there is no analytic solution for the maximization of the marginal likelihood. Many methods have been proposed for this purpose and many of them are implemented in software packages. The purpose of this study is to compare the performance of three different statistical principles - marginal likelihood, extended likelihood, Bayesian analysis-via simulation studies. Real data on contact wrestling are used for illustration.