921 resultados para random coefficient regression model
Resumo:
This paper investigates the usefulness of switching Gaussian state space models as a tool for implementing dynamic model selecting (DMS) or averaging (DMA) in time-varying parameter regression models. DMS methods allow for model switching, where a different model can be chosen at each point in time. Thus, they allow for the explanatory variables in the time-varying parameter regression model to change over time. DMA will carry out model averaging in a time-varying manner. We compare our exact approach to DMA/DMS to a popular existing procedure which relies on the use of forgetting factor approximations. In an application, we use DMS to select different predictors in an in ation forecasting application. We also compare different ways of implementing DMA/DMS and investigate whether they lead to similar results.
Resumo:
The context where the university admissions exams are performed is presented and the main concerns about this exams are outlined and discussed from a statistical point of view. The paper offers an illustration of the use of random coefficient models in the study of educational data. The association between two individual scores (one internal and the other external to the school) and the effect of the school in the external exam is analized by a regression model with random intercept and fixed slope. A variance component model for the analysis of the grading process is also presented. The paper ends with an outline of the main findings and the presentation of some specific proposals to improve and control the equity of the system. Some pedagogic reflections are also included.
Resumo:
This paper suggests a method for obtaining efficiency bounds in models containing either only infinite-dimensional parameters or both finite- and infinite-dimensional parameters (semiparametric models). The method is based on a theory of random linear functionals applied to the gradient of the log-likelihood functional and is illustrated by computing the lower bound for Cox's regression model
Resumo:
We present a model for transport in multiply scattering media based on a three-dimensional generalization of the persistent random walk. The model assumes that photons move along directions that are parallel to the axes. Although this hypothesis is not realistic, it allows us to solve exactly the problem of multiple scattering propagation in a thin slab. Among other quantities, the transmission probability and the mean transmission time can be calculated exactly. Besides being completely solvable, the model could be used as a benchmark for approximation schemes to multiple light scattering.
Resumo:
Some models have been developed using agrometeorological and remote sensing data to estimate agriculture production. However, it is expected that the use of SAR images can improve their performance. The main objective of this study was to estimate the sugarcane production using a multiple linear regression model which considers agronomic data and ALOS/PALSAR images obtained from 2007/08, 2008/09 and 2009/10 cropping seasons. The performance of models was evaluated by coefficient of determination, t-test, Willmott agreement index (d), random error and standard error. The model was able to explain 79%, 12% and 74% of the variation in the observed productions of the 2007/08, 2008/09 and 2009/10 cropping seasons, respectively. Performance of the model for the 2008/09 cropping season was poor because of the occurrence of a long period of drought in that season. When the three seasons were considered all together, the model explained 66% of the variation. Results showed that SAR-based yield prediction models can contribute and assist sugar mill technicians to improve such estimates.
Resumo:
Several Authors Have Discussed Recently the Limited Dependent Variable Regression Model with Serial Correlation Between Residuals. the Pseudo-Maximum Likelihood Estimators Obtained by Ignoring Serial Correlation Altogether, Have Been Shown to Be Consistent. We Present Alternative Pseudo-Maximum Likelihood Estimators Which Are Obtained by Ignoring Serial Correlation Only Selectively. Monte Carlo Experiments on a Model with First Order Serial Correlation Suggest That Our Alternative Estimators Have Substantially Lower Mean-Squared Errors in Medium Size and Small Samples, Especially When the Serial Correlation Coefficient Is High. the Same Experiments Also Suggest That the True Level of the Confidence Intervals Established with Our Estimators by Assuming Asymptotic Normality, Is Somewhat Lower Than the Intended Level. Although the Paper Focuses on Models with Only First Order Serial Correlation, the Generalization of the Proposed Approach to Serial Correlation of Higher Order Is Also Discussed Briefly.
Resumo:
Influences of inbreeding on daily milk yield (DMY), age at first calving (AFC), and calving intervals (CI) were determined on a highly inbred zebu dairy subpopulation of the Guzerat breed. Variance components were estimated using animal models in single-trait analyses. Two approaches were employed to estimate inbreeding depression: using individual increase in inbreeding coefficients or using inbreeding coefficients as possible covariates included in the statistical models. The pedigree file included 9,915 animals, of which 9,055 were inbred, with an average inbreeding coefficient of 15.2%. The maximum inbreeding coefficient observed was 49.45%, and the average inbreeding for the females still in the herd during the analysis was 26.42%. Heritability estimates were 0.27 for DMY and 0.38 for AFC. The genetic variance ratio estimated with the random regression model for CI ranged around 0.10. Increased inbreeding caused poorer performance in DMY, AFC, and CI. However, some of the cows with the highest milk yield were among the highly inbred animals in this subpopulation. Individual increase in inbreeding used as a covariate in the statistical models accounted for inbreeding depression while avoiding overestimation that may result when fitting inbreeding coefficients.
Resumo:
An extension of some standard likelihood based procedures to heteroscedastic nonlinear regression models under scale mixtures of skew-normal (SMSN) distributions is developed. This novel class of models provides a useful generalization of the heteroscedastic symmetrical nonlinear regression models (Cysneiros et al., 2010), since the random term distributions cover both symmetric as well as asymmetric and heavy-tailed distributions such as skew-t, skew-slash, skew-contaminated normal, among others. A simple EM-type algorithm for iteratively computing maximum likelihood estimates of the parameters is presented and the observed information matrix is derived analytically. In order to examine the performance of the proposed methods, some simulation studies are presented to show the robust aspect of this flexible class against outlying and influential observations and that the maximum likelihood estimates based on the EM-type algorithm do provide good asymptotic properties. Furthermore, local influence measures and the one-step approximations of the estimates in the case-deletion model are obtained. Finally, an illustration of the methodology is given considering a data set previously analyzed under the homoscedastic skew-t nonlinear regression model. (C) 2012 Elsevier B.V. All rights reserved.
Resumo:
In the present work we perform an econometric analysis of the Tribal art market. To this aim, we use a unique and original database that includes information on Tribal art market auctions worldwide from 1998 to 2011. In Literature, art prices are modelled through the hedonic regression model, a classic fixed-effect model. The main drawback of the hedonic approach is the large number of parameters, since, in general, art data include many categorical variables. In this work, we propose a multilevel model for the analysis of Tribal art prices that takes into account the influence of time on artwork prices. In fact, it is natural to assume that time exerts an influence over the price dynamics in various ways. Nevertheless, since the set of objects change at every auction date, we do not have repeated measurements of the same items over time. Hence, the dataset does not constitute a proper panel; rather, it has a two-level structure in that items, level-1 units, are grouped in time points, level-2 units. The main theoretical contribution is the extension of classical multilevel models to cope with the case described above. In particular, we introduce a model with time dependent random effects at the second level. We propose a novel specification of the model, derive the maximum likelihood estimators and implement them through the E-M algorithm. We test the finite sample properties of the estimators and the validity of the own-written R-code by means of a simulation study. Finally, we show that the new model improves considerably the fit of the Tribal art data with respect to both the hedonic regression model and the classic multilevel model.
Resumo:
We introduce a diagnostic test for the mixing distribution in a generalised linear mixed model. The test is based on the difference between the marginal maximum likelihood and conditional maximum likelihood estimates of a subset of the fixed effects in the model. We derive the asymptotic variance of this difference, and propose a test statistic that has a limiting chi-square distribution under the null hypothesis that the mixing distribution is correctly specified. For the important special case of the logistic regression model with random intercepts, we evaluate via simulation the power of the test in finite samples under several alternative distributional forms for the mixing distribution. We illustrate the method by applying it to data from a clinical trial investigating the effects of hormonal contraceptives in women.
Resumo:
Ecosystems are faced with high rates of species loss which has consequences for their functions and services. To assess the effects of plant species diversity on the nitrogen (N) cycle, we developed a model for monthly mean nitrate (NO3-N) concentrations in soil solution in 0-30 cm mineral soil depth using plant species and functional group richness and functional composition as drivers and assessing the effects of conversion of arable land to grassland, spatially heterogeneous soil properties, and climate. We used monthly mean NO3-N concentrations from 62 plots of a grassland plant diversity experiment from 2003 to 2006. Plant species richness (1-60) and functional group composition (1-4 functional groups: legumes, grasses, non-leguminous tall herbs, non-leguminous small herbs) were manipulated in a factorial design. Plant community composition, time since conversion from arable land to grassland, soil texture, and climate data (precipitation, soil moisture, air and soil temperature) were used to develop one general Bayesian multiple regression model for the 62 plots to allow an in-depth evaluation using the experimental design. The model simulated NO3-N concentrations with an overall Bayesian coefficient of determination of 0.48. The temporal course of NO3-N concentrations was simulated differently well for the individual plots with a maximum plot-specific Nash-Sutcliffe Efficiency of 0.57. The model shows that NO3-N concentrations decrease with species richness, but this relation reverses if more than approx. 25 % of legume species are included in the mixture. Presence of legumes increases and presence of grasses decreases NO3-N concentrations compared to mixtures containing only small and tall herbs. Altogether, our model shows that there is a strong influence of plant community composition on NO3-N concentrations.
Resumo:
Strategies are compared for the development of a linear regression model with stochastic (multivariate normal) regressor variables and the subsequent assessment of its predictive ability. Bias and mean squared error of four estimators of predictive performance are evaluated in simulated samples of 32 population correlation matrices. Models including all of the available predictors are compared with those obtained using selected subsets. The subset selection procedures investigated include two stopping rules, C$\sb{\rm p}$ and S$\sb{\rm p}$, each combined with an 'all possible subsets' or 'forward selection' of variables. The estimators of performance utilized include parametric (MSEP$\sb{\rm m}$) and non-parametric (PRESS) assessments in the entire sample, and two data splitting estimates restricted to a random or balanced (Snee's DUPLEX) 'validation' half sample. The simulations were performed as a designed experiment, with population correlation matrices representing a broad range of data structures.^ The techniques examined for subset selection do not generally result in improved predictions relative to the full model. Approaches using 'forward selection' result in slightly smaller prediction errors and less biased estimators of predictive accuracy than 'all possible subsets' approaches but no differences are detected between the performances of C$\sb{\rm p}$ and S$\sb{\rm p}$. In every case, prediction errors of models obtained by subset selection in either of the half splits exceed those obtained using all predictors and the entire sample.^ Only the random split estimator is conditionally (on $\\beta$) unbiased, however MSEP$\sb{\rm m}$ is unbiased on average and PRESS is nearly so in unselected (fixed form) models. When subset selection techniques are used, MSEP$\sb{\rm m}$ and PRESS always underestimate prediction errors, by as much as 27 percent (on average) in small samples. Despite their bias, the mean squared errors (MSE) of these estimators are at least 30 percent less than that of the unbiased random split estimator. The DUPLEX split estimator suffers from large MSE as well as bias, and seems of little value within the context of stochastic regressor variables.^ To maximize predictive accuracy while retaining a reliable estimate of that accuracy, it is recommended that the entire sample be used for model development, and a leave-one-out statistic (e.g. PRESS) be used for assessment. ^
Resumo:
Many destination marketing organizations in the United States and elsewhere are facing budget retrenchment for tourism marketing, especially for advertising. This study evaluates a three-stage model using Random Coefficient Logit (RCL) approach which controls for correlations between different non-independent alternatives and considers heterogeneity within individual’s responses to advertising. The results of this study indicate that the proposed RCL model results in a significantly better fit as compared to traditional logit models, and indicates that tourism advertising significantly influences tourist decisions with several variables (age, income, distance and Internet access) moderating these decisions differently depending on decision stage and product type. These findings suggest that this approach provides a better foundation for assessing, and in turn, designing more effective advertising campaigns.
Resumo:
BACKGROUND Respiratory tract infections and subsequent airway inflammation occur early in the life of infants with cystic fibrosis. However, detailed information about the microbial composition of the respiratory tract in infants with this disorder is scarce. We aimed to undertake longitudinal in-depth characterisation of the upper respiratory tract microbiota in infants with cystic fibrosis during the first year of life. METHODS We did this prospective cohort study at seven cystic fibrosis centres in Switzerland. Between Feb 1, 2011, and May 31, 2014, we enrolled 30 infants with a diagnosis of cystic fibrosis. Microbiota characterisation was done with 16S rRNA gene pyrosequencing and oligotyping of nasal swabs collected every 2 weeks from the infants with cystic fibrosis. We compared these data with data for an age-matched cohort of 47 healthy infants. We additionally investigated the effect of antibiotic treatment on the microbiota of infants with cystic fibrosis. Statistical methods included regression analyses with a multivariable multilevel linear model with random effects to correct for clustering on the individual level. FINDINGS We analysed 461 nasal swabs taken from the infants with cystic fibrosis; the cohort of healthy infants comprised 872 samples. The microbiota of infants with cystic fibrosis differed compositionally from that of healthy infants (p=0·001). This difference was also found in exclusively antibiotic-naive samples (p=0·001). The disordering was mainly, but not solely, due to an overall increase in the mean relative abundance of Staphylococcaceae in infants with cystic fibrosis compared with healthy infants (multivariable linear regression model stratified by age and adjusted for season; second month: coefficient 16·2 [95% CI 0·6-31·9]; p=0·04; third month: 17·9 [3·3-32·5]; p=0·02; fourth month: 21·1 [7·8-34·3]; p=0·002). Oligotyping analysis enabled differentiation between Staphylococcus aureus and coagulase-negative Staphylococci. Whereas the analysis showed a decrease in S aureus at and after antibiotic treatment, coagulase-negative Staphylococci increased. INTERPRETATION Our study describes compositional differences in the microbiota of infants with cystic fibrosis compared with healthy controls, and disordering of the microbiota on antibiotic administration. Besides S aureus, coagulase-negative Staphylococci also contributed to the disordering identified in these infants. These findings are clinically important in view of the crucial role that bacterial pathogens have in the disease progression of cystic fibrosis in early life. Our findings could be used to inform future studies of the effect of antibiotic treatment on the microbiota in infants with cystic fibrosis, and could assist in the prevention of early disease progression in infants with this disorder. FUNDING Swiss National Science Foundation, Fondation Botnar, the Swiss Society for Cystic Fibrosis, and the Swiss Lung Association Bern.
Resumo:
Count data with excess zeros relative to a Poisson distribution are common in many biomedical applications. A popular approach to the analysis of such data is to use a zero-inflated Poisson (ZIP) regression model. Often, because of the hierarchical Study design or the data collection procedure, zero-inflation and lack of independence may occur simultaneously, which tender the standard ZIP model inadequate. To account for the preponderance of zero counts and the inherent correlation of observations, a class of multi-level ZIP regression model with random effects is presented. Model fitting is facilitated using an expectation-maximization algorithm, whereas variance components are estimated via residual maximum likelihood estimating equations. A score test for zero-inflation is also presented. The multi-level ZIP model is then generalized to cope with a more complex correlation structure. Application to the analysis of correlated count data from a longitudinal infant feeding study illustrates the usefulness of the approach.