927 resultados para LINEAR-REGRESSION MODELS
Resumo:
The estimation of the long-term wind resource at a prospective site based on a relatively short on-site measurement campaign is an indispensable task in the development of a commercial wind farm. The typical industry approach is based on the measure-correlate-predict �MCP� method where a relational model between the site wind velocity data and the data obtained from a suitable reference site is built from concurrent records. In a subsequent step, a long-term prediction for the prospective site is obtained from a combination of the relational model and the historic reference data. In the present paper, a systematic study is presented where three new MCP models, together with two published reference models �a simple linear regression and the variance ratio method�, have been evaluated based on concurrent synthetic wind speed time series for two sites, simulating the prospective and the reference site. The synthetic method has the advantage of generating time series with the desired statistical properties, including Weibull scale and shape factors, required to evaluate the five methods under all plausible conditions. In this work, first a systematic discussion of the statistical fundamentals behind MCP methods is provided and three new models, one based on a nonlinear regression and two �termed kernel methods� derived from the use of conditional probability density functions, are proposed. All models are evaluated by using five metrics under a wide range of values of the correlation coefficient, the Weibull scale, and the Weibull shape factor. Only one of all models, a kernel method based on bivariate Weibull probability functions, is capable of accurately predicting all performance metrics studied.
Resumo:
Logistic models are studied as a tool to convert dynamical forecast information (deterministic and ensemble) into probability forecasts. A logistic model is obtained by setting the logarithmic odds ratio equal to a linear combination of the inputs. As with any statistical model, logistic models will suffer from overfitting if the number of inputs is comparable to the number of forecast instances. Computational approaches to avoid overfitting by regularization are discussed, and efficient techniques for model assessment and selection are presented. A logit version of the lasso (originally a linear regression technique), is discussed. In lasso models, less important inputs are identified and the corresponding coefficient is set to zero, providing an efficient and automatic model reduction procedure. For the same reason, lasso models are particularly appealing for diagnostic purposes.
Resumo:
This paper aims to understand the physical processes causing the large spread in the storm track projections of the CMIP5 climate models. In particular, the relationship between the climate change responses of the storm tracks, as measured by the 2–6 day mean sea level pressure variance, and the equator-to-pole temperature differences at upper- and lower-tropospheric levels is investigated. In the southern hemisphere the responses of the upper- and lower-tropospheric temperature differences are correlated across the models and as a result they share similar associations with the storm track responses. There are large regions in which the storm track responses are correlated with the temperature difference responses, and a simple linear regression model based on the temperature differences at either level captures the spatial pattern of the mean storm track response as well explaining between 30 and 60 % of the inter-model variance of the storm track responses. In the northern hemisphere the responses of the two temperature differences are not significantly correlated and their associations with the storm track responses are more complicated. In summer, the responses of the lower-tropospheric temperature differences dominate the inter-model spread of the storm track responses. In winter, the responses of the upper- and lower-temperature differences both play a role. The results suggest that there is potential to reduce the spread in storm track responses by constraining the relative magnitudes of the warming in the tropical and polar regions.
Resumo:
In this paper we discuss the current state-of-the-art in estimating, evaluating, and selecting among non-linear forecasting models for economic and financial time series. We review theoretical and empirical issues, including predictive density, interval and point evaluation and model selection, loss functions, data-mining, and aggregation. In addition, we argue that although the evidence in favor of constructing forecasts using non-linear models is rather sparse, there is reason to be optimistic. However, much remains to be done. Finally, we outline a variety of topics for future research, and discuss a number of areas which have received considerable attention in the recent literature, but where many questions remain.
Resumo:
The purpose of this article is to present a new method to predict the response variable of an observation in a new cluster for a multilevel logistic regression. The central idea is based on the empirical best estimator for the random effect. Two estimation methods for multilevel model are compared: penalized quasi-likelihood and Gauss-Hermite quadrature. The performance measures for the prediction of the probability for a new cluster observation of the multilevel logistic model in comparison with the usual logistic model are examined through simulations and an application.
Resumo:
In this paper we discuss bias-corrected estimators for the regression and the dispersion parameters in an extended class of dispersion models (Jorgensen, 1997b). This class extends the regular dispersion models by letting the dispersion parameter vary throughout the observations, and contains the dispersion models as particular case. General formulae for the O(n(-1)) bias are obtained explicitly in dispersion models with dispersion covariates, which generalize previous results obtained by Botter and Cordeiro (1998), Cordeiro and McCullagh (1991), Cordeiro and Vasconcellos (1999), and Paula (1992). The practical use of the formulae is that we can derive closed-form expressions for the O(n(-1)) biases of the maximum likelihood estimators of the regression and dispersion parameters when the information matrix has a closed-form. Various expressions for the O(n(-1)) biases are given for special models. The formulae have advantages for numerical purposes because they require only a supplementary weighted linear regression. We also compare these bias-corrected estimators with two different estimators which are also bias-free to order O(n(-1)) that are based on bootstrap methods. These estimators are compared by simulation. (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
This paper studies a smooth-transition (ST) type cointegration. The proposed ST cointegration allows for regime switching structure in a cointegrated system. It nests the linear cointegration developed by Engle and Granger (1987) and the threshold cointegration studied by Balke and Fomby (1997). We develop F-type tests to examine linear cointegration against ST cointegration in ST-type cointegrating regression models with or without time trends. The null asymptotic distributions of the tests are derived with stationary transition variables in ST cointegrating regression models. And it is shown that our tests have nonstandard limiting distributions expressed in terms of standard Brownian motion when regressors are pure random walks, while have standard asymptotic distributions when regressors contain random walks with nonzero drift. Finite-sample distributions of those tests are studied by Monto Carlo simulations. The small-sample performance of the tests states that our F-type tests have a better power when the system contains ST cointegration than when the system is linearly cointegrated. An empirical example for the purchasing power parity (PPP) data (monthly US dollar, Italy lira and dollar-lira exchange rate from 1973:01 to 1989:10) is illustrated by applying the testing procedures in this paper. It is found that there is no linear cointegration in the system, but there exits the ST-type cointegration in the PPP data.
Resumo:
We present the hglm package for fitting hierarchical generalized linear models. It can be used for linear mixed models and generalized linear mixed models with random effects for a variety of links and a variety of distributions for both the outcomes and the random effects. Fixed effects can also be fitted in the dispersion part of the model.
Resumo:
Background: The sensitivity to microenvironmental changes varies among animals and may be under genetic control. It is essential to take this element into account when aiming at breeding robust farm animals. Here, linear mixed models with genetic effects in the residual variance part of the model can be used. Such models have previously been fitted using EM and MCMC algorithms. Results: We propose the use of double hierarchical generalized linear models (DHGLM), where the squared residuals are assumed to be gamma distributed and the residual variance is fitted using a generalized linear model. The algorithm iterates between two sets of mixed model equations, one on the level of observations and one on the level of variances. The method was validated using simulations and also by re-analyzing a data set on pig litter size that was previously analyzed using a Bayesian approach. The pig litter size data contained 10,060 records from 4,149 sows. The DHGLM was implemented using the ASReml software and the algorithm converged within three minutes on a Linux server. The estimates were similar to those previously obtained using Bayesian methodology, especially the variance components in the residual variance part of the model. Conclusions: We have shown that variance components in the residual variance part of a linear mixed model can be estimated using a DHGLM approach. The method enables analyses of animal models with large numbers of observations. An important future development of the DHGLM methodology is to include the genetic correlation between the random effects in the mean and residual variance parts of the model as a parameter of the DHGLM.
Resumo:
Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)
Resumo:
Growth curves models provide a visual assessment of growth as a function of time, and prediction body weight at a specific age. This study aimed at estimating tinamous growth curve using different models, and at verifying their goodness of fit. A total number 11,639 weight records from 411 birds, being 6,671 from females and 3,095 from males, was analyzed. The highest estimates of a parameter were obtained using Brody (BD), von Bertalanffy (VB), Gompertz (GP,) and Logistic function (LG). Adult females were 5.7% heavier than males. The highest estimates of b parameter were obtained in the LG, GP, BID, and VB models. The estimated k parameter values in decreasing order were obtained in LG, GP, VB, and BID models. The correlation between the parameters a and k showed heavier birds are less precocious than the lighter. The estimates of intercept, linear regression coefficient, quadratic regression coefficient, and differences between quadratic coefficient of functions and estimated ties of quadratic-quadratic-quadratic segmented polynomials (QQQSP) were: 31.1732 +/- 2.41339; 3.07898 +/- 0.13287; 0.02689 +/- 0.00152; -0.05566 +/- 0.00193; 0.02349 +/- 0.00107, and 57 and 145 days, respectively. The estimated predicted mean error values (PME) of VB, GP, BID, LG, and QQQSP models were, respectively, 0.8353; 0.01715; -0.6939; -2.2453; and -0.7544%. The coefficient of determination (RI) and least square error values (MS) showed similar results. In conclusion, the VB and the QQQSP models adequately described tinamous growth. The best model to describe tinamous growth was the Gompertz model, because it presented the highest R-2 values, easiness of convergence, lower PME, and the easiness of parameter biological interpretation.
Resumo:
Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)
Resumo:
Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
It is often necessary to run response surface designs in blocks. In this paper the analysis of data from such experiments, using polynomial regression models, is discussed. The definition and estimation of pure error in blocked designs are considered. It is recommended that pure error is estimated by assuming additive block and treatment effects, as this is more consistent with designs without blocking. The recovery of inter-block information using REML analysis is discussed, although it is shown that it has very little impact if thc design is nearly orthogonally blocked. Finally prediction from blocked designs is considered and it is shown that prediction of many quantities of interest is much simpler than prediction of the response itself.