71 resultados para Geographic Regression Discontinuity


Relevância:

20.00% 20.00%

Publicador:

Resumo:

An efficient data based-modeling algorithm for nonlinear system identification is introduced for radial basis function (RBF) neural networks with the aim of maximizing generalization capability based on the concept of leave-one-out (LOO) cross validation. Each of the RBF kernels has its own kernel width parameter and the basic idea is to optimize the multiple pairs of regularization parameters and kernel widths, each of which is associated with a kernel, one at a time within the orthogonal forward regression (OFR) procedure. Thus, each OFR step consists of one model term selection based on the LOO mean square error (LOOMSE), followed by the optimization of the associated kernel width and regularization parameter, also based on the LOOMSE. Since like our previous state-of-the-art local regularization assisted orthogonal least squares (LROLS) algorithm, the same LOOMSE is adopted for model selection, our proposed new OFR algorithm is also capable of producing a very sparse RBF model with excellent generalization performance. Unlike our previous LROLS algorithm which requires an additional iterative loop to optimize the regularization parameters as well as an additional procedure to optimize the kernel width, the proposed new OFR algorithm optimizes both the kernel widths and regularization parameters within the single OFR procedure, and consequently the required computational complexity is dramatically reduced. Nonlinear system identification examples are included to demonstrate the effectiveness of this new approach in comparison to the well-known approaches of support vector machine and least absolute shrinkage and selection operator as well as the LROLS algorithm.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A new class of parameter estimation algorithms is introduced for Gaussian process regression (GPR) models. It is shown that the integration of the GPR model with probability distance measures of (i) the integrated square error and (ii) Kullback–Leibler (K–L) divergence are analytically tractable. An efficient coordinate descent algorithm is proposed to iteratively estimate the kernel width using golden section search which includes a fast gradient descent algorithm as an inner loop to estimate the noise variance. Numerical examples are included to demonstrate the effectiveness of the new identification approaches.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Classical regression methods take vectors as covariates and estimate the corresponding vectors of regression parameters. When addressing regression problems on covariates of more complex form such as multi-dimensional arrays (i.e. tensors), traditional computational models can be severely compromised by ultrahigh dimensionality as well as complex structure. By exploiting the special structure of tensor covariates, the tensor regression model provides a promising solution to reduce the model’s dimensionality to a manageable level, thus leading to efficient estimation. Most of the existing tensor-based methods independently estimate each individual regression problem based on tensor decomposition which allows the simultaneous projections of an input tensor to more than one direction along each mode. As a matter of fact, multi-dimensional data are collected under the same or very similar conditions, so that data share some common latent components but can also have their own independent parameters for each regression task. Therefore, it is beneficial to analyse regression parameters among all the regressions in a linked way. In this paper, we propose a tensor regression model based on Tucker Decomposition, which identifies not only the common components of parameters across all the regression tasks, but also independent factors contributing to each particular regression task simultaneously. Under this paradigm, the number of independent parameters along each mode is constrained by a sparsity-preserving regulariser. Linked multiway parameter analysis and sparsity modeling further reduce the total number of parameters, with lower memory cost than their tensor-based counterparts. The effectiveness of the new method is demonstrated on real data sets.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The tolerance of Callosobruchus maculatus from different geographical locations reared on two cowpea varieties, pale brown Ife Brown (IFBV) and dark brown IAR48 (IAR48V), to seed powder of Piper guineense (Schum and Thonn) was investigated. C. maculatus populations were collected from nine different locations across Osun state in the South Western part of Nigeria. The main and interactive effects of cowpea variety, population origin and dose on C. maculatus tolerance to P. guineense were explored. It was observed that bruchids that emerged from IAR48V had greater tolerance of P. guineense than bruchids reared on IFBV. There were significant effects (P < 0.001) of cowpea variety, population and dose, and significant interactions among these factors (except variety � dose, P > 0.05) on the response of bruchids to P. guineense. When reared on IAR48V, bruchid populations from the North-Eastern part of the state show greater tolerance to P. guineense than their counterparts from the SoutheWest. This study underscores the importance of knowledge of the origin of the population and the cowpea variety on which C. maculatus developed when managing bruchids damage using P. guineense

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Forecasting wind power is an important part of a successful integration of wind power into the power grid. Forecasts with lead times longer than 6 h are generally made by using statistical methods to post-process forecasts from numerical weather prediction systems. Two major problems that complicate this approach are the non-linear relationship between wind speed and power production and the limited range of power production between zero and nominal power of the turbine. In practice, these problems are often tackled by using non-linear non-parametric regression models. However, such an approach ignores valuable and readily available information: the power curve of the turbine's manufacturer. Much of the non-linearity can be directly accounted for by transforming the observed power production into wind speed via the inverse power curve so that simpler linear regression models can be used. Furthermore, the fact that the transformed power production has a limited range can be taken care of by employing censored regression models. In this study, we evaluate quantile forecasts from a range of methods: (i) using parametric and non-parametric models, (ii) with and without the proposed inverse power curve transformation and (iii) with and without censoring. The results show that with our inverse (power-to-wind) transformation, simpler linear regression models with censoring perform equally or better than non-linear models with or without the frequently used wind-to-power transformation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We use sunspot group observations from the Royal Greenwich Observatory (RGO) to investigate the effects of intercalibrating data from observers with different visual acuities. The tests are made by counting the number of groups RB above a variable cut-off threshold of observed total whole-spot area (uncorrected for foreshortening) to simulate what a lower acuity observer would have seen. The synthesised annual means of RB are then re-scaled to the full observed RGO group number RA using a variety of regression techniques. It is found that a very high correlation between RA and RB (rAB > 0.98) does not prevent large errors in the intercalibration (for example sunspot maximum values can be over 30 % too large even for such levels of rAB). In generating the backbone sunspot number (RBB), Svalgaard and Schatten (2015, this issue) force regression fits to pass through the scatter plot origin which generates unreliable fits (the residuals do not form a normal distribution) and causes sunspot cycle amplitudes to be exaggerated in the intercalibrated data. It is demonstrated that the use of Quantile-Quantile (“Q  Q”) plots to test for a normal distribution is a useful indicator of erroneous and misleading regression fits. Ordinary least squares linear fits, not forced to pass through the origin, are sometimes reliable (although the optimum method used is shown to be different when matching peak and average sunspot group numbers). However, other fits are only reliable if non-linear regression is used. From these results it is entirely possible that the inflation of solar cycle amplitudes in the backbone group sunspot number as one goes back in time, relative to related solar-terrestrial parameters, is entirely caused by the use of inappropriate and non-robust regression techniques to calibrate the sunspot data.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents an open-source canopy height profile (CHP) toolkit designed for processing small-footprint full-waveform LiDAR data to obtain the estimates of effective leaf area index (LAIe) and CHPs. The use of the toolkit is presented with a case study of LAIe estimation in discontinuous-canopy fruit plantations. The experiments are carried out in two study areas, namely, orange and almond plantations, with different percentages of canopy cover (48% and 40%, respectively). For comparison, two commonly used discrete-point LAIe estimation methods are also tested. The LiDAR LAIe values are first computed for each of the sites and each method as a whole, providing “apparent” site-level LAIe, which disregards the discontinuity of the plantations’ canopies. Since the toolkit allows for the calculation of the study area LAIe at different spatial scales, between-tree-level clumpingcan be easily accounted for and is then used to illustrate the impact of the discontinuity of canopy cover on LAIe retrieval. The LiDAR LAIe estimates are therefore computed at smaller scales as a mean of LAIe in various grid-cell sizes, providing estimates of “actual” site-level LAIe. Subsequently, the LiDAR LAIe results are compared with theoretical models of “apparent” LAIe versus “actual” LAIe, based on known percent canopy cover in each site. The comparison of those models to LiDAR LAIe derived from the smallest grid-cell sizes against the estimates of LAIe for the whole site has shown that the LAIe estimates obtained from the CHP toolkit provided values that are closest to those of theoretical models.