8 resultados para Nonparametric regression techniques
em BORIS: Bern Open Repository and Information System - Berna - Suiça
Resumo:
Let Y_i = f(x_i) + E_i\ (1\le i\le n) with given covariates x_1\lt x_2\lt \cdots\lt x_n , an unknown regression function f and independent random errors E_i with median zero. It is shown how to apply several linear rank test statistics simultaneously in order to test monotonicity of f in various regions and to identify its local extrema.
Resumo:
This paper studied two different regression techniques for pelvic shape prediction, i.e., the partial least square regression (PLSR) and the principal component regression (PCR). Three different predictors such as surface landmarks, morphological parameters, or surface models of neighboring structures were used in a cross-validation study to predict the pelvic shape. Results obtained from applying these two different regression techniques were compared to the population mean model. In almost all the prediction experiments, both regression techniques unanimously generated better results than the population mean model, while the difference on prediction accuracy between these two regression methods is not statistically significant (α=0.01).
Resumo:
Fossil pollen data from stratigraphic cores are irregularly spaced in time due to non-linear age-depth relations. Moreover, their marginal distributions may vary over time. We address these features in a nonparametric regression model with errors that are monotone transformations of a latent continuous-time Gaussian process Z(T). Although Z(T) is unobserved, due to monotonicity, under suitable regularity conditions, it can be recovered facilitating further computations such as estimation of the long-memory parameter and the Hermite coefficients. The estimation of Z(T) itself involves estimation of the marginal distribution function of the regression errors. These issues are considered in proposing a plug-in algorithm for optimal bandwidth selection and construction of confidence bands for the trend function. Some high-resolution time series of pollen records from Lago di Origlio in Switzerland, which go back ca. 20,000 years are used to illustrate the methods.
Resumo:
This paper presents a comparison of principal component (PC) regression and regularized expectation maximization (RegEM) to reconstruct European summer and winter surface air temperature over the past millennium. Reconstruction is performed within a surrogate climate using the National Center for Atmospheric Research (NCAR) Climate System Model (CSM) 1.4 and the climate model ECHO-G 4, assuming different white and red noise scenarios to define the distortion of pseudoproxy series. We show how sensitivity tests lead to valuable “a priori” information that provides a basis for improving real world proxy reconstructions. Our results emphasize the need to carefully test and evaluate reconstruction techniques with respect to the temporal resolution and the spatial scale they are applied to. Furthermore, we demonstrate that uncertainties inherent to the predictand and predictor data have to be more rigorously taken into account. The comparison of the two statistical techniques, in the specific experimental setting presented here, indicates that more skilful results are achieved with RegEM as low frequency variability is better preserved. We further detect seasonal differences in reconstruction skill for the continental scale, as e.g. the target temperature average is more adequately reconstructed for summer than for winter. For the specific predictor network given in this paper, both techniques underestimate the target temperature variations to an increasing extent as more noise is added to the signal, albeit RegEM less than with PC regression. We conclude that climate field reconstruction techniques can be improved and need to be further optimized in future applications.
Resumo:
Long-term measurements of CO2 flux can be obtained using the eddy covariance technique, but these datasets are affected by gaps which hinder the estimation of robust long-term means and annual ecosystem exchanges. We compare results obtained using three gap-fill techniques: multiple regression (MR), multiple imputation (MI), and artificial neural networks (ANNs), applied to a one-year dataset of hourly CO2 flux measurements collected in Lutjewad, over a flat agriculture area near the Wadden Sea dike in the north of the Netherlands. The dataset was separated in two subsets: a learning and a validation set. The performances of gap-filling techniques were analysed by calculating statistical criteria: coefficient of determination (R2), root mean square error (RMSE), mean absolute error (MAE), maximum absolute error (MaxAE), and mean square bias (MSB). The gap-fill accuracy is seasonally dependent, with better results in cold seasons. The highest accuracy is obtained using ANN technique which is also less sensitive to environmental/seasonal conditions. We argue that filling gaps directly on measured CO2 fluxes is more advantageous than the common method of filling gaps on calculated net ecosystem change, because ANN is an empirical method and smaller scatter is expected when gap filling is applied directly to measurements.
Resumo:
We present an independent calibration model for the determination of biogenic silica (BSi) in sediments, developed from analysis of synthetic sediment mixtures and application of Fourier transform infrared spectroscopy (FTIRS) and partial least squares regression (PLSR) modeling. In contrast to current FTIRS applications for quantifying BSi, this new calibration is independent from conventional wet-chemical techniques and their associated measurement uncertainties. This approach also removes the need for developing internal calibrations between the two methods for individual sediments records. For the independent calibration, we produced six series of different synthetic sediment mixtures using two purified diatom extracts, with one extract mixed with quartz sand, calcite, 60/40 quartz/calcite and two different natural sediments, and a second extract mixed with one of the natural sediments. A total of 306 samples—51 samples per series—yielded BSi contents ranging from 0 to 100 %. The resulting PLSR calibration model between the FTIR spectral information and the defined BSi concentration of the synthetic sediment mixtures exhibits a strong cross-validated correlation ( R2cv = 0.97) and a low root-mean square error of cross-validation (RMSECV = 4.7 %). Application of the independent calibration to natural lacustrine and marine sediments yields robust BSi reconstructions. At present, the synthetic mixtures do not include the variation in organic matter that occurs in natural samples, which may explain the somewhat lower prediction accuracy of the calibration model for organic-rich samples.
Resumo:
We consider the problem of nonparametric estimation of a concave regression function F. We show that the supremum distance between the least square s estimatorand F on a compact interval is typically of order(log(n)/n)2/5. This entails rates of convergence for the estimator’s derivative. Moreover, we discuss the impact of additional constraints on F such as monotonicity and pointwise bounds. Then we apply these results to the analysis of current status data, where the distribution function of the event times is assumed to be concave.