35 resultados para Non-parametric regression methods

em CentAUR: Central Archive University of Reading - UK


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Normal Quantile Transform (NQT) has been used in many hydrological and meteorological applications in order to make the Cumulated Distribution Function (CDF) of the observed, simulated and forecast river discharge, water level or precipitation data Gaussian. It is also the heart of the meta-Gaussian model for assessing the total predictive uncertainty of the Hydrological Uncertainty Processor (HUP) developed by Krzysztofowicz. In the field of geo-statistics this transformation is better known as the Normal-Score Transform. In this paper some possible problems caused by small sample sizes when applying the NQT in flood forecasting systems will be discussed and a novel way to solve the problem will be outlined by combining extreme value analysis and non-parametric regression methods. The method will be illustrated by examples of hydrological stream-flow forecasts.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Forecasting wind power is an important part of a successful integration of wind power into the power grid. Forecasts with lead times longer than 6 h are generally made by using statistical methods to post-process forecasts from numerical weather prediction systems. Two major problems that complicate this approach are the non-linear relationship between wind speed and power production and the limited range of power production between zero and nominal power of the turbine. In practice, these problems are often tackled by using non-linear non-parametric regression models. However, such an approach ignores valuable and readily available information: the power curve of the turbine's manufacturer. Much of the non-linearity can be directly accounted for by transforming the observed power production into wind speed via the inverse power curve so that simpler linear regression models can be used. Furthermore, the fact that the transformed power production has a limited range can be taken care of by employing censored regression models. In this study, we evaluate quantile forecasts from a range of methods: (i) using parametric and non-parametric models, (ii) with and without the proposed inverse power curve transformation and (iii) with and without censoring. The results show that with our inverse (power-to-wind) transformation, simpler linear regression models with censoring perform equally or better than non-linear models with or without the frequently used wind-to-power transformation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The use of Bayesian inference in the inference of time-frequency representations has, thus far, been limited to offline analysis of signals, using a smoothing spline based model of the time-frequency plane. In this paper we introduce a new framework that allows the routine use of Bayesian inference for online estimation of the time-varying spectral density of a locally stationary Gaussian process. The core of our approach is the use of a likelihood inspired by a local Whittle approximation. This choice, along with the use of a recursive algorithm for non-parametric estimation of the local spectral density, permits the use of a particle filter for estimating the time-varying spectral density online. We provide demonstrations of the algorithm through tracking chirps and the analysis of musical data.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper models the transmission of shocks between the US, Japanese and Australian equity markets. Tests for the existence of linear and non-linear transmission of volatility across the markets are performed using parametric and non-parametric techniques. In particular the size and sign of return innovations are important factors in determining the degree of spillovers in volatility. It is found that a multivariate asymmetric GARCH formulation can explain almost all of the non-linear causality between markets. These results have important implications for the construction of models and forecasts of international equity returns.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we study the role of the volatility risk premium for the forecasting performance of implied volatility. We introduce a non-parametric and parsimonious approach to adjust the model-free implied volatility for the volatility risk premium and implement this methodology using more than 20 years of options and futures data on three major energy markets. Using regression models and statistical loss functions, we find compelling evidence to suggest that the risk premium adjusted implied volatility significantly outperforms other models, including its unadjusted counterpart. Our main finding holds for different choices of volatility estimators and competing time-series models, underlying the robustness of our results.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Classical regression methods take vectors as covariates and estimate the corresponding vectors of regression parameters. When addressing regression problems on covariates of more complex form such as multi-dimensional arrays (i.e. tensors), traditional computational models can be severely compromised by ultrahigh dimensionality as well as complex structure. By exploiting the special structure of tensor covariates, the tensor regression model provides a promising solution to reduce the model’s dimensionality to a manageable level, thus leading to efficient estimation. Most of the existing tensor-based methods independently estimate each individual regression problem based on tensor decomposition which allows the simultaneous projections of an input tensor to more than one direction along each mode. As a matter of fact, multi-dimensional data are collected under the same or very similar conditions, so that data share some common latent components but can also have their own independent parameters for each regression task. Therefore, it is beneficial to analyse regression parameters among all the regressions in a linked way. In this paper, we propose a tensor regression model based on Tucker Decomposition, which identifies not only the common components of parameters across all the regression tasks, but also independent factors contributing to each particular regression task simultaneously. Under this paradigm, the number of independent parameters along each mode is constrained by a sparsity-preserving regulariser. Linked multiway parameter analysis and sparsity modeling further reduce the total number of parameters, with lower memory cost than their tensor-based counterparts. The effectiveness of the new method is demonstrated on real data sets.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We use sunspot group observations from the Royal Greenwich Observatory (RGO) to investigate the effects of intercalibrating data from observers with different visual acuities. The tests are made by counting the number of groups RB above a variable cut-off threshold of observed total whole-spot area (uncorrected for foreshortening) to simulate what a lower acuity observer would have seen. The synthesised annual means of RB are then re-scaled to the full observed RGO group number RA using a variety of regression techniques. It is found that a very high correlation between RA and RB (rAB > 0.98) does not prevent large errors in the intercalibration (for example sunspot maximum values can be over 30 % too large even for such levels of rAB). In generating the backbone sunspot number (RBB), Svalgaard and Schatten (2015, this issue) force regression fits to pass through the scatter plot origin which generates unreliable fits (the residuals do not form a normal distribution) and causes sunspot cycle amplitudes to be exaggerated in the intercalibrated data. It is demonstrated that the use of Quantile-Quantile (“Q  Q”) plots to test for a normal distribution is a useful indicator of erroneous and misleading regression fits. Ordinary least squares linear fits, not forced to pass through the origin, are sometimes reliable (although the optimum method used is shown to be different when matching peak and average sunspot group numbers). However, other fits are only reliable if non-linear regression is used. From these results it is entirely possible that the inflation of solar cycle amplitudes in the backbone group sunspot number as one goes back in time, relative to related solar-terrestrial parameters, is entirely caused by the use of inappropriate and non-robust regression techniques to calibrate the sunspot data.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Although the sunspot-number series have existed since the mid-19th century, they are still the subject of intense debate, with the largest uncertainty being related to the "calibration" of the visual acuity of individual observers in the past. Daisy-chain regression methods are applied to inter-calibrate the observers which may lead to significant bias and error accumulation. Here we present a novel method to calibrate the visual acuity of the key observers to the reference data set of Royal Greenwich Observatory sunspot groups for the period 1900-1976, using the statistics of the active-day fraction. For each observer we independently evaluate their observational thresholds [S_S] defined such that the observer is assumed to miss all of the groups with an area smaller than S_S and report all the groups larger than S_S. Next, using a Monte-Carlo method we construct, from the reference data set, a correction matrix for each observer. The correction matrices are significantly non-linear and cannot be approximated by a linear regression or proportionality. We emphasize that corrections based on a linear proportionality between annually averaged data lead to serious biases and distortions of the data. The correction matrices are applied to the original sunspot group records for each day, and finally the composite corrected series is produced for the period since 1748. The corrected series displays secular minima around 1800 (Dalton minimum) and 1900 (Gleissberg minimum), as well as the Modern grand maximum of activity in the second half of the 20th century. The uniqueness of the grand maximum is confirmed for the last 250 years. It is shown that the adoption of a linear relationship between the data of Wolf and Wolfer results in grossly inflated group numbers in the 18th and 19th centuries in some reconstructions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The purpose of this study was to improve the prediction of the quantity and type of Volatile Fatty Acids (VFA) produced from fermented substrate in the rumen of lactating cows. A model was formulated that describes the conversion of substrate (soluble carbohydrates, starch, hemi-cellulose, cellulose, and protein) into VFA (acetate, propionate, butyrate, and other VFA). Inputs to the model were observed rates of true rumen digestion of substrates, whereas outputs were observed molar proportions of VFA in rumen fluid. A literature survey generated data of 182 diets (96 roughage and 86 concentrate diets). Coefficient values that define the conversion of a specific substrate into VFA were estimated meta-analytically by regression of the model against observed VFA molar proportions using non-linear regression techniques. Coefficient estimates significantly differed for acetate and propionate production in particular, between different types of substrate and between roughage and concentrate diets. Deviations of fitted from observed VFA molar proportions could be attributed to random error for 100%. In addition to regression against observed data, simulation studies were performed to investigate the potential of the estimation method. Fitted coefficient estimates from simulated data sets appeared accurate, as well as fitted rates of VFA production, although the model accounted for only a small fraction (maximally 45%) of the variation in VFA molar proportions. The simulation results showed that the latter result was merely a consequence of the statistical analysis chosen and should not be interpreted as an indication of inaccuracy of coefficient estimates. Deviations between fitted and observed values corresponded to those obtained in simulations. (c) 2005 Elsevier Ltd. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The current energy requirements system used in the United Kingdom for lactating dairy cows utilizes key parameters such as metabolizable energy intake (MEI) at maintenance (MEm), the efficiency of utilization of MEI for 1) maintenance, 2) milk production (k(l)), 3) growth (k(g)), and the efficiency of utilization of body stores for milk production (k(t)). Traditionally, these have been determined using linear regression methods to analyze energy balance data from calorimetry experiments. Many studies have highlighted a number of concerns over current energy feeding systems particularly in relation to these key parameters, and the linear models used for analyzing. Therefore, a database containing 652 dairy cow observations was assembled from calorimetry studies in the United Kingdom. Five functions for analyzing energy balance data were considered: straight line, two diminishing returns functions, (the Mitscherlich and the rectangular hyperbola), and two sigmoidal functions (the logistic and the Gompertz). Meta-analysis of the data was conducted to estimate k(g) and k(t). Values of 0.83 to 0.86 and 0.66 to 0.69 were obtained for k(g) and k(t) using all the functions (with standard errors of 0.028 and 0.027), respectively, which were considerably different from previous reports of 0.60 to 0.75 for k(g) and 0.82 to 0.84 for k(t). Using the estimated values of k(g) and k(t), the data were corrected to allow for body tissue changes. Based on the definition of k(l) as the derivative of the ratio of milk energy derived from MEI to MEI directed towards milk production, MEm and k(l) were determined. Meta-analysis of the pooled data showed that the average k(l) ranged from 0.50 to 0.58 and MEm ranged between 0.34 and 0.64 MJ/kg of BW0.75 per day. Although the constrained Mitscherlich fitted the data as good as the straight line, more observations at high energy intakes (above 2.4 MJ/kg of BW0.75 per day) are required to determine conclusively whether milk energy is related to MEI linearly or not.