882 resultados para non-linear regression


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The benefits of applying tree-based methods to the purpose of modelling financial assets as opposed to linear factor analysis are increasingly being understood by market practitioners. Tree-based models such as CART (classification and regression trees) are particularly well suited to analysing stock market data which is noisy and often contains non-linear relationships and high-order interactions. CART was originally developed in the 1980s by medical researchers disheartened by the stringent assumptions applied by traditional regression analysis (Brieman et al. [1984]). In the intervening years, CART has been successfully applied to many areas of finance such as the classification of financial distress of firms (see Frydman, Altman and Kao [1985]), asset allocation (see Sorensen, Mezrich and Miller [1996]), equity style timing (see Kao and Shumaker [1999]) and stock selection (see Sorensen, Miller and Ooi [2000])...

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Brix content of pineapple fruit can be non-invasively predicted from the second derivative of near infrared reflectance spectra. Correlations obtained using a NIRSystems 6500 spectrophotometer through multiple linear regression and modified partial least squares analyses using a post-dispersive configuration were comparable with that from a pre-dispersive configuration in terms of accuracy (e.g. coefficient of determination, R2, 0.73; standard error of cross validation, SECV, 1.01°Brix). The effective depth of sample assessed was slightly greater using the post-dispersive technique (about 20 mm for pineapple fruit), as expected in relation to the higher incident light intensity, relative to the pre-dispersive configuration. The effect of such environmental variables as temperature, humidity and external light, and instrumental variables such as the number of scans averaged to form a spectrum, were considered with respect to the accuracy and precision of the measurement of absorbance at 876 nm, as a key term in the calibration for Brix, and predicted Brix. The application of post-dispersive near infrared technology to in-line assessment of intact fruit in a packing shed environment is discussed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The potential of near infra-red (NIR) spectroscopy for non-invasive measurement of fruit quality of pineapple (Ananas comosus var. Smooth Cayenne) and mango (Magnifera indica var. Kensington) fruit was assessed. A remote reflectance fibre optic probe, placed in contact with the fruit skin surface in a light-proof box, was used to deliver monochromatic light to the fruit, and to collect NIR reflectance spectra (760–2500 nm). The probe illuminated and collected reflected radiation from an area of about 16 cm2. The NIR spectral attributes were correlated with pineapple juice Brix and with mango flesh dry matter (DM) measured from fruit flesh directly underlying the scanned area. The highest correlations for both fruit were found using the second derivative of the spectra (d2 log 1/R) and an additive calibration equation. Multiple linear regression (MLR) on pineapple fruit spectra (n = 85) gave a calibration equation using d2 log 1/R at wavelengths of 866, 760, 1232 and 832 nm with a multiple coefficient of determination (R2) of 0.75, and a standard error of calibration (SEC) of 1.21 °Brix. Modified partial least squares (MPLS) regression analysis yielded a calibration equation with R2 = 0.91, SEC = 0.69, and a standard error of cross validation (SECV) of 1.09 oBrix. For mango, MLR gave a calibration equation using d2 log 1/R at 904, 872, 1660 and 1516 nm with R2 = 0.90, and SEC = 0.85% DM and a bias of 0.39. Using MPLS analysis, a calibration equation with R2 = 0.98, SEC = 0.54 and SECV = 1.19 was obtained. We conclude that NIR technology offers the potential to assess fruit sweetness in intact whole pineapple and DM in mango fruit, respectively, to within 1° Brix and 1% DM, and could be used for the grading of fruit in fruit packing sheds.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Leaf carbon (C) content, leaf nitrogen (N) content, and C:N ratio are especially useful for understanding plant-herbivore interactions and may be important in developing control methods for the invasive riparian plant Arundo donax L. We measured C content, N content, C:N ratio, and chlorophyll index (SPAD 502 reading) for 768 leaves from A. donax collected over a five year period at several locations in California, Nevada, and Texas. Leaf N was more variable than leaf C, and thus we developed a linear regression equation for estimating A. donax leaf N from the leaf chlorophyll index (SPAD reading). When applied to two independent data sets, the equation (leaf N content % = -0.63 + 0.08 x SPAD) produced realistic estimates that matched seasonal and spatial trends reported from a natural A. donax population. Used in conjunction with the handheld SPAD 502 meter, the equation provides a rapid, non-destructive method for estimating A. donax leaf quality.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The factors affecting the non-industrial, private forest landowners' (hereafter referred to using the acronym NIPF) strategic decisions in management planning are studied. A genetic algorithm is used to induce a set of rules predicting potential cut of the landowners' choices of preferred timber management strategies. The rules are based on variables describing the characteristics of the landowners and their forest holdings. The predictive ability of a genetic algorithm is compared to linear regression analysis using identical data sets. The data are cross-validated seven times applying both genetic algorithm and regression analyses in order to examine the data-sensitivity and robustness of the generated models. The optimal rule set derived from genetic algorithm analyses included the following variables: mean initial volume, landowner's positive price expectations for the next eight years, landowner being classified as farmer, and preference for the recreational use of forest property. When tested with previously unseen test data, the optimal rule set resulted in a relative root mean square error of 0.40. In the regression analyses, the optimal regression equation consisted of the following variables: mean initial volume, proportion of forestry income, intention to cut extensively in future, and positive price expectations for the next two years. The R2 of the optimal regression equation was 0.34 and the relative root mean square error obtained from the test data was 0.38. In both models, mean initial volume and positive stumpage price expectations were entered as significant predictors of potential cut of preferred timber management strategy. When tested with the complete data set of 201 observations, both the optimal rule set and the optimal regression model achieved the same level of accuracy.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The relationships among organisms and their surroundings can be of immense complexity. To describe and understand an ecosystem as a tangled bank, multiple ways of interaction and their effects have to be considered, such as predation, competition, mutualism and facilitation. Understanding the resulting interaction networks is a challenge in changing environments, e.g. to predict knock-on effects of invasive species and to understand how climate change impacts biodiversity. The elucidation of complex ecological systems with their interactions will benefit enormously from the development of new machine learning tools that aim to infer the structure of interaction networks from field data. In the present study, we propose a novel Bayesian regression and multiple changepoint model (BRAM) for reconstructing species interaction networks from observed species distributions. The model has been devised to allow robust inference in the presence of spatial autocorrelation and distributional heterogeneity. We have evaluated the model on simulated data that combines a trophic niche model with a stochastic population model on a 2-dimensional lattice, and we have compared the performance of our model with L1-penalized sparse regression (LASSO) and non-linear Bayesian networks with the BDe scoring scheme. In addition, we have applied our method to plant ground coverage data from the western shore of the Outer Hebrides with the objective to infer the ecological interactions. (C) 2012 Elsevier B.V. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Bridge construction responds to the need for environmentally friendly design of motorways and facilitates the passage through sensitive natural areas and the bypassing of urban areas. However, according to numerous research studies, bridge construction presents substantial budget overruns. Therefore, it is necessary early in the planning process for the decision makers to have reliable estimates of the final cost based on previously constructed projects. At the same time, the current European financial crisis reduces the available capital for investments and financial institutions are even less willing to finance transportation infrastructure. Consequently, it is even more necessary today to estimate the budget of high-cost construction projects -such as road bridges- with reasonable accuracy, in order for the state funds to be invested with lower risk and the projects to be designed with the highest possible efficiency. In this paper, a Bill-of-Quantities (BoQ) estimation tool for road bridges is developed in order to support the decisions made at the preliminary planning and design stages of highways. Specifically, a Feed-Forward Artificial Neural Network (ANN) with a hidden layer of 10 neurons is trained to predict the superstructure material quantities (concrete, pre-stressed steel and reinforcing steel) using the width of the deck, the adjusted length of span or cantilever and the type of the bridge as input variables. The training dataset includes actual data from 68 recently constructed concrete motorway bridges in Greece. According to the relevant metrics, the developed model captures very well the complex interrelations in the dataset and demonstrates strong generalisation capability. Furthermore, it outperforms the linear regression models developed for the same dataset. Therefore, the proposed cost estimation model stands as a useful and reliable tool for the construction industry as it enables planners to reach informed decisions for technical and economic planning of concrete bridge projects from their early implementation stages.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we propose exact likelihood-based mean-variance efficiency tests of the market portfolio in the context of Capital Asset Pricing Model (CAPM), allowing for a wide class of error distributions which include normality as a special case. These tests are developed in the frame-work of multivariate linear regressions (MLR). It is well known however that despite their simple statistical structure, standard asymptotically justified MLR-based tests are unreliable. In financial econometrics, exact tests have been proposed for a few specific hypotheses [Jobson and Korkie (Journal of Financial Economics, 1982), MacKinlay (Journal of Financial Economics, 1987), Gib-bons, Ross and Shanken (Econometrica, 1989), Zhou (Journal of Finance 1993)], most of which depend on normality. For the gaussian model, our tests correspond to Gibbons, Ross and Shanken’s mean-variance efficiency tests. In non-gaussian contexts, we reconsider mean-variance efficiency tests allowing for multivariate Student-t and gaussian mixture errors. Our framework allows to cast more evidence on whether the normality assumption is too restrictive when testing the CAPM. We also propose exact multivariate diagnostic checks (including tests for multivariate GARCH and mul-tivariate generalization of the well known variance ratio tests) and goodness of fit tests as well as a set estimate for the intervening nuisance parameters. Our results [over five-year subperiods] show the following: (i) multivariate normality is rejected in most subperiods, (ii) residual checks reveal no significant departures from the multivariate i.i.d. assumption, and (iii) mean-variance efficiency tests of the market portfolio is not rejected as frequently once it is allowed for the possibility of non-normal errors.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we propose several finite-sample specification tests for multivariate linear regressions (MLR) with applications to asset pricing models. We focus on departures from the assumption of i.i.d. errors assumption, at univariate and multivariate levels, with Gaussian and non-Gaussian (including Student t) errors. The univariate tests studied extend existing exact procedures by allowing for unspecified parameters in the error distributions (e.g., the degrees of freedom in the case of the Student t distribution). The multivariate tests are based on properly standardized multivariate residuals to ensure invariance to MLR coefficients and error covariances. We consider tests for serial correlation, tests for multivariate GARCH and sign-type tests against general dependencies and asymmetries. The procedures proposed provide exact versions of those applied in Shanken (1990) which consist in combining univariate specification tests. Specifically, we combine tests across equations using the MC test procedure to avoid Bonferroni-type bounds. Since non-Gaussian based tests are not pivotal, we apply the “maximized MC” (MMC) test method [Dufour (2002)], where the MC p-value for the tested hypothesis (which depends on nuisance parameters) is maximized (with respect to these nuisance parameters) to control the test’s significance level. The tests proposed are applied to an asset pricing model with observable risk-free rates, using monthly returns on New York Stock Exchange (NYSE) portfolios over five-year subperiods from 1926-1995. Our empirical results reveal the following. Whereas univariate exact tests indicate significant serial correlation, asymmetries and GARCH in some equations, such effects are much less prevalent once error cross-equation covariances are accounted for. In addition, significant departures from the i.i.d. hypothesis are less evident once we allow for non-Gaussian errors.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A study was designed to examine the relationships between protein, condensed tannin and cell wall carbohydrate content and composition and the nutritional quality of seven tropical legumes (Desmodium ovalifolium, Flemingia macrophylla, Leucaena leucocephala, L pallida, L macrophylla, Calliandra calothyrsus and Clitotia fairchildiana). Among the legume species studied, D ovalifolium showed the lowest concentration of nitrogen, while L leucocephala showed the highest. Fibre (NDF) content was lowest in C calothyrsus, L Leucocephala and L pallida and highest in L macrophylla, which had no measurable condensed tannins. The highest tannin concentration was found in C calothyrsus. Total non-structural polysaccharides (NSP) varied among legumes species (lowest in C calothyrsus and highest in D ovalifolium), and glucose and uronic acids were the most abundant carbohydrate constituents in all legumes. Total NSP losses were lowest in F macrophylla and highest in L leucocephala and L pallida. Gas accumulation and acetate and propionate levels were 50% less with F macrophylla and D ovalifolium as compared with L leucocephala. The highest levels of branched-chain fatty acids were observed with non-tanniniferous legumes, and negative concentrations were observed with some of the legumes with high tannin content (D ovalifolium and F macrophylla). Linear regression analysis showed that the presence of condensed tannins was more related to a reduction of the initial rate of gas production (0-48 h) than to the final amount of gas produced or the extent (144h) of dry matter degradation, which could be due to differences in tannin chemistry. Consequently, more attention should be given in the future to elucidating the impact of tannin structure on the nutritional quality of tropical forage legumes. (C) 2003 Society of Chemical Industry.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper shows that a wavelet network and a linear term can be advantageously combined for the purpose of non linear system identification. The theoretical foundation of this approach is laid by proving that radial wavelets are orthogonal to linear functions. A constructive procedure for building such nonlinear regression structures, termed linear-wavelet models, is described. For illustration, sim ulation data are used to identify a model for a two-link robotic manipulator. The results show that the introduction of wavelets does improve the prediction ability of a linear model.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Forecasting wind power is an important part of a successful integration of wind power into the power grid. Forecasts with lead times longer than 6 h are generally made by using statistical methods to post-process forecasts from numerical weather prediction systems. Two major problems that complicate this approach are the non-linear relationship between wind speed and power production and the limited range of power production between zero and nominal power of the turbine. In practice, these problems are often tackled by using non-linear non-parametric regression models. However, such an approach ignores valuable and readily available information: the power curve of the turbine's manufacturer. Much of the non-linearity can be directly accounted for by transforming the observed power production into wind speed via the inverse power curve so that simpler linear regression models can be used. Furthermore, the fact that the transformed power production has a limited range can be taken care of by employing censored regression models. In this study, we evaluate quantile forecasts from a range of methods: (i) using parametric and non-parametric models, (ii) with and without the proposed inverse power curve transformation and (iii) with and without censoring. The results show that with our inverse (power-to-wind) transformation, simpler linear regression models with censoring perform equally or better than non-linear models with or without the frequently used wind-to-power transformation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Most studies involving statistical time series analysis rely on assumptions of linearity, which by its simplicity facilitates parameter interpretation and estimation. However, the linearity assumption may be too restrictive for many practical applications. The implementation of nonlinear models in time series analysis involves the estimation of a large set of parameters, frequently leading to overfitting problems. In this article, a predictability coefficient is estimated using a combination of nonlinear autoregressive models and the use of support vector regression in this model is explored. We illustrate the usefulness and interpretability of results by using electroencephalographic records of an epileptic patient.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The work described was part of the programme, Innovative biological indicators to improve the efficiency of water and nitrogen use and the fruit quality in tree crops Project, a partnership between ISA and INRA. Field studies were conducted in Portugal on different irrigated plots of nectarine trees; a fully irrigated (unstressed plot) and a plot that was not irrigated for some days (stressed plot). The aim of this work was to investigate the effects of plant water stress on canopy temperature, to determine the nonwater-stressed baseline and to observe diurnal and seasonal variations of Crop Water Stress Index (CWSI). Canopy temperature, psychrometric and wind speed data were taken each half-hour, between 9:30 and 15:30 h. Results showed that canopy temperature was higher during the daytime, for both unstressed and stressed plots. A linear regression of canopy-air temperature differential and the vapor pressure deficit (non-water-stress baseline) showed a r2= 0.65. During the stress period, the average canopy temperature of the stressed plot was up to 5.4°C higher than the unstressed plot. Diurnal and seasonal average of CWSI values showed differences between unstressed and stressed plots, during the stress period.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Globalization of dairy cattle breeding has created a need for international sire proofs. Some early methods for converting proofs from one population to another are based on simple linear regression. An alternative robust regression method based on the t-distribution is presented, and maximum likelihood and Bayesian techniques for analysis are described, including the situation in which some proofs are missing. Procedures were used to investigate the relationship between Holstein sire proofs obtained by two Uruguayan genetic evaluation programs. The results suggest that conversion equations developed from data including only sires having proofs in both populations can lead to distorted results, relative to estimates obtained using techniques for incomplete data. There was evidence of non-normality of regression residuals, which constitutes an additional source of bias. A robust estimator may not solve all problems, but can provide simple conversion equations that are less sensitive to outlying proofs and to departures from assumptions.