933 resultados para Improvement of regression predictions


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This research proposes a methodology to improve computed individual prediction values provided by an existing regression model without having to change either its parameters or its architecture. In other words, we are interested in achieving more accurate results by adjusting the calculated regression prediction values, without modifying or rebuilding the original regression model. Our proposition is to adjust the regression prediction values using individual reliability estimates that indicate if a single regression prediction is likely to produce an error considered critical by the user of the regression. The proposed method was tested in three sets of experiments using three different types of data. The first set of experiments worked with synthetically produced data, the second with cross sectional data from the public data source UCI Machine Learning Repository and the third with time series data from ISO-NE (Independent System Operator in New England). The experiments with synthetic data were performed to verify how the method behaves in controlled situations. In this case, the outcomes of the experiments produced superior results with respect to predictions improvement for artificially produced cleaner datasets with progressive worsening with the addition of increased random elements. The experiments with real data extracted from UCI and ISO-NE were done to investigate the applicability of the methodology in the real world. The proposed method was able to improve regression prediction values by about 95% of the experiments with real data.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Virtual Screening (VS) methods can considerably aid clinical research, predicting how ligands interact with drug targets. However, the accuracy of most VS methods is constrained by limitations in the scoring function that describes biomolecular interactions, and even nowadays these uncertainties are not completely understood. In order to improve accuracy of scoring functions used in most VS methods we propose a hybrid novel approach where neural networks (NNET) and support vector machines (SVM) methods are trained with databases of known active (drugs) and inactive compounds, this information being exploited afterwards to improve VS predictions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

El foc bacterià és una malaltia que afecta a plantes de la família de la rosàcies, causada pel bacteri Erwinia amylovora. El seu rang d'hostes inclou arbres fruiters, com la perera, la pomera o el codonyer, i plantes ornamentals de gran interès comercial i econòmic. Actualment, la malaltia s'ha dispersat i es troba àmpliament distribuïda en totes les zones de clima temperat del món. A Espanya, on la malaltia no és endèmica, el foc bacterià es va detectar per primer cop al 1995 al nord del país (Euskadi) i posteriorment, han aparegut varis focus en altres localitzacions, que han estat convenientment eradicats. El control del foc bacterià, és molt poc efectiu en plantes afectades per la malaltia, de manera que es basa en mesures encaminades a evitar la dispersió del patogen, i la introducció de la malaltia en regions no endèmiques. En aquest treball, la termoteràpia ha estat avaluada com a mètode d'eradicació d'E. amylovora de material vegetal de propagació asimptomàtic. S'ha demostrat que la termoteràpia és un mètode viable d'eradicar E. amylovora de material de propagació. Gairebé totes les espècies i varietats de rosàcies mantingudes en condicions d'humitat sobrevivien 7 hores a 45 ºC i més de 3 hores a 50 ºC, mentre que més d'1 hora d'exposició a 50 ºC amb calor seca produïa danys en el material vegetal i reduïa la brotació. Tractaments de 60 min a 45 ºC o 30 min a 50 ºC van ser suficients per reduir la població epífita d'E. amylovora a nivells no detectables (5 x 102 ufc g-1 p.f.) en branques de perera. Els derivats dels fosfonats i el benzotiadiazol són efectius en el control del foc bacterià en perera i pomera, tant en condicions de laboratori, com d'hivernacle i camp. Els inductors de defensa de les plantes redueixen els nivells de malaltia fins al 40-60%. Els intervals de temps mínims per aconseguir el millor control de la malaltia van ser 5 dies pel fosetil-Al, i 7 dies per l'etefon i el benzotiadiazol, i les dosis òptimes pel fosetil-Al i el benzotiadiazol van ser 3.72 g HPO32- L-1 i 150 mg i.a. L-1, respectivament. Es millora l'eficàcia del fosetil-Al i del benzotiadiazol en el control del foc bacterià, quan es combinen amb els antibiòtics a la meitat de la dosi d'aquests últims. Tot i que l'estratègia de barrejar productes és més pràctica i fàcil de dur a terme a camp, que l'estratègia de combinar productes, el millor nivell de control de la malaltia s'aconsegueix amb l'estratègia de combinar productes. Es va analitzar a nivell histològic i ultrastructural l'efecte del benzotiadiazol i dels fosfonats en la interacció Erwinia amylovora-perera. Ni el benzotiadiazol, ni el fosetil-Al, ni l'etefon van induir canvis estructurals en els teixits de perera 7 dies després de la seva aplicació. No obstant, després de la inoculació d'E. amylovora es va observar en plantes tractades amb fosetil-Al i etefon una desorganització estructural cel·lular, mentre que en les plantes tractades amb benzotiadiazol aquestes alteracions tissulars van ser retardades. S'han avaluat dos models (Maryblyt, Cougarblight) en un camp a Espanya afectat per la malaltia, per determinar la precisió de les prediccions. Es van utilitzar dos models per elaborar el mapa de risc, el BRS-Powell combinat i el BIS95 modificat. Els resultats van mostrar dos zones amb elevat i baix risc de la malaltia. Maryblyt i Cougarblight són dos models de fàcil ús, tot i que la seva implementació en programes de maneig de la malaltia requereix que siguin avaluats i validats per un període de temps més llarg i en àrees on la malaltia hi estigui present.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The aim of this study was to assess and improve the accuracy of biotransfer models for the organic pollutants (PCBs, PCDD/Fs, PBDEs, PFCAs, and pesticides) into cow’s milk and beef used in human exposure assessment. Metabolic rate in cattle is known as a key parameter for this biotransfer, however few experimental data and no simulation methods are currently available. In this research, metabolic rate was estimated using existing QSAR biodegradation models of microorganisms (BioWIN) and fish (EPI-HL and IFS-HL). This simulated metabolic rate was then incorporated into the mechanistic cattle biotransfer models (RAIDAR, ACC-HUMAN, OMEGA, and CKow). The goodness of fit tests showed that RAIDAR, ACC-HUMAN, OMEGA model performances were significantly improved using either of the QSARs when comparing the new model outputs to observed data. The CKow model is the only one that separates the processes in the gut and liver. This model showed the lowest residual error of all the models tested when the BioWIN model was used to represent the ruminant metabolic process in the gut and the two fish QSARs were used to represent the metabolic process in the liver. Our testing included EUSES and CalTOX which are KOW-regression models that are widely used in regulatory assessment. New regressions based on the simulated rate of the two metabolic processes are also proposed as an alternative to KOW-regression models for a screening risk assessment. The modified CKow model is more physiologically realistic, but has equivalent usability to existing KOW-regression models for estimating cattle biotransfer of organic pollutants.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The FE ('fixed effects') estimator of technical inefficiency performs poorly when N ('number of firms') is large and T ('number of time observations') is small. We propose estimators of both the firm effects and the inefficiencies, which have small sample gains compared to the traditional FE estimator. The estimators are based on nonparametric kernel regression of unordered variables, which includes the FE estimator as a special case. In terms of global conditional MSE ('mean square error') criterions, it is proved that there are kernel estimators which are efficient to the FE estimators of firm effects and inefficiencies, in finite samples. Monte Carlo simulations supports our theoretical findings and in an empirical example it is shown how the traditional FE estimator and the proposed kernel FE estimator lead to very different conclusions about inefficiency of Indonesian rice farmers.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Strategies are compared for the development of a linear regression model with stochastic (multivariate normal) regressor variables and the subsequent assessment of its predictive ability. Bias and mean squared error of four estimators of predictive performance are evaluated in simulated samples of 32 population correlation matrices. Models including all of the available predictors are compared with those obtained using selected subsets. The subset selection procedures investigated include two stopping rules, C$\sb{\rm p}$ and S$\sb{\rm p}$, each combined with an 'all possible subsets' or 'forward selection' of variables. The estimators of performance utilized include parametric (MSEP$\sb{\rm m}$) and non-parametric (PRESS) assessments in the entire sample, and two data splitting estimates restricted to a random or balanced (Snee's DUPLEX) 'validation' half sample. The simulations were performed as a designed experiment, with population correlation matrices representing a broad range of data structures.^ The techniques examined for subset selection do not generally result in improved predictions relative to the full model. Approaches using 'forward selection' result in slightly smaller prediction errors and less biased estimators of predictive accuracy than 'all possible subsets' approaches but no differences are detected between the performances of C$\sb{\rm p}$ and S$\sb{\rm p}$. In every case, prediction errors of models obtained by subset selection in either of the half splits exceed those obtained using all predictors and the entire sample.^ Only the random split estimator is conditionally (on $\\beta$) unbiased, however MSEP$\sb{\rm m}$ is unbiased on average and PRESS is nearly so in unselected (fixed form) models. When subset selection techniques are used, MSEP$\sb{\rm m}$ and PRESS always underestimate prediction errors, by as much as 27 percent (on average) in small samples. Despite their bias, the mean squared errors (MSE) of these estimators are at least 30 percent less than that of the unbiased random split estimator. The DUPLEX split estimator suffers from large MSE as well as bias, and seems of little value within the context of stochastic regressor variables.^ To maximize predictive accuracy while retaining a reliable estimate of that accuracy, it is recommended that the entire sample be used for model development, and a leave-one-out statistic (e.g. PRESS) be used for assessment. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The objective of this study was to propose a multi-criteria optimization and decision-making technique to solve food engineering problems. This technique was demostrated using experimental data obtained on osmotic dehydratation of carrot cubes in a sodium chloride solution. The Aggregating Functions Approach, the Adaptive Random Search Algorithm, and the Penalty Functions Approach were used in this study to compute the initial set of non-dominated or Pareto-optimal solutions. Multiple non-linear regression analysis was performed on a set of experimental data in order to obtain particular multi-objective functions (responses), namely water loss, solute gain, rehydration ratio, three different colour criteria of rehydrated product, and sensory evaluation (organoleptic quality). Two multi-criteria decision-making approaches, the Analytic Hierarchy Process (AHP) and the Tabular Method (TM), were used simultaneously to choose the best alternative among the set of non-dominated solutions. The multi-criteria optimization and decision-making technique proposed in this study can facilitate the assessment of criteria weights, giving rise to a fairer, more consistent, and adequate final compromised solution or food process. This technique can be useful to food scientists in research and education, as well as to engineers involved in the improvement of a variety of food engineering processes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Long-term forecasts of pest pressure are central to the effective management of many agricultural insect pests. In the eastern cropping regions of Australia, serious infestations of Helicoverpa punctigera (Wallengren) and H. armigera (Hübner)(Lepidoptera: Noctuidae) are experienced annually. Regression analyses of a long series of light-trap catches of adult moths were used to describe the seasonal dynamics of both species. The size of the spring generation in eastern cropping zones could be related to rainfall in putative source areas in inland Australia. Subsequent generations could be related to the abundance of various crops in agricultural areas, rainfall and the magnitude of the spring population peak. As rainfall figured prominently as a predictor variable, and can itself be predicted using the Southern Oscillation Index (SOI), trap catches were also related to this variable. The geographic distribution of each species was modelled in relation to climate and CLIMEX was used to predict temporal variation in abundance at given putative source sites in inland Australia using historical meteorological data. These predictions were then correlated with subsequent pest abundance data in a major cropping region. The regression-based and bioclimatic-based approaches to predicting pest abundance are compared and their utility in predicting and interpreting pest dynamics are discussed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

BACKGROUND: Abnormalities in serum phosphorus, calcium and parathyroid hormone (PTH) have been associated with poor survival in haemodialysis patients. This COSMOS (Current management Of Secondary hyperparathyroidism: a Multicentre Observational Study) analysis assesses the association of high and low serum phosphorus, calcium and PTH with a relative risk of mortality. Furthermore, the impact of changes in these parameters on the relative risk of mortality throughout the 3-year follow-up has been investigated. METHODS:COSMOS is a 3-year, multicentre, open-cohort, prospective study carried out in 6797 adult chronic haemodialysis patients randomly selected from 20 European countries. RESULTS:Using Cox proportional hazard regression models and penalized splines analysis, it was found that both high and low serum phosphorus, calcium and PTH were associated with a higher risk of mortality. The serum values associated with the minimum relative risk of mortality were 4.4 mg/dL for serum phosphorus, 8.8 mg/dL for serum calcium and 398 pg/mL for serum PTH. The lowest mortality risk ranges obtained using as base the previous values were 3.6-5.2 mg/dL for serum phosphorus, 7.9-9.5 mg/dL for serum calcium and 168-674 pg/mL for serum PTH. Decreases in serum phosphorus and calcium and increases in serum PTH in patients with baseline values of >5.2 mg/dL (phosphorus), >9.5 mg/dL (calcium) and <168 pg/mL (PTH), respectively, were associated with improved survival. CONCLUSIONS:COSMOS provides evidence of the association of serum phosphorus, calcium and PTH and mortality, and suggests survival benefits of controlling chronic kidney disease-mineral and bone disorder biochemical parameters in CKD5D patients.