982 resultados para Prediction Error


Relevância:

30.00% 30.00%

Publicador:

Resumo:

The purpose of the research is to define practical profit which can be achieved using neural network methods as a prediction instrument. The thesis investigates the ability of neural networks to forecast future events. This capability is checked on the example of price prediction during intraday trading on stock market. The executed experiments show predictions of average 1, 2, 5 and 10 minutes’ prices based on data of one day and made by two different types of forecasting systems. These systems are based on the recurrent neural networks and back propagation neural nets. The precision of the predictions is controlled by the absolute error and the error of market direction. The economical effectiveness is estimated by a special trading system. In conclusion, the best structures of neural nets are tested with data of 31 days’ interval. The best results of the average percent of profit from one transaction (buying + selling) are 0.06668654, 0.188299453, 0.349854787 and 0.453178626, they were achieved for prediction periods 1, 2, 5 and 10 minutes. The investigation can be interesting for the investors who have access to a fast information channel with a possibility of every-minute data refreshment.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The prediction filters are well known models for signal estimation, in communications, control and many others areas. The classical method for deriving linear prediction coding (LPC) filters is often based on the minimization of a mean square error (MSE). Consequently, second order statistics are only required, but the estimation is only optimal if the residue is independent and identically distributed (iid) Gaussian. In this paper, we derive the ML estimate of the prediction filter. Relationships with robust estimation of auto-regressive (AR) processes, with blind deconvolution and with source separation based on mutual information minimization are then detailed. The algorithm, based on the minimization of a high-order statistics criterion, uses on-line estimation of the residue statistics. Experimental results emphasize on the interest of this approach.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Notre consommation en eau souterraine, en particulier comme eau potable ou pour l'irrigation, a considérablement augmenté au cours des années. De nombreux problèmes font alors leur apparition, allant de la prospection de nouvelles ressources à la remédiation des aquifères pollués. Indépendamment du problème hydrogéologique considéré, le principal défi reste la caractérisation des propriétés du sous-sol. Une approche stochastique est alors nécessaire afin de représenter cette incertitude en considérant de multiples scénarios géologiques et en générant un grand nombre de réalisations géostatistiques. Nous rencontrons alors la principale limitation de ces approches qui est le coût de calcul dû à la simulation des processus d'écoulements complexes pour chacune de ces réalisations. Dans la première partie de la thèse, ce problème est investigué dans le contexte de propagation de l'incertitude, oú un ensemble de réalisations est identifié comme représentant les propriétés du sous-sol. Afin de propager cette incertitude à la quantité d'intérêt tout en limitant le coût de calcul, les méthodes actuelles font appel à des modèles d'écoulement approximés. Cela permet l'identification d'un sous-ensemble de réalisations représentant la variabilité de l'ensemble initial. Le modèle complexe d'écoulement est alors évalué uniquement pour ce sousensemble, et, sur la base de ces réponses complexes, l'inférence est faite. Notre objectif est d'améliorer la performance de cette approche en utilisant toute l'information à disposition. Pour cela, le sous-ensemble de réponses approximées et exactes est utilisé afin de construire un modèle d'erreur, qui sert ensuite à corriger le reste des réponses approximées et prédire la réponse du modèle complexe. Cette méthode permet de maximiser l'utilisation de l'information à disposition sans augmentation perceptible du temps de calcul. La propagation de l'incertitude est alors plus précise et plus robuste. La stratégie explorée dans le premier chapitre consiste à apprendre d'un sous-ensemble de réalisations la relation entre les modèles d'écoulement approximé et complexe. Dans la seconde partie de la thèse, cette méthodologie est formalisée mathématiquement en introduisant un modèle de régression entre les réponses fonctionnelles. Comme ce problème est mal posé, il est nécessaire d'en réduire la dimensionnalité. Dans cette optique, l'innovation du travail présenté provient de l'utilisation de l'analyse en composantes principales fonctionnelles (ACPF), qui non seulement effectue la réduction de dimensionnalités tout en maximisant l'information retenue, mais permet aussi de diagnostiquer la qualité du modèle d'erreur dans cet espace fonctionnel. La méthodologie proposée est appliquée à un problème de pollution par une phase liquide nonaqueuse et les résultats obtenus montrent que le modèle d'erreur permet une forte réduction du temps de calcul tout en estimant correctement l'incertitude. De plus, pour chaque réponse approximée, une prédiction de la réponse complexe est fournie par le modèle d'erreur. Le concept de modèle d'erreur fonctionnel est donc pertinent pour la propagation de l'incertitude, mais aussi pour les problèmes d'inférence bayésienne. Les méthodes de Monte Carlo par chaîne de Markov (MCMC) sont les algorithmes les plus communément utilisés afin de générer des réalisations géostatistiques en accord avec les observations. Cependant, ces méthodes souffrent d'un taux d'acceptation très bas pour les problèmes de grande dimensionnalité, résultant en un grand nombre de simulations d'écoulement gaspillées. Une approche en deux temps, le "MCMC en deux étapes", a été introduite afin d'éviter les simulations du modèle complexe inutiles par une évaluation préliminaire de la réalisation. Dans la troisième partie de la thèse, le modèle d'écoulement approximé couplé à un modèle d'erreur sert d'évaluation préliminaire pour le "MCMC en deux étapes". Nous démontrons une augmentation du taux d'acceptation par un facteur de 1.5 à 3 en comparaison avec une implémentation classique de MCMC. Une question reste sans réponse : comment choisir la taille de l'ensemble d'entrainement et comment identifier les réalisations permettant d'optimiser la construction du modèle d'erreur. Cela requiert une stratégie itérative afin que, à chaque nouvelle simulation d'écoulement, le modèle d'erreur soit amélioré en incorporant les nouvelles informations. Ceci est développé dans la quatrième partie de la thèse, oú cette méthodologie est appliquée à un problème d'intrusion saline dans un aquifère côtier. -- Our consumption of groundwater, in particular as drinking water and for irrigation, has considerably increased over the years and groundwater is becoming an increasingly scarce and endangered resource. Nofadays, we are facing many problems ranging from water prospection to sustainable management and remediation of polluted aquifers. Independently of the hydrogeological problem, the main challenge remains dealing with the incomplete knofledge of the underground properties. Stochastic approaches have been developed to represent this uncertainty by considering multiple geological scenarios and generating a large number of realizations. The main limitation of this approach is the computational cost associated with performing complex of simulations in each realization. In the first part of the thesis, we explore this issue in the context of uncertainty propagation, where an ensemble of geostatistical realizations is identified as representative of the subsurface uncertainty. To propagate this lack of knofledge to the quantity of interest (e.g., the concentration of pollutant in extracted water), it is necessary to evaluate the of response of each realization. Due to computational constraints, state-of-the-art methods make use of approximate of simulation, to identify a subset of realizations that represents the variability of the ensemble. The complex and computationally heavy of model is then run for this subset based on which inference is made. Our objective is to increase the performance of this approach by using all of the available information and not solely the subset of exact responses. Two error models are proposed to correct the approximate responses follofing a machine learning approach. For the subset identified by a classical approach (here the distance kernel method) both the approximate and the exact responses are knofn. This information is used to construct an error model and correct the ensemble of approximate responses to predict the "expected" responses of the exact model. The proposed methodology makes use of all the available information without perceptible additional computational costs and leads to an increase in accuracy and robustness of the uncertainty propagation. The strategy explored in the first chapter consists in learning from a subset of realizations the relationship between proxy and exact curves. In the second part of this thesis, the strategy is formalized in a rigorous mathematical framework by defining a regression model between functions. As this problem is ill-posed, it is necessary to reduce its dimensionality. The novelty of the work comes from the use of functional principal component analysis (FPCA), which not only performs the dimensionality reduction while maximizing the retained information, but also allofs a diagnostic of the quality of the error model in the functional space. The proposed methodology is applied to a pollution problem by a non-aqueous phase-liquid. The error model allofs a strong reduction of the computational cost while providing a good estimate of the uncertainty. The individual correction of the proxy response by the error model leads to an excellent prediction of the exact response, opening the door to many applications. The concept of functional error model is useful not only in the context of uncertainty propagation, but also, and maybe even more so, to perform Bayesian inference. Monte Carlo Markov Chain (MCMC) algorithms are the most common choice to ensure that the generated realizations are sampled in accordance with the observations. Hofever, this approach suffers from lof acceptance rate in high dimensional problems, resulting in a large number of wasted of simulations. This led to the introduction of two-stage MCMC, where the computational cost is decreased by avoiding unnecessary simulation of the exact of thanks to a preliminary evaluation of the proposal. In the third part of the thesis, a proxy is coupled to an error model to provide an approximate response for the two-stage MCMC set-up. We demonstrate an increase in acceptance rate by a factor three with respect to one-stage MCMC results. An open question remains: hof do we choose the size of the learning set and identify the realizations to optimize the construction of the error model. This requires devising an iterative strategy to construct the error model, such that, as new of simulations are performed, the error model is iteratively improved by incorporating the new information. This is discussed in the fourth part of the thesis, in which we apply this methodology to a problem of saline intrusion in a coastal aquifer.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this study, the effects of hot-air drying conditions on color, water holding capacity, and total phenolic content of dried apple were investigated using artificial neural network as an intelligent modeling system. After that, a genetic algorithm was used to optimize the drying conditions. Apples were dried at different temperatures (40, 60, and 80 °C) and at three air flow-rates (0.5, 1, and 1.5 m/s). Applying the leave-one-out cross validation methodology, simulated and experimental data were in good agreement presenting an error < 2.4 %. Quality index optimal values were found at 62.9 °C and 1.0 m/s using genetic algorithm.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In our study we use a kernel based classification technique, Support Vector Machine Regression for predicting the Melting Point of Drug – like compounds in terms of Topological Descriptors, Topological Charge Indices, Connectivity Indices and 2D Auto Correlations. The Machine Learning model was designed, trained and tested using a dataset of 100 compounds and it was found that an SVMReg model with RBF Kernel could predict the Melting Point with a mean absolute error 15.5854 and Root Mean Squared Error 19.7576

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We contribute a quantitative and systematic model to capture etch non-uniformity in deep reactive ion etch of microelectromechanical systems (MEMS) devices. Deep reactive ion etch is commonly used in MEMS fabrication where high-aspect ratio features are to be produced in silicon. It is typical for many supposedly identical devices, perhaps of diameter 10 mm, to be etched simultaneously into one silicon wafer of diameter 150 mm. Etch non-uniformity depends on uneven distributions of ion and neutral species at the wafer level, and on local consumption of those species at the device, or die, level. An ion–neutral synergism model is constructed from data obtained from etching several layouts of differing pattern opening densities. Such a model is used to predict wafer-level variation with an r.m.s. error below 3%. This model is combined with a die-level model, which we have reported previously, on a MEMS layout. The two-level model is shown to enable prediction of both within-die and wafer-scale etch rate variation for arbitrary wafer loadings.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A new method for assessing forecast skill and predictability that involves the identification and tracking of extratropical cyclones has been developed and implemented to obtain detailed information about the prediction of cyclones that cannot be obtained from more conventional analysis methodologies. The cyclones were identified and tracked along the forecast trajectories, and statistics were generated to determine the rate at which the position and intensity of the forecasted storms diverge from the analyzed tracks as a function of forecast lead time. The results show a higher level of skill in predicting the position of extratropical cyclones than the intensity. They also show that there is potential to improve the skill in predicting the position by 1 - 1.5 days and the intensity by 2 - 3 days, via improvements to the forecast model. Further analysis shows that forecasted storms move at a slower speed than analyzed storms on average and that there is a larger error in the predicted amplitudes of intense storms than the weaker storms. The results also show that some storms can be predicted up to 3 days before they are identified as an 850-hPa vorticity center in the analyses. In general, the results show a higher level of skill in the Northern Hemisphere (NH) than the Southern Hemisphere (SH); however, the rapid growth of NH winter storms is not very well predicted. The impact that observations of different types have on the prediction of the extratropical cyclones has also been explored, using forecasts integrated from analyses that were constructed from reduced observing systems. A terrestrial, satellite, and surface-based system were investigated and the results showed that the predictive skill of the terrestrial system was superior to the satellite system in the NH. Further analysis showed that the satellite system was not very good at predicting the growth of the storms. In the SH the terrestrial system has significantly less skill than the satellite system, highlighting the dominance of satellite observations in this hemisphere. The surface system has very poor predictive skill in both hemispheres.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The prediction of extratropical cyclones by the European Centre for Medium Range Weather Forecasts (ECMWF) and the National Centers for Environmental Prediction (NCEP) Ensemble Prediction Systems (EPS) has been investigated using an objective feature tracking methodology to identify and track the cyclones along the forecast trajectories. Overall the results show that the ECMWF EPS has a slightly higher level of skill than the NCEP EPS in the northern hemisphere (NH). However in the southern hemisphere (SH), NCEP has higher predictive skill than ECMWF for the intensity of the cyclones. The results from both EPS indicate a higher level of predictive skill for the position of extratropical cyclones than their intensity and show that there is a larger spread in intensity than position. Further analysis shows that the predicted propagation speed of cyclones is generally too slow for the ECMWF EPS and show a slight bias for the intensity of the cyclones to be overpredicted. This is also true for the NCEP EPS in the SH. For the NCEP EPS in the NH the intensity of the cyclones is underpredicted. There is small bias in both the EPS for the cyclones to be displaced towards the poles. For each ensemble forecast of each cyclone, the predictive skill of the ensemble member that best predicts the cyclones position and intensity was computed. The results are very encouraging showing that the predictive skill of the best ensemble member is significantly higher than that of the control forecast in terms of both the position and intensity of the cyclones. The prediction of cyclones before they are identified as 850 hPa vorticity centers in the analysis cycle was also considered. It is shown that an indication of extratropical cyclones can be given by at least 1 ensemble member 7 days before they are identified in the analysis. Further analysis of the ECMWF EPS shows that the ensemble mean has a higher level of skill than the control forecast, particularly for the intensity of the cyclones, 2 from day 3 of the forecast. There is a higher level of skill in the NH than the SH and the spread in the SH is correspondingly larger. The difference between the ensemble mean and spread is very small for the position of the cyclones, but the spread of the ensemble is smaller than the ensemble mean error for the intensity of the cyclones in both hemispheres. Results also show that the ECMWF control forecast has ½ to 1 day more skill than the perturbed members, for both the position and intensity of the cyclones, throughout the forecast.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A regional study of the prediction of extratropical cyclones by the European Centre for Medium-Range Weather Forecasts (ECMWF) Ensemble Prediction System (EPS) has been performed. An objective feature-tracking method has been used to identify and track the cyclones along the forecast trajectories. Forecast error statistics have then been produced for the position, intensity, and propagation speed of the storms. In previous work, data limitations meant it was only possible to present the diagnostics for the entire Northern Hemisphere (NH) or Southern Hemisphere. A larger data sample has allowed the diagnostics to be computed separately for smaller regions around the globe and has made it possible to explore the regional differences in the prediction of storms by the EPS. Results show that in the NH there is a larger ensemble mean error in the position of storms over the Atlantic Ocean. Further analysis revealed that this is mainly due to errors in the prediction of storm propagation speed rather than in direction. Forecast storms propagate too slowly in all regions, but the bias is about 2 times as large in the NH Atlantic region. The results show that storm intensity is generally overpredicted over the ocean and underpredicted over the land and that the absolute error in intensity is larger over the ocean than over the land. In the NH, large errors occur in the prediction of the intensity of storms that originate as tropical cyclones but then move into the extratropics. The ensemble is underdispersive for the intensity of cyclones (i.e., the spread is smaller than the mean error) in all regions. The spatial patterns of the ensemble mean error and ensemble spread are very different for the intensity of cyclones. Spatial distributions of the ensemble mean error suggest that large errors occur during the growth phase of storm development, but this is not indicated by the spatial distributions of the ensemble spread. In the NH there are further differences. First, the large errors in the prediction of the intensity of cyclones that originate in the tropics are not indicated by the spread. Second, the ensemble mean error is larger over the Pacific Ocean than over the Atlantic, whereas the opposite is true for the spread. The use of a storm-tracking approach, to both weather forecasters and developers of forecast systems, is also discussed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We describe a new methodology for comparing satellite radiation budget data with a numerical weather prediction (NWP) model. This is applied to data from the Geostationary Earth Radiation Budget (GERB) instrument on Meteosat-8. The methodology brings together, in near-real time, GERB broadband shortwave and longwave fluxes with simulations based on analyses produced by the Met Office global NWP model. Results for the period May 2003 to February 2005 illustrate the progressive improvements in the data products as various initial problems were resolved. In most areas the comparisons reveal systematic errors in the model's representation of surface properties and clouds, which are discussed elsewhere. However, for clear-sky regions over the oceans the model simulations are believed to be sufficiently accurate to allow the quality of the GERB fluxes themselves to be assessed and any changes in time of the performance of the instrument to be identified. Using model and radiosonde profiles of temperature and humidity as input to a single-column version of the model's radiation code, we conduct sensitivity experiments which provide estimates of the expected model errors over the ocean of about ±5–10 W m−2 in clear-sky outgoing longwave radiation (OLR) and ±0.01 in clear-sky albedo. For the more recent data the differences between the observed and modeled OLR and albedo are well within these error estimates. The close agreement between the observed and modeled values, particularly for the most recent period, illustrates the value of the methodology. It also contributes to the validation of the GERB products and increases confidence in the quality of the data, prior to their release.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The precision farmer wants to manage the variation in soil nutrient status continuously, which requires reliable predictions at places between sampling sites. Ordinary kriging can be used for prediction if the data are spatially dependent and there is a suitable variogram model. However, even if data are spatially correlated, there are often few soil sampling sites in relation to the area to be managed. If intensive ancillary data are available and these are coregionalized with the sparse soil data, they could be used to increase the accuracy of predictions of the soil properties by methods such as cokriging, kriging with external drift and regression kriging. This paper compares the accuracy of predictions of the plant available N properties (mineral N and potentially available N) for two arable fields in Bedfordshire, United Kingdom, from ordinary kriging, cokriging, kriging with external drift and regression kriging. For the last three, intensive elevation data were used with the soil data. The mean squared errors of prediction from these methods of kriging were determined at validation sites where the values were known. Kriging with external drift resulted in the smallest mean squared error for two of the three properties examined, and cokriging for the other. The results suggest that the use of intensive ancillary data can increase the accuracy of predictions of soil properties in arable fields provided that the variables are related spatially. (c) 2005 Elsevier B.V. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The aim of the study was to establish and verify a predictive vegetation model for plant community distribution in the alti-Mediterranean zone of the Lefka Ori massif, western Crete. Based on previous work three variables were identified as significant determinants of plant community distribution, namely altitude, slope angle and geomorphic landform. The response of four community types against these variables was tested using classification trees analysis in order to model community type occurrence. V-fold cross-validation plots were used to determine the length of the best fitting tree. The final 9node tree selected, classified correctly 92.5% of the samples. The results were used to provide decision rules for the construction of a spatial model for each community type. The model was implemented within a Geographical Information System (GIS) to predict the distribution of each community type in the study site. The evaluation of the model in the field using an error matrix gave an overall accuracy of 71%. The user's accuracy was higher for the Crepis-Cirsium (100%) and Telephium-Herniaria community type (66.7%) and relatively lower for the Peucedanum-Alyssum and Dianthus-Lomelosia community types (63.2% and 62.5%, respectively). Misclassification and field validation points to the need for improved geomorphological mapping and suggests the presence of transitional communities between existing community types.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Space weather effects on technological systems originate with energy carried from the Sun to the terrestrial environment by the solar wind. In this study, we present results of modeling of solar corona-heliosphere processes to predict solar wind conditions at the L1 Lagrangian point upstream of Earth. In particular we calculate performance metrics for (1) empirical, (2) hybrid empirical/physics-based, and (3) full physics-based coupled corona-heliosphere models over an 8-year period (1995–2002). L1 measurements of the radial solar wind speed are the primary basis for validation of the coronal and heliosphere models studied, though other solar wind parameters are also considered. The models are from the Center for Integrated Space-Weather Modeling (CISM) which has developed a coupled model of the whole Sun-to-Earth system, from the solar photosphere to the terrestrial thermosphere. Simple point-by-point analysis techniques, such as mean-square-error and correlation coefficients, indicate that the empirical coronal-heliosphere model currently gives the best forecast of solar wind speed at 1 AU. A more detailed analysis shows that errors in the physics-based models are predominately the result of small timing offsets to solar wind structures and that the large-scale features of the solar wind are actually well modeled. We suggest that additional “tuning” of the coupling between the coronal and heliosphere models could lead to a significant improvement of their accuracy. Furthermore, we note that the physics-based models accurately capture dynamic effects at solar wind stream interaction regions, such as magnetic field compression, flow deflection, and density buildup, which the empirical scheme cannot.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

During the past 15 years, a number of initiatives have been undertaken at national level to develop ocean forecasting systems operating at regional and/or global scales. The co-ordination between these efforts has been organized internationally through the Global Ocean Data Assimilation Experiment (GODAE). The French MERCATOR project is one of the leading participants in GODAE. The MERCATOR systems routinely assimilate a variety of observations such as multi-satellite altimeter data, sea-surface temperature and in situ temperature and salinity profiles, focusing on high-resolution scales of the ocean dynamics. The assimilation strategy in MERCATOR is based on a hierarchy of methods of increasing sophistication including optimal interpolation, Kalman filtering and variational methods, which are progressively deployed through the Syst`eme d’Assimilation MERCATOR (SAM) series. SAM-1 is based on a reduced-order optimal interpolation which can be operated using ‘altimetry-only’ or ‘multi-data’ set-ups; it relies on the concept of separability, assuming that the correlations can be separated into a product of horizontal and vertical contributions. The second release, SAM-2, is being developed to include new features from the singular evolutive extended Kalman (SEEK) filter, such as three-dimensional, multivariate error modes and adaptivity schemes. The third one, SAM-3, considers variational methods such as the incremental four-dimensional variational algorithm. Most operational forecasting systems evaluated during GODAE are based on least-squares statistical estimation assuming Gaussian errors. In the framework of the EU MERSEA (Marine EnviRonment and Security for the European Area) project, research is being conducted to prepare the next-generation operational ocean monitoring and forecasting systems. The research effort will explore nonlinear assimilation formulations to overcome limitations of the current systems. This paper provides an overview of the developments conducted in MERSEA with the SEEK filter, the Ensemble Kalman filter and the sequential importance re-sampling filter.