917 resultados para vector error correction model
Resumo:
The ECMWF ensemble weather forecasts are generated by perturbing the initial conditions of the forecast using a subset of the singular vectors of the linearised propagator. Previous results show that when creating probabilistic forecasts from this ensemble better forecasts are obtained if the mean of the spread and the variability of the spread are calibrated separately. We show results from a simple linear model that suggest that this may be a generic property for all singular vector based ensemble forecasting systems based on only a subset of the full set of singular vectors.
Resumo:
The formulation of a new process-based crop model, the general large-area model (GLAM) for annual crops is presented. The model has been designed to operate on spatial scales commensurate with those of global and regional climate models. It aims to simulate the impact of climate on crop yield. Procedures for model parameter determination and optimisation are described, and demonstrated for the prediction of groundnut (i.e. peanut; Arachis hypogaea L.) yields across India for the period 1966-1989. Optimal parameters (e.g. extinction coefficient, transpiration efficiency, rate of change of harvest index) were stable over space and time, provided the estimate of the yield technology trend was based on the full 24-year period. The model has two location-specific parameters, the planting date, and the yield gap parameter. The latter varies spatially and is determined by calibration. The optimal value varies slightly when different input data are used. The model was tested using a historical data set on a 2.5degrees x 2.5degrees grid to simulate yields. Three sites are examined in detail-grid cells from Gujarat in the west, Andhra Pradesh towards the south, and Uttar Pradesh in the north. Agreement between observed and modelled yield was variable, with correlation coefficients of 0.74, 0.42 and 0, respectively. Skill was highest where the climate signal was greatest, and correlations were comparable to or greater than correlations with seasonal mean rainfall. Yields from all 35 cells were aggregated to simulate all-India yield. The correlation coefficient between observed and simulated yields was 0.76, and the root mean square error was 8.4% of the mean yield. The model can be easily extended to any annual crop for the investigation of the impacts of climate variability (or change) on crop yield over large areas. (C) 2004 Elsevier B.V. All rights reserved.
Resumo:
Reanalysis data provide an excellent test bed for impacts prediction systems. because they represent an upper limit on the skill of climate models. Indian groundnut (Arachis hypogaea L.) yields have been simulated using the General Large-Area Model (GLAM) for annual crops and the European Centre for Medium-Range Weather Forecasts (ECMWF) 40-yr reanalysis (ERA-40). The ability of ERA-40 to represent the Indian summer monsoon has been examined. The ability of GLAM. when driven with daily ERA-40 data, to model both observed yields and observed relationships between subseasonal weather and yield has been assessed. Mean yields "were simulated well across much of India. Correlations between observed and modeled yields, where these are significant. are comparable to correlations between observed yields and ERA-40 rainfall. Uncertainties due to the input planting window, crop duration, and weather data have been examined. A reduction in the root-mean-square error of simulated yields was achieved by applying bias correction techniques to the precipitation. The stability of the relationship between weather and yield over time has been examined. Weather-yield correlations vary on decadal time scales. and this has direct implications for the accuracy of yield simulations. Analysis of the skewness of both detrended yields and precipitation suggest that nonclimatic factors are partly responsible for this nonstationarity. Evidence from other studies, including data on cereal and pulse yields, indicates that this result is not particular to groundnut yield. The detection and modeling of nonstationary weather-yield relationships emerges from this study as an important part of the process of understanding and predicting the impacts of climate variability and change on crop yields.
Resumo:
Grass-based diets are of increasing social-economic importance in dairy cattle farming, but their low supply of glucogenic nutrients may limit the production of milk. Current evaluation systems that assess the energy supply and requirements are based on metabolisable energy (ME) or net energy (NE). These systems do not consider the characteristics of the energy delivering nutrients. In contrast, mechanistic models take into account the site of digestion, the type of nutrient absorbed and the type of nutrient required for production of milk constituents, and may therefore give a better prediction of supply and requirement of nutrients. The objective of the present study is to compare the ability of three energy evaluation systems, viz. the Dutch NE system, the agricultural and food research council (AFRC) ME system, and the feed into milk (FIM) ME system, and of a mechanistic model based on Dijkstra et al. [Simulation of digestion in cattle fed sugar cane: prediction of nutrient supply for milk production with locally available supplements. J. Agric. Sci., Cambridge 127, 247-60] and Mills et al. [A mechanistic model of whole-tract digestion and methanogenesis in the lactating dairy cow: model development, evaluation and application. J. Anim. Sci. 79, 1584-97] to predict the feed value of grass-based diets for milk production. The dataset for evaluation consists of 41 treatments of grass-based diets (at least 0.75 g ryegrass/g diet on DM basis). For each model, the predicted energy or nutrient supply, based on observed intake, was compared with predicted requirement based on observed performance. Assessment of the error of energy or nutrient supply relative to requirement is made by calculation of mean square prediction error (MSPE) and by concordance correlation coefficient (CCC). All energy evaluation systems predicted energy requirement to be lower (6-11%) than energy supply. The root MSPE (expressed as a proportion of the supply) was lowest for the mechanistic model (0.061), followed by the Dutch NE system (0.082), FIM ME system (0.097) and AFRCME system(0.118). For the energy evaluation systems, the error due to overall bias of prediction dominated the MSPE, whereas for the mechanistic model, proportionally 0.76 of MSPE was due to random variation. CCC analysis confirmed the higher accuracy and precision of the mechanistic model compared with energy evaluation systems. The error of prediction was positively related to grass protein content for the Dutch NE system, and was also positively related to grass DMI level for all models. In conclusion, current energy evaluation systems overestimate energy supply relative to energy requirement on grass-based diets for dairy cattle. The mechanistic model predicted glucogenic nutrients to limit performance of dairy cattle on grass-based diets, and proved to be more accurate and precise than the energy systems. The mechanistic model could be improved by allowing glucose maintenance and utilization requirements parameters to be variable. (C) 2007 Elsevier B.V. All rights reserved.
Resumo:
Diebold and Lamb (1997) argue that since the long-run elasticity of supply derived from the Nerlovian model entails a ratio of random variables, it is without moments. They propose minimum expected loss estimation to correct this problem but in so-doing ignore the fact that a non white-noise-error is implicit in the model. We show that, as a consequence the estimator is biased and demonstrate that Bayesian estimation which fully accounts for the error structure is preferable.
Resumo:
The aim of the study was to establish and verify a predictive vegetation model for plant community distribution in the alti-Mediterranean zone of the Lefka Ori massif, western Crete. Based on previous work three variables were identified as significant determinants of plant community distribution, namely altitude, slope angle and geomorphic landform. The response of four community types against these variables was tested using classification trees analysis in order to model community type occurrence. V-fold cross-validation plots were used to determine the length of the best fitting tree. The final 9node tree selected, classified correctly 92.5% of the samples. The results were used to provide decision rules for the construction of a spatial model for each community type. The model was implemented within a Geographical Information System (GIS) to predict the distribution of each community type in the study site. The evaluation of the model in the field using an error matrix gave an overall accuracy of 71%. The user's accuracy was higher for the Crepis-Cirsium (100%) and Telephium-Herniaria community type (66.7%) and relatively lower for the Peucedanum-Alyssum and Dianthus-Lomelosia community types (63.2% and 62.5%, respectively). Misclassification and field validation points to the need for improved geomorphological mapping and suggests the presence of transitional communities between existing community types.
A hierarchical Bayesian model for predicting the functional consequences of amino-acid polymorphisms
Resumo:
Genetic polymorphisms in deoxyribonucleic acid coding regions may have a phenotypic effect on the carrier, e.g. by influencing susceptibility to disease. Detection of deleterious mutations via association studies is hampered by the large number of candidate sites; therefore methods are needed to narrow down the search to the most promising sites. For this, a possible approach is to use structural and sequence-based information of the encoded protein to predict whether a mutation at a particular site is likely to disrupt the functionality of the protein itself. We propose a hierarchical Bayesian multivariate adaptive regression spline (BMARS) model for supervised learning in this context and assess its predictive performance by using data from mutagenesis experiments on lac repressor and lysozyme proteins. In these experiments, about 12 amino-acid substitutions were performed at each native amino-acid position and the effect on protein functionality was assessed. The training data thus consist of repeated observations at each position, which the hierarchical framework is needed to account for. The model is trained on the lac repressor data and tested on the lysozyme mutations and vice versa. In particular, we show that the hierarchical BMARS model, by allowing for the clustered nature of the data, yields lower out-of-sample misclassification rates compared with both a BMARS and a frequen-tist MARS model, a support vector machine classifier and an optimally pruned classification tree.
Resumo:
The paper considers meta-analysis of diagnostic studies that use a continuous Score for classification of study participants into healthy, or diseased groups. Classification is often done on the basis of a threshold or cut-off value, which might vary between Studies. Consequently, conventional meta-analysis methodology focusing solely on separate analysis of sensitivity and specificity might he confounded by a potentially unknown variation of the cut-off Value. To cope with this phenomena it is suggested to use, instead an overall estimate of the misclassification error previously suggested and used as Youden's index and; furthermore, it is argued that this index is less prone to between-study variation of cut-off values. A simple Mantel-Haenszel estimator as a summary measure of the overall misclassification error is suggested, which adjusts for a potential study effect. The measure of the misclassification error based on Youden's index is advantageous in that it easily allows an extension to a likelihood approach, which is then able to cope with unobserved heterogeneity via a nonparametric mixture model. All methods are illustrated at hand of an example on a diagnostic meta-analysis on duplex doppler ultrasound, with angiography as the standard for stroke prevention.
Nonlinear system identification using particle swarm optimisation tuned radial basis function models
Resumo:
A novel particle swarm optimisation (PSO) tuned radial basis function (RBF) network model is proposed for identification of non-linear systems. At each stage of orthogonal forward regression (OFR) model construction process, PSO is adopted to tune one RBF unit's centre vector and diagonal covariance matrix by minimising the leave-one-out (LOO) mean square error (MSE). This PSO aided OFR automatically determines how many tunable RBF nodes are sufficient for modelling. Compared with the-state-of-the-art local regularisation assisted orthogonal least squares algorithm based on the LOO MSE criterion for constructing fixed-node RBF network models, the PSO tuned RBF model construction produces more parsimonious RBF models with better generalisation performance and is often more efficient in model construction. The effectiveness of the proposed PSO aided OFR algorithm for constructing tunable node RBF models is demonstrated using three real data sets.
Resumo:
Objective: This paper presents a detailed study of fractal-based methods for texture characterization of mammographic mass lesions and architectural distortion. The purpose of this study is to explore the use of fractal and lacunarity analysis for the characterization and classification of both tumor lesions and normal breast parenchyma in mammography. Materials and methods: We conducted comparative evaluations of five popular fractal dimension estimation methods for the characterization of the texture of mass lesions and architectural distortion. We applied the concept of lacunarity to the description of the spatial distribution of the pixel intensities in mammographic images. These methods were tested with a set of 57 breast masses and 60 normal breast parenchyma (dataset1), and with another set of 19 architectural distortions and 41 normal breast parenchyma (dataset2). Support vector machines (SVM) were used as a pattern classification method for tumor classification. Results: Experimental results showed that the fractal dimension of region of interest (ROIs) depicting mass lesions and architectural distortion was statistically significantly lower than that of normal breast parenchyma for all five methods. Receiver operating characteristic (ROC) analysis showed that fractional Brownian motion (FBM) method generated the highest area under ROC curve (A z = 0.839 for dataset1, 0.828 for dataset2, respectively) among five methods for both datasets. Lacunarity analysis showed that the ROIs depicting mass lesions and architectural distortion had higher lacunarities than those of ROIs depicting normal breast parenchyma. The combination of FBM fractal dimension and lacunarity yielded the highest A z value (0.903 and 0.875, respectively) than those based on single feature alone for both given datasets. The application of the SVM improved the performance of the fractal-based features in differentiating tumor lesions from normal breast parenchyma by generating higher A z value. Conclusion: FBM texture model is the most appropriate model for characterizing mammographic images due to self-affinity assumption of the method being a better approximation. Lacunarity is an effective counterpart measure of the fractal dimension in texture feature extraction in mammographic images. The classification results obtained in this work suggest that the SVM is an effective method with great potential for classification in mammographic image analysis.
Resumo:
An efficient model identification algorithm for a large class of linear-in-the-parameters models is introduced that simultaneously optimises the model approximation ability, sparsity and robustness. The derived model parameters in each forward regression step are initially estimated via the orthogonal least squares (OLS), followed by being tuned with a new gradient-descent learning algorithm based on the basis pursuit that minimises the l(1) norm of the parameter estimate vector. The model subset selection cost function includes a D-optimality design criterion that maximises the determinant of the design matrix of the subset to ensure model robustness and to enable the model selection procedure to automatically terminate at a sparse model. The proposed approach is based on the forward OLS algorithm using the modified Gram-Schmidt procedure. Both the parameter tuning procedure, based on basis pursuit, and the model selection criterion, based on the D-optimality that is effective in ensuring model robustness, are integrated with the forward regression. As a consequence the inherent computational efficiency associated with the conventional forward OLS approach is maintained in the proposed algorithm. Examples demonstrate the effectiveness of the new approach.
Resumo:
This paper introduces a new neurofuzzy model construction and parameter estimation algorithm from observed finite data sets, based on a Takagi and Sugeno (T-S) inference mechanism and a new extended Gram-Schmidt orthogonal decomposition algorithm, for the modeling of a priori unknown dynamical systems in the form of a set of fuzzy rules. The first contribution of the paper is the introduction of a one to one mapping between a fuzzy rule-base and a model matrix feature subspace using the T-S inference mechanism. This link enables the numerical properties associated with a rule-based matrix subspace, the relationships amongst these matrix subspaces, and the correlation between the output vector and a rule-base matrix subspace, to be investigated and extracted as rule-based knowledge to enhance model transparency. The matrix subspace spanned by a fuzzy rule is initially derived as the input regression matrix multiplied by a weighting matrix that consists of the corresponding fuzzy membership functions over the training data set. Model transparency is explored by the derivation of an equivalence between an A-optimality experimental design criterion of the weighting matrix and the average model output sensitivity to the fuzzy rule, so that rule-bases can be effectively measured by their identifiability via the A-optimality experimental design criterion. The A-optimality experimental design criterion of the weighting matrices of fuzzy rules is used to construct an initial model rule-base. An extended Gram-Schmidt algorithm is then developed to estimate the parameter vector for each rule. This new algorithm decomposes the model rule-bases via an orthogonal subspace decomposition approach, so as to enhance model transparency with the capability of interpreting the derived rule-base energy level. This new approach is computationally simpler than the conventional Gram-Schmidt algorithm for resolving high dimensional regression problems, whereby it is computationally desirable to decompose complex models into a few submodels rather than a single model with large number of input variables and the associated curse of dimensionality problem. Numerical examples are included to demonstrate the effectiveness of the proposed new algorithm.
Resumo:
The identification of non-linear systems using only observed finite datasets has become a mature research area over the last two decades. A class of linear-in-the-parameter models with universal approximation capabilities have been intensively studied and widely used due to the availability of many linear-learning algorithms and their inherent convergence conditions. This article presents a systematic overview of basic research on model selection approaches for linear-in-the-parameter models. One of the fundamental problems in non-linear system identification is to find the minimal model with the best model generalisation performance from observational data only. The important concepts in achieving good model generalisation used in various non-linear system-identification algorithms are first reviewed, including Bayesian parameter regularisation and models selective criteria based on the cross validation and experimental design. A significant advance in machine learning has been the development of the support vector machine as a means for identifying kernel models based on the structural risk minimisation principle. The developments on the convex optimisation-based model construction algorithms including the support vector regression algorithms are outlined. Input selection algorithms and on-line system identification algorithms are also included in this review. Finally, some industrial applications of non-linear models are discussed.
Resumo:
The analysis-error variance of a 3D-FGAT assimilation is examined analytically using a simple scalar equation. It is shown that the analysis-error variance may be greater than the error variances of the inputs. The results are illustrated numerically with a scalar example and a shallow-water model.
Resumo:
The effects of meson fluctuations are studied in a nonlocal generalization of the Nambu–Jona-Lasinio model, by including terms of next-to-leading order (NLO) in 1/Nc. In the model with only scalar and pseudoscalar interactions NLO contributions to the quark condensate are found to be very small. This is a result of cancellation between virtual mesons and Fock terms, which occurs for the parameter sets of most interest. In the quark self-energy, similar cancellations arise in the tadpole diagrams, although not in other NLO pieces which contribute at the 25% level. The effects on pion properties are also found to be small. NLO contributions from real pi-pi intermediate states increase the sigma meson mass by 30%. In an extended model with vector and axial interactions, there are indications that NLO effects could be larger.