26 resultados para Empirical Best Linear Unbiased Predictor
em CentAUR: Central Archive University of Reading - UK
Resumo:
The incidence-severity relationship for cashew gummosis, caused by Lasiodiplodia theobromae, was studied to determine the feasibility of using disease incidence to estimate indirectly disease severity in order to establish the potential damage caused by this disease in semiarid north-eastern Brazil. Epidemics were monitored in two cashew orchards, from 1995 to 1998 in an experimental field composed of 28 dwarf clones, and from 2000 to 2002 in a commercial orchard of a single clone. The two sites were located 10 km from each other. Logarithmic transformation achieved the best linear adjustment of incidence and severity data as determined by coefficients of determination for place, age and pooled data. A very high correlation between incidence and severity was found in both fields, with different disease pressures, different cashew genotypes, different ages and at several epidemic stages. Thus, the easily assessed gummosis incidence could be used to estimate gummosis severity levels.
Resumo:
A significant challenge in the prediction of climate change impacts on ecosystems and biodiversity is quantifying the sources of uncertainty that emerge within and between different models. Statistical species niche models have grown in popularity, yet no single best technique has been identified reflecting differing performance in different situations. Our aim was to quantify uncertainties associated with the application of 2 complimentary modelling techniques. Generalised linear mixed models (GLMM) and generalised additive mixed models (GAMM) were used to model the realised niche of ombrotrophic Sphagnum species in British peatlands. These models were then used to predict changes in Sphagnum cover between 2020 and 2050 based on projections of climate change and atmospheric deposition of nitrogen and sulphur. Over 90% of the variation in the GLMM predictions was due to niche model parameter uncertainty, dropping to 14% for the GAMM. After having covaried out other factors, average variation in predicted values of Sphagnum cover across UK peatlands was the next largest source of variation (8% for the GLMM and 86% for the GAMM). The better performance of the GAMM needs to be weighed against its tendency to overfit the training data. While our niche models are only a first approximation, we used them to undertake a preliminary evaluation of the relative importance of climate change and nitrogen and sulphur deposition and the geographic locations of the largest expected changes in Sphagnum cover. Predicted changes in cover were all small (generally <1% in an average 4 m2 unit area) but also highly uncertain. Peatlands expected to be most affected by climate change in combination with atmospheric pollution were Dartmoor, Brecon Beacons and the western Lake District.
Resumo:
Preparing for episodes with risks of anomalous weather a month to a year ahead is an important challenge for governments, non-governmental organisations, and private companies and is dependent on the availability of reliable forecasts. The majority of operational seasonal forecasts are made using process-based dynamical models, which are complex, computationally challenging and prone to biases. Empirical forecast approaches built on statistical models to represent physical processes offer an alternative to dynamical systems and can provide either a benchmark for comparison or independent supplementary forecasts. Here, we present a simple empirical system based on multiple linear regression for producing probabilistic forecasts of seasonal surface air temperature and precipitation across the globe. The global CO2-equivalent concentration is taken as the primary predictor; subsequent predictors, including large-scale modes of variability in the climate system and local-scale information, are selected on the basis of their physical relationship with the predictand. The focus given to the climate change signal as a source of skill and the probabilistic nature of the forecasts produced constitute a novel approach to global empirical prediction. Hindcasts for the period 1961–2013 are validated against observations using deterministic (correlation of seasonal means) and probabilistic (continuous rank probability skill scores) metrics. Good skill is found in many regions, particularly for surface air temperature and most notably in much of Europe during the spring and summer seasons. For precipitation, skill is generally limited to regions with known El Niño–Southern Oscillation (ENSO) teleconnections. The system is used in a quasi-operational framework to generate empirical seasonal forecasts on a monthly basis.
Phosphorus dynamics and export in streams draining micro-catchments: Development of empirical models
Resumo:
Annual total phosphorus (TP) export data from 108 European micro-catchments were analyzed against descriptive catchment data on climate (runoff), soil types, catchment size, and land use. The best possible empirical model developed included runoff, proportion of agricultural land and catchment size as explanatory variables but with a low explanation of the variance in the dataset (R-2 = 0.37). Improved country specific empirical models could be developed in some cases. The best example was from Norway where an analysis of TP-export data from 12 predominantly agricultural micro-catchments revealed a relationship explaining 96% of the variance in TP-export. The explanatory variables were in this case soil-P status (P-AL), proportion of organic soil, and the export of suspended sediment. Another example is from Denmark where an empirical model was established for the basic annual average TP-export from 24 catchments with percentage sandy soils, percentage organic soils, runoff, and application of phosphorus in fertilizer and animal manure as explanatory variables (R-2 = 0.97).
Resumo:
The aim of the study was to establish and verify a predictive vegetation model for plant community distribution in the alti-Mediterranean zone of the Lefka Ori massif, western Crete. Based on previous work three variables were identified as significant determinants of plant community distribution, namely altitude, slope angle and geomorphic landform. The response of four community types against these variables was tested using classification trees analysis in order to model community type occurrence. V-fold cross-validation plots were used to determine the length of the best fitting tree. The final 9node tree selected, classified correctly 92.5% of the samples. The results were used to provide decision rules for the construction of a spatial model for each community type. The model was implemented within a Geographical Information System (GIS) to predict the distribution of each community type in the study site. The evaluation of the model in the field using an error matrix gave an overall accuracy of 71%. The user's accuracy was higher for the Crepis-Cirsium (100%) and Telephium-Herniaria community type (66.7%) and relatively lower for the Peucedanum-Alyssum and Dianthus-Lomelosia community types (63.2% and 62.5%, respectively). Misclassification and field validation points to the need for improved geomorphological mapping and suggests the presence of transitional communities between existing community types.
Resumo:
The contributions of different time scales to extratropical teleconnections are examined. By applying empirical orthogonal functions and correlation analyses to reanalysis data, it is shown that eddies with periods shorter than 10 days have no linear contribution to teleconnectivity. Instead, synoptic variability follows wavelike patterns along the storm tracks, interpreted as propagating baroclinic disturbances. In agreement with preceding studies, it is found that teleconnections such as the North Atlantic Oscillation (NAO) and the Pacific–North America (PNA) pattern occur only at low frequencies, typically for periods more than 20 days. Low-frequency potential vorticity variability is shown to follow patterns analogous to known teleconnections but with shapes that differ considerably from them. It is concluded that the role, if any, of synoptic eddies in determining and forcing teleconnections needs to be sought in nonlinear interactions with the slower transients. The present results demonstrate that daily variability of teleconnection indices cannot be interpreted in terms of the teleconnection patterns, only the slow part of the variability.
Resumo:
Space weather effects on technological systems originate with energy carried from the Sun to the terrestrial environment by the solar wind. In this study, we present results of modeling of solar corona-heliosphere processes to predict solar wind conditions at the L1 Lagrangian point upstream of Earth. In particular we calculate performance metrics for (1) empirical, (2) hybrid empirical/physics-based, and (3) full physics-based coupled corona-heliosphere models over an 8-year period (1995–2002). L1 measurements of the radial solar wind speed are the primary basis for validation of the coronal and heliosphere models studied, though other solar wind parameters are also considered. The models are from the Center for Integrated Space-Weather Modeling (CISM) which has developed a coupled model of the whole Sun-to-Earth system, from the solar photosphere to the terrestrial thermosphere. Simple point-by-point analysis techniques, such as mean-square-error and correlation coefficients, indicate that the empirical coronal-heliosphere model currently gives the best forecast of solar wind speed at 1 AU. A more detailed analysis shows that errors in the physics-based models are predominately the result of small timing offsets to solar wind structures and that the large-scale features of the solar wind are actually well modeled. We suggest that additional “tuning” of the coupling between the coronal and heliosphere models could lead to a significant improvement of their accuracy. Furthermore, we note that the physics-based models accurately capture dynamic effects at solar wind stream interaction regions, such as magnetic field compression, flow deflection, and density buildup, which the empirical scheme cannot.
Resumo:
This note presents a robust method for estimating response surfaces that consist of linear response regimes and a linear plateau. The linear response-and-plateau model has fascinated production scientists since von Liebig (1855) and, as Upton and Dalton indicated, some years ago in this Journal, the response-and-plateau model seems to fit the data in many empirical studies. The estimation algorithm evolves from Bayesian implementation of a switching-regression (finite mixtures) model and demonstrates routine application of Gibbs sampling and data augmentation-techniques that are now in widespread application in other disciplines.
Resumo:
When assessing hypotheses, the possibility and consequences of false-positive conclusions should be considered along with the avoidance of false-negative ones. A recent assessment of the system of rice intensification (SRI) by McDonald et al. [McDonald, A.J., Hobbs, P.R., Riha, S.J., 2006. Does the system of rice intensification outperform conventional best management? A synopsis of the empirical record. Field Crops Res. 96, 31-36] provides a good example where this was not done as it was preoccupied with avoiding false-positives only. It concluded, based on a desk study using secondary data assembled selectively from diverse sources and with a 95% level of confidence, that 'best management practices' (BMPs) on average produce 11% higher rice yields than SRI methods, and that, therefore, SRI has little to offer beyond what is already known by scientists.
Resumo:
The aim of the study was to establish and verify a predictive vegetation model for plant community distribution in the alti-Mediterranean zone of the Lefka Ori massif, western Crete. Based on previous work three variables were identified as significant determinants of plant community distribution, namely altitude, slope angle and geomorphic landform. The response of four community types against these variables was tested using classification trees analysis in order to model community type occurrence. V-fold cross-validation plots were used to determine the length of the best fitting tree. The final 9node tree selected, classified correctly 92.5% of the samples. The results were used to provide decision rules for the construction of a spatial model for each community type. The model was implemented within a Geographical Information System (GIS) to predict the distribution of each community type in the study site. The evaluation of the model in the field using an error matrix gave an overall accuracy of 71%. The user's accuracy was higher for the Crepis-Cirsium (100%) and Telephium-Herniaria community type (66.7%) and relatively lower for the Peucedanum-Alyssum and Dianthus-Lomelosia community types (63.2% and 62.5%, respectively). Misclassification and field validation points to the need for improved geomorphological mapping and suggests the presence of transitional communities between existing community types.
Resumo:
Analyses of high-density single-nucleotide polymorphism (SNP) data, such as genetic mapping and linkage disequilibrium (LD) studies, require phase-known haplotypes to allow for the correlation between tightly linked loci. However, current SNP genotyping technology cannot determine phase, which must be inferred statistically. In this paper, we present a new Bayesian Markov chain Monte Carlo (MCMC) algorithm for population haplotype frequency estimation, particulary in the context of LD assessment. The novel feature of the method is the incorporation of a log-linear prior model for population haplotype frequencies. We present simulations to suggest that 1) the log-linear prior model is more appropriate than the standard coalescent process in the presence of recombination (>0.02cM between adjacent loci), and 2) there is substantial inflation in measures of LD obtained by a "two-stage" approach to the analysis by treating the "best" haplotype configuration as correct, without regard to uncertainty in the recombination process. Genet Epidemiol 25:106-114, 2003. (C) 2003 Wiley-Liss, Inc.
Resumo:
Tremor is a clinical feature characterized by oscillations of a part of the body. The detection and study of tremor is an important step in investigations seeking to explain underlying control strategies of the central nervous system under natural (or physiological) and pathological conditions. It is well established that tremorous activity is composed of deterministic and stochastic components. For this reason, the use of digital signal processing techniques (DSP) which take into account the nonlinearity and nonstationarity of such signals may bring new information into the signal analysis which is often obscured by traditional linear techniques (e.g. Fourier analysis). In this context, this paper introduces the application of the empirical mode decomposition (EMD) and Hilbert spectrum (HS), which are relatively new DSP techniques for the analysis of nonlinear and nonstationary time-series, for the study of tremor. Our results, obtained from the analysis of experimental signals collected from 31 patients with different neurological conditions, showed that the EMD could automatically decompose acquired signals into basic components, called intrinsic mode functions (IMFs), representing tremorous and voluntary activity. The identification of a physical meaning for IMFs in the context of tremor analysis suggests an alternative and new way of detecting tremorous activity. These results may be relevant for those applications requiring automatic detection of tremor. Furthermore, the energy of IMFs was visualized as a function of time and frequency by means of the HS. This analysis showed that the variation of energy of tremorous and voluntary activity could be distinguished and characterized on the HS. Such results may be relevant for those applications aiming to identify neurological disorders. In general, both the HS and EMD demonstrated to be very useful to perform objective analysis of any kind of tremor and can therefore be potentially used to perform functional assessment.
Resumo:
The identification of non-linear systems using only observed finite datasets has become a mature research area over the last two decades. A class of linear-in-the-parameter models with universal approximation capabilities have been intensively studied and widely used due to the availability of many linear-learning algorithms and their inherent convergence conditions. This article presents a systematic overview of basic research on model selection approaches for linear-in-the-parameter models. One of the fundamental problems in non-linear system identification is to find the minimal model with the best model generalisation performance from observational data only. The important concepts in achieving good model generalisation used in various non-linear system-identification algorithms are first reviewed, including Bayesian parameter regularisation and models selective criteria based on the cross validation and experimental design. A significant advance in machine learning has been the development of the support vector machine as a means for identifying kernel models based on the structural risk minimisation principle. The developments on the convex optimisation-based model construction algorithms including the support vector regression algorithms are outlined. Input selection algorithms and on-line system identification algorithms are also included in this review. Finally, some industrial applications of non-linear models are discussed.