24 resultados para prediction accuracy
em CentAUR: Central Archive University of Reading - UK
Resumo:
Motivation: A new method that uses support vector machines (SVMs) to predict protein secondary structure is described and evaluated. The study is designed to develop a reliable prediction method using an alternative technique and to investigate the applicability of SVMs to this type of bioinformatics problem. Methods: Binary SVMs are trained to discriminate between two structural classes. The binary classifiers are combined in several ways to predict multi-class secondary structure. Results: The average three-state prediction accuracy per protein (Q3) is estimated by cross-validation to be 77.07 ± 0.26% with a segment overlap (Sov) score of 73.32 ± 0.39%. The SVM performs similarly to the 'state-of-the-art' PSIPRED prediction method on a non-homologous test set of 121 proteins despite being trained on substantially fewer examples. A simple consensus of the SVM, PSIPRED and PROFsec achieves significantly higher prediction accuracy than the individual methods. Availability: The SVM classifier is available from the authors. Work is in progress to make the method available on-line and to integrate the SVM predictions into the PSIPRED server.
Resumo:
Disease-weather relationships influencing Septoria leaf blotch (SLB) preceding growth stage (GS) 31 were identified using data from 12 sites in the UK covering 8 years. Based on these relationships, an early-warning predictive model for SLB on winter wheat was formulated to predict the occurrence of a damaging epidemic (defined as disease severity of 5% or > 5% on the top three leaf layers). The final model was based on accumulated rain > 3 mm in the 80-day period preceding GS 31 (roughly from early-February to the end of April) and accumulated minimum temperature with a 0A degrees C base in the 50-day period starting from 120 days preceding GS 31 (approximately January and February). The model was validated on an independent data set on which the prediction accuracy was influenced by cultivar resistance. Over all observations, the model had a true positive proportion of 0.61, a true negative proportion of 0.73, a sensitivity of 0.83, and a specificity of 0.18. True negative proportion increased to 0.85 for resistant cultivars and decreased to 0.50 for susceptible cultivars. Potential fungicide savings are most likely to be made with resistant cultivars, but such benefits would need to be identified with an in-depth evaluation.
Resumo:
MOTIVATION: The accurate prediction of the quality of 3D models is a key component of successful protein tertiary structure prediction methods. Currently, clustering or consensus based Model Quality Assessment Programs (MQAPs) are the most accurate methods for predicting 3D model quality; however they are often CPU intensive as they carry out multiple structural alignments in order to compare numerous models. In this study, we describe ModFOLDclustQ - a novel MQAP that compares 3D models of proteins without the need for CPU intensive structural alignments by utilising the Q measure for model comparisons. The ModFOLDclustQ method is benchmarked against the top established methods in terms of both accuracy and speed. In addition, the ModFOLDclustQ scores are combined with those from our older ModFOLDclust method to form a new method, ModFOLDclust2, that aims to provide increased prediction accuracy with negligible computational overhead. RESULTS: The ModFOLDclustQ method is competitive with leading clustering based MQAPs for the prediction of global model quality, yet it is up to 150 times faster than the previous version of the ModFOLDclust method at comparing models of small proteins (<60 residues) and over 5 times faster at comparing models of large proteins (>800 residues). Furthermore, a significant improvement in accuracy can be gained over the previous clustering based MQAPs by combining the scores from ModFOLDclustQ and ModFOLDclust to form the new ModFOLDclust2 method, with little impact on the overall time taken for each prediction. AVAILABILITY: The ModFOLDclustQ and ModFOLDclust2 methods are available to download from: http://www.reading.ac.uk/bioinf/downloads/ CONTACT: l.j.mcguffin@reading.ac.uk.
Resumo:
The efficacy of explicit and implicit learning paradigms was examined during the very early stages of learning the perceptual-motor anticipation task of predicting ball direction from temporally occluded footage of soccer penalty kicks. In addition, the effect of instructional condition on point-of-gaze during learning was examined. A significant improvement in horizontal prediction accuracy was observed in the explicit learning group; however, similar improvement was evident in a placebo group who watched footage of soccer matches. Only the explicit learning intervention resulted in changes in eye movement behaviour and increased awareness of relevant postural cues. Results are discussed in terms of methodological and practical issues regarding the employment of implicit perceptual training interventions. (c) 2005 Elsevier B.V. All rights reserved.
Resumo:
Two studies of assault victims examined the roles of (a) disorganized trauma memories in the development of posttraumatic stress disorder (PTSD), (b) peritraumatic cognitive processing in the development of problematic memories and PTSD, and (c) ongoing dissociation and negative appraisals of memories in maintaining symptomatology. In the cross-sectional study (n = 81), comparisons of current, past, and no-PTSD groups suggested that peritraumatic cognitive processing is related to the development of disorganized memories and PTSD. Ongoing dissociation and negative appraisals served to maintain PTSD symptoms. The prospective study (n = 73) replicated these findings longitudinally. Cognitive and memory assessments completed within 12-weeks postassault predicted 6-month symptoms. Assault severity measures explained 22% of symptom variance; measures of cognitive processing, memory disorganization, and appraisals increased prediction accuracy to 71%.
Resumo:
This paper proposes and tests a new framework for weighting recursive out-of-sample prediction errors according to their corresponding levels of in-sample estimation uncertainty. In essence, we show how to use the maximum possible amount of information from the sample in the evaluation of the prediction accuracy, by commencing the forecasts at the earliest opportunity and weighting the prediction errors. Via a Monte Carlo study, we demonstrate that the proposed framework selects the correct model from a set of candidate models considerably more often than the existing standard approach when only a small sample is available. We also show that the proposed weighting approaches result in tests of equal predictive accuracy that have much better sizes than the standard approach. An application to an exchange rate dataset highlights relevant differences in the results of tests of predictive accuracy based on the standard approach versus the framework proposed in this paper.
Resumo:
Ruminant production is a vital part of food industry but it raises environmental concerns, partly due to the associated methane outputs. Efficient methane mitigation and estimation of emissions from ruminants requires accurate prediction tools. Equations recommended by international organizations or scientific studies have been developed with animals fed conserved forages and concentrates and may be used with caution for grazing cattle. The aim of the current study was to develop prediction equations with animals fed fresh grass in order to be more suitable to pasture-based systems and for animals at lower feeding levels. A study with 25 nonpregnant nonlactating cows fed solely fresh-cut grass at maintenance energy level was performed over two consecutive grazing seasons. Grass of broad feeding quality, due to contrasting harvest dates, maturity, fertilisation and grass varieties, from eight swards was offered. Cows were offered the experimental diets for at least 2 weeks before housed in calorimetric chambers over 3 consecutive days with feed intake measurements and total urine and faeces collections performed daily. Methane emissions were measured over the last 2 days. Prediction models were developed from 100 3-day averaged records. Internal validation of these equations, and those recommended in literature, was performed. The existing in greenhouse gas inventories models under-estimated methane emissions from animals fed fresh-cut grass at maintenance while the new models, using the same predictors, improved prediction accuracy. Error in methane outputs prediction was decreased when grass nutrient, metabolisable energy and digestible organic matter concentrations were added as predictors to equations already containing dry matter or energy intakes, possibly because they explain feed digestibility and the type of energy-supplying nutrients more efficiently. Predictions based on readily available farm-level data, such as liveweight and grass nutrient concentrations were also generated and performed satisfactorily. New models may be recommended for predictions of methane emissions from grazing cattle at maintenance or low feeding levels.
Resumo:
Insect returns from the UK's Doppler weather radars were collected in the summers of 2007 and 2008, to ascertain their usefulness in providing information about boundary layer winds. Such observations could be assimilated into numerical weather prediction models to improve forecasts of convective showers before precipitation begins. Significant numbers of insect returns were observed during daylight hours on a number of days through this period, when they were detected at up to 30 km range from the radars, and up to 2 km above sea level. The range of detectable insect returns was found to vary with time of year and temperature. There was also a very weak correlation with wind speed and direction. Use of a dual-polarized radar revealed that the insects did not orient themselves at random, but showed distinct evidence of common orientation on several days, sometimes at an angle to their direction of travel. Observation minus model background residuals of wind profiles showed greater bias and standard deviation than that of other wind measurement types, which may be due to the insects' headings/airspeeds and to imperfect data extraction. The method used here, similar to the Met Office's procedure for extracting precipitation returns, requires further development as clutter contamination remained one of the largest error contributors. Wind observations derived from the insect returns would then be useful for data assimilation applications.
Resumo:
The precision farmer wants to manage the variation in soil nutrient status continuously, which requires reliable predictions at places between sampling sites. Ordinary kriging can be used for prediction if the data are spatially dependent and there is a suitable variogram model. However, even if data are spatially correlated, there are often few soil sampling sites in relation to the area to be managed. If intensive ancillary data are available and these are coregionalized with the sparse soil data, they could be used to increase the accuracy of predictions of the soil properties by methods such as cokriging, kriging with external drift and regression kriging. This paper compares the accuracy of predictions of the plant available N properties (mineral N and potentially available N) for two arable fields in Bedfordshire, United Kingdom, from ordinary kriging, cokriging, kriging with external drift and regression kriging. For the last three, intensive elevation data were used with the soil data. The mean squared errors of prediction from these methods of kriging were determined at validation sites where the values were known. Kriging with external drift resulted in the smallest mean squared error for two of the three properties examined, and cokriging for the other. The results suggest that the use of intensive ancillary data can increase the accuracy of predictions of soil properties in arable fields provided that the variables are related spatially. (c) 2005 Elsevier B.V. All rights reserved.
Resumo:
The aim of the study was to establish and verify a predictive vegetation model for plant community distribution in the alti-Mediterranean zone of the Lefka Ori massif, western Crete. Based on previous work three variables were identified as significant determinants of plant community distribution, namely altitude, slope angle and geomorphic landform. The response of four community types against these variables was tested using classification trees analysis in order to model community type occurrence. V-fold cross-validation plots were used to determine the length of the best fitting tree. The final 9node tree selected, classified correctly 92.5% of the samples. The results were used to provide decision rules for the construction of a spatial model for each community type. The model was implemented within a Geographical Information System (GIS) to predict the distribution of each community type in the study site. The evaluation of the model in the field using an error matrix gave an overall accuracy of 71%. The user's accuracy was higher for the Crepis-Cirsium (100%) and Telephium-Herniaria community type (66.7%) and relatively lower for the Peucedanum-Alyssum and Dianthus-Lomelosia community types (63.2% and 62.5%, respectively). Misclassification and field validation points to the need for improved geomorphological mapping and suggests the presence of transitional communities between existing community types.
Resumo:
Space weather effects on technological systems originate with energy carried from the Sun to the terrestrial environment by the solar wind. In this study, we present results of modeling of solar corona-heliosphere processes to predict solar wind conditions at the L1 Lagrangian point upstream of Earth. In particular we calculate performance metrics for (1) empirical, (2) hybrid empirical/physics-based, and (3) full physics-based coupled corona-heliosphere models over an 8-year period (1995–2002). L1 measurements of the radial solar wind speed are the primary basis for validation of the coronal and heliosphere models studied, though other solar wind parameters are also considered. The models are from the Center for Integrated Space-Weather Modeling (CISM) which has developed a coupled model of the whole Sun-to-Earth system, from the solar photosphere to the terrestrial thermosphere. Simple point-by-point analysis techniques, such as mean-square-error and correlation coefficients, indicate that the empirical coronal-heliosphere model currently gives the best forecast of solar wind speed at 1 AU. A more detailed analysis shows that errors in the physics-based models are predominately the result of small timing offsets to solar wind structures and that the large-scale features of the solar wind are actually well modeled. We suggest that additional “tuning” of the coupling between the coronal and heliosphere models could lead to a significant improvement of their accuracy. Furthermore, we note that the physics-based models accurately capture dynamic effects at solar wind stream interaction regions, such as magnetic field compression, flow deflection, and density buildup, which the empirical scheme cannot.
Resumo:
The aim of the study was to establish and verify a predictive vegetation model for plant community distribution in the alti-Mediterranean zone of the Lefka Ori massif, western Crete. Based on previous work three variables were identified as significant determinants of plant community distribution, namely altitude, slope angle and geomorphic landform. The response of four community types against these variables was tested using classification trees analysis in order to model community type occurrence. V-fold cross-validation plots were used to determine the length of the best fitting tree. The final 9node tree selected, classified correctly 92.5% of the samples. The results were used to provide decision rules for the construction of a spatial model for each community type. The model was implemented within a Geographical Information System (GIS) to predict the distribution of each community type in the study site. The evaluation of the model in the field using an error matrix gave an overall accuracy of 71%. The user's accuracy was higher for the Crepis-Cirsium (100%) and Telephium-Herniaria community type (66.7%) and relatively lower for the Peucedanum-Alyssum and Dianthus-Lomelosia community types (63.2% and 62.5%, respectively). Misclassification and field validation points to the need for improved geomorphological mapping and suggests the presence of transitional communities between existing community types.
Resumo:
In this paper the meteorological processes responsible for transporting tracer during the second ETEX (European Tracer EXperiment) release are determined using the UK Met Office Unified Model (UM). The UM predicted distribution of tracer is also compared with observations from the ETEX campaign. The dominant meteorological process is a warm conveyor belt which transports large amounts of tracer away from the surface up to a height of 4 km over a 36 h period. Convection is also an important process, transporting tracer to heights of up to 8 km. Potential sources of error when using an operational numerical weather prediction model to forecast air quality are also investigated. These potential sources of error include model dynamics, model resolution and model physics. In the UM a semi-Lagrangian monotonic advection scheme is used with cubic polynomial interpolation. This can predict unrealistic negative values of tracer which are subsequently set to zero, and hence results in an overprediction of tracer concentrations. In order to conserve mass in the UM tracer simulations it was necessary to include a flux corrected transport method. Model resolution can also affect the accuracy of predicted tracer distributions. Low resolution simulations (50 km grid length) were unable to resolve a change in wind direction observed during ETEX 2, this led to an error in the transport direction and hence an error in tracer distribution. High resolution simulations (12 km grid length) captured the change in wind direction and hence produced a tracer distribution that compared better with the observations. The representation of convective mixing was found to have a large effect on the vertical transport of tracer. Turning off the convective mixing parameterisation in the UM significantly reduced the vertical transport of tracer. Finally, air quality forecasts were found to be sensitive to the timing of synoptic scale features. Errors in the position of the cold front relative to the tracer release location of only 1 h resulted in changes in the predicted tracer concentrations that were of the same order of magnitude as the absolute tracer concentrations.
Resumo:
This study investigated the potential application of mid-infrared spectroscopy (MIR 4,000–900 cm−1) for the determination of milk coagulation properties (MCP), titratable acidity (TA), and pH in Brown Swiss milk samples (n = 1,064). Because MCP directly influence the efficiency of the cheese-making process, there is strong industrial interest in developing a rapid method for their assessment. Currently, the determination of MCP involves time-consuming laboratory-based measurements, and it is not feasible to carry out these measurements on the large numbers of milk samples associated with milk recording programs. Mid-infrared spectroscopy is an objective and nondestructive technique providing rapid real-time analysis of food compositional and quality parameters. Analysis of milk rennet coagulation time (RCT, min), curd firmness (a30, mm), TA (SH°/50 mL; SH° = Soxhlet-Henkel degree), and pH was carried out, and MIR data were recorded over the spectral range of 4,000 to 900 cm−1. Models were developed by partial least squares regression using untreated and pretreated spectra. The MCP, TA, and pH prediction models were improved by using the combined spectral ranges of 1,600 to 900 cm−1, 3,040 to 1,700 cm−1, and 4,000 to 3,470 cm−1. The root mean square errors of cross-validation for the developed models were 2.36 min (RCT, range 24.9 min), 6.86 mm (a30, range 58 mm), 0.25 SH°/50 mL (TA, range 3.58 SH°/50 mL), and 0.07 (pH, range 1.15). The most successfully predicted attributes were TA, RCT, and pH. The model for the prediction of TA provided approximate prediction (R2 = 0.66), whereas the predictive models developed for RCT and pH could discriminate between high and low values (R2 = 0.59 to 0.62). It was concluded that, although the models require further development to improve their accuracy before their application in industry, MIR spectroscopy has potential application for the assessment of RCT, TA, and pH during routine milk analysis in the dairy industry. The implementation of such models could be a means of improving MCP through phenotypic-based selection programs and to amend milk payment systems to incorporate MCP into their payment criteria.
Resumo:
The potential of near infrared spectroscopy in conjunction with partial least squares regression to predict Miscanthus xgiganteus and short rotation coppice willow quality indices was examined. Moisture, calorific value, ash and carbon content were predicted with a root mean square error of cross validation of 0.90% (R2 = 0.99), 0.13 MJ/kg (R2 = 0.99), 0.42% (R2 = 0.58), and 0.57% (R2 = 0.88), respectively. The moisture and calorific value prediction models had excellent accuracy while the carbon and ash models were fair and poor, respectively. The results indicate that near infrared spectroscopy has the potential to predict quality indices of dedicated energy crops, however the models must be further validated on a wider range of samples prior to implementation. The utilization of such models would assist in the optimal use of the feedstock based on its biomass properties.