936 resultados para leave one out cross validation
Resumo:
The presented thesis considered three different system approach topics to ensure yield and plant health in organically grown potatoes and tomatoes. The first topic describes interactions between late blight (Phytophthora infestans) incidence and soil nitrogen supply on yield in organic potato farming focussing in detail on the yield loss relationship of late blight based on results of several field trials. The interactive effects of soil N-supply, climatic conditions and late blight on the yield were studied in the presence and absence of copper fungicides from 2002-2004 for the potato cultivar Nicola. Under conditions of central Germany the use of copper significantly reduced late blight in almost all cases (15-30 %). However, the reductions in disease through copper application did not result in statistically significant yield increases (+0 – +10 %). Subsequently, only 30 % of the variation in yield could be attributed to disease reductions. A multiple regression model (R²Max), however, including disease reduction, growth duration and temperature sum from planting until 60 % disease severity was reached and soil mineral N contents 10 days after emergence could explain 75 % of the observed variations in yield. The second topic describes the effect of some selected organic fertilisers and biostimulant products on nitrogen-mineralization and efficiency, yield and diseases in organic potato and tomato trials. The organic fertilisers Biofeed Basis (BFB, plant derived, AgroBioProducts, Wageningen, Netherlands) and BioIlsa 12,5 Export (physically hydrolysed leather shavings, hair and skin of animals; ILSA, Arizignano, Italy) and two biostimulant products BioFeed Quality (BFQ, multi-compound seaweed extract, AgroBioProducts) and AUSMA (aqueous pine and spruce needle extract, A/S BIOLAT, Latvia), were tested. Both fertilisers supplied considerable amounts of nitrogen during the main uptake phases of the crops and reached yields as high or higher as compared to the control with horn meal fertilisation. The N-efficiency of the tested fertilisers in potatoes ranged from 90 to 159 kg yield*kg-1 N – input. Most effective with tomatoes were the combined treatments of fertiliser BFB and the biostimulants AUSMA and BFQ. Both biostimulants significantly increased the share of healthy fruit and/or the number of fruits. BFQ significantly increased potato yields (+6 %) in one out of two years and reduced R. solani-infestation in the potatoes. This suggests that the biostimulants had effects on plant metabolism and resistance properties. However, no effects of biostimulants on potato late blight could be observed in the fields. The third topic focused on the effect of suppressive composts and seed tuber health on the saprophytic pathogen Rhizoctonia solani in organic potato systems. In the present study 5t ha-1 DM of a yard and bio-waste (60/40) compost produced in a 5 month composting process and a 15 month old 100 % yard waste compost were used to assess the effects on potato infection with R. solani when applying composts within the limits allowed. Across the differences in initial seed tuber infestation and 12 cultivars 5t DM ha-1 of high quality composts, applied in the seed tuber area, reduced the infestation of harvested potatoes with black scurf, tuber malformations and dry core tubers by 20 to 84 %, 20 to 49 % and 38 to 54 %, respectively, while marketable yields were increased by 5 to 25 % due to lower rates of wastes after sorting (marketable yield is gross yield minus malformed tubers, tubers with dry core, tubers with black scurf > 15% infested skin). The rate of initial black scurf infection of the seed tubers also affected tuber number, health and quality significantly. Compared to healthy seed tubers initial black scurf sclerotia infestation of 2-5 and >10 % of tuber surface led in untreated plots to a decrease in marketable yields by 14-19 and 44-66 %, a increase of black scurf severity by 8-40 and 34-86 % and also increased the amount of malformed and dry core tubers by 32-57 and 109-214 %.
Resumo:
Die zunehmende Vernetzung der Informations- und Kommunikationssysteme führt zu einer weiteren Erhöhung der Komplexität und damit auch zu einer weiteren Zunahme von Sicherheitslücken. Klassische Schutzmechanismen wie Firewall-Systeme und Anti-Malware-Lösungen bieten schon lange keinen Schutz mehr vor Eindringversuchen in IT-Infrastrukturen. Als ein sehr wirkungsvolles Instrument zum Schutz gegenüber Cyber-Attacken haben sich hierbei die Intrusion Detection Systeme (IDS) etabliert. Solche Systeme sammeln und analysieren Informationen von Netzwerkkomponenten und Rechnern, um ungewöhnliches Verhalten und Sicherheitsverletzungen automatisiert festzustellen. Während signatur-basierte Ansätze nur bereits bekannte Angriffsmuster detektieren können, sind anomalie-basierte IDS auch in der Lage, neue bisher unbekannte Angriffe (Zero-Day-Attacks) frühzeitig zu erkennen. Das Kernproblem von Intrusion Detection Systeme besteht jedoch in der optimalen Verarbeitung der gewaltigen Netzdaten und der Entwicklung eines in Echtzeit arbeitenden adaptiven Erkennungsmodells. Um diese Herausforderungen lösen zu können, stellt diese Dissertation ein Framework bereit, das aus zwei Hauptteilen besteht. Der erste Teil, OptiFilter genannt, verwendet ein dynamisches "Queuing Concept", um die zahlreich anfallenden Netzdaten weiter zu verarbeiten, baut fortlaufend Netzverbindungen auf, und exportiert strukturierte Input-Daten für das IDS. Den zweiten Teil stellt ein adaptiver Klassifikator dar, der ein Klassifikator-Modell basierend auf "Enhanced Growing Hierarchical Self Organizing Map" (EGHSOM), ein Modell für Netzwerk Normalzustand (NNB) und ein "Update Model" umfasst. In dem OptiFilter werden Tcpdump und SNMP traps benutzt, um die Netzwerkpakete und Hostereignisse fortlaufend zu aggregieren. Diese aggregierten Netzwerkpackete und Hostereignisse werden weiter analysiert und in Verbindungsvektoren umgewandelt. Zur Verbesserung der Erkennungsrate des adaptiven Klassifikators wird das künstliche neuronale Netz GHSOM intensiv untersucht und wesentlich weiterentwickelt. In dieser Dissertation werden unterschiedliche Ansätze vorgeschlagen und diskutiert. So wird eine classification-confidence margin threshold definiert, um die unbekannten bösartigen Verbindungen aufzudecken, die Stabilität der Wachstumstopologie durch neuartige Ansätze für die Initialisierung der Gewichtvektoren und durch die Stärkung der Winner Neuronen erhöht, und ein selbst-adaptives Verfahren eingeführt, um das Modell ständig aktualisieren zu können. Darüber hinaus besteht die Hauptaufgabe des NNB-Modells in der weiteren Untersuchung der erkannten unbekannten Verbindungen von der EGHSOM und der Überprüfung, ob sie normal sind. Jedoch, ändern sich die Netzverkehrsdaten wegen des Concept drif Phänomens ständig, was in Echtzeit zur Erzeugung nicht stationärer Netzdaten führt. Dieses Phänomen wird von dem Update-Modell besser kontrolliert. Das EGHSOM-Modell kann die neuen Anomalien effektiv erkennen und das NNB-Model passt die Änderungen in Netzdaten optimal an. Bei den experimentellen Untersuchungen hat das Framework erfolgversprechende Ergebnisse gezeigt. Im ersten Experiment wurde das Framework in Offline-Betriebsmodus evaluiert. Der OptiFilter wurde mit offline-, synthetischen- und realistischen Daten ausgewertet. Der adaptive Klassifikator wurde mit dem 10-Fold Cross Validation Verfahren evaluiert, um dessen Genauigkeit abzuschätzen. Im zweiten Experiment wurde das Framework auf einer 1 bis 10 GB Netzwerkstrecke installiert und im Online-Betriebsmodus in Echtzeit ausgewertet. Der OptiFilter hat erfolgreich die gewaltige Menge von Netzdaten in die strukturierten Verbindungsvektoren umgewandelt und der adaptive Klassifikator hat sie präzise klassifiziert. Die Vergleichsstudie zwischen dem entwickelten Framework und anderen bekannten IDS-Ansätzen zeigt, dass der vorgeschlagene IDSFramework alle anderen Ansätze übertrifft. Dies lässt sich auf folgende Kernpunkte zurückführen: Bearbeitung der gesammelten Netzdaten, Erreichung der besten Performanz (wie die Gesamtgenauigkeit), Detektieren unbekannter Verbindungen und Entwicklung des in Echtzeit arbeitenden Erkennungsmodells von Eindringversuchen.
Resumo:
Antecedentes: la encuesta autoadministrada es la forma más usada y confiable para investigar comportamientos relacionados con la salud en adolescentes. Por lo general, un grupo significativo de participantes responde de forma inconsistente a algunos puntos de tópicos relacionados,particularmente con temas sensibles; en consecuencia, dichos puntos deben ser eliminados del análisis. Hasta la fecha, no se han comparado extensamente las características demográficas de los estudiantes que responden y los que no responden consistentemente una encuesta. Objetivo: comparar algunas variables demográficas relacionadas con respuestas inconsistentes sobre comportamiento sexual en estudiantes de secundaria de Santa Marta, Colombia. Método: una muestra probabilística por conglomerados de estudiantes diligenció una encuesta anónima sobre relaciones sexuales. Se usó regresión logística para ajustar las variables de la encuesta en las cuales se respondió de forma inconsistente. Resultados: un total de 3813 estudiantes completó la encuesta. Un grupo de 3 575 estudiantes (93,8%) respondió de forma consistente a los puntos sobre comportamiento sexual y uno de 238 (6,2%) respondió de forma inconsistente. Después de ajustar por estrato socioeconómico se evidenció que los estudiantes que con mayor frecuencia respondieron inconsistentemente eran varones (OR=2,1; IC95% 1,6-2,8) y pertenecían a colegios privados (OR=3,5; IC95% 2,6-4,8). Conclusiones: aproximadamente uno de cada veinte estudiantes responde de forma inconsistente las preguntas sobre comportamiento sexual. Las respuestas inconsistentes están relacionadas con estudiantes de colegios privados y sexo masculino. Se necesitan más investigaciones.
Resumo:
The aim of this paper is essentially twofold: first, to describe the use of spherical nonparametric estimators for determining statistical diagnostic fields from ensembles of feature tracks on a global domain, and second, to report the application of these techniques to data derived from a modern general circulation model. New spherical kernel functions are introduced that are more efficiently computed than the traditional exponential kernels. The data-driven techniques of cross-validation to determine the amount elf smoothing objectively, and adaptive smoothing to vary the smoothing locally, are also considered. Also introduced are techniques for combining seasonal statistical distributions to produce longer-term statistical distributions. Although all calculations are performed globally, only the results for the Northern Hemisphere winter (December, January, February) and Southern Hemisphere winter (June, July, August) cyclonic activity are presented, discussed, and compared with previous studies. Overall, results for the two hemispheric winters are in good agreement with previous studies, both for model-based studies and observational studies.
Resumo:
Maps of kriged soil properties for precision agriculture are often based on a variogram estimated from too few data because the costs of sampling and analysis are often prohibitive. If the variogram has been computed by the usual method of moments, it is likely to be unstable when there are fewer than 100 data. The scale of variation in soil properties should be investigated prior to sampling by computing a variogram from ancillary data, such as an aerial photograph of the bare soil. If the sampling interval suggested by this is large in relation to the size of the field there will be too few data to estimate a reliable variogram for kriging. Standardized variograms from aerial photographs can be used with standardized soil data that are sparse, provided the data are spatially structured and the nugget:sill ratio is similar to that of a reliable variogram of the property. The problem remains of how to set this ratio in the absence of an accurate variogram. Several methods of estimating the nugget:sill ratio for selected soil properties are proposed and evaluated. Standardized variograms with nugget:sill ratios set by these methods are more similar to those computed from intensive soil data than are variograms computed from sparse soil data. The results of cross-validation and mapping show that the standardized variograms provide more accurate estimates, and preserve the main patterns of variation better than those computed from sparse data.
Resumo:
Asymmetry in a distribution can arise from a long tail of values in the underlying process or from outliers that belong to another population that contaminate the primary process. The first paper of this series examined the effects of the former on the variogram and this paper examines the effects of asymmetry arising from outliers. Simulated annealing was used to create normally distributed random fields of different size that are realizations of known processes described by variograms with different nugget:sill ratios. These primary data sets were then contaminated with randomly located and spatially aggregated outliers from a secondary process to produce different degrees of asymmetry. Experimental variograms were computed from these data by Matheron's estimator and by three robust estimators. The effects of standard data transformations on the coefficient of skewness and on the variogram were also investigated. Cross-validation was used to assess the performance of models fitted to experimental variograms computed from a range of data contaminated by outliers for kriging. The results showed that where skewness was caused by outliers the variograms retained their general shape, but showed an increase in the nugget and sill variances and nugget:sill ratios. This effect was only slightly more for the smallest data set than for the two larger data sets and there was little difference between the results for the latter. Overall, the effect of size of data set was small for all analyses. The nugget:sill ratio showed a consistent decrease after transformation to both square roots and logarithms; the decrease was generally larger for the latter, however. Aggregated outliers had different effects on the variogram shape from those that were randomly located, and this also depended on whether they were aggregated near to the edge or the centre of the field. The results of cross-validation showed that the robust estimators and the removal of outliers were the most effective ways of dealing with outliers for variogram estimation and kriging. (C) 2007 Elsevier Ltd. All rights reserved.
Resumo:
The aim of the study was to establish and verify a predictive vegetation model for plant community distribution in the alti-Mediterranean zone of the Lefka Ori massif, western Crete. Based on previous work three variables were identified as significant determinants of plant community distribution, namely altitude, slope angle and geomorphic landform. The response of four community types against these variables was tested using classification trees analysis in order to model community type occurrence. V-fold cross-validation plots were used to determine the length of the best fitting tree. The final 9node tree selected, classified correctly 92.5% of the samples. The results were used to provide decision rules for the construction of a spatial model for each community type. The model was implemented within a Geographical Information System (GIS) to predict the distribution of each community type in the study site. The evaluation of the model in the field using an error matrix gave an overall accuracy of 71%. The user's accuracy was higher for the Crepis-Cirsium (100%) and Telephium-Herniaria community type (66.7%) and relatively lower for the Peucedanum-Alyssum and Dianthus-Lomelosia community types (63.2% and 62.5%, respectively). Misclassification and field validation points to the need for improved geomorphological mapping and suggests the presence of transitional communities between existing community types.
Resumo:
BACKGROUND: The serum peptidome may be a valuable source of diagnostic cancer biomarkers. Previous mass spectrometry (MS) studies have suggested that groups of related peptides discriminatory for different cancer types are generated ex vivo from abundant serum proteins by tumor-specific exopeptidases. We tested 2 complementary serum profiling strategies to see if similar peptides could be found that discriminate ovarian cancer from benign cases and healthy controls. METHODS: We subjected identically collected and processed serum samples from healthy volunteers and patients to automated polypeptide extraction on octadecylsilane-coated magnetic beads and separately on ZipTips before MALDI-TOF MS profiling at 2 centers. The 2 platforms were compared and case control profiling data analyzed to find altered MS peak intensities. We tested models built from training datasets for both methods for their ability to classify a blinded test set. RESULTS: Both profiling platforms had CVs of approximately 15% and could be applied for high-throughput analysis of clinical samples. The 2 methods generated overlapping peptide profiles, with some differences in peak intensity in different mass regions. In cross-validation, models from training data gave diagnostic accuracies up to 87% for discriminating malignant ovarian cancer from healthy controls and up to 81% for discriminating malignant from benign samples. Diagnostic accuracies up to 71% (malignant vs healthy) and up to 65% (malignant vs benign) were obtained when the models were validated on the blinded test set. CONCLUSIONS: For ovarian cancer, altered MALDI-TOF MS peptide profiles alone cannot be used for accurate diagnoses.
Resumo:
The aim of the study was to establish and verify a predictive vegetation model for plant community distribution in the alti-Mediterranean zone of the Lefka Ori massif, western Crete. Based on previous work three variables were identified as significant determinants of plant community distribution, namely altitude, slope angle and geomorphic landform. The response of four community types against these variables was tested using classification trees analysis in order to model community type occurrence. V-fold cross-validation plots were used to determine the length of the best fitting tree. The final 9node tree selected, classified correctly 92.5% of the samples. The results were used to provide decision rules for the construction of a spatial model for each community type. The model was implemented within a Geographical Information System (GIS) to predict the distribution of each community type in the study site. The evaluation of the model in the field using an error matrix gave an overall accuracy of 71%. The user's accuracy was higher for the Crepis-Cirsium (100%) and Telephium-Herniaria community type (66.7%) and relatively lower for the Peucedanum-Alyssum and Dianthus-Lomelosia community types (63.2% and 62.5%, respectively). Misclassification and field validation points to the need for improved geomorphological mapping and suggests the presence of transitional communities between existing community types.
Resumo:
This research is associated with the goal of the horticultural sector of the Colombian southwest, which is to obtain climatic information, specifically, to predict the monthly average temperature in sites where it has not been measured. The data correspond to monthly average temperature, and were recorded in meteorological stations at Valle del Cauca, Colombia, South America. Two components are identified in the data of this research: (1) a component due to the temporal aspects, determined by characteristics of the time series, distribution of the monthly average temperature through the months and the temporal phenomena, which increased (El Nino) and decreased (La Nina) the temperature values, and (2) a component due to the sites, which is determined for the clear differentiation of two populations, the valley and the mountains, which are associated with the pattern of monthly average temperature and with the altitude. Finally, due to the closeness between meteorological stations it is possible to find spatial correlation between data from nearby sites. In the first instance a random coefficient model without spatial covariance structure in the errors is obtained by month and geographical location (mountains and valley, respectively). Models for wet periods in mountains show a normal distribution in the errors; models for the valley and dry periods in mountains do not exhibit a normal pattern in the errors. In models of mountains and wet periods, omni-directional weighted variograms for residuals show spatial continuity. The random coefficient model without spatial covariance structure in the errors and the random coefficient model with spatial covariance structure in the errors are capturing the influence of the El Nino and La Nina phenomena, which indicates that the inclusion of the random part in the model is appropriate. The altitude variable contributes significantly in the models for mountains. In general, the cross-validation process indicates that the random coefficient model with spatial spherical and the random coefficient model with spatial Gaussian are the best models for the wet periods in mountains, and the worst model is the model used by the Colombian Institute for Meteorology, Hydrology and Environmental Studies (IDEAM) to predict temperature.
Resumo:
The surface geometries of the p (root7- x root7)R19degrees-(4CO) and c(2 x 4)-(2CO) layers on Ni {111} and the clean Ni {111} surface were determined by low energy electron diffraction structure analysis. For the clean surface small but significant contractions of d(12) and d(23) (both 2.02 Angstrom) were found with respect to the bulk interlayer distance (2.03 Angstrom). In the c(2 x 4)-(2CO) structure these distances are expanded, with values of d(12) = 2.08 Angstrom and d(23) = 2.06 Angstrom and buckling of 0.08 and 0.02 Angstrom, respectively, in the first and second layer. CO resides near hcp and fcc hollow sites with relatively large lateral shifts away from the ideal positions leading to unequal C-Ni bond lengths between 1.76 and 1.99 Angstrom. For the p(root7- x root7-)R19'-(4CO) layer two best fit geometries were found, which agree in most of their atomic positions, except for one out of four CO molecules, which is either near atop or between bridge and atop. The remaining three molecules reside near hcp and fcc sites, again with large lateral deviations from their ideal positions. The average C Ni bond length for these molecules is, however, the same as for CO on hollow sites at low coverage. The average CNi bond length at hollow sites, the interlayer distances, and buckling in the first Ni layer are similar to the c(2 x 4)(2CO) geometry, only the buckling in the second layer (0.08 Angstrom) is significantly larger. Lateral and vertical shifts of the Ni atoms in the first layer lead to unsymmetric environments for the CO molecules, which can be regarded as an imprint of the chiral p(root7- x root7-)R19degrees lattice geometry onto the substrate.
Resumo:
A new parameter-estimation algorithm, which minimises the cross-validated prediction error for linear-in-the-parameter models, is proposed, based on stacked regression and an evolutionary algorithm. It is initially shown that cross-validation is very important for prediction in linear-in-the-parameter models using a criterion called the mean dispersion error (MDE). Stacked regression, which can be regarded as a sophisticated type of cross-validation, is then introduced based on an evolutionary algorithm, to produce a new parameter-estimation algorithm, which preserves the parsimony of a concise model structure that is determined using the forward orthogonal least-squares (OLS) algorithm. The PRESS prediction errors are used for cross-validation, and the sunspot and Canadian lynx time series are used to demonstrate the new algorithms.
Resumo:
The potential of near infrared spectroscopy in conjunction with partial least squares regression to predict Miscanthus xgiganteus and short rotation coppice willow quality indices was examined. Moisture, calorific value, ash and carbon content were predicted with a root mean square error of cross validation of 0.90% (R2 = 0.99), 0.13 MJ/kg (R2 = 0.99), 0.42% (R2 = 0.58), and 0.57% (R2 = 0.88), respectively. The moisture and calorific value prediction models had excellent accuracy while the carbon and ash models were fair and poor, respectively. The results indicate that near infrared spectroscopy has the potential to predict quality indices of dedicated energy crops, however the models must be further validated on a wider range of samples prior to implementation. The utilization of such models would assist in the optimal use of the feedstock based on its biomass properties.
Resumo:
The objective of this study was to investigate the potential application of mid-infrared spectroscopy for determination of selected sensory attributes in a range of experimentally manufactured processed cheese samples. This study also evaluates mid-infrared spectroscopy against other recently proposed techniques for predicting sensory texture attributes. Processed cheeses (n = 32) of varying compositions were manufactured on a pilot scale. After 2 and 4 wk of storage at 4 degrees C, mid-infrared spectra ( 640 to 4,000 cm(-1)) were recorded and samples were scored on a scale of 0 to 100 for 9 attributes using descriptive sensory analysis. Models were developed by partial least squares regression using raw and pretreated spectra. The mouth-coating and mass-forming models were improved by using a reduced spectral range ( 930 to 1,767 cm(-1)). The remaining attributes were most successfully modeled using a combined range ( 930 to 1,767 cm(-1) and 2,839 to 4,000 cm(-1)). The root mean square errors of cross-validation for the models were 7.4(firmness; range 65.3), 4.6 ( rubbery; range 41.7), 7.1 ( creamy; range 60.9), 5.1(chewy; range 43.3), 5.2(mouth-coating; range 37.4), 5.3 (fragmentable; range 51.0), 7.4 ( melting; range 69.3), and 3.1 (mass-forming; range 23.6). These models had a good practical utility. Model accuracy ranged from approximate quantitative predictions to excellent predictions ( range error ratio = 9.6). In general, the models compared favorably with previously reported instrumental texture models and near-infrared models, although the creamy, chewy, and melting models were slightly weaker than the previously reported near-infrared models. We concluded that mid-infrared spectroscopy could be successfully used for the nondestructive and objective assessment of processed cheese sensory quality..
Resumo:
The objective of this study was to determine the potential of mid-infrared spectroscopy in conjunction with partial least squares (PLS) regression to predict various quality parameters in cheddar cheese. Cheddar cheeses (n = 24) were manufactured and stored at 8 degrees C for 12 mo. Mid-infrared spectra (640 to 4000/cm) were recorded after 4, 6, 9, and 12 mo storage. At 4, 6, and 9 mo, the water-soluble nitrogen (WSN) content of the samples was determined and the samples were also evaluated for 11 sensory texture attributes using descriptive sensory analysis. The mid-infrared spectra were subjected to a number of pretreatments, and predictive models were developed for all parameters. Age was predicted using scatter-corrected, 1st derivative spectra with a root mean square error of cross-validation (RMSECV) of 1 mo, while WSN was predicted using 1st derivative spectra (RMSECV = 2.6%). The sensory texture attributes most successfully predicted were rubbery, crumbly, chewy, and massforming. These attributes were modeled using 2nd derivative spectra and had, corresponding RMSECV values in the range of 2.5 to 4.2 on a scale of 0 to 100. It was concluded that mid-infrared spectroscopy has the potential to predict age, WSN, and several sensory texture attributes of cheddar cheese..