995 resultados para OBJECT CLASSIFICATION


Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this study, 137 corn distillers dried grains with solubles (DDGS) samples from a range of different geographical origins (Jilin Province of China, Heilongjiang Province of China, USA and Europe) were collected and analysed. Different near infrared spectrometers combined with different chemometric packages were used in two independent laboratories to investigate the feasibility of classifying geographical origin of DDGS. Base on the same dataset, one laboratory developed a partial least square discriminant analysis model and another laboratory developed an orthogonal partial least square discriminant analysis model. Results showed that both models could perfectly classify DDGS samples from different geographical origins. These promising results encourage the development of larger scale efforts to produce datasets which can be used to differentiate the geographical origin of DDGS and such efforts are required to provide higher level food security measures on a global scale.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Classification methods with embedded feature selection capability are very appealing for the analysis of complex processes since they allow the analysis of root causes even when the number of input variables is high. In this work, we investigate the performance of three techniques for classification within a Monte Carlo strategy with the aim of root cause analysis. We consider the naive bayes classifier and the logistic regression model with two different implementations for controlling model complexity, namely, a LASSO-like implementation with a L1 norm regularization and a fully Bayesian implementation of the logistic model, the so called relevance vector machine. Several challenges can arise when estimating such models mainly linked to the characteristics of the data: a large number of input variables, high correlation among subsets of variables, the situation where the number of variables is higher than the number of available data points and the case of unbalanced datasets. Using an ecological and a semiconductor manufacturing dataset, we show advantages and drawbacks of each method, highlighting the superior performance in term of classification accuracy for the relevance vector machine with respect to the other classifiers. Moreover, we show how the combination of the proposed techniques and the Monte Carlo approach can be used to get more robust insights into the problem under analysis when faced with challenging modelling conditions.