750 resultados para CLASSIFIER


Relevância:

10.00% 10.00%

Publicador:

Resumo:

Universidade Estadual de Campinas. Faculdade de Educação Física

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Fifty Bursa of Fabricius (BF) were examined by conventional optical microscopy and digital images were acquired and processed using Matlab® 6.5 software. The Artificial Neuronal Network (ANN) was generated using Neuroshell® Classifier software and the optical and digital data were compared. The ANN was able to make a comparable classification of digital and optical scores. The use of ANN was able to classify correctly the majority of the follicles, reaching sensibility and specificity of 89% and 96%, respectively. When the follicles were scored and grouped in a binary fashion the sensibility increased to 90% and obtained the maximum value for the specificity of 92%. These results demonstrate that the use of digital image analysis and ANN is a useful tool for the pathological classification of the BF lymphoid depletion. In addition it provides objective results that allow measuring the dimension of the error in the diagnosis and classification therefore making comparison between databases feasible.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This work proposes a new approach using a committee machine of artificial neural networks to classify masses found in mammograms as benign or malignant. Three shape factors, three edge-sharpness measures, and 14 texture measures are used for the classification of 20 regions of interest (ROIs) related to malignant tumors and 37 ROIs related to benign masses. A group of multilayer perceptrons (MLPs) is employed as a committee machine of neural network classifiers. The classification results are reached by combining the responses of the individual classifiers. Experiments involving changes in the learning algorithm of the committee machine are conducted. The classification accuracy is evaluated using the area A. under the receiver operating characteristics (ROC) curve. The A, result for the committee machine is compared with the A, results obtained using MLPs and single-layer perceptrons (SLPs), as well as a linear discriminant analysis (LDA) classifier Tests are carried out using the student's t-distribution. The committee machine classifier outperforms the MLP SLP, and LDA classifiers in the following cases: with the shape measure of spiculation index, the A, values of the four methods are, in order 0.93, 0.84, 0.75, and 0.76; and with the edge-sharpness measure of acutance, the values are 0.79, 0.70, 0.69, and 0.74. Although the features with which improvement is obtained with the committee machines are not the same as those that provided the maximal value of A(z) (A(z) = 0.99 with some shape features, with or without the committee machine), they correspond to features that are not critically dependent on the accuracy of the boundaries of the masses, which is an important result. (c) 2008 SPIE and IS&T.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We study the star/galaxy classification efficiency of 13 different decision tree algorithms applied to photometric objects in the Sloan Digital Sky Survey Data Release Seven (SDSS-DR7). Each algorithm is defined by a set of parameters which, when varied, produce different final classification trees. We extensively explore the parameter space of each algorithm, using the set of 884,126 SDSS objects with spectroscopic data as the training set. The efficiency of star-galaxy separation is measured using the completeness function. We find that the Functional Tree algorithm (FT) yields the best results as measured by the mean completeness in two magnitude intervals: 14 <= r <= 21 (85.2%) and r >= 19 (82.1%). We compare the performance of the tree generated with the optimal FT configuration to the classifications provided by the SDSS parametric classifier, 2DPHOT, and Ball et al. We find that our FT classifier is comparable to or better in completeness over the full magnitude range 15 <= r <= 21, with much lower contamination than all but the Ball et al. classifier. At the faintest magnitudes (r > 19), our classifier is the only one that maintains high completeness (> 80%) while simultaneously achieving low contamination (similar to 2.5%). We also examine the SDSS parametric classifier (psfMag - modelMag) to see if the dividing line between stars and galaxies can be adjusted to improve the classifier. We find that currently stars in close pairs are often misclassified as galaxies, and suggest a new cut to improve the classifier. Finally, we apply our FT classifier to separate stars from galaxies in the full set of 69,545,326 SDSS photometric objects in the magnitude range 14 <= r <= 21.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We investigate the sensitivity of the composite cellular automaton of H. Fuks [Phys. Rev. E 55, R2081 (1997)] to noise and assess the density classification performance of the resulting probabilistic cellular automaton (PCA) numerically. We conclude that the composite PCA performs the density classification task reliably only up to very small levels of noise. In particular, it cannot outperform the noisy Gacs-Kurdyumov-Levin automaton, an imperfect classifier, for any level of noise. While the original composite CA is nonergodic, analyses of relaxation times indicate that its noisy version is an ergodic automaton, with the relaxation times decaying algebraically over an extended range of parameters with an exponent very close (possibly equal) to the mean-field value.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Online music databases have increased significantly as a consequence of the rapid growth of the Internet and digital audio, requiring the development of faster and more efficient tools for music content analysis. Musical genres are widely used to organize music collections. In this paper, the problem of automatic single and multi-label music genre classification is addressed by exploring rhythm-based features obtained from a respective complex network representation. A Markov model is built in order to analyse the temporal sequence of rhythmic notation events. Feature analysis is performed by using two multi-variate statistical approaches: principal components analysis (unsupervised) and linear discriminant analysis (supervised). Similarly, two classifiers are applied in order to identify the category of rhythms: parametric Bayesian classifier under the Gaussian hypothesis (supervised) and agglomerative hierarchical clustering (unsupervised). Qualitative results obtained by using the kappa coefficient and the obtained clusters corroborated the effectiveness of the proposed method.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Introduction: Internet users are increasingly using the worldwide web to search for information relating to their health. This situation makes it necessary to create specialized tools capable of supporting users in their searches. Objective: To apply and compare strategies that were developed to investigate the use of the Portuguese version of Medical Subject Headings (MeSH) for constructing an automated classifier for Brazilian Portuguese-language web-based content within or outside of the field of healthcare, focusing on the lay public. Methods: 3658 Brazilian web pages were used to train the classifier and 606 Brazilian web pages were used to validate it. The strategies proposed were constructed using content-based vector methods for text classification, such that Naive Bayes was used for the task of classifying vector patterns with characteristics obtained through the proposed strategies. Results: A strategy named InDeCS was developed specifically to adapt MeSH for the problem that was put forward. This approach achieved better accuracy for this pattern classification task (0.94 sensitivity, specificity and area under the ROC curve). Conclusions: Because of the significant results achieved by InDeCS, this tool has been successfully applied to the Brazilian healthcare search portal known as Busca Saude. Furthermore, it could be shown that MeSH presents important results when used for the task of classifying web-based content focusing on the lay public. It was also possible to show from this study that MeSH was able to map out mutable non-deterministic characteristics of the web. (c) 2010 Elsevier Inc. All rights reserved.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Age-related changes in running kinematics have been reported in the literature using classical inferential statistics. However, this approach has been hampered by the increased number of biomechanical gait variables reported and subsequently the lack of differences presented in these studies. Data mining techniques have been applied in recent biomedical studies to solve this problem using a more general approach. In the present work, we re-analyzed lower extremity running kinematic data of 17 young and 17 elderly male runners using the Support Vector Machine (SVM) classification approach. In total, 31 kinematic variables were extracted to train the classification algorithm and test the generalized performance. The results revealed different accuracy rates across three different kernel methods adopted in the classifier, with the linear kernel performing the best. A subsequent forward feature selection algorithm demonstrated that with only six features, the linear kernel SVM achieved 100% classification performance rate, showing that these features provided powerful combined information to distinguish age groups. The results of the present work demonstrate potential in applying this approach to improve knowledge about the age-related differences in running gait biomechanics and encourages the use of the SVM in other clinical contexts. (C) 2010 Elsevier Ltd. All rights reserved.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A hybrid system to automatically detect, locate and classify disturbances affecting power quality in an electrical power system is presented in this paper. The disturbances characterized are events from an actual power distribution system simulated by the ATP (Alternative Transients Program) software. The hybrid approach introduced consists of two stages. In the first stage, the wavelet transform (WT) is used to detect disturbances in the system and to locate the time of their occurrence. When such an event is flagged, the second stage is triggered and various artificial neural networks (ANNs) are applied to classify the data measured during the disturbance(s). A computational logic using WTs and ANNs together with a graphical user interface (GU) between the algorithm and its end user is then implemented. The results obtained so far are promising and suggest that this approach could lead to a useful application in an actual distribution system. (C) 2009 Elsevier Ltd. All rights reserved.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper describes the modeling of a weed infestation risk inference system that implements a collaborative inference scheme based on rules extracted from two Bayesian network classifiers. The first Bayesian classifier infers a categorical variable value for the weed-crop competitiveness using as input categorical variables for the total density of weeds and corresponding proportions of narrow and broad-leaved weeds. The inferred categorical variable values for the weed-crop competitiveness along with three other categorical variables extracted from estimated maps for the weed seed production and weed coverage are then used as input for a second Bayesian network classifier to infer categorical variables values for the risk of infestation. Weed biomass and yield loss data samples are used to learn the probability relationship among the nodes of the first and second Bayesian classifiers in a supervised fashion, respectively. For comparison purposes, two types of Bayesian network structures are considered, namely an expert-based Bayesian classifier and a naive Bayes classifier. The inference system focused on the knowledge interpretation by translating a Bayesian classifier into a set of classification rules. The results obtained for the risk inference in a corn-crop field are presented and discussed. (C) 2009 Elsevier Ltd. All rights reserved.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Recently, we have built a classification model that is capable of assigning a given sesquiterpene lactone (STL) into exactly one tribe of the plant family Asteraceae from which the STL has been isolated. Although many plant species are able to biosynthesize a set of peculiar compounds, the occurrence of the same secondary metabolites in more than one tribe of Asteraceae is frequent. Building on our previous work, in this paper, we explore the possibility of assigning an STL to more than one tribe (class) simultaneously. When an object may belong to more than one class simultaneously, it is called multilabeled. In this work, we present a general overview of the techniques available to examine multilabeled data. The problem of evaluating the performance of a multilabeled classifier is discussed. Two particular multilabeled classification methods-cross-training with support vector machines (ct-SVM) and multilabeled k-nearest neighbors (M-L-kNN)were applied to the classification of the STLs into seven tribes from the plant family Asteraceae. The results are compared to a single-label classification and are analyzed from a chemotaxonomic point of view. The multilabeled approach allowed us to (1) model the reality as closely as possible, (2) improve our understanding of the relationship between the secondary metabolite profiles of different Asteraceae tribes, and (3) significantly decrease the number of plant sources to be considered for finding a certain STL. The presented classification models are useful for the targeted collection of plants with the objective of finding plant sources of natural compounds that are biologically active or possess other specific properties of interest.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper examines the article system in interlanguage grammar focusing on Japanese learners of English, whose native language lacks articles. It will be demonstrated that for the acquisition of the English article system, count/mass distinctions and definiteness are the crucial factors. Although Japanese does not employ the article system to encode these aspects, it will be argued that they are nevertheless syntactically encoded through its classifier system. Hence, the problem for these learners must be to map these features onto the appropriate surface forms as the Missing Surface Inflection Hypothesis predicts (Prévost & White 2000). This suggestion will further be supported empirically by a fill-in-the article task. It will be concluded that these Japanese learners understand the English article system fairly well, possibly due to their native language, yet have problems with realizing the relevant features (i.e. count/mass distinctions and definiteness) in the target language.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Data mining is the process to identify valid, implicit, previously unknown, potentially useful and understandable information from large databases. It is an important step in the process of knowledge discovery in databases, (Olaru & Wehenkel, 1999). In a data mining process, input data can be structured, seme-structured, or unstructured. Data can be in text, categorical or numerical values. One of the important characteristics of data mining is its ability to deal data with large volume, distributed, time variant, noisy, and high dimensionality. A large number of data mining algorithms have been developed for different applications. For example, association rules mining can be useful for market basket problems, clustering algorithms can be used to discover trends in unsupervised learning problems, classification algorithms can be applied in decision-making problems, and sequential and time series mining algorithms can be used in predicting events, fault detection, and other supervised learning problems (Vapnik, 1999). Classification is among the most important tasks in the data mining, particularly for data mining applications into engineering fields. Together with regression, classification is mainly for predictive modelling. So far, there have been a number of classification algorithms in practice. According to (Sebastiani, 2002), the main classification algorithms can be categorized as: decision tree and rule based approach such as C4.5 (Quinlan, 1996); probability methods such as Bayesian classifier (Lewis, 1998); on-line methods such as Winnow (Littlestone, 1988) and CVFDT (Hulten 2001), neural networks methods (Rumelhart, Hinton & Wiliams, 1986); example-based methods such as k-nearest neighbors (Duda & Hart, 1973), and SVM (Cortes & Vapnik, 1995). Other important techniques for classification tasks include Associative Classification (Liu et al, 1998) and Ensemble Classification (Tumer, 1996).

Relevância:

10.00% 10.00%

Publicador:

Resumo:

There are many techniques for electricity market price forecasting. However, most of them are designed for expected price analysis rather than price spike forecasting. An effective method of predicting the occurrence of spikes has not yet been observed in the literature so far. In this paper, a data mining based approach is presented to give a reliable forecast of the occurrence of price spikes. Combined with the spike value prediction techniques developed by the same authors, the proposed approach aims at providing a comprehensive tool for price spike forecasting. In this paper, feature selection techniques are firstly described to identify the attributes relevant to the occurrence of spikes. A simple introduction to the classification techniques is given for completeness. Two algorithms: support vector machine and probability classifier are chosen to be the spike occurrence predictors and are discussed in details. Realistic market data are used to test the proposed model with promising results.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Fogo selvagem (FS) is mediated by pathogenic, predominantly IgG4, anti-desmoglein 1 (Dsg1) autoantibodies and is endemic in Limao Verde, Brazil. IgG and IgG subclass autoantibodies were tested in a sample of 214 FS patients and 261 healthy controls by Dsg1 ELISA. For model selection, the sample was randomly divided into training (50%), validation (25%), and test (25%) sets. Using the training and validation sets, IgG4 was chosen as the best predictor of FS, with index values above 6.43 classified as FS. Using the test set, IgG4 has sensitivity of 92% (95% confidence interval (95% CI): 82-95%), specificity of 97% (95% CI: 89-100%), and area under the curve of 0.97 ( 95% CI: 0.94-1.00). The IgG4 positive predictive value (PPV) in Limao Verde (3% FS prevalence) was 49%. The sensitivity, specificity, and PPV of IgG anti-Dsg1 were 87, 91, and 23%, respectively. The IgG4-based classifier was validated by testing 11 FS patients before and after clinical disease and 60 Japanese pemphigus foliaceus patients. It classified 21 of 96 normal individuals from a Limao Verde cohort as having FS serology. On the basis of its PPV, half of the 21 individuals may currently have preclinical FS and could develop clinical disease in the future. Identifying individuals during preclinical FS will enhance our ability to identify the etiological agent(s) triggering FS.