898 resultados para classification accuracy


Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, an improved stochastic discrimination (SD) is introduced to reduce the error rate of the standard SD in the context of multi-class classification problem. The learning procedure of the improved SD consists of two stages. In the first stage, a standard SD, but with shorter learning period is carried out to identify an important space where all the misclassified samples are located. In the second stage, the standard SD is modified by (i) restricting sampling in the important space; and (ii) introducing a new discriminant function for samples in the important space. It is shown by mathematical derivation that the new discriminant function has the same mean, but smaller variance than that of standard SD for samples in the important space. It is also analyzed that the smaller the variance of the discriminant function, the lower the error rate of the classifier. Consequently, the proposed improved SD improves standard SD by its capability of achieving higher classification accuracy. Illustrative examples axe provided to demonstrate the effectiveness of the proposed improved SD.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Stochastic discrimination (SD) depends on a discriminant function for classification. In this paper, an improved SD is introduced to reduce the error rate of the standard SD in the context of a two-class classification problem. The learning procedure of the improved SD consists of two stages. Initially a standard SD, but with shorter learning period is carried out to identify an important space where all the misclassified samples are located. Then the standard SD is modified by 1) restricting sampling in the important space, and 2) introducing a new discriminant function for samples in the important space. It is shown by mathematical derivation that the new discriminant function has the same mean, but with a smaller variance than that of the standard SD for samples in the important space. It is also analyzed that the smaller the variance of the discriminant function, the lower the error rate of the classifier. Consequently, the proposed improved SD improves standard SD by its capability of achieving higher classification accuracy. Illustrative examples are provided to demonstrate the effectiveness of the proposed improved SD.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This work proposes and discusses an approach for inducing Bayesian classifiers aimed at balancing the tradeoff between the precise probability estimates produced by time consuming unrestricted Bayesian networks and the computational efficiency of Naive Bayes (NB) classifiers. The proposed approach is based on the fundamental principles of the Heuristic Search Bayesian network learning. The Markov Blanket concept, as well as a proposed ""approximate Markov Blanket"" are used to reduce the number of nodes that form the Bayesian network to be induced from data. Consequently, the usually high computational cost of the heuristic search learning algorithms can be lessened, while Bayesian network structures better than NB can be achieved. The resulting algorithms, called DMBC (Dynamic Markov Blanket Classifier) and A-DMBC (Approximate DMBC), are empirically assessed in twelve domains that illustrate scenarios of particular interest. The obtained results are compared with NB and Tree Augmented Network (TAN) classifiers, and confinn that both proposed algorithms can provide good classification accuracies and better probability estimates than NB and TAN, while being more computationally efficient than the widely used K2 Algorithm.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The purpose of this study was to examine the reliability, validity and classification accuracy of the South Oaks Gambling Screen (SOGS) in a sample of the Brazilian population. Participants in this study were drawn from three sources: 71 men and women from the general population interviewed at a metropolitan train station; 116 men and women encountered at a bingo venue; and 54 men and women undergoing treatment for gambling. The SOGS and a DSM-IV-based instrument were applied by trained researchers. The internal consistency of the SOGS was 0.75 according to the Cronbach`s alpha model, and construct validity was good. A significant difference among groups was demonstrated by ANOVA (F ((2.238)) = 221.3, P < 0.001). The SOGS items and DSM-IV symptoms were highly correlated (r = 0.854, P < 0.01). The SOGS also presented satisfactory psychometric properties: sensitivity (100), specificity (74.7), positive predictive rate (60.7), negative predictive rate (100) and misclassification rate (0.18). However, a cut-off score of eight improved classification accuracy and reduced the rate of false positives: sensitivity (95.4), specificity (89.8), positive predictive rate (78.5), negative predictive rate (98) and misclassification rate (0.09). Thus, the SOGS was found to be reliable and valid in the Brazilian population.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

[EN]In this paper, we focus on gender recognition in challenging large scale scenarios. Firstly, we review the literature results achieved for the problem in large datasets, and select the currently hardest dataset: The Images of Groups. Secondly, we study the extraction of features from the face and its local context to improve the recognition accuracy. Diff erent descriptors, resolutions and classfii ers are studied, overcoming previous literature results, reaching an accuracy of 89.8%.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Remote sensing data is routinely used in ecology to investigate the relationship between landscape pattern as characterised by land use and land cover maps, and ecological processes. Multiple factors related to the representation of geographic phenomenon have been shown to affect characterisation of landscape pattern resulting in spatial uncertainty. This study investigated the effect of the interaction between landscape spatial pattern and geospatial processing methods statistically; unlike most papers which consider the effect of each factor in isolation only. This is important since data used to calculate landscape metrics typically undergo a series of data abstraction processing tasks and are rarely performed in isolation. The geospatial processing methods tested were the aggregation method and the choice of pixel size used to aggregate data. These were compared to two components of landscape pattern, spatial heterogeneity and the proportion of landcover class area. The interactions and their effect on the final landcover map were described using landscape metrics to measure landscape pattern and classification accuracy (response variables). All landscape metrics and classification accuracy were shown to be affected by both landscape pattern and by processing methods. Large variability in the response of those variables and interactions between the explanatory variables were observed. However, even though interactions occurred, this only affected the magnitude of the difference in landscape metric values. Thus, provided that the same processing methods are used, landscapes should retain their ranking when their landscape metrics are compared. For example, highly fragmented landscapes will always have larger values for the landscape metric "number of patches" than less fragmented landscapes. But the magnitude of difference between the landscapes may change and therefore absolute values of landscape metrics may need to be interpreted with caution. The explanatory variables which had the largest effects were spatial heterogeneity and pixel size. These explanatory variables tended to result in large main effects and large interactions. The high variability in the response variables and the interaction of the explanatory variables indicate it would be difficult to make generalisations about the impact of processing on landscape pattern as only two processing methods were tested and it is likely that untested processing methods will potentially result in even greater spatial uncertainty. © 2013 Elsevier B.V.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

There has been considerable recent research into the connection between Parkinson's disease (PD) and speech impairment. Recently, a wide range of speech signal processing algorithms (dysphonia measures) aiming to predict PD symptom severity using speech signals have been introduced. In this paper, we test how accurately these novel algorithms can be used to discriminate PD subjects from healthy controls. In total, we compute 132 dysphonia measures from sustained vowels. Then, we select four parsimonious subsets of these dysphonia measures using four feature selection algorithms, and map these feature subsets to a binary classification response using two statistical classifiers: random forests and support vector machines. We use an existing database consisting of 263 samples from 43 subjects, and demonstrate that these new dysphonia measures can outperform state-of-the-art results, reaching almost 99% overall classification accuracy using only ten dysphonia features. We find that some of the recently proposed dysphonia measures complement existing algorithms in maximizing the ability of the classifiers to discriminate healthy controls from PD subjects. We see these results as an important step toward noninvasive diagnostic decision support in PD.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

This work proposes a new approach using a committee machine of artificial neural networks to classify masses found in mammograms as benign or malignant. Three shape factors, three edge-sharpness measures, and 14 texture measures are used for the classification of 20 regions of interest (ROIs) related to malignant tumors and 37 ROIs related to benign masses. A group of multilayer perceptrons (MLPs) is employed as a committee machine of neural network classifiers. The classification results are reached by combining the responses of the individual classifiers. Experiments involving changes in the learning algorithm of the committee machine are conducted. The classification accuracy is evaluated using the area A. under the receiver operating characteristics (ROC) curve. The A, result for the committee machine is compared with the A, results obtained using MLPs and single-layer perceptrons (SLPs), as well as a linear discriminant analysis (LDA) classifier Tests are carried out using the student's t-distribution. The committee machine classifier outperforms the MLP SLP, and LDA classifiers in the following cases: with the shape measure of spiculation index, the A, values of the four methods are, in order 0.93, 0.84, 0.75, and 0.76; and with the edge-sharpness measure of acutance, the values are 0.79, 0.70, 0.69, and 0.74. Although the features with which improvement is obtained with the committee machines are not the same as those that provided the maximal value of A(z) (A(z) = 0.99 with some shape features, with or without the committee machine), they correspond to features that are not critically dependent on the accuracy of the boundaries of the masses, which is an important result. (c) 2008 SPIE and IS&T.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

PURPOSE: Fatty liver disease (FLD) is an increasing prevalent disease that can be reversed if detected early. Ultrasound is the safest and ubiquitous method for identifying FLD. Since expert sonographers are required to accurately interpret the liver ultrasound images, lack of the same will result in interobserver variability. For more objective interpretation, high accuracy, and quick second opinions, computer aided diagnostic (CAD) techniques may be exploited. The purpose of this work is to develop one such CAD technique for accurate classification of normal livers and abnormal livers affected by FLD. METHODS: In this paper, the authors present a CAD technique (called Symtosis) that uses a novel combination of significant features based on the texture, wavelet transform, and higher order spectra of the liver ultrasound images in various supervised learning-based classifiers in order to determine parameters that classify normal and FLD-affected abnormal livers. RESULTS: On evaluating the proposed technique on a database of 58 abnormal and 42 normal liver ultrasound images, the authors were able to achieve a high classification accuracy of 93.3% using the decision tree classifier. CONCLUSIONS: This high accuracy added to the completely automated classification procedure makes the authors' proposed technique highly suitable for clinical deployment and usage.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Chronic Liver Disease is a progressive, most of the time asymptomatic, and potentially fatal disease. In this paper, a semi-automatic procedure to stage this disease is proposed based on ultrasound liver images, clinical and laboratorial data. In the core of the algorithm two classifiers are used: a k nearest neighbor and a Support Vector Machine, with different kernels. The classifiers were trained with the proposed multi-modal feature set and the results obtained were compared with the laboratorial and clinical feature set. The results showed that using ultrasound based features, in association with laboratorial and clinical features, improve the classification accuracy. The support vector machine, polynomial kernel, outperformed the others classifiers in every class studied. For the Normal class we achieved 100% accuracy, for the chronic hepatitis with cirrhosis 73.08%, for compensated cirrhosis 59.26% and for decompensated cirrhosis 91.67%.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Dissertation submitted in partial fulfilment of the requirements for the Degree of Master of Science in Geospatial Technologies.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

In the last years, volunteers have been contributing massively to what we know nowadays as Volunteered Geographic Information. This huge amount of data might be hiding a vast geographical richness and therefore research needs to be conducted to explore their potential and use it in the solution of real world problems. In this study we conduct an exploratory analysis of data from the OpenStreetMap initiative. Using the Corine Land Cover database as reference and continental Portugal as the study area, we establish a possible correspondence between both classification nomenclatures, evaluate the quality of OpenStreetMap polygon features classification against Corine Land Cover classes from level 1 nomenclature, and analyze the spatial distribution of OpenStreetMap classes over continental Portugal. A global classification accuracy around 76% and interesting coverage areas’ values are remarkable and promising results that encourages us for future research on this topic.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Introduction: Responses to external stimuli are typically investigated by averaging peri-stimulus electroencephalography (EEG) epochs in order to derive event-related potentials (ERPs) across the electrode montage, under the assumption that signals that are related to the external stimulus are fixed in time across trials. We demonstrate the applicability of a single-trial model based on patterns of scalp topographies (De Lucia et al, 2007) that can be used for ERP analysis at the single-subject level. The model is able to classify new trials (or groups of trials) with minimal a priori hypotheses, using information derived from a training dataset. The features used for the classification (the topography of responses and their latency) can be neurophysiologically interpreted, because a difference in scalp topography indicates a different configuration of brain generators. An above chance classification accuracy on test datasets implicitly demonstrates the suitability of this model for EEG data. Methods: The data analyzed in this study were acquired from two separate visual evoked potential (VEP) experiments. The first entailed passive presentation of checkerboard stimuli to each of the four visual quadrants (hereafter, "Checkerboard Experiment") (Plomp et al, submitted). The second entailed active discrimination of novel versus repeated line drawings of common objects (hereafter, "Priming Experiment") (Murray et al, 2004). Four subjects per experiment were analyzed, using approx. 200 trials per experimental condition. These trials were randomly separated in training (90%) and testing (10%) datasets in 10 independent shuffles. In order to perform the ERP analysis we estimated the statistical distribution of voltage topographies by a Mixture of Gaussians (MofGs), which reduces our original dataset to a small number of representative voltage topographies. We then evaluated statistically the degree of presence of these template maps across trials and whether and when this was different across experimental conditions. Based on these differences, single-trials or sets of a few single-trials were classified as belonging to one or the other experimental condition. Classification performance was assessed using the Receiver Operating Characteristic (ROC) curve. Results: For the Checkerboard Experiment contrasts entailed left vs. right visual field presentations for upper and lower quadrants, separately. The average posterior probabilities, indicating the presence of the computed template maps in time and across trials revealed significant differences starting at ~60-70 ms post-stimulus. The average ROC curve area across all four subjects was 0.80 and 0.85 for upper and lower quadrants, respectively and was in all cases significantly higher than chance (unpaired t-test, p<0.0001). In the Priming Experiment, we contrasted initial versus repeated presentations of visual object stimuli. Their posterior probabilities revealed significant differences, which started at 250ms post-stimulus onset. The classification accuracy rates with single-trial test data were at chance level. We therefore considered sub-averages based on five single trials. We found that for three out of four subjects' classification rates were significantly above chance level (unpaired t-test, p<0.0001). Conclusions: The main advantage of the present approach is that it is based on topographic features that are readily interpretable along neurophysiologic lines. As these maps were previously normalized by the overall strength of the field potential on the scalp, a change in their presence across trials and between conditions forcibly reflects a change in the underlying generator configurations. The temporal periods of statistical difference between conditions were estimated for each training dataset for ten shuffles of the data. Across the ten shuffles and in both experiments, we observed a high level of consistency in the temporal periods over which the two conditions differed. With this method we are able to analyze ERPs at the single-subject level providing a novel tool to compare normal electrophysiological responses versus single cases that cannot be considered part of any cohort of subjects. This aspect promises to have a strong impact on both basic and clinical research.