898 resultados para classification accuracy


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Automatic generation of classification rules has been an increasingly popular technique in commercial applications such as Big Data analytics, rule based expert systems and decision making systems. However, a principal problem that arises with most methods for generation of classification rules is the overfit-ting of training data. When Big Data is dealt with, this may result in the generation of a large number of complex rules. This may not only increase computational cost but also lower the accuracy in predicting further unseen instances. This has led to the necessity of developing pruning methods for the simplification of rules. In addition, classification rules are used further to make predictions after the completion of their generation. As efficiency is concerned, it is expected to find the first rule that fires as soon as possible by searching through a rule set. Thus a suit-able structure is required to represent the rule set effectively. In this chapter, the authors introduce a unified framework for construction of rule based classification systems consisting of three operations on Big Data: rule generation, rule simplification and rule representation. The authors also review some existing methods and techniques used for each of the three operations and highlight their limitations. They introduce some novel methods and techniques developed by them recently. These methods and techniques are also discussed in comparison to existing ones with respect to efficient processing of Big Data.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Various complex oscillatory processes are involved in the generation of the motor command. The temporal dynamics of these processes were studied for movement detection from single trial electroencephalogram (EEG). Autocorrelation analysis was performed on the EEG signals to find robust markers of movement detection. The evolution of the autocorrelation function was characterised via the relaxation time of the autocorrelation by exponential curve fitting. It was observed that the decay constant of the exponential curve increased during movement, indicating that the autocorrelation function decays slowly during motor execution. Significant differences were observed between movement and no moment tasks. Additionally, a linear discriminant analysis (LDA) classifier was used to identify movement trials with a peak accuracy of 74%.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Advances in hardware and software technologies allow to capture streaming data. The area of Data Stream Mining (DSM) is concerned with the analysis of these vast amounts of data as it is generated in real-time. Data stream classification is one of the most important DSM techniques allowing to classify previously unseen data instances. Different to traditional classifiers for static data, data stream classifiers need to adapt to concept changes (concept drift) in the stream in real-time in order to reflect the most recent concept in the data as accurately as possible. A recent addition to the data stream classifier toolbox is eRules which induces and updates a set of expressive rules that can easily be interpreted by humans. However, like most rule-based data stream classifiers, eRules exhibits a poor computational performance when confronted with continuous attributes. In this work, we propose an approach to deal with continuous data effectively and accurately in rule-based classifiers by using the Gaussian distribution as heuristic for building rule terms on continuous attributes. We show on the example of eRules that incorporating our method for continuous attributes indeed speeds up the real-time rule induction process while maintaining a similar level of accuracy compared with the original eRules classifier. We termed this new version of eRules with our approach G-eRules.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

An important application of Big Data Analytics is the real-time analysis of streaming data. Streaming data imposes unique challenges to data mining algorithms, such as concept drifts, the need to analyse the data on the fly due to unbounded data streams and scalable algorithms due to potentially high throughput of data. Real-time classification algorithms that are adaptive to concept drifts and fast exist, however, most approaches are not naturally parallel and are thus limited in their scalability. This paper presents work on the Micro-Cluster Nearest Neighbour (MC-NN) classifier. MC-NN is based on an adaptive statistical data summary based on Micro-Clusters. MC-NN is very fast and adaptive to concept drift whilst maintaining the parallel properties of the base KNN classifier. Also MC-NN is competitive compared with existing data stream classifiers in terms of accuracy and speed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The personalised conditioning system (PCS) is widely studied. Potentially, it is able to reduce energy consumption while securing occupants’ thermal comfort requirements. It has been suggested that automatic optimised operation schemes for PCS should be introduced to avoid energy wastage and discomfort caused by inappropriate operation. In certain automatic operation schemes, personalised thermal sensation models are applied as key components to help in setting targets for PCS operation. In this research, a novel personal thermal sensation modelling method based on the C-Support Vector Classification (C-SVC) algorithm has been developed for PCS control. The personal thermal sensation modelling has been regarded as a classification problem. During the modelling process, the method ‘learns’ an occupant’s thermal preferences from his/her feedback, environmental parameters and personal physiological and behavioural factors. The modelling method has been verified by comparing the actual thermal sensation vote (TSV) with the modelled one based on 20 individual cases. Furthermore, the accuracy of each individual thermal sensation model has been compared with the outcomes of the PMV model. The results indicate that the modelling method presented in this paper is an effective tool to model personal thermal sensations and could be integrated within the PCS for optimised system operation and control.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper we present a novel approach for multispectral image contextual classification by combining iterative combinatorial optimization algorithms. The pixel-wise decision rule is defined using a Bayesian approach to combine two MRF models: a Gaussian Markov Random Field (GMRF) for the observations (likelihood) and a Potts model for the a priori knowledge, to regularize the solution in the presence of noisy data. Hence, the classification problem is stated according to a Maximum a Posteriori (MAP) framework. In order to approximate the MAP solution we apply several combinatorial optimization methods using multiple simultaneous initializations, making the solution less sensitive to the initial conditions and reducing both computational cost and time in comparison to Simulated Annealing, often unfeasible in many real image processing applications. Markov Random Field model parameters are estimated by Maximum Pseudo-Likelihood (MPL) approach, avoiding manual adjustments in the choice of the regularization parameters. Asymptotic evaluations assess the accuracy of the proposed parameter estimation procedure. To test and evaluate the proposed classification method, we adopt metrics for quantitative performance assessment (Cohen`s Kappa coefficient), allowing a robust and accurate statistical analysis. The obtained results clearly show that combining sub-optimal contextual algorithms significantly improves the classification performance, indicating the effectiveness of the proposed methodology. (C) 2010 Elsevier B.V. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Condition monitoring of wooden railway sleepers applications are generallycarried out by visual inspection and if necessary some impact acoustic examination iscarried out intuitively by skilled personnel. In this work, a pattern recognition solutionhas been proposed to automate the process for the achievement of robust results. Thestudy presents a comparison of several pattern recognition techniques together withvarious nonstationary feature extraction techniques for classification of impactacoustic emissions. Pattern classifiers such as multilayer perceptron, learning cectorquantization and gaussian mixture models, are combined with nonstationary featureextraction techniques such as Short Time Fourier Transform, Continuous WaveletTransform, Discrete Wavelet Transform and Wigner-Ville Distribution. Due to thepresence of several different feature extraction and classification technqies, datafusion has been investigated. Data fusion in the current case has mainly beeninvestigated on two levels, feature level and classifier level respectively. Fusion at thefeature level demonstrated best results with an overall accuracy of 82% whencompared to the human operator.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Parkinson's disease (PD) is a degenerative illness whose cardinal symptoms include rigidity, tremor, and slowness of movement. In addition to its widely recognized effects PD can have a profound effect on speech and voice.The speech symptoms most commonly demonstrated by patients with PD are reduced vocal loudness, monopitch, disruptions of voice quality, and abnormally fast rate of speech. This cluster of speech symptoms is often termed Hypokinetic Dysarthria.The disease can be difficult to diagnose accurately, especially in its early stages, due to this reason, automatic techniques based on Artificial Intelligence should increase the diagnosing accuracy and to help the doctors make better decisions. The aim of the thesis work is to predict the PD based on the audio files collected from various patients.Audio files are preprocessed in order to attain the features.The preprocessed data contains 23 attributes and 195 instances. On an average there are six voice recordings per person, By using data compression technique such as Discrete Cosine Transform (DCT) number of instances can be minimized, after data compression, attribute selection is done using several WEKA build in methods such as ChiSquared, GainRatio, Infogain after identifying the important attributes, we evaluate attributes one by one by using stepwise regression.Based on the selected attributes we process in WEKA by using cost sensitive classifier with various algorithms like MultiPass LVQ, Logistic Model Tree(LMT), K-Star.The classified results shows on an average 80%.By using this features 95% approximate classification of PD is acheived.This shows that using the audio dataset, PD could be predicted with a higher level of accuracy.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

With the service life of water supply network (WSN) growth, the growing phenomenon of aging pipe network has become exceedingly serious. As urban water supply network is hidden underground asset, it is difficult for monitoring staff to make a direct classification towards the faults of pipe network by means of the modern detecting technology. In this paper, based on the basic property data (e.g. diameter, material, pressure, distance to pump, distance to tank, load, etc.) of water supply network, decision tree algorithm (C4.5) has been carried out to classify the specific situation of water supply pipeline. Part of the historical data was used to establish a decision tree classification model, and the remaining historical data was used to validate this established model. Adopting statistical methods were used to access the decision tree model including basic statistical method, Receiver Operating Characteristic (ROC) and Recall-Precision Curves (RPC). These methods has been successfully used to assess the accuracy of this established classification model of water pipe network. The purpose of classification model was to classify the specific condition of water pipe network. It is important to maintain the pipeline according to the classification results including asset unserviceable (AU), near perfect condition (NPC) and serious deterioration (SD). Finally, this research focused on pipe classification which plays a significant role in maintaining water supply networks in the future.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper reports on a sensor array able to distinguish tastes and used to classify red wines. The array comprises sensing units made from Langmuir-Blodgett (LB) films of conducting polymers and lipids and layer-by-layer (LBL) films from chitosan deposited onto gold interdigitated electrodes. Using impedance spectroscopy as the principle of detection, we show that distinct clusters can be identified in principal component analysis (PCA) plots for six types of red wine. Distinction can be made with regard to vintage, vineyard and brands of the red wine. Furthermore, if the data are treated with artificial neural networks (ANNs), this artificial tongue can identify wine samples stored under different conditions. This is illustrated by considering 900 wine samples, obtained with 30 measurements for each of the five bottles of the six wines, which could be recognised with 100% accuracy using the algorithms Standard Backpropagation and Backpropagation momentum in the ANNs. (C) 2003 Elsevier B.V. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Objectives:The aim of this in vitro study was to assess the inter- and intra-examiner reproducibility and the accuracy of the International Caries Detection and Assessment System-II (ICDAS-II) in detecting occlusal caries.Methods:One hundred and sixty-three molars were independently assessed twice by two experienced dentists using the 0- to 6-graded ICDAS-II. The teeth were histologically prepared and classified using two different histological systems [Ekstrand et al. (1997) Caries Research vol. 31, pp. 224-231; Lussi et al. (1999) Caries Research vol. 33, pp. 261-266] and assessed for caries extension. Sensitivity, specificity, accuracy and area under the ROC curve (A(z)) were obtained at D(2) and D(3) thresholds. Unweighted kappa coefficient was used to assess inter- and intra-examiner reproducibility.Results:For the Ekstrand et al. histological classification the sensitivity was 0.99 and 1.00, specificity 1.00 and 0.69 and accuracy 0.99 and 0.76 at D(2) and D(3), respectively. For the Lussi et al. histological classification the sensitivity was 0.91 and 0.75, specificity 0.47 and 0.62 and accuracy 0.86 and 0.68 at D(2) and D(3), respectively. The A(z) varied from 0.54 to 0.73. The inter- and intra-examiner kappa values were 0.51 and 0.58, respectively.Conclusions:ICDAS-II presented good reproducibility and accuracy in detecting occlusal caries, especially caries lesions in the outer half of the enamel.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Objectives: (1) To evaluate the intraobserver agreement related to image interpretation and (2) to compare the accuracy of 100%, 200% and 400% zoomed digital images in the detection of simulated periodontal bone defects.Methods: Periodontal bone defects were created in 60 pig hemi-mandibles with slow-speed burs 0.5 mm, 1.0 mm, 1.5 mm, 2.0 mm and 3.0 mm in diameter. 180 standardized digital radiographs were made using Schick sensor and evaluated at 100%, 200% and 400% zooming. The intraobserver agreement was estimated by Kappa statistic (kappa). For the evaluation of diagnostic accuracy receiver operating characteristic (ROC) analysis was performed followed by chi-square test to compare the areas under ROC curves according to each level of zooming.Results: For 100%, 200% and 400% zooming the intraobserver agreement was moderate (kappa = 0.48, kappa = 0.54 and kappa = 0.43, respectively) and there were similar performances in the discrimination capacity, with ROC areas of 0.8611 (95% CI: 0.7660-0.9562), 0.8600 (95% CI: 0.7659-0.9540), and 0.8368 (95% CI: 0.7346-0.9390), respectively, with no statistical significant differences (chi(2)-test; P = 0.8440).Conclusions: A moderate intraobserver agreement was observed in the classification of periodontal bone defects and the 100%, 200% and 400% zoomed digital images presented similar performances in the detection of periodontal bone defects.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Majority of biometric researchers focus on the accuracy of matching using biometrics databases, including iris databases, while the scalability and speed issues have been neglected. In the applications such as identification in airports and borders, it is critical for the identification system to have low-time response. In this paper, a graph-based framework for pattern recognition, called Optimum-Path Forest (OPF), is utilized as a classifier in a pre-developed iris recognition system. The aim of this paper is to verify the effectiveness of OPF in the field of iris recognition, and its performance for various scale iris databases. This paper investigates several classifiers, which are widely used in iris recognition papers, and the response time along with accuracy. The existing Gauss-Laguerre Wavelet based iris coding scheme, which shows perfect discrimination with rotary Hamming distance classifier, is used for iris coding. The performance of classifiers is compared using small, medium, and large scale databases. Such comparison shows that OPF has faster response for large scale database, thus performing better than more accurate but slower Bayesian classifier.