89 resultados para Naïve Bayes classifier
Resumo:
Design of speaker identification schemes for a small number of speakers (around 10) with a high degree of accuracy in controlled environment is a practical proposition today. When the number of speakers is large (say 50–100), many of these schemes cannot be directly extended, as both recognition error and computation time increase monotonically with population size. The feature selection problem is also complex for such schemes. Though there were earlier attempts to rank order features based on statistical distance measures, it has been observed only recently that the best two independent measurements are not the same as the combination in two's for pattern classification. We propose here a systematic approach to the problem using the decision tree or hierarchical classifier with the following objectives: (1) Design of optimal policy at each node of the tree given the tree structure i.e., the tree skeleton and the features to be used at each node. (2) Determination of the optimal feature measurement and decision policy given only the tree skeleton. Applicability of optimization procedures such as dynamic programming in the design of such trees is studied. The experimental results deal with the design of a 50 speaker identification scheme based on this approach.
Resumo:
Inovirus is a helical array of alpha-helical protein asymmetric units surrounding a DNA core. X-ray fibre diffraction studies show that the Pf1 species of Inovirus can undergo a reversible temperature-induced transition between two similar structural forms having slightly different virion helix parameters. Molecular models of the two forms show no evidence for altered interactions between the protein and either the solvent or the viral DNA; but there are significant differences in the shape and orientation of the protein asymmetric unit, related to the changes in the virion parameters. Normal modes involving libration of whole asymmetric units are in a frequency range with appreciable entropy of libration, and the structural transition may be related to changes in libration.
Resumo:
Inovirus is a helical array of agr-helical protein asymmetric units surrounding a DNA core. X-ray fibre diffraction studies show that the Pf1 species of Inovirus can undergo a reversible temperature-induced transition between two similar structural forms having slightly different virion helix parameters. Molecular models of the two forms show no evidence for altered interactions between the protein and either the solvent or the viral DNA; but there are significant differences in the shape and orientation of the protein asymmetric unit, related to the changes in the virion parameters. Normal modes involving libration of whole asymmetric units are in a frequency range with appreciable entropy of libration, and the structural transition may be related to changes in libration.
Resumo:
A 4 A electron-density map of Pf1 filamentous bacterial virus has been calculated from x-ray fiber diffraction data by using the maximum-entropy method. This method produces a map that is free of features due to noise in the data and enables incomplete isomorphous-derivative phase information to be supplemented by information about the nature of the solution. The map shows gently curved (banana-shaped) rods of density about 70 A long, oriented roughly parallel to the virion axis but slewing by about 1/6th turn while running from a radius of 28 A to one of 13 A. Within these rods, there is a helical periodicity with a pitch of 5 to 6 A. We interpret these rods to be the helical subunits of the virion. The position of strongly diffracted intensity on the x-ray fiber pattern shows that the basic helix of the virion is right handed and that neighboring nearly parallel protein helices cross one another in an unusual negative sense.
Resumo:
In this paper we present a novel macroblock mode decision algorithm to speedup H.264/SVC Intra frame encoding. We replace the complex mode-decision calculations by a classifier which has been trained specifically to minimize the reduction in RD performance. This results in a significant speedup in encoding. The results show that machine learning has a great potential and can reduce the complexity substantially with negligible impact on quality. The results show that the proposed method reduces encoding time to about 70% in base layer and up to 50% in enhancement layer of the reference implementation with a negligible loss in quality.
Resumo:
This work describes an online handwritten character recognition system working in combination with an offline recognition system. The online input data is also converted into an offline image, and parallely recognized by both online and offline strategies. Features are proposed for offline recognition and a disambiguation step is employed in the offline system for the samples for which the confidence level of the classifier is low. The outputs are then combined probabilistically resulting in a classifier out-performing both individual systems. Experiments are performed for Kannada, a South Indian Language, over a database of 295 classes. The accuracy of the online recognizer improves by 11% when the combination with offline system is used.
Resumo:
In this paper, we propose a novel dexterous technique for fast and accurate recognition of online handwritten Kannada and Tamil characters. Based on the primary classifier output and prior knowledge, the best classifier is chosen from set of three classifiers for second stage classification. Prior knowledge is obtained through analysis of the confusion matrix of primary classifier which helped in identifying the multiple sets of confused characters. Further, studies were carried out to check the performance of secondary classifiers in disambiguating among the confusion sets. Using this technique we have achieved an average accuracy of 92.6% for Kannada characters on the MILE lab dataset and 90.2% for Tamil characters on the HP Labs dataset.
Resumo:
This paper is concerned with off-line signature verification. Four different types of pattern representation schemes have been implemented, viz., geometric features, moment-based representations, envelope characteristics and tree-structured Wavelet features. The individual feature components in a representation are weighed by their pattern characterization capability using Genetic Algorithms. The conclusions of the four subsystems teach depending on a representation scheme) are combined to form a final decision on the validity of signature. Threshold-based classifiers (including the traditional confidence-interval classifier), neighbourhood classifiers and their combinations were studied. Benefits of using forged signatures for training purposes have been assessed. Experimental results show that combination of the Feature-based classifiers increases verification accuracy. (C) 1999 Pattern Recognition Society. Published by Elsevier Science Ltd. All rights reserved.
Resumo:
In this paper, we show that it is possible to reduce the complexity of Intra MB coding in H.264/AVC based on a novel chance constrained classifier. Using the pairs of simple mean-variances values, our technique is able to reduce the complexity of Intra MB coding process with a negligible loss in PSNR. We present an alternate approach to address the classification problem which is equivalent to machine learning. Implementation results show that the proposed method reduces encoding time to about 20% of the reference implementation with average loss of 0.05 dB in PSNR.
Resumo:
Due to its wide applicability, semi-supervised learning is an attractive method for using unlabeled data in classification. In this work, we present a semi-supervised support vector classifier that is designed using quasi-Newton method for nonsmooth convex functions. The proposed algorithm is suitable in dealing with very large number of examples and features. Numerical experiments on various benchmark datasets showed that the proposed algorithm is fast and gives improved generalization performance over the existing methods. Further, a non-linear semi-supervised SVM has been proposed based on a multiple label switching scheme. This non-linear semi-supervised SVM is found to converge faster and it is found to improve generalization performance on several benchmark datasets. (C) 2010 Elsevier Ltd. All rights reserved.
Resumo:
This paper introduces a scheme for classification of online handwritten characters based on polynomial regression of the sampled points of the sub-strokes in a character. The segmentation is done based on the velocity profile of the written character and this requires a smoothening of the velocity profile. We propose a novel scheme for smoothening the velocity profile curve and identification of the critical points to segment the character. We also porpose another method for segmentation based on the human eye perception. We then extract two sets of features for recognition of handwritten characters. Each sub-stroke is a simple curve, a part of the character, and is represented by the distance measure of each point from the first point. This forms the first set of feature vector for each character. The second feature vector are the coeficients obtained from the B-splines fitted to the control knots obtained from the segmentation algorithm. The feature vector is fed to the SVM classifier and it indicates an efficiency of 68% using the polynomial regression technique and 74% using the spline fitting method.
Resumo:
Growing concern over the status of global and regional bioenergy resources has necessitated the analysis and monitoring of land cover and land use parameters on spatial and temporal scales. The knowledge of land cover and land use is very important in understanding natural resources utilization, conversion and management. Land cover, land use intensity and land use diversity are land quality indicators for sustainable land management. Optimal management of resources aids in maintaining the ecosystem balance and thereby ensures the sustainable development of a region. Thus sustainable development of a region requires a synoptic ecosystem approach in the management of natural resources that relates to the dynamics of natural variability and the effects of human intervention on key indicators of biodiversity and productivity. Spatial and temporal tools such as remote sensing (RS), geographic information system (GIS) and global positioning system (GPS) provide spatial and attribute data at regular intervals with functionalities of a decision support system aid in visualisation, querying, analysis, etc., which would aid in sustainable management of natural resources. Remote sensing data and GIS technologies play an important role in spatially evaluating bioresource availability and demand. This paper explores various land cover and land use techniques that could be used for bioresources monitoring considering the spatial data of Kolar district, Karnataka state, India. Slope and distance based vegetation indices are computed for qualitative and quantitative assessment of land cover using remote spectral measurements. Differentscale mapping of land use pattern in Kolar district is done using supervised classification approaches. Slope based vegetation indices show area under vegetation range from 47.65 % to 49.05% while distance based vegetation indices shoes its range from 40.40% to 47.41%. Land use analyses using maximum likelihood classifier indicate that 46.69% is agricultural land, 42.33% is wasteland (barren land), 4.62% is built up, 3.07% of plantation, 2.77% natural forest and 0.53% water bodies. The comparative analysis of various classifiers, indicate that the Gaussian maximum likelihood classifier has least errors. The computation of talukwise bioresource status shows that Chikballapur Taluk has better availability of resources compared to other taluks in the district.
Resumo:
Support Vector Clustering has gained reasonable attention from the researchers in exploratory data analysis due to firm theoretical foundation in statistical learning theory. Hard Partitioning of the data set achieved by support vector clustering may not be acceptable in real world scenarios. Rough Support Vector Clustering is an extension of Support Vector Clustering to attain a soft partitioning of the data set. But the Quadratic Programming Problem involved in Rough Support Vector Clustering makes it computationally expensive to handle large datasets. In this paper, we propose Rough Core Vector Clustering algorithm which is a computationally efficient realization of Rough Support Vector Clustering. Here Rough Support Vector Clustering problem is formulated using an approximate Minimum Enclosing Ball problem and is solved using an approximate Minimum Enclosing Ball finding algorithm. Experiments done with several Large Multi class datasets such as Forest cover type, and other Multi class datasets taken from LIBSVM page shows that the proposed strategy is efficient, finds meaningful soft cluster abstractions which provide a superior generalization performance than the SVM classifier.
Resumo:
This paper investigates a new Glowworm Swarm Optimization (GSO) clustering algorithm for hierarchical splitting and merging of automatic multi-spectral satellite image classification (land cover mapping problem). Amongst the multiple benefits and uses of remote sensing, one of the most important has been its use in solving the problem of land cover mapping. Image classification forms the core of the solution to the land cover mapping problem. No single classifier can prove to classify all the basic land cover classes of an urban region in a satisfactory manner. In unsupervised classification methods, the automatic generation of clusters to classify a huge database is not exploited to their full potential. The proposed methodology searches for the best possible number of clusters and its center using Glowworm Swarm Optimization (GSO). Using these clusters, we classify by merging based on parametric method (k-means technique). The performance of the proposed unsupervised classification technique is evaluated for Landsat 7 thematic mapper image. Results are evaluated in terms of the classification efficiency - individual, average and overall.
Resumo:
We present a fractal coding method to recognize online handwritten Tamil characters and propose a novel technique to increase the efficiency in terms of time while coding and decoding. This technique exploits the redundancy in data, thereby achieving better compression and usage of lesser memory. It also reduces the encoding time and causes little distortion during reconstruction. Experiments have been conducted to use these fractal codes to classify the online handwritten Tamil characters from the IWFHR 2006 competition dataset. In one approach, we use fractal coding and decoding process. A recognition accuracy of 90% has been achieved by using DTW for distortion evaluation during classification and encoding processes as compared to 78% using nearest neighbor classifier. In other experiments, we use the fractal code, fractal dimensions and features derived from fractal codes as features in separate classifiers. While the fractal code is successful as a feature, the other two features are not able to capture the wide within-class variations.