857 resultados para Tribal classification
Resumo:
The widely used Bayesian classifier is based on the assumption of equal prior probabilities for all the classes. However, inclusion of equal prior probabilities may not guarantee high classification accuracy for the individual classes. Here, we propose a novel technique-Hybrid Bayesian Classifier (HBC)-where the class prior probabilities are determined by unmixing a supplemental low spatial-high spectral resolution multispectral (MS) data that are assigned to every pixel in a high spatial-low spectral resolution MS data in Bayesian classification. This is demonstrated with two separate experiments-first, class abundances are estimated per pixel by unmixing Moderate Resolution Imaging Spectroradiometer data to be used as prior probabilities, while posterior probabilities are determined from the training data obtained from ground. These have been used for classifying the Indian Remote Sensing Satellite LISS-III MS data through Bayesian classifier. In the second experiment, abundances obtained by unmixing Landsat Enhanced Thematic Mapper Plus are used as priors, and posterior probabilities are determined from the ground data to classify IKONOS MS images through Bayesian classifier. The results indicated that HBC systematically exploited the information from two image sources, improving the overall accuracy of LISS-III MS classification by 6% and IKONOS MS classification by 9%. Inclusion of prior probabilities increased the average producer's and user's accuracies by 5.5% and 6.5% in case of LISS-III MS with six classes and 12.5% and 5.4% in IKONOS MS for five classes considered.
Resumo:
In this paper, we give a brief review of pattern classification algorithms based on discriminant analysis. We then apply these algorithms to classify movement direction based on multivariate local field potentials recorded from a microelectrode array in the primary motor cortex of a monkey performing a reaching task. We obtain prediction accuracies between 55% and 90% using different methods which are significantly above the chance level of 12.5%.
Resumo:
Proving the unsatisfiability of propositional Boolean formulas has applications in a wide range of fields. Minimal Unsatisfiable Sets (MUS) are signatures of the property of unsatisfiability in formulas and our understanding of these signatures can be very helpful in answering various algorithmic and structural questions relating to unsatisfiability. In this paper, we explore some combinatorial properties of MUS and use them to devise a classification scheme for MUS. We also derive bounds on the sizes of MUS in Horn, 2-SAT and 3-SAT formulas.
Resumo:
In this paper, we consider the problem of time series classification. Using piecewise linear interpolation various novel kernels are obtained which can be used with Support vector machines for designing classifiers capable of deciding the class of a given time series. The approach is general and is applicable in many scenarios. We apply the method to the task of Online Tamil handwritten character recognition with promising results.
Resumo:
This paper discusses an approach for river mapping and flood evaluation based on multi-temporal time series analysis of satellite images utilizing pixel spectral information for image classification and region-based segmentation for extracting water-covered regions. Analysis of MODIS satellite images is applied in three stages: before flood, during flood and after flood. Water regions are extracted from the MODIS images using image classification (based on spectral information) and image segmentation (based on spatial information). Multi-temporal MODIS images from ``normal'' (non-flood) and flood time-periods are processed in two steps. In the first step, image classifiers such as Support Vector Machines (SVMs) and Artificial Neural Networks (ANNs) separate the image pixels into water and non-water groups based on their spectral features. The classified image is then segmented using spatial features of the water pixels to remove the misclassified water. From the results obtained, we evaluate the performance of the method and conclude that the use of image classification (SVM and ANN) and region-based image segmentation is an accurate and reliable approach for the extraction of water-covered regions. (c) 2012 COSPAR. Published by Elsevier Ltd. All rights reserved.
Resumo:
In this paper we study the problem of designing SVM classifiers when the kernel matrix, K, is affected by uncertainty. Specifically K is modeled as a positive affine combination of given positive semi definite kernels, with the coefficients ranging in a norm-bounded uncertainty set. We treat the problem using the Robust Optimization methodology. This reduces the uncertain SVM problem into a deterministic conic quadratic problem which can be solved in principle by a polynomial time Interior Point (IP) algorithm. However, for large-scale classification problems, IP methods become intractable and one has to resort to first-order gradient type methods. The strategy we use here is to reformulate the robust counterpart of the uncertain SVM problem as a saddle point problem and employ a special gradient scheme which works directly on the convex-concave saddle function. The algorithm is a simplified version of a general scheme due to Juditski and Nemirovski (2011). It achieves an O(1/T-2) reduction of the initial error after T iterations. A comprehensive empirical study on both synthetic data and real-world protein structure data sets show that the proposed formulations achieve the desired robustness, and the saddle point based algorithm outperforms the IP method significantly.
Resumo:
In the design of practical web page classification systems one often encounters a situation in which the labeled training set is created by choosing some examples from each class; but, the class proportions in this set are not the same as those in the test distribution to which the classifier will be actually applied. The problem is made worse when the amount of training data is also small. In this paper we explore and adapt binary SVM methods that make use of unlabeled data from the test distribution, viz., Transductive SVMs (TSVMs) and expectation regularization/constraint (ER/EC) methods to deal with this situation. We empirically show that when the labeled training data is small, TSVM designed using the class ratio tuned by minimizing the loss on the labeled set yields the best performance; its performance is good even when the deviation between the class ratios of the labeled training set and the test set is quite large. When the labeled training data is sufficiently large, an unsupervised Gaussian mixture model can be used to get a very good estimate of the class ratio in the test set; also, when this estimate is used, both TSVM and EC/ER give their best possible performance, with TSVM coming out superior. The ideas in the paper can be easily extended to multi-class SVMs and MaxEnt models.
Resumo:
The present approach uses stopwords and the gaps that oc- cur between successive stopwords –formed by contentwords– as features for sentiment classification.
Resumo:
Time series classification deals with the problem of classification of data that is multivariate in nature. This means that one or more of the attributes is in the form of a sequence. The notion of similarity or distance, used in time series data, is significant and affects the accuracy, time, and space complexity of the classification algorithm. There exist numerous similarity measures for time series data, but each of them has its own disadvantages. Instead of relying upon a single similarity measure, our aim is to find the near optimal solution to the classification problem by combining different similarity measures. In this work, we use genetic algorithms to combine the similarity measures so as to get the best performance. The weightage given to different similarity measures evolves over a number of generations so as to get the best combination. We test our approach on a number of benchmark time series datasets and present promising results.
Resumo:
This paper presents a new hierarchical clustering algorithm for crop stage classification using hyperspectral satellite image. Amongst the multiple benefits and uses of remote sensing, one of the important application is to solve the problem of crop stage classification. Modern commercial imaging satellites, owing to their large volume of satellite imagery, offer greater opportunities for automated image analysis. Hence, we propose a unsupervised algorithm namely Hierarchical Artificial Immune System (HAIS) of two steps: splitting the cluster centers and merging them. The high dimensionality of the data has been reduced with the help of Principal Component Analysis (PCA). The classification results have been compared with K-means and Artificial Immune System algorithms. From the results obtained, we conclude that the proposed hierarchical clustering algorithm is accurate.
Resumo:
Subsurface lithology and seismic site classification of Lucknow urban center located in the central part of the Indo-Gangetic Basin (IGB) are presented based on detailed shallow subsurface investigations and borehole analysis. These are done by carrying out 47 seismic surface wave tests using multichannel analysis of surface waves (MASW) and 23 boreholes drilled up to 30 m with standard penetration test (SPT) N values. Subsurface lithology profiles drawn from the drilled boreholes show low- to medium-compressibility clay and silty to poorly graded sand available till depth of 30 m. In addition, deeper boreholes (depth >150 m) were collected from the Lucknow Jal Nigam (Water Corporation), Government of Uttar Pradesh to understand deeper subsoil stratification. Deeper boreholes in this paper refer to those with depth over 150 m. These reports show the presence of clay mix with sand and Kankar at some locations till a depth of 150 m, followed by layers of sand, clay, and Kankar up to 400 m. Based on the available details, shallow and deeper cross-sections through Lucknow are presented. Shear wave velocity (SWV) and N-SPT values were measured for the study area using MASW and SPT testing. Measured SWV and N-SPT values for the same locations were found to be comparable. These values were used to estimate 30 m average values of N-SPT (N-30) and SWV (V-s(30)) for seismic site classification of the study area as per the National Earthquake Hazards Reduction Program (NEHRP) soil classification system. Based on the NEHRP classification, the entire study area is classified into site class C and D based on V-s(30) and site class D and E based on N-30. The issue of larger amplification during future seismic events is highlighted for a major part of the study area which comes under site class D and E. Also, the mismatch of site classes based on N-30 and V-s(30) raises the question of the suitability of the NEHRP classification system for the study region. Further, 17 sets of SPT and SWV data are used to develop a correlation between N-SPT and SWV. This represents a first attempt of seismic site classification and correlation between N-SPT and SWV in the Indo-Gangetic Basin.
Resumo:
This paper presents an efficient approach to the modeling and classification of vehicles using the magnetic signature of the vehicle. A database was created using the magnetic signature collected over a wide range of vehicles(cars). A vehicle is modeled as an array of magnetic dipoles. The strength of the magnetic dipole and the separation between the magnetic dipoles varies for different vehicles and is dependent on the metallic composition and configuration of the vehicle. Based on the magnetic dipole data model, we present a novel method to extract a feature vector from the magnetic signature. In the classification of vehicles, a linear support vector machine configuration is used to classify the vehicles based on the obtained feature vectors.
Resumo:
Effective conservation and management of natural resources requires up-to-date information of the land cover (LC) types and their dynamics. The LC dynamics are being captured using multi-resolution remote sensing (RS) data with appropriate classification strategies. RS data with important environmental layers (either remotely acquired or derived from ground measurements) would however be more effective in addressing LC dynamics and associated changes. These ancillary layers provide additional information for delineating LC classes' decision boundaries compared to the conventional classification techniques. This communication ascertains the possibility of improved classification accuracy of RS data with ancillary and derived geographical layers such as vegetation index, temperature, digital elevation model (DEM), aspect, slope and texture. This has been implemented in three terrains of varying topography. The study would help in the selection of appropriate ancillary data depending on the terrain for better classified information.