955 resultados para Multiple classification


Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper investigates a new Glowworm Swarm Optimization (GSO) clustering algorithm for hierarchical splitting and merging of automatic multi-spectral satellite image classification (land cover mapping problem). Amongst the multiple benefits and uses of remote sensing, one of the most important has been its use in solving the problem of land cover mapping. Image classification forms the core of the solution to the land cover mapping problem. No single classifier can prove to classify all the basic land cover classes of an urban region in a satisfactory manner. In unsupervised classification methods, the automatic generation of clusters to classify a huge database is not exploited to their full potential. The proposed methodology searches for the best possible number of clusters and its center using Glowworm Swarm Optimization (GSO). Using these clusters, we classify by merging based on parametric method (k-means technique). The performance of the proposed unsupervised classification technique is evaluated for Landsat 7 thematic mapper image. Results are evaluated in terms of the classification efficiency - individual, average and overall.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents a new hierarchical clustering algorithm for crop stage classification using hyperspectral satellite image. Amongst the multiple benefits and uses of remote sensing, one of the important application is to solve the problem of crop stage classification. Modern commercial imaging satellites, owing to their large volume of satellite imagery, offer greater opportunities for automated image analysis. Hence, we propose a unsupervised algorithm namely Hierarchical Artificial Immune System (HAIS) of two steps: splitting the cluster centers and merging them. The high dimensionality of the data has been reduced with the help of Principal Component Analysis (PCA). The classification results have been compared with K-means and Artificial Immune System algorithms. From the results obtained, we conclude that the proposed hierarchical clustering algorithm is accurate.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This work proposes a boosting-based transfer learning approach for head-pose classification from multiple, low-resolution views. Head-pose classification performance is adversely affected when the source (training) and target (test) data arise from different distributions (due to change in face appearance, lighting, etc). Under such conditions, we employ Xferboost, a Logitboost-based transfer learning framework that integrates knowledge from a few labeled target samples with the source model to effectively minimize misclassifications on the target data. Experiments confirm that the Xferboost framework can improve classification performance by up to 6%, when knowledge is transferred between the CLEAR and FBK four-view headpose datasets.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Maximum entropy approach to classification is very well studied in applied statistics and machine learning and almost all the methods that exists in literature are discriminative in nature. In this paper, we introduce a maximum entropy classification method with feature selection for large dimensional data such as text datasets that is generative in nature. To tackle the curse of dimensionality of large data sets, we employ conditional independence assumption (Naive Bayes) and we perform feature selection simultaneously, by enforcing a `maximum discrimination' between estimated class conditional densities. For two class problems, in the proposed method, we use Jeffreys (J) divergence to discriminate the class conditional densities. To extend our method to the multi-class case, we propose a completely new approach by considering a multi-distribution divergence: we replace Jeffreys divergence by Jensen-Shannon (JS) divergence to discriminate conditional densities of multiple classes. In order to reduce computational complexity, we employ a modified Jensen-Shannon divergence (JS(GM)), based on AM-GM inequality. We show that the resulting divergence is a natural generalization of Jeffreys divergence to a multiple distributions case. As far as the theoretical justifications are concerned we show that when one intends to select the best features in a generative maximum entropy approach, maximum discrimination using J-divergence emerges naturally in binary classification. Performance and comparative study of the proposed algorithms have been demonstrated on large dimensional text and gene expression datasets that show our methods scale up very well with large dimensional datasets.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Establishing functional relationships between multi-domain protein sequences is a non-trivial task. Traditionally, delineating functional assignment and relationships of proteins requires domain assignments as a prerequisite. This process is sensitive to alignment quality and domain definitions. In multi-domain proteins due to multiple reasons, the quality of alignments is poor. We report the correspondence between the classification of proteins represented as full-length gene products and their functions. Our approach differs fundamentally from traditional methods in not performing the classification at the level of domains. Our method is based on an alignment free local matching scores (LMS) computation at the amino-acid sequence level followed by hierarchical clustering. As there are no gold standards for full-length protein sequence classification, we resorted to Gene Ontology and domain-architecture based similarity measures to assess our classification. The final clusters obtained using LMS show high functional and domain architectural similarities. Comparison of the current method with alignment based approaches at both domain and full-length protein showed superiority of the LMS scores. Using this method we have recreated objective relationships among different protein kinase sub-families and also classified immunoglobulin containing proteins where sub-family definitions do not exist currently. This method can be applied to any set of protein sequences and hence will be instrumental in analysis of large numbers of full-length protein sequences.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Thiolases are enzymes involved in lipid metabolism. Thiolases remove the acetyl-CoA moiety from 3-ketoacyl-CoAs in the degradative reaction. They can also catalyze the reverse Claisen condensation reaction, which is the first step of biosynthetic processes such as the biosynthesis of sterols and ketone bodies. In human, six distinct thiolases have been identified. Each of these thiolases is different from the other with respect to sequence, oligomeric state, substrate specificity and subcellular localization. Four sequence fingerprints, identifying catalytic loops of thiolases, have been described. In this study genome searches of two mycobacterial species (Mycobacterium tuberculosis and Mycobacterium smegmatis), were carried out, using the six human thiolase sequences as queries. Eight and thirteen different thiolase sequences were identified in M. tuberculosis and M. smegmatis, respectively. In addition, thiolase-like proteins (one encoded in the Mtb and two in the Msm genome) were found. The purpose of this study is to classify these mostly uncharacterized thiolases and thiolase-like proteins. Several other sequences obtained by searches of genome databases of bacteria, mammals and the parasitic protist family of the Trypanosomatidae were included in the analysis. Thiolase-like proteins were also found in the trypanosomatid genomes, but not in those of mammals. In order to study the phylogenetic relationships at a high confidence level, additional thiolase sequences were included such that a total of 130 thiolases and thiolase-like protein sequences were used for the multiple sequence alignment. The resulting phylogenetic tree identifies 12 classes of sequences, each possessing a characteristic set of sequence fingerprints for the catalytic loops. From this analysis it is now possible to assign the mycobacterial thiolases to corresponding homologues in other kingdoms of life. The results of this bioinformatics analysis also show interesting differences between the distributions of M. tuberculosis and M. smegmatis thiolases over the 12 different classes. (C) 2014 Elsevier Ltd. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Among the multiple advantages and applications of remote sensing, one of the most important uses is to solve the problem of crop classification, i.e., differentiating between various crop types. Satellite images are a reliable source for investigating the temporal changes in crop cultivated areas. In this letter, we propose a novel bat algorithm (BA)-based clustering approach for solving crop type classification problems using a multispectral satellite image. The proposed partitional clustering algorithm is used to extract information in the form of optimal cluster centers from training samples. The extracted cluster centers are then validated on test samples. A real-time multispectral satellite image and one benchmark data set from the University of California, Irvine (UCI) repository are used to demonstrate the robustness of the proposed algorithm. The performance of the BA is compared with two other nature-inspired metaheuristic techniques, namely, genetic algorithm and particle swarm optimization. The performance is also compared with the existing hybrid approach such as the BA with K-means. From the results obtained, it can be concluded that the BA can be successfully applied to solve crop type classification problems.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This article describes the progress of the River Communities Project which commenced in 1977. This project aimed to develop a sensitive and practical system for river site classification using macroinvertebrates as an objective means of appraising the status of British rivers. The relationship between physical and chemical features of sites and their biological communities were examined. Sampling was undertaken on 41 British rivers. Ordination techniques were used to analyze data and the sites were classified into 16 groups using multiple discrimination analysis. The potential for using the environmental data to predict to which group a site belonged and the fauna likely to be present was investigated.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In the problem of one-class classification (OCC) one of the classes, the target class, has to be distinguished from all other possible objects, considered as nontargets. In many biomedical problems this situation arises, for example, in diagnosis, image based tumor recognition or analysis of electrocardiogram data. In this paper an approach to OCC based on a typicality test is experimentally compared with reference state-of-the-art OCC techniques-Gaussian, mixture of Gaussians, naive Parzen, Parzen, and support vector data description-using biomedical data sets. We evaluate the ability of the procedures using twelve experimental data sets with not necessarily continuous data. As there are few benchmark data sets for one-class classification, all data sets considered in the evaluation have multiple classes. Each class in turn is considered as the target class and the units in the other classes are considered as new units to be classified. The results of the comparison show the good performance of the typicality approach, which is available for high dimensional data; it is worth mentioning that it can be used for any kind of data (continuous, discrete, or nominal), whereas state-of-the-art approaches application is not straightforward when nominal variables are present.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Visual recognition problems often involve classification of myriads of pixels, across scales, to locate objects of interest in an image or to segment images according to object classes. The requirement for high speed and accuracy makes the problems very challenging and has motivated studies on efficient classification algorithms. A novel multi-classifier boosting algorithm is proposed to tackle the multimodal problems by simultaneously clustering samples and boosting classifiers in Section 2. The method is extended into an online version for object tracking in Section 3. Section 4 presents a tree-structured classifier, called Super tree, to further speed up the classification time of a standard boosting classifier. The proposed methods are demonstrated for object detection, tracking and segmentation tasks. © 2013 Springer-Verlag Berlin Heidelberg.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A mechanism is proposed that integrates low-level (image processing), mid-level (recursive 3D trajectory estimation), and high-level (action recognition) processes. It is assumed that the system observes multiple moving objects via a single, uncalibrated video camera. A novel extended Kalman filter formulation is used in estimating the relative 3D motion trajectories up to a scale factor. The recursive estimation process provides a prediction and error measure that is exploited in higher-level stages of action recognition. Conversely, higher-level mechanisms provide feedback that allows the system to reliably segment and maintain the tracking of moving objects before, during, and after occlusion. The 3D trajectory, occlusion, and segmentation information are utilized in extracting stabilized views of the moving object. Trajectory-guided recognition (TGR) is proposed as a new and efficient method for adaptive classification of action. The TGR approach is demonstrated using "motion history images" that are then recognized via a mixture of Gaussian classifier. The system was tested in recognizing various dynamic human outdoor activities; e.g., running, walking, roller blading, and cycling. Experiments with synthetic data sets are used to evaluate stability of the trajectory estimator with respect to noise.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Nearest neighbor classification using shape context can yield highly accurate results in a number of recognition problems. Unfortunately, the approach can be too slow for practical applications, and thus approximation strategies are needed to make shape context practical. This paper proposes a method for efficient and accurate nearest neighbor classification in non-Euclidean spaces, such as the space induced by the shape context measure. First, a method is introduced for constructing a Euclidean embedding that is optimized for nearest neighbor classification accuracy. Using that embedding, multiple approximations of the underlying non-Euclidean similarity measure are obtained, at different levels of accuracy and efficiency. The approximations are automatically combined to form a cascade classifier, which applies the slower approximations only to the hardest cases. Unlike typical cascade-of-classifiers approaches, that are applied to binary classification problems, our method constructs a cascade for a multiclass problem. Experiments with a standard shape data set indicate that a two-to-three order of magnitude speed up is gained over the standard shape context classifier, with minimal losses in classification accuracy.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Nearest neighbor retrieval is the task of identifying, given a database of objects and a query object, the objects in the database that are the most similar to the query. Retrieving nearest neighbors is a necessary component of many practical applications, in fields as diverse as computer vision, pattern recognition, multimedia databases, bioinformatics, and computer networks. At the same time, finding nearest neighbors accurately and efficiently can be challenging, especially when the database contains a large number of objects, and when the underlying distance measure is computationally expensive. This thesis proposes new methods for improving the efficiency and accuracy of nearest neighbor retrieval and classification in spaces with computationally expensive distance measures. The proposed methods are domain-independent, and can be applied in arbitrary spaces, including non-Euclidean and non-metric spaces. In this thesis particular emphasis is given to computer vision applications related to object and shape recognition, where expensive non-Euclidean distance measures are often needed to achieve high accuracy. The first contribution of this thesis is the BoostMap algorithm for embedding arbitrary spaces into a vector space with a computationally efficient distance measure. Using this approach, an approximate set of nearest neighbors can be retrieved efficiently - often orders of magnitude faster than retrieval using the exact distance measure in the original space. The BoostMap algorithm has two key distinguishing features with respect to existing embedding methods. First, embedding construction explicitly maximizes the amount of nearest neighbor information preserved by the embedding. Second, embedding construction is treated as a machine learning problem, in contrast to existing methods that are based on geometric considerations. The second contribution is a method for constructing query-sensitive distance measures for the purposes of nearest neighbor retrieval and classification. In high-dimensional spaces, query-sensitive distance measures allow for automatic selection of the dimensions that are the most informative for each specific query object. It is shown theoretically and experimentally that query-sensitivity increases the modeling power of embeddings, allowing embeddings to capture a larger amount of the nearest neighbor structure of the original space. The third contribution is a method for speeding up nearest neighbor classification by combining multiple embedding-based nearest neighbor classifiers in a cascade. In a cascade, computationally efficient classifiers are used to quickly classify easy cases, and classifiers that are more computationally expensive and also more accurate are only applied to objects that are harder to classify. An interesting property of the proposed cascade method is that, under certain conditions, classification time actually decreases as the size of the database increases, a behavior that is in stark contrast to the behavior of typical nearest neighbor classification systems. The proposed methods are evaluated experimentally in several different applications: hand shape recognition, off-line character recognition, online character recognition, and efficient retrieval of time series. In all datasets, the proposed methods lead to significant improvements in accuracy and efficiency compared to existing state-of-the-art methods. In some datasets, the general-purpose methods introduced in this thesis even outperform domain-specific methods that have been custom-designed for such datasets.