121 resultados para clustering


Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, an empirical analysis to examine the effects of image segmentation with different colour models using the fuzzy c-means (FCM) clustering algorithm is conducted. A qualitative evaluation method based on human perceptual judgement is used. Two sets of complex images, i.e., outdoor scenes and satellite imagery, are used for demonstration. These images are employed to examine the characteristics of image segmentation using FCM with eight different colour models. The results obtained from the experimental study are compared and analysed. It is found that the CIELAB colour model yields the best outcomes in colour image segmentation with FCM.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Both the instance level knowledge and the attribute level knowledge can improve clustering quality, but how to effectively utilize both of them is an essential problem to solve. This paper proposes a wrapper framework for semi-supervised clustering, which aims to gracely integrate both kinds of priori knowledge in the clustering process, the instance level knowledge in the form of pairwise constraints and the attribute level knowledge in the form of attribute order preferences. The wrapped algorithm is then designed as a semi-supervised clustering process which transforms this clustering problem into an optimization problem. The experimental results demonstrate the effectiveness and potential of proposed method.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, a new image segmentation approach that integrates color and texture features using the fuzzy c-means clustering algorithm is described. To demonstrate the applicability of the proposed approach to satellite image retrieval, an interactive region-based image query system is designed and developed. A database comprising 400 multispectral satellite images is used to evaluate the performance of the system. The results are analyzed and discussed, and a performance comparison with other methods is included. The outcomes reveal that the proposed approach is able to improve the quality of the segmentation results as well as the retrieval performance.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present a novel approach to improving subspace clustering by exploiting the spatial constraints. The new method encourages the sparse solution to be consistent with the spatial geometry of the tracked points, by embedding weights into the sparse formulation. By doing so, we are able to correct sparse representations in a principled manner without introducing much additional computational cost. We discuss alternative ways to treat the missing and corrupted data using the latest theory in robust lasso regression and suggest numerical algorithms so solve the proposed formulation. The experiments on the benchmark Johns Hopkins 155 dataset demonstrate that exploiting spatial constraints significantly improves motion segmentation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Due to the limitations of the traditional port-based and payload-based traffic classification approaches, the past decade has seen extensive work on utilizing machine learning techniques to classify network traffic based on packet and flow level features. In particular, previous studies have shown that the unsupervised clustering approach is both accurate and capable of discovering previously unknown application classes. In this paper, we explore the utility of side information in the process of traffic clustering. Specifically, we focus on the flow correlation information that can be efficiently extracted from packet headers and expressed as instance-level constraints, which indicate that particular sets of flows are using the same application and thus should be put into the same cluster. To incorporate the constraints, we propose a modified constrained K-Means algorithm. A variety of real-world traffic traces are used to show that the constraints are widely available. The experimental results indicate that the constrained approach not only improves the quality of the resulted clusters, but also speeds up the convergence of the clustering process.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents a new spectral clustering method called correlation preserving indexing (CPI), which is performed in the correlation similarity measure space. In this framework, the documents are projected into a low-dimensional semantic space in which the correlations between the documents in the local patches are maximized while the correlations between the documents outside these patches are minimized simultaneously. Since the intrinsic geometrical structure of the document space is often embedded in the similarities between the documents, correlation as a similarity measure is more suitable for detecting the intrinsic geometrical structure of the document space than euclidean distance. Consequently, the proposed CPI method can effectively discover the intrinsic structures embedded in high-dimensional document space. The effectiveness of the new method is demonstrated by extensive experiments conducted on various data sets and by comparison with existing document clustering methods.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Satellite image processing is a complex task that has received considerable attention from many researchers. In this paper, an interactive image query system for satellite imagery searching and retrieval is proposed. Like most image retrieval systems, extraction of image features is the most important step that has a great impact on the retrieval performance. Thus, a new technique that fuses color and texture features for segmentation is introduced. Applicability of the proposed technique is assessed using a database containing multispectral satellite imagery. The experiments demonstrate that the proposed segmentation technique is able to improve quality of the segmentation results as well as the retrieval performance.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this study, an interactive Content-Based Image Retrieval (CBIR) system that allows searching and retrieving images from databases is designed and developed. Based on the fuzzy c-means clustering algorithm, the CBIR system fuses color and texture features in image segmentation. A technique to form compound queries based on the combined features of different images is devised. This technique allows users to have a better control on the search criteria, thus a higher retrieval performance can be achieved. A database consisting of skin cancer imagery is used to demonstrate the applicability of the CBIR system.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present a novel method for document clustering using sparse representation of documents in conjunction with spectral clustering. An ℓ1-norm optimization formulation is posed to learn the sparse representation of each document, allowing us to characterize the affinity between documents by considering the overall information instead of traditional pair wise similarities. This document affinity is encoded through a graph on which spectral clustering is performed. The decomposition into multiple subspaces allows documents to be part of a sub-group that shares a smaller set of similar vocabulary, thus allowing for cleaner clusters. Extensive experimental evaluations on two real-world datasets from Reuters-21578 and 20Newsgroup corpora show that our proposed method consistently outperforms state-of-the-art algorithms. Significantly, the performance improvement over other methods is prominent for this datasets.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We propose in this paper a novel sparse subspace clustering method that regularizes sparse subspace representation by exploiting the structural sharing between tasks and data points via group sparse coding. We derive simple, provably convergent, and computationally efficient algorithms for solving the proposed group formulations. We demonstrate the advantage of the framework on three challenging benchmark datasets ranging from medical record data to image and text clustering and show that they consistently outperforms rival methods.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In recent years, significant effort has been given to predicting protein functions from protein interaction data generated from high throughput techniques. However, predicting protein functions correctly and reliably still remains a challenge. Recently, many computational methods have been proposed for predicting protein functions. Among these methods, clustering based methods are the most promising. The existing methods, however, mainly focus on protein relationship modeling and the prediction algorithms that statically predict functions from the clusters that are related to the unannotated proteins. In fact, the clustering itself is a dynamic process and the function prediction should take this dynamic feature of clustering into consideration. Unfortunately, this dynamic feature of clustering is ignored in the existing prediction methods. In this paper, we propose an innovative progressive clustering based prediction method to trace the functions of relevant annotated proteins across all clusters that are generated through the progressive clustering of proteins. A set of prediction criteria is proposed to predict functions of unannotated proteins from all relevant clusters and traced functions. The method was evaluated on real protein interaction datasets and the results demonstrated the effectiveness of the proposed method compared with representative existing methods.