131 resultados para speaker clustering


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Satellite image processing is a complex task that has received considerable attention from many researchers. In this paper, an interactive image query system for satellite imagery searching and retrieval is proposed. Like most image retrieval systems, extraction of image features is the most important step that has a great impact on the retrieval performance. Thus, a new technique that fuses color and texture features for segmentation is introduced. Applicability of the proposed technique is assessed using a database containing multispectral satellite imagery. The experiments demonstrate that the proposed segmentation technique is able to improve quality of the segmentation results as well as the retrieval performance.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this study, an interactive Content-Based Image Retrieval (CBIR) system that allows searching and retrieving images from databases is designed and developed. Based on the fuzzy c-means clustering algorithm, the CBIR system fuses color and texture features in image segmentation. A technique to form compound queries based on the combined features of different images is devised. This technique allows users to have a better control on the search criteria, thus a higher retrieval performance can be achieved. A database consisting of skin cancer imagery is used to demonstrate the applicability of the CBIR system.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Speaker of the House of Representatives Peter Slipper has stepped aside following allegations of sexual harassment and the misuse of cab-charge vouchers.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present a novel method for document clustering using sparse representation of documents in conjunction with spectral clustering. An ℓ1-norm optimization formulation is posed to learn the sparse representation of each document, allowing us to characterize the affinity between documents by considering the overall information instead of traditional pair wise similarities. This document affinity is encoded through a graph on which spectral clustering is performed. The decomposition into multiple subspaces allows documents to be part of a sub-group that shares a smaller set of similar vocabulary, thus allowing for cleaner clusters. Extensive experimental evaluations on two real-world datasets from Reuters-21578 and 20Newsgroup corpora show that our proposed method consistently outperforms state-of-the-art algorithms. Significantly, the performance improvement over other methods is prominent for this datasets.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We propose in this paper a novel sparse subspace clustering method that regularizes sparse subspace representation by exploiting the structural sharing between tasks and data points via group sparse coding. We derive simple, provably convergent, and computationally efficient algorithms for solving the proposed group formulations. We demonstrate the advantage of the framework on three challenging benchmark datasets ranging from medical record data to image and text clustering and show that they consistently outperforms rival methods.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In recent years, significant effort has been given to predicting protein functions from protein interaction data generated from high throughput techniques. However, predicting protein functions correctly and reliably still remains a challenge. Recently, many computational methods have been proposed for predicting protein functions. Among these methods, clustering based methods are the most promising. The existing methods, however, mainly focus on protein relationship modeling and the prediction algorithms that statically predict functions from the clusters that are related to the unannotated proteins. In fact, the clustering itself is a dynamic process and the function prediction should take this dynamic feature of clustering into consideration. Unfortunately, this dynamic feature of clustering is ignored in the existing prediction methods. In this paper, we propose an innovative progressive clustering based prediction method to trace the functions of relevant annotated proteins across all clusters that are generated through the progressive clustering of proteins. A set of prediction criteria is proposed to predict functions of unannotated proteins from all relevant clusters and traced functions. The method was evaluated on real protein interaction datasets and the results demonstrated the effectiveness of the proposed method compared with representative existing methods.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents an integration of a novel document vector representation technique and a novel Growing Self Organizing Process. In this new approach, documents are represented as a low dimensional vector, which is composed of the indices and weights derived from the keywords of the document.

An index based similarity calculation method is employed on this low dimensional feature space and the growing self organizing process is modified to comply with the new feature representation model.

The initial experiments show that this novel integration outperforms the state-of-the-art Self Organizing Map based techniques of text clustering in terms of its efficiency while preserving the same accuracy level.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Text clustering can be considered as a four step process consisting of feature extraction, text representation, document clustering and cluster interpretation. Most text clustering models consider text as an unordered collection of words. However the semantics of text would be better captured if word sequences are taken into account.

In this paper we propose a sequence based text clustering model where four novel sequence based components are introduced in each of the four steps in the text clustering process.

Experiments conducted on the Reuters dataset and Sydney Morning Herald (SMH) news archives demonstrate the advantage of the proposed sequence based model, in terms of capturing context with semantics, accuracy and speed, compared to clustering of documents based on single words and n-gram based models.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

An improved evolving model, i.e., Evolving Tree (ETree) with Fuzzy c-Means (FCM), is proposed for undertaking text document visualization problems in this study. ETree forms a hierarchical tree structure in which nodes (i.e., trunks) are allowed to grow and split into child nodes (i.e., leaves), and each node represents a cluster of documents. However, ETree adopts a relatively simple approach to split its nodes. Thus, FCM is adopted as an alternative to perform node splitting in ETree. An experimental study using articles from a flagship conference of Universiti Malaysia Sarawak (UNIMAS), i.e., Engineering Conference (ENCON), is conducted. The experimental results are analyzed and discussed, and the outcome shows that the proposed ETree-FCM model is effective for undertaking text document clustering and visualization problems.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Multimedia contents often possess weakly annotated data such as tags, links and interactions. The weakly annotated data is called side information. It is the auxiliary information of data and provides hints for exploring the link structure of data. Most clustering algorithms utilize pure data for clustering. A model that combines pure data and side information, such as images and tags, documents and keywords, can perform better at understanding the underlying structure of data. We demonstrate how to incorporate different types of side information into a recently proposed Bayesian nonparametric model, the distance dependent Chinese restaurant process (DD-CRP). Our algorithm embeds the affinity of this information into the decay function of the DD-CRP when side information is in the form of subsets of discrete labels. It is flexible to measure distance based on arbitrary side information instead of only the spatial layout or time stamp of observations. At the same time, for noisy and incomplete side information, we set the decay function so that the DD-CRP reduces to the traditional Chinese restaurant process, thus not inducing side effects of noisy and incomplete side information. Experimental evaluations on two real-world datasets NUS WIDE and 20 Newsgroups show exploiting side information in DD-CRP significantly improves the clustering performance.