131 resultados para speaker clustering


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Despite significant advancements in wireless sensor networks (WSNs), energy conservation in the networks remains one of the most important research challenges. One approach commonly used to prolong the network lifetime is through aggregating data at the cluster heads (CHs). However, there is possibility that the CHs may fail and function incorrectly due to a number of reasons such as power instability. During the failure, the CHs are unable to collect and transfer data correctly. This affects the performance of the WSN. Early detection of failure of CHs will reduce the data loss and provide possible minimal recovery efforts. This paper proposes a self-configurable clustering mechanism to detect the disordered CHs and replace them with other nodes. Simulation results verify the effectiveness of the proposed approach.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Pixel color has proven to be a useful and robust cue for detection of most objects of interest like fire. In this paper, a hybrid intelligent algorithm is proposed to detect fire pixels in the background of an image. The proposed algorithm is introduced by the combination of a computational search method based on a swarm intelligence technique and the Kemdoids clustering method in order to form a Fire-based Color Space (FCS), in fact, the new technique converts RGB color system to FCS through a 3*3 matrix. This algorithm consists of five main stages:(1) extracting fire and non-fire pixels manually from the original image. (2) using K-medoids clustering to find a Cost function to minimize the error value. (3) applying Particle Swarm Optimization (PSO) to search and find the best W components in order to minimize the fitness function. (4) reporting the best matrix including feature weights, and utilizing this matrix to convert the all original images in the database to the new color space. (5) using Otsu threshold technique to binarize the final images. As compared with some state-of-the-art techniques, the experimental results show the ability and efficiency of the new method to detect fire pixels in color images.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Many vision problems deal with high-dimensional data, such as motion segmentation and face clustering. However, these high-dimensional data usually lie in a low-dimensional structure. Sparse representation is a powerful principle for solving a number of clustering problems with high-dimensional data. This principle is motivated from an ideal modeling of data points according to linear algebra theory. However, real data in computer vision are unlikely to follow the ideal model perfectly. In this paper, we exploit the mixed norm regularization for sparse subspace clustering. This regularization term is a convex combination of the l1norm, which promotes sparsity at the individual level and the block norm l2/1 which promotes group sparsity. Combining these powerful regularization terms will provide a more accurate modeling, subsequently leading to a better solution for the affinity matrix used in sparse subspace clustering. This could help us achieve better performance on motion segmentation and face clustering problems. This formulation also caters for different types of data corruptions. We derive a provably convergent algorithm based on the alternating direction method of multipliers (ADMM) framework, which is computationally efficient, to solve the formulation. We demonstrate that this formulation outperforms other state-of-arts on both motion segmentation and face clustering.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Side information, or auxiliary information associated with documents or image content, provides hints for clustering. We propose a new model, side information dependent Chinese restaurant process, which exploits side information in a Bayesian nonparametric model to improve data clustering. We introduce side information into the framework of distance dependent Chinese restaurant process using a robust decay function to handle noisy side information. The threshold parameter of the decay function is updated automatically in the Gibbs sampling process. A fast inference algorithm is proposed. We evaluate our approach on four datasets: Cora, 20 Newsgroups, NUS-WIDE and one medical dataset. Types of side information explored in this paper include citations, authors, tags, keywords and auxiliary clinical information. The comparison with the state-of-the-art approaches based on standard performance measures (NMI, F1) clearly shows the superiority of our approach.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Maximum target coverage with minimum number of sensor nodes, known as an MCMS problem, is an important problem in directional sensor networks (DSNs). For guaranteed coverage and event reporting, the underlying mechanism must ensure that all targets are covered by the sensors and the resulting network is connected. Existing solutions allow individual sensor nodes to determine the sensing direction for maximum target coverage which produces sensing coverage redundancy and much overhead. Gathering nodes into clusters might provide a better solution to this problem. In this paper, we have designed distributed clustering and target coverage algorithms to address the problem in an energy-efficient way. To the best of our knowledge, this is the first work that exploits cluster heads to determine the active sensing nodes and their directions for solving target coverage problems in DSNs. Our extensive simulation study shows that our system outperforms a number of state-of-the-art approaches.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Privacy-preserving data mining aims to keep data safe, yet useful. But algorithms providing strong guarantees often end up with low utility. We propose a novel privacy preserving framework that thwarts an adversary from inferring an unknown data point by ensuring that the estimation error is almost invariant to the inclusion/exclusion of the data point. By focusing directly on the estimation error of the data point, our framework is able to significantly lower the perturbation required. We use this framework to propose a new privacy aware K-means clustering algorithm. Using both synthetic and real datasets, we demonstrate that the utility of this algorithm is almost equal to that of the unperturbed K-means, and at strict privacy levels, almost twice as good as compared to the differential privacy counterpart.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper investigates the problem of minimizing data transfer between different data centers of the cloud during the neurological diagnostics of cardiac autonomic neuropathy (CAN). This problem has never been considered in the literature before. All classifiers considered for the diagnostics of CAN previously assume complete access to all data, which would lead to enormous burden of data transfer during training if such classifiers were deployed in the cloud. We introduce a new model of clustering-based multi-layer distributed ensembles (CBMLDE). It is designed to eliminate the need to transfer data between different data centers for training of the classifiers. We conducted experiments utilizing a dataset derived from an extensive DiScRi database. Our comprehensive tests have determined the best combinations of options for setting up CBMLDE classifiers. The results demonstrate that CBMLDE classifiers not only completely eliminate the need in patient data transfer, but also have significantly outperformed all base classifiers and simpler counterpart models in all cloud frameworks.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Due to the potential important information in real world networks, link prediction has become an interesting focus of different branches of science. Nevertheless, in "big data" era, link prediction faces significant challenges, such as how to predict the massive data efficiently and accurately. In this paper, we propose two novel node-coupling clustering approaches and their extensions for link prediction, which combine the coupling degrees of the common neighbor nodes of a predicted node-pair with cluster geometries of nodes. We then present an experimental evaluation to compare the prediction accuracy and effectiveness between our approaches and the representative existing methods on two synthetic datasets and six real world datasets. The experimental results show our approaches outperform the existing methods.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In many real-world computer vision applications, such as multi-camera surveillance, the objects of interest are captured by visual sensors concurrently, resulting in multi-view data. These views usually provide complementary information to each other. One recent and powerful computer vision method for clustering is sparse subspace clustering (SSC); however, it was not designed for multi-view data, which break down its linear separability assumption. To integrate complementary information between views, multi-view clustering algorithms are required to improve the clustering performance. In this paper, we propose a novel multi-view subspace clustering by searching for an unified latent structure as a global affinity matrix in subspace clustering. Due to the integration of affinity matrices for each view, this global affinity matrix can best represent the relationship between clusters. This could help us achieve better performance on face clustering. We derive a provably convergent algorithm based on the alternating direction method of multipliers (ADMM) framework, which is computationally efficient, to solve the formulation. We demonstrate that this formulation outperforms other alternatives based on state-of-The-Arts on challenging multi-view face datasets.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Traffic subarea division is vital for traffic system management and traffic network analysis in intelligent transportation systems (ITSs). Since existing methods may not be suitable for big traffic data processing, this paper presents a MapReduce-based Parallel Three-Phase K -Means (Par3PKM) algorithm for solving traffic subarea division problem on a widely adopted Hadoop distributed computing platform. Specifically, we first modify the distance metric and initialization strategy of K -Means and then employ a MapReduce paradigm to redesign the optimized K -Means algorithm for parallel clustering of large-scale taxi trajectories. Moreover, we propose a boundary identifying method to connect the borders of clustering results for each cluster. Finally, we divide traffic subarea of Beijing based on real-world trajectory data sets generated by 12,000 taxis in a period of one month using the proposed approach. Experimental evaluation results indicate that when compared with K -Means, Par2PK-Means, and ParCLARA, Par3PKM achieves higher efficiency, more accuracy, and better scalability and can effectively divide traffic subarea with big taxi trajectory data.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Utility companies provide electricity to a large number of consumers. These companies need to have an accurate forecast of the next day electricity demand. Any forecast errors will result in either reliability issues or increased costs for the company. Because of the widespread roll-out of smart meters, a large amount of high resolution consumption data is now accessible which was not available in the past. This new data can be used to improve the load forecast and as a result increase the reliability and decrease the expenses of electricity providers. In this paper, a number of methods for improving load forecast using smart meter data are discussed. In these methods, consumers are first divided into a number of clusters. Then a neural network is trained for each cluster and forecasts of these networks are added together in order to form the prediction for the aggregated load. In this paper, it is demonstrated that clustering increases the forecast accuracy significantly. Criteria used for grouping consumers play an important role in this process. In this work, three different feature selection methods for clustering consumers are explained and the effect of feature extraction methods on forecast error is investigated.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Clustering is applied in wireless sensor networks for increasing energy efficiency. Clustering methods in wireless sensor networks are different from those in traditional data mining systems. This paper proposes a novel clustering algorithm based on Minimal Spanning Tree (MST) and Maximum Energy resource on sensors named MSTME. Also, specified constrains of clustering in wireless sensor networks and several evaluation metrics are given. MSTME performs better than already known clustering methods of Low Energy Adaptive Clustering Hierarchy (LEACH) and Base Station Controlled Dynamic Clustering Protocol (BCDCP) in wireless sensor networks when they are evaluated by these evaluation metrics. Simulation results show MSTME increases energy efficiency and network lifetime compared with LEACH and BCDCP in two-hop and multi-hop networks, respectively. © World Scientific Publishing Company.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A novel Cluster Heads (CH) choosing algorithm based on both Minimal Spanning Tree and Maximum Energy resource on sensors, named MSTME, is provided for prolonging lifetime of wireless sensor networks. MSTME can satisfy three principles of optimal CHs: to have the most energy resource among sensors in local clusters, to group approximately the same number of closer sensors into clusters, and to distribute evenly in the networks in terms of location. Simulation shows the network lifetime in MSTME excels its counterparts in two-hop and multi-hop wireless sensor networks.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Cluster analysis has been identified as a core task in data mining. What constitutes a cluster, or a good clustering, may depend on the background of researchers and applications. This paper proposes two optimization criteria of abstract degree and fidelity in the field of image abstract. To satisfy the fidelity criteria, a novel clustering algorithm named Global Optimized Color-based DBSCAN Clustering (GOC-DBSCAN) is provided. Also, non-optimized local color information based version of GOC-DBSCAN, called HSV-DBSCAN, is given. Both of them are based on HSV color space. Clusters of GOC-DBSCAN are analyzed to find the factors that impact on the performance of both abstract degree and fidelity. Examples show generally the greater the abstract degree is, the less is the fidelity. It also shows GOC-DBSCAN outperforms HSV-DBSCAN when they are evaluated by the two optimization criteria.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, a novel approach is proposed to automatically generate both watercolor painting and pencil sketch drawing, or binary image of contour, from realism-style photo by using DBSCAN color clustering based on HSV color space. While the color clusters produced by proposed methods help to create watercolor painting, the noise pixels are useful to generate the pencil sketch drawing. Moreover, noise pixels are reassigned to color clusters by a novel algorithm to refine the contour in the watercolor painting. The main goal of this paper is to inspire non-professional artists' imagination to produce traditional style painting easily by only adjusting a few parameters. Also, another contribution of this paper is to propose an easy method to produce the binary image of contour, which is a vice product when mining image data by DBSCAN clustering. Thus the binary image is useful in resource limited system to reduce data but keep enough information of images. © 2007 IEEE.