775 resultados para Algorithm clustering
Resumo:
The Evidence Accumulation Clustering (EAC) paradigm is a clustering ensemble method which derives a consensus partition from a collection of base clusterings obtained using different algorithms. It collects from the partitions in the ensemble a set of pairwise observations about the co-occurrence of objects in a same cluster and it uses these co-occurrence statistics to derive a similarity matrix, referred to as co-association matrix. The Probabilistic Evidence Accumulation for Clustering Ensembles (PEACE) algorithm is a principled approach for the extraction of a consensus clustering from the observations encoded in the co-association matrix based on a probabilistic model for the co-association matrix parameterized by the unknown assignments of objects to clusters. In this paper we extend the PEACE algorithm by deriving a consensus solution according to a MAP approach with Dirichlet priors defined for the unknown probabilistic cluster assignments. In particular, we study the positive regularization effect of Dirichlet priors on the final consensus solution with both synthetic and real benchmark data.
Resumo:
"Expectation-Maximization'' (EM) algorithm and gradient-based approaches for maximum likelihood learning of finite Gaussian mixtures. We show that the EM step in parameter space is obtained from the gradient via a projection matrix $P$, and we provide an explicit expression for the matrix. We then analyze the convergence of EM in terms of special properties of $P$ and provide new results analyzing the effect that $P$ has on the likelihood surface. Based on these mathematical results, we present a comparative discussion of the advantages and disadvantages of EM and other algorithms for the learning of Gaussian mixture models.
Resumo:
We focus on mixtures of factor analyzers from the perspective of a method for model-based density estimation from high-dimensional data, and hence for the clustering of such data. This approach enables a normal mixture model to be fitted to a sample of n data points of dimension p, where p is large relative to n. The number of free parameters is controlled through the dimension of the latent factor space. By working in this reduced space, it allows a model for each component-covariance matrix with complexity lying between that of the isotropic and full covariance structure models. We shall illustrate the use of mixtures of factor analyzers in a practical example that considers the clustering of cell lines on the basis of gene expressions from microarray experiments. (C) 2002 Elsevier Science B.V. All rights reserved.
Resumo:
Dissertation submitted in partial fulfilment of the requirements for the Degree of Master of Science in Geospatial Technologies
Resumo:
A graph clustering algorithm constructs groups of closely related parts and machines separately. After they are matched for the least intercell moves, a refining process runs on the initial cell formation to decrease the number of intercell moves. A simple modification of this main approach can deal with some practical constraints, such as the popular constraint of bounding the maximum number of machines in a cell. Our approach makes a big improvement in the computational time. More importantly, improvement is seen in the number of intercell moves when the computational results were compared with best known solutions from the literature. (C) 2009 Elsevier Ltd. All rights reserved.
Resumo:
Radial basis functions can be combined into a network structure that has several advantages over conventional neural network solutions. However, to operate effectively the number and positions of the basis function centres must be carefully selected. Although no rigorous algorithm exists for this purpose, several heuristic methods have been suggested. In this paper a new method is proposed in which radial basis function centres are selected by the mean-tracking clustering algorithm. The mean-tracking algorithm is compared with k means clustering and it is shown that it achieves significantly better results in terms of radial basis function performance. As well as being computationally simpler, the mean-tracking algorithm in general selects better centre positions, thus providing the radial basis functions with better modelling accuracy
Resumo:
With the fast development of wireless communications, ZigBee and semiconductor devices, home automation networks have recently become very popular. Since typical consumer products deployed in home automation networks are often powered by tiny and limited batteries, one of the most challenging research issues is concerning energy reduction and the balancing of energy consumption across the network in order to prolong the home network lifetime for consumer devices. The introduction of clustering and sink mobility techniques into home automation networks have been shown to be an efficient way to improve the network performance and have received significant research attention. Taking inspiration from nature, this paper proposes an Ant Colony Optimization (ACO) based clustering algorithm specifically with mobile sink support for home automation networks. In this work, the network is divided into several clusters and cluster heads are selected within each cluster. Then, a mobile sink communicates with each cluster head to collect data directly through short range communications. The ACO algorithm has been utilized in this work in order to find the optimal mobility trajectory for the mobile sink. Extensive simulation results from this research show that the proposed algorithm significantly improves home network performance when using mobile sinks in terms of energy consumption and network lifetime as compared to other routing algorithms currently deployed for home automation networks.
Resumo:
A large amount of biological data has been produced in the last years. Important knowledge can be extracted from these data by the use of data analysis techniques. Clustering plays an important role in data analysis, by organizing similar objects from a dataset into meaningful groups. Several clustering algorithms have been proposed in the literature. However, each algorithm has its bias, being more adequate for particular datasets. This paper presents a mathematical formulation to support the creation of consistent clusters for biological data. Moreover. it shows a clustering algorithm to solve this formulation that uses GRASP (Greedy Randomized Adaptive Search Procedure). We compared the proposed algorithm with three known other algorithms. The proposed algorithm presented the best clustering results confirmed statistically. (C) 2009 Elsevier Ltd. All rights reserved.
Resumo:
The Capacitated Centered Clustering Problem (CCCP) consists of defining a set of p groups with minimum dissimilarity on a network with n points. Demand values are associated with each point and each group has a demand capacity. The problem is well known to be NP-hard and has many practical applications. In this paper, the hybrid method Clustering Search (CS) is implemented to solve the CCCP. This method identifies promising regions of the search space by generating solutions with a metaheuristic, such as Genetic Algorithm, and clustering them into clusters that are then explored further with local search heuristics. Computational results considering instances available in the literature are presented to demonstrate the efficacy of CS. (C) 2010 Elsevier Ltd. All rights reserved.
Resumo:
Industrial applications of computer vision sometimes require detection of atypical objects that occur as small groups of pixels in digital images. These objects are difficult to single out because they are small and randomly distributed. In this work we propose an image segmentation method using the novel Ant System-based Clustering Algorithm (ASCA). ASCA models the foraging behaviour of ants, which move through the data space searching for high data-density regions, and leave pheromone trails on their path. The pheromone map is used to identify the exact number of clusters, and assign the pixels to these clusters using the pheromone gradient. We applied ASCA to detection of microcalcifications in digital mammograms and compared its performance with state-of-the-art clustering algorithms such as 1D Self-Organizing Map, k-Means, Fuzzy c-Means and Possibilistic Fuzzy c-Means. The main advantage of ASCA is that the number of clusters needs not to be known a priori. The experimental results show that ASCA is more efficient than the other algorithms in detecting small clusters of atypical data.
Resumo:
Peer reviewed