26 resultados para Concept Clustering


Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper is concerned with the computational efficiency of fuzzy clustering algorithms when the data set to be clustered is described by a proximity matrix only (relational data) and the number of clusters must be automatically estimated from such data. A fuzzy variant of an evolutionary algorithm for relational clustering is derived and compared against two systematic (pseudo-exhaustive) approaches that can also be used to automatically estimate the number of fuzzy clusters in relational data. An extensive collection of experiments involving 18 artificial and two real data sets is reported and analyzed. (C) 2011 Elsevier B.V. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Clustering is a difficult task: there is no single cluster definition and the data can have more than one underlying structure. Pareto-based multi-objective genetic algorithms (e.g., MOCK Multi-Objective Clustering with automatic K-determination and MOCLE-Multi-Objective Clustering Ensemble) were proposed to tackle these problems. However, the output of such algorithms can often contains a high number of partitions, becoming difficult for an expert to manually analyze all of them. In order to deal with this problem, we present two selection strategies, which are based on the corrected Rand, to choose a subset of solutions. To test them, they are applied to the set of solutions produced by MOCK and MOCLE in the context of several datasets. The study was also extended to select a reduced set of partitions from the initial population of MOCLE. These analysis show that both versions of selection strategy proposed are very effective. They can significantly reduce the number of solutions and, at the same time, keep the quality and the diversity of the partitions in the original set of solutions. (C) 2010 Elsevier B.V. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A large amount of biological data has been produced in the last years. Important knowledge can be extracted from these data by the use of data analysis techniques. Clustering plays an important role in data analysis, by organizing similar objects from a dataset into meaningful groups. Several clustering algorithms have been proposed in the literature. However, each algorithm has its bias, being more adequate for particular datasets. This paper presents a mathematical formulation to support the creation of consistent clusters for biological data. Moreover. it shows a clustering algorithm to solve this formulation that uses GRASP (Greedy Randomized Adaptive Search Procedure). We compared the proposed algorithm with three known other algorithms. The proposed algorithm presented the best clustering results confirmed statistically. (C) 2009 Elsevier Ltd. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we present an algorithm for cluster analysis that integrates aspects from cluster ensemble and multi-objective clustering. The algorithm is based on a Pareto-based multi-objective genetic algorithm, with a special crossover operator, which uses clustering validation measures as objective functions. The algorithm proposed can deal with data sets presenting different types of clusters, without the need of expertise in cluster analysis. its result is a concise set of partitions representing alternative trade-offs among the objective functions. We compare the results obtained with our algorithm, in the context of gene expression data sets, to those achieved with multi-objective Clustering with automatic K-determination (MOCK). the algorithm most closely related to ours. (C) 2009 Elsevier B.V. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A conceptual problem that appears in different contexts of clustering analysis is that of measuring the degree of compatibility between two sequences of numbers. This problem is usually addressed by means of numerical indexes referred to as sequence correlation indexes. This paper elaborates on why some specific sequence correlation indexes may not be good choices depending on the application scenario in hand. A variant of the Product-Moment correlation coefficient and a weighted formulation for the Goodman-Kruskal and Kendall`s indexes are derived that may be more appropriate for some particular application scenarios. The proposed and existing indexes are analyzed from different perspectives, such as their sensitivity to the ranks and magnitudes of the sequences under evaluation, among other relevant aspects of the problem. The results help suggesting scenarios within the context of clustering analysis that are possibly more appropriate for the application of each index. (C) 2008 Elsevier Inc. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper tackles the problem of showing that evolutionary algorithms for fuzzy clustering can be more efficient than systematic (i.e. repetitive) approaches when the number of clusters in a data set is unknown. To do so, a fuzzy version of an Evolutionary Algorithm for Clustering (EAC) is introduced. A fuzzy cluster validity criterion and a fuzzy local search algorithm are used instead of their hard counterparts employed by EAC. Theoretical complexity analyses for both the systematic and evolutionary algorithms under interest are provided. Examples with computational experiments and statistical analyses are also presented.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Clustering quality or validation indices allow the evaluation of the quality of clustering in order to support the selection of a specific partition or clustering structure in its natural unsupervised environment, where the real solution is unknown or not available. In this paper, we investigate the use of quality indices mostly based on the concepts of clusters` compactness and separation, for the evaluation of clustering results (partitions in particular). This work intends to offer a general perspective regarding the appropriate use of quality indices for the purpose of clustering evaluation. After presenting some commonly used indices, as well as indices recently proposed in the literature, key issues regarding the practical use of quality indices are addressed. A general methodological approach is presented which considers the identification of appropriate indices thresholds. This general approach is compared with the simple use of quality indices for evaluating a clustering solution.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this article we introduce the concept of a gradient-like nonlinear semigroup as an intermediate concept between a gradient nonlinear semigroup (those possessing a Lyapunov function, see [J.K. Hale, Asymptotic Behavior of Dissipative Systems, Math. Surveys Monogr., vol. 25, Amer. Math. Soc., 1989]) and a nonlinear semigroup possessing a gradient-like attractor. We prove that a perturbation of a gradient-like nonlinear semigroup remains a gradient-like nonlinear semigroup. Moreover, for non-autonomous dynamical systems we introduce the concept of a gradient-like evolution process and prove that a non-autonomous perturbation of a gradient-like nonlinear semigroup is a gradient-like evolution process. For gradient-like nonlinear semigroups and evolution processes, we prove continuity, characterization and (pullback and forwards) exponential attraction of their attractors under perturbation extending the results of [A.N. Carvalho, J.A. Langa, J.C. Robinson, A. Suarez, Characterization of non-autonomous attractors of a perturbed gradient system, J. Differential Equations 236 (2007) 570-603] on characterization and of [A.V. Babin, M.I. Vishik, Attractors in Evolutionary Equations, Stud. Math. Appl.. vol. 25, North-Holland, Amsterdam, 1992] on exponential attraction. (C) 2009 Elsevier Inc. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We study a symplectic chain with a non-local form of coupling by means of a standard map lattice where the interaction strength decreases with the lattice distance as a power-law, in Such a way that one can pass continuously from a local (nearest-neighbor) to a global (mean-field) type of coupling. We investigate the formation of map clusters, or spatially coherent structures generated by the system dynamics. Such clusters are found to be related to stickiness of chaotic phase-space trajectories near periodic island remnants, and also to the behavior of the diffusion coefficient. An approximate two-dimensional map is derived to explain some of the features of this connection. (C) 2008 Elsevier Ltd. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The analysis of the electrical impedance of an electrolytic cell in the shape of a slab is performed. We have solved, numerically, the differential equations governing the phenomenon of the redistribution of the ions in the presence of an external electric field, and compared the results with the ones obtained by solving the linear approximation of these equations. The control parameters in our study are the amplitude and the frequency of the applied voltage, assumed a simple harmonic function of the time. We show that for the large amplitudes of the applied voltage, the actual current is no longer harmonic at low frequencies. From this result it follows that the concept of electrical impedance of a cell is a useful quantity only in the case where the linear approximation of the fundamental equations of problem work well.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We use the deformed sine-Gordon models recently presented by Bazeia et al [1] to take the first steps towards defining the concept of quasi-integrability. We consider one such definition and use it to calculate an infinite number of quasi-conserved quantities through a modification of the usual techniques of integrable field theories. Performing an expansion around the sine-Gordon theory we are able to evaluate the charges and the anomalies of their conservation laws in a perturbative power series in a small parameter which describes the ""closeness"" to the integrable sine-Gordon model. We show that in the case of the two-soliton scattering the charges, up to first order of perturbation, are conserved asymptotically, i.e. their values are the same in the distant past and future, when the solitons are well separated. We indicate that this property may hold or not to higher orders depending on the behavior of the two-soliton solution under a special parity transformation. For closely bound systems, such as breather-like field configurations, the situation however is more complex and perhaps the anomalies have a different structure implying that the concept of quasi-integrability does not apply in the same way as in the scattering of solitons. We back up our results with the data of many numerical simulations which also demonstrate the existence of long lived breather-like and wobble-like states in these models.