821 resultados para Grid-based clustering approach


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the knowledge-based clustering approaches reported in the literature, explicit know ledge, typically in the form of a set of concepts, is used in computing similarity or conceptual cohesiveness between objects and in grouping them. We propose a knowledge-based clustering approach in which the domain knowledge is also used in the pattern representation phase of clustering. We argue that such a knowledge-based pattern representation scheme reduces the complexity of similarity computation and grouping phases. We present a knowledge-based clustering algorithm for grouping hooks in a library.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Clustering techniques are used in regional flood frequency analysis (RFFA) to partition watersheds into natural groups or regions with similar hydrologic responses. The linear Kohonen's self‐organizing feature map (SOFM) has been applied as a clustering technique for RFFA in several recent studies. However, it is seldom possible to interpret clusters from the output of an SOFM, irrespective of its size and dimensionality. In this study, we demonstrate that SOFMs may, however, serve as a useful precursor to clustering algorithms. We present a two‐level. SOFM‐based clustering approach to form regions for FFA. In the first level, the SOFM is used to form a two‐dimensional feature map. In the second level, the output nodes of SOFM are clustered using Fuzzy c‐means algorithm to form regions. The optimal number of regions is based on fuzzy cluster validation measures. Effectiveness of the proposed approach in forming homogeneous regions for FFA is illustrated through application to data from watersheds in Indiana, USA. Results show that the performance of the proposed approach to form regions is better than that based on classical SOFM.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Regionalization approaches are widely used in water resources engineering to identify hydrologically homogeneous groups of watersheds that are referred to as regions. Pooled information from sites (depicting watersheds) in a region forms the basis to estimate quantiles associated with hydrological extreme events at ungauged/sparsely gauged sites in the region. Conventional regionalization approaches can be effective when watersheds (data points) corresponding to different regions can be separated using straight lines or linear planes in the space of watershed related attributes. In this paper, a kernel-based Fuzzy c-means (KFCM) clustering approach is presented for use in situations where such linear separation of regions cannot be accomplished. The approach uses kernel-based functions to map the data points from the attribute space to a higher-dimensional space where they can be separated into regions by linear planes. A procedure to determine optimal number of regions with the KFCM approach is suggested. Further, formulations to estimate flood quantiles at ungauged sites with the approach are developed. Effectiveness of the approach is demonstrated through Monte-Carlo simulation experiments and a case study on watersheds in United States. Comparison of results with those based on conventional Fuzzy c-means clustering, Region-of-influence approach and a prior study indicate that KFCM approach outperforms the other approaches in forming regions that are closer to being statistically homogeneous and in estimating flood quantiles at ungauged sites. Key Points Kernel-based regionalization approach is presented for flood frequency analysis Kernel procedure to estimate flood quantiles at ungauged sites is developed A set of fuzzy regions is delineated in Ohio, USA

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We consider the problem of assessing the number of clusters in a limited number of tissue samples containing gene expressions for possibly several thousands of genes. It is proposed to use a normal mixture model-based approach to the clustering of the tissue samples. One advantage of this approach is that the question on the number of clusters in the data can be formulated in terms of a test on the smallest number of components in the mixture model compatible with the data. This test can be carried out on the basis of the likelihood ratio test statistic, using resampling to assess its null distribution. The effectiveness of this approach is demonstrated on simulated data and on some microarray datasets, as considered previously in the bioinformatics literature. (C) 2004 Elsevier Inc. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper proposes a novel demand response model using a fuzzy subtractive cluster approach. The model development provides support to domestic consumer decisions on controllable loads management, considering consumers' consumption needs and the appropriate load shape or rescheduling in order to achieve possible economic benefits. The model based on fuzzy subtractive clustering method considers clusters of domestic consumption covering an adequate consumption range. Analysis of different scenarios is presented considering available electric power and electric energy prices. Simulation results are presented and conclusions of the proposed demand response model are discussed. (C) 2016 Elsevier Ltd. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper proposes a novel demand response model using a fuzzy subtractive cluster approach. The model development provides support to domestic consumer decisions on controllable loads management, considering consumers’ consumption needs and the appropriate load shape or rescheduling in order to achieve possible economic benefits. The model based on fuzzy subtractive clustering method considers clusters of domestic consumption covering an adequate consumption range. Analysis of different scenarios is presented considering available electric power and electric energy prices. Simulation results are presented and conclusions of the proposed demand response model are discussed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper proposes a novel demand response model using a fuzzy subtractive cluster approach. The model development provides support to domestic consumer decisions on controllable loads management, considering consumers’ consumption needs and the appropriate load shape or rescheduling in order to achieve possible economic benefits. The model based on fuzzy subtractive clustering method considers clusters of domestic consumption covering an adequate consumption range. Analysis of different scenarios is presented considering available electric power and electric energy prices. Simulation results are presented and conclusions of the proposed demand response model are discussed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we propose and evaluate a speaker attribution system using a complete-linkage clustering method. Speaker attribution refers to the annotation of a collection of spoken audio based on speaker identities. This can be achieved using diarization and speaker linking. The main challenge associated with attribution is achieving computational efficiency when dealing with large audio archives. Traditional agglomerative clustering methods with model merging and retraining are not feasible for this purpose. This has motivated the use of linkage clustering methods without retraining. We first propose a diarization system using complete-linkage clustering and show that it outperforms traditional agglomerative and single-linkage clustering based diarization systems with a relative improvement of 40% and 68%, respectively. We then propose a complete-linkage speaker linking system to achieve attribution and demonstrate a 26% relative improvement in attribution error rate (AER) over the single-linkage speaker linking approach.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The keyword based search technique suffers from the problem of synonymic and polysemic queries. Current approaches address only theproblem of synonymic queries in which different queries might have the same information requirement. But the problem of polysemic queries,i.e., same query having different intentions, still remains unaddressed. In this paper, we propose the notion of intent clusters, the members of which will have the same intention. We develop a clustering algorithm that uses the user session information in query logs in addition to query URL entries to identify cluster of queries having the same intention. The proposed approach has been studied through case examples from the actual log data from AOL, and the clustering algorithm is shown to be successful in discerning the user intentions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Delineation of homogeneous precipitation regions (regionalization) is necessary for investigating frequency and spatial distribution of meteorological droughts. The conventional methods of regionalization use statistics of precipitation as attributes to establish homogeneous regions. Therefore they cannot be used to form regions in ungauged areas, and they may not be useful to form meaningful regions in areas having sparse rain gauge density. Further, validation of the regions for homogeneity in precipitation is not possible, since the use of the precipitation statistics to form regions and subsequently to test the regional homogeneity is not appropriate. To alleviate this problem, an approach based on fuzzy cluster analysis is presented. It allows delineation of homogeneous precipitation regions in data sparse areas using large scale atmospheric variables (LSAV), which influence precipitation in the study area, as attributes. The LSAV, location parameters (latitude, longitude and altitude) and seasonality of precipitation are suggested as features for regionalization. The approach allows independent validation of the identified regions for homogeneity using statistics computed from the observed precipitation. Further it has the ability to form regions even in ungauged areas, owing to the use of attributes that can be reliably estimated even when no at-site precipitation data are available. The approach was applied to delineate homogeneous annual rainfall regions in India, and its effectiveness is illustrated by comparing the results with those obtained using rainfall statistics, regionalization based on hard cluster analysis, and meteorological sub-divisions in India. (C) 2011 Elsevier B.V. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Among the multiple advantages and applications of remote sensing, one of the most important uses is to solve the problem of crop classification, i.e., differentiating between various crop types. Satellite images are a reliable source for investigating the temporal changes in crop cultivated areas. In this letter, we propose a novel bat algorithm (BA)-based clustering approach for solving crop type classification problems using a multispectral satellite image. The proposed partitional clustering algorithm is used to extract information in the form of optimal cluster centers from training samples. The extracted cluster centers are then validated on test samples. A real-time multispectral satellite image and one benchmark data set from the University of California, Irvine (UCI) repository are used to demonstrate the robustness of the proposed algorithm. The performance of the BA is compared with two other nature-inspired metaheuristic techniques, namely, genetic algorithm and particle swarm optimization. The performance is also compared with the existing hybrid approach such as the BA with K-means. From the results obtained, it can be concluded that the BA can be successfully applied to solve crop type classification problems.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A large amount of biological data has been produced in the last years. Important knowledge can be extracted from these data by the use of data analysis techniques. Clustering plays an important role in data analysis, by organizing similar objects from a dataset into meaningful groups. Several clustering algorithms have been proposed in the literature. However, each algorithm has its bias, being more adequate for particular datasets. This paper presents a mathematical formulation to support the creation of consistent clusters for biological data. Moreover. it shows a clustering algorithm to solve this formulation that uses GRASP (Greedy Randomized Adaptive Search Procedure). We compared the proposed algorithm with three known other algorithms. The proposed algorithm presented the best clustering results confirmed statistically. (C) 2009 Elsevier Ltd. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This work proposes a method for data clustering based on complex networks theory. A data set is represented as a network by considering different metrics to establish the connection between each pair of objects. The clusters are obtained by taking into account five community detection algorithms. The network-based clustering approach is applied in two real-world databases and two sets of artificially generated data. The obtained results suggest that the exponential of the Minkowski distance is the most suitable metric to quantify the similarities between pairs of objects. In addition, the community identification method based on the greedy optimization provides the best cluster solution. We compare the network-based clustering approach with some traditional clustering algorithms and verify that it provides the lowest classification error rate. (C) 2012 Elsevier B.V. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The selection of predefined analytic grids (partitions of the numeric ranges) to represent input and output functions as histograms has been proposed as a mechanism of approximation in order to control the tradeoff between accuracy and computation times in several áreas ranging from simulation to constraint solving. In particular, the application of interval methods for probabilistic function characterization has been shown to have advantages over other methods based on the simulation of random samples. However, standard interval arithmetic has always been used for the computation steps. In this paper, we introduce an alternative approximate arithmetic aimed at controlling the cost of the interval operations. Its distinctive feature is that grids are taken into account by the operators. We apply the technique in the context of probability density functions in order to improve the accuracy of the probability estimates. Results show that this approach has advantages over existing approaches in some particular situations, although computation times tend to increase significantly when analyzing large functions.