10 resultados para Fuzzy similarity

em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo (BDPI/USP)


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Analysis of floristic similarity relationships between plant communities can detect patterns of species occurrence and also explain conditioning factors. Searching for such patterns, floristic similarity relationships among Atlantic Forest sites situated at Ibiuna Plateau, Sao Paulo state, Brazil, were analyzed by multivariate techniques. Twenty one forest fragments and six sites within a continuous Forest Reserve were included in the analyses. Floristic composition and structure of the tree community (minimum dbh 5 cm) were assessed using the point centered quarter method. Two methods were used for multivariate analysis: Detrended Correspondence Analysis (DCA) and Two-Way Indicator Species Analysis (TWINSPAN). Similarity relationships among the study areas were based on the successional stage of the community and also on spatial proximity. The more similar the successional stage of the communities, the higher the floristic similarity between them, especially if the communities are geographically close. A floristic gradient from north to south was observed, suggesting a transition between biomes, since northern indicator species are mostly heliophytes, occurring also in cerrado vegetation and seasonal semideciduous forest, while southern indicator species are mostly typical ombrophilous and climax species from typical dense evergreen Atlantic Forest.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper is concerned with the computational efficiency of fuzzy clustering algorithms when the data set to be clustered is described by a proximity matrix only (relational data) and the number of clusters must be automatically estimated from such data. A fuzzy variant of an evolutionary algorithm for relational clustering is derived and compared against two systematic (pseudo-exhaustive) approaches that can also be used to automatically estimate the number of fuzzy clusters in relational data. An extensive collection of experiments involving 18 artificial and two real data sets is reported and analyzed. (C) 2011 Elsevier B.V. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

There is a family of well-known external clustering validity indexes to measure the degree of compatibility or similarity between two hard partitions of a given data set, including partitions with different numbers of categories. A unified, fully equivalent set-theoretic formulation for an important class of such indexes was derived and extended to the fuzzy domain in a previous work by the author [Campello, R.J.G.B., 2007. A fuzzy extension of the Rand index and other related indexes for clustering and classification assessment. Pattern Recognition Lett., 28, 833-841]. However, the proposed fuzzy set-theoretic formulation is not valid as a general approach for comparing two fuzzy partitions of data. Instead, it is an approach for comparing a fuzzy partition against a hard referential partition of the data into mutually disjoint categories. In this paper, generalized external indexes for comparing two data partitions with overlapping categories are introduced. These indexes can be used as general measures for comparing two partitions of the same data set into overlapping categories. An important issue that is seldom touched in the literature is also addressed in the paper, namely, how to compare two partitions of different subsamples of data. A number of pedagogical examples and three simulation experiments are presented and analyzed in details. A review of recent related work compiled from the literature is also provided. (c) 2010 Elsevier B.V. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

An important feature of a database management systems (DBMS) is its client/server architecture, where managing shared memory among the clients and the server is always an tough issue. However, similarity queries are specially sensitive to this kind of architecture, since the answer sizes vary widely. Usually, the answers of similarity query are fully processed to be sent in full to the user, who often is interested in just parts of the answer, e.g. just few elements closer or farther to the query reference. Compelling the DBMS to retrieve the full answer, further ignoring its majority is at least a waste of server processing power. Paging the answer is a technique that splits the answer onto several pages, following client requests. Despite the success of paging on traditional queries, little work has been done to support it in similarity queries. In this work, we present a technique that not only provides paging in similarity range or k-nearest neighbor queries, but also supports them in two variations: the forward similarity query and the backward similarity query. They return elements either increasingly farther of increasingly closer to the query reference. The reported experiments show that, depending on the proportion of the interesting part over the full answer, both techniques allow answering queries much faster than it is obtained in the non-paged way. (C) 2010 Elsevier Inc. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Modern database applications are increasingly employing database management systems (DBMS) to store multimedia and other complex data. To adequately support the queries required to retrieve these kinds of data, the DBMS need to answer similarity queries. However, the standard structured query language (SQL) does not provide effective support for such queries. This paper proposes an extension to SQL that seamlessly integrates syntactical constructions to express similarity predicates to the existing SQL syntax and describes the implementation of a similarity retrieval engine that allows posing similarity queries using the language extension in a relational DBM. The engine allows the evaluation of every aspect of the proposed extension, including the data definition language and data manipulation language statements, and employs metric access methods to accelerate the queries. Copyright (c) 2008 John Wiley & Sons, Ltd.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper tackles the problem of showing that evolutionary algorithms for fuzzy clustering can be more efficient than systematic (i.e. repetitive) approaches when the number of clusters in a data set is unknown. To do so, a fuzzy version of an Evolutionary Algorithm for Clustering (EAC) is introduced. A fuzzy cluster validity criterion and a fuzzy local search algorithm are used instead of their hard counterparts employed by EAC. Theoretical complexity analyses for both the systematic and evolutionary algorithms under interest are provided. Examples with computational experiments and statistical analyses are also presented.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Each square complex matrix is unitarily similar to an upper triangular matrix with diagonal entries in any prescribed order. Let A = [a(ij)] and B = [b(ij)] be upper triangular n x n matrices that are not similar to direct sums of square matrices of smaller sizes, or are in general position and have the same main diagonal. We prove that A and B are unitarily similar if and only if parallel to h(A(k))parallel to = parallel to h(B(k))parallel to for all h is an element of C vertical bar x vertical bar and k = 1, ..., n, where A(k) := [a(ij)](i.j=1)(k) and B(k) := [b(ij)](i.j=1)(k) are the leading principal k x k submatrices of A and B, and parallel to . parallel to is the Frobenius norm. (C) 2011 Elsevier Inc. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A square matrix is nonderogatory if its Jordan blocks have distinct eigenvalues. We give canonical forms for (1) nonderogatory complex matrices up to unitary similarity, and (2) pairs of complex matrices up to similarity, in which one matrix has distinct eigenvalues. The types of these canonical forms are given by undirected and, respectively, directed graphs with no undirected cycles. (C) 2011 Elsevier Inc. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Hologram quantitative structure-activity relationships (HQSAR) were applied to a data set of 41 cruzain inhibitors. The best HQSAR model (Q(2) = 0.77; R-2 = 0.90) employing Surflex-Sim, as training and test sets generator, was obtained using atoms, bonds, and connections as fragment distinctions and 4-7 as fragment size. This model was then used to predict the potencies of 12 test set compounds, giving satisfactory predictive R-2 value of 0,88. The contribution maps obtained from the best HQSAR model are in agreement with the biological activities of the study compounds. The Trypanosoma cruzi cruzain shares high similarity with the mammalian homolog cathepsin L. The selectivity toward cruzam was checked by a database of 123 compounds, which corresponds to the 41 cruzain inhibitors used in the HQSAR model development plus 82 cathepsin L inhibitors. We screened these compounds by ROCS (Rapid Overlay of Chemical Structures), a Gaussian-shape volume overlap filter that can rapidly identify shapes that match the query molecule. Remarkably, ROCS was able to rank the first 37 hits as being only cruzain inhibitors. In addition, the area under the curve (AUC) obtained with ROCS was 0.96, indicating that the method was very efficient to distinguishing between cruzain and cathepsin L inhibitors. (c) 2007 Elsevier Ltd. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Cytochrome P450 (CYP450) is a class of enzymes where the substrate identification is particularly important to know. It would help medicinal chemists to design drugs with lower side effects due to drug-drug interactions and to extensive genetic polymorphism. Herein, we discuss the application of the 2D and 3D-similarity searches in identifying reference Structures with higher capacity to retrieve Substrates of three important CYP enzymes (CYP2C9, CYP2D6, and CYP3A4). On the basis of the complementarities of multiple reference structures selected by different similarity search methods, we proposed the fusion of their individual Tanimoto scores into a consensus Tanimoto score (T(consensus)). Using this new score, true positive rates of 63% (CYP2C9) and 81% (CYP2D6) were achieved with false positive rates of 4% for the CYP2C9-CYP2D6 data Set. Extended similarity searches were carried out oil a validation data set, and the results showed that by using the T(consensus) score, not only the area of a ROC graph increased, but also more substrates were recovered at the beginning of a ranked list.