21 resultados para Labeling hierarchical clustering
Resumo:
The soybean crop is considered a high expression around the world. In plant breeding programs, knowledge of genetic diversity is extremely important and in this context, are frequently used multivariate analyzes. Thus, the aim of the present study was to evaluate the genetic divergence between soybean crosses through multivariate techniques. In total, 16 crosses were evaluated, which were in the F2 generation of inbreeding. The evaluated characteristics were plant height at maturity, height of the first pod, number of branches per plant, number of pods per plant, number of nodes per plant, hundred seed weight, grain yield and oil content. For the analyzes was used Euclidean distance, methods of hierarchical clustering UPGMA and Ward and principal component analysis. Genetic distances estimated using Euclidean distance ranged from 1.24 to 8.13, with the smallest distance observed between crosses C1 and C4, and the greatest distance between the C2 crosses and C6. The methods UPGMA clustering and Ward met crossings in five different groups. The principal component analysis explained 86.2% of the variance contained in the original eight variables with three main components. The APM characters, NV, NR, NN, PG% and oil were the main contributors to genetic divergence among traits. Multivariate techniques were crucial to the analysis of genetic diversity, and the methods of Ward and UPGMA clustering and principal components have consistent results in this way, the simultaneous use of these tools in genetic analysis of crosses is indicated
Resumo:
Pós-graduação em Agronomia (Produção Vegetal) - FCAV
Long-term clinical evaluation of the color stability and stainability of acrylic resin denture teeth
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
Although association mining has been highlighted in the last years, the huge number of rules that are generated hamper its use. To overcome this problem, many post-processing approaches were suggested, such as clustering, which organizes the rules in groups that contain, somehow, similar knowledge. Nevertheless, clustering can aid the user only if good descriptors be associated with each group. This is a relevant issue, since the labels will provide to the user a view of the topics to be explored, helping to guide its search. This is interesting, for example, when the user doesn't have, a priori, an idea where to start. Thus, the analysis of different labeling methods for association rule clustering is important. Considering the exposed arguments, this paper analyzes some labeling methods through two measures that are proposed. One of them, Precision, measures how much the methods can find labels that represent as accurately as possible the rules contained in its group and Repetition Frequency determines how the labels are distributed along the clusters. As a result, it was possible to identify the methods and the domain organizations with the best performances that can be applied in clusters of association rules.
Resumo:
A simulation study was made of the effects of mixing two evolutionary forces (natural selection and random genetic drift), combined in a single data matrix of gene frequencies, on the resulting genetic distances among populations. Twenty-one, kinds of simulated gene frequencies surfaces, for 15 populations linearly distributed over geographic space, were used to construct 21 data matrices, combining different proportions of two types of surfaces (gradients and random surfaces). These matrices were analysed by Unweighted Pair-Group Method - Arithmetic Averages (UPGMA), clustering and Principal Coordinate Analysis. The results obtained show that ordination is more accurate than UPGMA in revealing the spatial patterns in the genetic distances, in comparison with results obtained using the Mantel test comparing directly genetic and geographic distances.
Resumo:
Nowadays, organizations face the problem of keeping their information protected, available and trustworthy. In this context, machine learning techniques have also been extensively applied to this task. Since manual labeling is very expensive, several works attempt to handle intrusion detection with traditional clustering algorithms. In this paper, we introduce a new pattern recognition technique called Optimum-Path Forest (OPF) clustering to this task. Experiments on three public datasets have showed that OPF classifier may be a suitable tool to detect intrusions on computer networks, since it outperformed some state-of-the-art unsupervised techniques. © 2012 IEEE.