765 resultados para Grouping, clustering, campi, associazione
Resumo:
The CMS High-Level Trigger (HLT) is responsible for ensuring that data samples with potentially interesting events are recorded with high efficiency and good quality. This paper gives an overview of the HLT and focuses on its commissioning using cosmic rays. The selection of triggers that were deployed is presented and the online grouping of triggered events into streams and primary datasets is discussed. Tools for online and offline data quality monitoring for the HLT are described, and the operational performance of the muon HLT algorithms is reviewed. The average time taken for the HLT selection and its dependence on detector and operating conditions are presented. The HLT performed reliably and helped provide a large dataset. This dataset has proven to be invaluable for understanding the performance of the trigger and the CMS experiment as a whole. © 2010 IOP Publishing Ltd and SISSA.
Resumo:
Land use classification has been paramount in the last years, since we can identify illegal land use and also to monitor deforesting areas. Although one can find several research works in the literature that address this problem, we propose here the land use recognition by means of Optimum-Path Forest Clustering (OPF), which has never been applied to this context up to date. Experiments among Optimum-Path Forest, Mean Shift and K-Means demonstrated the robustness of OPF for automatic land use classification of images obtained by CBERS-2B and Ikonos-2 satellites. © 2011 IEEE.
Resumo:
The significant volume of work accidents in the cities causes an expressive loss to society. The development of Spatial Data Mining technologies presents a new perspective for the extraction of knowledge from the correlation between conventional and spatial attributes. One of the most important techniques of the Spatial Data Mining is the Spatial Clustering, which clusters similar spatial objects to find a distribution of patterns, taking into account the geographical position of the objects. Applying this technique to the health area, will provide information that can contribute towards the planning of more adequate strategies for the prevention of work accidents. The original contribution of this work is to present an application of tools developed for Spatial Clustering which supply a set of graphic resources that have helped to discover knowledge and support for management in the work accidents area. © 2011 IEEE.
Resumo:
The post-processing of association rules is a difficult task, since a large number of patterns can be obtained. Many approaches have been developed to overcome this problem, as objective measures and clustering, which are respectively used to: (i) highlight the potentially interesting knowledge in domain; (ii) structure the domain, organizing the rules in groups that contain, somehow, similar knowledge. However, objective measures don't reduce nor organize the collection of rules, making the understanding of the domain difficult. On the other hand, clustering doesn't reduce the exploration space nor direct the user to find interesting knowledge, making the search for relevant knowledge not so easy. This work proposes the PAR-COM (Post-processing Association Rules with Clustering and Objective Measures) methodology that, combining clustering and objective measures, reduces the association rule exploration space directing the user to what is potentially interesting. Thereby, PAR-COM minimizes the user's effort during the post-processing process.
Resumo:
Structural Health Monitoring (SHM) denotes a system with the ability to detect and interpret adverse changes in a structure. One of the critical challenges for practical implementation of SHM system is the ability to detect damage under changing environmental conditions. This paper aims to characterize the temperature, load and damage effects in the sensor measurements obtained with piezoelectric transducer (PZT) patches. Data sets are collected on thin aluminum specimens under different environmental conditions and artificially induced damage states. The fuzzy clustering algorithm is used to organize the sensor measurements into a set of clusters, which can attribute the variation in sensor data due to temperature, load or any induced damage.
Resumo:
Non-technical losses identification has been paramount in the last decade. Since we have datasets with hundreds of legal and illegal profiles, one may have a method to group data into subprofiles in order to minimize the search for consumers that cause great frauds. In this context, a electric power company may be interested in to go deeper a specific profile of illegal consumer. In this paper, we introduce the Optimum-Path Forest (OPF) clustering technique to this task, and we evaluate the behavior of a dataset provided by a brazilian electric power company with different values of an OPF parameter. © 2011 IEEE.
Resumo:
Wireless Sensor Networks (WSN) are a special kind of ad-hoc networks that is usually deployed in a monitoring field in order to detect some physical phenomenon. Due to the low dependability of individual nodes, small radio coverage and large areas to be monitored, the organization of nodes in small clusters is generally used. Moreover, a large number of WSN nodes is usually deployed in the monitoring area to increase WSN dependability. Therefore, the best cluster head positioning is a desirable characteristic in a WSN. In this paper, we propose a hybrid clustering algorithm based on community detection in complex networks and traditional K-means clustering technique: the QK-Means algorithm. Simulation results show that QK-Means detect communities and sub-communities thus lost message rate is decreased and WSN coverage is increased. © 2012 IEEE.
Resumo:
Although association mining has been highlighted in the last years, the huge number of rules that are generated hamper its use. To overcome this problem, many post-processing approaches were suggested, such as clustering, which organizes the rules in groups that contain, somehow, similar knowledge. Nevertheless, clustering can aid the user only if good descriptors be associated with each group. This is a relevant issue, since the labels will provide to the user a view of the topics to be explored, helping to guide its search. This is interesting, for example, when the user doesn't have, a priori, an idea where to start. Thus, the analysis of different labeling methods for association rule clustering is important. Considering the exposed arguments, this paper analyzes some labeling methods through two measures that are proposed. One of them, Precision, measures how much the methods can find labels that represent as accurately as possible the rules contained in its group and Repetition Frequency determines how the labels are distributed along the clusters. As a result, it was possible to identify the methods and the domain organizations with the best performances that can be applied in clusters of association rules.
Resumo:
In this paper we propose a nature-inspired approach that can boost the Optimum-Path Forest (OPF) clustering algorithm by optimizing its parameters in a discrete lattice. The experiments in two public datasets have shown that the proposed algorithm can achieve similar parameters' values compared to the exhaustive search. Although, the proposed technique is faster than the traditional one, being interesting for intrusion detection in large scale traffic networks. © 2012 IEEE.
Resumo:
Image categorization by means of bag of visual words has received increasing attention by the image processing and vision communities in the last years. In these approaches, each image is represented by invariant points of interest which are mapped to a Hilbert Space representing a visual dictionary which aims at comprising the most discriminative features in a set of images. Notwithstanding, the main problem of such approaches is to find a compact and representative dictionary. Finding such representative dictionary automatically with no user intervention is an even more difficult task. In this paper, we propose a method to automatically find such dictionary by employing a recent developed graph-based clustering algorithm called Optimum-Path Forest, which does not make any assumption about the visual dictionary's size and is more efficient and effective than the state-of-the-art techniques used for dictionary generation. © 2012 IEEE.
Resumo:
Nowadays, organizations face the problem of keeping their information protected, available and trustworthy. In this context, machine learning techniques have also been extensively applied to this task. Since manual labeling is very expensive, several works attempt to handle intrusion detection with traditional clustering algorithms. In this paper, we introduce a new pattern recognition technique called Optimum-Path Forest (OPF) clustering to this task. Experiments on three public datasets have showed that OPF classifier may be a suitable tool to detect intrusions on computer networks, since it outperformed some state-of-the-art unsupervised techniques. © 2012 IEEE.
Resumo:
Incluye Bibliografía
Resumo:
The objective of this study was to define production environments by grouping different environmental factors and, consequently, to assess genotype by production environment interactions on weaning weight (WW) in the Angus populations of Brazil and Uruguay. Climatic conditions were represented by monthly temperature means (°C), minimum and maximum temperatures in winter and summer respectively and accumulated rainfall (mm/year). Mode in month of birth and weaning, and calf weight (kg) and age (days) at weaning were used as indicators of management conditions of 33 and 161 herds in 13 and 34 regions in Uruguay and Brazil, respectively. Two approaches were developed: (a) a bi-character analysis of extreme sub-datasets within each environmental factor (bottom and top 33% of regions), (b) three different production environments (including farms from both countries) were defined in a cluster analysis using standardized environmental factors. To identify the variables that influenced the cluster formation, a discriminant analysis was previously carried out. Management (month, age and weight at weaning) and climatic factors (accumulated rainfalls and winter and summer temperatures) were the most important factors in the clustering of farms. Bi or trivariate analyses were performed to estimate heritability and genetic correlations for WW in extreme sub-datasets within environmental factor or between clusters, using MTDFREML software. Heritability estimates of WW in the first approach ranged from 0.27 to 0.54, and genetic correlations between top and bottom sub-datasets within environmental factors, from -0.29 to 0.70. In the cluster approach, heritabilities were 0.58±0.04 for cluster 1, 0.31±0.01 for Cluster 2 and 0.40±0.02 for Cluster 3. Genetic correlations were 0.27±0.08, 0.32±0.09 and 0.33±0.09, between clusters 1 and 2, 1 and 3, and 2 and 3, respectively. Both approaches suggest the existence of genotype x environment interaction for weaning weight in Angus breed of Brazil and Uruguay. © 2012 Elsevier B.V.
Resumo:
Many topics related to association mining have received attention in the research community, especially the ones focused on the discovery of interesting knowledge. A promising approach, related to this topic, is the application of clustering in the pre-processing step to aid the user to find the relevant associative patterns of the domain. In this paper, we propose nine metrics to support the evaluation of this kind of approach. The metrics are important since they provide criteria to: (a) analyze the methodologies, (b) identify their positive and negative aspects, (c) carry out comparisons among them and, therefore, (d) help the users to select the most suitable solution for their problems. Some experiments were done in order to present how the metrics can be used and their usefulness. © 2013 Springer-Verlag GmbH.
Resumo:
Pós-graduação em Educação Matemática - IGCE