856 resultados para Data Driven Clustering
Resumo:
The taxonomy of the N(2)-fixing bacteria belonging to the genus Bradyrhizobium is still poorly refined, mainly due to conflicting results obtained by the analysis of the phenotypic and genotypic properties. This paper presents an application of a method aiming at the identification of possible new clusters within a Brazilian collection of 119 Bradryrhizobium strains showing phenotypic characteristics of B. japonicum and B. elkanii. The stability was studied as a function of the number of restriction enzymes used in the RFLP-PCR analysis of three ribosomal regions with three restriction enzymes per region. The method proposed here uses Clustering algorithms with distances calculated by average-linkage clustering. Introducing perturbations using sub-sampling techniques makes the stability analysis. The method showed efficacy in the grouping of the species B. japonicum and B. elkanii. Furthermore, two new clusters were clearly defined, indicating possible new species, and sub-clusters within each detected cluster. (C) 2008 Elsevier B.V. All rights reserved.
Resumo:
The development of strategies for structural health monitoring (SHM) has become increasingly important because of the necessity of preventing undesirable damage. This paper describes an approach to this problem using vibration data. It involves a three-stage process: reduction of the time-series data using principle component analysis (PCA), the development of a data-based model using an auto-regressive moving average (ARMA) model using data from an undamaged structure, and the classification of whether or not the structure is damaged using a fuzzy clustering approach. The approach is applied to data from a benchmark structure from Los Alamos National Laboratory, USA. Two fuzzy clustering algorithms are compared: fuzzy c-means (FCM) and Gustafson-Kessel (GK) algorithms. It is shown that while both fuzzy clustering algorithms are effective, the GK algorithm marginally outperforms the FCM algorithm. (C) 2008 Elsevier Ltd. All rights reserved.
Resumo:
Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)
Resumo:
The frequency of adenine mononucleotides (A), dinucleotides (AA) and clusters, and the positions of clusters, were studied in 502 molecules of the 5S rRNA.All frequencies were reduced in the evolutive lines of vertebrates, plants and fungi, in parallel with increasing organismic complexity. No change was observed in invertebrates. All frequencies were increased in mitochondria, plastids and mycoplasmas. The presumed relatives to the ancestors of the organelles, Rhodobacteria alfa and Cyanobacteria, showed intermediate values, relative to the eubacterial averages. Firmibacterid showed very high number of cluster sites.Clusters were more frequent in single-stranded regions in all organisms. The routes of organelles and mycoplasmas accummulated clusters at faster rates in double-stranded regions. Rates of change were higher for AA and clusters than for A in plants, vertebrates and organeltes, higher for cluster sites and A in mycoplasmas, and higher for AA and A in fungi. These data indicated that selection pressures acted more strongly on adenine clustering than on adenine frequency.It is proposed that AA and clusters, as sites of lower informational content. have the property of tolerating positional variation in the sites of other molecules (or other regions of the same molecule) that interact with the adenines. This reasoning was consistent with the degrees of genic polymorphism. low in plants and vertebrates and high in invertebrates. In the eubacteria endosymbiontic or parasitic to eukaryotes, the more tolerant RNA would be better adapted to interactions with the homologous nucleus-derived ribosomal proteins: the intermediate values observed in their precursors were interpreted as preadaptive.Among other groups, only the Deinococcus-Thermus eubacteria showed excessive AA and cluster contents, possibly related to their peculiar tolerance to mutagens, and the Ciliates showed excessive AA contents, indicative of retention of primitive characters.
PHYLOGENETIC STUDIES OF SOME SPECIES OF THE GENUS COFFEA .2. NUMERICAL-ANALYSIS OF ISOENZYMATIC DATA
Resumo:
Thirteen species of Coffea were studied for five enzymes systems, including alpha and beta esterase, alkaline phosphatase, acid phosphatase, malate dehydrogenase and acid dehydrogenase. Three coefficients of similarity: Simple Matching, Jaccard and Ochiai and three different clustering methods: Single Linkage, Complete Linkage and Unweighted Pair Group, using Arithmetic Averages (UPGMA) were used to analyse the data.The phylogenetic relationships among the twelve diploid species and between them and the tetraploid species C. arabica showed that similarity among species of the same subsection is not always greater than among species of different subsections. In addition, although there are several similarity groups in common, established by isoenzymatic polymorphism, morphological characteristics, chemical data, crossability and geographic distribution, there is no common trend among the phylogenetic relationships as indicated by all these different evaluating procedures.
Resumo:
In this paper we focus on providing coordinated visual strategies to assist users in performing tasks driven by the presence of temporal and spatial attributes. We introduce temporal visualization techniques targeted at such tasks, and illustrate their use with an application involving a climate classification process. The climate classification requires extensive Processing of a database containing daily rain precipitation values collected along over fifty years at several spatial locations in the São Paulo state, Brazil. We identify user exploration tasks typically conducted as part of the data preparation required in this process, and then describe how such tasks may be assisted by the multiple visual techniques provided. Issues related to the use of the multiple techniques by an end-user are also discussed.
Resumo:
An intelligent system that emulates human decision behaviour based on visual data acquisition is proposed. The approach is useful in applications where images are used to supply information to specialists who will choose suitable actions. An artificial neural classifier aids a fuzzy decision support system to deal with uncertainty and imprecision present in available information. Advantages of both techniques are exploited complementarily. As an example, this method was applied in automatic focus checking and adjustment in video monitor manufacturing. Copyright © 2005 IFAC.
Resumo:
The significant volume of work accidents in the cities causes an expressive loss to society. The development of Spatial Data Mining technologies presents a new perspective for the extraction of knowledge from the correlation between conventional and spatial attributes. One of the most important techniques of the Spatial Data Mining is the Spatial Clustering, which clusters similar spatial objects to find a distribution of patterns, taking into account the geographical position of the objects. Applying this technique to the health area, will provide information that can contribute towards the planning of more adequate strategies for the prevention of work accidents. The original contribution of this work is to present an application of tools developed for Spatial Clustering which supply a set of graphic resources that have helped to discover knowledge and support for management in the work accidents area. © 2011 IEEE.
Resumo:
Structural Health Monitoring (SHM) denotes a system with the ability to detect and interpret adverse changes in a structure. One of the critical challenges for practical implementation of SHM system is the ability to detect damage under changing environmental conditions. This paper aims to characterize the temperature, load and damage effects in the sensor measurements obtained with piezoelectric transducer (PZT) patches. Data sets are collected on thin aluminum specimens under different environmental conditions and artificially induced damage states. The fuzzy clustering algorithm is used to organize the sensor measurements into a set of clusters, which can attribute the variation in sensor data due to temperature, load or any induced damage.
Resumo:
Nowadays, organizations face the problem of keeping their information protected, available and trustworthy. In this context, machine learning techniques have also been extensively applied to this task. Since manual labeling is very expensive, several works attempt to handle intrusion detection with traditional clustering algorithms. In this paper, we introduce a new pattern recognition technique called Optimum-Path Forest (OPF) clustering to this task. Experiments on three public datasets have showed that OPF classifier may be a suitable tool to detect intrusions on computer networks, since it outperformed some state-of-the-art unsupervised techniques. © 2012 IEEE.
Resumo:
Many topics related to association mining have received attention in the research community, especially the ones focused on the discovery of interesting knowledge. A promising approach, related to this topic, is the application of clustering in the pre-processing step to aid the user to find the relevant associative patterns of the domain. In this paper, we propose nine metrics to support the evaluation of this kind of approach. The metrics are important since they provide criteria to: (a) analyze the methodologies, (b) identify their positive and negative aspects, (c) carry out comparisons among them and, therefore, (d) help the users to select the most suitable solution for their problems. Some experiments were done in order to present how the metrics can be used and their usefulness. © 2013 Springer-Verlag GmbH.
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Spatial Data Mining to Support Environmental Management and Decision Making - A Case Study in Brazil
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)