912 resultados para hierarchical clustering techniques
Resumo:
Mobile robots need autonomy to fulfill their tasks. Such autonomy is related whith their capacity to explorer and to recognize their navigation environments. In this context, the present work considers techniques for the classification and extraction of features from images, using artificial neural networks. This images are used in the mapping and localization system of LACE (Automation and Evolutive Computing Laboratory) mobile robot. In this direction, the robot uses a sensorial system composed by ultrasound sensors and a catadioptric vision system equipped with a camera and a conical mirror. The mapping system is composed of three modules; two of them will be presented in this paper: the classifier and the characterizer modules. Results of these modules simulations are presented in this paper.
Resumo:
A simulation study was made of the effects of mixing two evolutionary forces (natural selection and random genetic drift), combined in a single data matrix of gene frequencies, on the resulting genetic distances among populations. Twenty-one, kinds of simulated gene frequencies surfaces, for 15 populations linearly distributed over geographic space, were used to construct 21 data matrices, combining different proportions of two types of surfaces (gradients and random surfaces). These matrices were analysed by Unweighted Pair-Group Method - Arithmetic Averages (UPGMA), clustering and Principal Coordinate Analysis. The results obtained show that ordination is more accurate than UPGMA in revealing the spatial patterns in the genetic distances, in comparison with results obtained using the Mantel test comparing directly genetic and geographic distances.
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
One common problem in all basic techniques of knowledge representation is the handling of the trade-off between precision of inferences and resource constraints, such as time and memory. Michalski and Winston (1986) suggested the Censored Production Rule (CPR) as an underlying representation and computational mechanism to enable logic based systems to exhibit variable precision in which certainty varies while specificity stays constant. As an extension of CPR, the Hierarchical Censored Production Rules (HCPRs) system of knowledge representation, proposed by Bharadwaj & Jain (1992), exhibits both variable certainty as well as variable specificity and offers mechanisms for handling the trade-off between the two. An HCPR has the form: Decision If(preconditions) Unless(censor) Generality(general_information) Specificity(specific_information). As an attempt towards evolving a generalized knowledge representation, an Extended Hierarchical Censored Production Rules (EHCPRs) system is suggested in this paper. With the inclusion of new operators, an Extended Hierarchical Censored Production Rule (EHCPR) takes the general form: Concept If (Preconditions) Unless (Exceptions) Generality (General-Concept) Specificity (Specific Concepts) Has_part (default: structural-parts) Has_property (default:characteristic-properties) Has_instance (instances). How semantic networks and frames are represented in terms of an EHCPRs is shown. Multiple inheritance, inheritance with and without cancellation, recognition with partial match, and a few default logic problems are shown to be tackled efficiently in the proposed system.
Resumo:
The genetic divergence in 20 Eucalyptus spp. clones was evaluated by multivariate techniques based on 167 RAPD markers, of which 155 were polymorphic and 12 monomorphic. The measures of genetic distances were obtained by the arithmetic complement of the coefficients of Jaccard and of Sorenso-Nei and Li and evaluated by the hierarchical methods of Single Linkage clustering and Unweighted Pair Group Method with Arithmetic Mean (UPGMA). Independent of the dissimilarity coefficient, the greatest divergence was found between clones 7 and 17 and the smallest between the clones 11 and 14. Clone clustering was little influenced by the applied procedure so that, adopting the same percentage of divergence, the UPGMA identified two groups less for the coefficient of Sorenso-Nei and Li. The clones evidenced considerable genetic divergence, which is partly associated to the origin of the study material. The clusters formed by the UPGMA clustering algorithm associated to the arithmetic complement of Jaccard were most consistent.
Resumo:
Autonomous robots must be able to learn and maintain models of their environments. In this context, the present work considers techniques for the classification and extraction of features from images in joined with artificial neural networks in order to use them in the system of mapping and localization of the mobile robot of Laboratory of Automation and Evolutive Computer (LACE). To do this, the robot uses a sensorial system composed for ultrasound sensors and a catadioptric vision system formed by a camera and a conical mirror. The mapping system is composed by three modules. Two of them will be presented in this paper: the classifier and the characterizer module. The first module uses a hierarchical neural network to do the classification; the second uses techiniques of extraction of attributes of images and recognition of invariant patterns extracted from the places images set. The neural network of the classifier module is structured in two layers, reason and intuition, and is trained to classify each place explored for the robot amongst four predefine classes. The final result of the exploration is the construction of a topological map of the explored environment. Results gotten through the simulation of the both modules of the mapping system will be presented in this paper. © 2008 IEEE.
Resumo:
The significant volume of work accidents in the cities causes an expressive loss to society. The development of Spatial Data Mining technologies presents a new perspective for the extraction of knowledge from the correlation between conventional and spatial attributes. One of the most important techniques of the Spatial Data Mining is the Spatial Clustering, which clusters similar spatial objects to find a distribution of patterns, taking into account the geographical position of the objects. Applying this technique to the health area, will provide information that can contribute towards the planning of more adequate strategies for the prevention of work accidents. The original contribution of this work is to present an application of tools developed for Spatial Clustering which supply a set of graphic resources that have helped to discover knowledge and support for management in the work accidents area. © 2011 IEEE.
Resumo:
Image categorization by means of bag of visual words has received increasing attention by the image processing and vision communities in the last years. In these approaches, each image is represented by invariant points of interest which are mapped to a Hilbert Space representing a visual dictionary which aims at comprising the most discriminative features in a set of images. Notwithstanding, the main problem of such approaches is to find a compact and representative dictionary. Finding such representative dictionary automatically with no user intervention is an even more difficult task. In this paper, we propose a method to automatically find such dictionary by employing a recent developed graph-based clustering algorithm called Optimum-Path Forest, which does not make any assumption about the visual dictionary's size and is more efficient and effective than the state-of-the-art techniques used for dictionary generation. © 2012 IEEE.
Resumo:
Nowadays, organizations face the problem of keeping their information protected, available and trustworthy. In this context, machine learning techniques have also been extensively applied to this task. Since manual labeling is very expensive, several works attempt to handle intrusion detection with traditional clustering algorithms. In this paper, we introduce a new pattern recognition technique called Optimum-Path Forest (OPF) clustering to this task. Experiments on three public datasets have showed that OPF classifier may be a suitable tool to detect intrusions on computer networks, since it outperformed some state-of-the-art unsupervised techniques. © 2012 IEEE.
Resumo:
This paper introduces the Optimum-Path Forest (OPF) classifier for static video summarization, being its results comparable to the ones obtained by some state-of-the-art video summarization techniques. The experimental section has been conducted using several image descriptors in two public datasets, followed by an analysis of OPF robustness regarding one ad-hoc parameter. Future works are guided to improve OPF effectiveness on each distinct video category.
Resumo:
Image categorization by means of bag of visual words has received increasing attention by the image processing and vision communities in the last years. In these approaches, each image is represented by invariant points of interest which are mapped to a Hilbert Space representing a visual dictionary which aims at comprising the most discriminative features in a set of images. Notwithstanding, the main problem of such approaches is to find a compact and representative dictionary. Finding such representative dictionary automatically with no user intervention is an even more difficult task. In this paper, we propose a method to automatically find such dictionary by employing a recent developed graph-based clustering algorithm called Optimum-Path Forest, which does not make any assumption about the visual dictionary's size and is more efficient and effective than the state-of-the-art techniques used for dictionary generation.
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
In this paper we deal with the problem of boosting the Optimum-Path Forest (OPF) clustering approach using evolutionary-based optimization techniques. As the OPF classifier performs an exhaustive search to find out the size of sample's neighborhood that allows it to reach the minimum graph cut as a quality measure, we compared several optimization techniques that can obtain close graph cut values to the ones obtained by brute force. Experiments in two public datasets in the context of unsupervised network intrusion detection have showed the evolutionary optimization techniques can find suitable values for the neighborhood faster than the exhaustive search. Additionally, we have showed that it is not necessary to employ many agents for such task, since the neighborhood size is defined by discrete values, with constrain the set of possible solution to a few ones.
Resumo:
The automatic disambiguation of word senses (i.e., the identification of which of the meanings is used in a given context for a word that has multiple meanings) is essential for such applications as machine translation and information retrieval, and represents a key step for developing the so-called Semantic Web. Humans disambiguate words in a straightforward fashion, but this does not apply to computers. In this paper we address the problem of Word Sense Disambiguation (WSD) by treating texts as complex networks, and show that word senses can be distinguished upon characterizing the local structure around ambiguous words. Our goal was not to obtain the best possible disambiguation system, but we nevertheless found that in half of the cases our approach outperforms traditional shallow methods. We show that the hierarchical connectivity and clustering of words are usually the most relevant features for WSD. The results reported here shed light on the relationship between semantic and structural parameters of complex networks. They also indicate that when combined with traditional techniques the complex network approach may be useful to enhance the discrimination of senses in large texts. Copyright (C) EPLA, 2012
Resumo:
Abstract Background Transcript enumeration methods such as SAGE, MPSS, and sequencing-by-synthesis EST "digital northern", are important high-throughput techniques for digital gene expression measurement. As other counting or voting processes, these measurements constitute compositional data exhibiting properties particular to the simplex space where the summation of the components is constrained. These properties are not present on regular Euclidean spaces, on which hybridization-based microarray data is often modeled. Therefore, pattern recognition methods commonly used for microarray data analysis may be non-informative for the data generated by transcript enumeration techniques since they ignore certain fundamental properties of this space. Results Here we present a software tool, Simcluster, designed to perform clustering analysis for data on the simplex space. We present Simcluster as a stand-alone command-line C package and as a user-friendly on-line tool. Both versions are available at: http://xerad.systemsbiology.net/simcluster. Conclusion Simcluster is designed in accordance with a well-established mathematical framework for compositional data analysis, which provides principled procedures for dealing with the simplex space, and is thus applicable in a number of contexts, including enumeration-based gene expression data.