977 resultados para feature representation
Resumo:
The proposed model, called the combinatorial and competitive spatio-temporal memory or CCSTM, provides an elegant solution to the general problem of having to store and recall spatio-temporal patterns in which states or sequences of states can recur in various contexts. For example, fig. 1 shows two state sequences that have a common subsequence, C and D. The CCSTM assumes that any state has a distributed representation as a collection of features. Each feature has an associated competitive module (CM) containing K cells. On any given occurrence of a particular feature, A, exactly one of the cells in CMA will be chosen to represent it. It is the particular set of cells active on the previous time step that determines which cells are chosen to represent instances of their associated features on the current time step. If we assume that typically S features are active in any state then any state has K^S different neural representations. This huge space of possible neural representations of any state is what underlies the model's ability to store and recall numerous context-sensitive state sequences. The purpose of this paper is simply to describe this mechanism.
Resumo:
This paper proposes max separation clustering (MSC), a new non-hierarchical clustering method used for feature extraction from optical emission spectroscopy (OES) data for plasma etch process control applications. OES data is high dimensional and inherently highly redundant with the result that it is difficult if not impossible to recognize useful features and key variables by direct visualization. MSC is developed for clustering variables with distinctive patterns and providing effective pattern representation by a small number of representative variables. The relationship between signal-to-noise ratio (SNR) and clustering performance is highlighted, leading to a requirement that low SNR signals be removed before applying MSC. Experimental results on industrial OES data show that MSC with low SNR signal removal produces effective summarization of the dominant patterns in the data.
Resumo:
Thesis (Ph.D.)--University of Washington, 2013
Resumo:
L’apprentissage supervisé de réseaux hiérarchiques à grande échelle connaît présentement un succès fulgurant. Malgré cette effervescence, l’apprentissage non-supervisé représente toujours, selon plusieurs chercheurs, un élément clé de l’Intelligence Artificielle, où les agents doivent apprendre à partir d’un nombre potentiellement limité de données. Cette thèse s’inscrit dans cette pensée et aborde divers sujets de recherche liés au problème d’estimation de densité par l’entremise des machines de Boltzmann (BM), modèles graphiques probabilistes au coeur de l’apprentissage profond. Nos contributions touchent les domaines de l’échantillonnage, l’estimation de fonctions de partition, l’optimisation ainsi que l’apprentissage de représentations invariantes. Cette thèse débute par l’exposition d’un nouvel algorithme d'échantillonnage adaptatif, qui ajuste (de fa ̧con automatique) la température des chaînes de Markov sous simulation, afin de maintenir une vitesse de convergence élevée tout au long de l’apprentissage. Lorsqu’utilisé dans le contexte de l’apprentissage par maximum de vraisemblance stochastique (SML), notre algorithme engendre une robustesse accrue face à la sélection du taux d’apprentissage, ainsi qu’une meilleure vitesse de convergence. Nos résultats sont présent ́es dans le domaine des BMs, mais la méthode est générale et applicable à l’apprentissage de tout modèle probabiliste exploitant l’échantillonnage par chaînes de Markov. Tandis que le gradient du maximum de vraisemblance peut-être approximé par échantillonnage, l’évaluation de la log-vraisemblance nécessite un estimé de la fonction de partition. Contrairement aux approches traditionnelles qui considèrent un modèle donné comme une boîte noire, nous proposons plutôt d’exploiter la dynamique de l’apprentissage en estimant les changements successifs de log-partition encourus à chaque mise à jour des paramètres. Le problème d’estimation est reformulé comme un problème d’inférence similaire au filtre de Kalman, mais sur un graphe bi-dimensionnel, où les dimensions correspondent aux axes du temps et au paramètre de température. Sur le thème de l’optimisation, nous présentons également un algorithme permettant d’appliquer, de manière efficace, le gradient naturel à des machines de Boltzmann comportant des milliers d’unités. Jusqu’à présent, son adoption était limitée par son haut coût computationel ainsi que sa demande en mémoire. Notre algorithme, Metric-Free Natural Gradient (MFNG), permet d’éviter le calcul explicite de la matrice d’information de Fisher (et son inverse) en exploitant un solveur linéaire combiné à un produit matrice-vecteur efficace. L’algorithme est prometteur: en terme du nombre d’évaluations de fonctions, MFNG converge plus rapidement que SML. Son implémentation demeure malheureusement inefficace en temps de calcul. Ces travaux explorent également les mécanismes sous-jacents à l’apprentissage de représentations invariantes. À cette fin, nous utilisons la famille de machines de Boltzmann restreintes “spike & slab” (ssRBM), que nous modifions afin de pouvoir modéliser des distributions binaires et parcimonieuses. Les variables latentes binaires de la ssRBM peuvent être rendues invariantes à un sous-espace vectoriel, en associant à chacune d’elles, un vecteur de variables latentes continues (dénommées “slabs”). Ceci se traduit par une invariance accrue au niveau de la représentation et un meilleur taux de classification lorsque peu de données étiquetées sont disponibles. Nous terminons cette thèse sur un sujet ambitieux: l’apprentissage de représentations pouvant séparer les facteurs de variations présents dans le signal d’entrée. Nous proposons une solution à base de ssRBM bilinéaire (avec deux groupes de facteurs latents) et formulons le problème comme l’un de “pooling” dans des sous-espaces vectoriels complémentaires.
Resumo:
Photo-mosaicing techniques have become popular for seafloor mapping in various marine science applications. However, the common methods cannot accurately map regions with high relief and topographical variations. Ortho-mosaicing borrowed from photogrammetry is an alternative technique that enables taking into account the 3-D shape of the terrain. A serious bottleneck is the volume of elevation information that needs to be estimated from the video data, fused, and processed for the generation of a composite ortho-photo that covers a relatively large seafloor area. We present a framework that combines the advantages of dense depth-map and 3-D feature estimation techniques based on visual motion cues. The main goal is to identify and reconstruct certain key terrain feature points that adequately represent the surface with minimal complexity in the form of piecewise planar patches. The proposed implementation utilizes local depth maps for feature selection, while tracking over several views enables 3-D reconstruction by bundle adjustment. Experimental results with synthetic and real data validate the effectiveness of the proposed approach
Resumo:
This paper describes a proposed new approach to the Computer Network Security Intrusion Detection Systems (NIDS) application domain knowledge processing focused on a topic map technology-enabled representation of features of the threat pattern space as well as the knowledge of situated efficacy of alternative candidate algorithms for pattern recognition within the NIDS domain. Thus an integrative knowledge representation framework for virtualisation, data intelligence and learning loop architecting in the NIDS domain is described together with specific aspects of its deployment.
Resumo:
A new class of shape features for region classification and high-level recognition is introduced. The novel Randomised Region Ray (RRR) features can be used to train binary decision trees for object category classification using an abstract representation of the scene. In particular we address the problem of human detection using an over segmented input image. We therefore do not rely on pixel values for training, instead we design and train specialised classifiers on the sparse set of semantic regions which compose the image. Thanks to the abstract nature of the input, the trained classifier has the potential to be fast and applicable to extreme imagery conditions. We demonstrate and evaluate its performance in people detection using a pedestrian dataset.
Resumo:
We introduce a flexible technique for interactive exploration of vector field data through classification derived from user-specified feature templates. Our method is founded on the observation that, while similar features within the vector field may be spatially disparate, they share similar neighborhood characteristics. Users generate feature-based visualizations by interactively highlighting well-accepted and domain specific representative feature points. Feature exploration begins with the computation of attributes that describe the neighborhood of each sample within the input vector field. Compilation of these attributes forms a representation of the vector field samples in the attribute space. We project the attribute points onto the canonical 2D plane to enable interactive exploration of the vector field using a painting interface. The projection encodes the similarities between vector field points within the distances computed between their associated attribute points. The proposed method is performed at interactive rates for enhanced user experience and is completely flexible as showcased by the simultaneous identification of diverse feature types.
Resumo:
This paper presents the formulation of a combinatorial optimization problem with the following characteristics: (i) the search space is the power set of a finite set structured as a Boolean lattice; (ii) the cost function forms a U-shaped curve when applied to any lattice chain. This formulation applies for feature selection in the context of pattern recognition. The known approaches for this problem are branch-and-bound algorithms and heuristics that explore partially the search space. Branch-and-bound algorithms are equivalent to the full search, while heuristics are not. This paper presents a branch-and-bound algorithm that differs from the others known by exploring the lattice structure and the U-shaped chain curves of the search space. The main contribution of this paper is the architecture of this algorithm that is based on the representation and exploration of the search space by new lattice properties proven here. Several experiments, with well known public data, indicate the superiority of the proposed method to the sequential floating forward selection (SFFS), which is a popular heuristic that gives good results in very short computational time. In all experiments, the proposed method got better or equal results in similar or even smaller computational time. (C) 2009 Elsevier Ltd. All rights reserved.
Resumo:
Most face recognition approaches require a prior training where a given distribution of faces is assumed to further predict the identity of test faces. Such an approach may experience difficulty in identifying faces belonging to distributions different from the one provided during the training. A face recognition technique that performs well regardless of training is, therefore, interesting to consider as a basis of more sophisticated methods. In this work, the Census Transform is applied to describe the faces. Based on a scanning window which extracts local histograms of Census Features, we present a method that directly matches face samples. With this simple technique, 97.2% of the faces in the FERET fa/fb test were correctly recognized. Despite being an easy test set, we have found no other approaches in literature regarding straight comparisons of faces with such a performance. Also, a window for further improvement is presented. Among other techniques, we demonstrate how the use of SVMs over the Census Histogram representation can increase the recognition performance.
Resumo:
This article seeks to reflect on geographic representation in the coats of arms of countries in Latin America, showing how the physical, aspects of the landscape, the elements of the economy and the republican symbols were used by local elites to compose an imaginary nation in the nineteenth century. This process of "naturalization of territory" was used as an important feature in the national discourse, because this time, in most cases, the Latin American nations were composed of multi-ethnic states, with strong differences of class and a large illiterate population plus a very tenuous territory from the point of view of national integration. Thus, the elements related to geographic image through the use of coats of arms, conveyed strong messages to citizens, showing how these heraldic symbols can become an important source of research to unravel the process of building the imaginary nation.
Resumo:
Feature selection aims to find the most important information to save computational efforts and data storage. We formulated this task as a combinatorial optimization problem since the exponential growth of possible solutions makes an exhaustive search infeasible. In this work, we propose a new nature-inspired feature selection technique based on bats behavior, namely, binary bat algorithm The wrapper approach combines the power of exploration of the bats together with the speed of the optimum-path forest classifier to find a better data representation. Experiments in public datasets have shown that the proposed technique can indeed improve the effectiveness of the optimum-path forest and outperform some well-known swarm-based techniques. © 2013 Copyright © 2013 Elsevier Inc. All rights reserved.
Resumo:
An important feature in computer systems developed for the agricultural sector is to satisfy the heterogeneity of data generated in different processes. Most problems related with this heterogeneity arise from the lack of standard for different computing solutions proposed. An efficient solution for that is to create a single standard for data exchange. The study on the actual process involved in cotton production was based on a research developed by the Brazilian Agricultural Research Corporation (EMBRAPA) that reports all phases as a result of the compilation of several theoretical and practical researches related to cotton crop. The proposition of a standard starts with the identification of the most important classes of data involved in the process, and includes an ontology that is the systematization of concepts related to the production of cotton fiber and results in a set of classes, relations, functions and instances. The results are used as a reference for the development of computational tools, transforming implicit knowledge into applications that support the knowledge described. This research is based on data from the Midwest of Brazil. The choice of the cotton process as a study case comes from the fact that Brazil is one of the major players and there are several improvements required for system integration in this segment.
Resumo:
Breaking synoptic-scale Rossby waves (RWB) at the tropopause level are central to the daily weather evolution in the extratropics and the subtropics. RWB leads to pronounced meridional transport of heat, moisture, momentum, and chemical constituents. RWB events are manifest as elongated and narrow structures in the tropopause-level potential vorticity (PV) field. A feature-based validation approach is used to assess the representation of Northern Hemisphere RWB in present-day climate simulations carried out with the ECHAM5-HAM climate model at three different resolutions (T42L19, T63L31, and T106L31) against the ERA-40 reanalysis data set. An objective identification algorithm extracts RWB events from the isentropic PV field and allows quantifying the frequency of occurrence of RWB. The biases in the frequency of RWB are then compared to biases in the time mean tropopause-level jet wind speeds. The ECHAM5-HAM model captures the location of the RWB frequency maxima in the Northern Hemisphere at all three resolutions. However, at coarse resolution (T42L19) the overall frequency of RWB, i.e. the frequency averaged over all seasons and the entire hemisphere, is underestimated by 28%.The higher-resolution simulations capture the overall frequency of RWB much better, with a minor difference between T63L31 and T106L31 (frequency errors of −3.5 and 6%, respectively). The number of large-size RWB events is significantly underestimated by the T42L19 experiment and well represented in the T106L31 simulation. On the local scale, however, significant differences to ERA-40 are found in the higher-resolution simulations. These differences are regionally confined and vary with the season. The most striking difference between T106L31 and ERA-40 is that ECHAM5-HAM overestimates the frequency of RWB in the subtropical Atlantic in all seasons except for spring. This bias maximum is accompanied by an equatorward extension of the subtropical westerlies.
Resumo:
Synaesthesia has multifaceted consequences for both subjective experience and cognitive performance. Here, I broach the issue of how synaesthesia is represented in semantic memory. I hypothesize that, for example, in grapheme colour synaesthesia, colour is represented as an additional feature in the semantic network that enables the formation of associations that are not present in non-synaesthetes. Thus, synaesthesia provokes richer memory representations which enable learning opportunities that are not present in non-synaesthetes, provides additional memory cues, and may trigger creative ideas.