887 resultados para Invariant Features
Resumo:
This paper proposes an automatic hand detection system that combines the Fourier-Mellin Transform along with other computer vision techniques to achieve hand detection in cluttered scene color images. The proposed system uses the Fourier-Mellin Transform as an invariant feature extractor to perform RST invariant hand detection. In a first stage of the system a simple non-adaptive skin color-based image segmentation and an interest point detector based on corners are used in order to identify regions of interest that contains possible matches. A sliding window algorithm is then used to scan the image at different scales performing the FMT calculations only in the previously detected regions of interest and comparing the extracted FM descriptor of the windows with a hand descriptors database obtained from a train image set. The results of the performed experiments suggest the use of Fourier-Mellin invariant features as a promising approach for automatic hand detection.
Resumo:
This paper proposes an automatic hand detection system that combines the Fourier-Mellin Transform along with other computer vision techniques to achieve hand detection in cluttered scene color images. The proposed system uses the Fourier-Mellin Transform as an invariant feature extractor to perform RST invariant hand detection. In a first stage of the system a simple non-adaptive skin color-based image segmentation and an interest point detector based on corners are used in order to identify regions of interest that contains possible matches. A sliding window algorithm is then used to scan the image at different scales performing the FMT calculations only in the previously detected regions of interest and comparing the extracted FM descriptor of the windows with a hand descriptors database obtained from a train image set. The results of the performed experiments suggest the use of Fourier-Mellin invariant features as a promising approach for automatic hand detection.
Resumo:
Perceiving the world visually is a basic act for humans, but for computers it is still an unsolved problem. The variability present innatural environments is an obstacle for effective computer vision. The goal of invariant object recognition is to recognise objects in a digital image despite variations in, for example, pose, lighting or occlusion. In this study, invariant object recognition is considered from the viewpoint of feature extraction. Thedifferences between local and global features are studied with emphasis on Hough transform and Gabor filtering based feature extraction. The methods are examined with respect to four capabilities: generality, invariance, stability, and efficiency. Invariant features are presented using both Hough transform and Gabor filtering. A modified Hough transform technique is also presented where the distortion tolerance is increased by incorporating local information. In addition, methods for decreasing the computational costs of the Hough transform employing parallel processing and local information are introduced.
Resumo:
The span of writer identification extends to broad domes like digital rights administration, forensic expert decisionmaking systems, and document analysis systems and so on. As the success rate of a writer identification scheme is highly dependent on the features extracted from the documents, the phase of feature extraction and therefore selection is highly significant for writer identification schemes. In this paper, the writer identification in Malayalam language is sought for by utilizing feature extraction technique such as Scale Invariant Features Transform (SIFT).The schemes are tested on a test bed of 280 writers and performance evaluated
Resumo:
Since last two decades researches have been working on developing systems that can assistsdrivers in the best way possible and make driving safe. Computer vision has played a crucialpart in design of these systems. With the introduction of vision techniques variousautonomous and robust real-time traffic automation systems have been designed such asTraffic monitoring, Traffic related parameter estimation and intelligent vehicles. Among theseautomatic detection and recognition of road signs has became an interesting research topic.The system can assist drivers about signs they don’t recognize before passing them.Aim of this research project is to present an Intelligent Road Sign Recognition System basedon state-of-the-art technique, the Support Vector Machine. The project is an extension to thework done at ITS research Platform at Dalarna University [25]. Focus of this research work ison the recognition of road signs under analysis. When classifying an image its location, sizeand orientation in the image plane are its irrelevant features and one way to get rid of thisambiguity is to extract those features which are invariant under the above mentionedtransformation. These invariant features are then used in Support Vector Machine forclassification. Support Vector Machine is a supervised learning machine that solves problemin higher dimension with the help of Kernel functions and is best know for classificationproblems.
Resumo:
One of the most significant research topics in computer vision is object detection. Most of the reported object detection results localise the detected object within a bounding box, but do not explicitly label the edge contours of the object. Since object contours provide a fundamental diagnostic of object shape, some researchers have initiated work on linear contour feature representations for object detection and localisation. However, linear contour feature-based localisation is highly dependent on the performance of linear contour detection within natural images, and this can be perturbed significantly by a cluttered background. In addition, the conventional approach to achieving rotation-invariant features is to rotate the feature receptive field to align with the local dominant orientation before computing the feature representation. Grid resampling after rotation adds extra computational cost and increases the total time consumption for computing the feature descriptor. Though it is not an expensive process if using current computers, it is appreciated that if each step of the implementation is faster to compute especially when the number of local features is increasing and the application is implemented on resource limited ”smart devices”, such as mobile phones, in real-time. Motivated by the above issues, a 2D object localisation system is proposed in this thesis that matches features of edge contour points, which is an alternative method that takes advantage of the shape information for object localisation. This is inspired by edge contour points comprising the basic components of shape contours. In addition, edge point detection is usually simpler to achieve than linear edge contour detection. Therefore, the proposed localization system could avoid the need for linear contour detection and reduce the pathological disruption from the image background. Moreover, since natural images usually comprise many more edge contour points than interest points (i.e. corner points), we also propose new methods to generate rotation-invariant local feature descriptors without pre-rotating the feature receptive field to improve the computational efficiency of the whole system. In detail, the 2D object localisation system is achieved by matching edge contour points features in a constrained search area based on the initial pose-estimate produced by a prior object detection process. The local feature descriptor obtains rotation invariance by making use of rotational symmetry of the hexagonal structure. Therefore, a set of local feature descriptors is proposed based on the hierarchically hexagonal grouping structure. Ultimately, the 2D object localisation system achieves a very promising performance based on matching the proposed features of edge contour points with the mean correct labelling rate of the edge contour points 0.8654 and the mean false labelling rate 0.0314 applied on the data from Amsterdam Library of Object Images (ALOI). Furthermore, the proposed descriptors are evaluated by comparing to the state-of-the-art descriptors and achieve competitive performances in terms of pose estimate with around half-pixel pose error.
Resumo:
Este trabajo presenta un sistema para detectar y clasificar objetos binarios según la forma de éstos. En el primer paso del procedimiento, se aplica un filtrado para extraer el contorno del objeto. Con la información de los puntos de forma se obtiene un descriptor BSM con características altamente descriptivas, universales e invariantes. En la segunda fase del sistema se aprende y se clasifica la información del descriptor mediante Adaboost y Códigos Correctores de Errores. Se han usado bases de datos públicas, tanto en escala de grises como en color, para validar la implementación del sistema diseñado. Además, el sistema emplea una interfaz interactiva en la que diferentes métodos de procesamiento de imágenes pueden ser aplicados.
Resumo:
L’apprentissage supervisé de réseaux hiérarchiques à grande échelle connaît présentement un succès fulgurant. Malgré cette effervescence, l’apprentissage non-supervisé représente toujours, selon plusieurs chercheurs, un élément clé de l’Intelligence Artificielle, où les agents doivent apprendre à partir d’un nombre potentiellement limité de données. Cette thèse s’inscrit dans cette pensée et aborde divers sujets de recherche liés au problème d’estimation de densité par l’entremise des machines de Boltzmann (BM), modèles graphiques probabilistes au coeur de l’apprentissage profond. Nos contributions touchent les domaines de l’échantillonnage, l’estimation de fonctions de partition, l’optimisation ainsi que l’apprentissage de représentations invariantes. Cette thèse débute par l’exposition d’un nouvel algorithme d'échantillonnage adaptatif, qui ajuste (de fa ̧con automatique) la température des chaînes de Markov sous simulation, afin de maintenir une vitesse de convergence élevée tout au long de l’apprentissage. Lorsqu’utilisé dans le contexte de l’apprentissage par maximum de vraisemblance stochastique (SML), notre algorithme engendre une robustesse accrue face à la sélection du taux d’apprentissage, ainsi qu’une meilleure vitesse de convergence. Nos résultats sont présent ́es dans le domaine des BMs, mais la méthode est générale et applicable à l’apprentissage de tout modèle probabiliste exploitant l’échantillonnage par chaînes de Markov. Tandis que le gradient du maximum de vraisemblance peut-être approximé par échantillonnage, l’évaluation de la log-vraisemblance nécessite un estimé de la fonction de partition. Contrairement aux approches traditionnelles qui considèrent un modèle donné comme une boîte noire, nous proposons plutôt d’exploiter la dynamique de l’apprentissage en estimant les changements successifs de log-partition encourus à chaque mise à jour des paramètres. Le problème d’estimation est reformulé comme un problème d’inférence similaire au filtre de Kalman, mais sur un graphe bi-dimensionnel, où les dimensions correspondent aux axes du temps et au paramètre de température. Sur le thème de l’optimisation, nous présentons également un algorithme permettant d’appliquer, de manière efficace, le gradient naturel à des machines de Boltzmann comportant des milliers d’unités. Jusqu’à présent, son adoption était limitée par son haut coût computationel ainsi que sa demande en mémoire. Notre algorithme, Metric-Free Natural Gradient (MFNG), permet d’éviter le calcul explicite de la matrice d’information de Fisher (et son inverse) en exploitant un solveur linéaire combiné à un produit matrice-vecteur efficace. L’algorithme est prometteur: en terme du nombre d’évaluations de fonctions, MFNG converge plus rapidement que SML. Son implémentation demeure malheureusement inefficace en temps de calcul. Ces travaux explorent également les mécanismes sous-jacents à l’apprentissage de représentations invariantes. À cette fin, nous utilisons la famille de machines de Boltzmann restreintes “spike & slab” (ssRBM), que nous modifions afin de pouvoir modéliser des distributions binaires et parcimonieuses. Les variables latentes binaires de la ssRBM peuvent être rendues invariantes à un sous-espace vectoriel, en associant à chacune d’elles, un vecteur de variables latentes continues (dénommées “slabs”). Ceci se traduit par une invariance accrue au niveau de la représentation et un meilleur taux de classification lorsque peu de données étiquetées sont disponibles. Nous terminons cette thèse sur un sujet ambitieux: l’apprentissage de représentations pouvant séparer les facteurs de variations présents dans le signal d’entrée. Nous proposons une solution à base de ssRBM bilinéaire (avec deux groupes de facteurs latents) et formulons le problème comme l’un de “pooling” dans des sous-espaces vectoriels complémentaires.
Resumo:
Retrieval of similar anatomical structures of brain MR images across patients would help the expert in diagnosis of diseases. In this paper, modified local binary pattern with ternary encoding called modified local ternary pattern (MOD-LTP) is introduced, which is more discriminant and less sensitive to noise in near-uniform regions, to locate slices belonging to the same level from the brain MR image database. The ternary encoding depends on a threshold, which is a user-specified one or calculated locally, based on the variance of the pixel intensities in each window. The variancebased local threshold makes the MOD-LTP more robust to noise and global illumination changes. The retrieval performance is shown to improve by taking region-based moment features of MODLTP and iteratively reweighting the moment features of MOD-LTP based on the user’s feedback. The average rank obtained using iterated and weighted moment features of MOD-LTP with a local variance-based threshold, is one to two times better than rotational invariant LBP (Unay, D., Ekin, A. and Jasinschi, R.S. (2010) Local structure-based region-of-interest retrieval in brain MR images. IEEE Trans. Inf. Technol. Biomed., 14, 897–903.) in retrieving the first 10 relevant images
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
We propose a method for image recognition on the base of projections. Radon transform gives an opportunity to map image into space of its projections. Projection properties allow constructing informative features on the base of moments that can be successfully used for invariant recognition. Offered approach gives about 91-97% of correct recognition.
Resumo:
Classification procedures, including atmospheric correction satellite images as well as classification performance utilizing calibration and validation at different levels, have been investigated in the context of a coarse land-cover classification scheme for the Pachitea Basin. Two different correction methods were tested against no correction in terms of reflectance correction towards a common response for pseudo-invariant features (PIF). The accuracy of classifications derived from each of the three methods was then assessed in a discriminant analysis using crossvalidation at pixel, polygon, region, and image levels. Results indicate that only regression adjusted images using PIFs show no significant difference between images in any of the bands. A comparison of classifications at different levels suggests though that at pixel, polygon, and region levels the accuracy of the classifications do not significantly differ between corrected and uncorrected images. Spatial patterns of land-cover were analyzed in terms of colonization history, infrastructure, suitability of the land, and landownership. The actual use of the land is driven mainly by the ability to access the land and markets as is obvious in the distribution of land cover as a function of distance to rivers and roads. When considering all rivers and roads a threshold distance at which disproportional agro-pastoral land cover switches from over represented to under represented is at about 1km. Best land use suggestions seem not to affect the choice of land use. Differences in abundance of land cover between watersheds are more prevailing than differences between colonist and indigenous groups.
Resumo:
Many tracking algorithms have difficulties dealing with occlusions and background clutters, and consequently don't converge to an appropriate solution. Tracking based on the mean shift algorithm has shown robust performance in many circumstances but still fails e.g. when encountering dramatic intensity or colour changes in a pre-defined neighbourhood. In this paper, we present a robust tracking algorithm that integrates the advantages of mean shift tracking with those of tracking local invariant features. These features are integrated into the mean shift formulation so that tracking is performed based both on mean shift and feature probability distributions, coupled with an expectation maximisation scheme. Experimental results show robust tracking performance on a series of complicated real image sequences. © 2010 IEEE.
Resumo:
The invariant chain (Ii) prevents binding of ligands to major histocompatibility complex (MHC) class II molecules in the endoplasmic reticulum and during intracellular transport. Stepwise removal of the Ii in a trans-Golgi compartment renders MHC class II molecules accessible for peptide loading, with CLIP (class II-associated Ii peptides) as the final fragment to be released. Here we show that CLIP can be subdivided into distinct functional regions. The C-terminal segment (residues 92-105) of the CLIP-(81-105) fragment mediates inhibition of self- and antigenic peptide binding to HLA-DR2 molecules. In contrast, the N-terminal segment CLIP-(81-98) binds to the Staphylococcus aureus enterotoxin B contact site outside the peptide-binding groove on the alpha 1 domain and does not interfere with peptide binding. Its functional significance appears to lie in the contribution to CLIP removal: the dissociation of CLIP-(81-105) is characterized by a fast off-rate, which is accelerated at endosomal pH, whereas in the absence of the N-terminal CLIP-(81-91), the off-rate of C-terminal CLIP-(92-105) is slow and remains unaltered at low pH. Mechanistically, the N-terminal segment of CLIP seems to prevent tight interactions of CLIP side chains with specificity pockets in the peptide-binding groove that normally occurs during maturation of long-lived class II-peptide complexes.