964 resultados para Invariant Object Recognition
Resumo:
La visió és probablement el nostre sentit més dominant a partir del qual derivem la majoria d'informació del món que ens envolta. A través de la visió podem percebre com són les coses, on són i com es mouen. En les imatges que percebem amb el nostre sistema de visió podem extreure'n característiques com el color, la textura i la forma, i gràcies a aquesta informació som capaços de reconèixer objectes fins i tot quan s'observen sota unes condicions totalment diferents. Per exemple, som capaços de distingir un mateix objecte si l'observem des de diferents punts de vista, distància, condicions d'il·luminació, etc. La Visió per Computador intenta emular el sistema de visió humà mitjançant un sistema de captura d'imatges, un ordinador, i un conjunt de programes. L'objectiu desitjat no és altre que desenvolupar un sistema que pugui entendre una imatge d'una manera similar com ho realitzaria una persona. Aquesta tesi es centra en l'anàlisi de la textura per tal de realitzar el reconeixement de superfícies. La motivació principal és resoldre el problema de la classificació de superfícies texturades quan han estat capturades sota diferents condicions, com ara distància de la càmera o direcció de la il·luminació. D'aquesta forma s'aconsegueix reduir els errors de classificació provocats per aquests canvis en les condicions de captura. En aquest treball es presenta detalladament un sistema de reconeixement de textures que ens permet classificar imatges de diferents superfícies capturades en diferents condicions. El sistema proposat es basa en un model 3D de la superfície (que inclou informació de color i forma) obtingut mitjançant la tècnica coneguda com a 4-Source Colour Photometric Stereo (CPS). Aquesta informació és utilitzada posteriorment per un mètode de predicció de textures amb l'objectiu de generar noves imatges 2D de les textures sota unes noves condicions. Aquestes imatges virtuals que es generen seran la base del nostre sistema de reconeixement, ja que seran utilitzades com a models de referència per al nostre classificador de textures. El sistema de reconeixement proposat combina les Matrius de Co-ocurrència per a l'extracció de característiques de textura, amb la utilització del Classificador del veí més proper. Aquest classificador ens permet al mateix temps aproximar la direcció d'il·luminació present en les imatges que s'utilitzen per testejar el sistema de reconeixement. És a dir, serem capaços de predir l'angle d'il·luminació sota el qual han estat capturades les imatges de test. Els resultats obtinguts en els diferents experiments que s'han realitzat demostren la viabilitat del sistema de predicció de textures, així com del sistema de reconeixement.
Resumo:
Garment information tracking is required for clean room garment management. In this paper, we present a camera-based robust system with implementation of Optical Character Reconition (OCR) techniques to fulfill garment label recognition. In the system, a camera is used for image capturing; an adaptive thresholding algorithm is employed to generate binary images; Connected Component Labelling (CCL) is then adopted for object detection in the binary image as a part of finding the ROI (Region of Interest); Artificial Neural Networks (ANNs) with the BP (Back Propagation) learning algorithm are used for digit recognition; and finally the system is verified by a system database. The system has been tested. The results show that it is capable of coping with variance of lighting, digit twisting, background complexity, and font orientations. The system performance with association to the digit recognition rate has met the design requirement. It has achieved real-time and error-free garment information tracking during the testing.
Resumo:
Light Detection And Ranging (LIDAR) is an important modality in terrain and land surveying for many environmental, engineering and civil applications. This paper presents the framework for a recently developed unsupervised classification algorithm called Skewness Balancing for object and ground point separation in airborne LIDAR data. The main advantages of the algorithm are threshold-freedom and independence from LIDAR data format and resolution, while preserving object and terrain details. The framework for Skewness Balancing has been built in this contribution with a prediction model in which unknown LIDAR tiles can be categorised as “hilly” or “moderate” terrains. Accuracy assessment of the model is carried out using cross-validation with an overall accuracy of 95%. An extension to the algorithm is developed to address the overclassification issue for hilly terrain. For moderate terrain, the results show that from the classified tiles detached objects (buildings and vegetation) and attached objects (bridges and motorway junctions) are separated from bare earth (ground, roads and yards) which makes Skewness Balancing ideal to be integrated into geographic information system (GIS) software packages.
Resumo:
A new class of shape features for region classification and high-level recognition is introduced. The novel Randomised Region Ray (RRR) features can be used to train binary decision trees for object category classification using an abstract representation of the scene. In particular we address the problem of human detection using an over segmented input image. We therefore do not rely on pixel values for training, instead we design and train specialised classifiers on the sparse set of semantic regions which compose the image. Thanks to the abstract nature of the input, the trained classifier has the potential to be fast and applicable to extreme imagery conditions. We demonstrate and evaluate its performance in people detection using a pedestrian dataset.
Resumo:
This paper presents a video surveillance framework that robustly and efficiently detects abandoned objects in surveillance scenes. The framework is based on a novel threat assessment algorithm which combines the concept of ownership with automatic understanding of social relations in order to infer abandonment of objects. Implementation is achieved through development of a logic-based inference engine based on Prolog. Threat detection performance is conducted by testing against a range of datasets describing realistic situations and demonstrates a reduction in the number of false alarms generated. The proposed system represents the approach employed in the EU SUBITO project (Surveillance of Unattended Baggage and the Identification and Tracking of the Owner).
Resumo:
Shape provides one of the most relevant information about an object. This makes shape one of the most important visual attributes used to characterize objects. This paper introduces a novel approach for shape characterization, which combines modeling shape into a complex network and the analysis of its complexity in a dynamic evolution context. Descriptors computed through this approach show to be efficient in shape characterization, incorporating many characteristics, such as scale and rotation invariant. Experiments using two different shape databases (an artificial shapes database and a leaf shape database) are presented in order to evaluate the method. and its results are compared to traditional shape analysis methods found in literature. (C) 2009 Published by Elsevier B.V.
Resumo:
Since last two decades researches have been working on developing systems that can assistsdrivers in the best way possible and make driving safe. Computer vision has played a crucialpart in design of these systems. With the introduction of vision techniques variousautonomous and robust real-time traffic automation systems have been designed such asTraffic monitoring, Traffic related parameter estimation and intelligent vehicles. Among theseautomatic detection and recognition of road signs has became an interesting research topic.The system can assist drivers about signs they don’t recognize before passing them.Aim of this research project is to present an Intelligent Road Sign Recognition System basedon state-of-the-art technique, the Support Vector Machine. The project is an extension to thework done at ITS research Platform at Dalarna University [25]. Focus of this research work ison the recognition of road signs under analysis. When classifying an image its location, sizeand orientation in the image plane are its irrelevant features and one way to get rid of thisambiguity is to extract those features which are invariant under the above mentionedtransformation. These invariant features are then used in Support Vector Machine forclassification. Support Vector Machine is a supervised learning machine that solves problemin higher dimension with the help of Kernel functions and is best know for classificationproblems.
Resumo:
This thesis presents a system to recognise and classify road and traffic signs for the purpose of developing an inventory of them which could assist the highway engineers’ tasks of updating and maintaining them. It uses images taken by a camera from a moving vehicle. The system is based on three major stages: colour segmentation, recognition, and classification. Four colour segmentation algorithms are developed and tested. They are a shadow and highlight invariant, a dynamic threshold, a modification of de la Escalera’s algorithm and a Fuzzy colour segmentation algorithm. All algorithms are tested using hundreds of images and the shadow-highlight invariant algorithm is eventually chosen as the best performer. This is because it is immune to shadows and highlights. It is also robust as it was tested in different lighting conditions, weather conditions, and times of the day. Approximately 97% successful segmentation rate was achieved using this algorithm.Recognition of traffic signs is carried out using a fuzzy shape recogniser. Based on four shape measures - the rectangularity, triangularity, ellipticity, and octagonality, fuzzy rules were developed to determine the shape of the sign. Among these shape measures octangonality has been introduced in this research. The final decision of the recogniser is based on the combination of both the colour and shape of the sign. The recogniser was tested in a variety of testing conditions giving an overall performance of approximately 88%.Classification was undertaken using a Support Vector Machine (SVM) classifier. The classification is carried out in two stages: rim’s shape classification followed by the classification of interior of the sign. The classifier was trained and tested using binary images in addition to five different types of moments which are Geometric moments, Zernike moments, Legendre moments, Orthogonal Fourier-Mellin Moments, and Binary Haar features. The performance of the SVM was tested using different features, kernels, SVM types, SVM parameters, and moment’s orders. The average classification rate achieved is about 97%. Binary images show the best testing results followed by Legendre moments. Linear kernel gives the best testing results followed by RBF. C-SVM shows very good performance, but ?-SVM gives better results in some case.
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
In this thesis we consider systems of finitely many particles moving on paths given by a strong Markov process and undergoing branching and reproduction at random times. The branching rate of a particle, its number of offspring and their spatial distribution are allowed to depend on the particle's position and possibly on the configuration of coexisting particles. In addition there is immigration of new particles, with the rate of immigration and the distribution of immigrants possibly depending on the configuration of pre-existing particles as well. In the first two chapters of this work, we concentrate on the case that the joint motion of particles is governed by a diffusion with interacting components. The resulting process of particle configurations was studied by E. Löcherbach (2002, 2004) and is known as a branching diffusion with immigration (BDI). Chapter 1 contains a detailed introduction of the basic model assumptions, in particular an assumption of ergodicity which guarantees that the BDI process is positive Harris recurrent with finite invariant measure on the configuration space. This object and a closely related quantity, namely the invariant occupation measure on the single-particle space, are investigated in Chapter 2 where we study the problem of the existence of Lebesgue-densities with nice regularity properties. For example, it turns out that the existence of a continuous density for the invariant measure depends on the mechanism by which newborn particles are distributed in space, namely whether branching particles reproduce at their death position or their offspring are distributed according to an absolutely continuous transition kernel. In Chapter 3, we assume that the quantities defining the model depend only on the spatial position but not on the configuration of coexisting particles. In this framework (which was considered by Höpfner and Löcherbach (2005) in the special case that branching particles reproduce at their death position), the particle motions are independent, and we can allow for more general Markov processes instead of diffusions. The resulting configuration process is a branching Markov process in the sense introduced by Ikeda, Nagasawa and Watanabe (1968), complemented by an immigration mechanism. Generalizing results obtained by Höpfner and Löcherbach (2005), we give sufficient conditions for ergodicity in the sense of positive recurrence of the configuration process and finiteness of the invariant occupation measure in the case of general particle motions and offspring distributions.
Resumo:
Pictorial representations of three-dimensional objects are often used to investigate animal cognitive abilities; however, investigators rarely evaluate whether the animals conceptualize the two-dimensional image as the object it is intended to represent. We tested for picture recognition in lion-tailed macaques by presenting five monkeys with digitized images of familiar foods on a touch screen. Monkeys viewed images of two different foods and learned that they would receive a piece of the one they touched first. After demonstrating that they would reliably select images of their preferred foods on one set of foods, animals were transferred to images of a second set of familiar foods. We assumed that if the monkeys recognized the images, they would spontaneously select images of their preferred foods on the second set of foods. Three monkeys selected images of their preferred foods significantly more often than chance on their first transfer session. In an additional test of the monkeys' picture recognition abilities, animals were presented with pairs of food images containing a medium-preference food paired with either a high-preference food or a low-preference food. The same three monkeys selected the medium-preference foods significantly more often when they were paired with low-preference foods and significantly less often when those same foods were paired with high-preference foods. Our novel design provided convincing evidence that macaques recognized the content of two-dimensional images on a touch screen. Results also suggested that the animals understood the connection between the two-dimensional images and the three-dimensional objects they represented.
Resumo:
Human invariant natural killer T (NKT) cell TCRs bind to CD1d via an "invariant" Vα24-Jα18 chain (iNKTα) paired to semi-invariant Vβ11 chains (iNKTβ). Single-amino acid variations at position 93 (p93) of iNKTα, immediately upstream of the "invariant" CDR3α region, have been reported in a substantial proportion of human iNKT-cell clones (4-30%). Although p93, a serine in most human iNKT-cell TCRs, makes no contact with CD1d, it could affect CD1d binding by altering the conformation of the crucial CDR3α loop. By generating recombinant refolded iNKT-cell TCRs, we show that natural single-nucleotide variations in iNKTα, translating to serine, threonine, asparagine or isoleucine at p93, exert a powerful effect on CD1d binding, with up to 28-fold differences in affinity between these variants. This effect was observed with CD1d loaded with either the artificial α-galactosylceramide antigens KRN7000 or OCH, or the endogenous glycolipid β-galactosylceramide, and its importance for autoreactive recognition of endogenous lipids was demonstrated by the binding of variant iNKT-cell TCR tetramers to cell surface expressed CD1d. The serine-containing variant showed the strongest CD1d binding, offering an explanation for its predominance in vivo. Complementary molecular dynamics modeling studies were consistent with an impact of p93 on the conformation of the CDR3α loop.
Resumo:
The goal of this study was to investigate recognition memory performance across the lifespan and to determine how estimates of recollection and familiarity contribute to performance. In each of three experiments, participants from five groups from 14 up to 85 years of age (children, young adults, middle-aged adults, young-old adults, and old-old adults) were presented with high- and low-frequency words in a study phase and were tested immediately afterwards and/or after a one day retention interval. The results showed that word frequency and retention interval affected recognition memory performance as well as estimates of recollection and familiarity. Across the lifespan, the trajectory of recognition memory followed an inverse u-shape function that was neither affected by word frequency nor by retention interval. The trajectory of estimates of recollection also followed an inverse u-shape function, and was especially pronounced for low-frequency words. In contrast, estimates of familiarity did not differ across the lifespan. The results indicate that age differences in recognition memory are mainly due to differences in processes related to recollection while the contribution of familiarity-based processes seems to be age-invariant.
Resumo:
Computer vision-based food recognition could be used to estimate a meal's carbohydrate content for diabetic patients. This study proposes a methodology for automatic food recognition, based on the Bag of Features (BoF) model. An extensive technical investigation was conducted for the identification and optimization of the best performing components involved in the BoF architecture, as well as the estimation of the corresponding parameters. For the design and evaluation of the prototype system, a visual dataset with nearly 5,000 food images was created and organized into 11 classes. The optimized system computes dense local features, using the scale-invariant feature transform on the HSV color space, builds a visual dictionary of 10,000 visual words by using the hierarchical k-means clustering and finally classifies the food images with a linear support vector machine classifier. The system achieved classification accuracy of the order of 78%, thus proving the feasibility of the proposed approach in a very challenging image dataset.