891 resultados para Computer Imaging, Vision, Pattern Recognition and Graphics
Resumo:
In this paper, the fusion of probabilistic knowledge-based classification rules and learning automata theory is proposed and as a result we present a set of probabilistic classification rules with self-learning capability. The probabilities of the classification rules change dynamically guided by a supervised reinforcement process aimed at obtaining an optimum classification accuracy. This novel classifier is applied to the automatic recognition of digital images corresponding to visual landmarks for the autonomous navigation of an unmanned aerial vehicle (UAV) developed by the authors. The classification accuracy of the proposed classifier and its comparison with well-established pattern recognition methods is finally reported.
Resumo:
In this paper we propose an innovative method for the automatic detection and tracking of road traffic signs using an onboard stereo camera. It involves a combination of monocular and stereo analysis strategies to increase the reliability of the detections such that it can boost the performance of any traffic sign recognition scheme. Firstly, an adaptive color and appearance based detection is applied at single camera level to generate a set of traffic sign hypotheses. In turn, stereo information allows for sparse 3D reconstruction of potential traffic signs through a SURF-based matching strategy. Namely, the plane that best fits the cloud of 3D points traced back from feature matches is estimated using a RANSAC based approach to improve robustness to outliers. Temporal consistency of the 3D information is ensured through a Kalman-based tracking stage. This also allows for the generation of a predicted 3D traffic sign model, which is in turn used to enhance the previously mentioned color-based detector through a feedback loop, thus improving detection accuracy. The proposed solution has been tested with real sequences under several illumination conditions and in both urban areas and highways, achieving very high detection rates in challenging environments, including rapid motion and significant perspective distortion
Resumo:
In the recent years, the computer vision community has shown great interest on depth-based applications thanks to the performance and flexibility of the new generation of RGB-D imagery. In this paper, we present an efficient background subtraction algorithm based on the fusion of multiple region-based classifiers that processes depth and color data provided by RGB-D cameras. Foreground objects are detected by combining a region-based foreground prediction (based on depth data) with different background models (based on a Mixture of Gaussian algorithm) providing color and depth descriptions of the scene at pixel and region level. The information given by these modules is fused in a mixture of experts fashion to improve the foreground detection accuracy. The main contributions of the paper are the region-based models of both background and foreground, built from the depth and color data. The obtained results using different database sequences demonstrate that the proposed approach leads to a higher detection accuracy with respect to existing state-of-the-art techniques.
Resumo:
Uno de los mayores retos para la comunidad científica es conseguir que las máquinas posean en un futuro la capacidad del sistema visual y cognitivo humanos, de forma que, por ejemplo, en entornos de video vigilancia, puedan llegar a proporcionar de manera automática una descripción fiable de lo que está ocurriendo en la escena. En la presente tesis, mediante la propuesta de un marco de trabajo de referencia, se discuten y plantean los pasos necesarios para el desarrollo de sistemas más inteligentes capaces de extraer y analizar, a diferentes niveles de abstracción y mediante distintos módulos de procesamiento independientes, la información necesaria para comprender qué está sucediendo en un conjunto amplio de escenarios de distinta naturaleza. Se parte de un análisis de requisitos y se identifican los retos para este tipo de sistemas en la actualidad, lo que constituye en sí mismo los objetivos de esta tesis, contribuyendo así a un modelo de datos basado en el conocimiento que permitirá analizar distintas situaciones en las que personas y vehículos son los actores principales, dejando no obstante la puerta abierta a la adaptación a otros dominios. Así mismo, se estudian los distintos procesos que se pueden lanzar a nivel interno así como la necesidad de integrar mecanismos de realimentación a distintos niveles que permitan al sistema adaptarse mejor a cambios en el entorno. Como resultado, se propone un marco de referencia jerárquico que integra las capacidades de percepción, interpretación y aprendizaje para superar los retos identificados en este ámbito; y así poder desarrollar sistemas de vigilancia más robustos, flexibles e inteligentes, capaces de operar en una variedad de entornos. Resultados experimentales ejecutados sobre distintas muestras de datos (secuencias de vídeo principalmente) demuestran la efectividad del marco de trabajo propuesto respecto a otros propuestos en el pasado. Un primer caso de estudio, permite demostrar la creación de un sistema de monitorización de entornos de parking en exteriores para la detección de vehículos y el análisis de plazas libres de aparcamiento. Un segundo caso de estudio, permite demostrar la flexibilidad del marco de referencia propuesto para adaptarse a los requisitos de un entorno de vigilancia completamente distinto, como es un hogar inteligente donde el análisis automático de actividades de la vida cotidiana centra la atención del estudio. ABSTRACT One of the most ambitious objectives for the Computer Vision and Pattern Recognition research community is that machines can achieve similar capacities to the human's visual and cognitive system, and thus provide a trustworthy description of what is happening in the scene under surveillance. Thus, a number of well-established scenario understanding architectural frameworks to develop applications working on a variety of environments can be found in the literature. In this Thesis, a highly descriptive methodology for the development of scene understanding applications is presented. It consists of a set of formal guidelines to let machines extract and analyse, at different levels of abstraction and by means of independent processing modules that interact with each other, the necessary information to understand a broad set of different real World surveillance scenarios. Taking into account the challenges that working at both low and high levels offer, we contribute with a highly descriptive knowledge-based data model for the analysis of different situations in which people and vehicles are the main actors, leaving the door open for the development of interesting applications in diverse smart domains. Recommendations to let systems achieve high-level behaviour understanding will be also provided. Furthermore, feedback mechanisms are proposed to be integrated in order to let any system to understand better the environment and the logical context around, reducing thus the uncertainty and noise, and increasing its robustness and precision in front of low-level or high-level errors. As a result, a hierarchical cognitive architecture of reference which integrates the necessary perception, interpretation, attention and learning capabilities to overcome main challenges identified in this area of research is proposed; thus allowing to develop more robust, flexible and smart surveillance systems to cope with the different requirements of a variety of environments. Once crucial issues that should be treated explicitly in the design of this kind of systems have been formulated and discussed, experimental results shows the effectiveness of the proposed framework compared with other proposed in the past. Two case studies were implemented to test the capabilities of the framework. The first case study presents how the proposed framework can be used to create intelligent parking monitoring systems. The second case study demonstrates the flexibility of the system to cope with the requirements of a completely different environment, a smart home where activities of daily living are performed. Finally, general conclusions and future work lines to further enhancing the capabilities of the proposed framework are presented.
Resumo:
In this paper, we propose a novel method for the unsupervised clustering of graphs in the context of the constellation approach to object recognition. Such method is an EM central clustering algorithm which builds prototypical graphs on the basis of fast matching with graph transformations. Our experiments, both with random graphs and in realistic situations (visual localization), show that our prototypes improve the set median graphs and also the prototypes derived from our previous incremental method. We also discuss how the method scales with a growing number of images.
Resumo:
The need to digitise music scores has led to the development of Optical Music Recognition (OMR) tools. Unfortunately, the performance of these systems is still far from providing acceptable results. This situation forces the user to be involved in the process due to the need of correcting the mistakes made during recognition. However, this correction is performed over the output of the system, so these interventions are not exploited to improve the performance of the recognition. This work sets the scenario in which human and machine interact to accurately complete the OMR task with the least possible effort for the user.
Resumo:
In endotherms insects, the thermoregulatory mechanisms modulate heat transfer from the thorax to the abdomen to avoid overheating or cooling in order to obtain a prolonged flight performance. Scarabaeus sacer and S. cicatricosus, two sympatric species with the same habitat and food preferences, showed daily temporal segregation with S. cicatricosus being more active during warmer hours of the day in opposition to S. sacer who avoid it. In the case of S. sacer, their endothermy pattern suggested an adaptive capacity for thorax heat retention. In S. cicatricosus, an active ‘heat exchanger’ mechanism was suggested. However, no empirical evidence had been documented until now. Thermographic sequences recorded during flight performance showed evidence of the existence of both thermoregulatory mechanisms. In S. sacer, infrared sequences showed a possible heat insulator (passive thermal window), which prevents heat transfer from meso- and metathorax to the abdomen during flight. In S. cicatricosus, infrared sequences revealed clear and effective heat flow between the thorax and abdomen (abdominal heat transfer) that should be considered the main mechanism of thermoregulation. This was related to a subsequent increase in abdominal pumping (as a cooling mechanism) during flight. Computer microtomography scanning, anatomical dissections and internal air volume measurements showed two possible heat retention mechanisms for S. sacer; the abdominal air sacs and the development of the internal abdominal sternites that could explain the thermoregulation between thorax and abdomen. Our results suggest that interspecific interactions between sympatric species are regulated by very different mechanisms. These mechanisms create unique thermal niches for the different species, thereby preventing competition and modulating spatio-temporal distribution and the composition of dung beetle assemblages.
Resumo:
In this work, a modified version of the elastic bunch graph matching (EBGM) algorithm for face recognition is introduced. First, faces are detected by using a fuzzy skin detector based on the RGB color space. Then, the fiducial points for the facial graph are extracted automatically by adjusting a grid of points to the result of an edge detector. After that, the position of the nodes, their relation with their neighbors and their Gabor jets are calculated in order to obtain the feature vector defining each face. A self-organizing map (SOM) framework is shown afterwards. Thus, the calculation of the winning neuron and the recognition process are performed by using a similarity function that takes into account both the geometric and texture information of the facial graph. The set of experiments carried out for our SOM-EBGM method shows the accuracy of our proposal when compared with other state-of the-art methods.
Resumo:
"Supported in part by Contract AT(11-1) 1018 with the U.S. Atomic Energy Commission and the Advanced Research Projects Agency."
Resumo:
"C00-1018-1213"--Cover.
Resumo:
Supported by: Contract AT (11-1)-1018 with the U.S. Atomic Energy Commission and the Advanced Research Projects Agency.
Resumo:
We introduce a new second-order method of texture analysis called Adaptive Multi-Scale Grey Level Co-occurrence Matrix (AMSGLCM), based on the well-known Grey Level Co-occurrence Matrix (GLCM) method. The method deviates significantly from GLCM in that features are extracted, not via a fixed 2D weighting function of co-occurrence matrix elements, but by a variable summation of matrix elements in 3D localized neighborhoods. We subsequently present a new methodology for extracting optimized, highly discriminant features from these localized areas using adaptive Gaussian weighting functions. Genetic Algorithm (GA) optimization is used to produce a set of features whose classification worth is evaluated by discriminatory power and feature correlation considerations. We critically appraised the performance of our method and GLCM in pairwise classification of images from visually similar texture classes, captured from Markov Random Field (MRF) synthesized, natural, and biological origins. In these cross-validated classification trials, our method demonstrated significant benefits over GLCM, including increased feature discriminatory power, automatic feature adaptability, and significantly improved classification performance.
Resumo:
The expectation-maximization (EM) algorithm has been of considerable interest in recent years as the basis for various algorithms in application areas of neural networks such as pattern recognition. However, there exists some misconceptions concerning its application to neural networks. In this paper, we clarify these misconceptions and consider how the EM algorithm can be adopted to train multilayer perceptron (MLP) and mixture of experts (ME) networks in applications to multiclass classification. We identify some situations where the application of the EM algorithm to train MLP networks may be of limited value and discuss some ways of handling the difficulties. For ME networks, it is reported in the literature that networks trained by the EM algorithm using iteratively reweighted least squares (IRLS) algorithm in the inner loop of the M-step, often performed poorly in multiclass classification. However, we found that the convergence of the IRLS algorithm is stable and that the log likelihood is monotonic increasing when a learning rate smaller than one is adopted. Also, we propose the use of an expectation-conditional maximization (ECM) algorithm to train ME networks. Its performance is demonstrated to be superior to the IRLS algorithm on some simulated and real data sets.
Resumo:
Automatic signature verification is a well-established and an active area of research with numerous applications such as bank check verification, ATM access, etc. This paper proposes a novel approach to the problem of automatic off-line signature verification and forgery detection. The proposed approach is based on fuzzy modeling that employs the Takagi-Sugeno (TS) model. Signature verification and forgery detection are carried out using angle features extracted from box approach. Each feature corresponds to a fuzzy set. The features are fuzzified by an exponential membership function involved in the TS model, which is modified to include structural parameters. The structural parameters are devised to take account of possible variations due to handwriting styles and to reflect moods. The membership functions constitute weights in the TS model. The optimization of the output of the TS model with respect to the structural parameters yields the solution for the parameters. We have also derived two TS models by considering a rule for each input feature in the first formulation (Multiple rules) and by considering a single rule for all input features in the second formulation. In this work, we have found that TS model with multiple rules is better than TS model with single rule for detecting three types of forgeries; random, skilled and unskilled from a large database of sample signatures in addition to verifying genuine signatures. We have also devised three approaches, viz., an innovative approach and two intuitive approaches using the TS model with multiple rules for improved performance. (C) 2004 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved.
Resumo:
Aim: Polysomnography (PSG) is the current standard protocol for sleep disordered breathing (SDB) investigation in children. Presently, there are limited reliable screening tests for both central (CE) and obstructive (OE) respiratory events. This study compared three indices, derived from pulse oximetry and electrocardiogram ( ECG), with the PSG gold standard. These indices were heart rate (HR) variability, arterial blood oxygen de-saturation (SaO(2)) and pulse transit time (PTT). Methods: 15 children (12 male) from routine PSG studies were recruited (aged 3 - 14 years). The characteristics of the three indices were based on known criteria for respiratory events (RPE). Their estimation singly and in combination was evaluated with simultaneous scored PSG recordings. Results: 215 RPE and 215 tidal breathing events were analysed. For OE, the obtained sensitivity was HR (0.703), SaO(2) (0.047), PTT (0.750), considering all three indices (0) and either of the indices (0.828) while specificity was (0.891), (0.938), (0.922), (0.953) and (0.859) respectively. For CE, the sensitivity was HR (0.715), SaO(2) (0.278), PTT (0.662), considering all indices (0.040) and either of the indices (0.868) while specificity was (0.815), (0.954), (0.901), (0.960) and (0.762) accordingly. Conclusions: Preliminary findings herein suggest that the later combination of these non-invasive indices to be a promising screening method of SDB in children.