179 resultados para occlusions
Resumo:
Handling appearance variations is a very challenging problem for visual tracking. Existing methods usually solve this problem by relying on an effective appearance model with two features: (1) being capable of discriminating the tracked target from its background, (2) being robust to the target's appearance variations during tracking. Instead of integrating the two requirements into the appearance model, in this paper, we propose a tracking method that deals with these problems separately based on sparse representation in a particle filter framework. Each target candidate defined by a particle is linearly represented by the target and background templates with an additive representation error. Discriminating the target from its background is achieved by activating the target templates or the background templates in the linear system in a competitive manner. The target's appearance variations are directly modeled as the representation error. An online algorithm is used to learn the basis functions that sparsely span the representation error. The linear system is solved via ℓ1 minimization. The candidate with the smallest reconstruction error using the target templates is selected as the tracking result. We test the proposed approach using four sequences with heavy occlusions, large pose variations, drastic illumination changes and low foreground-background contrast. The proposed approach shows excellent performance in comparison with two latest state-of-the-art trackers.
Resumo:
Object tracking is an active research area nowadays due to its importance in human computer interface, teleconferencing and video surveillance. However, reliable tracking of objects in the presence of occlusions, pose and illumination changes is still a challenging topic. In this paper, we introduce a novel tracking approach that fuses two cues namely colour and spatio-temporal motion energy within a particle filter based framework. We conduct a measure of coherent motion over two image frames, which reveals the spatio-temporal dynamics of the target. At the same time, the importance of both colour and motion energy cues is determined in the stage of reliability evaluation. This determination helps maintain the performance of the tracking system against abrupt appearance changes. Experimental results demonstrate that the proposed method outperforms the other state of the art techniques in the used test datasets.
Resumo:
In this paper we extend the minimum-cost network flow approach to multi-target tracking, by incorporating a motion model, allowing the tracker to better cope with longterm occlusions and missed detections. In our new method, the tracking problem is solved iteratively: Firstly, an initial tracking solution is found without the help of motion information. Given this initial set of tracklets, the motion at each detection is estimated, and used to refine the tracking solution.
Finally, special edges are added to the tracking graph, allowing a further revised tracking solution to be found, where distant tracklets may be linked based on motion similarity. Our system has been tested on the PETS S2.L1 and Oxford town-center sequences, outperforming the baseline system, and achieving results comparable with the current state of the art.
Resumo:
Empirical studies concerning face recognition suggest that faces may be stored in memory by a few canonical representations. Models of visual perception are based on image representations in cortical area V1 and beyond, which contain many cell layers for feature extraction. Simple, complex and end-stopped cells provide input for line, edge and keypoint detection. Detected events provide a rich, multi-scale object representation, and this representation can be stored in memory in order to identify objects. In this paper, the above context is applied to face recognition. The multi-scale line/edge representation is explored in conjunction with keypoint-based saliency maps for Focus-of-Attention. Recognition rates of up to 96% were achieved by combining frontal and 3/4 views, and recognition was quite robust against partial occlusions.
Resumo:
Empirical studies concerning face recognition suggest that faces may be stored in memory by a few canonical representations. Models of visual perception are based on image representations in cortical area V1 and beyond, which contain many cell layers for feature extraction. Simple, complex and end-stopped cells provide input for line, edge and keypoint detection. Detected events provide a rich, multi-scale object representation, and this representation can be stored in memory in order to identify objects. In this paper, the above context is applied to face recognition. The multi-scale line/edge representation is explored in conjunction with keypoint-based saliency maps for Focus-of-Attention. Recognition rates of up to 96% were achieved by combining frontal and 3/4 views, and recognition was quite robust against partial occlusions.
Resumo:
Attention is usually modelled by sequential fixation of peaks in saliency maps. Those maps code local conspicuity: complexity, colour and texture. Such features have no relation to entire objects, unless also disparity and optical flow are considered, which often segregate entire objects from their background. Recently we developed a model of local gist vision: which types of objects are about where in a scene. This model addresses man-made objects which are dominated by a small shape repertoire: squares, rectangles, trapeziums, triangles, circles and ellipses. Only exploiting local colour contrast, the model can detect these shapes by a small hierarchy of cell layers devoted to low- and mid-level geometry. The model has been tested successfully on video sequences containing traffic signs and other scenes, and partial occlusions were not problematic.
Resumo:
Target tracking with bearing-only sensors is a challenging problem when the target moves dynamically in complex scenarios. Besides the partial observability of such sensors, they have limited field of views, occlusions can occur, etc. In those cases, cooperative approaches with multiple tracking robots are interesting, but the different sources of uncertain information need to be considered appropriately in order to achieve better estimates. Even though there exist probabilistic filters that can estimate the position of a target dealing with incertainties, bearing-only measurements bring usually additional problems with initialization and data association. In this paper, we propose a multi-robot triangulation method with a dynamic baseline that can triangulate bearing-only measurements in a probabilistic manner to produce 3D observations. This method is combined with a decentralized stochastic filter and used to tackle those initialization and data association issues. The approach is validated with simulations and field experiments where a team of aerial and ground robots with cameras track a dynamic target.
Resumo:
Réalisé en cotutelle avec le laboratoire M2S de Rennes 2
Resumo:
Introduction. In utero, l’infection des membranes maternelles et fœtales, la chorioamniotite, passe souvent inaperçue et, en particulier lorsque associée à une acidémie, due à l’occlusion du cordon ombilical (OCO), comme il se produirait au cours du travail, peut entrainer des lésions cérébrales et avoir des répercussions neurologiques péri - et postnatales à long terme chez le fœtus. Il n'existe actuellement aucun moyen de détecter précocement ces conditions pathologiques in utéro afin de prévenir ou de limiter ces atteintes. Hypothèses. 1)l’électroencéphalogramme (EEG) fœtal obtenu du scalp fœtal pourrait servir d’outil auxiliaire à la surveillance électronique fœtale du rythme cardiaque fœtal (RCF) pour la détection précoce d'acidémie fœtale et d'agression neurologique; 2) la fréquence d’échantillonnage de l’ECG fœtal (ECGf) a un impact important sur le monitoring continu de la Variabilité du Rythme Cardiaque (VRCf) dans la prédiction de l’acidémie fœtale ; 3) les patrons de la corrélation de la VRCf aux cytokines pro-inflammatoires refléteront les états de réponses spontanées versus inflammatoires de la Voie Cholinergique Anti-inflammatoire (VCA); 4) grâce au développement d’un modèle de prédictions mathématiques, la prédiction du pH et de l’excès de base (EB) à la naissance sera possible avec seulement une heure de monitoring d’ECGf. Méthodes. Dans une série d’études fondamentales et cliniques, en utilisant respectivement le mouton et une cohorte de femmes en travail comme modèle expérimental et clinique , nous avons modélisé 1) une situation d’hypoxie cérébrale résultant de séquences d’occlusion du cordon ombilical de sévérité croissante jusqu’à atteindre un pH critique limite de 7.00 comme méthode expérimentale analogue au travail humain pour tester les première et deuxième hypothèses 2) un inflammation fœtale modérée en administrant le LPS à une autre cohorte animale pour vérifier la troisième hypothèse et 3) un modèle mathématique de prédictions à partir de paramètres et mesures validés cliniquement qui permettraient de déterminer les facteurs de prédiction d’une détresse fœtale pour tester la dernière hypothèse. Résultats. Les séries d’OCO répétitives se sont soldés par une acidose marquée (pH artériel 7.35±0.01 à 7.00±0.01), une diminution des amplitudes à l'électroencéphalogramme( EEG) synchronisé avec les décélérations du RCF induites par les OCO accompagnées d'une baisse pathologique de la pression artérielle (PA) et une augmentation marquée de VRCf avec hypoxie-acidémie aggravante à 1000 Hz, mais pas à 4 Hz, fréquence d’échantillonnage utilisée en clinique. L’administration du LPS entraîne une inflammation systémique chez le fœtus avec les IL-6 atteignant un pic 3 h après et des modifications de la VRCf retraçant précisément ce profil temporel des cytokines. En clinique, avec nos cohortes originale et de validation, un modèle statistique basée sur une matrice de 103 mesures de VRCf (R2 = 0,90, P < 0,001) permettent de prédire le pH mais pas l’EB, avec une heure d’enregistrement du RCF avant la poussée. Conclusions. La diminution de l'amplitude à l'EEG suggère un mécanisme d'arrêt adaptatif neuroprotecteur du cerveau et suggère que l'EEG fœtal puisse être un complément utile à la surveillance du RCF pendant le travail à haut risque chez la femme. La VRCf étant capable de détecter une hypoxie-acidémie aggravante tôt chez le fœtus à 1000Hz vs 4 Hz évoque qu’un mode d'acquisition d’ECG fœtal plus sensible pourrait constituer une solution. Des profils distinctifs de mesures de la VRCf, identifiés en corrélation avec les niveaux de l'inflammation, ouvre une nouvelle voie pour caractériser le profil inflammatoire de la réponse fœtale à l’infection. En clinique, un monitoring de chevet de prédiction du pH et EB à la naissance, à partir de mesures de VRCf permettrait des interprétations visuelles plus explicites pour des prises de décision plus exactes en obstétrique au cours du travail.
Resumo:
En chirurgie vasculaire, l’accès à l’artère fémorale, qu’il soit par une incision chirurgicale ou par une approche percutanée, est très fréquemment utilisé pour une multitude d’interventions vasculaires ou endovasculaires; pour des pontages divers, le traitement d’occlusions artérielles, la réparation d’anévrismes et la pose d’endoprothèses. L’objectif général de ce projet de recherche est de faciliter et réduire les risques des approches de l’artère fémorale par une meilleure compréhension anatomique du triangle fémoral. La méthodologie a été réalisée grâce à l’utilisation de cadavres spécialement embaumés par la méthode développée par Walter Thiel. Les résultats présentés dans ce mémoire ont permis de proposer des solutions en réponse à des problèmes cliniques en chirurgie vasculaire. Dans un premier temps, l’étude de la vascularisation cutanée du triangle fémoral a mené à proposer de nouvelles incisions chirurgicales afin de limiter la dévascularisation cutanée des plaies et ainsi réduire les problèmes de cicatrisation observés. Ensuite, nous avons validé l’identification radiographique et échographique de l’artère fémorale à son croisement avec le ligament inguinal afin de faciliter l’identification d’un site de ponction artérielle adéquat. Enfin, nous avons développé une méthode échographique simple qui facilite l’approche percutanée de l’artère fémorale, même chez les patients obèses. Les retombées de ce projet de recherche sont multiples pour les cliniciens, l’étude fournit une meilleure compréhension anatomique tridimensionnelle du triangle fémoral et les techniques proposées dans ce mémoire pourront apporter une amélioration de la pratique chirurgicale et faciliter le travail des médecins. Toutefois, ces propositions devront maintenant être validées en clinique.
Resumo:
A key problem in object recognition is selection, namely, the problem of identifying regions in an image within which to start the recognition process, ideally by isolating regions that are likely to come from a single object. Such a selection mechanism has been found to be crucial in reducing the combinatorial search involved in the matching stage of object recognition. Even though selection is of help in recognition, it has largely remained unsolved because of the difficulty in isolating regions belonging to objects under complex imaging conditions involving occlusions, changing illumination, and object appearances. This thesis presents a novel approach to the selection problem by proposing a computational model of visual attentional selection as a paradigm for selection in recognition. In particular, it proposes two modes of attentional selection, namely, attracted and pay attention modes as being appropriate for data and model-driven selection in recognition. An implementation of this model has led to new ways of extracting color, texture and line group information in images, and their subsequent use in isolating areas of the scene likely to contain the model object. Among the specific results in this thesis are: a method of specifying color by perceptual color categories for fast color region segmentation and color-based localization of objects, and a result showing that the recognition of texture patterns on model objects is possible under changes in orientation and occlusions without detailed segmentation. The thesis also presents an evaluation of the proposed model by integrating with a 3D from 2D object recognition system and recording the improvement in performance. These results indicate that attentional selection can significantly overcome the computational bottleneck in object recognition, both due to a reduction in the number of features, and due to a reduction in the number of matches during recognition using the information derived during selection. Finally, these studies have revealed a surprising use of selection, namely, in the partial solution of the pose of a 3D object.
Resumo:
Local descriptors are increasingly used for the task of object recognition because of their perceived robustness with respect to occlusions and to global geometrical deformations. We propose a performance criterion for a local descriptor based on the tradeoff between selectivity and invariance. In this paper, we evaluate several local descriptors with respect to selectivity and invariance. The descriptors that we evaluated are Gaussian derivatives up to the third order, gray image patches, and Laplacian-based descriptors with either three scales or one scale filters. We compare selectivity and invariance to several affine changes such as rotation, scale, brightness, and viewpoint. Comparisons have been made keeping the dimensionality of the descriptors roughly constant. The overall results indicate a good performance by the descriptor based on a set of oriented Gaussian filters. It is interesting that oriented receptive fields similar to the Gaussian derivatives as well as receptive fields similar to the Laplacian are found in primate visual cortex.
Resumo:
Local descriptors are increasingly used for the task of object recognition because of their perceived robustness with respect to occlusions and to global geometrical deformations. Such a descriptor--based on a set of oriented Gaussian derivative filters-- is used in our recognition system. We report here an evaluation of several techniques for orientation estimation to achieve rotation invariance of the descriptor. We also describe feature selection based on a single training image. Virtual images are generated by rotating and rescaling the image and robust features are selected. The results confirm robust performance in cluttered scenes, in the presence of partial occlusions, and when the object is embedded in different backgrounds.
Resumo:
This paper deals with the problem of navigation for an unmanned underwater vehicle (UUV) through image mosaicking. It represents a first step towards a real-time vision-based navigation system for a small-class low-cost UUV. We propose a navigation system composed by: (i) an image mosaicking module which provides velocity estimates; and (ii) an extended Kalman filter based on the hydrodynamic equation of motion, previously identified for this particular UUV. The obtained system is able to estimate the position and velocity of the robot. Moreover, it is able to deal with visual occlusions that usually appear when the sea bottom does not have enough visual features to solve the correspondence problem in a certain area of the trajectory
Resumo:
This paper presents a complete solution for creating accurate 3D textured models from monocular video sequences. The methods are developed within the framework of sequential structure from motion, where a 3D model of the environment is maintained and updated as new visual information becomes available. The camera position is recovered by directly associating the 3D scene model with local image observations. Compared to standard structure from motion techniques, this approach decreases the error accumulation while increasing the robustness to scene occlusions and feature association failures. The obtained 3D information is used to generate high quality, composite visual maps of the scene (mosaics). The visual maps are used to create texture-mapped, realistic views of the scene