959 resultados para 3D Object Tracking


Relevância:

90.00% 90.00%

Publicador:

Resumo:

A new information-theoretic approach is presented for finding the pose of an object in an image. The technique does not require information about the surface properties of the object, besides its shape, and is robust with respect to variations of illumination. In our derivation, few assumptions are made about the nature of the imaging process. As a result the algorithms are quite general and can foreseeably be used in a wide variety of imaging situations. Experiments are presented that demonstrate the approach registering magnetic resonance (MR) images with computed tomography (CT) images, aligning a complex 3D object model to real scenes including clutter and occlusion, tracking a human head in a video sequence and aligning a view-based 2D object model to real images. The method is based on a formulation of the mutual information between the model and the image called EMMA. As applied here the technique is intensity-based, rather than feature-based. It works well in domains where edge or gradient-magnitude based methods have difficulty, yet it is more robust than traditional correlation. Additionally, it has an efficient implementation that is based on stochastic approximation. Finally, we will describe a number of additional real-world applications that can be solved efficiently and reliably using EMMA. EMMA can be used in machine learning to find maximally informative projections of high-dimensional data. EMMA can also be used to detect and correct corruption in magnetic resonance images (MRI).

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper sketches a hypothetical cortical architecture for visual 3D object recognition based on a recent computational model. The view-centered scheme relies on modules for learning from examples, such as Hyperbf-like networks. Such models capture a class of explanations we call Memory-Based Models (MBM) that contains sparse population coding, memory-based recognition, and codebooks of prototypes. Unlike the sigmoidal units of some artificial neural networks, the units of MBMs are consistent with the description of cortical neurons. We describe how an example of MBM may be realized in terms of cortical circuitry and biophysical mechanisms, consistent with psychophysical and physiological data.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper describes a real-time multi-camera surveillance system that can be applied to a range of application domains. This integrated system is designed to observe crowded scenes and has mechanisms to improve tracking of objects that are in close proximity. The four component modules described in this paper are (i) motion detection using a layered background model, (ii) object tracking based on local appearance, (iii) hierarchical object recognition, and (iv) fused multisensor object tracking using multiple features and geometric constraints. This integrated approach to complex scene tracking is validated against a number of representative real-world scenarios to show that robust, real-time analysis can be performed. Copyright (C) 2007 Hindawi Publishing Corporation. All rights reserved.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Questa tesi si inserisce nel filone di ricerca dell'elaborazione di dati 3D, e in particolare nella 3D Object Recognition, e delinea in primo luogo una panoramica sulle principali rappresentazioni strutturate di dati 3D, le quali rappresentano una prerogativa necessaria per implementare in modo efficiente algoritmi di processing di dati 3D, per poi presentare un nuovo algoritmo di 3D Keypoint Detection che è stato sviluppato e proposto dal Computer Vision Laboratory dell'Università di Bologna presso il quale ho effettuato la mia attività di tesi.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

For broadcasting purposes MIXED REALITY, the combination of real and virtual scene content, has become ubiquitous nowadays. Mixed Reality recording still requires expensive studio setups and is often limited to simple color keying. We present a system for Mixed Reality applications which uses depth keying and provides threedimensional mixing of real and artificial content. It features enhanced realism through automatic shadow computation which we consider a core issue to obtain realism and a convincing visual perception, besides the correct alignment of the two modalities and correct occlusion handling. Furthermore we present a possibility to support placement of virtual content in the scene. Core feature of our system is the incorporation of a TIME-OF-FLIGHT (TOF)-camera device. This device delivers real-time depth images of the environment at a reasonable resolution and quality. This camera is used to build a static environment model and it also allows correct handling of mutual occlusions between real and virtual content, shadow computation and enhanced content planning. The presented system is inexpensive, compact, mobile, flexible and provides convenient calibration procedures. Chroma-keying is replaced by depth-keying which is efficiently performed on the GRAPHICS PROCESSING UNIT (GPU) by the usage of an environment model and the current ToF-camera image. Automatic extraction and tracking of dynamic scene content is herewith performed and this information is used for planning and alignment of virtual content. An additional sustainable feature is that depth maps of the mixed content are available in real-time, which makes the approach suitable for future 3DTV productions. The presented paper gives an overview of the whole system approach including camera calibration, environment model generation, real-time keying and mixing of virtual and real content, shadowing for virtual content and dynamic object tracking for content planning.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Shading reduces the power output of a photovoltaic (PV) system. The design engineering of PV systems requires modeling and evaluating shading losses. Some PV systems are affected by complex shading scenes whose resulting PV energy losses are very difficult to evaluate with current modeling tools. Several specialized PV design and simulation software include the possibility to evaluate shading losses. They generally possess a Graphical User Interface (GUI) through which the user can draw a 3D shading scene, and then evaluate its corresponding PV energy losses. The complexity of the objects that these tools can handle is relatively limited. We have created a software solution, 3DPV, which allows evaluating the energy losses induced by complex 3D scenes on PV generators. The 3D objects can be imported from specialized 3D modeling software or from a 3D object library. The shadows cast by this 3D scene on the PV generator are then directly evaluated from the Graphics Processing Unit (GPU). Thanks to the recent development of GPUs for the video game industry, the shadows can be evaluated with a very high spatial resolution that reaches well beyond the PV cell level, in very short calculation times. A PV simulation model then translates the geometrical shading into PV energy output losses. 3DPV has been implemented using WebGL, which allows it to run directly from a Web browser, without requiring any local installation from the user. This also allows taken full benefits from the information already available from Internet, such as the 3D object libraries. This contribution describes, step by step, the method that allows 3DPV to evaluate the PV energy losses caused by complex shading. We then illustrate the results of this methodology to several application cases that are encountered in the world of PV systems design. Keywords: 3D, modeling, simulation, GPU, shading, losses, shadow mapping, solar, photovoltaic, PV, WebGL

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Feature vectors can be anything from simple surface normals to more complex feature descriptors. Feature extraction is important to solve various computer vision problems: e.g. registration, object recognition and scene understanding. Most of these techniques cannot be computed online due to their complexity and the context where they are applied. Therefore, computing these features in real-time for many points in the scene is impossible. In this work, a hardware-based implementation of 3D feature extraction and 3D object recognition is proposed to accelerate these methods and therefore the entire pipeline of RGBD based computer vision systems where such features are typically used. The use of a GPU as a general purpose processor can achieve considerable speed-ups compared with a CPU implementation. In this work, advantageous results are obtained using the GPU to accelerate the computation of a 3D descriptor based on the calculation of 3D semi-local surface patches of partial views. This allows descriptor computation at several points of a scene in real-time. Benefits of the accelerated descriptor have been demonstrated in object recognition tasks. Source code will be made publicly available as contribution to the Open Source Point Cloud Library.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

During grasping and intelligent robotic manipulation tasks, the camera position relative to the scene changes dramatically because the robot is moving to adapt its path and correctly grasp objects. This is because the camera is mounted at the robot effector. For this reason, in this type of environment, a visual recognition system must be implemented to recognize and “automatically and autonomously” obtain the positions of objects in the scene. Furthermore, in industrial environments, all objects that are manipulated by robots are made of the same material and cannot be differentiated by features such as texture or color. In this work, first, a study and analysis of 3D recognition descriptors has been completed for application in these environments. Second, a visual recognition system designed from specific distributed client-server architecture has been proposed to be applied in the recognition process of industrial objects without these appearance features. Our system has been implemented to overcome problems of recognition when the objects can only be recognized by geometric shape and the simplicity of shapes could create ambiguity. Finally, some real tests are performed and illustrated to verify the satisfactory performance of the proposed system.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Thesis (Ph.D.)--University of Washington, 2016-06

Relevância:

90.00% 90.00%

Publicador:

Resumo:

A real-time three-dimensional (3D) object sensing and reconstruction scheme is presented that can be applied on any arbitrary corporeal shape. Operation is demonstrated on several calibrated objects. The system uses curvature sensors based upon in-line fiber Bragg gratings encapsulated in a low-temperature curing synthetic silicone. New methods to quantitatively evaluate the performance of a 3D object-sensing scheme are developed and appraised. It is shown that the sensing scheme yields a volumetric error of 1% to 9%, depending on the object.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

We present a video-based system which interactively captures the geometry of a 3D object in the form of a point cloud, then recognizes and registers known objects in this point cloud in a matter of seconds (fig. 1). In order to achieve interactive speed, we exploit both efficient inference algorithms and parallel computation, often on a GPU. The system can be broken down into two distinct phases: geometry capture, and object inference. We now discuss these in further detail. © 2011 IEEE.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper addresses the problem of automatically obtaining the object/background segmentation of a rigid 3D object observed in a set of images that have been calibrated for camera pose and intrinsics. Such segmentations can be used to obtain a shape representation of a potentially texture-less object by computing a visual hull. We propose an automatic approach where the object to be segmented is identified by the pose of the cameras instead of user input such as 2D bounding rectangles or brush-strokes. The key behind our method is a pairwise MRF framework that combines (a) foreground/background appearance models, (b) epipolar constraints and (c) weak stereo correspondence into a single segmentation cost function that can be efficiently solved by Graph-cuts. The segmentation thus obtained is further improved using silhouette coherency and then used to update the foreground/background appearance models which are fed into the next Graph-cut computation. These two steps are iterated until segmentation convergences. Our method can automatically provide a 3D surface representation even in texture-less scenes where MVS methods might fail. Furthermore, it confers improved performance in images where the object is not readily separable from the background in colour space, an area that previous segmentation approaches have found challenging. © 2011 IEEE.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Pour être performant au plus haut niveau, les athlètes doivent posséder une capacité perceptivo-cognitive supérieure à la moyenne. Cette faculté, reflétée sur le terrain par la vision et l’intelligence de jeu des sportifs, permet d’extraire l’information clé de la scène visuelle. La science du sport a depuis longtemps observé l’expertise perceptivo-cognitive au sein de l’environnement sportif propre aux athlètes. Récemment, des études ont rapporté que l’expertise pouvait également se refléter hors de ce contexte, lors d’activités du quotidien par exemple. De plus, les récentes théories entourant la capacité plastique du cerveau ont amené les chercheurs à développer des outils pour entraîner les capacités perceptivo-cognitives des athlètes afin de les rendre plus performants sur le terrain. Ces méthodes sont la plupart du temps contextuelles à la discipline visée. Cependant, un nouvel outil d’entraînement perceptivo-cognitif, nommé 3-Dimensional Multiple Object Tracking (3D-MOT) et dénué de contexte sportif, a récemment vu le jour et a fait l’objet de nos recherches. Un de nos objectifs visait à mettre en évidence l’expertise perceptivo-cognitive spécifique et non-spécifique chez des athlètes lors d’une même étude. Nous avons évalué la perception du mouvement biologique chez des joueurs de soccer et des non-athlètes dans une salle de réalité virtuelle. Les sportifs étaient systématiquement plus performants en termes d’efficacité et de temps de réaction que les novices pour discriminer la direction du mouvement biologique lors d’un exercice spécifique de soccer (tir) mais également lors d’une action issue du quotidien (marche). Ces résultats signifient que les athlètes possèdent une meilleure capacité à percevoir les mouvements biologiques humains effectués par les autres. La pratique du soccer semble donc conférer un avantage fondamental qui va au-delà des fonctions spécifiques à la pratique d’un sport. Ces découvertes sont à mettre en parallèle avec la performance exceptionnelle des athlètes dans le traitement de scènes visuelles dynamiques et également dénuées de contexte sportif. Des joueurs de soccer ont surpassé des novices dans le test de 3D-MOT qui consiste à suivre des cibles en mouvement et stimule les capacités perceptivo-cognitives. Leur vitesse de suivi visuel ainsi que leur faculté d’apprentissage étaient supérieures. Ces résultats confirmaient des données obtenues précédemment chez des sportifs. Le 3D-MOT est un test de poursuite attentionnelle qui stimule le traitement actif de l’information visuelle dynamique. En particulier, l’attention sélective, dynamique et soutenue ainsi que la mémoire de travail. Cet outil peut être utilisé pour entraîner les fonctions perceptivo-cognitives des athlètes. Des joueurs de soccer entraînés au 3D-MOT durant 30 sessions ont montré une amélioration de la prise de décision dans les passes de 15% sur le terrain comparés à des joueurs de groupes contrôles. Ces données démontrent pour la première fois un transfert perceptivo-cognitif du laboratoire au terrain suivant un entraînement perceptivo-cognitif non-contextuel au sport de l’athlète ciblé. Nos recherches aident à comprendre l’expertise des athlètes par l’approche spécifique et non-spécifique et présentent également les outils d’entraînements perceptivo-cognitifs, en particulier le 3D-MOT, pour améliorer la performance dans le sport de haut-niveau.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In questa tesi sono stati analizzati alcuni metodi di ricerca per dati 3D. Viene illustrata una panoramica generale sul campo della Computer Vision, sullo stato dell’arte dei sensori per l’acquisizione e su alcuni dei formati utilizzati per la descrizione di dati 3D. In seguito è stato fatto un approfondimento sulla 3D Object Recognition dove, oltre ad essere descritto l’intero processo di matching tra Local Features, è stata fatta una focalizzazione sulla fase di detection dei punti salienti. In particolare è stato analizzato un Learned Keypoint detector, basato su tecniche di apprendimento di machine learning. Quest ultimo viene illustrato con l’implementazione di due algoritmi di ricerca di vicini: uno esauriente (K-d tree) e uno approssimato (Radial Search). Sono state riportate infine alcune valutazioni sperimentali in termini di efficienza e velocità del detector implementato con diversi metodi di ricerca, mostrando l’effettivo miglioramento di performance senza una considerabile perdita di accuratezza con la ricerca approssimata.