888 resultados para Computer vision system


Relevância:

100.00% 100.00%

Publicador:

Resumo:

A vision system for recognizing rigid and articulated three-dimensional objects in two-dimensional images is described. Geometrical models are extracted from a commercial computer aided design package. The models are then augmented with appearance and functional information which improves the system's hypothesis generation, hypothesis verification, and pose refinement. Significant advantages over existing CAD-based vision systems, which utilize only information available in the CAD system, are realized. Examples show the system recognizing, locating, and tracking a variety of objects in a robot work-cell and in natural scenes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis deals with the challenging problem of designing systems able to perceive objects in underwater environments. In the last few decades research activities in robotics have advanced the state of art regarding intervention capabilities of autonomous systems. State of art in fields such as localization and navigation, real time perception and cognition, safe action and manipulation capabilities, applied to ground environments (both indoor and outdoor) has now reached such a readiness level that it allows high level autonomous operations. On the opposite side, the underwater environment remains a very difficult one for autonomous robots. Water influences the mechanical and electrical design of systems, interferes with sensors by limiting their capabilities, heavily impacts on data transmissions, and generally requires systems with low power consumption in order to enable reasonable mission duration. Interest in underwater applications is driven by needs of exploring and intervening in environments in which human capabilities are very limited. Nowadays, most underwater field operations are carried out by manned or remotely operated vehicles, deployed for explorations and limited intervention missions. Manned vehicles, directly on-board controlled, expose human operators to risks related to the stay in field of the mission, within a hostile environment. Remotely Operated Vehicles (ROV) currently represent the most advanced technology for underwater intervention services available on the market. These vehicles can be remotely operated for long time but they need support from an oceanographic vessel with multiple teams of highly specialized pilots. Vehicles equipped with multiple state-of-art sensors and capable to autonomously plan missions have been deployed in the last ten years and exploited as observers for underwater fauna, seabed, ship wrecks, and so on. On the other hand, underwater operations like object recovery and equipment maintenance are still challenging tasks to be conducted without human supervision since they require object perception and localization with much higher accuracy and robustness, to a degree seldom available in Autonomous Underwater Vehicles (AUV). This thesis reports the study, from design to deployment and evaluation, of a general purpose and configurable platform dedicated to stereo-vision perception in underwater environments. Several aspects related to the peculiar environment characteristics have been taken into account during all stages of system design and evaluation: depth of operation and light conditions, together with water turbidity and external weather, heavily impact on perception capabilities. The vision platform proposed in this work is a modular system comprising off-the-shelf components for both the imaging sensors and the computational unit, linked by a high performance ethernet network bus. The adopted design philosophy aims at achieving high flexibility in terms of feasible perception applications, that should not be as limited as in case of a special-purpose and dedicated hardware. Flexibility is required by the variability of underwater environments, with water conditions ranging from clear to turbid, light backscattering varying with daylight and depth, strong color distortion, and other environmental factors. Furthermore, the proposed modular design ensures an easier maintenance and update of the system over time. Performance of the proposed system, in terms of perception capabilities, has been evaluated in several underwater contexts taking advantage of the opportunity offered by the MARIS national project. Design issues like energy power consumption, heat dissipation and network capabilities have been evaluated in different scenarios. Finally, real-world experiments, conducted in multiple and variable underwater contexts, including open sea waters, have led to the collection of several datasets that have been publicly released to the scientific community. The vision system has been integrated in a state of the art AUV equipped with a robotic arm and gripper, and has been exploited in the robot control loop to successfully perform underwater grasping operations.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In a search for new sensor systems and new methods for underwater vehicle positioning based on visual observation, this paper presents a computer vision system based on coded light projection. 3D information is taken from an underwater scene. This information is used to test obstacle avoidance behaviour. In addition, the main ideas for achieving stabilisation of the vehicle in front of an object are presented

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In a search for new sensor systems and new methods for underwater vehicle positioning based on visual observation, this paper presents a computer vision system based on coded light projection. 3D information is taken from an underwater scene. This information is used to test obstacle avoidance behaviour. In addition, the main ideas for achieving stabilisation of the vehicle in front of an object are presented

Relevância:

100.00% 100.00%

Publicador:

Resumo:

[EN]In this paper, a basic conceptual architecture aimed at the design of Computer Vision System is qualitatively described. The proposed architecture addresses the design of vision systems in a modular fashion using modules with three distinct units or components: a processing network or diagnostics unit, a control unit and a communications unit. The control of the system at the modules level is designed based on a Discrete Events Model. This basic methodology has been used to design a realtime active vision system for detection, tracking and recognition of people. It is made up of three functional modules aimed at the detection, tracking, recognition of moving individuals plus a supervision module.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Sendo uma forma natural de interação homem-máquina, o reconhecimento de gestos implica uma forte componente de investigação em áreas como a visão por computador e a aprendizagem computacional. O reconhecimento gestual é uma área com aplicações muito diversas, fornecendo aos utilizadores uma forma mais natural e mais simples de comunicar com sistemas baseados em computador, sem a necessidade de utilização de dispositivos extras. Assim, o objectivo principal da investigação na área de reconhecimento de gestos aplicada à interacção homemmáquina é o da criação de sistemas, que possam identificar gestos específicos e usálos para transmitir informações ou para controlar dispositivos. Para isso as interfaces baseados em visão para o reconhecimento de gestos, necessitam de detectar a mão de forma rápida e robusta e de serem capazes de efetuar o reconhecimento de gestos em tempo real. Hoje em dia, os sistemas de reconhecimento de gestos baseados em visão são capazes de trabalhar com soluções específicas, construídos para resolver um determinado problema e configurados para trabalhar de uma forma particular. Este projeto de investigação estudou e implementou soluções, suficientemente genéricas, com o recurso a algoritmos de aprendizagem computacional, permitindo a sua aplicação num conjunto alargado de sistemas de interface homem-máquina, para reconhecimento de gestos em tempo real. A solução proposta, Gesture Learning Module Architecture (GeLMA), permite de forma simples definir um conjunto de comandos que pode ser baseado em gestos estáticos e dinâmicos e que pode ser facilmente integrado e configurado para ser utilizado numa série de aplicações. É um sistema de baixo custo e fácil de treinar e usar, e uma vez que é construído unicamente com bibliotecas de código. As experiências realizadas permitiram mostrar que o sistema atingiu uma precisão de 99,2% em termos de reconhecimento de gestos estáticos e uma precisão média de 93,7% em termos de reconhecimento de gestos dinâmicos. Para validar a solução proposta, foram implementados dois sistemas completos. O primeiro é um sistema em tempo real capaz de ajudar um árbitro a arbitrar um jogo de futebol robótico. A solução proposta combina um sistema de reconhecimento de gestos baseada em visão com a definição de uma linguagem formal, o CommLang Referee, à qual demos a designação de Referee Command Language Interface System (ReCLIS). O sistema identifica os comandos baseados num conjunto de gestos estáticos e dinâmicos executados pelo árbitro, sendo este posteriormente enviado para um interface de computador que transmite a respectiva informação para os robôs. O segundo é um sistema em tempo real capaz de interpretar um subconjunto da Linguagem Gestual Portuguesa. As experiências demonstraram que o sistema foi capaz de reconhecer as vogais em tempo real de forma fiável. Embora a solução implementada apenas tenha sido treinada para reconhecer as cinco vogais, o sistema é facilmente extensível para reconhecer o resto do alfabeto. As experiências também permitiram mostrar que a base dos sistemas de interação baseados em visão pode ser a mesma para todas as aplicações e, deste modo facilitar a sua implementação. A solução proposta tem ainda a vantagem de ser suficientemente genérica e uma base sólida para o desenvolvimento de sistemas baseados em reconhecimento gestual que podem ser facilmente integrados com qualquer aplicação de interface homem-máquina. A linguagem formal de definição da interface pode ser redefinida e o sistema pode ser facilmente configurado e treinado com um conjunto de gestos diferentes de forma a serem integrados na solução final.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

"Lecture notes in computational vision and biomechanics series, ISSN 2212-9391, vol. 19"

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Tese de Doutoramento em Engenharia de Eletrónica e de Computadores

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Nowadays despite improvements in usability and intuitiveness users have to adapt to the proposed systems to satisfy their needs. For instance, they must learn how to achieve tasks, how to interact with the system, and fulfill system's specifications. This paper proposes an approach to improve this situation enabling graphical user interface redefinition through virtualization and computer vision with the aim of increasing the system's usability. To achieve this goal the approach is based on enriched task models, virtualization and picture-driven computing.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

in RoboCup 2007: Robot Soccer World Cup XI

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a pattern recognition method focused on paintings images. The purpose is construct a system able to recognize authors or art styles based on common elements of his work (here called patterns). The method is based on comparing images that contain the same or similar patterns. It uses different computer vision techniques, like SIFT and SURF, to describe the patterns in descriptors, K-Means to classify and simplify these descriptors, and RANSAC to determine and detect good results. The method are good to find patterns of known images but not so good if they are not.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

L’objectif de cette thèse par articles est de présenter modestement quelques étapes du parcours qui mènera (on espère) à une solution générale du problème de l’intelligence artificielle. Cette thèse contient quatre articles qui présentent chacun une différente nouvelle méthode d’inférence perceptive en utilisant l’apprentissage machine et, plus particulièrement, les réseaux neuronaux profonds. Chacun de ces documents met en évidence l’utilité de sa méthode proposée dans le cadre d’une tâche de vision par ordinateur. Ces méthodes sont applicables dans un contexte plus général, et dans certains cas elles on tété appliquées ailleurs, mais ceci ne sera pas abordé dans le contexte de cette de thèse. Dans le premier article, nous présentons deux nouveaux algorithmes d’inférence variationelle pour le modèle génératif d’images appelé codage parcimonieux “spike- and-slab” (CPSS). Ces méthodes d’inférence plus rapides nous permettent d’utiliser des modèles CPSS de tailles beaucoup plus grandes qu’auparavant. Nous démontrons qu’elles sont meilleures pour extraire des détecteur de caractéristiques quand très peu d’exemples étiquetés sont disponibles pour l’entraînement. Partant d’un modèle CPSS, nous construisons ensuite une architecture profonde, la machine de Boltzmann profonde partiellement dirigée (MBP-PD). Ce modèle a été conçu de manière à simplifier d’entraînement des machines de Boltzmann profondes qui nécessitent normalement une phase de pré-entraînement glouton pour chaque couche. Ce problème est réglé dans une certaine mesure, mais le coût d’inférence dans le nouveau modèle est relativement trop élevé pour permettre de l’utiliser de manière pratique. Dans le deuxième article, nous revenons au problème d’entraînement joint de machines de Boltzmann profondes. Cette fois, au lieu de changer de famille de modèles, nous introduisons un nouveau critère d’entraînement qui donne naissance aux machines de Boltzmann profondes à multiples prédictions (MBP-MP). Les MBP-MP sont entraînables en une seule étape et ont un meilleur taux de succès en classification que les MBP classiques. Elles s’entraînent aussi avec des méthodes variationelles standard au lieu de nécessiter un classificateur discriminant pour obtenir un bon taux de succès en classification. Par contre, un des inconvénients de tels modèles est leur incapacité de générer deséchantillons, mais ceci n’est pas trop grave puisque la performance de classification des machines de Boltzmann profondes n’est plus une priorité étant donné les dernières avancées en apprentissage supervisé. Malgré cela, les MBP-MP demeurent intéressantes parce qu’elles sont capable d’accomplir certaines tâches que des modèles purement supervisés ne peuvent pas faire, telles que celle de classifier des données incomplètes ou encore celle de combler intelligemment l’information manquante dans ces données incomplètes. Le travail présenté dans cette thèse s’est déroulé au milieu d’une période de transformations importantes du domaine de l’apprentissage à réseaux neuronaux profonds qui a été déclenchée par la découverte de l’algorithme de “dropout” par Geoffrey Hinton. Dropout rend possible un entraînement purement supervisé d’architectures de propagation unidirectionnel sans être exposé au danger de sur- entraînement. Le troisième article présenté dans cette thèse introduit une nouvelle fonction d’activation spécialement con ̧cue pour aller avec l’algorithme de Dropout. Cette fonction d’activation, appelée maxout, permet l’utilisation de aggrégation multi-canal dans un contexte d’apprentissage purement supervisé. Nous démontrons comment plusieurs tâches de reconnaissance d’objets sont mieux accomplies par l’utilisation de maxout. Pour terminer, sont présentons un vrai cas d’utilisation dans l’industrie pour la transcription d’adresses de maisons à plusieurs chiffres. En combinant maxout avec une nouvelle sorte de couche de sortie pour des réseaux neuronaux de convolution, nous démontrons qu’il est possible d’atteindre un taux de succès comparable à celui des humains sur un ensemble de données coriace constitué de photos prises par les voitures de Google. Ce système a été déployé avec succès chez Google pour lire environ cent million d’adresses de maisons.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The paper describes a novel integrated vision system in which two autonomous visual modules are combined to interpret a dynamic scene. The first module employs a 3D model-based scheme to track rigid objects such as vehicles. The second module uses a 2D deformable model to track non-rigid objects such as people. The principal contribution is a novel method for handling occlusion between objects within the context of this hybrid tracking system. The practical aim of the work is to derive a scene description that is sufficiently rich to be used in a range of surveillance tasks. The paper describes each of the modules in outline before detailing the method of integration and the handling of occlusion in particular. Experimental results are presented to illustrate the performance of the system in a dynamic outdoor scene involving cars and people.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Computer vision applications generally split their problem into multiple simpler tasks. Likewise research often combines algorithms into systems for evaluation purposes. Frameworks for modular vision provide interfaces and mechanisms for algorithm combination and network transparency. However, these don’t provide interfaces efficiently utilising the slow memory in modern PCs. We investigate quantitatively how system performance varies with different patterns of memory usage by the framework for an example vision system.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The rapid development of data transfer through internet made it easier to send the data accurate and faster to the destination. There are many transmission media to transfer the data to destination like e-mails; at the same time it is may be easier to modify and misuse the valuable information through hacking. So, in order to transfer the data securely to the destination without any modifications, there are many approaches like cryptography and steganography. This paper deals with the image steganography as well as with the different security issues, general overview of cryptography, steganography and digital watermarking approaches.  The problem of copyright violation of multimedia data has increased due to the enormous growth of computer networks that provides fast and error free transmission of any unauthorized duplicate and possibly manipulated copy of multimedia information. In order to be effective for copyright protection, digital watermark must be robust which are difficult to remove from the object in which they are embedded despite a variety of possible attacks. The message to be send safe and secure, we use watermarking. We use invisible watermarking to embed the message using LSB (Least Significant Bit) steganographic technique. The standard LSB technique embed the message in every pixel, but my contribution for this proposed watermarking, works with the hint for embedding the message only on the image edges alone. If the hacker knows that the system uses LSB technique also, it cannot decrypt correct message. To make my system robust and secure, we added cryptography algorithm as Vigenere square. Whereas the message is transmitted in cipher text and its added advantage to the proposed system. The standard Vigenere square algorithm works with either lower case or upper case. The proposed cryptography algorithm is Vigenere square with extension of numbers also. We can keep the crypto key with combination of characters and numbers. So by using these modifications and updating in this existing algorithm and combination of cryptography and steganography method we develop a secure and strong watermarking method. Performance of this watermarking scheme has been analyzed by evaluating the robustness of the algorithm with PSNR (Peak Signal to Noise Ratio) and MSE (Mean Square Error) against the quality of the image for large amount of data. While coming to see results of the proposed encryption, higher value of 89dB of PSNR with small value of MSE is 0.0017. Then it seems the proposed watermarking system is secure and robust for hiding secure information in any digital system, because this system collect the properties of both steganography and cryptography sciences.