809 resultados para Monocular video
Resumo:
Dissertação apresentada ao Programa de Mestrado em Comunicação da Universidade Municipal de São Caetano do Sul - USCS
Resumo:
Os resultados das análises feitas com estes dados indicaram diferenças significativas no aumento da amplitude do plano meridiano horizontal nasal do campo visual monocular, medidas em unidades angulares. As diferenças foram interpretadas como indicativas da influência dos três diferentes níveis de complexidade dos estímulos visuais. Concluiu-se, portanto, que a variável colativa por complexidade influi no ato perceptual do reconhecimento visual.
Resumo:
Vídeos são dos principais meios de difusão de conhecimento, informação e entretenimento existentes. Todavia, apesar da boa qualidade e da boa aceitação do público, os vídeos atuais ainda restringem o espectador a um único ponto de vista. Atualmente, alguns estudos estão sendo desenvolvidos visando oferecer ao espectador maior liberdade para decidir de onde ele gostaria de assistir a cena. O tipo de vídeo a ser produzido por essas iniciativas tem sido chamado genericamente de vídeo 3D. Esse trabalho propõe uma arquitetura para captura e exibição de vídeos 3D em tempo real utilizando as informações de cor e profundidade da cena, capturadas para cada pixel de cada quadro do vídeo. A informação de profundidade pode ser obtida utilizando-se câmeras 3D, algoritmos de extração de disparidade a partir de estéreo, ou com auxílio de luz estruturada. A partir da informação de profundidade é possível calcular novos pontos de vista da cena utilizando um algoritmo de warping 3D. Devido a não disponibilidade de câmeras 3D durante a realização deste trabalho, a arquitetura proposta foi validada utilizando um ambiente sintético construído usando técnicas de computação gráfica. Este protótipo também foi utilizado para analisar diversos algoritmos de visão computacional que utilizam imagens estereoscópias para a extração da profundidade de cenas em tempo real. O uso de um ambiente controlado permitiu uma análise bastante criteriosa da qualidade dos mapas de profundidade produzidos por estes algoritmos, nos levando a concluir que eles ainda não são apropriados para uso de aplicações que necessitem da captura de vídeo 3D em tempo real.
Resumo:
Image stitching is the process of joining several images to obtain a bigger view of a scene. It is used, for example, in tourism to transmit to the viewer the sensation of being in another place. I am presenting an inexpensive solution for automatic real time video and image stitching with two web cameras as the video/image sources. The proposed solution relies on the usage of several markers in the scene as reference points for the stitching algorithm. The implemented algorithm is divided in four main steps, the marker detection, camera pose determination (in reference to the markers), video/image size and 3d transformation, and image translation. Wii remote controllers are used to support several steps in the process. The built‐in IR camera provides clean marker detection, which facilitates the camera pose determination. The only restriction in the algorithm is that markers have to be in the field of view when capturing the scene. Several tests where made to evaluate the final algorithm. The algorithm is able to perform video stitching with a frame rate between 8 and 13 fps. The joining of the two videos/images is good with minor misalignments in objects at the same depth of the marker,misalignments in the background and foreground are bigger. The capture process is simple enough so anyone can perform a stitching with a very short explanation. Although real‐time video stitching can be achieved by this affordable approach, there are few shortcomings in current version. For example, contrast inconsistency along the stitching line could be reduced by applying a color correction algorithm to every source videos. In addition, the misalignments in stitched images due to camera lens distortion could be eased by optical correction algorithm. The work was developed in Apple’s Quartz Composer, a visual programming environment. A library of extended functions was developed using Xcode tools also from Apple.
Resumo:
AIRES, Kelson R. T. ; ARAÚJO, Hélder J. ; MEDEIROS, Adelardo A. D. . Plane Detection from Monocular Image Sequences. In: VISUALIZATION, IMAGING AND IMAGE PROCESSING, 2008, Palma de Mallorca, Spain. Proceedings..., Palma de Mallorca: VIIP, 2008
Resumo:
The goal of this work is to propose a SLAM (Simultaneous Localization and Mapping) solution based on Extended Kalman Filter (EKF) in order to make possible a robot navigates along the environment using information from odometry and pre-existing lines on the floor. Initially, a segmentation step is necessary to classify parts of the image in floor or non floor . Then the image processing identifies floor lines and the parameters of these lines are mapped to world using a homography matrix. Finally, the identified lines are used in SLAM as landmarks in order to build a feature map. In parallel, using the corrected robot pose, the uncertainty about the pose and also the part non floor of the image, it is possible to build an occupancy grid map and generate a metric map with the obstacle s description. A greater autonomy for the robot is attained by using the two types of obtained map (the metric map and the features map). Thus, it is possible to run path planning tasks in parallel with localization and mapping. Practical results are presented to validate the proposal
Resumo:
The development and refinement of techniques that make simultaneous localization and mapping (SLAM) for an autonomous mobile robot and the building of local 3-D maps from a sequence of images, is widely studied in scientific circles. This work presents a monocular visual SLAM technique based on extended Kalman filter, which uses features found in a sequence of images using the SURF descriptor (Speeded Up Robust Features) and determines which features can be used as marks by a technique based on delayed initialization from 3-D straight lines. For this, only the coordinates of the features found in the image and the intrinsic and extrinsic camera parameters are avaliable. Its possible to determine the position of the marks only on the availability of information of depth. Tests have shown that during the route, the mobile robot detects the presence of characteristics in the images and through a proposed technique for delayed initialization of marks, adds new marks to the state vector of the extended Kalman filter (EKF), after estimating the depth of features. With the estimated position of the marks, it was possible to estimate the updated position of the robot at each step, obtaining good results that demonstrate the effectiveness of monocular visual SLAM system proposed in this paper
Resumo:
In Simultaneous Localization and Mapping (SLAM - Simultaneous Localization and Mapping), a robot placed in an unknown location in any environment must be able to create a perspective of this environment (a map) and is situated in the same simultaneously, using only information captured by the robot s sensors and control signals known. Recently, driven by the advance of computing power, work in this area have proposed to use video camera as a sensor and it came so Visual SLAM. This has several approaches and the vast majority of them work basically extracting features of the environment, calculating the necessary correspondence and through these estimate the required parameters. This work presented a monocular visual SLAM system that uses direct image registration to calculate the image reprojection error and optimization methods that minimize this error and thus obtain the parameters for the robot pose and map of the environment directly from the pixels of the images. Thus the steps of extracting and matching features are not needed, enabling our system works well in environments where traditional approaches have difficulty. Moreover, when addressing the problem of SLAM as proposed in this work we avoid a very common problem in traditional approaches, known as error propagation. Worrying about the high computational cost of this approach have been tested several types of optimization methods in order to find a good balance between good estimates and processing time. The results presented in this work show the success of this system in different environments
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
The aim of this Study was to compare the learning process of a highly complex ballet skill following demonstrations of point light and video models 16 participants divided into point light and video groups (ns = 8) performed 160 trials of a pirouette equally distributed in blocks of 20 trials alternating periods of demonstration and practice with a retention test a day later Measures of head and trunk oscillation coordination d1 parity from the model and movement time difference showed similarities between video and point light groups ballet experts evaluations indicated superiority of performance in the video over the point light group Results are discussed in terms of the task requirements of dissociation between head and trunk rotations focusing on the hypothesis of sufficiency and higher relevance of information contained in biological motion models applied to learning of complex motor skills
Resumo:
Based on analyses of high-speed video recordings of cloud-to-ground lightning in Brazil and the USA, the characteristics of positive cloud-to-ground (+CG) leaders are presented. The high frame rates permitted the average, 2-dimensional speeds of development along the paths of the channels to be resolved with good accuracy. The values range from 0.3 to 6.0 x 10(5) ms(-1) with a mean of 2.7 x 10(5) ms(-1). Contrary to what is usually assumed, downward +CG leader speeds are similar to downward -CG leader speeds. Our observations also show that the speeds tend to increase by a factor of 1.1 to 6.5 as they approach the ground. The presence of short duration, recoil leaders (RLs) during the development of positive leaders reveal a highly branched structure that is not usually recorded when using conventional photographic and video cameras. The existence of the RLs may help to explain observations of UHF-VHF radiation during the development of +CG flashes.
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)