892 resultados para video data
Resumo:
Estimating the fundamental matrix (F), to determine the epipolar geometry between a pair of images or video frames, is a basic step for a wide variety of vision-based functions used in construction operations, such as camera-pair calibration, automatic progress monitoring, and 3D reconstruction. Currently, robust methods (e.g., SIFT + normalized eight-point algorithm + RANSAC) are widely used in the construction community for this purpose. Although they can provide acceptable accuracy, the significant amount of required computational time impedes their adoption in real-time applications, especially video data analysis with many frames per second. Aiming to overcome this limitation, this paper presents and evaluates the accuracy of a solution to find F by combining the use of two speedy and consistent methods: SURF for the selection of a robust set of point correspondences and the normalized eight-point algorithm. This solution is tested extensively on construction site image pairs including changes in viewpoint, scale, illumination, rotation, and moving objects. The results demonstrate that this method can be used for real-time applications (5 image pairs per second with the resolution of 640 × 480) involving scenes of the built environment.
Resumo:
Tracking applications provide real time on-site information that can be used to detect travel path conflicts, calculate crew productivity and eliminate unnecessary processes at the site. This paper presents the validation of a novel vision based tracking methodology at the Egnatia Odos Motorway in Thessaloniki, Greece. Egnatia Odos is a motorway that connects Turkey with Italy through Greece. Its multiple open construction sites serves as an ideal multi-site test bed for validating construction site tracking methods. The vision based tracking methodology uses video cameras and computer algorithms to calculate the 3D position of project related entities (e.g. personnel, materials and equipment) in construction sites. The approach provides an unobtrusive, inexpensive way of effectively identifying and tracking the 3D location of entities. The process followed in this study starts by acquiring video data from multiple synchronous cameras at several large scale project sites of Egnatia Odos, such as tunnels, interchanges and bridges under construction. Subsequent steps include the evaluation of the collected data and finally, performing the 3D tracking operations on selected entities (heavy equipment and personnel). The accuracy and precision of the method's results is evaluated by comparing it with the actual 3D position of the object, thus assessing the 3D tracking method's effectiveness.
Resumo:
The purpose of this supplemental project was to collect invaluable data from the large-scale construction sites of Egnatia Odos motorway needed to validate a novel automated vision-tracking method created under the parent grant. For this purpose, one US graduate and three US undergraduate students traveled to Greece for 4 months and worked together with 2 Greek graduate students of the local faculty collaborator. This team of students monitored project activities and scheduled data collection trips on a daily basis, setup a mobile video data collection lab on the back of a truck, and drove to various sites every day to collect hundreds of hours of video from multiple cameras on a large variety of activities ranging from soil excavation to bridge construction. The US students were underrepresented students from minority groups who had never visited a foreign country. As a result, this trip was a major life experience to them. They learned how to live in a non-English speaking country, communicate with Greek students, workers and engineers. They lead a project in a very unfamiliar environment, troubleshoot myriad problems that hampered their progress daily and, above all, how to collaborate effectively and efficiently with other cultures. They returned to the US more mature, with improved leadership and problem-solving skills and a wider perspective of their profession.
Resumo:
We develop a convex relaxation of maximum a posteriori estimation of a mixture of regression models. Although our relaxation involves a semidefinite matrix variable, we reformulate the problem to eliminate the need for general semidefinite programming. In particular, we provide two reformulations that admit fast algorithms. The first is a max-min spectral reformulation exploiting quasi-Newton descent. The second is a min-min reformulation consisting of fast alternating steps of closed-form updates. We evaluate the methods against Expectation-Maximization in a real problem of motion segmentation from video data.
Resumo:
科学数据是重要的科技资源,科学数据的共享管理日益成为学术界和政府关注的前沿领域。地球科学门类众多,研究对象复杂且往往时空尺度大,在此过程中产生了大量数据结构形式各异的数据,诸如图型数据、表格数据、文本数据、影像数据等等。在数据库系统环境下如何对这些异构的数据进行存储、发布、显示是科学数据管理必须首先面对的问题。在分析研究地球科学数据特征的基础上,结合黄土高原数据中心的建设实践,以科学数据共享管理为目标,对地球科学研究数据的分类和组织进行了研究。阐述了地球科学研究数据的异构性、密集性、复合性等基本属性和特征,提出了关系类型、空间类型、文件类型等3种基本类型的数据集分类和组织方式,并提出了整编数据集的基本原则和方法以及科学数据分级、保护、共享的方式。实践表明:该数据分类与组织技术方案符合地球科学研究数据的特点,并将科学数据管理与计算机网络技术、信息技术有机结合,具有思路与技术的先进性和广泛的应用场合。
Resumo:
介绍了一种微型遥控潜水器视频监控系统,包括硬件装置及基于软件技术的视频数据处理技术。硬件装置完成视频信号的采集、传输、显示。视频数据处理技术解决了视频信号的捕获、存储、画面质量调节等。该系统已经在微型遥控潜水器中应用,效果良好。
Resumo:
Scene flow methods estimate the three-dimensional motion field for points in the world, using multi-camera video data. Such methods combine multi-view reconstruction with motion estimation approaches. This paper describes an alternative formulation for dense scene flow estimation that provides convincing results using only two cameras by fusing stereo and optical flow estimation into a single coherent framework. To handle the aperture problems inherent in the estimation task, a multi-scale method along with a novel adaptive smoothing technique is used to gain a regularized solution. This combined approach both preserves discontinuities and prevents over-regularization-two problems commonly associated with basic multi-scale approaches. Internally, the framework generates probability distributions for optical flow and disparity. Taking into account the uncertainty in the intermediate stages allows for more reliable estimation of the 3D scene flow than standard stereo and optical flow methods allow. Experiments with synthetic and real test data demonstrate the effectiveness of the approach.
Resumo:
Scene flow methods estimate the three-dimensional motion field for points in the world, using multi-camera video data. Such methods combine multi-view reconstruction with motion estimation. This paper describes an alternative formulation for dense scene flow estimation that provides reliable results using only two cameras by fusing stereo and optical flow estimation into a single coherent framework. Internally, the proposed algorithm generates probability distributions for optical flow and disparity. Taking into account the uncertainty in the intermediate stages allows for more reliable estimation of the 3D scene flow than previous methods allow. To handle the aperture problems inherent in the estimation of optical flow and disparity, a multi-scale method along with a novel region-based technique is used within a regularized solution. This combined approach both preserves discontinuities and prevents over-regularization – two problems commonly associated with the basic multi-scale approaches. Experiments with synthetic and real test data demonstrate the strength of the proposed approach.
Resumo:
We present results of a study into the performance of a variety of different image transform-based feature types for speaker-independent visual speech recognition of isolated digits. This includes the first reported use of features extracted using a discrete curvelet transform. The study will show a comparison of some methods for selecting features of each feature type and show the relative benefits of both static and dynamic visual features. The performance of the features will be tested on both clean video data and also video data corrupted in a variety of ways to assess each feature type's robustness to potential real-world conditions. One of the test conditions involves a novel form of video corruption we call jitter which simulates camera and/or head movement during recording.
Resumo:
This project involved creative artists working with older people with dementia and staff from two Belfast Health and Social Care Trust supported housing centres in a mixed programme of dance, painting, music and drama which culminated in an open workshop with relatives and friends of the tenants. The study steered away from traditional medical models of art/music/dance therapy where the participant is perceived as a ‘patient’ in favour of identifying the participant as a ‘student’ who avails of a life-long learning experience. A key premise was that access to the arts is a human right, especially in the context of advancing age and cognitive impairment. . According to one the tenants of Mullan Mews, the project served to ‘awaken - or reawaken - folk with dementia to the endless vista of possibility already in their lives if they will only look for it’. A phenomenographic analysis of video data generated by the project emphasises the importance of the individual experiences of participants in the programme. The evidence from these storylines gained strength from the development of a documentary-style film text that has proved successful in capturing and translating the live experience of the project participants into a supportive text that goes beyond the written word.
Resumo:
Sleep quality and duration are increasingly recognised as being important prognostic parameters in the assessment of an individual's health. However, reliable non-invasive long-term monitoring of sleep in a non-clinical setting remains a challenging problem. This paper describes the validation of a novel under mattress pressure sensing sleep monitoring modality that can be seamlessly integrated into existing home environments and provides a pervasive and distributed solution for monitoring long-term changes in sleep patterns and sleep disorders in adults. 410 minutes of concomitant Under Mattress Bed Sensor (UMBS) and strain gauge data were analysed from eight healthy adults lying passively. In this analysis, customised respirations rate detection algorithms yielded a mean difference of −0.12 breaths per five minutes and a mean percentage error (MPE) of 0.16% when the sensor was placed beneath the mattress. 1,491 minutes of UMBS and video data were recorded simultaneously from four participants in order to assess the movement detection efficacy of customised UMBS algorithms. These algorithms yielded accuracies, sensitivities and specificities of over 90% when compared to a video-based movement detection gold standard. A reduced data set (267 minutes) of wrist actigraphy, the gold standard ambulatory sleep monitor, was recorded. The UMBS was shown to outperform the movement detection ability of wrist actigraphy and has the added advantage of not requiring active subject participation.
Resumo:
Genetic Programming (GP) is a widely used methodology for solving various computational problems. GP's problem solving ability is usually hindered by its long execution times. In this thesis, GP is applied toward real-time computer vision. In particular, object classification and tracking using a parallel GP system is discussed. First, a study of suitable GP languages for object classification is presented. Two main GP approaches for visual pattern classification, namely the block-classifiers and the pixel-classifiers, were studied. Results showed that the pixel-classifiers generally performed better. Using these results, a suitable language was selected for the real-time implementation. Synthetic video data was used in the experiments. The goal of the experiments was to evolve a unique classifier for each texture pattern that existed in the video. The experiments revealed that the system was capable of correctly tracking the textures in the video. The performance of the system was on-par with real-time requirements.
Resumo:
Les pays industrialisés comme le Canada doivent faire face au vieillissement de leur population. En particulier, la majorité des personnes âgées, vivant à domicile et souvent seules, font face à des situations à risques telles que des chutes. Dans ce contexte, la vidéosurveillance est une solution innovante qui peut leur permettre de vivre normalement dans un environnement sécurisé. L’idée serait de placer un réseau de caméras dans l’appartement de la personne pour détecter automatiquement une chute. En cas de problème, un message pourrait être envoyé suivant l’urgence aux secours ou à la famille via une connexion internet sécurisée. Pour un système bas coût, nous avons limité le nombre de caméras à une seule par pièce ce qui nous a poussé à explorer les méthodes monoculaires de détection de chutes. Nous avons d’abord exploré le problème d’un point de vue 2D (image) en nous intéressant aux changements importants de la silhouette de la personne lors d’une chute. Les données d’activités normales d’une personne âgée ont été modélisées par un mélange de gaussiennes nous permettant de détecter tout événement anormal. Notre méthode a été validée à l’aide d’une vidéothèque de chutes simulées et d’activités normales réalistes. Cependant, une information 3D telle que la localisation de la personne par rapport à son environnement peut être très intéressante pour un système d’analyse de comportement. Bien qu’il soit préférable d’utiliser un système multi-caméras pour obtenir une information 3D, nous avons prouvé qu’avec une seule caméra calibrée, il était possible de localiser une personne dans son environnement grâce à sa tête. Concrêtement, la tête de la personne, modélisée par une ellipsoide, est suivie dans la séquence d’images à l’aide d’un filtre à particules. La précision de la localisation 3D de la tête a été évaluée avec une bibliothèque de séquence vidéos contenant les vraies localisations 3D obtenues par un système de capture de mouvement (Motion Capture). Un exemple d’application utilisant la trajectoire 3D de la tête est proposée dans le cadre de la détection de chutes. En conclusion, un système de vidéosurveillance pour la détection de chutes avec une seule caméra par pièce est parfaitement envisageable. Pour réduire au maximum les risques de fausses alarmes, une méthode hybride combinant des informations 2D et 3D pourrait être envisagée.
Resumo:
As the popularity of digital videos increases, a large number illegal videos are being generated and getting published. Video copies are generated by performing various sorts of transformations on the original video data. For effectively identifying such illegal videos, the image features that are invariant to various transformations must be extracted for performing similarity matching. An image feature can be its local feature or global feature. Among them, local features are powerful and have been applied in a wide variety of computer vision aplications .This paper focuses on various recently proposed local detectors and descriptors that are invariant to a number of image transformations.
Resumo:
Photo-mosaicing techniques have become popular for seafloor mapping in various marine science applications. However, the common methods cannot accurately map regions with high relief and topographical variations. Ortho-mosaicing borrowed from photogrammetry is an alternative technique that enables taking into account the 3-D shape of the terrain. A serious bottleneck is the volume of elevation information that needs to be estimated from the video data, fused, and processed for the generation of a composite ortho-photo that covers a relatively large seafloor area. We present a framework that combines the advantages of dense depth-map and 3-D feature estimation techniques based on visual motion cues. The main goal is to identify and reconstruct certain key terrain feature points that adequately represent the surface with minimal complexity in the form of piecewise planar patches. The proposed implementation utilizes local depth maps for feature selection, while tracking over several views enables 3-D reconstruction by bundle adjustment. Experimental results with synthetic and real data validate the effectiveness of the proposed approach