988 resultados para Video Processing


Relevância:

60.00% 60.00%

Publicador:

Resumo:

Image and video compression play a major role in the world today, allowing the storage and transmission of large multimedia content volumes. However, the processing of this information requires high computational resources, hence the improvement of the computational performance of these compression algorithms is very important. The Multidimensional Multiscale Parser (MMP) is a pattern-matching-based compression algorithm for multimedia contents, namely images, achieving high compression ratios, maintaining good image quality, Rodrigues et al. [2008]. However, in comparison with other existing algorithms, this algorithm takes some time to execute. Therefore, two parallel implementations for GPUs were proposed by Ribeiro [2016] and Silva [2015] in CUDA and OpenCL-GPU, respectively. In this dissertation, to complement the referred work, we propose two parallel versions that run the MMP algorithm in CPU: one resorting to OpenMP and another that converts the existing OpenCL-GPU into OpenCL-CPU. The proposed solutions are able to improve the computational performance of MMP by 3 and 2:7 , respectively. The High Efficiency Video Coding (HEVC/H.265) is the most recent standard for compression of image and video. Its impressive compression performance, makes it a target for many adaptations, particularly for holoscopic image/video processing (or light field). Some of the proposed modifications to encode this new multimedia content are based on geometry-based disparity compensations (SS), developed by Conti et al. [2014], and a Geometric Transformations (GT) module, proposed by Monteiro et al. [2015]. These compression algorithms for holoscopic images based on HEVC present an implementation of specific search for similar micro-images that is more efficient than the one performed by HEVC, but its implementation is considerably slower than HEVC. In order to enable better execution times, we choose to use the OpenCL API as the GPU enabling language in order to increase the module performance. With its most costly setting, we are able to reduce the GT module execution time from 6.9 days to less then 4 hours, effectively attaining a speedup of 45 .

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Distributed Wireless Smart Camera (DWSC) network is a special type of Wireless Sensor Network (WSN) that processes captured images in a distributed manner. While image processing on DWSCs sees a great potential for growth, with its applications possessing a vast practical application domain such as security surveillance and health care, it suffers from tremendous constraints. In addition to the limitations of conventional WSNs, image processing on DWSCs requires more computational power, bandwidth and energy that presents significant challenges for large scale deployments. This dissertation has developed a number of algorithms that are highly scalable, portable, energy efficient and performance efficient, with considerations of practical constraints imposed by the hardware and the nature of WSN. More specifically, these algorithms tackle the problems of multi-object tracking and localisation in distributed wireless smart camera net- works and optimal camera configuration determination. Addressing the first problem of multi-object tracking and localisation requires solving a large array of sub-problems. The sub-problems that are discussed in this dissertation are calibration of internal parameters, multi-camera calibration for localisation and object handover for tracking. These topics have been covered extensively in computer vision literatures, however new algorithms must be invented to accommodate the various constraints introduced and required by the DWSC platform. A technique has been developed for the automatic calibration of low-cost cameras which are assumed to be restricted in their freedom of movement to either pan or tilt movements. Camera internal parameters, including focal length, principal point, lens distortion parameter and the angle and axis of rotation, can be recovered from a minimum set of two images of the camera, provided that the axis of rotation between the two images goes through the camera's optical centre and is parallel to either the vertical (panning) or horizontal (tilting) axis of the image. For object localisation, a novel approach has been developed for the calibration of a network of non-overlapping DWSCs in terms of their ground plane homographies, which can then be used for localising objects. In the proposed approach, a robot travels through the camera network while updating its position in a global coordinate frame, which it broadcasts to the cameras. The cameras use this, along with the image plane location of the robot, to compute a mapping from their image planes to the global coordinate frame. This is combined with an occupancy map generated by the robot during the mapping process to localised objects moving within the network. In addition, to deal with the problem of object handover between DWSCs of non-overlapping fields of view, a highly-scalable, distributed protocol has been designed. Cameras that follow the proposed protocol transmit object descriptions to a selected set of neighbours that are determined using a predictive forwarding strategy. The received descriptions are then matched at the subsequent camera on the object's path using a probability maximisation process with locally generated descriptions. The second problem of camera placement emerges naturally when these pervasive devices are put into real use. The locations, orientations, lens types etc. of the cameras must be chosen in a way that the utility of the network is maximised (e.g. maximum coverage) while user requirements are met. To deal with this, a statistical formulation of the problem of determining optimal camera configurations has been introduced and a Trans-Dimensional Simulated Annealing (TDSA) algorithm has been proposed to effectively solve the problem.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

We explore the use of natural language understanding and image processing to index and query American Football tapes. We present a model for representing spatio-temporal characteristics of multiple objects in dynamic scenes in this domain, and a recognition system which uses the model to recognise American Football plays.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

A main objective of the human movement analysis is the quantitative description of joint kinematics and kinetics. This information may have great possibility to address clinical problems both in orthopaedics and motor rehabilitation. Previous studies have shown that the assessment of kinematics and kinetics from stereophotogrammetric data necessitates a setup phase, special equipment and expertise to operate. Besides, this procedure may cause feeling of uneasiness on the subjects and may hinder with their walking. The general aim of this thesis is the implementation and evaluation of new 2D markerless techniques, in order to contribute to the development of an alternative technique to the traditional stereophotogrammetric techniques. At first, the focus of the study has been the estimation of the ankle-foot complex kinematics during stance phase of the gait. Two particular cases were considered: subjects barefoot and subjects wearing ankle socks. The use of socks was investigated in view of the development of the hybrid method proposed in this work. Different algorithms were analyzed, evaluated and implemented in order to have a 2D markerless solution to estimate the kinematics for both cases. The validation of the proposed technique was done with a traditional stereophotogrammetric system. The implementation of the technique leads towards an easy to configure (and more comfortable for the subject) alternative to the traditional stereophotogrammetric system. Then, the abovementioned technique has been improved so that the measurement of knee flexion/extension could be done with a 2D markerless technique. The main changes on the implementation were on occlusion handling and background segmentation. With the additional constraints, the proposed technique was applied to the estimation of knee flexion/extension and compared with a traditional stereophotogrammetric system. Results showed that the knee flexion/extension estimation from traditional stereophotogrammetric system and the proposed markerless system were highly comparable, making the latter a potential alternative for clinical use. A contribution has also been given in the estimation of lower limb kinematics of the children with cerebral palsy (CP). For this purpose, a hybrid technique, which uses high-cut underwear and ankle socks as “segmental markers” in combination with a markerless methodology, was proposed. The proposed hybrid technique is different than the abovementioned markerless technique in terms of the algorithm chosen. Results showed that the proposed hybrid technique can become a simple and low-cost alternative to the traditional stereophotogrammetric systems.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Schizophrenia is a mental disorder characterized by a breakdown of cognitive processes and by a deficit of typi-cal emotional responses. Effectiveness of computerized task has been demonstrated in the field of cognitive rehabilitation. However, current rehabilitation programs based on virtual environments normally focus on higher cognitive functions, not covering social cognition training. This paper presents a set of video-based tasks specifically designed for the rehabilita-tion of emotional processing deficits in patients in early stages of schizophrenia or schizoaffective disorders. These tasks are part of the Mental Health program of Guttmann NeuroPer-sonalTrainer® cognitive tele-rehabilitation platform, and entail innovation both from a clinical and technological per-spective in relation with former traditional therapeutic con-tents.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Identifying an individual from surveillance video is a difficult, time consuming and labour intensive process. The proposed system aims to streamline this process by filtering out unwanted scenes and enhancing an individual's face through super-resolution. An automatic face recognition system is then used to identify the subject or present the human operator with likely matches from a database. A person tracker is used to speed up the subject detection and super-resolution process by tracking moving subjects and cropping a region of interest around the subject's face to reduce the number and size of the image frames to be super-resolved respectively. In this paper, experiments have been conducted to demonstrate how the optical flow super-resolution method used improves surveillance imagery for visual inspection as well as automatic face recognition on an Eigenface and Elastic Bunch Graph Matching system. The optical flow based method has also been benchmarked against the ``hallucination'' algorithm, interpolation methods and the original low-resolution images. Results show that both super-resolution algorithms improved recognition rates significantly. Although the hallucination method resulted in slightly higher recognition rates, the optical flow method produced less artifacts and more visually correct images suitable for human consumption.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This study explores the effects of use-simulated and peripheral placements in video games on attitude to the brand. Results indicate that placements do not lead to enhanced brand attitude, even when controlling for involvement and skill. It appears this is due to constraints on brand information processing in a game context.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Characteristics of surveillance video generally include low resolution and poor quality due to environmental, storage and processing limitations. It is extremely difficult for computers and human operators to identify individuals from these videos. To overcome this problem, super-resolution can be used in conjunction with an automated face recognition system to enhance the spatial resolution of video frames containing the subject and narrow down the number of manual verifications performed by the human operator by presenting a list of most likely candidates from the database. As the super-resolution reconstruction process is ill-posed, visual artifacts are often generated as a result. These artifacts can be visually distracting to humans and/or affect machine recognition algorithms. While it is intuitive that higher resolution should lead to improved recognition accuracy, the effects of super-resolution and such artifacts on face recognition performance have not been systematically studied. This paper aims to address this gap while illustrating that super-resolution allows more accurate identification of individuals from low-resolution surveillance footage. The proposed optical flow-based super-resolution method is benchmarked against Baker et al.’s hallucination and Schultz et al.’s super-resolution techniques on images from the Terrascope and XM2VTS databases. Ground truth and interpolated images were also tested to provide a baseline for comparison. Results show that a suitable super-resolution system can improve the discriminability of surveillance video and enhance face recognition accuracy. The experiments also show that Schultz et al.’s method fails when dealing surveillance footage due to its assumption of rigid objects in the scene. The hallucination and optical flow-based methods performed comparably, with the optical flow-based method producing less visually distracting artifacts that interfered with human recognition.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A new method for the detection of abnormal vehicle trajectories is proposed. It couples optical flow extraction of vehicle velocities with a neural network classifier. Abnormal trajectories are indicative of drunk or sleepy drivers. A single feature of the vehicle, eg., a tail light, is isolated and the optical flow computed only around this feature rather than at each pixel in the image.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We propose an approach to employ eigen light-fields for face recognition across pose on video. Faces of a subject are collected from video frames and combined based on the pose to obtain a set of probe light-fields. These probe data are then projected to the principal subspace of the eigen light-fields within which the classification takes place. We modify the original light-field projection and found that it is more robust in the proposed system. Evaluation on VidTIMIT dataset has demonstrated that the eigen light-fields method is able to take advantage of multiple observations contained in the video.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

From a law enforcement standpoint, the ability to search for a person matching a semantic description (i.e. 1.8m tall, red shirt, jeans) is highly desirable. While a significant research effort has focused on person re-detection (the task of identifying a previously observed individual in surveillance video), these techniques require descriptors to be built from existing image or video observations. As such, person re-detection techniques are not suited to situations where footage of the person of interest is not readily available, such as a witness reporting a recent crime. In this paper, we present a novel framework that is able to search for a person based on a semantic description. The proposed approach uses size and colour cues, and does not require a person detection routine to locate people in the scene, improving utility in crowded conditions. The proposed approach is demonstrated with a new database that will be made available to the research community, and we show that the proposed technique is able to correctly localise a person in a video based on a simple semantic description.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper a real-time vision based power line extraction solution is investigated for active UAV guidance. The line extraction algorithm starts from ridge points detected by steerable filters. A collinear line segments fitting algorithm is followed up by considering global and local information together with multiple collinear measurements. GPU boosted algorithm implementation is also investigated in the experiment. The experimental result shows that the proposed algorithm outperforms two baseline line detection algorithms and is able to fitting long collinear line segments. The low computational cost of the algorithm make suitable for real-time applications.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Quality based frame selection is a crucial task in video face recognition, to both improve the recognition rate and to reduce the computational cost. In this paper we present a framework that uses a variety of cues (face symmetry, sharpness, contrast, closeness of mouth, brightness and openness of the eye) to select the highest quality facial images available in a video sequence for recognition. Normalized feature scores are fused using a neural network and frames with high quality scores are used in a Local Gabor Binary Pattern Histogram Sequence based face recognition system. Experiments on the Honda/UCSD database shows that the proposed method selects the best quality face images in the video sequence, resulting in improved recognition performance.