804 resultados para Monocular video
Resumo:
With the recent increased popularity and high usage of HTTP Adaptive Streaming (HAS) techniques, various studies have been carried out in this area which generally focused on the technical enhancement of HAS technology and applications. However, a lack of common HAS standard led to multiple proprietary approaches which have been developed by major Internet companies. In the emerging MPEG-DASH standard the packagings of the video content and HTTP syntax have been standardized; but all the details of the adaptation behavior are left to the client implementation. Nevertheless, to design an adaptation algorithm which optimizes the viewing experience of the enduser, the multimedia service providers need to know about the Quality of Experience (QoE) of different adaptation schemes. Taking this into account, the objective of this experiment was to study the QoE of a HAS-based video broadcast model. The experiment has been carried out through a subjective study of the end user response to various possible clients’ behavior for changing the video quality taking different QoE-influence factors into account. The experimental conclusions have made a good insight into the QoE of different adaptation schemes which can be exploited by HAS clients for designing the adaptation algorithms.
Quality-optimization algorithm based on stochastic dynamic programming for MPEG DASH video streaming
Resumo:
In contrast to traditional push-based protocols, adaptive streaming techniques like Dynamic Adaptive Streaming over HTTP (DASH) fix attention on the client, who dynamically requests different-quality portions of the content to cope with a limited and variable bandwidth but aiming at maximizing the quality perceived by the user. Since DASH adaptation logic at the client is not covered by the standard, we propose a solution based on Stochastic Dynamic Programming (SDP) techniques to find the optimal request policies that guarantee the users' Quality of Experience (QoE). Our algorithm is evaluated in a simulated streaming session and is compared with other adaptation approaches. The results show that our proposal outperforms them in terms of QoE, requesting higher qualities on average.
Resumo:
A frame-level distortion model based on perceptual features of the human visual system is proposed to improve the performance of unequal error protection strategies and provide better quality of experience to users in Side-by-Side 3D video delivery systems.
Resumo:
One of the key factors for a given application to take advantage of cloud computing is the ability to scale in an efficient, fast and reliable way. In centralized multi-party video conferencing, dynamically scaling a running conversation is a complex problem. In this paper we propose a methodology to divide the Multipoint Control Unit (the video conferencing server) into more simple units, broadcasters. Each broadcaster receives the media from a participant, processes it and forwards it to the rest. These broadcasters can be distributed among a group of CPUs. By using this methodology, video conferencing systems can scale in a more granular way, improving the deployment.
Resumo:
Assessing video quality is a complex task. While most pixel-based metrics do not present enough correlation between objective and subjective results, algorithms need to correspond to human perception when analyzing quality in a video sequence. For analyzing the perceived quality derived from concrete video artifacts in determined region of interest we present a novel methodology for generating test sequences which allow the analysis of impact of each individual distortion. Through results obtained after subjective assessment it is possible to create psychovisual models based on weighting pixels belonging to different regions of interest distributed by color, position, motion or content. Interesting results are obtained in subjective assessment which demonstrates the necessity of new metrics adapted to human visual system.
Resumo:
Shading reduces the power output of a photovoltaic (PV) system. The design engineering of PV systems requires modeling and evaluating shading losses. Some PV systems are affected by complex shading scenes whose resulting PV energy losses are very difficult to evaluate with current modeling tools. Several specialized PV design and simulation software include the possibility to evaluate shading losses. They generally possess a Graphical User Interface (GUI) through which the user can draw a 3D shading scene, and then evaluate its corresponding PV energy losses. The complexity of the objects that these tools can handle is relatively limited. We have created a software solution, 3DPV, which allows evaluating the energy losses induced by complex 3D scenes on PV generators. The 3D objects can be imported from specialized 3D modeling software or from a 3D object library. The shadows cast by this 3D scene on the PV generator are then directly evaluated from the Graphics Processing Unit (GPU). Thanks to the recent development of GPUs for the video game industry, the shadows can be evaluated with a very high spatial resolution that reaches well beyond the PV cell level, in very short calculation times. A PV simulation model then translates the geometrical shading into PV energy output losses. 3DPV has been implemented using WebGL, which allows it to run directly from a Web browser, without requiring any local installation from the user. This also allows taken full benefits from the information already available from Internet, such as the 3D object libraries. This contribution describes, step by step, the method that allows 3DPV to evaluate the PV energy losses caused by complex shading. We then illustrate the results of this methodology to several application cases that are encountered in the world of PV systems design. Keywords: 3D, modeling, simulation, GPU, shading, losses, shadow mapping, solar, photovoltaic, PV, WebGL
Resumo:
This paper gives an overview of three recent studies by the authors on the topic of 3D video Quality of Experience (QoE). Two of studies [1,2] investigated different psychological dimension that may be needed for describing 3D video QoE and the third the visibility and annoyance of crosstalk[3]. The results shows that the video quality scale could be sufficient for evaluating S3D video experience for coding and spatial resolution reduction distortions. It was also confirmed that with a more complex mixture of degradations more than one scale should be used to capture the QoE in these cases. The study found a linear relationship between the perceived crosstalk and the amount of crosstalk.
Resumo:
In order to cater for user's quality of experience (QoE) requirements, HTTP adaptive streaming (HAS) based solutions of video services have become popular recently. User QoE feedback can be instrumental in improving the capabilities of such services. Perceptual quality experiments that involve humans are considered to be the most valid method of the assessment of QoE. Besides lab-based subjective experiments, crowdsourcing based subjective assessment of video quality is gaining popularity as an alternative method. This paper presents insights into a study that investigates perceptual preferences of various adaptive video streaming scenarios through crowdsourcing based subjective quality assessment.
Resumo:
We present a framework for the analysis of the decoding delay in multiview video coding (MVC). We show that in real-time applications, an accurate estimation of the decoding delay is essential to achieve a minimum communication latency. As opposed to single-view codecs, the complexity of the multiview prediction structure and the parallel decoding of several views requires a systematic analysis of this decoding delay, which we solve using graph theory and a model of the decoder hardware architecture. Our framework assumes a decoder implementation in general purpose multi-core processors with multi-threading capabilities. For this hardware model, we show that frame processing times depend on the computational load of the decoder and we provide an iterative algorithm to compute jointly frame processing times and decoding delay. Finally, we show that decoding delay analysis can be applied to design decoders with the objective of minimizing the communication latency of the MVC system.
Resumo:
Low-cost systems that can obtain a high-quality foreground segmentation almostindependently of the existing illumination conditions for indoor environments are verydesirable, especially for security and surveillance applications. In this paper, a novelforeground segmentation algorithm that uses only a Kinect depth sensor is proposedto satisfy the aforementioned system characteristics. This is achieved by combininga mixture of Gaussians-based background subtraction algorithm with a new Bayesiannetwork that robustly predicts the foreground/background regions between consecutivetime steps. The Bayesian network explicitly exploits the intrinsic characteristics ofthe depth data by means of two dynamic models that estimate the spatial and depthevolution of the foreground/background regions. The most remarkable contribution is thedepth-based dynamic model that predicts the changes in the foreground depth distributionbetween consecutive time steps. This is a key difference with regard to visible imagery,where the color/gray distribution of the foreground is typically assumed to be constant.Experiments carried out on two different depth-based databases demonstrate that theproposed combination of algorithms is able to obtain a more accurate segmentation of theforeground/background than other state-of-the art approaches.
Resumo:
Nowadays, HTTP adaptive streaming (HAS) has become a reliable distribution technology offering significant advantages in terms of both user perceived Quality of Experience (QoE) and resource utilization for content and network service providers. By trading-off the video quality, HAS is able to adapt to the available bandwidth and display requirements so that it can deliver the video content to a variety of devices over the Internet. However, until now there is not enough knowledge of how the adaptation techniques affect the end user's visual experience. Therefore, this paper presents a comparative analysis of different bitrate adaptation strategies in adaptive streaming of monoscopic and stereoscopic video. This has been done through a subjective experiment of testing the end-user response to the video quality variations, considering the visual comfort issue. The experimental outcomes have made a good insight into the factors that can influence on the QoE of different adaptation strategies.
Resumo:
Vision-based object detection from a moving platform becomes particularly challenging in the field of advanced driver assistance systems (ADAS). In this context, onboard vision-based vehicle verification strategies become critical, facing challenges derived from the variability of vehicles appearance, illumination, and vehicle speed. In this paper, an optimized HOG configuration for onboard vehicle verification is proposed which not only considers its spatial and orientation resolution, but descriptor processing strategies and classification. An in-depth analysis of the optimal settings for HOG for onboard vehicle verification is presented, in the context of SVM classification with different kernels. In contrast to many existing approaches, the evaluation is realized in a public and heterogeneous database of vehicle and non-vehicle images in different areas of the road, rendering excellent verification rates that outperform other similar approaches in the literature.
Resumo:
The importance of vision-based systems for Sense-and-Avoid is increasing nowadays as remotely piloted and autonomous UAVs become part of the non-segregated airspace. The development and evaluation of these systems demand flight scenario images which are expensive and risky to obtain. Currently Augmented Reality techniques allow the compositing of real flight scenario images with 3D aircraft models to produce useful realistic images for system development and benchmarking purposes at a much lower cost and risk. With the techniques presented in this paper, 3D aircraft models are positioned firstly in a simulated 3D scene with controlled illumination and rendering parameters. Realistic simulated images are then obtained using an image processing algorithm which fuses the images obtained from the 3D scene with images from real UAV flights taking into account on board camera vibrations. Since the intruder and camera poses are user-defined, ground truth data is available. These ground truth annotations allow to develop and quantitatively evaluate aircraft detection and tracking algorithms. This paper presents the software developed to create a public dataset of 24 videos together with their annotations and some tracking application results.
Resumo:
Video Quality Assessment needs to correspond to human perception. Pixel-based metrics (PSNR or MSE) fail in many circumstances for not taking into account the spatio-temporal property of human's visual perception. In this paper we propose a new pixel-weighted method to improve video quality metrics for artifacts evaluation. The method applies a psychovisual model based on motion, level of detail, pixel location and the appearance of human faces, which approximate the quality to the human eye's response. Subjective tests were developed to adjust the psychovisual model for demonstrating the noticeable improvement of an algorithm when weighting the pixels according to the factors analyzed instead of treating them equally. The analysis developed demonstrates the necessity of models adapted to the specific visualization of contents and the model presents an advance in quality to be applied over sequences when a determined artifact is analyzed.
Resumo:
Esta tesis presenta un estudio exhaustivo sobre la evaluación de la calidad de experiencia (QoE, del inglés Quality of Experience) percibida por los usuarios de sistemas de vídeo 3D, analizando el impacto de los efectos introducidos por todos los elementos de la cadena de procesamiento de vídeo 3D. Por lo tanto, se presentan varias pruebas de evaluación subjetiva específicamente diseñadas para evaluar los sistemas considerados, teniendo en cuenta todos los factores perceptuales relacionados con la experiencia visual tridimensional, tales como la percepción de profundidad y la molestia visual. Concretamente, se describe un test subjetivo basado en la evaluación de degradaciones típicas que pueden aparecer en el proceso de creación de contenidos de vídeo 3D, por ejemplo debidas a calibraciones incorrectas de las cámaras o a algoritmos de procesamiento de la señal de vídeo (p. ej., conversión de 2D a 3D). Además, se presenta el proceso de generación de una base de datos de vídeos estereoscópicos de alta calidad, disponible gratuitamente para la comunidad investigadora y que ha sido utilizada ampliamente en diferentes trabajos relacionados con vídeo 3D. Asimismo, se presenta otro estudio subjetivo, realizado entre varios laboratorios, con el que se analiza el impacto de degradaciones causadas por la codificación de vídeo, así como diversos formatos de representación de vídeo 3D. Igualmente, se describen tres pruebas subjetivas centradas en el estudio de posibles efectos causados por la transmisión de vídeo 3D a través de redes de televisión sobre IP (IPTV, del inglés Internet Protocol Television) y de sistemas de streaming adaptativo de vídeo. Para estos casos, se ha propuesto una innovadora metodología de evaluación subjetiva de calidad vídeo, denominada Content-Immersive Evaluation of Transmission Impairments (CIETI), diseñada específicamente para evaluar eventos de transmisión simulando condiciones realistas de visualización de vídeo en ámbitos domésticos, con el fin de obtener conclusiones más representativas sobre la experiencia visual de los usuarios finales. Finalmente, se exponen dos experimentos subjetivos comparando varias tecnologías actuales de televisores 3D disponibles en el mercado de consumo y evaluando factores perceptuales de sistemas Super Multiview Video (SMV), previstos a ser la tecnología futura de televisores 3D de consumo, gracias a una prometedora visualización de contenido 3D sin necesidad de gafas específicas. El trabajo presentado en esta tesis ha permitido entender los factores perceptuales y técnicos relacionados con el procesamiento y visualización de contenidos de vídeo 3D, que pueden ser de utilidad en el desarrollo de nuevas tecnologías y técnicas de evaluación de la QoE, tanto metodologías subjetivas como métricas objetivas. ABSTRACT This thesis presents a comprehensive study of the evaluation of the Quality of Experience (QoE) perceived by the users of 3D video systems, analyzing the impact of effects introduced by all the elements of the 3D video processing chain. Therefore, various subjective assessment tests are presented, particularly designed to evaluate the systems under consideration, and taking into account all the perceptual factors related to the 3D visual experience, such as depth perception and visual discomfort. In particular, a subjective test is presented, based on evaluating typical degradations that may appear during the content creation, for instance due to incorrect camera calibration or video processing algorithms (e.g., 2D to 3D conversion). Moreover, the process of generation of a high-quality dataset of 3D stereoscopic videos is described, which is freely available for the research community, and has been already widely used in different works related with 3D video. In addition, another inter-laboratory subjective study is presented analyzing the impact of coding impairments and representation formats of stereoscopic video. Also, three subjective tests are presented studying the effects of transmission events that take place in Internet Protocol Television (IPTV) networks and adaptive streaming scenarios for 3D video. For these cases, a novel subjective evaluation methodology, called Content-Immersive Evaluation of Transmission Impairments (CIETI), was proposed, which was especially designed to evaluate transmission events simulating realistic home-viewing conditions, to obtain more representative conclusions about the visual experience of the end users. Finally, two subjective experiments are exposed comparing various current 3D displays available in the consumer market, and evaluating perceptual factors of Super Multiview Video (SMV) systems, expected to be the future technology for consumer 3D displays thanks to a promising visualization of 3D content without specific glasses. The work presented in this thesis has allowed to understand perceptual and technical factors related to the processing and visualization of 3D video content, which may be useful in the development of new technologies and approaches for QoE evaluation, both subjective methodologies and objective metrics.