61 resultados para 3D Video Telecommunication Multimedia
em Universidad Politécnica de Madrid
Resumo:
Las tecnologías de vídeo en 3D han estado al alza en los últimos años, con abundantes avances en investigación unidos a una adopción generalizada por parte de la industria del cine, y una importancia creciente en la electrónica de consumo. Relacionado con esto, está el concepto de vídeo multivista, que abarca el vídeo 3D, y puede definirse como un flujo de vídeo compuesto de dos o más vistas. El vídeo multivista permite prestaciones avanzadas de vídeo, como el vídeo estereoscópico, el “free viewpoint video”, contacto visual mejorado mediante vistas virtuales, o entornos virtuales compartidos. El propósito de esta tesis es salvar un obstáculo considerable de cara al uso de vídeo multivista en sistemas de comunicación: la falta de soporte para esta tecnología por parte de los protocolos de señalización existentes, que hace imposible configurar una sesión con vídeo multivista mediante mecanismos estándar. Así pues, nuestro principal objetivo es la extensión del Protocolo de Inicio de Sesión (SIP) para soportar la negociación de sesiones multimedia con flujos de vídeo multivista. Nuestro trabajo se puede resumir en tres contribuciones principales. En primer lugar, hemos definido una extensión de señalización para configurar sesiones SIP con vídeo 3D. Esta extensión modifica el Protocolo de Descripción de Sesión (SDP) para introducir un nuevo atributo de nivel de medios, y un nuevo tipo de dependencia de descodificación, que contribuyen a describir los formatos de vídeo 3D que pueden emplearse en una sesión, así como la relación entre los flujos de vídeo que componen un flujo de vídeo 3D. La segunda contribución consiste en una extensión a SIP para manejar la señalización de videoconferencias con flujos de vídeo multivista. Se definen dos nuevos paquetes de eventos SIP para describir las capacidades y topología de los terminales de conferencia, por un lado, y la configuración espacial y mapeo de flujos de una conferencia, por el otro. También se describe un mecanismo para integrar el intercambio de esta información en el proceso de inicio de una conferencia SIP. Como tercera y última contribución, introducimos el concepto de espacio virtual de una conferencia, o un sistema de coordenadas que incluye todos los objetos relevantes de la conferencia (como dispositivos de captura, pantallas, y usuarios). Explicamos cómo el espacio virtual se relaciona con prestaciones de conferencia como el contacto visual, la escala de vídeo y la fidelidad espacial, y proporcionamos reglas para determinar las prestaciones de una conferencia a partir del análisis de su espacio virtual, y para generar espacios virtuales durante la configuración de conferencias.
Resumo:
Recently, three-dimensional (3D) video has decisively burst onto the entertainment industry scene, and has arrived in households even before the standardization process has been completed. 3D television (3DTV) adoption and deployment can be seen as a major leap in television history, similar to previous transitions from black and white (B&W) to color, from analog to digital television (TV), and from standard definition to high definition. In this paper, we analyze current 3D video technology trends in order to define a taxonomy of the availability and possible introduction of 3D-based services. We also propose an audiovisual network services architecture which provides a smooth transition from two-dimensional (2D) to 3DTV in an Internet Protocol (IP)-based scenario. Based on subjective assessment tests, we also analyze those factors which will influence the quality of experience in those 3D video services, focusing on effects of both coding and transmission errors. In addition, examples of the application of the architecture and results of assessment tests are provided.
Resumo:
A frame-level distortion model based on perceptual features of the human visual system is proposed to improve the performance of unequal error protection strategies and provide better quality of experience to users in Side-by-Side 3D video delivery systems.
Resumo:
This paper gives an overview of three recent studies by the authors on the topic of 3D video Quality of Experience (QoE). Two of studies [1,2] investigated different psychological dimension that may be needed for describing 3D video QoE and the third the visibility and annoyance of crosstalk[3]. The results shows that the video quality scale could be sufficient for evaluating S3D video experience for coding and spatial resolution reduction distortions. It was also confirmed that with a more complex mixture of degradations more than one scale should be used to capture the QoE in these cases. The study found a linear relationship between the perceived crosstalk and the amount of crosstalk.
Resumo:
Esta tesis presenta un estudio exhaustivo sobre la evaluación de la calidad de experiencia (QoE, del inglés Quality of Experience) percibida por los usuarios de sistemas de vídeo 3D, analizando el impacto de los efectos introducidos por todos los elementos de la cadena de procesamiento de vídeo 3D. Por lo tanto, se presentan varias pruebas de evaluación subjetiva específicamente diseñadas para evaluar los sistemas considerados, teniendo en cuenta todos los factores perceptuales relacionados con la experiencia visual tridimensional, tales como la percepción de profundidad y la molestia visual. Concretamente, se describe un test subjetivo basado en la evaluación de degradaciones típicas que pueden aparecer en el proceso de creación de contenidos de vídeo 3D, por ejemplo debidas a calibraciones incorrectas de las cámaras o a algoritmos de procesamiento de la señal de vídeo (p. ej., conversión de 2D a 3D). Además, se presenta el proceso de generación de una base de datos de vídeos estereoscópicos de alta calidad, disponible gratuitamente para la comunidad investigadora y que ha sido utilizada ampliamente en diferentes trabajos relacionados con vídeo 3D. Asimismo, se presenta otro estudio subjetivo, realizado entre varios laboratorios, con el que se analiza el impacto de degradaciones causadas por la codificación de vídeo, así como diversos formatos de representación de vídeo 3D. Igualmente, se describen tres pruebas subjetivas centradas en el estudio de posibles efectos causados por la transmisión de vídeo 3D a través de redes de televisión sobre IP (IPTV, del inglés Internet Protocol Television) y de sistemas de streaming adaptativo de vídeo. Para estos casos, se ha propuesto una innovadora metodología de evaluación subjetiva de calidad vídeo, denominada Content-Immersive Evaluation of Transmission Impairments (CIETI), diseñada específicamente para evaluar eventos de transmisión simulando condiciones realistas de visualización de vídeo en ámbitos domésticos, con el fin de obtener conclusiones más representativas sobre la experiencia visual de los usuarios finales. Finalmente, se exponen dos experimentos subjetivos comparando varias tecnologías actuales de televisores 3D disponibles en el mercado de consumo y evaluando factores perceptuales de sistemas Super Multiview Video (SMV), previstos a ser la tecnología futura de televisores 3D de consumo, gracias a una prometedora visualización de contenido 3D sin necesidad de gafas específicas. El trabajo presentado en esta tesis ha permitido entender los factores perceptuales y técnicos relacionados con el procesamiento y visualización de contenidos de vídeo 3D, que pueden ser de utilidad en el desarrollo de nuevas tecnologías y técnicas de evaluación de la QoE, tanto metodologías subjetivas como métricas objetivas. ABSTRACT This thesis presents a comprehensive study of the evaluation of the Quality of Experience (QoE) perceived by the users of 3D video systems, analyzing the impact of effects introduced by all the elements of the 3D video processing chain. Therefore, various subjective assessment tests are presented, particularly designed to evaluate the systems under consideration, and taking into account all the perceptual factors related to the 3D visual experience, such as depth perception and visual discomfort. In particular, a subjective test is presented, based on evaluating typical degradations that may appear during the content creation, for instance due to incorrect camera calibration or video processing algorithms (e.g., 2D to 3D conversion). Moreover, the process of generation of a high-quality dataset of 3D stereoscopic videos is described, which is freely available for the research community, and has been already widely used in different works related with 3D video. In addition, another inter-laboratory subjective study is presented analyzing the impact of coding impairments and representation formats of stereoscopic video. Also, three subjective tests are presented studying the effects of transmission events that take place in Internet Protocol Television (IPTV) networks and adaptive streaming scenarios for 3D video. For these cases, a novel subjective evaluation methodology, called Content-Immersive Evaluation of Transmission Impairments (CIETI), was proposed, which was especially designed to evaluate transmission events simulating realistic home-viewing conditions, to obtain more representative conclusions about the visual experience of the end users. Finally, two subjective experiments are exposed comparing various current 3D displays available in the consumer market, and evaluating perceptual factors of Super Multiview Video (SMV) systems, expected to be the future technology for consumer 3D displays thanks to a promising visualization of 3D content without specific glasses. The work presented in this thesis has allowed to understand perceptual and technical factors related to the processing and visualization of 3D video content, which may be useful in the development of new technologies and approaches for QoE evaluation, both subjective methodologies and objective metrics.
Resumo:
Immersion and interaction have been identified as key factors influencing the quality of experience in stereoscopic video systems. The work presented here aims to create a new paradigm for 3D Multimedia consumption exploiting these factors in order to increase user involvement. We use a 5-sided CAVETM environment to support 3D panoramic video reproduction, real-time insertion of synthetic objects into the three-dimensional scene and real-time user interaction with the inserted elements. In this paper we describe our system requirements, functionalities, conceptual design and preliminary implementation results emphasizing the most relevant challenges accomplished. The focus is on three main issues: the generation of stereoscopic video panoramas; the synchronous reproduction of immersive 3D video across multiple screens; and, the real-time insertion algorithm implemented for the integration of synthetic objects into the stereoscopic video. These results have been successfully integrated into the graphic engine managing the operation of the CAVETM infrastructure.
Experimental Prototype Merging Stereo Panoramic Video and Interactive 3D Content in a 5-sided CAVETM
Resumo:
Immersion and interaction have been identified as key factors influencing the quality of experience in stereoscopic video systems. An experimental prototype designed to explore the influence of these factors in 3D video applications is described here1. The focus is on the real-time insertion algorithm of new 3D models into the original video streams. Using this algorithm, our prototype is aimed to explore a new interaction paradigm ? similar to the augmented reality approach ? with 3D video applications.
Resumo:
Esta tesis estudia la monitorización y gestión de la Calidad de Experiencia (QoE) en los servicios de distribución de vídeo sobre IP. Aborda el problema de cómo prevenir, detectar, medir y reaccionar a las degradaciones de la QoE desde la perspectiva de un proveedor de servicios: la solución debe ser escalable para una red IP extensa que entregue flujos individuales a miles de usuarios simultáneamente. La solución de monitorización propuesta se ha denominado QuEM(Qualitative Experience Monitoring, o Monitorización Cualitativa de la Experiencia). Se basa en la detección de las degradaciones de la calidad de servicio de red (pérdidas de paquetes, disminuciones abruptas del ancho de banda...) e inferir de cada una una descripción cualitativa de su efecto en la Calidad de Experiencia percibida (silencios, defectos en el vídeo...). Este análisis se apoya en la información de transporte y de la capa de abstracción de red de los flujos codificados, y permite caracterizar los defectos más relevantes que se observan en este tipo de servicios: congelaciones, efecto de “cuadros”, silencios, pérdida de calidad del vídeo, retardos e interrupciones en el servicio. Los resultados se han validado mediante pruebas de calidad subjetiva. La metodología usada en esas pruebas se ha desarrollado a su vez para imitar lo más posible las condiciones de visualización de un usuario de este tipo de servicios: los defectos que se evalúan se introducen de forma aleatoria en medio de una secuencia de vídeo continua. Se han propuesto también algunas aplicaciones basadas en la solución de monitorización: un sistema de protección desigual frente a errores que ofrece más protección a las partes del vídeo más sensibles a pérdidas, una solución para minimizar el impacto de la interrupción de la descarga de segmentos de Streaming Adaptativo sobre HTTP, y un sistema de cifrado selectivo que encripta únicamente las partes del vídeo más sensibles. También se ha presentado una solución de cambio rápido de canal, así como el análisis de la aplicabilidad de los resultados anteriores a un escenario de vídeo en 3D. ABSTRACT This thesis proposes a comprehensive approach to the monitoring and management of Quality of Experience (QoE) in multimedia delivery services over IP. It addresses the problem of preventing, detecting, measuring, and reacting to QoE degradations, under the constraints of a service provider: the solution must scale for a wide IP network delivering individual media streams to thousands of users. The solution proposed for the monitoring is called QuEM (Qualitative Experience Monitoring). It is based on the detection of degradations in the network Quality of Service (packet losses, bandwidth drops...) and the mapping of each degradation event to a qualitative description of its effect in the perceived Quality of Experience (audio mutes, video artifacts...). This mapping is based on the analysis of the transport and Network Abstraction Layer information of the coded stream, and allows a good characterization of the most relevant defects that exist in this kind of services: screen freezing, macroblocking, audio mutes, video quality drops, delay issues, and service outages. The results have been validated by subjective quality assessment tests. The methodology used for those test has also been designed to mimic as much as possible the conditions of a real user of those services: the impairments to evaluate are introduced randomly in the middle of a continuous video stream. Based on the monitoring solution, several applications have been proposed as well: an unequal error protection system which provides higher protection to the parts of the stream which are more critical for the QoE, a solution which applies the same principles to minimize the impact of incomplete segment downloads in HTTP Adaptive Streaming, and a selective scrambling algorithm which ciphers only the most sensitive parts of the media stream. A fast channel change application is also presented, as well as a discussion about how to apply the previous results and concepts in a 3D video scenario.
Resumo:
We present a novel framework for encoding latency analysis of arbitrary multiview video coding prediction structures. This framework avoids the need to consider an specific encoder architecture for encoding latency analysis by assuming an unlimited processing capacity on the multiview encoder. Under this assumption, only the influence of the prediction structure and the processing times have to be considered, and the encoding latency is solved systematically by means of a graph model. The results obtained with this model are valid for a multiview encoder with sufficient processing capacity and serve as a lower bound otherwise. Furthermore, with the objective of low latency encoder design with low penalty on rate-distortion performance, the graph model allows us to identify the prediction relationships that add higher encoding latency to the encoder. Experimental results for JMVM prediction structures illustrate how low latency prediction structures with a low rate-distortion penalty can be derived in a systematic manner using the new model.
Resumo:
There is an increasing need of easy and affordable technologies to automatically generate virtual 3D models from their real counterparts. In particular, 3D human reconstruction has driven the creation of many clever techniques, most of them based on the visual hull (VH) concept. Such techniques do not require expensive hardware; however, they tend to yield 3D humanoids with realistic bodies but mediocre faces, since VH cannot handle concavities. On the other hand, structured light projectors allow to capture very accurate depth data, and thus to reconstruct realistic faces, but they are too expensive to use several of them. We have developed a technique to merge a VH-based 3D mesh of a reconstructed humanoid and the depth data of its face, captured by a single structured light projector. By combining the advantages of both systems in a simple setting, we are able to reconstruct realistic 3D human models with believable faces.
Resumo:
We present an adaptive unequal error protection (UEP) strategy built on the 1-D interleaved parity Application Layer Forward Error Correction (AL-FEC) code for protecting the transmission of stereoscopic 3D video content encoded with Multiview Video Coding (MVC) through IP-based networks. Our scheme targets the minimization of quality degradation produced by packet losses during video transmission in time-sensitive application scenarios. To that end, based on a novel packet-level distortion model, it selects in real time the most suitable packets within each Group of Pictures (GOP) to be protected and the most convenient FEC technique parameters, i.e., the size of the FEC generator matrix. In order to make these decisions, it considers the relevance of the packet, the behavior of the channel, and the available bitrate for protection purposes. Simulation results validate both the distortion model introduced to estimate the importance of packets and the optimization of the FEC technique parameter values.
Resumo:
Automatic 2D-to-3D conversion is an important application for filling the gap between the increasing number of 3D displays and the still scant 3D content. However, existing approaches have an excessive computational cost that complicates its practical application. In this paper, a fast automatic 2D-to-3D conversion technique is proposed, which uses a machine learning framework to infer the 3D structure of a query color image from a training database with color and depth images. Assuming that photometrically similar images have analogous 3D structures, a depth map is estimated by searching the most similar color images in the database, and fusing the corresponding depth maps. Large databases are desirable to achieve better results, but the computational cost also increases. A clustering-based hierarchical search using compact SURF descriptors to characterize images is proposed to drastically reduce search times. A significant computational time improvement has been obtained regarding other state-of-the-art approaches, maintaining the quality results.
Resumo:
This paper presents a comparison among different consumer 3D display technologies by means of a subjective assessment test. Therefore, four 55-in displays have been considered: one autostereoscopic display, one stereoscopic with polarized passive glasses, and two with active shutter glasses. In addition, a high-quality 3D video database has been used to show diverse material with both views in high definition. To carry out the test, standard recommendations have been followed considering also some modifications looking for a test environment more similar to real home viewing conditions, with the objective of obtaining more representative conclusions. Moreover, several perceptual factors have been considered to study the performance of the displays, such as picture quality, depth perception, and visual discomfort. The obtained results show interesting issues, like the performance improvement of active shutter glasses technology, the high performance of the polarized glasses technology in terms of quality and comfort, and the need of improvement of the autostereoscopic displays to complement the visual comfort to reach a global high-quality visual experience.
Resumo:
Although 3DTV has led the evolution of television market, its delivery by broadcast networks is still small. Now, 3DTV transmis-sions are usually done by combining both views into one common frame (side by side) to be able to use standard HDTV transmission equipment. Today, orthogonal subsampling is mostly used, but other alternatives will appear soon. Here, different subsampling schemes for both progressive and interlaced 3DTV are considered. For each possible scheme, its pre-served frequency content is analyzed and a simple interpolation filter is designed. The analysis is carried out for progressive and interlaced video and the designed filters are applied on different sequences, showing the advantages and disadvantages of the different options
Resumo:
Recently, broadcasted 3D video content has reached households with the first generation of 3DTV. However, few studies have been done to analyze the Quality of Experience (QoE) perceived by the end-users in this scenario. This paper studies the impact of trans- mission errors in 3DTV, considering that the video is delivered in side-by-side format over a conventional packet-based network. For this purpose, a novel evaluation methodology based on standard sin- gle stimulus methods and with the aim of keeping as close as pos- sible the home environment viewing conditions has been proposed. The effects of packet losses in monoscopic and stereoscopic videos are compared from the results of subjective assessment tests. Other aspects were also measured concerning 3D content as naturalness, sense of presence and visual fatigue. The results show that although the final perceived QoE is acceptable, some errors cause important binocular rivalry, and therefore, substantial visual discomfort.