950 resultados para 3D Video Telecommunication Multimedia
Resumo:
Il principale scopo di questa tesi è focalizzato alla ricerca di una caratterizzazione dei contenuti in video 3D. In una prima analisi, le complessità spaziale e temporale di contenuti 3D sono state studiate seguendo le convenzionali tecniche applicate a video 2D. In particolare, Spatial Information (SI) e Temporal Information (TI) sono i due indicatori utilizzati nella caratterizzazione 3D di contenuti spaziali e temporali. Per presentare una descrizione completa di video 3D, deve essere considerata anche la caratterizzazione in termini di profondità. A questo riguardo, nuovi indicatori di profondità sono stati proposti sulla base di valutazioni statistiche degli istogrammi di mappe di profondità. Il primo depth indicator è basato infatti sullo studio della media e deviazione standard della distribuzione dei dati nella depth map. Un'altra metrica proposta in questo lavoro stima la profondità basandosi sul calcolo dell’entropia della depth map. Infine, il quarto algoritmo implementato applica congiuntamente una tecnica di sogliatura (thresholding technique) e analizza i valori residui dell’istogramma calcolando l’indice di Kurtosis. Gli algoritmi proposti sono stati testati con un confronto tra le metriche proposte in questo lavoro e quelle precedenti, ma anche con risultati di test soggettivi. I risultati sperimentali mostrano l’efficacia delle soluzioni proposte nel valutare la profondità in video 3D. Infine, uno dei nuovi indicatori è stato applicato ad un database di video 3D per completare la caratterizzazione di contenuti 3D.
Resumo:
Las tecnologías de vídeo en 3D han estado al alza en los últimos años, con abundantes avances en investigación unidos a una adopción generalizada por parte de la industria del cine, y una importancia creciente en la electrónica de consumo. Relacionado con esto, está el concepto de vídeo multivista, que abarca el vídeo 3D, y puede definirse como un flujo de vídeo compuesto de dos o más vistas. El vídeo multivista permite prestaciones avanzadas de vídeo, como el vídeo estereoscópico, el “free viewpoint video”, contacto visual mejorado mediante vistas virtuales, o entornos virtuales compartidos. El propósito de esta tesis es salvar un obstáculo considerable de cara al uso de vídeo multivista en sistemas de comunicación: la falta de soporte para esta tecnología por parte de los protocolos de señalización existentes, que hace imposible configurar una sesión con vídeo multivista mediante mecanismos estándar. Así pues, nuestro principal objetivo es la extensión del Protocolo de Inicio de Sesión (SIP) para soportar la negociación de sesiones multimedia con flujos de vídeo multivista. Nuestro trabajo se puede resumir en tres contribuciones principales. En primer lugar, hemos definido una extensión de señalización para configurar sesiones SIP con vídeo 3D. Esta extensión modifica el Protocolo de Descripción de Sesión (SDP) para introducir un nuevo atributo de nivel de medios, y un nuevo tipo de dependencia de descodificación, que contribuyen a describir los formatos de vídeo 3D que pueden emplearse en una sesión, así como la relación entre los flujos de vídeo que componen un flujo de vídeo 3D. La segunda contribución consiste en una extensión a SIP para manejar la señalización de videoconferencias con flujos de vídeo multivista. Se definen dos nuevos paquetes de eventos SIP para describir las capacidades y topología de los terminales de conferencia, por un lado, y la configuración espacial y mapeo de flujos de una conferencia, por el otro. También se describe un mecanismo para integrar el intercambio de esta información en el proceso de inicio de una conferencia SIP. Como tercera y última contribución, introducimos el concepto de espacio virtual de una conferencia, o un sistema de coordenadas que incluye todos los objetos relevantes de la conferencia (como dispositivos de captura, pantallas, y usuarios). Explicamos cómo el espacio virtual se relaciona con prestaciones de conferencia como el contacto visual, la escala de vídeo y la fidelidad espacial, y proporcionamos reglas para determinar las prestaciones de una conferencia a partir del análisis de su espacio virtual, y para generar espacios virtuales durante la configuración de conferencias.
Resumo:
This paper presents an empirical study of affine invariant feature detectors to perform matching on video sequences of people with non-rigid surface deformation. Recent advances in feature detection and wide baseline matching have focused on static scenes. Video frames of human movement capture highly non-rigid deformation such as loose hair, cloth creases, skin stretching and free flowing clothing. This study evaluates the performance of six widely used feature detectors for sparse temporal correspondence on single view and multiple view video sequences. Quantitative evaluation is performed of both the number of features detected and their temporal matching against and without ground truth correspondence. Recall-accuracy analysis of feature matching is reported for temporal correspondence on single view and multiple view sequences of people with variation in clothing and movement. This analysis identifies that existing feature detection and matching algorithms are unreliable for fast movement with common clothing.
Resumo:
Visual fixation is employed by humans and some animals to keep a specific 3D location at the center of the visual gaze. Inspired by this phenomenon in nature, this paper explores the idea to transfer this mechanism to the context of video stabilization for a handheld video camera. A novel approach is presented that stabilizes a video by fixating on automatically extracted 3D target points. This approach is different from existing automatic solutions that stabilize the video by smoothing. To determine the 3D target points, the recorded scene is analyzed with a stateof- the-art structure-from-motion algorithm, which estimates camera motion and reconstructs a 3D point cloud of the static scene objects. Special algorithms are presented that search either virtual or real 3D target points, which back-project close to the center of the image for as long a period of time as possible. The stabilization algorithm then transforms the original images of the sequence so that these 3D target points are kept exactly in the center of the image, which, in case of real 3D target points, produces a perfectly stable result at the image center. Furthermore, different methods of additional user interaction are investigated. It is shown that the stabilization process can easily be controlled and that it can be combined with state-of-theart tracking techniques in order to obtain a powerful image stabilization tool. The approach is evaluated on a variety of videos taken with a hand-held camera in natural scenes.
Resumo:
Recently, three-dimensional (3D) video has decisively burst onto the entertainment industry scene, and has arrived in households even before the standardization process has been completed. 3D television (3DTV) adoption and deployment can be seen as a major leap in television history, similar to previous transitions from black and white (B&W) to color, from analog to digital television (TV), and from standard definition to high definition. In this paper, we analyze current 3D video technology trends in order to define a taxonomy of the availability and possible introduction of 3D-based services. We also propose an audiovisual network services architecture which provides a smooth transition from two-dimensional (2D) to 3DTV in an Internet Protocol (IP)-based scenario. Based on subjective assessment tests, we also analyze those factors which will influence the quality of experience in those 3D video services, focusing on effects of both coding and transmission errors. In addition, examples of the application of the architecture and results of assessment tests are provided.
Resumo:
A frame-level distortion model based on perceptual features of the human visual system is proposed to improve the performance of unequal error protection strategies and provide better quality of experience to users in Side-by-Side 3D video delivery systems.
Resumo:
This paper gives an overview of three recent studies by the authors on the topic of 3D video Quality of Experience (QoE). Two of studies [1,2] investigated different psychological dimension that may be needed for describing 3D video QoE and the third the visibility and annoyance of crosstalk[3]. The results shows that the video quality scale could be sufficient for evaluating S3D video experience for coding and spatial resolution reduction distortions. It was also confirmed that with a more complex mixture of degradations more than one scale should be used to capture the QoE in these cases. The study found a linear relationship between the perceived crosstalk and the amount of crosstalk.
Resumo:
Esta tesis presenta un estudio exhaustivo sobre la evaluación de la calidad de experiencia (QoE, del inglés Quality of Experience) percibida por los usuarios de sistemas de vídeo 3D, analizando el impacto de los efectos introducidos por todos los elementos de la cadena de procesamiento de vídeo 3D. Por lo tanto, se presentan varias pruebas de evaluación subjetiva específicamente diseñadas para evaluar los sistemas considerados, teniendo en cuenta todos los factores perceptuales relacionados con la experiencia visual tridimensional, tales como la percepción de profundidad y la molestia visual. Concretamente, se describe un test subjetivo basado en la evaluación de degradaciones típicas que pueden aparecer en el proceso de creación de contenidos de vídeo 3D, por ejemplo debidas a calibraciones incorrectas de las cámaras o a algoritmos de procesamiento de la señal de vídeo (p. ej., conversión de 2D a 3D). Además, se presenta el proceso de generación de una base de datos de vídeos estereoscópicos de alta calidad, disponible gratuitamente para la comunidad investigadora y que ha sido utilizada ampliamente en diferentes trabajos relacionados con vídeo 3D. Asimismo, se presenta otro estudio subjetivo, realizado entre varios laboratorios, con el que se analiza el impacto de degradaciones causadas por la codificación de vídeo, así como diversos formatos de representación de vídeo 3D. Igualmente, se describen tres pruebas subjetivas centradas en el estudio de posibles efectos causados por la transmisión de vídeo 3D a través de redes de televisión sobre IP (IPTV, del inglés Internet Protocol Television) y de sistemas de streaming adaptativo de vídeo. Para estos casos, se ha propuesto una innovadora metodología de evaluación subjetiva de calidad vídeo, denominada Content-Immersive Evaluation of Transmission Impairments (CIETI), diseñada específicamente para evaluar eventos de transmisión simulando condiciones realistas de visualización de vídeo en ámbitos domésticos, con el fin de obtener conclusiones más representativas sobre la experiencia visual de los usuarios finales. Finalmente, se exponen dos experimentos subjetivos comparando varias tecnologías actuales de televisores 3D disponibles en el mercado de consumo y evaluando factores perceptuales de sistemas Super Multiview Video (SMV), previstos a ser la tecnología futura de televisores 3D de consumo, gracias a una prometedora visualización de contenido 3D sin necesidad de gafas específicas. El trabajo presentado en esta tesis ha permitido entender los factores perceptuales y técnicos relacionados con el procesamiento y visualización de contenidos de vídeo 3D, que pueden ser de utilidad en el desarrollo de nuevas tecnologías y técnicas de evaluación de la QoE, tanto metodologías subjetivas como métricas objetivas. ABSTRACT This thesis presents a comprehensive study of the evaluation of the Quality of Experience (QoE) perceived by the users of 3D video systems, analyzing the impact of effects introduced by all the elements of the 3D video processing chain. Therefore, various subjective assessment tests are presented, particularly designed to evaluate the systems under consideration, and taking into account all the perceptual factors related to the 3D visual experience, such as depth perception and visual discomfort. In particular, a subjective test is presented, based on evaluating typical degradations that may appear during the content creation, for instance due to incorrect camera calibration or video processing algorithms (e.g., 2D to 3D conversion). Moreover, the process of generation of a high-quality dataset of 3D stereoscopic videos is described, which is freely available for the research community, and has been already widely used in different works related with 3D video. In addition, another inter-laboratory subjective study is presented analyzing the impact of coding impairments and representation formats of stereoscopic video. Also, three subjective tests are presented studying the effects of transmission events that take place in Internet Protocol Television (IPTV) networks and adaptive streaming scenarios for 3D video. For these cases, a novel subjective evaluation methodology, called Content-Immersive Evaluation of Transmission Impairments (CIETI), was proposed, which was especially designed to evaluate transmission events simulating realistic home-viewing conditions, to obtain more representative conclusions about the visual experience of the end users. Finally, two subjective experiments are exposed comparing various current 3D displays available in the consumer market, and evaluating perceptual factors of Super Multiview Video (SMV) systems, expected to be the future technology for consumer 3D displays thanks to a promising visualization of 3D content without specific glasses. The work presented in this thesis has allowed to understand perceptual and technical factors related to the processing and visualization of 3D video content, which may be useful in the development of new technologies and approaches for QoE evaluation, both subjective methodologies and objective metrics.
Resumo:
Objective: To evaluate two cases of intermittent exotropia (IX(T)) treated by vision therapy the efficacy of the treatment by complementing the clinical examination with a 3-D video-oculography to register and to evidence the potential applicability of this technology for such purpose. Methods: We report the binocular alignment changes occurring after vision therapy in a woman of 36 years with an IX(T) of 25 prism diopters (Δ) at far and 18 Δ at near and a child of 10 years with 8 Δ of IX(T) in primary position associated to 6 Δ of left eye hypotropia. Both patients presented good visual acuity with correction in both eyes. Instability of ocular deviation was evident by VOG analysis, revealing also the presence of vertical and torsional components. Binocular vision therapy was prescribed and performed including different types of vergence, accommodation, and consciousness of diplopia training. Results: After therapy, excellent ranges of fusional vergence and a “to-the-nose” near point of convergence were obtained. The 3-D VOG examination (Sensoro Motoric Instruments, Teltow, Germany) confirmed the compensation of the deviation with a high level of stability of binocular alignment. Significant improvement could be observed after therapy in the vertical and torsional components that were found to become more stable. Patients were very satisfied with the outcome obtained by vision therapy. Conclusion: 3D-VOG is a useful technique for providing an objective register of the compensation of the ocular deviation and the stability of the binocular alignment achieved after vision therapy in cases of IX(T), providing a detailed analysis of vertical and torsional improvements.