67 resultados para Video semantics


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Nowadays, HTTP adaptive streaming (HAS) has become a reliable distribution technology offering significant advantages in terms of both user perceived Quality of Experience (QoE) and resource utilization for content and network service providers. By trading-off the video quality, HAS is able to adapt to the available bandwidth and display requirements so that it can deliver the video content to a variety of devices over the Internet. However, until now there is not enough knowledge of how the adaptation techniques affect the end user's visual experience. Therefore, this paper presents a comparative analysis of different bitrate adaptation strategies in adaptive streaming of monoscopic and stereoscopic video. This has been done through a subjective experiment of testing the end-user response to the video quality variations, considering the visual comfort issue. The experimental outcomes have made a good insight into the factors that can influence on the QoE of different adaptation strategies.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Vision-based object detection from a moving platform becomes particularly challenging in the field of advanced driver assistance systems (ADAS). In this context, onboard vision-based vehicle verification strategies become critical, facing challenges derived from the variability of vehicles appearance, illumination, and vehicle speed. In this paper, an optimized HOG configuration for onboard vehicle verification is proposed which not only considers its spatial and orientation resolution, but descriptor processing strategies and classification. An in-depth analysis of the optimal settings for HOG for onboard vehicle verification is presented, in the context of SVM classification with different kernels. In contrast to many existing approaches, the evaluation is realized in a public and heterogeneous database of vehicle and non-vehicle images in different areas of the road, rendering excellent verification rates that outperform other similar approaches in the literature.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The importance of vision-based systems for Sense-and-Avoid is increasing nowadays as remotely piloted and autonomous UAVs become part of the non-segregated airspace. The development and evaluation of these systems demand flight scenario images which are expensive and risky to obtain. Currently Augmented Reality techniques allow the compositing of real flight scenario images with 3D aircraft models to produce useful realistic images for system development and benchmarking purposes at a much lower cost and risk. With the techniques presented in this paper, 3D aircraft models are positioned firstly in a simulated 3D scene with controlled illumination and rendering parameters. Realistic simulated images are then obtained using an image processing algorithm which fuses the images obtained from the 3D scene with images from real UAV flights taking into account on board camera vibrations. Since the intruder and camera poses are user-defined, ground truth data is available. These ground truth annotations allow to develop and quantitatively evaluate aircraft detection and tracking algorithms. This paper presents the software developed to create a public dataset of 24 videos together with their annotations and some tracking application results.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Video Quality Assessment needs to correspond to human perception. Pixel-based metrics (PSNR or MSE) fail in many circumstances for not taking into account the spatio-temporal property of human's visual perception. In this paper we propose a new pixel-weighted method to improve video quality metrics for artifacts evaluation. The method applies a psychovisual model based on motion, level of detail, pixel location and the appearance of human faces, which approximate the quality to the human eye's response. Subjective tests were developed to adjust the psychovisual model for demonstrating the noticeable improvement of an algorithm when weighting the pixels according to the factors analyzed instead of treating them equally. The analysis developed demonstrates the necessity of models adapted to the specific visualization of contents and the model presents an advance in quality to be applied over sequences when a determined artifact is analyzed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Esta tesis presenta un estudio exhaustivo sobre la evaluación de la calidad de experiencia (QoE, del inglés Quality of Experience) percibida por los usuarios de sistemas de vídeo 3D, analizando el impacto de los efectos introducidos por todos los elementos de la cadena de procesamiento de vídeo 3D. Por lo tanto, se presentan varias pruebas de evaluación subjetiva específicamente diseñadas para evaluar los sistemas considerados, teniendo en cuenta todos los factores perceptuales relacionados con la experiencia visual tridimensional, tales como la percepción de profundidad y la molestia visual. Concretamente, se describe un test subjetivo basado en la evaluación de degradaciones típicas que pueden aparecer en el proceso de creación de contenidos de vídeo 3D, por ejemplo debidas a calibraciones incorrectas de las cámaras o a algoritmos de procesamiento de la señal de vídeo (p. ej., conversión de 2D a 3D). Además, se presenta el proceso de generación de una base de datos de vídeos estereoscópicos de alta calidad, disponible gratuitamente para la comunidad investigadora y que ha sido utilizada ampliamente en diferentes trabajos relacionados con vídeo 3D. Asimismo, se presenta otro estudio subjetivo, realizado entre varios laboratorios, con el que se analiza el impacto de degradaciones causadas por la codificación de vídeo, así como diversos formatos de representación de vídeo 3D. Igualmente, se describen tres pruebas subjetivas centradas en el estudio de posibles efectos causados por la transmisión de vídeo 3D a través de redes de televisión sobre IP (IPTV, del inglés Internet Protocol Television) y de sistemas de streaming adaptativo de vídeo. Para estos casos, se ha propuesto una innovadora metodología de evaluación subjetiva de calidad vídeo, denominada Content-Immersive Evaluation of Transmission Impairments (CIETI), diseñada específicamente para evaluar eventos de transmisión simulando condiciones realistas de visualización de vídeo en ámbitos domésticos, con el fin de obtener conclusiones más representativas sobre la experiencia visual de los usuarios finales. Finalmente, se exponen dos experimentos subjetivos comparando varias tecnologías actuales de televisores 3D disponibles en el mercado de consumo y evaluando factores perceptuales de sistemas Super Multiview Video (SMV), previstos a ser la tecnología futura de televisores 3D de consumo, gracias a una prometedora visualización de contenido 3D sin necesidad de gafas específicas. El trabajo presentado en esta tesis ha permitido entender los factores perceptuales y técnicos relacionados con el procesamiento y visualización de contenidos de vídeo 3D, que pueden ser de utilidad en el desarrollo de nuevas tecnologías y técnicas de evaluación de la QoE, tanto metodologías subjetivas como métricas objetivas. ABSTRACT This thesis presents a comprehensive study of the evaluation of the Quality of Experience (QoE) perceived by the users of 3D video systems, analyzing the impact of effects introduced by all the elements of the 3D video processing chain. Therefore, various subjective assessment tests are presented, particularly designed to evaluate the systems under consideration, and taking into account all the perceptual factors related to the 3D visual experience, such as depth perception and visual discomfort. In particular, a subjective test is presented, based on evaluating typical degradations that may appear during the content creation, for instance due to incorrect camera calibration or video processing algorithms (e.g., 2D to 3D conversion). Moreover, the process of generation of a high-quality dataset of 3D stereoscopic videos is described, which is freely available for the research community, and has been already widely used in different works related with 3D video. In addition, another inter-laboratory subjective study is presented analyzing the impact of coding impairments and representation formats of stereoscopic video. Also, three subjective tests are presented studying the effects of transmission events that take place in Internet Protocol Television (IPTV) networks and adaptive streaming scenarios for 3D video. For these cases, a novel subjective evaluation methodology, called Content-Immersive Evaluation of Transmission Impairments (CIETI), was proposed, which was especially designed to evaluate transmission events simulating realistic home-viewing conditions, to obtain more representative conclusions about the visual experience of the end users. Finally, two subjective experiments are exposed comparing various current 3D displays available in the consumer market, and evaluating perceptual factors of Super Multiview Video (SMV) systems, expected to be the future technology for consumer 3D displays thanks to a promising visualization of 3D content without specific glasses. The work presented in this thesis has allowed to understand perceptual and technical factors related to the processing and visualization of 3D video content, which may be useful in the development of new technologies and approaches for QoE evaluation, both subjective methodologies and objective metrics.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

La medida de calidad de vídeo sigue siendo necesaria para definir los criterios que caracterizan una señal que cumpla los requisitos de visionado impuestos por el usuario. Las nuevas tecnologías, como el vídeo 3D estereoscópico o formatos más allá de la alta definición, imponen nuevos criterios que deben ser analizadas para obtener la mayor satisfacción posible del usuario. Entre los problemas detectados durante el desarrollo de esta tesis doctoral se han determinado fenómenos que afectan a distintas fases de la cadena de producción audiovisual y tipo de contenido variado. En primer lugar, el proceso de generación de contenidos debe encontrarse controlado mediante parámetros que eviten que se produzca el disconfort visual y, consecuentemente, fatiga visual, especialmente en lo relativo a contenidos de 3D estereoscópico, tanto de animación como de acción real. Por otro lado, la medida de calidad relativa a la fase de compresión de vídeo emplea métricas que en ocasiones no se encuentran adaptadas a la percepción del usuario. El empleo de modelos psicovisuales y diagramas de atención visual permitirían ponderar las áreas de la imagen de manera que se preste mayor importancia a los píxeles que el usuario enfocará con mayor probabilidad. Estos dos bloques se relacionan a través de la definición del término saliencia. Saliencia es la capacidad del sistema visual para caracterizar una imagen visualizada ponderando las áreas que más atractivas resultan al ojo humano. La saliencia en generación de contenidos estereoscópicos se refiere principalmente a la profundidad simulada mediante la ilusión óptica, medida en términos de distancia del objeto virtual al ojo humano. Sin embargo, en vídeo bidimensional, la saliencia no se basa en la profundidad, sino en otros elementos adicionales, como el movimiento, el nivel de detalle, la posición de los píxeles o la aparición de caras, que serán los factores básicos que compondrán el modelo de atención visual desarrollado. Con el objetivo de detectar las características de una secuencia de vídeo estereoscópico que, con mayor probabilidad, pueden generar disconfort visual, se consultó la extensa literatura relativa a este tema y se realizaron unas pruebas subjetivas preliminares con usuarios. De esta forma, se llegó a la conclusión de que se producía disconfort en los casos en que se producía un cambio abrupto en la distribución de profundidades simuladas de la imagen, aparte de otras degradaciones como la denominada “violación de ventana”. A través de nuevas pruebas subjetivas centradas en analizar estos efectos con diferentes distribuciones de profundidades, se trataron de concretar los parámetros que definían esta imagen. Los resultados de las pruebas demuestran que los cambios abruptos en imágenes se producen en entornos con movimientos y disparidades negativas elevadas que producen interferencias en los procesos de acomodación y vergencia del ojo humano, así como una necesidad en el aumento de los tiempos de enfoque del cristalino. En la mejora de las métricas de calidad a través de modelos que se adaptan al sistema visual humano, se realizaron también pruebas subjetivas que ayudaron a determinar la importancia de cada uno de los factores a la hora de enmascarar una determinada degradación. Los resultados demuestran una ligera mejora en los resultados obtenidos al aplicar máscaras de ponderación y atención visual, los cuales aproximan los parámetros de calidad objetiva a la respuesta del ojo humano. ABSTRACT Video quality assessment is still a necessary tool for defining the criteria to characterize a signal with the viewing requirements imposed by the final user. New technologies, such as 3D stereoscopic video and formats of HD and beyond HD oblige to develop new analysis of video features for obtaining the highest user’s satisfaction. Among the problems detected during the process of this doctoral thesis, it has been determined that some phenomena affect to different phases in the audiovisual production chain, apart from the type of content. On first instance, the generation of contents process should be enough controlled through parameters that avoid the occurrence of visual discomfort in observer’s eye, and consequently, visual fatigue. It is especially necessary controlling sequences of stereoscopic 3D, with both animation and live-action contents. On the other hand, video quality assessment, related to compression processes, should be improved because some objective metrics are adapted to user’s perception. The use of psychovisual models and visual attention diagrams allow the weighting of image regions of interest, giving more importance to the areas which the user will focus most probably. These two work fields are related together through the definition of the term saliency. Saliency is the capacity of human visual system for characterizing an image, highlighting the areas which result more attractive to the human eye. Saliency in generation of 3DTV contents refers mainly to the simulated depth of the optic illusion, i.e. the distance from the virtual object to the human eye. On the other hand, saliency is not based on virtual depth, but on other features, such as motion, level of detail, position of pixels in the frame or face detection, which are the basic features that are part of the developed visual attention model, as demonstrated with tests. Extensive literature involving visual comfort assessment was looked up, and the development of new preliminary subjective assessment with users was performed, in order to detect the features that increase the probability of discomfort to occur. With this methodology, the conclusions drawn confirmed that one common source of visual discomfort was when an abrupt change of disparity happened in video transitions, apart from other degradations, such as window violation. New quality assessment was performed to quantify the distribution of disparities over different sequences. The results confirmed that abrupt changes in negative parallax environment produce accommodation-vergence mismatches derived from the increasing time for human crystalline to focus the virtual objects. On the other side, for developing metrics that adapt to human visual system, additional subjective tests were developed to determine the importance of each factor, which masks a concrete distortion. Results demonstrated slight improvement after applying visual attention to objective metrics. This process of weighing pixels approximates the quality results to human eye’s response.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The usage of HTTP adaptive streaming (HAS) has become widely spread in multimedia services. Because it allows the service providers to improve the network resource utilization and user׳s Quality of Experience (QoE). Using this technology, the video playback interruption is reduced since the network and server status in addition to capability of user device, all are taken into account by HAS client to adapt the quality to the current condition. Adaptation can be done using different strategies. In order to provide optimal QoE, the perceptual impact of adaptation strategies from point of view of the user should be studied. However, the time-varying video quality due to the adaptation which usually takes place in a long interval introduces a new type of impairment making the subjective evaluation of adaptive streaming system challenging. The contribution of this paper is two-fold: first, it investigates the testing methodology to evaluate HAS QoE by comparing the subjective experimental outcomes obtained from ACR standardized method and a semi-continuous method developed to evaluate the long sequences. In addition, influence of using audiovisual stimuli to evaluate the video-related impairment is inquired. Second, impact of some of the adaptation technical factors including the quality switching amplitude and chunk size in combination with high range of commercial content type is investigated. The results of this study provide a good insight toward achieving appropriate testing method to evaluate HAS QoE, in addition to designing switching strategies with optimal visual quality.