71 resultados para Video Installation
Resumo:
Research in stereoscopic 3D coding, transmission and subjective assessment methodology depends largely on the availability of source content that can be used in cross-lab evaluations. While several studies have already been presented using proprietary content, comparisons between the studies are difficult since discrepant contents are used. Therefore in this paper, a freely available dataset of high quality Full-HD stereoscopic sequences shot with a semiprofessional 3D camera is introduced in detail. The content was designed to be suited for usage in a wide variety of applications, including high quality studies. A set of depth maps was calculated from the stereoscopic pair. As an application example, a subjective assessment has been performed using coding and spatial degradations. The Absolute Category Rating with Hidden Reference method was used. The observers were instructed to vote on video quality only. Results of this experiment are also freely available and will be presented in this paper as a first step towards objective video quality measurement for 3DTV.
Resumo:
In this paper we propose an innovative method for the automatic detection and tracking of road traffic signs using an onboard stereo camera. It involves a combination of monocular and stereo analysis strategies to increase the reliability of the detections such that it can boost the performance of any traffic sign recognition scheme. Firstly, an adaptive color and appearance based detection is applied at single camera level to generate a set of traffic sign hypotheses. In turn, stereo information allows for sparse 3D reconstruction of potential traffic signs through a SURF-based matching strategy. Namely, the plane that best fits the cloud of 3D points traced back from feature matches is estimated using a RANSAC based approach to improve robustness to outliers. Temporal consistency of the 3D information is ensured through a Kalman-based tracking stage. This also allows for the generation of a predicted 3D traffic sign model, which is in turn used to enhance the previously mentioned color-based detector through a feedback loop, thus improving detection accuracy. The proposed solution has been tested with real sequences under several illumination conditions and in both urban areas and highways, achieving very high detection rates in challenging environments, including rapid motion and significant perspective distortion
Resumo:
The increasing use of video editing software requires faster and more efficient editing tools. As a first step, these tools perform a temporal segmentation in shots that allows a later building of indexes describing the video content. Here, we propose a novel real-time high-quality shot detection strategy, suitable for the last generation of video editing software requiring both low computational cost and high quality results. While abrupt transitions are detected through a very fast pixel-based analysis, gradual transitions are obtained from an efficient edge-based analysis. Both analyses are reinforced with a motion analysis that helps to detect and discard false detections. This motion analysis is carried out exclusively over a reduced set of candidate transitions, thus maintaining the computational requirements demanded by new applications to fulfill user needs.
Resumo:
Recently, three-dimensional (3D) video has decisively burst onto the entertainment industry scene, and has arrived in households even before the standardization process has been completed. 3D television (3DTV) adoption and deployment can be seen as a major leap in television history, similar to previous transitions from black and white (B&W) to color, from analog to digital television (TV), and from standard definition to high definition. In this paper, we analyze current 3D video technology trends in order to define a taxonomy of the availability and possible introduction of 3D-based services. We also propose an audiovisual network services architecture which provides a smooth transition from two-dimensional (2D) to 3DTV in an Internet Protocol (IP)-based scenario. Based on subjective assessment tests, we also analyze those factors which will influence the quality of experience in those 3D video services, focusing on effects of both coding and transmission errors. In addition, examples of the application of the architecture and results of assessment tests are provided.
Resumo:
We present a novel framework for the analysis and optimization of encoding latency for multiview video. Firstly, we characterize the elements that have an influence in the encoding latency performance: (i) the multiview prediction structure and (ii) the hardware encoder model. Then, we provide algorithms to find the encoding latency of any arbitrary multiview prediction structure. The proposed framework relies on the directed acyclic graph encoder latency (DAGEL) model, which provides an abstraction of the processing capacity of the encoder by considering an unbounded number of processors. Using graph theoretic algorithms, the DAGEL model allows us to compute the encoding latency of a given prediction structure, and determine the contribution of the prediction dependencies to it. As an example of DAGEL application, we propose an algorithm to reduce the encoding latency of a given multiview prediction structure up to a target value. In our approach, a minimum number of frame dependencies are pruned, until the latency target value is achieved, thus minimizing the degradation of the rate-distortion performance due to the removal of the prediction dependencies. Finally, we analyze the latency performance of the DAGEL derived prediction structures in multiview encoders with limited processing capacity.
Resumo:
Analysis of minimally invasive surgical videos is a powerful tool to drive new solutions for achieving reproducible training programs, objective and transparent assessment systems and navigation tools to assist surgeons and improve patient safety. This paper presents how video analysis contributes to the development of new cognitive and motor training and assessment programs as well as new paradigms for image-guided surgery.
Resumo:
El esquema actual que existe en el ámbito de la normalización y el diseño de nuevos estándares de codificación de vídeo se está convirtiendo en una tarea difícil de satisfacer la evolución y dinamismo de la comunidad de codificación de vídeo. El problema estaba centrado principalmente en poder explotar todas las características y similitudes entre los diferentes códecs y estándares de codificación. Esto ha obligado a tener que rediseñar algunas partes comunes a varios estándares de codificación. Este problema originó la aparición de una nueva iniciativa de normalización dentro del comité ISO/IEC MPEG, llamado Reconfigurable Video Coding (RVC). Su principal idea era desarrollar un estándar de codificación de vídeo que actualizase e incrementase progresivamente una biblioteca de los componentes, aportando flexibilidad y la capacidad de tener un código reconfigurable mediante el uso de un nuevo lenguaje orientado a flujo de Actores/datos denominado CAL. Este lenguaje se usa para la especificación de la biblioteca estándar y para la creación de instancias del modelo del decodificador. Más tarde, se desarrolló un nuevo estándar de codificación de vídeo denominado High Efficiency Video Coding (HEVC), que actualmente se encuentra en continuo proceso de actualización y desarrollo, que mejorase la eficiencia y compresión de la codificación de vídeo. Obviamente se ha desarrollado una visión de HEVC empleando la metodología de RVC. En este PFC, se emplean diferentes implementaciones de estándares empleando RVC. Por ejemplo mediante los decodificadores Mpeg 4 Part 2 SP y Mpeg 4 Part 10 CBP y PHP así como del nuevo estándar de codificación HEVC, resaltando las características y utilidad de cada uno de ellos. En RVC los algoritmos se describen mediante una clase de actores que intercambian flujos de datos (tokens) para realizar diferentes acciones. El objetivo de este proyecto es desarrollar un programa que, partiendo de los decodificadores anteriormente mencionados, una serie de secuencia de vídeo en diferentes formatos de compresión y una distribución estándar de los actores (para cada uno de los decodificadores), sea capaz de generar diferentes distribuciones de los actores del decodificador sobre uno o varios procesadores del sistema sobre el que se ejecuta, para conseguir la mayor eficiencia en la codificación del vídeo. La finalidad del programa desarrollado en este proyecto es la de facilitar la realización de las distribuciones de los actores sobre los núcleos del sistema, y obtener las mejores configuraciones posibles de una manera automática y eficiente. ABSTRACT. The current scheme that exists in the field of standardization and the design of new video coding standards is becoming a difficult task to meet the evolving and dynamic community of video encoding. The problem was centered mainly in order to exploit all the features and similarities between different codecs and encoding standards. This has forced redesigning some parts common to several coding standards. This problem led to the emergence of a new initiative for standardization within the ISO / IEC MPEG committee, called Reconfigurable Video Coding (RVC). His main idea was to develop a video coding standard and gradually incrementase to update a library of components, providing flexibility and the ability to have a reconfigurable code using a new flow -oriented language Actors / data called CAL. This language is used for the specification of the standard library and to the instantiation model decoder. Later, a new video coding standard called High Efficiency Video Coding (HEVC), which currently is in continuous process of updating and development, which would improve the compression efficiency and video coding is developed. Obviously has developed a vision of using the methodology HEVC RVC. In this PFC, different implementations using RVC standard are used. For example, using decoders MPEG 4 Part 2 SP and MPEG 4 Part 10 CBP and PHP and the new coding standard HEVC, highlighting the features and usefulness of each. In RVC, the algorithms are described by a class of actors that exchange streams of data (tokens) to perform different actions. The objective of this project is to develop a program that, based on the aforementioned decoders, a series of video stream in different compression formats and a standard distribution of actors (for each of the decoders), is capable of generating different distributions decoder actors on one or more processors of the system on which it runs, to achieve greater efficiency in video coding. The purpose of the program developed in this project is to facilitate the realization of the distributions of the actors on the cores of the system, and get the best possible settings automatically and efficiently.
Resumo:
In recent years, the increasing sophistication of embedded multimedia systems and wireless communication technologies has promoted a widespread utilization of video streaming applications. It has been reported in 2013 that youngsters, aged between 13 and 24, spend around 16.7 hours a week watching online video through social media, business websites, and video streaming sites. Video applications have already been blended into people daily life. Traditionally, video streaming research has focused on performance improvement, namely throughput increase and response time reduction. However, most mobile devices are battery-powered, a technology that grows at a much slower pace than either multimedia or hardware developments. Since battery developments cannot satisfy expanding power demand of mobile devices, research interests on video applications technology has attracted more attention to achieve energy-efficient designs. How to efficiently use the limited battery energy budget becomes a major research challenge. In addition, next generation video standards impel to diversification and personalization. Therefore, it is desirable to have mechanisms to implement energy optimizations with greater flexibility and scalability. In this context, the main goal of this dissertation is to find an energy management and optimization mechanism to reduce the energy consumption of video decoders based on the idea of functional-oriented reconfiguration. System battery life is prolonged as the result of a trade-off between energy consumption and video quality. Functional-oriented reconfiguration takes advantage of the similarities among standards to build video decoders reconnecting existing functional units. If a feedback channel from the decoder to the encoder is available, the former can signal the latter changes in either the encoding parameters or the encoding algorithms for energy-saving adaption. The proposed energy optimization and management mechanism is carried out at the decoder end. This mechanism consists of an energy-aware manager, implemented as an additional block of the reconfiguration engine, an energy estimator, integrated into the decoder, and, if available, a feedback channel connected to the encoder end. The energy-aware manager checks the battery level, selects the new decoder description and signals to build a new decoder to the reconfiguration engine. It is worth noting that the analysis of the energy consumption is fundamental for the success of the energy management and optimization mechanism. In this thesis, an energy estimation method driven by platform event monitoring is proposed. In addition, an event filter is suggested to automate the selection of the most appropriate events that affect the energy consumption. At last, a detailed study on the influence of the training data on the model accuracy is presented. The modeling methodology of the energy estimator has been evaluated on different underlying platforms, single-core and multi-core, with different characteristics of workload. All the results show a good accuracy and low on-line computation overhead. The required modifications on the reconfiguration engine to implement the energy-aware manager have been assessed under different scenarios. The results indicate a possibility to lengthen the battery lifetime of the system in two different use-cases.
Resumo:
The paper examines the video game industry in the perspective of being the paradigm of innovation in digital media and content. In particular, it analyses the response to two main factors that have impacted this industry over the last decade. First, it tracks the evolution of its global market and its emerging geography with the rise of Asia. Second, within this global landscape the paper explores how the changes derived from mobile and on-line gaming enabled major transformations of this industry. From here, some conclusions on the lessons from the evolution of this sector for the whole media and content industries are presented.
Resumo:
With the recent increased popularity and high usage of HTTP Adaptive Streaming (HAS) techniques, various studies have been carried out in this area which generally focused on the technical enhancement of HAS technology and applications. However, a lack of common HAS standard led to multiple proprietary approaches which have been developed by major Internet companies. In the emerging MPEG-DASH standard the packagings of the video content and HTTP syntax have been standardized; but all the details of the adaptation behavior are left to the client implementation. Nevertheless, to design an adaptation algorithm which optimizes the viewing experience of the enduser, the multimedia service providers need to know about the Quality of Experience (QoE) of different adaptation schemes. Taking this into account, the objective of this experiment was to study the QoE of a HAS-based video broadcast model. The experiment has been carried out through a subjective study of the end user response to various possible clients’ behavior for changing the video quality taking different QoE-influence factors into account. The experimental conclusions have made a good insight into the QoE of different adaptation schemes which can be exploited by HAS clients for designing the adaptation algorithms.
Quality-optimization algorithm based on stochastic dynamic programming for MPEG DASH video streaming
Resumo:
In contrast to traditional push-based protocols, adaptive streaming techniques like Dynamic Adaptive Streaming over HTTP (DASH) fix attention on the client, who dynamically requests different-quality portions of the content to cope with a limited and variable bandwidth but aiming at maximizing the quality perceived by the user. Since DASH adaptation logic at the client is not covered by the standard, we propose a solution based on Stochastic Dynamic Programming (SDP) techniques to find the optimal request policies that guarantee the users' Quality of Experience (QoE). Our algorithm is evaluated in a simulated streaming session and is compared with other adaptation approaches. The results show that our proposal outperforms them in terms of QoE, requesting higher qualities on average.
Resumo:
A frame-level distortion model based on perceptual features of the human visual system is proposed to improve the performance of unequal error protection strategies and provide better quality of experience to users in Side-by-Side 3D video delivery systems.
Resumo:
One of the key factors for a given application to take advantage of cloud computing is the ability to scale in an efficient, fast and reliable way. In centralized multi-party video conferencing, dynamically scaling a running conversation is a complex problem. In this paper we propose a methodology to divide the Multipoint Control Unit (the video conferencing server) into more simple units, broadcasters. Each broadcaster receives the media from a participant, processes it and forwards it to the rest. These broadcasters can be distributed among a group of CPUs. By using this methodology, video conferencing systems can scale in a more granular way, improving the deployment.
Resumo:
Assessing video quality is a complex task. While most pixel-based metrics do not present enough correlation between objective and subjective results, algorithms need to correspond to human perception when analyzing quality in a video sequence. For analyzing the perceived quality derived from concrete video artifacts in determined region of interest we present a novel methodology for generating test sequences which allow the analysis of impact of each individual distortion. Through results obtained after subjective assessment it is possible to create psychovisual models based on weighting pixels belonging to different regions of interest distributed by color, position, motion or content. Interesting results are obtained in subjective assessment which demonstrates the necessity of new metrics adapted to human visual system.
Resumo:
This paper gives an overview of three recent studies by the authors on the topic of 3D video Quality of Experience (QoE). Two of studies [1,2] investigated different psychological dimension that may be needed for describing 3D video QoE and the third the visibility and annoyance of crosstalk[3]. The results shows that the video quality scale could be sufficient for evaluating S3D video experience for coding and spatial resolution reduction distortions. It was also confirmed that with a more complex mixture of degradations more than one scale should be used to capture the QoE in these cases. The study found a linear relationship between the perceived crosstalk and the amount of crosstalk.