797 resultados para VIDEO STREAMING
Resumo:
HTTP adaptive streaming technology has become widely spread in multimedia services because of its ability to provide adaptation to characteristics of various viewing devices and dynamic network conditions. There are various studies targeting the optimization of adaptation strategy. However, in order to provide an optimal viewing experience to the end-user, it is crucial to get knowledge about the Quality of Experience (QoE) of different adaptation schemes. This paper overviews the state of the art concerning subjective evaluation of adaptive streaming QoE and highlights the challenges and open research questions related to QoE assessment.
Resumo:
We present a framework for the analysis of the decoding delay in multiview video coding (MVC). We show that in real-time applications, an accurate estimation of the decoding delay is essential to achieve a minimum communication latency. As opposed to single-view codecs, the complexity of the multiview prediction structure and the parallel decoding of several views requires a systematic analysis of this decoding delay, which we solve using graph theory and a model of the decoder hardware architecture. Our framework assumes a decoder implementation in general purpose multi-core processors with multi-threading capabilities. For this hardware model, we show that frame processing times depend on the computational load of the decoder and we provide an iterative algorithm to compute jointly frame processing times and decoding delay. Finally, we show that decoding delay analysis can be applied to design decoders with the objective of minimizing the communication latency of the MVC system.
Resumo:
En este Trabajo de Fin de Grado se va a explicar el procedimiento seguido a la hora de estudiar, diseñar y desarrollar Ackuaria, un portal de monitorización y análisis de estadísticas de comunicaciones en tiempo real. Después, se mostrarán los resultados obtenidos y la interfaz gráfica desarrollada para una mejor experiencia de usuario. Ackuaria se apoyará en el uso de Licode, un proyecto de código libre desarrollado en la Universidad Politécnica de Madrid, más concretamente en el Grupo de Internet de Nueva Generación de la Escuela Técnica Superior de Ingenieros de Telecomunicación. Licode ofrece la posibilidad de crear un servicio de streaming y videoconferencia en la propia infraestructura del usuario. Está diseñado para ser totalmente escalable y su uso está orientado principalmente al Cloud, aunque es perfectamente utilizable en una infraestructura física. Licode a su vez se basa en WebRTC, un protocolo desarrollado por la W3C (World Wide Web Consortium) y el IETF (Internet Engineering Task Force) pensado para poder transmitir y recibir flujos de audio, video y datos a través del navegador. No necesita ninguna instalación adicional, por lo que establecer una sesión de videoconferencia Peer-to-Peer es realmente sencillo. Con Licode se usa una MCU (Multipoint Control Unit) para evitar que todas las conexiones entre los usuarios sean Peer-To-Peer. Actúa como un cliente WebRTC más por el que pasan todos los flujos, que se encarga de multiplexar y redirigir donde sea necesario. De esta forma se ahorra ancho de banda y recursos del dispositivo de una forma muy significativa. Existe la creciente necesidad de los usuarios de Licode y de cualquier servicio de videoconferencia en general de poder gestionar su infraestructura a partir de datos y estadísticas fiables. Sus objetivos son muy variados: desde estudiar el comportamiento de WebRTC en distintos escenarios hasta monitorizar el uso de los usuarios para poder contabilizar después el tiempo publicado por cada uno. En todos los casos era común la necesidad de disponer de una herramienta que permitiese conocer en todo momento qué está pasando en el servicio de Licode, así como de almacenar toda la información para poder ser analizada posteriormente. Para conseguir desarrollar Ackuaria se ha realizado un estudio de las comunicaciones en tiempo real con el objetivo de determinar qué parámetros era indispensable y útil monitorizar. A partir de este estudio se ha actualizado la arquitectura de Licode para que obtuviese todos los datos necesarios y los enviase de forma que pudiesen ser recogidos por Ackuaria. El portal de monitorización entonces tratará esa información y la mostrará de forma clara y ordenada, además de proporcionar una API REST al usuario.
Resumo:
Low-cost systems that can obtain a high-quality foreground segmentation almostindependently of the existing illumination conditions for indoor environments are verydesirable, especially for security and surveillance applications. In this paper, a novelforeground segmentation algorithm that uses only a Kinect depth sensor is proposedto satisfy the aforementioned system characteristics. This is achieved by combininga mixture of Gaussians-based background subtraction algorithm with a new Bayesiannetwork that robustly predicts the foreground/background regions between consecutivetime steps. The Bayesian network explicitly exploits the intrinsic characteristics ofthe depth data by means of two dynamic models that estimate the spatial and depthevolution of the foreground/background regions. The most remarkable contribution is thedepth-based dynamic model that predicts the changes in the foreground depth distributionbetween consecutive time steps. This is a key difference with regard to visible imagery,where the color/gray distribution of the foreground is typically assumed to be constant.Experiments carried out on two different depth-based databases demonstrate that theproposed combination of algorithms is able to obtain a more accurate segmentation of theforeground/background than other state-of-the art approaches.
Resumo:
Vision-based object detection from a moving platform becomes particularly challenging in the field of advanced driver assistance systems (ADAS). In this context, onboard vision-based vehicle verification strategies become critical, facing challenges derived from the variability of vehicles appearance, illumination, and vehicle speed. In this paper, an optimized HOG configuration for onboard vehicle verification is proposed which not only considers its spatial and orientation resolution, but descriptor processing strategies and classification. An in-depth analysis of the optimal settings for HOG for onboard vehicle verification is presented, in the context of SVM classification with different kernels. In contrast to many existing approaches, the evaluation is realized in a public and heterogeneous database of vehicle and non-vehicle images in different areas of the road, rendering excellent verification rates that outperform other similar approaches in the literature.
Resumo:
The importance of vision-based systems for Sense-and-Avoid is increasing nowadays as remotely piloted and autonomous UAVs become part of the non-segregated airspace. The development and evaluation of these systems demand flight scenario images which are expensive and risky to obtain. Currently Augmented Reality techniques allow the compositing of real flight scenario images with 3D aircraft models to produce useful realistic images for system development and benchmarking purposes at a much lower cost and risk. With the techniques presented in this paper, 3D aircraft models are positioned firstly in a simulated 3D scene with controlled illumination and rendering parameters. Realistic simulated images are then obtained using an image processing algorithm which fuses the images obtained from the 3D scene with images from real UAV flights taking into account on board camera vibrations. Since the intruder and camera poses are user-defined, ground truth data is available. These ground truth annotations allow to develop and quantitatively evaluate aircraft detection and tracking algorithms. This paper presents the software developed to create a public dataset of 24 videos together with their annotations and some tracking application results.
Resumo:
Video Quality Assessment needs to correspond to human perception. Pixel-based metrics (PSNR or MSE) fail in many circumstances for not taking into account the spatio-temporal property of human's visual perception. In this paper we propose a new pixel-weighted method to improve video quality metrics for artifacts evaluation. The method applies a psychovisual model based on motion, level of detail, pixel location and the appearance of human faces, which approximate the quality to the human eye's response. Subjective tests were developed to adjust the psychovisual model for demonstrating the noticeable improvement of an algorithm when weighting the pixels according to the factors analyzed instead of treating them equally. The analysis developed demonstrates the necessity of models adapted to the specific visualization of contents and the model presents an advance in quality to be applied over sequences when a determined artifact is analyzed.
Resumo:
La medida de calidad de vídeo sigue siendo necesaria para definir los criterios que caracterizan una señal que cumpla los requisitos de visionado impuestos por el usuario. Las nuevas tecnologías, como el vídeo 3D estereoscópico o formatos más allá de la alta definición, imponen nuevos criterios que deben ser analizadas para obtener la mayor satisfacción posible del usuario. Entre los problemas detectados durante el desarrollo de esta tesis doctoral se han determinado fenómenos que afectan a distintas fases de la cadena de producción audiovisual y tipo de contenido variado. En primer lugar, el proceso de generación de contenidos debe encontrarse controlado mediante parámetros que eviten que se produzca el disconfort visual y, consecuentemente, fatiga visual, especialmente en lo relativo a contenidos de 3D estereoscópico, tanto de animación como de acción real. Por otro lado, la medida de calidad relativa a la fase de compresión de vídeo emplea métricas que en ocasiones no se encuentran adaptadas a la percepción del usuario. El empleo de modelos psicovisuales y diagramas de atención visual permitirían ponderar las áreas de la imagen de manera que se preste mayor importancia a los píxeles que el usuario enfocará con mayor probabilidad. Estos dos bloques se relacionan a través de la definición del término saliencia. Saliencia es la capacidad del sistema visual para caracterizar una imagen visualizada ponderando las áreas que más atractivas resultan al ojo humano. La saliencia en generación de contenidos estereoscópicos se refiere principalmente a la profundidad simulada mediante la ilusión óptica, medida en términos de distancia del objeto virtual al ojo humano. Sin embargo, en vídeo bidimensional, la saliencia no se basa en la profundidad, sino en otros elementos adicionales, como el movimiento, el nivel de detalle, la posición de los píxeles o la aparición de caras, que serán los factores básicos que compondrán el modelo de atención visual desarrollado. Con el objetivo de detectar las características de una secuencia de vídeo estereoscópico que, con mayor probabilidad, pueden generar disconfort visual, se consultó la extensa literatura relativa a este tema y se realizaron unas pruebas subjetivas preliminares con usuarios. De esta forma, se llegó a la conclusión de que se producía disconfort en los casos en que se producía un cambio abrupto en la distribución de profundidades simuladas de la imagen, aparte de otras degradaciones como la denominada “violación de ventana”. A través de nuevas pruebas subjetivas centradas en analizar estos efectos con diferentes distribuciones de profundidades, se trataron de concretar los parámetros que definían esta imagen. Los resultados de las pruebas demuestran que los cambios abruptos en imágenes se producen en entornos con movimientos y disparidades negativas elevadas que producen interferencias en los procesos de acomodación y vergencia del ojo humano, así como una necesidad en el aumento de los tiempos de enfoque del cristalino. En la mejora de las métricas de calidad a través de modelos que se adaptan al sistema visual humano, se realizaron también pruebas subjetivas que ayudaron a determinar la importancia de cada uno de los factores a la hora de enmascarar una determinada degradación. Los resultados demuestran una ligera mejora en los resultados obtenidos al aplicar máscaras de ponderación y atención visual, los cuales aproximan los parámetros de calidad objetiva a la respuesta del ojo humano. ABSTRACT Video quality assessment is still a necessary tool for defining the criteria to characterize a signal with the viewing requirements imposed by the final user. New technologies, such as 3D stereoscopic video and formats of HD and beyond HD oblige to develop new analysis of video features for obtaining the highest user’s satisfaction. Among the problems detected during the process of this doctoral thesis, it has been determined that some phenomena affect to different phases in the audiovisual production chain, apart from the type of content. On first instance, the generation of contents process should be enough controlled through parameters that avoid the occurrence of visual discomfort in observer’s eye, and consequently, visual fatigue. It is especially necessary controlling sequences of stereoscopic 3D, with both animation and live-action contents. On the other hand, video quality assessment, related to compression processes, should be improved because some objective metrics are adapted to user’s perception. The use of psychovisual models and visual attention diagrams allow the weighting of image regions of interest, giving more importance to the areas which the user will focus most probably. These two work fields are related together through the definition of the term saliency. Saliency is the capacity of human visual system for characterizing an image, highlighting the areas which result more attractive to the human eye. Saliency in generation of 3DTV contents refers mainly to the simulated depth of the optic illusion, i.e. the distance from the virtual object to the human eye. On the other hand, saliency is not based on virtual depth, but on other features, such as motion, level of detail, position of pixels in the frame or face detection, which are the basic features that are part of the developed visual attention model, as demonstrated with tests. Extensive literature involving visual comfort assessment was looked up, and the development of new preliminary subjective assessment with users was performed, in order to detect the features that increase the probability of discomfort to occur. With this methodology, the conclusions drawn confirmed that one common source of visual discomfort was when an abrupt change of disparity happened in video transitions, apart from other degradations, such as window violation. New quality assessment was performed to quantify the distribution of disparities over different sequences. The results confirmed that abrupt changes in negative parallax environment produce accommodation-vergence mismatches derived from the increasing time for human crystalline to focus the virtual objects. On the other side, for developing metrics that adapt to human visual system, additional subjective tests were developed to determine the importance of each factor, which masks a concrete distortion. Results demonstrated slight improvement after applying visual attention to objective metrics. This process of weighing pixels approximates the quality results to human eye’s response.
Resumo:
Acknowledgements We would like to thank Erik Rexstad and Rob Williams for useful reviews of this manuscript. The collection of visual and acoustic data was funded by the UK Department of Energy & Climate Change, the Scottish Government, Collaborative Offshore Wind Research into the Environment (COWRIE) and Oil & Gas UK. Digital aerial surveys were funded by Moray Offshore Renewables Ltd and additional funding for analysis of the combined datasets was provided by Marine Scotland. Collaboration between the University of Aberdeen and Marine Scotland was supported by MarCRF. We thank colleagues at the University of Aberdeen, Moray First Marine, NERI, Hi-Def Aerial Surveying Ltd and Ravenair for essential support in the field, particularly Tim Barton, Bill Ruck, Rasmus Nielson and Dave Rutter. Thanks also to Andy Webb, David Borchers, Len Thomas, Kelly McLeod, David L. Miller, Dinara Sadykova and Thomas Cornulier for advice on survey design and statistical approache. Data Accessibility Data are available from the Dryad Digital Repository: http://dx.doi.org/10.5061/dryad.cf04g
Resumo:
Streaming potentials across cloned epithelial Na+ channels (ENaC) incorporated into planar lipid bilayers were measured. We found that the establishment of an osmotic pressure gradient (Δπ) across a channel-containing membrane mimicked the activation effects of a hydrostatic pressure differential (ΔP) on αβγ-rENaC, although with a quantitative difference in the magnitude of the driving forces. Moreover, the imposition of a Δπ negates channel activation by ΔP when the Δπ was directed against ΔP. A streaming potential of 2.0 ± 0.7 mV was measured across αβγ-rat ENaC (rENaC)-containing bilayers at 100 mM symmetrical [Na+] in the presence of a 2 Osmol/kg sucrose gradient. Assuming single file movement of ions and water within the conduction pathway, we conclude that between two and three water molecules are translocated together with a single Na+ ion. A minimal effective pore diameter of 3 Å that could accommodate two water molecules even in single file is in contrast with the 2-Å diameter predicted from the selectivity properties of αβγ-rENaC. The fact that activation of αβγ-rENaC by ΔP can be reproduced by the imposition of Δπ suggests that water movement through the channel is also an important determinant of channel activity.
Resumo:
Toxoplasma gondii is a member of the phylum Apicomplexa, a diverse group of intracellular parasites that share a unique form of gliding motility. Gliding is substrate dependent and occurs without apparent changes in cell shape and in the absence of traditional locomotory organelles. Here, we demonstrate that gliding is characterized by three distinct forms of motility: circular gliding, upright twirling, and helical rotation. Circular gliding commences while the crescent-shaped parasite lies on its right side, from where it moves in a counterclockwise manner at a rate of ∼1.5 μm/s. Twirling occurs when the parasite rights itself vertically, remaining attached to the substrate by its posterior end and spinning clockwise. Helical gliding is similar to twirling except that it occurs while the parasite is positioned horizontally, resulting in forward movement that follows the path of a corkscrew. The parasite begins lying on its left side (where the convex side is defined as dorsal) and initiates a clockwise revolution along the long axis of the crescent-shaped body. Time-lapse video analyses indicated that helical gliding is a biphasic process. During the first 180o of the turn, the parasite moves forward one body length at a rate of ∼1–3 μm/s. In the second phase, the parasite flips onto its left side, in the process undergoing little net forward motion. All three forms of motility were disrupted by inhibitors of actin filaments (cytochalasin D) and myosin ATPase (butanedione monoxime), indicating that they rely on an actinomyosin motor in the parasite. Gliding motility likely provides the force for active penetration of the host cell and may participate in dissemination within the host and thus is of both fundamental and practical interest.