991 resultados para Video coding
Análisis de las herramientas ORCC y Vivado HLS para la Síntesis de Modelos de Flujo de Datos RVC-CAL
Resumo:
En este Proyecto Fin de Grado se ha realizado un estudio de cómo generar, a partir de modelos de flujo de datos en RVC-CAL (Reconfigurable Video Coding – CAL Actor Language), modelos VHDL (Versatile Hardware Description Language) mediante Vivado HLS (Vivado High Level Synthesis), incluida en las herramientas disponibles en Vivado de Xilinx. Una vez conseguido el modelo VHDL resultante, la intención es que mediante las herramientas de Xilinx se programe en una FPGA (Field Programmable Gate Array) o el dispositivo Zynq también desarrollado por Xilinx. RVC-CAL es un lenguaje de flujo de datos que describe la funcionalidad de bloques funcionales, denominados actores. Las funcionalidades que desarrolla un actor se definen como acciones, las cuales pueden ser diferentes en un mismo actor. Los actores pueden comunicarse entre sí y formar una red de actores o network. Con Vivado HLS podemos obtener un diseño VHDL a partir de un modelo en lenguaje C. Por lo que la generación de modelos en VHDL a partir de otros en RVC-CAL, requiere una fase previa en la que los modelos en RVC-CAL serán compilados para conseguir su equivalente en lenguaje C. El compilador ORCC (Open RVC-CAL Compiler) es la herramienta que nos permite lograr diseños en lenguaje C partiendo de modelos en RVC-CAL. ORCC no crea directamente el código ejecutable, sino que genera un código fuente disponible para ser compilado por otra herramienta, en el caso de este proyecto, el compilador GCC (Gnu C Compiler) de Linux. En resumen en este proyecto nos encontramos con tres puntos de estudio bien diferenciados, los cuales son: 1. Partimos de modelos de flujo de datos en RVC-CAL, los cuales son compilados por ORCC para alcanzar su traducción en lenguaje C. 2. Una vez conseguidos los diseños equivalentes en lenguaje C, son sintetizados en Vivado HLS para conseguir los modelos en VHDL. 3. Los modelos VHDL resultantes serian manipulados por las herramientas de Xilinx para producir el bitstream que sea programado en una FPGA o en el dispositivo Zynq. En el estudio del segundo punto, nos encontramos con una serie de elementos conflictivos que afectan a la síntesis en Vivado HLS de los diseños en lenguaje C generados por ORCC. Estos elementos están relacionados con la manera que se encuentra estructurada la especificación en C generada por ORCC y que Vivado HLS no puede soportar en determinados momentos de la síntesis. De esta manera se ha propuesto una transformación “manual” de los diseños generados por ORCC que afecto lo menos posible a los modelos originales para poder realizar la síntesis con Vivado HLS y crear el fichero VHDL correcto. De esta forma este documento se estructura siguiendo el modelo de un trabajo de investigación. En primer lugar, se exponen las motivaciones y objetivos que apoyan y se esperan lograr en este trabajo. Seguidamente, se pone de manifiesto un análisis del estado del arte de los elementos necesarios para el desarrollo del mismo, proporcionando los conceptos básicos para la correcta comprensión y estudio del documento. Se realiza una descripción de los lenguajes RVC-CAL y VHDL, además de una introducción de las herramientas ORCC y Vivado, analizando las bondades y características principales de ambas. Una vez conocido el comportamiento de ambas herramientas, se describen las soluciones desarrolladas en nuestro estudio de la síntesis de modelos en RVC-CAL, poniéndose de manifiesto los puntos conflictivos anteriormente señalados que Vivado HLS no puede soportar en la síntesis de los diseños en lenguaje C generados por el compilador ORCC. A continuación se presentan las soluciones propuestas a estos errores acontecidos durante la síntesis, con las cuales se pretende alcanzar una especificación en C más óptima para una correcta síntesis en Vivado HLS y alcanzar de esta forma los modelos VHDL adecuados. Por último, como resultado final de este trabajo se extraen un conjunto de conclusiones sobre todos los análisis y desarrollos acontecidos en el mismo. Al mismo tiempo se proponen una serie de líneas futuras de trabajo con las que se podría continuar el estudio y completar la investigación desarrollada en este documento. ABSTRACT. In this Project it has made a study of how to generate, from data flow models in RVC-CAL (Reconfigurable Video Coding - Actor CAL Language), VHDL models (Versatile Hardware Description Language) by Vivado HLS (Vivado High Level Synthesis), included in the tools available in Vivado of Xilinx. Once achieved the resulting VHDL model, the intention is that by the Xilinx tools programmed in FPGA or Zynq device also developed by Xilinx. RVC-CAL is a dataflow language that describes the functionality of functional blocks, called actors. The functionalities developed by an actor are defined as actions, which may be different in the same actor. Actors can communicate with each other and form a network of actors. With Vivado HLS we can get a VHDL design from a model in C. So the generation of models in VHDL from others in RVC-CAL requires a preliminary phase in which the models RVC-CAL will be compiled to get its equivalent in C. The compiler ORCC (Open RVC-CAL Compiler) is the tool that allows us to achieve designs in C language models based on RVC-CAL. ORCC not directly create the executable code but generates an available source code to be compiled by another tool, in the case of this project, the GCC compiler (GNU C Compiler) of Linux. In short, in this project we find three well-defined points of study, which are: 1. We start from data flow models in RVC-CAL, which are compiled by ORCC to achieve its translation in C. 2. Once you realize the equivalent designs in C, they are synthesized in Vivado HLS for VHDL models. 3. The resulting models VHDL would be manipulated by Xilinx tools to produce the bitstream that is programmed into an FPGA or Zynq device. In the study of the second point, we find a number of conflicting elements that affect the synthesis Vivado HLS designs in C generated by ORCC. These elements are related to the way it is structured specification in C generated ORCC and Vivado HLS cannot hold at certain times of the synthesis. Thus it has proposed a "manual" transformation of designs generated by ORCC that affected as little as possible to the original in order to perform the synthesis Vivado HLS and create the correct file VHDL models. Thus this document is structured along the lines of a research. First, the motivations and objectives that support and hope to reach in this work are presented. Then it shows an analysis the state of the art of the elements necessary for its development, providing the basics for a correct understanding and study of the document. A description of the RVC-CAL and VHDL languages is made, in addition an introduction of the ORCC and Vivado tools, analyzing the advantages and main features of both. Once you know the behavior of both tools, the solutions developed in our study of the synthesis of RVC-CAL models, introducing the conflicting points mentioned above are described that Vivado HLS cannot stand in the synthesis of design in C language generated by ORCC compiler. Below the proposed solutions to these errors occurred during synthesis, with which it is intended to achieve optimum C specification for proper synthesis Vivado HLS and thus create the appropriate VHDL models are presented. Finally, as the end result of this work a set of conclusions on all analyzes and developments occurred in the same are removed. At the same time a series of future lines of work which could continue to study and complete the research developed in this document are proposed.
Resumo:
As digital systems move away from traditional desktop setups, new interaction paradigms are emerging that better integrate with users’ realworld surroundings, and better support users’ individual needs. While promising, these modern interaction paradigms also present new challenges, such as a lack of paradigm-specific tools to systematically evaluate and fully understand their use. This dissertation tackles this issue by framing empirical studies of three novel digital systems in embodied cognition – an exciting new perspective in cognitive science where the body and its interactions with the physical world take a central role in human cognition. This is achieved by first, focusing the design of all these systems on a contemporary interaction paradigm that emphasizes physical interaction on tangible interaction, a contemporary interaction paradigm; and second, by comprehensively studying user performance in these systems through a set of novel performance metrics grounded on epistemic actions, a relatively well established and studied construct in the literature on embodied cognition. The first system presented in this dissertation is an augmented Four-in-a-row board game. Three different versions of the game were developed, based on three different interaction paradigms (tangible, touch and mouse), and a repeated measures study involving 36 participants measured the occurrence of three simple epistemic actions across these three interfaces. The results highlight the relevance of epistemic actions in such a task and suggest that the different interaction paradigms afford instantiation of these actions in different ways. Additionally, the tangible version of the system supports the most rapid execution of these actions, providing novel quantitative insights into the real benefits of tangible systems. The second system presented in this dissertation is a tangible tabletop scheduling application. Two studies with single and paired users provide several insights into the impact of epistemic actions on the user experience when these are performed outside of a system’s sensing boundaries. These insights are clustered by the form, size and location of ideal interface areas for such offline epistemic actions to occur, as well as how can physical tokens be designed to better support them. Finally, and based on the results obtained to this point, the last study presented in this dissertation directly addresses the lack of empirical tools to formally evaluate tangible interaction. It presents a video-coding framework grounded on a systematic literature review of 78 papers, and evaluates its value as metric through a 60 participant study performed across three different research laboratories. The results highlight the usefulness and power of epistemic actions as a performance metric for tangible systems. In sum, through the use of such novel metrics in each of the three studies presented, this dissertation provides a better understanding of the real impact and benefits of designing and developing systems that feature tangible interaction.
Resumo:
Esta dissertação apresenta um trabalho sobre codificação de vídeo 3D compatível com vídeo 2D. Tem por base o desenvolvimento de um método para melhorar, no descodificador, a reconstrução de uma vista subamostrada resultante de uma transmissão simulcast usando a norma de codificação de vídeo H.265 (informalmente denominada de High Efficiency Video Coding (HEVC)). Apesar de manter a compatibilidade com vídeo 2D a transmissão simulcast normalmente requer uma taxa de transmissão elevada. Na ausência de ferramentas de codificação 3D adequadas é possível reduzir a taxa de transmissão utilizando compressão assimétrica do vídeo, onde a vista base é codificada com a resolução espacial original, enquanto que a vista auxiliar é codificada com uma resolução espacial menor, sendo sobreamostrada no descodificador. O método desenvolvido visa melhorar a vista auxiliar sobreamostrada no descodificador utilizando informação dos detalhes da vista base, ou seja, as componentes de alta frequência. Este processo depende de transformadas Afim para realizar um mapeamento geométrico entre a informação de alta frequência da vista base de resolução completa e a vista auxiliar de menor resolução. Adicionalmente, de modo a manter a continuidade do conteúdo da imagem entre regiões, evitando artefatos de blocos, o mapeamento utiliza uma malha de triangulação da vista auxiliar aplicado à imagem de detalhes obtida a partir da vista base. A técnica proposta é comparada com um método de estimação de disparidade por correspondência de blocos, sendo que os resultados mostram que para algumas sequências a técnica desenvolvida melhora não só a qualidade objetiva (PSNR) até 2.2 dB, mas também a qualidade subjetiva, para a mesma taxa de compressão global.
Resumo:
Medical imaging technology and applications are continuously evolving, dealing with images of increasing spatial and temporal resolutions, which allow easier and more accurate medical diagnosis. However, this increase in resolution demands a growing amount of data to be stored and transmitted. Despite the high coding efficiency achieved by the most recent image and video coding standards in lossy compression, they are not well suited for quality-critical medical image compression where either near-lossless or lossless coding is required. In this dissertation, two different approaches to improve lossless coding of volumetric medical images, such as Magnetic Resonance and Computed Tomography, were studied and implemented using the latest standard High Efficiency Video Encoder (HEVC). In a first approach, the use of geometric transformations to perform inter-slice prediction was investigated. For the second approach, a pixel-wise prediction technique, based on Least-Squares prediction, that exploits inter-slice redundancy was proposed to extend the current HEVC lossless tools. Experimental results show a bitrate reduction between 45% and 49%, when compared with DICOM recommended encoders, and 13.7% when compared with standard HEVC.
Resumo:
Dissertação (mestrado)—Universidade de Brasília, Faculdade de Tecnoloigia, 2016.
Resumo:
Image and video compression play a major role in the world today, allowing the storage and transmission of large multimedia content volumes. However, the processing of this information requires high computational resources, hence the improvement of the computational performance of these compression algorithms is very important. The Multidimensional Multiscale Parser (MMP) is a pattern-matching-based compression algorithm for multimedia contents, namely images, achieving high compression ratios, maintaining good image quality, Rodrigues et al. [2008]. However, in comparison with other existing algorithms, this algorithm takes some time to execute. Therefore, two parallel implementations for GPUs were proposed by Ribeiro [2016] and Silva [2015] in CUDA and OpenCL-GPU, respectively. In this dissertation, to complement the referred work, we propose two parallel versions that run the MMP algorithm in CPU: one resorting to OpenMP and another that converts the existing OpenCL-GPU into OpenCL-CPU. The proposed solutions are able to improve the computational performance of MMP by 3 and 2:7 , respectively. The High Efficiency Video Coding (HEVC/H.265) is the most recent standard for compression of image and video. Its impressive compression performance, makes it a target for many adaptations, particularly for holoscopic image/video processing (or light field). Some of the proposed modifications to encode this new multimedia content are based on geometry-based disparity compensations (SS), developed by Conti et al. [2014], and a Geometric Transformations (GT) module, proposed by Monteiro et al. [2015]. These compression algorithms for holoscopic images based on HEVC present an implementation of specific search for similar micro-images that is more efficient than the one performed by HEVC, but its implementation is considerably slower than HEVC. In order to enable better execution times, we choose to use the OpenCL API as the GPU enabling language in order to increase the module performance. With its most costly setting, we are able to reduce the GT module execution time from 6.9 days to less then 4 hours, effectively attaining a speedup of 45 .
Resumo:
In free viewpoint applications, the images are captured by an array of cameras that acquire a scene of interest from different perspectives. Any intermediate viewpoint not included in the camera array can be virtually synthesized by the decoder, at a quality that depends on the distance between the virtual view and the camera views available at decoder. Hence, it is beneficial for any user to receive camera views that are close to each other for synthesis. This is however not always feasible in bandwidth-limited overlay networks, where every node may ask for different camera views. In this work, we propose an optimized delivery strategy for free viewpoint streaming over overlay networks. We introduce the concept of layered quality-of-experience (QoE), which describes the level of interactivity offered to clients. Based on these levels of QoE, camera views are organized into layered subsets. These subsets are then delivered to clients through a prioritized network coding streaming scheme, which accommodates for the network and clients heterogeneity and effectively exploit the resources of the overlay network. Simulation results show that, in a scenario with limited bandwidth or channel reliability, the proposed method outperforms baseline network coding approaches, where the different levels of QoE are not taken into account in the delivery strategy optimization.
Resumo:
In this work, we propose a novel network coding enabled NDN architecture for the delivery of scalable video. Our scheme utilizes network coding in order to address the problem that arises in the original NDN protocol, where optimal use of the bandwidth and caching resources necessitates the coordination of the forwarding decisions. To optimize the performance of the proposed network coding based NDN protocol and render it appropriate for transmission of scalable video, we devise a novel rate allocation algorithm that decides on the optimal rates of Interest messages sent by clients and intermediate nodes. This algorithm guarantees that the achieved flow of Data objects will maximize the average quality of the video delivered to the client population. To support the handling of Interest messages and Data objects when intermediate nodes perform network coding, we modify the standard NDN protocol and introduce the use of Bloom filters, which store efficiently additional information about the Interest messages and Data objects. The proposed architecture is evaluated for transmission of scalable video over PlanetLab topologies. The evaluation shows that the proposed scheme performs very close to the optimal performance
Resumo:
Research in stereoscopic 3D coding, transmission and subjective assessment methodology depends largely on the availability of source content that can be used in cross-lab evaluations. While several studies have already been presented using proprietary content, comparisons between the studies are difficult since discrepant contents are used. Therefore in this paper, a freely available dataset of high quality Full-HD stereoscopic sequences shot with a semiprofessional 3D camera is introduced in detail. The content was designed to be suited for usage in a wide variety of applications, including high quality studies. A set of depth maps was calculated from the stereoscopic pair. As an application example, a subjective assessment has been performed using coding and spatial degradations. The Absolute Category Rating with Hidden Reference method was used. The observers were instructed to vote on video quality only. Results of this experiment are also freely available and will be presented in this paper as a first step towards objective video quality measurement for 3DTV.
Resumo:
This paper will look at the benefits and limitations of content distribution using Forward Error Correction (FEC) in conjunction with the Transmission Control Protocol (TCP). FEC can be used to reduce the number of retransmissions which would usually result from a lost packet. The requirement for TCP to deal with any losses is then greatly reduced. There are however side-effects to using FEC as a countermeasure to packet loss: an additional requirement for bandwidth. When applications such as real-time video conferencing are needed, delay must be kept to a minimum, and retransmissions are certainly not desirable. A balance, therefore, between additional bandwidth and delay due to retransmissions must be struck. Our results show that the throughput of data can be significantly improved when packet loss occurs using a combination of FEC and TCP, compared to relying solely on TCP for retransmissions. Furthermore, a case study applies the result to demonstrate the achievable improvements in the quality of streaming video perceived by end users.
Resumo:
In a clinical setting, pain is reported either through patient self-report or via an observer. Such measures are problematic as they are: 1) subjective, and 2) give no specific timing information. Coding pain as a series of facial action units (AUs) can avoid these issues as it can be used to gain an objective measure of pain on a frame-by-frame basis. Using video data from patients with shoulder injuries, in this paper, we describe an active appearance model (AAM)-based system that can automatically detect the frames in video in which a patient is in pain. This pain data set highlights the many challenges associated with spontaneous emotion detection, particularly that of expression and head movement due to the patient's reaction to pain. In this paper, we show that the AAM can deal with these movements and can achieve significant improvements in both the AU and pain detection performance compared to the current-state-of-the-art approaches which utilize similarity-normalized appearance features only.