774 resultados para Scalable video coding
Resumo:
HEVC es el nuevo estándar de codificación de vídeo que está siendo desarrollado conjuntamente por las organizaciones ITU-T Video Coding Experts Group (VCEG) e ISO/IEC Moving Picture Experts Group (MPEG). Su objetivo principal es mejorar la compresión de vídeo, en relación a los actuales estándares. Es común hoy en día, debido a su flexibilidad para aplicaciones de bajo consumo, diseñar sistemas de descodificación de vídeo basados en un procesador digital de señal (DSP). En la mayoría de las veces, los diseños parten de un código creado para ser ejecutado en un ordenador personal y posteriormente se optimizan para tecnología DSP. El objetivo principal de este proyecto es caracterizar el rendimiento de un sistema basado en DSP que ejecute el código de un descodificador de video HEVC. ABSTRACT. HEVC is a new video coding standard which is being developed by both ITU-T Video Coding Experts Group (VCEG) and ISO/IEC Moving Picture Experts Group (MPEG). Its main goal is to improve video compression, compared with the actual standards. It is common practice, because of the flexibility in low power applications, to design video decoding systems using digital signal processors (DSP). Most of the time, these designs start with a code suitable to be executed in personal computers and then it is optimized forDSP technology. The main goal in this final degree project is to characterize the performance of a DSP based system executing an HEVC video decoder.
Resumo:
We propose a new algorithm for the design of prediction structures with low delay and limited penalty in the rate-distortion performance for multiview video coding schemes. This algorithm constitutes one of the elements of a framework for the analysis and optimization of delay in multiview coding schemes that is based in graph theory. The objective of the algorithm is to find the best combination of prediction dependencies to prune from a multiview prediction structure, given a number of cuts. Taking into account the properties of the graph-based analysis of the encoding delay, the algorithm is able to find the best prediction dependencies to eliminate from an original prediction structure, while limiting the number of cut combinations to evaluate. We show that this algorithm obtains optimum results in the reduction of the encoding latency with a lower computational complexity than exhaustive search alternatives.
Resumo:
Este proyecto se basa en la integración de funciones optimizadas de OpenHEVC en el códec Reconfigurable Video Coding (RVC) - High Efficiency Video Coding (HEVC). RVC es un framework capaz de generar automáticamente el código que implementa cualquier estándar de video mediante el uso de librerías. Estas librerías contienen la definición de bloques funcionales de los que se componen los distintos estándares de video a implementar. Sin embargo, como desventaja a la facilidad de creación de estándares utilizando este framework, las librerías que utiliza no se encuentran optimizadas. Por ello se pretende que el códec RVC-HEVC sea capaz de realizar llamadas a funciones optimizadas, que para el estudio éstas se encontrarán en la librería OpenHEVC. Por otro lado, estos codificadores de video se pueden encontrar implementados tanto en PCs como en sistemas embebidos. Los Digital Signal Processors (DSPs) son unas plataformas especializadas en el procesamiento digital, teniendo una alta velocidad en el cómputo de operaciones matemáticas. Por ello, para este proyecto se integrará RVC-HEVC con las llamadas a OpenHEVC en una plataforma DSP como la TMS320C6678. Una vez completa la integración se efectuan medidas de eficiencia para ver cómo las llamadas a funciones optimizadas mejoran la velocidad en la decodificación de imágenes. ABSTRACT. This project is based in the integration of optimized functions from OpenHEVC in the RVC-HEVC (Reconfigurable Video Coding- High Efficiency Video Coding) codec. RVC is a framework capable of generating automatically any type of video standard with the use of libraries. Inside these libraries there are the definitions of the functional blocks which make up the different standards, in which for the case of study will be the HEVC standard. Nevertheless, as a downside for the simplicity in producing standards with the RVC tool, these libraries are not optimized. Thus, one of the goals for the project will be to make the RVC-HEVC call optimized functions, in which in this case they will be inside the OpenHEVC library. On the other hand, these video encoders can be implemented both in PCs and embedded systems. The DSPs (Digital Signal Processors) are platforms specialized in digital processing, being able to compute mathematical operations in a short period of time. Consequently, for this project the integration of the RVC-HEVC with calls to the OpenHEVC library will be done in a DSP platform such as a TMS320C6678. Once completed the integration, performance measures will be carried out to evaluate the improvement in the decoding speed obtained when optimized functions are used by the RVC-HEVC.
Resumo:
Se va a realizar un estudio de la codificación de imágenes sobre el estándar HEVC (high-effiency video coding). El proyecto se va a centrar en el codificador híbrido, más concretamente sobre la aplicación de la transformada inversa del coseno que se realiza tanto en codificador como en el descodificador. La necesidad de codificar vídeo surge por la aparición de la secuencia de imágenes como señales digitales. El problema principal que tiene el vídeo es la cantidad de bits que aparecen al realizar la codificación. Como consecuencia del aumento de la calidad de las imágenes, se produce un crecimiento exponencial de la cantidad de información a codificar. La utilización de las transformadas al procesamiento digital de imágenes ha aumentado a lo largo de los años. La transformada inversa del coseno se ha convertido en el método más utilizado en el campo de la codificación de imágenes y video. Las ventajas de la transformada inversa del coseno permiten obtener altos índices de compresión a muy bajo coste. La teoría de las transformadas ha mejorado el procesamiento de imágenes. En la codificación por transformada, una imagen se divide en bloques y se identifica cada imagen a un conjunto de coeficientes. Esta codificación se aprovecha de las dependencias estadísticas de las imágenes para reducir la cantidad de datos. El proyecto realiza un estudio de la evolución a lo largo de los años de los distintos estándares de codificación de video. Se analiza el codificador híbrido con más profundidad así como el estándar HEVC. El objetivo final que busca este proyecto fin de carrera es la realización del núcleo de un procesador específico para la ejecución de la transformada inversa del coseno en un descodificador de vídeo compatible con el estándar HEVC. Es objetivo se logra siguiendo una serie de etapas, en las que se va añadiendo requisitos. Este sistema permite al diseñador hardware ir adquiriendo una experiencia y un conocimiento más profundo de la arquitectura final. ABSTRACT. A study about the codification of images based on the standard HEVC (high-efficiency video coding) will be developed. The project will be based on the hybrid encoder, in particular, on the application of the inverse cosine transform, which is used for the encoder as well as for the decoder. The necessity of encoding video arises because of the appearance of the sequence of images as digital signals. The main problem that video faces is the amount of bits that appear when making the codification. As a consequence of the increase of the quality of the images, an exponential growth on the quantity of information that should be encoded happens. The usage of transforms to the digital processing of images has increased along the years. The inverse cosine transform has become the most used method in the field of codification of images and video. The advantages of the inverse cosine transform allow to obtain high levels of comprehension at a very low price. The theory of the transforms has improved the processing of images. In the codification by transform, an image is divided in blocks and each image is identified to a set of coefficients. This codification takes advantage of the statistic dependence of the images to reduce the amount of data. The project develops a study of the evolution along the years of the different standards in video codification. In addition, the hybrid encoder and the standard HEVC are analyzed more in depth. The final objective of this end of degree project is the realization of the nucleus from a specific processor for the execution of the inverse cosine transform in a decoder of video that is compatible with the standard HEVC. This objective is reached following a series of stages, in which requirements are added. This system allows the hardware designer to acquire a deeper experience and knowledge of the final architecture.
Análisis de las herramientas ORCC y Vivado HLS para la Síntesis de Modelos de Flujo de Datos RVC-CAL
Resumo:
En este Proyecto Fin de Grado se ha realizado un estudio de cómo generar, a partir de modelos de flujo de datos en RVC-CAL (Reconfigurable Video Coding – CAL Actor Language), modelos VHDL (Versatile Hardware Description Language) mediante Vivado HLS (Vivado High Level Synthesis), incluida en las herramientas disponibles en Vivado de Xilinx. Una vez conseguido el modelo VHDL resultante, la intención es que mediante las herramientas de Xilinx se programe en una FPGA (Field Programmable Gate Array) o el dispositivo Zynq también desarrollado por Xilinx. RVC-CAL es un lenguaje de flujo de datos que describe la funcionalidad de bloques funcionales, denominados actores. Las funcionalidades que desarrolla un actor se definen como acciones, las cuales pueden ser diferentes en un mismo actor. Los actores pueden comunicarse entre sí y formar una red de actores o network. Con Vivado HLS podemos obtener un diseño VHDL a partir de un modelo en lenguaje C. Por lo que la generación de modelos en VHDL a partir de otros en RVC-CAL, requiere una fase previa en la que los modelos en RVC-CAL serán compilados para conseguir su equivalente en lenguaje C. El compilador ORCC (Open RVC-CAL Compiler) es la herramienta que nos permite lograr diseños en lenguaje C partiendo de modelos en RVC-CAL. ORCC no crea directamente el código ejecutable, sino que genera un código fuente disponible para ser compilado por otra herramienta, en el caso de este proyecto, el compilador GCC (Gnu C Compiler) de Linux. En resumen en este proyecto nos encontramos con tres puntos de estudio bien diferenciados, los cuales son: 1. Partimos de modelos de flujo de datos en RVC-CAL, los cuales son compilados por ORCC para alcanzar su traducción en lenguaje C. 2. Una vez conseguidos los diseños equivalentes en lenguaje C, son sintetizados en Vivado HLS para conseguir los modelos en VHDL. 3. Los modelos VHDL resultantes serian manipulados por las herramientas de Xilinx para producir el bitstream que sea programado en una FPGA o en el dispositivo Zynq. En el estudio del segundo punto, nos encontramos con una serie de elementos conflictivos que afectan a la síntesis en Vivado HLS de los diseños en lenguaje C generados por ORCC. Estos elementos están relacionados con la manera que se encuentra estructurada la especificación en C generada por ORCC y que Vivado HLS no puede soportar en determinados momentos de la síntesis. De esta manera se ha propuesto una transformación “manual” de los diseños generados por ORCC que afecto lo menos posible a los modelos originales para poder realizar la síntesis con Vivado HLS y crear el fichero VHDL correcto. De esta forma este documento se estructura siguiendo el modelo de un trabajo de investigación. En primer lugar, se exponen las motivaciones y objetivos que apoyan y se esperan lograr en este trabajo. Seguidamente, se pone de manifiesto un análisis del estado del arte de los elementos necesarios para el desarrollo del mismo, proporcionando los conceptos básicos para la correcta comprensión y estudio del documento. Se realiza una descripción de los lenguajes RVC-CAL y VHDL, además de una introducción de las herramientas ORCC y Vivado, analizando las bondades y características principales de ambas. Una vez conocido el comportamiento de ambas herramientas, se describen las soluciones desarrolladas en nuestro estudio de la síntesis de modelos en RVC-CAL, poniéndose de manifiesto los puntos conflictivos anteriormente señalados que Vivado HLS no puede soportar en la síntesis de los diseños en lenguaje C generados por el compilador ORCC. A continuación se presentan las soluciones propuestas a estos errores acontecidos durante la síntesis, con las cuales se pretende alcanzar una especificación en C más óptima para una correcta síntesis en Vivado HLS y alcanzar de esta forma los modelos VHDL adecuados. Por último, como resultado final de este trabajo se extraen un conjunto de conclusiones sobre todos los análisis y desarrollos acontecidos en el mismo. Al mismo tiempo se proponen una serie de líneas futuras de trabajo con las que se podría continuar el estudio y completar la investigación desarrollada en este documento. ABSTRACT. In this Project it has made a study of how to generate, from data flow models in RVC-CAL (Reconfigurable Video Coding - Actor CAL Language), VHDL models (Versatile Hardware Description Language) by Vivado HLS (Vivado High Level Synthesis), included in the tools available in Vivado of Xilinx. Once achieved the resulting VHDL model, the intention is that by the Xilinx tools programmed in FPGA or Zynq device also developed by Xilinx. RVC-CAL is a dataflow language that describes the functionality of functional blocks, called actors. The functionalities developed by an actor are defined as actions, which may be different in the same actor. Actors can communicate with each other and form a network of actors. With Vivado HLS we can get a VHDL design from a model in C. So the generation of models in VHDL from others in RVC-CAL requires a preliminary phase in which the models RVC-CAL will be compiled to get its equivalent in C. The compiler ORCC (Open RVC-CAL Compiler) is the tool that allows us to achieve designs in C language models based on RVC-CAL. ORCC not directly create the executable code but generates an available source code to be compiled by another tool, in the case of this project, the GCC compiler (GNU C Compiler) of Linux. In short, in this project we find three well-defined points of study, which are: 1. We start from data flow models in RVC-CAL, which are compiled by ORCC to achieve its translation in C. 2. Once you realize the equivalent designs in C, they are synthesized in Vivado HLS for VHDL models. 3. The resulting models VHDL would be manipulated by Xilinx tools to produce the bitstream that is programmed into an FPGA or Zynq device. In the study of the second point, we find a number of conflicting elements that affect the synthesis Vivado HLS designs in C generated by ORCC. These elements are related to the way it is structured specification in C generated ORCC and Vivado HLS cannot hold at certain times of the synthesis. Thus it has proposed a "manual" transformation of designs generated by ORCC that affected as little as possible to the original in order to perform the synthesis Vivado HLS and create the correct file VHDL models. Thus this document is structured along the lines of a research. First, the motivations and objectives that support and hope to reach in this work are presented. Then it shows an analysis the state of the art of the elements necessary for its development, providing the basics for a correct understanding and study of the document. A description of the RVC-CAL and VHDL languages is made, in addition an introduction of the ORCC and Vivado tools, analyzing the advantages and main features of both. Once you know the behavior of both tools, the solutions developed in our study of the synthesis of RVC-CAL models, introducing the conflicting points mentioned above are described that Vivado HLS cannot stand in the synthesis of design in C language generated by ORCC compiler. Below the proposed solutions to these errors occurred during synthesis, with which it is intended to achieve optimum C specification for proper synthesis Vivado HLS and thus create the appropriate VHDL models are presented. Finally, as the end result of this work a set of conclusions on all analyzes and developments occurred in the same are removed. At the same time a series of future lines of work which could continue to study and complete the research developed in this document are proposed.
Resumo:
As digital systems move away from traditional desktop setups, new interaction paradigms are emerging that better integrate with users’ realworld surroundings, and better support users’ individual needs. While promising, these modern interaction paradigms also present new challenges, such as a lack of paradigm-specific tools to systematically evaluate and fully understand their use. This dissertation tackles this issue by framing empirical studies of three novel digital systems in embodied cognition – an exciting new perspective in cognitive science where the body and its interactions with the physical world take a central role in human cognition. This is achieved by first, focusing the design of all these systems on a contemporary interaction paradigm that emphasizes physical interaction on tangible interaction, a contemporary interaction paradigm; and second, by comprehensively studying user performance in these systems through a set of novel performance metrics grounded on epistemic actions, a relatively well established and studied construct in the literature on embodied cognition. The first system presented in this dissertation is an augmented Four-in-a-row board game. Three different versions of the game were developed, based on three different interaction paradigms (tangible, touch and mouse), and a repeated measures study involving 36 participants measured the occurrence of three simple epistemic actions across these three interfaces. The results highlight the relevance of epistemic actions in such a task and suggest that the different interaction paradigms afford instantiation of these actions in different ways. Additionally, the tangible version of the system supports the most rapid execution of these actions, providing novel quantitative insights into the real benefits of tangible systems. The second system presented in this dissertation is a tangible tabletop scheduling application. Two studies with single and paired users provide several insights into the impact of epistemic actions on the user experience when these are performed outside of a system’s sensing boundaries. These insights are clustered by the form, size and location of ideal interface areas for such offline epistemic actions to occur, as well as how can physical tokens be designed to better support them. Finally, and based on the results obtained to this point, the last study presented in this dissertation directly addresses the lack of empirical tools to formally evaluate tangible interaction. It presents a video-coding framework grounded on a systematic literature review of 78 papers, and evaluates its value as metric through a 60 participant study performed across three different research laboratories. The results highlight the usefulness and power of epistemic actions as a performance metric for tangible systems. In sum, through the use of such novel metrics in each of the three studies presented, this dissertation provides a better understanding of the real impact and benefits of designing and developing systems that feature tangible interaction.
Resumo:
Esta dissertação apresenta um trabalho sobre codificação de vídeo 3D compatível com vídeo 2D. Tem por base o desenvolvimento de um método para melhorar, no descodificador, a reconstrução de uma vista subamostrada resultante de uma transmissão simulcast usando a norma de codificação de vídeo H.265 (informalmente denominada de High Efficiency Video Coding (HEVC)). Apesar de manter a compatibilidade com vídeo 2D a transmissão simulcast normalmente requer uma taxa de transmissão elevada. Na ausência de ferramentas de codificação 3D adequadas é possível reduzir a taxa de transmissão utilizando compressão assimétrica do vídeo, onde a vista base é codificada com a resolução espacial original, enquanto que a vista auxiliar é codificada com uma resolução espacial menor, sendo sobreamostrada no descodificador. O método desenvolvido visa melhorar a vista auxiliar sobreamostrada no descodificador utilizando informação dos detalhes da vista base, ou seja, as componentes de alta frequência. Este processo depende de transformadas Afim para realizar um mapeamento geométrico entre a informação de alta frequência da vista base de resolução completa e a vista auxiliar de menor resolução. Adicionalmente, de modo a manter a continuidade do conteúdo da imagem entre regiões, evitando artefatos de blocos, o mapeamento utiliza uma malha de triangulação da vista auxiliar aplicado à imagem de detalhes obtida a partir da vista base. A técnica proposta é comparada com um método de estimação de disparidade por correspondência de blocos, sendo que os resultados mostram que para algumas sequências a técnica desenvolvida melhora não só a qualidade objetiva (PSNR) até 2.2 dB, mas também a qualidade subjetiva, para a mesma taxa de compressão global.
Resumo:
Medical imaging technology and applications are continuously evolving, dealing with images of increasing spatial and temporal resolutions, which allow easier and more accurate medical diagnosis. However, this increase in resolution demands a growing amount of data to be stored and transmitted. Despite the high coding efficiency achieved by the most recent image and video coding standards in lossy compression, they are not well suited for quality-critical medical image compression where either near-lossless or lossless coding is required. In this dissertation, two different approaches to improve lossless coding of volumetric medical images, such as Magnetic Resonance and Computed Tomography, were studied and implemented using the latest standard High Efficiency Video Encoder (HEVC). In a first approach, the use of geometric transformations to perform inter-slice prediction was investigated. For the second approach, a pixel-wise prediction technique, based on Least-Squares prediction, that exploits inter-slice redundancy was proposed to extend the current HEVC lossless tools. Experimental results show a bitrate reduction between 45% and 49%, when compared with DICOM recommended encoders, and 13.7% when compared with standard HEVC.
Resumo:
Dissertação (mestrado)—Universidade de Brasília, Faculdade de Tecnoloigia, 2016.
Resumo:
Image and video compression play a major role in the world today, allowing the storage and transmission of large multimedia content volumes. However, the processing of this information requires high computational resources, hence the improvement of the computational performance of these compression algorithms is very important. The Multidimensional Multiscale Parser (MMP) is a pattern-matching-based compression algorithm for multimedia contents, namely images, achieving high compression ratios, maintaining good image quality, Rodrigues et al. [2008]. However, in comparison with other existing algorithms, this algorithm takes some time to execute. Therefore, two parallel implementations for GPUs were proposed by Ribeiro [2016] and Silva [2015] in CUDA and OpenCL-GPU, respectively. In this dissertation, to complement the referred work, we propose two parallel versions that run the MMP algorithm in CPU: one resorting to OpenMP and another that converts the existing OpenCL-GPU into OpenCL-CPU. The proposed solutions are able to improve the computational performance of MMP by 3 and 2:7 , respectively. The High Efficiency Video Coding (HEVC/H.265) is the most recent standard for compression of image and video. Its impressive compression performance, makes it a target for many adaptations, particularly for holoscopic image/video processing (or light field). Some of the proposed modifications to encode this new multimedia content are based on geometry-based disparity compensations (SS), developed by Conti et al. [2014], and a Geometric Transformations (GT) module, proposed by Monteiro et al. [2015]. These compression algorithms for holoscopic images based on HEVC present an implementation of specific search for similar micro-images that is more efficient than the one performed by HEVC, but its implementation is considerably slower than HEVC. In order to enable better execution times, we choose to use the OpenCL API as the GPU enabling language in order to increase the module performance. With its most costly setting, we are able to reduce the GT module execution time from 6.9 days to less then 4 hours, effectively attaining a speedup of 45 .
Resumo:
Automatic indexing and retrieval of digital data poses major challenges. The main problem arises from the ever increasing mass of digital media and the lack of efficient methods for indexing and retrieval of such data based on the semantic content rather than keywords. To enable intelligent web interactions, or even web filtering, we need to be capable of interpreting the information base in an intelligent manner. For a number of years research has been ongoing in the field of ontological engineering with the aim of using ontologies to add such (meta) knowledge to information. In this paper, we describe the architecture of a system (Dynamic REtrieval Analysis and semantic metadata Management (DREAM)) designed to automatically and intelligently index huge repositories of special effects video clips, based on their semantic content, using a network of scalable ontologies to enable intelligent retrieval. The DREAM Demonstrator has been evaluated as deployed in the film post-production phase to support the process of storage, indexing and retrieval of large data sets of special effects video clips as an exemplar application domain. This paper provides its performance and usability results and highlights the scope for future enhancements of the DREAM architecture which has proven successful in its first and possibly most challenging proving ground, namely film production, where it is already in routine use within our test bed Partners' creative processes. (C) 2009 Published by Elsevier B.V.
Resumo:
In free viewpoint applications, the images are captured by an array of cameras that acquire a scene of interest from different perspectives. Any intermediate viewpoint not included in the camera array can be virtually synthesized by the decoder, at a quality that depends on the distance between the virtual view and the camera views available at decoder. Hence, it is beneficial for any user to receive camera views that are close to each other for synthesis. This is however not always feasible in bandwidth-limited overlay networks, where every node may ask for different camera views. In this work, we propose an optimized delivery strategy for free viewpoint streaming over overlay networks. We introduce the concept of layered quality-of-experience (QoE), which describes the level of interactivity offered to clients. Based on these levels of QoE, camera views are organized into layered subsets. These subsets are then delivered to clients through a prioritized network coding streaming scheme, which accommodates for the network and clients heterogeneity and effectively exploit the resources of the overlay network. Simulation results show that, in a scenario with limited bandwidth or channel reliability, the proposed method outperforms baseline network coding approaches, where the different levels of QoE are not taken into account in the delivery strategy optimization.
Resumo:
A highly parallel and scalable Deblocking Filter (DF) hardware architecture for H.264/AVC and SVC video codecs is presented in this paper. The proposed architecture mainly consists on a coarse grain systolic array obtained by replicating a unique and homogeneous Functional Unit (FU), in which a whole Deblocking-Filter unit is implemented. The proposal is also based on a novel macroblock-level parallelization strategy of the filtering algorithm which improves the final performance by exploiting specific data dependences. This way communication overhead is reduced and a more intensive parallelism in comparison with the existing state-of-the-art solutions is obtained. Furthermore, the architecture is completely flexible, since the level of parallelism can be changed, according to the application requirements. The design has been implemented in a Virtex-5 FPGA, and it allows filtering 4CIF (704 × 576 pixels @30 fps) video sequences in real-time at frequencies lower than 10.16 Mhz.
Resumo:
In this paper we present a scalable software architecture for on-line multi-camera video processing, that guarantees a good trade off between computational power, scalability and flexibility. The software system is modular and its main blocks are the Processing Units (PUs), and the Central Unit. The Central Unit works as a supervisor of the running PUs and each PU manages the acquisition phase and the processing phase. Furthermore, an approach to easily parallelize the desired processing application has been presented. In this paper, as case study, we apply the proposed software architecture to a multi-camera system in order to efficiently manage multiple 2D object detection modules in a real-time scenario. System performance has been evaluated under different load conditions such as number of cameras and image sizes. The results show that the software architecture scales well with the number of camera and can easily works with different image formats respecting the real time constraints. Moreover, the parallelization approach can be used in order to speed up the processing tasks with a low level of overhead