950 resultados para 3D Video Telecommunication Multimedia
Resumo:
This dissertation presents a study and experimental research on asymmetric coding of stereoscopic video. A review on 3D technologies, video formats and coding is rst presented and then particular emphasis is given to asymmetric coding of 3D content and performance evaluation methods, based on subjective measures, of methods using asymmetric coding. The research objective was de ned to be an extension of the current concept of asymmetric coding for stereo video. To achieve this objective the rst step consists in de ning regions in the spatial dimension of auxiliary view with di erent perceptual relevance within the stereo pair, which are identi ed by a binary mask. Then these regions are encoded with better quality (lower quantisation) for the most relevant ones and worse quality (higher quantisation) for the those with lower perceptual relevance. The actual estimation of the relevance of a given region is based on a measure of disparity according to the absolute di erence between views. To allow encoding of a stereo sequence using this method, a reference H.264/MVC encoder (JM) has been modi ed to allow additional con guration parameters and inputs. The nal encoder is still standard compliant. In order to show the viability of the method subjective assessment tests were performed over a wide range of objective qualities of the auxiliary view. The results of these tests allow us to prove 3 main goals. First, it is shown that the proposed method can be more e cient than traditional asymmetric coding when encoding stereo video at higher qualities/rates. The method can also be used to extend the threshold at which uniform asymmetric coding methods start to have an impact on the subjective quality perceived by the observers. Finally the issue of eye dominance is addressed. Results from stereo still images displayed over a short period of time showed it has little or no impact on the proposed method.
Resumo:
Immersion and interaction have been identified as key factors influencing the quality of experience in stereoscopic video systems. The work presented here aims to create a new paradigm for 3D Multimedia consumption exploiting these factors in order to increase user involvement. We use a 5-sided CAVETM environment to support 3D panoramic video reproduction, real-time insertion of synthetic objects into the three-dimensional scene and real-time user interaction with the inserted elements. In this paper we describe our system requirements, functionalities, conceptual design and preliminary implementation results emphasizing the most relevant challenges accomplished. The focus is on three main issues: the generation of stereoscopic video panoramas; the synchronous reproduction of immersive 3D video across multiple screens; and, the real-time insertion algorithm implemented for the integration of synthetic objects into the stereoscopic video. These results have been successfully integrated into the graphic engine managing the operation of the CAVETM infrastructure.
Resumo:
Redes em Malha sem Fio ( do inglês Wireless Mesh Networks - WMNs) são previstas serem uma das mais importantes tecnologias sem fio no que se refere ao fornecimento do acesso de última milha em redes multimídia futuras. Elas vão permitir que milhares de usuários fixos e móveis acessem, produzam e compartilhem conteúdo multimídia de forma onipresente. Neste contexto, vídeo 3D está previsto atrair mais e mais o mercado multimídia com a perspectiva de reforçar as aplicações (vídeos de vigilância, controle demissões críticas, entretenimento, etc). No entanto, o desafio de lidar com a largura de banda optante, escassez de recursos e taxas de erros variantes com o tempo destas redes, ilustra a necessidade da transmissão de vídeos 3D mais resistentes a erros. Dessa forma, alternativas como abordagens de Correção Antecipada de Erros (FEC) se tornam necessárias para fornecer a distribuição de aplicações de vídeo para usuários sem fio com garantia de melhor qualidade de serviço (QoS) e Qualidade de Experiência (QoE). Esta dissertação apresenta um mecanismo baseado em FEC com Proteção Desigual de Erros (UEP) para melhorar a transmissão de vídeo 3D em WMNs, aumentando a satisfação do usuário e permitindo uma melhoria do uso dos recursos sem fio. Os benefícios e impactos do mecanismo proposto serão demonstrados usando simulação e a avaliação será realizada através de métricas de QoE objetivas e subjetivas.
Resumo:
This paper introduces a database of freely available stereo-3D content designed to facilitate research in stereo post-production. It describes the structure and content of the database and provides some details about how the material was gathered. The database includes examples of many of the scenarios characteristic to broadcast footage. Material was gathered at different locations including a studio with controlled lighting and both indoor and outdoor on-location sites with more restricted lighting control. The database also includes video sequences with accompanying 3D audio data recorded in an Ambisonics format. An intended consequence of gathering the material is that the database contains examples of degradations that would be commonly present in real-world scenarios. This paper describes one such artefact caused by uneven exposure in the stereo views, causing saturation in the over-exposed view. An algorithm for the restoration of this artefact is proposed in order to highlight the usefuiness of the database.
Experimental Prototype Merging Stereo Panoramic Video and Interactive 3D Content in a 5-sided CAVETM
Resumo:
Immersion and interaction have been identified as key factors influencing the quality of experience in stereoscopic video systems. An experimental prototype designed to explore the influence of these factors in 3D video applications is described here1. The focus is on the real-time insertion algorithm of new 3D models into the original video streams. Using this algorithm, our prototype is aimed to explore a new interaction paradigm ? similar to the augmented reality approach ? with 3D video applications.
Resumo:
Esta tesis estudia la monitorización y gestión de la Calidad de Experiencia (QoE) en los servicios de distribución de vídeo sobre IP. Aborda el problema de cómo prevenir, detectar, medir y reaccionar a las degradaciones de la QoE desde la perspectiva de un proveedor de servicios: la solución debe ser escalable para una red IP extensa que entregue flujos individuales a miles de usuarios simultáneamente. La solución de monitorización propuesta se ha denominado QuEM(Qualitative Experience Monitoring, o Monitorización Cualitativa de la Experiencia). Se basa en la detección de las degradaciones de la calidad de servicio de red (pérdidas de paquetes, disminuciones abruptas del ancho de banda...) e inferir de cada una una descripción cualitativa de su efecto en la Calidad de Experiencia percibida (silencios, defectos en el vídeo...). Este análisis se apoya en la información de transporte y de la capa de abstracción de red de los flujos codificados, y permite caracterizar los defectos más relevantes que se observan en este tipo de servicios: congelaciones, efecto de “cuadros”, silencios, pérdida de calidad del vídeo, retardos e interrupciones en el servicio. Los resultados se han validado mediante pruebas de calidad subjetiva. La metodología usada en esas pruebas se ha desarrollado a su vez para imitar lo más posible las condiciones de visualización de un usuario de este tipo de servicios: los defectos que se evalúan se introducen de forma aleatoria en medio de una secuencia de vídeo continua. Se han propuesto también algunas aplicaciones basadas en la solución de monitorización: un sistema de protección desigual frente a errores que ofrece más protección a las partes del vídeo más sensibles a pérdidas, una solución para minimizar el impacto de la interrupción de la descarga de segmentos de Streaming Adaptativo sobre HTTP, y un sistema de cifrado selectivo que encripta únicamente las partes del vídeo más sensibles. También se ha presentado una solución de cambio rápido de canal, así como el análisis de la aplicabilidad de los resultados anteriores a un escenario de vídeo en 3D. ABSTRACT This thesis proposes a comprehensive approach to the monitoring and management of Quality of Experience (QoE) in multimedia delivery services over IP. It addresses the problem of preventing, detecting, measuring, and reacting to QoE degradations, under the constraints of a service provider: the solution must scale for a wide IP network delivering individual media streams to thousands of users. The solution proposed for the monitoring is called QuEM (Qualitative Experience Monitoring). It is based on the detection of degradations in the network Quality of Service (packet losses, bandwidth drops...) and the mapping of each degradation event to a qualitative description of its effect in the perceived Quality of Experience (audio mutes, video artifacts...). This mapping is based on the analysis of the transport and Network Abstraction Layer information of the coded stream, and allows a good characterization of the most relevant defects that exist in this kind of services: screen freezing, macroblocking, audio mutes, video quality drops, delay issues, and service outages. The results have been validated by subjective quality assessment tests. The methodology used for those test has also been designed to mimic as much as possible the conditions of a real user of those services: the impairments to evaluate are introduced randomly in the middle of a continuous video stream. Based on the monitoring solution, several applications have been proposed as well: an unequal error protection system which provides higher protection to the parts of the stream which are more critical for the QoE, a solution which applies the same principles to minimize the impact of incomplete segment downloads in HTTP Adaptive Streaming, and a selective scrambling algorithm which ciphers only the most sensitive parts of the media stream. A fast channel change application is also presented, as well as a discussion about how to apply the previous results and concepts in a 3D video scenario.
Resumo:
This paper presents the implementation of a high quality real-time 3D video system intended for 3D videoconferencing -- Basically, the system is able to extract depth information from a pair of images coming from a short-baseline camera setup -- The system is based on the use of a variant of the adaptive support-weight algorithm to be applied on GPU-based architectures -- The reason to do it is to get real-time results without compromising accuracy and also to reduce costs by using commodity hardware -- The complete system runs over the GStreamer multimedia software platform to make it even more flexible -- Moreover, an autoestereoscopic display has been used as the end-up terminal for 3D content visualization
Resumo:
Searching for multimedia is an important activity for users of Web search engines. Studying user's interactions with Web search engine multimedia buttons, including image, audio, and video, is important for the development of multimedia Web search systems. This article provides results from a Weblog analysis study of multimedia Web searching by Dogpile users in 2006. The study analyzes the (a) duration, size, and structure of Web search queries and sessions; (b) user demographics; (c) most popular multimedia Web searching terms; and (d) use of advanced Web search techniques including Boolean and natural language. The current study findings are compared with results from previous multimedia Web searching studies. The key findings are: (a) Since 1997, image search consistently is the dominant media type searched followed by audio and video; (b) multimedia search duration is still short (>50% of searching episodes are <1 min), using few search terms; (c) many multimedia searches are for information about people, especially in audio search; and (d) multimedia search has begun to shift from entertainment to other categories such as medical, sports, and technology (based on the most repeated terms). Implications for design of Web multimedia search engines are discussed.
Resumo:
Real time anomaly detection is the need of the hour for any security applications. In this article, we have proposed a real time anomaly detection for H.264 compressed video streams utilizing pre-encoded motion vectors (MVs). The proposed work is principally motivated by the observation that MVs have distinct characteristics during anomaly than usual. Our observation shows that H.264 MV magnitude and orientation contain relevant information which can be used to model the usual behavior (UB) effectively. This is subsequently extended to detect abnormality/anomaly based on the probability of occurrence of a behavior. The performance of the proposed algorithm was evaluated and bench-marked on UMN and Ped anomaly detection video datasets, with a detection rate of 70 frames per sec resulting in 90x and 250x speedup, along with on-par detection accuracy compared to the state-of-the-art algorithms.
Resumo:
An improved color video super-resolution technique using kernel regression and fuzzy enhancement is presented in this paper. A high resolution frame is computed from a set of low resolution video frames by kernel regression using an adaptive Gaussian kernel. A fuzzy smoothing filter is proposed to enhance the regression output. The proposed technique is a low cost software solution to resolution enhancement of color video in multimedia applications. The performance of the proposed technique is evaluated using several color videos and it is found to be better than other techniques in producing high quality high resolution color videos
Resumo:
The pervasive and ubiquitous computing has motivated researches on multimedia adaptation which aims at matching the video quality to the user needs and device restrictions. This technique has a high computational cost which needs to be studied and estimated when designing architectures and applications. This paper presents an analytical model to quantify these video transcoding costs in a hardware independent way. The model was used to analyze the impact of transcoding delays in end-to-end live-video transmissions over LANs, MANs and WANs. Experiments confirm that the proposed model helps to define the best transcoding architecture for different scenarios.
Resumo:
3D video-fluoroscopy is an accurate but cumbersome technique to estimate natural or prosthetic human joint kinematics. This dissertation proposes innovative methodologies to improve the 3D fluoroscopic analysis reliability and usability. Being based on direct radiographic imaging of the joint, and avoiding soft tissue artefact that limits the accuracy of skin marker based techniques, the fluoroscopic analysis has a potential accuracy of the order of mm/deg or better. It can provide fundamental informations for clinical and methodological applications, but, notwithstanding the number of methodological protocols proposed in the literature, time consuming user interaction is exploited to obtain consistent results. The user-dependency prevented a reliable quantification of the actual accuracy and precision of the methods, and, consequently, slowed down the translation to the clinical practice. The objective of the present work was to speed up this process introducing methodological improvements in the analysis. In the thesis, the fluoroscopic analysis was characterized in depth, in order to evaluate its pros and cons, and to provide reliable solutions to overcome its limitations. To this aim, an analytical approach was followed. The major sources of error were isolated with in-silico preliminary studies as: (a) geometric distortion and calibration errors, (b) 2D images and 3D models resolutions, (c) incorrect contour extraction, (d) bone model symmetries, (e) optimization algorithm limitations, (f) user errors. The effect of each criticality was quantified, and verified with an in-vivo preliminary study on the elbow joint. The dominant source of error was identified in the limited extent of the convergence domain for the local optimization algorithms, which forced the user to manually specify the starting pose for the estimating process. To solve this problem, two different approaches were followed: to increase the optimal pose convergence basin, the local approach used sequential alignments of the 6 degrees of freedom in order of sensitivity, or a geometrical feature-based estimation of the initial conditions for the optimization; the global approach used an unsupervised memetic algorithm to optimally explore the search domain. The performances of the technique were evaluated with a series of in-silico studies and validated in-vitro with a phantom based comparison with a radiostereometric gold-standard. The accuracy of the method is joint-dependent, and for the intact knee joint, the new unsupervised algorithm guaranteed a maximum error lower than 0.5 mm for in-plane translations, 10 mm for out-of-plane translation, and of 3 deg for rotations in a mono-planar setup; and lower than 0.5 mm for translations and 1 deg for rotations in a bi-planar setups. The bi-planar setup is best suited when accurate results are needed, such as for methodological research studies. The mono-planar analysis may be enough for clinical application when the analysis time and cost may be an issue. A further reduction of the user interaction was obtained for prosthetic joints kinematics. A mixed region-growing and level-set segmentation method was proposed and halved the analysis time, delegating the computational burden to the machine. In-silico and in-vivo studies demonstrated that the reliability of the new semiautomatic method was comparable to a user defined manual gold-standard. The improved fluoroscopic analysis was finally applied to a first in-vivo methodological study on the foot kinematics. Preliminary evaluations showed that the presented methodology represents a feasible gold-standard for the validation of skin marker based foot kinematics protocols.