950 resultados para 3D Video Telecommunication Multimedia
Resumo:
We describe a user assisted technique for 3D stereo conversion from 2D images. Our approach exploits the geometric structure of perspective images including vanishing points. We allow a user to indicate lines, planes, and vanishing points in the input image, and directly employ these as constraints in an image warping framework to produce a stereo pair. By sidestepping explicit construction of a depth map, our approach is applicable to more general scenes and avoids potential artifacts of depth-image-based rendering. Our method is most suitable for scenes with large scale structures such as buildings.
Resumo:
We present a novel framework for encoding latency analysis of arbitrary multiview video coding prediction structures. This framework avoids the need to consider an specific encoder architecture for encoding latency analysis by assuming an unlimited processing capacity on the multiview encoder. Under this assumption, only the influence of the prediction structure and the processing times have to be considered, and the encoding latency is solved systematically by means of a graph model. The results obtained with this model are valid for a multiview encoder with sufficient processing capacity and serve as a lower bound otherwise. Furthermore, with the objective of low latency encoder design with low penalty on rate-distortion performance, the graph model allows us to identify the prediction relationships that add higher encoding latency to the encoder. Experimental results for JMVM prediction structures illustrate how low latency prediction structures with a low rate-distortion penalty can be derived in a systematic manner using the new model.
Resumo:
There is an increasing need of easy and affordable technologies to automatically generate virtual 3D models from their real counterparts. In particular, 3D human reconstruction has driven the creation of many clever techniques, most of them based on the visual hull (VH) concept. Such techniques do not require expensive hardware; however, they tend to yield 3D humanoids with realistic bodies but mediocre faces, since VH cannot handle concavities. On the other hand, structured light projectors allow to capture very accurate depth data, and thus to reconstruct realistic faces, but they are too expensive to use several of them. We have developed a technique to merge a VH-based 3D mesh of a reconstructed humanoid and the depth data of its face, captured by a single structured light projector. By combining the advantages of both systems in a simple setting, we are able to reconstruct realistic 3D human models with believable faces.
Resumo:
We present an adaptive unequal error protection (UEP) strategy built on the 1-D interleaved parity Application Layer Forward Error Correction (AL-FEC) code for protecting the transmission of stereoscopic 3D video content encoded with Multiview Video Coding (MVC) through IP-based networks. Our scheme targets the minimization of quality degradation produced by packet losses during video transmission in time-sensitive application scenarios. To that end, based on a novel packet-level distortion model, it selects in real time the most suitable packets within each Group of Pictures (GOP) to be protected and the most convenient FEC technique parameters, i.e., the size of the FEC generator matrix. In order to make these decisions, it considers the relevance of the packet, the behavior of the channel, and the available bitrate for protection purposes. Simulation results validate both the distortion model introduced to estimate the importance of packets and the optimization of the FEC technique parameter values.
Resumo:
Automatic 2D-to-3D conversion is an important application for filling the gap between the increasing number of 3D displays and the still scant 3D content. However, existing approaches have an excessive computational cost that complicates its practical application. In this paper, a fast automatic 2D-to-3D conversion technique is proposed, which uses a machine learning framework to infer the 3D structure of a query color image from a training database with color and depth images. Assuming that photometrically similar images have analogous 3D structures, a depth map is estimated by searching the most similar color images in the database, and fusing the corresponding depth maps. Large databases are desirable to achieve better results, but the computational cost also increases. A clustering-based hierarchical search using compact SURF descriptors to characterize images is proposed to drastically reduce search times. A significant computational time improvement has been obtained regarding other state-of-the-art approaches, maintaining the quality results.
Resumo:
Objetivo: Evaluar la eficacia del tratamiento en 3 casos de exotropia intermitente (XT(i)) mediante ejercicios de terapia visual, completando la exploración clínica con Videooculografia-30 y evidenciar la potencial aplicabilidad de esta tecnología para dicho propósito. Métodos: Exponemos los cambios ocurridos tras ejercicios de terapia visual en una mujer de 36 años con XT(i) de -25 dioptrías prismáticas (dp) de lejos y 18 dp de cerca; Un niño de 10 años de edad con 8 dp de XT(i) en posición primaria, asociados a +6 dp de hipotropia izquierda; y un hombre de 63 años con XT(i) de 6 dp en posición primaria asociada a +7 dp de hipertropia derecha. Todos los pacientes presentaron buena agudeza visual corregida en ambos ojos. La inestabilidad de la desviación ocular se evidenció mediante análisis de VOG-30, revelando la presencia de components verticales y torsionales. Se realizaron ejercicios de terapia visual, incluyendo diferentes tipos de ejercicios de vergencias, acomodación y percepción de la diplopía. Resultados: Tras la terapia visual se obtuvieron excelentes rangos de vergencias fusionales y de punto próximo de convergencia («hasta la nariz»). El examen mediante VOG-3D (Sensoro Motoric lnstruments, Teltow, Germany) confirmó la compensación de la desviación con estabilidad del alineamiento ocular. Se observó una significativa mejora después de la terapia en los components verticals y torsionales, lo cuales se hicieron más estables. Los pacientes se mostraron muy satisfechos de los resultados obtenidos. Conclusión: La VOG-3D es una técnica útil para dotamos de un método objetivo de registro de la compensación y estabilidad de la desviación ocular después de realizar ejercicios de terapia visual en casos de XT(i), ofreciéndonos un detallado análisis de la mejoría de los components verticales y torsionales.
Resumo:
This paper is concerned with long-term (20+ years) forecasting of broadband traffic in next-generation networks. Such long-term approach requires going beyond extrapolations of past traffic data while facing high uncertainty in predicting the future developments and facing the fact that, in 20 years, the current network technologies and architectures will be obsolete. Thus, "order of magnitude" upper bounds of upstream and downstream traffic are deemed to be good enough to facilitate such long-term forecasting. These bounds can be obtained by evaluating the limits of human sighting and assuming that these limits will be achieved by future services or, alternatively, by considering the contents transferred by bandwidth-demanding applications such as those using embedded interactive 3D video streaming. The traffic upper bounds are a good indication of the peak values and, subsequently, also of the future network capacity demands. Furthermore, the main drivers of traffic growth including multimedia as well as non-multimedia applications are identified. New disruptive applications and services are explored that can make good use of the large bandwidth provided by next-generation networks. The results can be used to identify monetization opportunities of future services and to map potential revenues for network operators. © 2014 The Author(s).
Resumo:
This paper presents a comparison among different consumer 3D display technologies by means of a subjective assessment test. Therefore, four 55-in displays have been considered: one autostereoscopic display, one stereoscopic with polarized passive glasses, and two with active shutter glasses. In addition, a high-quality 3D video database has been used to show diverse material with both views in high definition. To carry out the test, standard recommendations have been followed considering also some modifications looking for a test environment more similar to real home viewing conditions, with the objective of obtaining more representative conclusions. Moreover, several perceptual factors have been considered to study the performance of the displays, such as picture quality, depth perception, and visual discomfort. The obtained results show interesting issues, like the performance improvement of active shutter glasses technology, the high performance of the polarized glasses technology in terms of quality and comfort, and the need of improvement of the autostereoscopic displays to complement the visual comfort to reach a global high-quality visual experience.
Resumo:
In 2009, QUT’s Office of Research and the Institute for Adult Learning Singapore funded a six-month pilot project that represented the first stage of a larger international comparative study. The study is the first of its kind to investigate to what extent and how digital content workers’ learning needs are being met by adult education and training in Australia and Singapore. The pilot project involved consolidating key theoretical literature, studies, policies, programs and statistical data relevant to the digital content industries in Australia and Singapore. This had not been done before, and represented new knowledge generation. Digital content workers include professionals within and beyond the creative industries as follows: Visual effects and animation (including virtual reality and 3D products); Interactive multimedia (e.g. websites, CD-ROMs) and software development; Computer and online games; and Digital film & TV production and film & TV post-production. In the last decade, the digital content industries have been recognised as an industry sector of strong and increasing significance. The project compared Australia and Singapore on aspects of the digital content industries’ labour market, skill requirements, human capital challenges, the role of adult education in building a workforce for the digital content industries, and innovation policies. The consolidated report generated from the project formed the basis of the proposal for an ARC Linkage Project application submitted in the May 2010 round.
Resumo:
A novel algorithm for Virtual View Synthesis based on Non-Local Means Filtering is presented in this paper. Apart from using the video frames from the nearby cameras and the corresponding per-pixel depth map, this algorithm also makes use of the previously synthesized frame. Simple and efficient, the algorithm can synthesize video at any given virtual viewpoint at a faster rate. In the process, the quality of the synthesized frame is not compromised. Experimental results prove the above mentioned claim. The subjective and objective quality of the synthesized frames are comparable to the existing algorithms.
Resumo:
Increasing the field of view of a holographic display while maintaining adequate image size is a difficult task. To address this problem, we designed a system that tessellates several sub-holograms into one large hologram at the output. The sub-holograms we generate is similar to a kinoform but without the paraxial approximation during computation. The sub-holograms are loaded onto a single spatial light modulator consecutively and relayed to the appropriate position at the output through a combination of optics and scanning reconstruction light. We will review the method of computer generated hologram and describe the working principles of our system. Results from our proof-of-concept system are shown to have an improved field of view and reconstructed image size. ©2009 IEEE.
Resumo:
It is now possible to use powerful general purpose computer architectures to support post-production of both video and multimedia projects. By devising a suitable portable software architecture and using high-speed networking in an appropriate manner, a system has been constructed where editors are no longer tied to a specific location. New types of production, such as multi-threaded interactive video, are supported. Editors may also work remotely where very high speed network connection is not currently provided. An object-oriented database is used for the comprehensive cataloging of material and to support automatic audio/video object migration and replication. Copyright © 1997 by the Society of Motion Picture and Television Engineers, Inc.
Resumo:
Co-training is a semi-supervised learning method that is designed to take advantage of the redundancy that is present when the object to be identified has multiple descriptions. Co-training is known to work well when the multiple descriptions are conditional independent given the class of the object. The presence of multiple descriptions of objects in the form of text, images, audio and video in multimedia applications appears to provide redundancy in the form that may be suitable for co-training. In this paper, we investigate the suitability of utilizing text and image data from the Web for co-training. We perform measurements to find indications of conditional independence in the texts and images obtained from the Web. Our measurements suggest that conditional independence is likely to be present in the data. Our experiments, within a relevance feedback framework to test whether a method that exploits the conditional independence outperforms methods that do not, also indicate that better performance can indeed be obtained by designing algorithms that exploit this form of the redundancy when it is present.
Resumo:
The creation of Wireless Personal Area Networks (WPANs) offers the Consumer Electronics industry a mechanism to truly unwire consumer products, leading to portability and ease of installation as never seen before. WPAN's can offer data-rates exceeding those that are required to convey high quality broadcast video, thus users can easily connect to high quality video for multimedia presentations in education, libraries, advertising, or have a wireless connection at home. There have been many WPAN proposals, but this paper concentrates on ECMA-368 as this standard has the largest industrial and implementers' forum backing. This paper discusses the technology behind ECMA-368, the required numerical bandwidth, buffer memory requirements and implementation considerations while concentrating on supporting all the offered data-rates'.
Resumo:
The creation of Wireless Personal Area Networks (WPANs) offers the Consumer Electronics industry a mechanism to truly unwire consumer products, leading to portability and ease of installation as never seen before. WPAN's can offer data-rates exceeding those that are required to convey high quality broadcast video, thus users can easily connect to high quality video for multimedia presentations in education, libraries, advertising, or have a wireless connection at home. There have been many WPAN proposals, but this paper concentrates on ECMA-368 as this standard has the largest industrial and implementers' forum backing. With the aim to effective consumer electronic define and create cost equipment this paper discusses the technology behind ECMA-368 physical layer, the design freedom availabilities, the required processing, buffer memory requirements and implementation considerations while concentrating on supporting all the offered data-rates(1).