67 resultados para Video semantics
Resumo:
The use of new technologies in neurorehabilitation has led to higher intensity rehabilitation processes, extending therapies in an economically sustainable way. Interactive Video (IV) technology allows therapists to work with virtual environments that reproduce real situations. In this way, patients deal with Activities of the Daily Living (ADL) immersed within enhanced environments [1]. These rehabilitation exercises, which focus in re-learning lost functions, will try to modulate the neural plasticity processes [2]. This research presents a system where a neurorehabilitation IV-based environment has been integrated with an eye-tracker device in order to monitor and to interact using visual attention. While patients are interacting with the neurorehabilitation environment, their visual behavior is closely related with their cognitive state, which in turn mirrors the brain damage condition suffered by them [3] [4]. Patients’ gaze data can provide knowledge on their attention focus and their cognitive state, as well as on the validity of the rehabilitation tasks proposed [5].
Resumo:
The present work covers the first validation efforts of the EVA Tracking System for the assessment of minimally invasive surgery (MIS) psychomotor skills. Instrument movements were recorded for 42 surgeons (4 expert, 22 residents, 16 novice medical students) and analyzed for a box trainer peg transfer task. Construct validation was established for 7/9 motion analysis parameters (MAPs). Concurrent validation was determined for 8/9 MAPs against the TrEndo Tracking System. Finally, automatic determination of surgical proficiency based on the MAPs was sought by 3 different approaches to supervised classification (LDA, SVM, ANFIS), with accuracy results of 61.9%, 83.3% and 80.9% respectively. Results not only reflect on the validation of EVA for skills? assessment, but also on the relevance of motion analysis of instruments in the determination of surgical competence.
Resumo:
Background: Minimally invasive surgery creates two technological opportunities: (1) the development of better training and objective evaluation environments, and (2) the creation of image guided surgical systems.
Resumo:
Analysis of minimally invasive surgical videos is a powerful tool to drive new solutions for achieving reproducible training programs, objective and transparent assessment systems and navigation tools to assist surgeons and improve patient safety. This paper presents how video analysis contributes to the development of new cognitive and motor training and assessment programs as well as new paradigms for image-guided surgery.
Resumo:
Esta tesis presenta un novedoso marco de referencia para el análisis y optimización del retardo de codificación y descodificación para vídeo multivista. El objetivo de este marco de referencia es proporcionar una metodología sistemática para el análisis del retardo en codificadores y descodificadores multivista y herramientas útiles en el diseño de codificadores/descodificadores para aplicaciones con requisitos de bajo retardo. El marco de referencia propuesto caracteriza primero los elementos que tienen influencia en el comportamiento del retardo: i) la estructura de predicción multivista, ii) el modelo hardware del codificador/descodificador y iii) los tiempos de proceso de cuadro. En segundo lugar, proporciona algoritmos para el cálculo del retardo de codificación/ descodificación de cualquier estructura arbitraria de predicción multivista. El núcleo de este marco de referencia consiste en una metodología para el análisis del retardo de codificación/descodificación multivista que es independiente de la arquitectura hardware del codificador/descodificador, completada con un conjunto de modelos que particularizan este análisis del retardo con las características de la arquitectura hardware del codificador/descodificador. Entre estos modelos, aquellos basados en teoría de grafos adquieren especial relevancia debido a su capacidad de desacoplar la influencia de los diferentes elementos en el comportamiento del retardo en el codificador/ descodificador, mediante una abstracción de su capacidad de proceso. Para revelar las posibles aplicaciones de este marco de referencia, esta tesis presenta algunos ejemplos de su utilización en problemas de diseño que afectan a codificadores y descodificadores multivista. Este escenario de aplicación cubre los siguientes casos: estrategias para el diseño de estructuras de predicción que tengan en consideración requisitos de retardo además del comportamiento tasa-distorsión; diseño del número de procesadores y análisis de los requisitos de velocidad de proceso en codificadores/ descodificadores multivista dado un retardo objetivo; y el análisis comparativo del comportamiento del retardo en codificadores multivista con diferentes capacidades de proceso e implementaciones hardware. ABSTRACT This thesis presents a novel framework for the analysis and optimization of the encoding and decoding delay for multiview video. The objective of this framework is to provide a systematic methodology for the analysis of the delay in multiview encoders and decoders and useful tools in the design of multiview encoders/decoders for applications with low delay requirements. The proposed framework characterizes firstly the elements that have an influence in the delay performance: i) the multiview prediction structure ii) the hardware model of the encoder/decoder and iii) frame processing times. Secondly, it provides algorithms for the computation of the encoding/decoding delay of any arbitrary multiview prediction structure. The core of this framework consists in a methodology for the analysis of the multiview encoding/decoding delay that is independent of the hardware architecture of the encoder/decoder, which is completed with a set of models that particularize this delay analysis with the characteristics of the hardware architecture of the encoder/decoder. Among these models, the ones based in graph theory acquire special relevance due to their capacity to detach the influence of the different elements in the delay performance of the encoder/decoder, by means of an abstraction of its processing capacity. To reveal possible applications of this framework, this thesis presents some examples of its utilization in design problems that affect multiview encoders and decoders. This application scenario covers the following cases: strategies for the design of prediction structures that take into consideration delay requirements in addition to the rate-distortion performance; design of number of processors and analysis of processor speed requirements in multiview encoders/decoders given a target delay; and comparative analysis of the encoding delay performance of multiview encoders with different processing capabilities and hardware implementations.
Resumo:
Using CMOS transistors for terahertz detection is currently a disruptive technology that offers the direct integration of a terahertz detector with video preamplifiers. The detectors are based on the resistive mixer concept and performance mainly depends on the following parameters: type of antenna, electrical parameters (gate to drain capacitor and channel length of the CMOS device) and foundry. Two different 300 GHz detectors are discussed: a single transistor detector with a broadband antenna and a differential pair driven by a resonant patch antenna.
Resumo:
En este trabajo nos vamos a centrar en la metodolgía utilizada en la asignatura de Organización Industrial ya que se centra en el estudio de la dirección de operaciones, tema del IV Worshop OMTECH. Partimos de los estudios de Dobro et al.( 2011 )y de Dean & Jolly (2012) para proponer una nueva metodología en las aulas de Dirección de Operaciones, en general, y de Organización Industrial, en nuestro caso.
Resumo:
The increasing use of video editing software has resulted in a necessity for faster and more efficient editing tools. Here, we propose a lightweight high-quality video indexing tool that is suitable for video editing software.
Resumo:
A novel scheme for depth sequences compression, based on a perceptual coding algorithm, is proposed. A depth sequence describes the object position in the 3D scene, and is used, in Free Viewpoint Video, for the generation of synthetic video sequences. In perceptual video coding the human visual system characteristics are exploited to improve the compression efficiency. As depth sequences are never shown, the perceptual video coding, assessed over them, is not effective. The proposed algorithm is based on a novel perceptual rate distortion optimization process, assessed over the perceptual distortion of the rendered views generated through the encoded depth sequences. The experimental results show the effectiveness of the proposed method, able to obtain a very considerable improvement of the rendered view perceptual quality.
Resumo:
We present an adaptive unequal error protection (UEP) strategy built on the 1-D interleaved parity Application Layer Forward Error Correction (AL-FEC) code for protecting the transmission of stereoscopic 3D video content encoded with Multiview Video Coding (MVC) through IP-based networks. Our scheme targets the minimization of quality degradation produced by packet losses during video transmission in time-sensitive application scenarios. To that end, based on a novel packet-level distortion model, it selects in real time the most suitable packets within each Group of Pictures (GOP) to be protected and the most convenient FEC technique parameters, i.e., the size of the FEC generator matrix. In order to make these decisions, it considers the relevance of the packet, the behavior of the channel, and the available bitrate for protection purposes. Simulation results validate both the distortion model introduced to estimate the importance of packets and the optimization of the FEC technique parameter values.
Resumo:
Transmission errors are the main cause of degradation of the quality of real broadcasted video services. Therefore, knowing their impact on the quality of experience of the end users is a crucial issue. For instance, it would help to improve the performance of the distribution systems, and to develop monitoring tools to automatically estimate the quality perceived by the end users. In this paper we validate a subjective evaluation approach specifically designed to obtain meaningful results of the effects of degradations caused by transmission errors. This methodology has been already used in our previous works with monoscopic and stereoscopic videos. The validation is done by comparing the subjective ratings obtained for typical transmission errors with the proposed methodology and with the standard method Absolute Category Rating. The results show that the proposed approach could provide more representative evaluations of the quality of experience perceived by end users of conventional and 3D broadcasted video services.
Resumo:
Research in stereoscopic 3D coding, transmission and subjective assessment methodology depends largely on the availability of source content that can be used in cross-lab evaluations. While several studies have already been presented using proprietary content, comparisons between the studies are difficult since discrepant contents are used. Therefore in this paper, a freely available dataset of high quality Full-HD stereoscopic sequences shot with a semiprofessional 3D camera is introduced in detail. The content was designed to be suited for usage in a wide variety of applications, including high quality studies. A set of depth maps was calculated from the stereoscopic pair. As an application example, a subjective assessment has been performed using coding and spatial degradations. The Absolute Category Rating with Hidden Reference method was used. The observers were instructed to vote on video quality only. Results of this experiment are also freely available and will be presented in this paper as a first step towards objective video quality measurement for 3DTV.
Resumo:
In this paper we propose an innovative method for the automatic detection and tracking of road traffic signs using an onboard stereo camera. It involves a combination of monocular and stereo analysis strategies to increase the reliability of the detections such that it can boost the performance of any traffic sign recognition scheme. Firstly, an adaptive color and appearance based detection is applied at single camera level to generate a set of traffic sign hypotheses. In turn, stereo information allows for sparse 3D reconstruction of potential traffic signs through a SURF-based matching strategy. Namely, the plane that best fits the cloud of 3D points traced back from feature matches is estimated using a RANSAC based approach to improve robustness to outliers. Temporal consistency of the 3D information is ensured through a Kalman-based tracking stage. This also allows for the generation of a predicted 3D traffic sign model, which is in turn used to enhance the previously mentioned color-based detector through a feedback loop, thus improving detection accuracy. The proposed solution has been tested with real sequences under several illumination conditions and in both urban areas and highways, achieving very high detection rates in challenging environments, including rapid motion and significant perspective distortion
Resumo:
The increasing use of video editing software requires faster and more efficient editing tools. As a first step, these tools perform a temporal segmentation in shots that allows a later building of indexes describing the video content. Here, we propose a novel real-time high-quality shot detection strategy, suitable for the last generation of video editing software requiring both low computational cost and high quality results. While abrupt transitions are detected through a very fast pixel-based analysis, gradual transitions are obtained from an efficient edge-based analysis. Both analyses are reinforced with a motion analysis that helps to detect and discard false detections. This motion analysis is carried out exclusively over a reduced set of candidate transitions, thus maintaining the computational requirements demanded by new applications to fulfill user needs.
Resumo:
Recently, three-dimensional (3D) video has decisively burst onto the entertainment industry scene, and has arrived in households even before the standardization process has been completed. 3D television (3DTV) adoption and deployment can be seen as a major leap in television history, similar to previous transitions from black and white (B&W) to color, from analog to digital television (TV), and from standard definition to high definition. In this paper, we analyze current 3D video technology trends in order to define a taxonomy of the availability and possible introduction of 3D-based services. We also propose an audiovisual network services architecture which provides a smooth transition from two-dimensional (2D) to 3DTV in an Internet Protocol (IP)-based scenario. Based on subjective assessment tests, we also analyze those factors which will influence the quality of experience in those 3D video services, focusing on effects of both coding and transmission errors. In addition, examples of the application of the architecture and results of assessment tests are provided.