940 resultados para Video Analysis
Resumo:
Pedicle screw insertion technique has made revolution in the surgical treatment of spinal fractures and spinal disorders. Although X- ray fluoroscopy based navigation is popular, there is risk of prolonged exposure to X- ray radiation. Systems that have lower radiation risk are generally quite expensive. The position and orientation of the drill is clinically very important in pedicle screw fixation. In this paper, the position and orientation of the marker on the drill is determined using pattern recognition based methods, using geometric features, obtained from the input video sequence taken from CCD camera. A search is then performed on the video frames after preprocessing, to obtain the exact position and orientation of the drill. An animated graphics, showing the instantaneous position and orientation of the drill is then overlaid on the processed video for real time drill control and navigation
Resumo:
In this paper we present a novel approach to detect people meeting. The proposed approach works by translating people behaviour from trajectory information into semantic terms. Having available a semantic model of the meeting behaviour, the event detection is performed in the semantic domain. The model is learnt employing a soft-computing clustering algorithm that combines trajectory information and motion semantic terms. A stable representation can be obtained from a series of examples. Results obtained on a series of videos with different types of meeting situations show that the proposed approach can learn a generic model that can effectively be applied on the behaviour recognition of meeting situations.
Resumo:
Primary voice production occurs in the larynx through vibrational movements carried out by vocal folds. However, many problems can affect this complex system resulting in voice disorders. In this context, time-frequency-shape analysis based on embedding phase space plots and nonlinear dynamics methods have been used to evaluate the vocal fold dynamics during phonation. For this purpose, the present work used high-speed video to record the vocal fold movements of three subjects and extract the glottal area time series using an image segmentation algorithm. This signal is used for an optimization method which combines genetic algorithms and a quasi-Newton method to optimize the parameters of a biomechanical model of vocal folds based on lumped elements (masses, springs and dampers). After optimization, this model is capable of simulating the dynamics of recorded vocal folds and their glottal pulse. Bifurcation diagrams and phase space analysis were used to evaluate the behavior of this deterministic system in different circumstances. The results showed that this methodology can be used to extract some physiological parameters of vocal folds and reproduce some complex behaviors of these structures contributing to the scientific and clinical evaluation of voice production. (C) 2010 Elsevier Inc. All rights reserved.
Resumo:
A main objective of the human movement analysis is the quantitative description of joint kinematics and kinetics. This information may have great possibility to address clinical problems both in orthopaedics and motor rehabilitation. Previous studies have shown that the assessment of kinematics and kinetics from stereophotogrammetric data necessitates a setup phase, special equipment and expertise to operate. Besides, this procedure may cause feeling of uneasiness on the subjects and may hinder with their walking. The general aim of this thesis is the implementation and evaluation of new 2D markerless techniques, in order to contribute to the development of an alternative technique to the traditional stereophotogrammetric techniques. At first, the focus of the study has been the estimation of the ankle-foot complex kinematics during stance phase of the gait. Two particular cases were considered: subjects barefoot and subjects wearing ankle socks. The use of socks was investigated in view of the development of the hybrid method proposed in this work. Different algorithms were analyzed, evaluated and implemented in order to have a 2D markerless solution to estimate the kinematics for both cases. The validation of the proposed technique was done with a traditional stereophotogrammetric system. The implementation of the technique leads towards an easy to configure (and more comfortable for the subject) alternative to the traditional stereophotogrammetric system. Then, the abovementioned technique has been improved so that the measurement of knee flexion/extension could be done with a 2D markerless technique. The main changes on the implementation were on occlusion handling and background segmentation. With the additional constraints, the proposed technique was applied to the estimation of knee flexion/extension and compared with a traditional stereophotogrammetric system. Results showed that the knee flexion/extension estimation from traditional stereophotogrammetric system and the proposed markerless system were highly comparable, making the latter a potential alternative for clinical use. A contribution has also been given in the estimation of lower limb kinematics of the children with cerebral palsy (CP). For this purpose, a hybrid technique, which uses high-cut underwear and ankle socks as “segmental markers” in combination with a markerless methodology, was proposed. The proposed hybrid technique is different than the abovementioned markerless technique in terms of the algorithm chosen. Results showed that the proposed hybrid technique can become a simple and low-cost alternative to the traditional stereophotogrammetric systems.
Resumo:
A reliable and robust routing service for Flying Ad-Hoc Networks (FANETs) must be able to adapt to topology changes, and also to recover the quality level of the delivered multiple video flows under dynamic network topologies. The user experience on watching live videos must also be satisfactory even in scenarios with network congestion, buffer overflow, and packet loss ratio, as experienced in many FANET multimedia applications. In this paper, we perform a comparative simulation study to assess the robustness, reliability, and quality level of videos transmitted via well-known beaconless opportunistic routing protocols. Simulation results shows that our developed protocol XLinGO achieves multimedia dissemination with Quality of Experience (QoE) support and robustness in a multi-hop, multi-flow, and mobile networks, as required in many multimedia FANET scenarios.
Resumo:
Growth codes are a subclass of Rateless codes that have found interesting applications in data dissemination problems. Compared to other Rateless and conventional channel codes, Growth codes show improved intermediate performance which is particularly useful in applications where partial data presents some utility. In this paper, we investigate the asymptotic performance of Growth codes using the Wormald method, which was proposed for studying the Peeling Decoder of LDPC and LDGM codes. Compared to previous works, the Wormald differential equations are set on nodes' perspective which enables a numerical solution to the computation of the expected asymptotic decoding performance of Growth codes. Our framework is appropriate for any class of Rateless codes that does not include a precoding step. We further study the performance of Growth codes with moderate and large size codeblocks through simulations and we use the generalized logistic function to model the decoding probability. We then exploit the decoding probability model in an illustrative application of Growth codes to error resilient video transmission. The video transmission problem is cast as a joint source and channel rate allocation problem that is shown to be convex with respect to the channel rate. This illustrative application permits to highlight the main advantage of Growth codes, namely improved performance in the intermediate loss region.
Resumo:
Introduction. Food frequency questionnaires (FFQ) are used study the association between dietary intake and disease. An instructional video may potentially offer a low cost, practical method of dietary assessment training for participants thereby reducing recall bias in FFQs. There is little evidence in the literature of the effect of using instructional videos on FFQ-based intake. Objective. This analysis compared the reported energy and macronutrient intake of two groups that were randomized either to watch an instructional video before completing an FFQ or to view the same instructional video after completing the same FFQ. Methods. In the parent study, a diverse group of students, faculty and staff from Houston Community College were randomized to two groups, stratified by ethnicity, and completed an FFQ. The "video before" group watched an instructional video about completing the FFQ prior to answering the FFQ. The "video after" group watched the instructional video after completing the FFQ. The two groups were compared on mean daily energy (Kcal/day), fat (g/day), protein (g/day), carbohydrate (g/day) and fiber (g/day) intakes using descriptive statistics and one-way ANOVA. Demographic, height, and weight information was collected. Dietary intakes were adjusted for total energy intake before the comparative analysis. BMI and age were ruled out as potential confounders. Results. There were no significant differences between the two groups in mean daily dietary intakes of energy, total fat, protein, carbohydrates and fiber. However, a pattern of higher energy intake and lower fiber intake was reported in the group that viewed the instructional video before completing the FFQ compared to those who viewed the video after. Discussion. Analysis of the difference between reported intake of energy and macronutrients showed an overall pattern, albeit not statistically significant, of higher intake in the video before versus the video after group. Application of instructional videos for dietary assessment may require further research to address the validity of reported dietary intakes in those who are randomized to watch an instructional video before reporting diet compared to a control groups that does not view a video.^
Resumo:
Esta tesis estudia la monitorización y gestión de la Calidad de Experiencia (QoE) en los servicios de distribución de vídeo sobre IP. Aborda el problema de cómo prevenir, detectar, medir y reaccionar a las degradaciones de la QoE desde la perspectiva de un proveedor de servicios: la solución debe ser escalable para una red IP extensa que entregue flujos individuales a miles de usuarios simultáneamente. La solución de monitorización propuesta se ha denominado QuEM(Qualitative Experience Monitoring, o Monitorización Cualitativa de la Experiencia). Se basa en la detección de las degradaciones de la calidad de servicio de red (pérdidas de paquetes, disminuciones abruptas del ancho de banda...) e inferir de cada una una descripción cualitativa de su efecto en la Calidad de Experiencia percibida (silencios, defectos en el vídeo...). Este análisis se apoya en la información de transporte y de la capa de abstracción de red de los flujos codificados, y permite caracterizar los defectos más relevantes que se observan en este tipo de servicios: congelaciones, efecto de “cuadros”, silencios, pérdida de calidad del vídeo, retardos e interrupciones en el servicio. Los resultados se han validado mediante pruebas de calidad subjetiva. La metodología usada en esas pruebas se ha desarrollado a su vez para imitar lo más posible las condiciones de visualización de un usuario de este tipo de servicios: los defectos que se evalúan se introducen de forma aleatoria en medio de una secuencia de vídeo continua. Se han propuesto también algunas aplicaciones basadas en la solución de monitorización: un sistema de protección desigual frente a errores que ofrece más protección a las partes del vídeo más sensibles a pérdidas, una solución para minimizar el impacto de la interrupción de la descarga de segmentos de Streaming Adaptativo sobre HTTP, y un sistema de cifrado selectivo que encripta únicamente las partes del vídeo más sensibles. También se ha presentado una solución de cambio rápido de canal, así como el análisis de la aplicabilidad de los resultados anteriores a un escenario de vídeo en 3D. ABSTRACT This thesis proposes a comprehensive approach to the monitoring and management of Quality of Experience (QoE) in multimedia delivery services over IP. It addresses the problem of preventing, detecting, measuring, and reacting to QoE degradations, under the constraints of a service provider: the solution must scale for a wide IP network delivering individual media streams to thousands of users. The solution proposed for the monitoring is called QuEM (Qualitative Experience Monitoring). It is based on the detection of degradations in the network Quality of Service (packet losses, bandwidth drops...) and the mapping of each degradation event to a qualitative description of its effect in the perceived Quality of Experience (audio mutes, video artifacts...). This mapping is based on the analysis of the transport and Network Abstraction Layer information of the coded stream, and allows a good characterization of the most relevant defects that exist in this kind of services: screen freezing, macroblocking, audio mutes, video quality drops, delay issues, and service outages. The results have been validated by subjective quality assessment tests. The methodology used for those test has also been designed to mimic as much as possible the conditions of a real user of those services: the impairments to evaluate are introduced randomly in the middle of a continuous video stream. Based on the monitoring solution, several applications have been proposed as well: an unequal error protection system which provides higher protection to the parts of the stream which are more critical for the QoE, a solution which applies the same principles to minimize the impact of incomplete segment downloads in HTTP Adaptive Streaming, and a selective scrambling algorithm which ciphers only the most sensitive parts of the media stream. A fast channel change application is also presented, as well as a discussion about how to apply the previous results and concepts in a 3D video scenario.
Resumo:
Esta tesis presenta un novedoso marco de referencia para el análisis y optimización del retardo de codificación y descodificación para vídeo multivista. El objetivo de este marco de referencia es proporcionar una metodología sistemática para el análisis del retardo en codificadores y descodificadores multivista y herramientas útiles en el diseño de codificadores/descodificadores para aplicaciones con requisitos de bajo retardo. El marco de referencia propuesto caracteriza primero los elementos que tienen influencia en el comportamiento del retardo: i) la estructura de predicción multivista, ii) el modelo hardware del codificador/descodificador y iii) los tiempos de proceso de cuadro. En segundo lugar, proporciona algoritmos para el cálculo del retardo de codificación/ descodificación de cualquier estructura arbitraria de predicción multivista. El núcleo de este marco de referencia consiste en una metodología para el análisis del retardo de codificación/descodificación multivista que es independiente de la arquitectura hardware del codificador/descodificador, completada con un conjunto de modelos que particularizan este análisis del retardo con las características de la arquitectura hardware del codificador/descodificador. Entre estos modelos, aquellos basados en teoría de grafos adquieren especial relevancia debido a su capacidad de desacoplar la influencia de los diferentes elementos en el comportamiento del retardo en el codificador/ descodificador, mediante una abstracción de su capacidad de proceso. Para revelar las posibles aplicaciones de este marco de referencia, esta tesis presenta algunos ejemplos de su utilización en problemas de diseño que afectan a codificadores y descodificadores multivista. Este escenario de aplicación cubre los siguientes casos: estrategias para el diseño de estructuras de predicción que tengan en consideración requisitos de retardo además del comportamiento tasa-distorsión; diseño del número de procesadores y análisis de los requisitos de velocidad de proceso en codificadores/ descodificadores multivista dado un retardo objetivo; y el análisis comparativo del comportamiento del retardo en codificadores multivista con diferentes capacidades de proceso e implementaciones hardware. ABSTRACT This thesis presents a novel framework for the analysis and optimization of the encoding and decoding delay for multiview video. The objective of this framework is to provide a systematic methodology for the analysis of the delay in multiview encoders and decoders and useful tools in the design of multiview encoders/decoders for applications with low delay requirements. The proposed framework characterizes firstly the elements that have an influence in the delay performance: i) the multiview prediction structure ii) the hardware model of the encoder/decoder and iii) frame processing times. Secondly, it provides algorithms for the computation of the encoding/decoding delay of any arbitrary multiview prediction structure. The core of this framework consists in a methodology for the analysis of the multiview encoding/decoding delay that is independent of the hardware architecture of the encoder/decoder, which is completed with a set of models that particularize this delay analysis with the characteristics of the hardware architecture of the encoder/decoder. Among these models, the ones based in graph theory acquire special relevance due to their capacity to detach the influence of the different elements in the delay performance of the encoder/decoder, by means of an abstraction of its processing capacity. To reveal possible applications of this framework, this thesis presents some examples of its utilization in design problems that affect multiview encoders and decoders. This application scenario covers the following cases: strategies for the design of prediction structures that take into consideration delay requirements in addition to the rate-distortion performance; design of number of processors and analysis of processor speed requirements in multiview encoders/decoders given a target delay; and comparative analysis of the encoding delay performance of multiview encoders with different processing capabilities and hardware implementations.
Resumo:
We present a novel framework for the analysis and optimization of encoding latency for multiview video. Firstly, we characterize the elements that have an influence in the encoding latency performance: (i) the multiview prediction structure and (ii) the hardware encoder model. Then, we provide algorithms to find the encoding latency of any arbitrary multiview prediction structure. The proposed framework relies on the directed acyclic graph encoder latency (DAGEL) model, which provides an abstraction of the processing capacity of the encoder by considering an unbounded number of processors. Using graph theoretic algorithms, the DAGEL model allows us to compute the encoding latency of a given prediction structure, and determine the contribution of the prediction dependencies to it. As an example of DAGEL application, we propose an algorithm to reduce the encoding latency of a given multiview prediction structure up to a target value. In our approach, a minimum number of frame dependencies are pruned, until the latency target value is achieved, thus minimizing the degradation of the rate-distortion performance due to the removal of the prediction dependencies. Finally, we analyze the latency performance of the DAGEL derived prediction structures in multiview encoders with limited processing capacity.
Resumo:
We present a framework for the analysis of the decoding delay in multiview video coding (MVC). We show that in real-time applications, an accurate estimation of the decoding delay is essential to achieve a minimum communication latency. As opposed to single-view codecs, the complexity of the multiview prediction structure and the parallel decoding of several views requires a systematic analysis of this decoding delay, which we solve using graph theory and a model of the decoder hardware architecture. Our framework assumes a decoder implementation in general purpose multi-core processors with multi-threading capabilities. For this hardware model, we show that frame processing times depend on the computational load of the decoder and we provide an iterative algorithm to compute jointly frame processing times and decoding delay. Finally, we show that decoding delay analysis can be applied to design decoders with the objective of minimizing the communication latency of the MVC system.
Resumo:
Esta tesis presenta un estudio exhaustivo sobre la evaluación de la calidad de experiencia (QoE, del inglés Quality of Experience) percibida por los usuarios de sistemas de vídeo 3D, analizando el impacto de los efectos introducidos por todos los elementos de la cadena de procesamiento de vídeo 3D. Por lo tanto, se presentan varias pruebas de evaluación subjetiva específicamente diseñadas para evaluar los sistemas considerados, teniendo en cuenta todos los factores perceptuales relacionados con la experiencia visual tridimensional, tales como la percepción de profundidad y la molestia visual. Concretamente, se describe un test subjetivo basado en la evaluación de degradaciones típicas que pueden aparecer en el proceso de creación de contenidos de vídeo 3D, por ejemplo debidas a calibraciones incorrectas de las cámaras o a algoritmos de procesamiento de la señal de vídeo (p. ej., conversión de 2D a 3D). Además, se presenta el proceso de generación de una base de datos de vídeos estereoscópicos de alta calidad, disponible gratuitamente para la comunidad investigadora y que ha sido utilizada ampliamente en diferentes trabajos relacionados con vídeo 3D. Asimismo, se presenta otro estudio subjetivo, realizado entre varios laboratorios, con el que se analiza el impacto de degradaciones causadas por la codificación de vídeo, así como diversos formatos de representación de vídeo 3D. Igualmente, se describen tres pruebas subjetivas centradas en el estudio de posibles efectos causados por la transmisión de vídeo 3D a través de redes de televisión sobre IP (IPTV, del inglés Internet Protocol Television) y de sistemas de streaming adaptativo de vídeo. Para estos casos, se ha propuesto una innovadora metodología de evaluación subjetiva de calidad vídeo, denominada Content-Immersive Evaluation of Transmission Impairments (CIETI), diseñada específicamente para evaluar eventos de transmisión simulando condiciones realistas de visualización de vídeo en ámbitos domésticos, con el fin de obtener conclusiones más representativas sobre la experiencia visual de los usuarios finales. Finalmente, se exponen dos experimentos subjetivos comparando varias tecnologías actuales de televisores 3D disponibles en el mercado de consumo y evaluando factores perceptuales de sistemas Super Multiview Video (SMV), previstos a ser la tecnología futura de televisores 3D de consumo, gracias a una prometedora visualización de contenido 3D sin necesidad de gafas específicas. El trabajo presentado en esta tesis ha permitido entender los factores perceptuales y técnicos relacionados con el procesamiento y visualización de contenidos de vídeo 3D, que pueden ser de utilidad en el desarrollo de nuevas tecnologías y técnicas de evaluación de la QoE, tanto metodologías subjetivas como métricas objetivas. ABSTRACT This thesis presents a comprehensive study of the evaluation of the Quality of Experience (QoE) perceived by the users of 3D video systems, analyzing the impact of effects introduced by all the elements of the 3D video processing chain. Therefore, various subjective assessment tests are presented, particularly designed to evaluate the systems under consideration, and taking into account all the perceptual factors related to the 3D visual experience, such as depth perception and visual discomfort. In particular, a subjective test is presented, based on evaluating typical degradations that may appear during the content creation, for instance due to incorrect camera calibration or video processing algorithms (e.g., 2D to 3D conversion). Moreover, the process of generation of a high-quality dataset of 3D stereoscopic videos is described, which is freely available for the research community, and has been already widely used in different works related with 3D video. In addition, another inter-laboratory subjective study is presented analyzing the impact of coding impairments and representation formats of stereoscopic video. Also, three subjective tests are presented studying the effects of transmission events that take place in Internet Protocol Television (IPTV) networks and adaptive streaming scenarios for 3D video. For these cases, a novel subjective evaluation methodology, called Content-Immersive Evaluation of Transmission Impairments (CIETI), was proposed, which was especially designed to evaluate transmission events simulating realistic home-viewing conditions, to obtain more representative conclusions about the visual experience of the end users. Finally, two subjective experiments are exposed comparing various current 3D displays available in the consumer market, and evaluating perceptual factors of Super Multiview Video (SMV) systems, expected to be the future technology for consumer 3D displays thanks to a promising visualization of 3D content without specific glasses. The work presented in this thesis has allowed to understand perceptual and technical factors related to the processing and visualization of 3D video content, which may be useful in the development of new technologies and approaches for QoE evaluation, both subjective methodologies and objective metrics.
Resumo:
"Report no. FHWA-IL-UI-278"--Technical report documentation page.