67 resultados para Video semantics

em Universidad Politécnica de Madrid


Relevância:

30.00% 30.00%

Publicador:

Resumo:

This article presents a probabilistic method for vehicle detection and tracking through the analysis of monocular images obtained from a vehicle-mounted camera. The method is designed to address the main shortcomings of traditional particle filtering approaches, namely Bayesian methods based on importance sampling, for use in traffic environments. These methods do not scale well when the dimensionality of the feature space grows, which creates significant limitations when tracking multiple objects. Alternatively, the proposed method is based on a Markov chain Monte Carlo (MCMC) approach, which allows efficient sampling of the feature space. The method involves important contributions in both the motion and the observation models of the tracker. Indeed, as opposed to particle filter-based tracking methods in the literature, which typically resort to observation models based on appearance or template matching, in this study a likelihood model that combines appearance analysis with information from motion parallax is introduced. Regarding the motion model, a new interaction treatment is defined based on Markov random fields (MRF) that allows for the handling of possible inter-dependencies in vehicle trajectories. As for vehicle detection, the method relies on a supervised classification stage using support vector machines (SVM). The contribution in this field is twofold. First, a new descriptor based on the analysis of gradient orientations in concentric rectangles is dened. This descriptor involves a much smaller feature space compared to traditional descriptors, which are too costly for real-time applications. Second, a new vehicle image database is generated to train the SVM and made public. The proposed vehicle detection and tracking method is proven to outperform existing methods and to successfully handle challenging situations in the test sequences.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Las tecnologías de vídeo en 3D han estado al alza en los últimos años, con abundantes avances en investigación unidos a una adopción generalizada por parte de la industria del cine, y una importancia creciente en la electrónica de consumo. Relacionado con esto, está el concepto de vídeo multivista, que abarca el vídeo 3D, y puede definirse como un flujo de vídeo compuesto de dos o más vistas. El vídeo multivista permite prestaciones avanzadas de vídeo, como el vídeo estereoscópico, el “free viewpoint video”, contacto visual mejorado mediante vistas virtuales, o entornos virtuales compartidos. El propósito de esta tesis es salvar un obstáculo considerable de cara al uso de vídeo multivista en sistemas de comunicación: la falta de soporte para esta tecnología por parte de los protocolos de señalización existentes, que hace imposible configurar una sesión con vídeo multivista mediante mecanismos estándar. Así pues, nuestro principal objetivo es la extensión del Protocolo de Inicio de Sesión (SIP) para soportar la negociación de sesiones multimedia con flujos de vídeo multivista. Nuestro trabajo se puede resumir en tres contribuciones principales. En primer lugar, hemos definido una extensión de señalización para configurar sesiones SIP con vídeo 3D. Esta extensión modifica el Protocolo de Descripción de Sesión (SDP) para introducir un nuevo atributo de nivel de medios, y un nuevo tipo de dependencia de descodificación, que contribuyen a describir los formatos de vídeo 3D que pueden emplearse en una sesión, así como la relación entre los flujos de vídeo que componen un flujo de vídeo 3D. La segunda contribución consiste en una extensión a SIP para manejar la señalización de videoconferencias con flujos de vídeo multivista. Se definen dos nuevos paquetes de eventos SIP para describir las capacidades y topología de los terminales de conferencia, por un lado, y la configuración espacial y mapeo de flujos de una conferencia, por el otro. También se describe un mecanismo para integrar el intercambio de esta información en el proceso de inicio de una conferencia SIP. Como tercera y última contribución, introducimos el concepto de espacio virtual de una conferencia, o un sistema de coordenadas que incluye todos los objetos relevantes de la conferencia (como dispositivos de captura, pantallas, y usuarios). Explicamos cómo el espacio virtual se relaciona con prestaciones de conferencia como el contacto visual, la escala de vídeo y la fidelidad espacial, y proporcionamos reglas para determinar las prestaciones de una conferencia a partir del análisis de su espacio virtual, y para generar espacios virtuales durante la configuración de conferencias.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In Video over IP services, perceived video quality heavily depends on parameters such as video coding and network Quality of Service. This paper proposes a model for the estimation of perceived video quality in video streaming and broadcasting services that combines the aforementioned parameters with other that depend mainly on the information contents of the video sequences. These fitting parameters are derived from the Spatial and Temporal Information contents of the sequences. This model does not require reference to the original video sequence so it can be used for online, real-time monitoring of perceived video quality in Video over IP services. Furthermore, this paper proposes a measurement workbench designed to acquire both training data for model fitting and test data for model validation. Preliminary results show good correlation between measured and predicted values.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The paper proposes a model for estimation of perceived video quality in IPTV, taking as input both video coding and network Quality of Service parameters. It includes some fitting parameters that depend mainly on the information contents of the video sequences. A method to derive them from the Spatial and Temporal Information contents of the sequences is proposed. The model may be used for near real-time monitoring of IPTV video quality.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A video-aware unequal loss protection (ULP) system for protecting RTP video streaming in bursty packet loss networks is proposed. Just considering the relevance of the frame, the state of the channel and the bitrate constraints of the protection bitstream, our algorithm selects in real time the most suitable frames to be protected through forward error correction (FEC) techniques. It benefits from a wise RTP encapsulation that allows working at a frame level without requiring any further process than that of parsing RTP headers, so it is perfectly suitable to be included in commercial transmitters. The simulation results show how our proposed ULP technique outperforms non-smart schemes.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Real-time monitoring of multimedia Quality of Experience is a critical task for the providers of multimedia delivery services: from television broadcasters to IP content delivery networks or IPTV. For such scenarios, meaningful metrics are required which can generate useful information to the service providers that overcome the limitations of pure Quality of Service monitoring probes. However, most of objective multimedia quality estimators, aimed at modeling the Mean Opinion Score, are difficult to apply to massive quality monitoring. Thus we propose a lightweight and scalable monitoring architecture called Qualitative Experience Monitoring (QuEM), based on detecting identifiable impairment events such as the ones reported by the customers of those services. We also carried out a subjective assessment test to validate the approach and calibrate the metrics. Preliminary results of this test set support our approach.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We present a novel framework for encoding latency analysis of arbitrary multiview video coding prediction structures. This framework avoids the need to consider an specific encoder architecture for encoding latency analysis by assuming an unlimited processing capacity on the multiview encoder. Under this assumption, only the influence of the prediction structure and the processing times have to be considered, and the encoding latency is solved systematically by means of a graph model. The results obtained with this model are valid for a multiview encoder with sufficient processing capacity and serve as a lower bound otherwise. Furthermore, with the objective of low latency encoder design with low penalty on rate-distortion performance, the graph model allows us to identify the prediction relationships that add higher encoding latency to the encoder. Experimental results for JMVM prediction structures illustrate how low latency prediction structures with a low rate-distortion penalty can be derived in a systematic manner using the new model.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Se muestra la necesidad de la creación de un estándar que facilite el intercambio de datos entre empresas productoras de vídeo y cadenas de distribución. Se muestra un posible modelo en la forma de transmisión, modelo de datos y procesado de datos.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A highly parallel and scalable Deblocking Filter (DF) hardware architecture for H.264/AVC and SVC video codecs is presented in this paper. The proposed architecture mainly consists on a coarse grain systolic array obtained by replicating a unique and homogeneous Functional Unit (FU), in which a whole Deblocking-Filter unit is implemented. The proposal is also based on a novel macroblock-level parallelization strategy of the filtering algorithm which improves the final performance by exploiting specific data dependences. This way communication overhead is reduced and a more intensive parallelism in comparison with the existing state-of-the-art solutions is obtained. Furthermore, the architecture is completely flexible, since the level of parallelism can be changed, according to the application requirements. The design has been implemented in a Virtex-5 FPGA, and it allows filtering 4CIF (704 × 576 pixels @30 fps) video sequences in real-time at frequencies lower than 10.16 Mhz.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Automatic analysis of minimally invasive surgical (MIS) video has the potential to drive new solutions that alleviate existing needs for safer surgeries: reproducible training programs, objective and transparent assessment systems and navigation tools to assist surgeons and improve patient safety. As an unobtrusive, always available source of information in the operating room (OR), this research proposes the use of surgical video for extracting useful information during surgical operations. Methodology proposed includes tools' tracking algorithm and 3D reconstruction of the surgical field. The motivation for these solutions is the augmentation of the laparoscopic view in order to provide orientation aids, optimal surgical path visualization, or preoperative virtual models overlay

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Systems relying on fixed hardware components with a static level of parallelism can suffer from an underuse of logical resources, since they have to be designed for the worst-case scenario. This problem is especially important in video applications due to the emergence of new flexible standards, like Scalable Video Coding (SVC), which offer several levels of scalability. In this paper, Dynamic and Partial Reconfiguration (DPR) of modern FPGAs is used to achieve run-time variable parallelism, by using scalable architectures where the size can be adapted at run-time. Based on this proposal, a scalable Deblocking Filter core (DF), compliant with the H.264/AVC and SVC standards has been designed. This scalable DF allows run-time addition or removal of computational units working in parallel. Scalability is offered together with a scalable parallelization strategy at the macroblock (MB) level, such that when the size of the architecture changes, MB filtering order is modified accordingly

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Automatic visual object counting and video surveillance have important applications for home and business environments, such as security and management of access points. However, in order to obtain a satisfactory performance these technologies need professional and expensive hardware, complex installations and setups, and the supervision of qualified workers. In this paper, an efficient visual detection and tracking framework is proposed for the tasks of object counting and surveillance, which meets the requirements of the consumer electronics: off-the-shelf equipment, easy installation and configuration, and unsupervised working conditions. This is accomplished by a novel Bayesian tracking model that can manage multimodal distributions without explicitly computing the association between tracked objects and detections. In addition, it is robust to erroneous, distorted and missing detections. The proposed algorithm is compared with a recent work, also focused on consumer electronics, proving its superior performance.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper we present a scalable software architecture for on-line multi-camera video processing, that guarantees a good trade off between computational power, scalability and flexibility. The software system is modular and its main blocks are the Processing Units (PUs), and the Central Unit. The Central Unit works as a supervisor of the running PUs and each PU manages the acquisition phase and the processing phase. Furthermore, an approach to easily parallelize the desired processing application has been presented. In this paper, as case study, we apply the proposed software architecture to a multi-camera system in order to efficiently manage multiple 2D object detection modules in a real-time scenario. System performance has been evaluated under different load conditions such as number of cameras and image sizes. The results show that the software architecture scales well with the number of camera and can easily works with different image formats respecting the real time constraints. Moreover, the parallelization approach can be used in order to speed up the processing tasks with a low level of overhead

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We analyze the effect of packet losses in video sequences and propose a lightweight Unequal Error Protection strategy which, by choosing which packet is discarded, reduces strongly the Mean Square Error of the received sequence

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Multimedia distribution through wireless networks in the home environment presents a number of advantages which have fueled the interest of industry in recent years, such as simple connectivity and data delivery to a variety of devices. Together with High-Definition (HD) contents, multimedia wireless networks have been proposed for several applications, such as IPTV and Digital TV distribution for multiple devices in the home environment. For these scenarios, we propose a multicast distribution system for High-Definition video over 802.11 wireless networks based on rate-limited packet retransmission. We develop a limited rate ARQ system that retransmits packets according to the importance of their content (prioritization scheme) and according to their delay limitations (delay control). The performance of our proposed ARQ system is evaluated and compared with a similarly rate-limited ARQ algorithm. The results show a higher packet recovery rate and improvements in video quality for our proposed system.