940 resultados para video-technology
Resumo:
Background subtraction is a fundamental low-level processing task in numerous computer vision applications. The vast majority of algorithms process images on a pixel-by-pixel basis, where an independent decision is made for each pixel. A general limitation of such processing is that rich contextual information is not taken into account. We propose a block-based method capable of dealing with noise, illumination variations, and dynamic backgrounds, while still obtaining smooth contours of foreground objects. Specifically, image sequences are analyzed on an overlapping block-by-block basis. A low-dimensional texture descriptor obtained from each block is passed through an adaptive classifier cascade, where each stage handles a distinct problem. A probabilistic foreground mask generation approach then exploits block overlaps to integrate interim block-level decisions into final pixel-level foreground segmentation. Unlike many pixel-based methods, ad-hoc postprocessing of foreground masks is not required. Experiments on the difficult Wallflower and I2R datasets show that the proposed approach obtains on average better results (both qualitatively and quantitatively) than several prominent methods. We furthermore propose the use of tracking performance as an unbiased approach for assessing the practical usefulness of foreground segmentation methods, and show that the proposed approach leads to considerable improvements in tracking accuracy on the CAVIAR dataset.
Resumo:
Due to the popularity of security cameras in public places, it is of interest to design an intelligent system that can efficiently detect events automatically. This paper proposes a novel algorithm for multi-person event detection. To ensure greater than real-time performance, features are extracted directly from compressed MPEG video. A novel histogram-based feature descriptor that captures the angles between extracted particle trajectories is proposed, which allows us to capture motion patterns of multi-person events in the video. To alleviate the need for fine-grained annotation, we propose the use of Labelled Latent Dirichlet Allocation, a “weakly supervised” method that allows the use of coarse temporal annotations which are much simpler to obtain. This novel system is able to run at approximately ten times real-time, while preserving state-of-theart detection performance for multi-person events on a 100-hour real-world surveillance dataset (TRECVid SED).
Resumo:
Acoustic modeling using mixtures of multivariate Gaussians is the prevalent approach for many speech processing problems. Computing likelihoods against a large set of Gaussians is required as a part of many speech processing systems and it is the computationally dominant phase for Large Vocabulary Continuous Speech Recognition (LVCSR) systems. We express the likelihood computation as a multiplication of matrices representing augmented feature vectors and Gaussian parameters. The computational gain of this approach over traditional methods is by exploiting the structure of these matrices and efficient implementation of their multiplication. In particular, we explore direct low-rank approximation of the Gaussian parameter matrix and indirect derivation of low-rank factors of the Gaussian parameter matrix by optimum approximation of the likelihood matrix. We show that both the methods lead to similar speedups but the latter leads to far lesser impact on the recognition accuracy. Experiments on 1,138 work vocabulary RM1 task and 6,224 word vocabulary TIMIT task using Sphinx 3.7 system show that, for a typical case the matrix multiplication based approach leads to overall speedup of 46 % on RM1 task and 115 % for TIMIT task. Our low-rank approximation methods provide a way for trading off recognition accuracy for a further increase in computational performance extending overall speedups up to 61 % for RM1 and 119 % for TIMIT for an increase of word error rate (WER) from 3.2 to 3.5 % for RM1 and for no increase in WER for TIMIT. We also express pairwise Euclidean distance computation phase in Dynamic Time Warping (DTW) in terms of matrix multiplication leading to saving of approximately of computational operations. In our experiments using efficient implementation of matrix multiplication, this leads to a speedup of 5.6 in computing the pairwise Euclidean distances and overall speedup up to 3.25 for DTW.
Resumo:
TV Maxambomba: Processos de Singularização é o resultado do processo de investigação sobre as potencialidades que residem na linguagem audiovisual, sobretudo no processo de produção de vídeo e comunicação popular, apropriados por pessoas que nas suas diferenças utilizam a linguagem e a tecnologia do vídeo como ferramenta de produção da expressão da sua cultura, da sua realidade, da sua criação e inventividade. Percorrendo o percurso da TV Maxambomba, essa pesquisa trouxe a dimensão da potencia que envolve a articulação de pessoas e grupos utilizando a tecnologia do audiovisual, a linguagem do vídeo no seu processo de criação como mecanismo de produção de conhecimento e de subjetivação. Ao longo dos seus 15 anos A TV Maxambomba revela-se como um potencial laboratório de invenção midiática, democratizando a linguagem audiovisual, possibilitando que numa era midiática, inicia-se a era pós-mídia. Transgredindo as normas e os formatos televisivos, traçando suas linhas de fuga, trazendo as peculiaridades das comunidades e territórios ocupados pela TV Maxambomba, territorializando e desterritolizando sua própria linguagem, revela-se como espaço de produção de processos de singularização.
Resumo:
In this paper, we derive an EM algorithm for nonlinear state space models. We use it to estimate jointly the neural network weights, the model uncertainty and the noise in the data. In the E-step we apply a forwardbackward Rauch-Tung-Striebel smoother to compute the network weights. For the M-step, we derive expressions to compute the model uncertainty and the measurement noise. We find that the method is intrinsically very powerful, simple and stable.
Resumo:
In the article an attempt at describing theoretical bases and application of self-observation using video-technology in teachers' education has been undertaken. The article starts by showing the very beginning of the use of visual self-observation in teachers' education. First of all video self-observation is approached as an important tool for changing self-esteem and for changing teachers' professional performance. However, this self-observation is only the first aid. The critical factor in causing changes of behaviour is the role of a tutor. In general self-observation is rather the medium for changing the self-esteem which an object presents than a direct cause of behavioural change.
Resumo:
Este trabajo tiene como propósito esencial, realizar un acercamiento para detectar e identificar las necesidades de información y el comportamiento informativo de entrenadores en deportes de combate. Para ello se aplicó un cuestionario a instructores de aikido, boxeo, esgrima, judo, karate, kendo, lima lama, lucha y taekwondo seleccionados mediante un muestreo no probabilístico por causalidad. En general encontramos que los principales temas de interés entre los instructores son: los programas de entrenamiento, nutrición y dietas de entrenamiento. Por otra parte, los entrenadores son más propensos a utilizar su experiencia, internet y cursos para obtener información. En contraste se nota que la biblioteca y los libros son poco usados.
Resumo:
This paper, chosen as a best paper from the 2004 SAMOS Workshop on Computer Systems: describes a novel, efficient methodology for automatically creating embedded DSP computer systems. The novelty arises since now embedded electronic signal processing systems, such as radar or sonar, can be designed by anyone from the algorithm level, i.e. no low level system design experience is required, whilst still achieving low controllable implementation overheads and high real time performance. In the chosen design example, a bank of Normalised Lattice Filter (NLF) components is created which a four-fold reduction in the required processing resource with no performance decrease.
Resumo:
The technical challenges in the design and programming of signal processors for multimedia communication are discussed. The development of terminal equipment to meet such demand presents a significant technical challenge, considering that it is highly desirable that the equipment be cost effective, power efficient, versatile, and extensible for future upgrades. The main challenges in the design and programming of signal processors for multimedia communication are, general-purpose signal processor design, application-specific signal processor design, operating systems and programming support and application programming. The size of FFT is programmable so that it can be used for various OFDM-based communication systems, such as digital audio broadcasting (DAB), digital video broadcasting-terrestrial (DVB-T) and digital video broadcasting-handheld (DVB-H). The clustered architecture design and distributed ping-pong register files in the PAC DSP raise new challenges of code generation.
Resumo:
A methodology for rapid silicon design of biorthogonal wavelet transform systems has been developed. This is based on generic, scalable architectures for the forward and inverse wavelet filters. These architectures offer efficient hardware utilisation by combining the linear phase property of biorthogonal filters with decimation and interpolation. The resulting designs have been parameterised in terms of types of wavelet and wordlengths for data and coefficients. Control circuitry is embedded within these cores that allows them to be cascaded for any desired level of decomposition without any interface logic. The time to produce silicon designs for a biorthogonal wavelet system is only the time required to run synthesis and layout tools with no further design effort required. The resulting silicon cores produced are comparable in area and performance to hand-crafted designs. These designs are also portable across a range of foundries and are suitable for FPGA and PLD implementations.