994 resultados para video sequence matching


Relevância:

40.00% 40.00%

Publicador:

Resumo:

In this paper, we describe a video tracking application using the dual-tree polar matching algorithm. The models are specified in a probabilistic setting, and a particle ilter is used to perform the sequential inference. Computer simulations demonstrate the ability of the algorithm to track a simulated video moving target in an urban environment with complete and partial occlusions. © The Institution of Engineering and Technology.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Hand signals are commonly used in applications such as giving instructions to a pilot for airplane take off or direction of a crane operator by a foreman on the ground. A new algorithm for recognizing hand signals from a single camera is proposed. Typically, tracked 2D feature positions of hand signals are matched to 2D training images. In contrast, our approach matches the 2D feature positions to an archive of 3D motion capture sequences. The method avoids explicit reconstruction of the 3D articulated motion from 2D image features. Instead, the matching between the 2D and 3D sequence is done by backprojecting the 3D motion capture data onto 2D. Experiments demonstrate the effectiveness of the approach in an example application: recognizing six classes of basketball referee hand signals in video.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Establishing correspondences among object instances is still challenging in multi-camera surveillance systems, especially when the cameras’ fields of view are non-overlapping. Spatiotemporal constraints can help in solving the correspondence problem but still leave a wide margin of uncertainty. One way to reduce this uncertainty is to use appearance information about the moving objects in the site. In this paper we present the preliminary results of a new method that can capture salient appearance characteristics at each camera node in the network. A Latent Dirichlet Allocation (LDA) model is created and maintained at each node in the camera network. Each object is encoded in terms of the LDA bag-of-words model for appearance. The encoded appearance is then used to establish probable matching across cameras. Preliminary experiments are conducted on a dataset of 20 individuals and comparison against Madden’s I-MCHR is reported.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Fractal video compression is a relatively new video compression method. Its attraction is due to the high compression ratio and the simple decompression algorithm. But its computational complexity is high and as a result parallel algorithms on high performance machines become one way out. In this study we partition the matching search, which occupies the majority of the work in a fractal video compression process, into small tasks and implement them in two distributed computing environments, one using DCOM and the other using .NET Remoting technology, based on a local area network consists of loosely coupled PCs. Experimental results show that the parallel algorithm is able to achieve a high speedup in these distributed environments.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In this paper we report the degree of reliability of image sequences taken by off-the-shelf TV cameras for modeling camera rotation and reconstructing 3D structure using computer vision techniques. This is done in spite of the fact that computer vision systems usually use imaging devices that are specifically designed for the human vision. Our scenario consists of a static scene and a mobile camera moving through the scene. The scene is any long axial building dominated by features along the three principal orientations and with at least one wall containing prominent repetitive planar features such as doors, windows bricks etc. The camera is an ordinary commercial camcorder moving along the axial axis of the scene and is allowed to rotate freely within the range +/- 10 degrees in all directions. This makes it possible that the camera be held by a walking unprofessional cameraman with normal gait, or to be mounted on a mobile robot. The system has been tested successfully on sequence of images of a variety of structured, but fairly cluttered scenes taken by different walking cameramen. The potential application areas of the system include medicine, robotics and photogrammetry.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Most current work on video indexing concentrates on queries which operate over high level semantic information which must be entirely composed and entered manually. We propose an indexing system which is based on spatial information about key objects in a scene. These key objects may be detected automatically, with manual supervision, and tracked through a sequence using one of a number of recently developed techniques. This representation is highly compact and allows rapid resolution of queries specified by iconic example. A number of systems have been produced which use 2D string notations to index digital image libraries. Just as 2D strings provide a compact and tractable indexing notation for digital pictures, a sequence of 2D strings might provide an index for a video or image sequence. To improve further upon this we reduce the representation to the 2D string pair representing the initial frame, and a sequence of edits to these strings. This takes advantage of the continuity between frames to further reduce the size of the notation. By representing video sequences using string edits, a notation has been developed which is compact, and allows querying on the spatial relationships of objects to be performed without rebuilding the majority of the scene. Calculating ranks of objects directly from the edit sequence allows matching with minimal calculation, thus greatly reducing search time. This paper presents the edit sequence notation and algorithms for evaluating queries over image sequences. A number of optimizations which represent a considerably saving in search time is demonstrated in the paper.

Relevância:

40.00% 40.00%

Publicador:

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper presents different application scenarios for which the registration of sub-sequence reconstructions or multi-camera reconstructions is essential for successful camera motion estimation and 3D reconstruction from video. The registration is achieved by merging unconnected feature point tracks between the reconstructions. One application is drift removal for sequential camera motion estimation of long sequences. The state-of-the-art in drift removal is to apply a RANSAC approach to find unconnected feature point tracks. In this paper an alternative spectral algorithm for pairwise matching of unconnected feature point tracks is used. It is then shown that the algorithms can be combined and applied to novel scenarios where independent camera motion estimations must be registered into a common global coordinate system. In the first scenario multiple moving cameras, which capture the same scene simultaneously, are registered. A second new scenario occurs in situations where the tracking of feature points during sequential camera motion estimation fails completely, e.g., due to large occluding objects in the foreground, and the unconnected tracks of the independent reconstructions must be merged. In the third scenario image sequences of the same scene, which are captured under different illuminations, are registered. Several experiments with challenging real video sequences demonstrate that the presented techniques work in practice.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper presents an empirical study of affine invariant feature detectors to perform matching on video sequences of people with non-rigid surface deformation. Recent advances in feature detection and wide baseline matching have focused on static scenes. Video frames of human movement capture highly non-rigid deformation such as loose hair, cloth creases, skin stretching and free flowing clothing. This study evaluates the performance of six widely used feature detectors for sparse temporal correspondence on single view and multiple view video sequences. Quantitative evaluation is performed of both the number of features detected and their temporal matching against and without ground truth correspondence. Recall-accuracy analysis of feature matching is reported for temporal correspondence on single view and multiple view sequences of people with variation in clothing and movement. This analysis identifies that existing feature detection and matching algorithms are unreliable for fast movement with common clothing.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Storing and recalling spiking sequences is a general problem the brain needs to solve. It is, however, unclear what type of biologically plausible learning rule is suited to learn a wide class of spatiotemporal activity patterns in a robust way. Here we consider a recurrent network of stochastic spiking neurons composed of both visible and hidden neurons. We derive a generic learning rule that is matched to the neural dynamics by minimizing an upper bound on the Kullback–Leibler divergence from the target distribution to the model distribution. The derived learning rule is consistent with spike-timing dependent plasticity in that a presynaptic spike preceding a postsynaptic spike elicits potentiation while otherwise depression emerges. Furthermore, the learning rule for synapses that target visible neurons can be matched to the recently proposed voltage-triplet rule. The learning rule for synapses that target hidden neurons is modulated by a global factor, which shares properties with astrocytes and gives rise to testable predictions.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

A real-time large scale part-to-part video matching algorithm, based on the cross correlation of the intensity of motion curves, is proposed with a view to originality recognition, video database cleansing, copyright enforcement, video tagging or video result re-ranking. Moreover, it is suggested how the most representative hashes and distance functions - strada, discrete cosine transformation, Marr-Hildreth and radial - should be integrated in order for the matching algorithm to be invariant against blur, compression and rotation distortions: (R; _) 2 [1; 20]_[1; 8], from 512_512 to 32_32pixels2 and from 10 to 180_. The DCT hash is invariant against blur and compression up to 64x64 pixels2. Nevertheless, although its performance against rotation is the best, with a success up to 70%, it should be combined with the Marr-Hildreth distance function. With the latter, the image selected by the DCT hash should be at a distance lower than 1.15 times the Marr-Hildreth minimum distance.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Molecular and fragment ion data of intact 8- to 43-kDa proteins from electrospray Fourier-transform tandem mass spectrometry are matched against the corresponding data in sequence data bases. Extending the sequence tag concept of Mann and Wilm for matching peptides, a partial amino acid sequence in the unknown is first identified from the mass differences of a series of fragment ions, and the mass position of this sequence is defined from molecular weight and the fragment ion masses. For three studied proteins, a single sequence tag retrieved only the correct protein from the data base; a fourth protein required the input of two sequence tags. However, three of the data base proteins differed by having an extra methionine or by missing an acetyl or heme substitution. The positions of these modifications in the protein examined were greatly restricted by the mass differences of its molecular and fragment ions versus those of the data base. To characterize the primary structure of an unknown represented in the data base, this method is fast and specific and does not require prior enzymatic or chemical degradation.

Relevância:

40.00% 40.00%

Publicador: