957 resultados para Video-art


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Hand signals are commonly used in applications such as giving instructions to a pilot for airplane take off or direction of a crane operator by a foreman on the ground. A new algorithm for recognizing hand signals from a single camera is proposed. Typically, tracked 2D feature positions of hand signals are matched to 2D training images. In contrast, our approach matches the 2D feature positions to an archive of 3D motion capture sequences. The method avoids explicit reconstruction of the 3D articulated motion from 2D image features. Instead, the matching between the 2D and 3D sequence is done by backprojecting the 3D motion capture data onto 2D. Experiments demonstrate the effectiveness of the approach in an example application: recognizing six classes of basketball referee hand signals in video.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Locating hands in sign language video is challenging due to a number of factors. Hand appearance varies widely across signers due to anthropometric variations and varying levels of signer proficiency. Video can be captured under varying illumination, camera resolutions, and levels of scene clutter, e.g., high-res video captured in a studio vs. low-res video gathered by a web cam in a user’s home. Moreover, the signers’ clothing varies, e.g., skin-toned clothing vs. contrasting clothing, short-sleeved vs. long-sleeved shirts, etc. In this work, the hand detection problem is addressed in an appearance matching framework. The Histogram of Oriented Gradient (HOG) based matching score function is reformulated to allow non-rigid alignment between pairs of images to account for hand shape variation. The resulting alignment score is used within a Support Vector Machine hand/not-hand classifier for hand detection. The new matching score function yields improved performance (in ROC area and hand detection rate) over the Vocabulary Guided Pyramid Match Kernel (VGPMK) and the traditional, rigid HOG distance on American Sign Language video gestured by expert signers. The proposed match score function is computationally less expensive (for training and testing), has fewer parameters and is less sensitive to parameter settings than VGPMK. The proposed detector works well on test sequences from an inexpert signer in a non-studio setting with cluttered background.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Establishing correspondences among object instances is still challenging in multi-camera surveillance systems, especially when the cameras’ fields of view are non-overlapping. Spatiotemporal constraints can help in solving the correspondence problem but still leave a wide margin of uncertainty. One way to reduce this uncertainty is to use appearance information about the moving objects in the site. In this paper we present the preliminary results of a new method that can capture salient appearance characteristics at each camera node in the network. A Latent Dirichlet Allocation (LDA) model is created and maintained at each node in the camera network. Each object is encoded in terms of the LDA bag-of-words model for appearance. The encoded appearance is then used to establish probable matching across cameras. Preliminary experiments are conducted on a dataset of 20 individuals and comparison against Madden’s I-MCHR is reported.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Dynamic service aggregation techniques can exploit skewed access popularity patterns to reduce the costs of building interactive VoD systems. These schemes seek to cluster and merge users into single streams by bridging the temporal skew between them, thus improving server and network utilization. Rate adaptation and secondary content insertion are two such schemes. In this paper, we present and evaluate an optimal scheduling algorithm for inserting secondary content in this scenario. The algorithm runs in polynomial time, and is optimal with respect to the total bandwidth usage over the merging interval. We present constraints on content insertion which make the overall QoS of the delivered stream acceptable, and show how our algorithm can satisfy these constraints. We report simulation results which quantify the excellent gains due to content insertion. We discuss dynamic scenarios with user arrivals and interactions, and show that content insertion reduces the channel bandwidth requirement to almost half. We also discuss differentiated service techniques, such as N-VoD and premium no-advertisement service, and show how our algorithm can support these as well.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We introduce a view-point invariant representation of moving object trajectories that can be used in video database applications. It is assumed that trajectories lie on a surface that can be locally approximated with a plane. Raw trajectory data is first locally approximated with a cubic spline via least squares fitting. For each sampled point of the obtained curve, a projective invariant feature is computed using a small number of points in its neighborhood. The resulting sequence of invariant features computed along the entire trajectory forms the view invariant descriptor of the trajectory itself. Time parametrization has been exploited to compute cross ratios without ambiguity due to point ordering. Similarity between descriptors of different trajectories is measured with a distance that takes into account the statistical properties of the cross ratio, and its symmetry with respect to the point at infinity. In experiments, an overall correct classification rate of about 95% has been obtained on a dataset of 58 trajectories of players in soccer video, and an overall correct classification rate of about 80% has been obtained on matching partial segments of trajectories collected from two overlapping views of outdoor scenes with moving people and cars.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Moving cameras are needed for a wide range of applications in robotics, vehicle systems, surveillance, etc. However, many foreground object segmentation methods reported in the literature are unsuitable for such settings; these methods assume that the camera is fixed and the background changes slowly, and are inadequate for segmenting objects in video if there is significant motion of the camera or background. To address this shortcoming, a new method for segmenting foreground objects is proposed that utilizes binocular video. The method is demonstrated in the application of tracking and segmenting people in video who are approximately facing the binocular camera rig. Given a stereo image pair, the system first tries to find faces. Starting at each face, the region containing the person is grown by merging regions from an over-segmented color image. The disparity map is used to guide this merging process. The system has been implemented on a consumer-grade PC, and tested on video sequences of people indoors obtained from a moving camera rig. As can be expected, the proposed method works well in situations where other foreground-background segmentation methods typically fail. We believe that this superior performance is partly due to the use of object detection to guide region merging in disparity/color foreground segmentation, and partly due to the use of disparity information available with a binocular rig, in contrast with most previous methods that assumed monocular sequences.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this project we design and implement a centralized hashing table in the snBench sensor network environment. We discuss the feasibility of this approach and compare and contrast with the distributed hashing architecture, with particular discussion regarding the conditions under which a centralized architecture makes sense. There are numerous computational tasks that require persistence of data in a sensor network environment. To help motivate the need for data storage in snBench we demonstrate a practical application of the technology whereby a video camera can monitor a room to detect the presence of a person and send an alert to the appropriate authorities.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

ACT is compared with a particular type of connectionist model that cannot handle symbols and use non-biological operations that cannot learn in real time. This focus continues an unfortunate trend of straw man "debates" in cognitive science. Adaptive Resonance Theory, or ART, neural models of cognition can handle both symbols and sub-symbolic representations, and meets the Newell criteria at least as well as these models.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Memories in Adaptive Resonance Theory (ART) networks are based on matched patterns that focus attention on those portions of bottom-up inputs that match active top-down expectations. While this learning strategy has proved successful for both brain models and applications, computational examples show that attention to early critical features may later distort memory representations during online fast learning. For supervised learning, biased ARTMAP (bARTMAP) solves the problem of over-emphasis on early critical features by directing attention away from previously attended features after the system makes a predictive error. Small-scale, hand-computed analog and binary examples illustrate key model dynamics. Twodimensional simulation examples demonstrate the evolution of bARTMAP memories as they are learned online. Benchmark simulations show that featural biasing also improves performance on large-scale examples. One example, which predicts movie genres and is based, in part, on the Netflix Prize database, was developed for this project. Both first principles and consistent performance improvements on all simulation studies suggest that featural biasing should be incorporated by default in all ARTMAP systems. Benchmark datasets and bARTMAP code are available from the CNS Technology Lab Website: http://techlab.bu.edu/bART/.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we introduce the Generalized Equality Classifier (GEC) for use as an unsupervised clustering algorithm in categorizing analog data. GEC is based on a formal definition of inexact equality originally developed for voting in fault tolerant software applications. GEC is defined using a metric space framework. The only parameter in GEC is a scalar threshold which defines the approximate equality of two patterns. Here, we compare the characteristics of GEC to the ART2-A algorithm (Carpenter, Grossberg, and Rosen, 1991). In particular, we show that GEC with the Hamming distance performs the same optimization as ART2. Moreover, GEC has lower computational requirements than AR12 on serial machines.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper introduces ART-EMAP, a neural architecture that uses spatial and temporal evidence accumulation to extend the capabilities of fuzzy ARTMAP. ART-EMAP combines supervised and unsupervised learning and a medium-term memory process to accomplish stable pattern category recognition in a noisy input environment. The ART-EMAP system features (i) distributed pattern registration at a view category field; (ii) a decision criterion for mapping between view and object categories which can delay categorization of ambiguous objects and trigger an evidence accumulation process when faced with a low confidence prediction; (iii) a process that accumulates evidence at a medium-term memory (MTM) field; and (iv) an unsupervised learning algorithm to fine-tune performance after a limited initial period of supervised network training. ART-EMAP dynamics are illustrated with a benchmark simulation example. Applications include 3-D object recognition from a series of ambiguous 2-D views.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A model which extends the adaptive resonance theory model to sequential memory is presented. This new model learns sequences of events and recalls a sequence when presented with parts of the sequence. A sequence can have repeated events and different sequences can share events. The ART model is modified by creating interconnected sublayers within ART's F2 layer. Nodes within F2 learn temporal patterns by forming recency gradients within LTM. Versions of the ART model like ART I, ART 2, and fuzzy ART can be used.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A new neural network architecture is introduced for the recognition of pattern classes after supervised and unsupervised learning. Applications include spatio-temporal image understanding and prediction and 3-D object recognition from a series of ambiguous 2-D views. The architecture, called ART-EMAP, achieves a synthesis of adaptive resonance theory (ART) and spatial and temporal evidence integration for dynamic predictive mapping (EMAP). ART-EMAP extends the capabilities of fuzzy ARTMAP in four incremental stages. Stage 1 introduces distributed pattern representation at a view category field. Stage 2 adds a decision criterion to the mapping between view and object categories, delaying identification of ambiguous objects when faced with a low confidence prediction. Stage 3 augments the system with a field where evidence accumulates in medium-term memory (MTM). Stage 4 adds an unsupervised learning process to fine-tune performance after the limited initial period of supervised network training. Each ART-EMAP stage is illustrated with a benchmark simulation example, using both noisy and noise-free data. A concluding set of simulations demonstrate ART-EMAP performance on a difficult 3-D object recognition problem.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Adaptive Resonance Theory (ART) models are real-time neural networks for category learning, pattern recognition, and prediction. Unsupervised fuzzy ART and supervised fuzzy ARTMAP synthesize fuzzy logic and ART networks by exploiting the formal similarity between the computations of fuzzy subsethood and the dynamics of ART category choice, search, and learning. Fuzzy ART self-organizes stable recognition categories in response to arbitrary sequences of analog or binary input patterns. It generalizes the binary ART 1 model, replacing the set-theoretic: intersection (∩) with the fuzzy intersection (∧), or component-wise minimum. A normalization procedure called complement coding leads to a symmetric: theory in which the fuzzy inter:>ec:tion and the fuzzy union (∨), or component-wise maximum, play complementary roles. Complement coding preserves individual feature amplitudes while normalizing the input vector, and prevents a potential category proliferation problem. Adaptive weights :otart equal to one and can only decrease in time. A geometric interpretation of fuzzy AHT represents each category as a box that increases in size as weights decrease. A matching criterion controls search, determining how close an input and a learned representation must be for a category to accept the input as a new exemplar. A vigilance parameter (p) sets the matching criterion and determines how finely or coarsely an ART system will partition inputs. High vigilance creates fine categories, represented by small boxes. Learning stops when boxes cover the input space. With fast learning, fixed vigilance, and an arbitrary input set, learning stabilizes after just one presentation of each input. A fast-commit slow-recode option allows rapid learning of rare events yet buffers memories against recoding by noisy inputs. Fuzzy ARTMAP unites two fuzzy ART networks to solve supervised learning and prediction problems. A Minimax Learning Rule controls ARTMAP category structure, conjointly minimizing predictive error and maximizing code compression. Low vigilance maximizes compression but may therefore cause very different inputs to make the same prediction. When this coarse grouping strategy causes a predictive error, an internal match tracking control process increases vigilance just enough to correct the error. ARTMAP automatically constructs a minimal number of recognition categories, or "hidden units," to meet accuracy criteria. An ARTMAP voting strategy improves prediction by training the system several times using different orderings of the input set. Voting assigns confidence estimates to competing predictions given small, noisy, or incomplete training sets. ARPA benchmark simulations illustrate fuzzy ARTMAP dynamics. The chapter also compares fuzzy ARTMAP to Salzberg's Nested Generalized Exemplar (NGE) and to Simpson's Fuzzy Min-Max Classifier (FMMC); and concludes with a summary of ART and ARTMAP applications.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Adaptive Resonance Theory (ART) models are real-time neural networks for category learning, pattern recognition, and prediction. Unsupervised fuzzy ART and supervised fuzzy ARTMAP networks synthesize fuzzy logic and ART by exploiting the formal similarity between tile computations of fuzzy subsethood and the dynamics of ART category choice, search, and learning. Fuzzy ART self-organizes stable recognition categories in response to arbitrary sequences of analog or binary input patterns. It generalizes the binary ART 1 model, replacing the set-theoretic intersection (∩) with the fuzzy intersection(∧), or component-wise minimum. A normalization procedure called complement coding leads to a symmetric theory in which the fuzzy intersection and the fuzzy union (∨), or component-wise maximum, play complementary roles. A geometric interpretation of fuzzy ART represents each category as a box that increases in size as weights decrease. This paper analyzes fuzzy ART models that employ various choice functions for category selection. One such function minimizes total weight change during learning. Benchmark simulations compare peformance of fuzzy ARTMAP systems that use different choice functions.