5 resultados para change detection, visione stereo, background difference

em Boston University Digital Common


Relevância:

40.00% 40.00%

Publicador:

Resumo:

Moving cameras are needed for a wide range of applications in robotics, vehicle systems, surveillance, etc. However, many foreground object segmentation methods reported in the literature are unsuitable for such settings; these methods assume that the camera is fixed and the background changes slowly, and are inadequate for segmenting objects in video if there is significant motion of the camera or background. To address this shortcoming, a new method for segmenting foreground objects is proposed that utilizes binocular video. The method is demonstrated in the application of tracking and segmenting people in video who are approximately facing the binocular camera rig. Given a stereo image pair, the system first tries to find faces. Starting at each face, the region containing the person is grown by merging regions from an over-segmented color image. The disparity map is used to guide this merging process. The system has been implemented on a consumer-grade PC, and tested on video sequences of people indoors obtained from a moving camera rig. As can be expected, the proposed method works well in situations where other foreground-background segmentation methods typically fail. We believe that this superior performance is partly due to the use of object detection to guide region merging in disparity/color foreground segmentation, and partly due to the use of disparity information available with a binocular rig, in contrast with most previous methods that assumed monocular sequences.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This dissertation describes a model for acoustic propagation in inhomogeneous flu- ids, and explores the focusing by arrays onto targets under various conditions. The work explores the use of arrays, in particular the time reversal array, for underwater and biomedical applications. Aspects of propagation and phasing which can lead to reduced focusing effectiveness are described. An acoustic wave equation was derived for the propagation of finite-amplitude waves in lossy time-varying inhomogeneous fluid media. The equation was solved numerically in both Cartesian and cylindrical geometries using the finite-difference time-domain (FDTD) method. It was found that time reversal arrays are sensitive to several debilitating factors. Focusing ability was determined to be adequate in the presence of temporal jitter in the time reversed signal only up to about one-sixth of a period. Thermoviscous absorption also had a debilitating effect on focal pressure for both linear and nonlinear propagation. It was also found that nonlinearity leads to degradation of focal pressure through amplification of the received signal at the array, and enhanced absorption in the shocked waveforms. This dissertation also examined the heating effects of focused ultrasound in a tissue-like medium. The application considered is therapeutic heating for hyperther- mia. The acoustic model and a thermal model for tissue were coupled to solve for transient and steady temperature profiles in tissue-like media. The Pennes bioheat equation was solved using the FDTD method to calculate the temperature fields in tissue-like media from focused acoustic sources. It was found that the temperature-dependence of the medium's background prop- erties can play an important role in the temperature predictions. Finite-amplitude effects contributed excess heat when source conditions were provided for nonlinear ef- fects to manifest themselves. The effect of medium heterogeneity was also found to be important in redistributing the acoustic and temperature fields, creating regions with hotter and colder temperatures than the mean by local scattering and lensing action. These temperature excursions from the mean were found to increase monotonically with increasing contrast in the medium's properties.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Object detection is challenging when the object class exhibits large within-class variations. In this work, we show that foreground-background classification (detection) and within-class classification of the foreground class (pose estimation) can be jointly learned in a multiplicative form of two kernel functions. One kernel measures similarity for foreground-background classification. The other kernel accounts for latent factors that control within-class variation and implicitly enables feature sharing among foreground training samples. Detector training can be accomplished via standard SVM learning. The resulting detectors are tuned to specific variations in the foreground class. They also serve to evaluate hypotheses of the foreground state. When the foreground parameters are provided in training, the detectors can also produce parameter estimate. When the foreground object masks are provided in training, the detectors can also produce object segmentation. The advantages of our method over past methods are demonstrated on data sets of human hands and vehicles.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Locating hands in sign language video is challenging due to a number of factors. Hand appearance varies widely across signers due to anthropometric variations and varying levels of signer proficiency. Video can be captured under varying illumination, camera resolutions, and levels of scene clutter, e.g., high-res video captured in a studio vs. low-res video gathered by a web cam in a user’s home. Moreover, the signers’ clothing varies, e.g., skin-toned clothing vs. contrasting clothing, short-sleeved vs. long-sleeved shirts, etc. In this work, the hand detection problem is addressed in an appearance matching framework. The Histogram of Oriented Gradient (HOG) based matching score function is reformulated to allow non-rigid alignment between pairs of images to account for hand shape variation. The resulting alignment score is used within a Support Vector Machine hand/not-hand classifier for hand detection. The new matching score function yields improved performance (in ROC area and hand detection rate) over the Vocabulary Guided Pyramid Match Kernel (VGPMK) and the traditional, rigid HOG distance on American Sign Language video gestured by expert signers. The proposed match score function is computationally less expensive (for training and testing), has fewer parameters and is less sensitive to parameter settings than VGPMK. The proposed detector works well on test sequences from an inexpert signer in a non-studio setting with cluttered background.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A method for deformable shape detection and recognition is described. Deformable shape templates are used to partition the image into a globally consistent interpretation, determined in part by the minimum description length principle. Statistical shape models enforce the prior probabilities on global, parametric deformations for each object class. Once trained, the system autonomously segments deformed shapes from the background, while not merging them with adjacent objects or shadows. The formulation can be used to group image regions based on any image homogeneity predicate; e.g., texture, color, or motion. The recovered shape models can be used directly in object recognition. Experiments with color imagery are reported.