69 resultados para visual representation


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Our everyday visual experience frequently involves searching for objects in clutter. Why are some searches easy and others hard? It is generally believed that the time taken to find a target increases as it becomes similar to its surrounding distractors. Here, I show that while this is qualitatively true, the exact relationship is in fact not linear. In a simple search experiment, when subjects searched for a bar differing in orientation from its distractors, search time was inversely proportional to the angular difference in orientation. Thus, rather than taking search reaction time (RT) to be a measure of target-distractor similarity, we can literally turn search time on its head (i.e. take its reciprocal 1/RT) to obtain a measure of search dissimilarity that varies linearly over a large range of target-distractor differences. I show that this dissimilarity measure has the properties of a distance metric, and report two interesting insights come from this measure: First, for a large number of searches, search asymmetries are relatively rare and when they do occur, differ by a fixed distance. Second, search distances can be used to elucidate object representations that underlie search - for example, these representations are roughly invariant to three-dimensional view. Finally, search distance has a straightforward interpretation in the context of accumulator models of search, where it is proportional to the discriminative signal that is integrated to produce a response. This is consistent with recent studies that have linked this distance to neuronal discriminability in visual cortex. Thus, while search time remains the more direct measure of visual search, its reciprocal also has the potential for interesting and novel insights. (C) 2012 Elsevier Ltd. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Bhutani N, Ray S, Murthy A. Is saccade averaging determined by visual processing or movement planning? J Neurophysiol 108: 3161-3171, 2012. First published September 26, 2012; doi:10.1152/jn.00344.2012.-Saccadic averaging that causes subjects' gaze to land between the location of two targets when faced with simultaneously or sequentially presented stimuli has been often used as a probe to investigate the nature of computations that transform sensory representations into an oculomotor plan. Since saccadic movements involve at least two processing stages-a visual stage that selects a target and a movement stage that prepares the response-saccade averaging can either occur due to interference in visual processing or movement planning. By having human subjects perform two versions of a saccadic double-step task, in which the stimuli remained the same, but different instructions were provided (REDIRECT gaze to the later-appearing target vs. FOLLOW the sequence of targets in their order of appearance), we tested two alternative hypotheses. If saccade averaging were due to visual processing alone, the pattern of saccade averaging is expected to remain the same across task conditions. However, whereas subjects produced averaged saccades between two targets in the FOLLOW condition, they produced hypometric saccades in the direction of the initial target in the REDIRECT condition, suggesting that the interaction between competing movement plans produces saccade averaging.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We consider a visual search problem studied by Sripati and Olson where the objective is to identify an oddball image embedded among multiple distractor images as quickly as possible. We model this visual search task as an active sequential hypothesis testing problem (ASHT problem). Chernoff in 1959 proposed a policy in which the expected delay to decision is asymptotically optimal. The asymptotics is under vanishing error probabilities. We first prove a stronger property on the moments of the delay until a decision, under the same asymptotics. Applying the result to the visual search problem, we then propose a ``neuronal metric'' on the measured neuronal responses that captures the discriminability between images. From empirical study we obtain a remarkable correlation (r = 0.90) between the proposed neuronal metric and speed of discrimination between the images. Although this correlation is lower than with the L-1 metric used by Sripati and Olson, this metric has the advantage of being firmly grounded in formal decision theory.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The problem of human detection is challenging, more so, when faced with adverse conditions such as occlusion and background clutter. This paper addresses the problem of human detection by representing an extracted feature of an image using a sparse linear combination of chosen dictionary atoms. The detection along with the scale finding, is done by using the coefficients obtained from sparse representation. This is of particular interest as we address the problem of scale using a scale-embedded dictionary where the conventional methods detect the object by running the detection window at all scales.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents a unified framework using the unit cube for measurement, representation and usage of the range of motion (ROM) of body joints with multiple degrees of freedom (d.o.f) to be used for digital human models (DHM). Traditional goniometry needs skill and kn owledge; it is intrusive and has limited applicability for multi-d.o.f. joints. Measurements using motion capture systems often involve complicated mathematics which itself need validation. In this paper we use change of orientation as the measure of rotation; this definition does not require the identification of any fixed axis of rotation. A two-d.o.f. joint ROM can be represented as a Gaussian map. Spherical polygon representation of ROM, though popular, remains inaccurate, vulnerable due to singularities on parametric sphere and difficult to use for point classification. The unit cube representation overcomes these difficulties. In the work presented here, electromagnetic trackers have been effectively used for measuring the relative orientation of a body segment of interest with respect to another body segment. The orientation is then mapped on a surface gridded cube. As the body segment is moved, the grid cells visited are identified and visualized. Using the visual display as a feedback, the subject is instructed to cover as many grid cells as he can. In this way we get a connected patch of contiguous grid cells. The boundary of this patch represents the active ROM of the concerned joint. The tracker data is converted into the motion of a direction aligned with the axis of the segment and a rotation about this axis later on. The direction identifies the grid cells on the cube and rotation about the axis is represented as a range and visualized using color codes. Thus the present methodology provides a simple, intuitive and accura te determination and representation of up to 3 d.o.f. joints. Basic results are presented for the shoulder. The measurement scheme to be used for wrist and neck, and approach for estimation of the statistical distribution of ROM for a given population are also discussed.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We consider the problem of extracting a signature representation of similar entities employing covariance descriptors. Covariance descriptors can efficiently represent objects and are robust to scale and pose changes. We posit that covariance descriptors corresponding to similar objects share a common geometrical structure which can be extracted through joint diagonalization. We term this diagonalizing matrix as the Covariance Profile (CP). CP can be used to measure the distance of a novel object to an object set through the diagonality measure. We demonstrate how CP can be employed on images as well as for videos, for applications such as face recognition and object-track clustering.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Real-time object tracking is a critical task in many computer vision applications. Achieving rapid and robust tracking while handling changes in object pose and size, varying illumination and partial occlusion, is a challenging task given the limited amount of computational resources. In this paper we propose a real-time object tracker in l(1) framework addressing these issues. In the proposed approach, dictionaries containing templates of overlapping object fragments are created. The candidate fragments are sparsely represented in the dictionary fragment space by solving the l(1) regularized least squares problem. The non zero coefficients indicate the relative motion between the target and candidate fragments along with a fidelity measure. The final object motion is obtained by fusing the reliable motion information. The dictionary is updated based on the object likelihood map. The proposed tracking algorithm is tested on various challenging videos and found to outperform earlier approach.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The contour tree is a topological abstraction of a scalar field that captures evolution in level set connectivity. It is an effective representation for visual exploration and analysis of scientific data. We describe a work-efficient, output sensitive, and scalable parallel algorithm for computing the contour tree of a scalar field defined on a domain that is represented using either an unstructured mesh or a structured grid. A hybrid implementation of the algorithm using the GPU and multi-core CPU can compute the contour tree of an input containing 16 million vertices in less than ten seconds with a speedup factor of upto 13. Experiments based on an implementation in a multi-core CPU environment show near-linear speedup for large data sets.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents classification, representation and extraction of deformation features in sheet-metal parts. The thickness is constant for these shape features and hence these are also referred to as constant thickness features. The deformation feature is represented as a set of faces with a characteristic arrangement among the faces. Deformation of the base-sheet or forming of material creates Bends and Walls with respect to a base-sheet or a reference plane. These are referred to as Basic Deformation Features (BDFs). Compound deformation features having two or more BDFs are defined as characteristic combinations of Bends and Walls and represented as a graph called Basic Deformation Features Graph (BDFG). The graph, therefore, represents a compound deformation feature uniquely. The characteristic arrangement of the faces and type of bends belonging to the feature decide the type and nature of the deformation feature. Algorithms have been developed to extract and identify deformation features from a CAD model of sheet-metal parts. The proposed algorithm does not require folding and unfolding of the part as intermediate steps to recognize deformation features. Representations of typical features are illustrated and results of extracting these deformation features from typical sheet metal parts are presented and discussed. (C) 2013 Elsevier Ltd. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

There is a strong relation between sparse signal recovery and error control coding. It is known that burst errors are block sparse in nature. So, here we attempt to solve burst error correction problem using block sparse signal recovery methods. We construct partial Fourier based encoding and decoding matrices using results on difference sets. These constructions offer guaranteed and efficient error correction when used in conjunction with reconstruction algorithms which exploit block sparsity.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Sparse representation based classification (SRC) is one of the most successful methods that has been developed in recent times for face recognition. Optimal projection for Sparse representation based classification (OPSRC)1] provides a dimensionality reduction map that is supposed to give optimum performance for SRC framework. However, the computational complexity involved in this method is too high. Here, we propose a new projection technique using the data scatter matrix which is computationally superior to the optimal projection method with comparable classification accuracy with respect OPSRC. The performance of the proposed approach is benchmarked with various publicly available face database.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Single features such as line orientation and length are known to guide visual search, but relatively little is known about how multiple features combine in search. To address this question, we investigated how search for targets differing in multiple features ( intensity, length, orientation) from the distracters is related to searches for targets differing in each of the individual features. We tested race models (based on reaction times) and coactivation models ( based on reciprocal of reaction times) for their ability to predict multiple feature searches. Multiple feature searches were best accounted for by a co-activation model in which feature information combined linearly (r = 0.95). This result agrees with the classic finding that these features are separable i.e., subjective dissimilarity ratings sum linearly. We then replicated the classical finding that the length and width of a rectangle are integral features-in other words, they combine nonlinearly in visual search. However, to our surprise, upon including aspect ratio as an additional feature, length and width combined linearly and this model outperformed all other models. Thus, length and width of a rectangle became separable when considered together with aspect ratio. This finding predicts that searches involving shapes with identical aspect ratio should be more difficult than searches where shapes differ in aspect ratio. We confirmed this prediction on a variety of shapes. We conclude that features in visual search co-activate linearly and demonstrate for the first time that aspect ratio is a novel feature that guides visual search.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Regions in video streams attracting human interest contribute significantly to human understanding of the video. Being able to predict salient and informative Regions of Interest (ROIs) through a sequence of eye movements is a challenging problem. Applications such as content-aware retargeting of videos to different aspect ratios while preserving informative regions and smart insertion of dialog (closed-caption text) into the video stream can significantly be improved using the predicted ROIs. We propose an interactive human-in-the-loop framework to model eye movements and predict visual saliency into yet-unseen frames. Eye tracking and video content are used to model visual attention in a manner that accounts for important eye-gaze characteristics such as temporal discontinuities due to sudden eye movements, noise, and behavioral artifacts. A novel statistical-and algorithm-based method gaze buffering is proposed for eye-gaze analysis and its fusion with content-based features. Our robust saliency prediction is instantiated for two challenging and exciting applications. The first application alters video aspect ratios on-the-fly using content-aware video retargeting, thus making them suitable for a variety of display sizes. The second application dynamically localizes active speakers and places dialog captions on-the-fly in the video stream. Our method ensures that dialogs are faithful to active speaker locations and do not interfere with salient content in the video stream. Our framework naturally accommodates personalisation of the application to suit biases and preferences of individual users.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Designing a robust algorithm for visual object tracking has been a challenging task since many years. There are trackers in the literature that are reasonably accurate for many tracking scenarios but most of them are computationally expensive. This narrows down their applicability as many tracking applications demand real time response. In this paper, we present a tracker based on random ferns. Tracking is posed as a classification problem and classification is done using ferns. We used ferns as they rely on binary features and are extremely fast at both training and classification as compared to other classification algorithms. Our experiments show that the proposed tracker performs well on some of the most challenging tracking datasets and executes much faster than one of the state-of-the-art trackers, without much difference in tracking accuracy.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The concurrent planning of sequential saccades offers a simple model to study the nature of visuomotor transformations since the second saccade vector needs to be remapped to foveate the second target following the first saccade. Remapping is thought to occur through egocentric mechanisms involving an efference copy of the first saccade that is available around the time of its onset. In contrast, an exocentric representation of the second target relative to the first target, if available, can be used to directly code the second saccade vector. While human volunteers performed a modified double-step task, we examined the role of exocentric encoding in concurrent saccade planning by shifting the first target location well before the efference copy could be used by the oculomotor system. The impact of the first target shift on concurrent processing was tested by examining the end-points of second saccades following a shift of the second target during the first saccade. The frequency of second saccades to the old versus new location of the second target, as well as the propagation of first saccade localization errors, both indices of concurrent processing, were found to be significantly reduced in trials with the first target shift compared to those without it. A similar decrease in concurrent processing was obtained when we shifted the first target but kept constant the second saccade vector. Overall, these results suggest that the brain can use relatively stable visual landmarks, independent of efference copy-based egocentric mechanisms, for concurrent planning of sequential saccades.