28 resultados para decoupled image-based visual servoing


Relevância:

40.00% 40.00%

Publicador:

Resumo:

In this paper, we introduce a novel high-level visual content descriptor devised for performing semantic-based image classification and retrieval. The work can be treated as an attempt for bridging the so called "semantic gap". The proposed image feature vector model is fundamentally underpinned by an automatic image labelling framework, called Collaterally Cued Labelling (CCL), which incorporates the collateral knowledge extracted from the collateral texts accompanying the images with the state-of-the-art low-level visual feature extraction techniques for automatically assigning textual keywords to image regions. A subset of the Corel image collection was used for evaluating the proposed method. The experimental results indicate that our semantic-level visual content descriptors outperform both conventional visual and textual image feature models.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In this paper, a forward-looking infrared (FLIR) video surveillance system is presented for collision avoidance of moving ships to bridge piers. An image preprocessing algorithm is proposed to reduce clutter background by multi-scale fractal analysis, in which the blanket method is used for fractal feature computation. Then, the moving ship detection algorithm is developed from image differentials of the fractal feature in the region of surveillance between regularly interval frames. When the moving ships are detected in region of surveillance, the device for safety alert is triggered. Experimental results have shown that the approach is feasible and effective. It has achieved real-time and reliable alert to avoid collisions of moving ships to bridge piers.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper presents a new image data fusion scheme by combining median filtering with self-organizing feature map (SOFM) neural networks. The scheme consists of three steps: (1) pre-processing of the images, where weighted median filtering removes part of the noise components corrupting the image, (2) pixel clustering for each image using self-organizing feature map neural networks, and (3) fusion of the images obtained in Step (2), which suppresses the residual noise components and thus further improves the image quality. It proves that such a three-step combination offers an impressive effectiveness and performance improvement, which is confirmed by simulations involving three image sensors (each of which has a different noise structure).

Relevância:

40.00% 40.00%

Publicador:

Resumo:

View-based and Cartesian representations provide rival accounts of visual navigation in humans, and here we explore possible models for the view-based case. A visual “homing” experiment was undertaken by human participants in immersive virtual reality. The distributions of end-point errors on the ground plane differed significantly in shape and extent depending on visual landmark configuration and relative goal location. A model based on simple visual cues captures important characteristics of these distributions. Augmenting visual features to include 3D elements such as stereo and motion parallax result in a set of models that describe the data accurately, demonstrating the effectiveness of a view-based approach.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Threat detection is a challenging problem, because threats appear in many variations and differences to normal behaviour can be very subtle. In this paper, we consider threats on a parking lot, where theft of a truck’s cargo occurs. The threats range from explicit, e.g. a person attacking the truck driver, to implicit, e.g. somebody loitering and then fiddling with the exterior of the truck in order to open it. Our goal is a system that is able to recognize a threat instantaneously as they develop. Typical observables of the threats are a person’s activity, presence in a particular zone and the trajectory. The novelty of this paper is an encoding of these threat observables in a semantic, intermediate-level representation, based on low-level visual features that have no intrinsic semantic meaning themselves. The aim of this representation was to bridge the semantic gap between the low-level tracks and motion and the higher-level notion of threats. In our experiments, we demonstrate that our semantic representation is more descriptive for threat detection than directly using low-level features. We find that a person’s activities are the most important elements of this semantic representation, followed by the person’s trajectory. The proposed threat detection system is very accurate: 96.6 % of the tracks are correctly interpreted, when considering the temporal context.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Objective. Interferences from spatially adjacent non-target stimuli are known to evoke event-related potentials (ERPs) during non-target flashes and, therefore, lead to false positives. This phenomenon was commonly seen in visual attention-based brain–computer interfaces (BCIs) using conspicuous stimuli and is known to adversely affect the performance of BCI systems. Although users try to focus on the target stimulus, they cannot help but be affected by conspicuous changes of the stimuli (such as flashes or presenting images) which were adjacent to the target stimulus. Furthermore, subjects have reported that conspicuous stimuli made them tired and annoyed. In view of this, the aim of this study was to reduce adjacent interference, annoyance and fatigue using a new stimulus presentation pattern based upon facial expression changes. Our goal was not to design a new pattern which could evoke larger ERPs than the face pattern, but to design a new pattern which could reduce adjacent interference, annoyance and fatigue, and evoke ERPs as good as those observed during the face pattern. Approach. Positive facial expressions could be changed to negative facial expressions by minor changes to the original facial image. Although the changes are minor, the contrast is big enough to evoke strong ERPs. In this paper, a facial expression change pattern between positive and negative facial expressions was used to attempt to minimize interference effects. This was compared against two different conditions, a shuffled pattern containing the same shapes and colours as the facial expression change pattern, but without the semantic content associated with a change in expression, and a face versus no face pattern. Comparisons were made in terms of classification accuracy and information transfer rate as well as user supplied subjective measures. Main results. The results showed that interferences from adjacent stimuli, annoyance and the fatigue experienced by the subjects could be reduced significantly (p < 0.05) by using the facial expression change patterns in comparison with the face pattern. The offline results show that the classification accuracy of the facial expression change pattern was significantly better than that of the shuffled pattern (p < 0.05) and the face pattern (p < 0.05). Significance. The facial expression change pattern presented in this paper reduced interference from adjacent stimuli and decreased the fatigue and annoyance experienced by BCI users significantly (p < 0.05) compared to the face pattern.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

OBJECTIVE: Interferences from spatially adjacent non-target stimuli are known to evoke event-related potentials (ERPs) during non-target flashes and, therefore, lead to false positives. This phenomenon was commonly seen in visual attention-based brain-computer interfaces (BCIs) using conspicuous stimuli and is known to adversely affect the performance of BCI systems. Although users try to focus on the target stimulus, they cannot help but be affected by conspicuous changes of the stimuli (such as flashes or presenting images) which were adjacent to the target stimulus. Furthermore, subjects have reported that conspicuous stimuli made them tired and annoyed. In view of this, the aim of this study was to reduce adjacent interference, annoyance and fatigue using a new stimulus presentation pattern based upon facial expression changes. Our goal was not to design a new pattern which could evoke larger ERPs than the face pattern, but to design a new pattern which could reduce adjacent interference, annoyance and fatigue, and evoke ERPs as good as those observed during the face pattern. APPROACH: Positive facial expressions could be changed to negative facial expressions by minor changes to the original facial image. Although the changes are minor, the contrast is big enough to evoke strong ERPs. In this paper, a facial expression change pattern between positive and negative facial expressions was used to attempt to minimize interference effects. This was compared against two different conditions, a shuffled pattern containing the same shapes and colours as the facial expression change pattern, but without the semantic content associated with a change in expression, and a face versus no face pattern. Comparisons were made in terms of classification accuracy and information transfer rate as well as user supplied subjective measures. MAIN RESULTS: The results showed that interferences from adjacent stimuli, annoyance and the fatigue experienced by the subjects could be reduced significantly (p < 0.05) by using the facial expression change patterns in comparison with the face pattern. The offline results show that the classification accuracy of the facial expression change pattern was significantly better than that of the shuffled pattern (p < 0.05) and the face pattern (p < 0.05). SIGNIFICANCE: The facial expression change pattern presented in this paper reduced interference from adjacent stimuli and decreased the fatigue and annoyance experienced by BCI users significantly (p < 0.05) compared to the face pattern.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Human observers exhibit large systematic distance-dependent biases when estimating the three-dimensional (3D) shape of objects defined by binocular image disparities. This has led some to question the utility of disparity as a cue to 3D shape and whether accurate estimation of 3D shape is at all possible. Others have argued that accurate perception is possible, but only with large continuous perspective transformations of an object. Using a stimulus that is known to elicit large distance-dependent perceptual bias (random dot stereograms of elliptical cylinders) we show that contrary to these findings the simple adoption of a more naturalistic viewing angle completely eliminates this bias. Using behavioural psychophysics, coupled with a novel surface-based reverse correlation methodology, we show that it is binocular edge and contour information that allows for accurate and precise perception and that observers actively exploit and sample this information when it is available.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The challenge of moving past the classic Window Icons Menus Pointer (WIMP) interface, i.e. by turning it ‘3D’, has resulted in much research and development. To evaluate the impact of 3D on the ‘finding a target picture in a folder’ task, we built a 3D WIMP interface that allowed the systematic manipulation of visual depth, visual aides, semantic category distribution of targets versus non-targets; and the detailed measurement of lower-level stimuli features. Across two separate experiments, one large sample web-based experiment, to understand associations, and one controlled lab environment, using eye tracking to understand user focus, we investigated how visual depth, use of visual aides, use of semantic categories, and lower-level stimuli features (i.e. contrast, colour and luminance) impact how successfully participants are able to search for, and detect, the target image. Moreover in the lab-based experiment, we captured pupillometry measurements to allow consideration of the influence of increasing cognitive load as a result of either an increasing number of items on the screen, or due to the inclusion of visual depth. Our findings showed that increasing the visible layers of depth, and inclusion of converging lines, did not impact target detection times, errors, or failure rates. Low-level features, including colour, luminance, and number of edges, did correlate with differences in target detection times, errors, and failure rates. Our results also revealed that semantic sorting algorithms significantly decreased target detection times. Increased semantic contrasts between a target and its neighbours correlated with an increase in detection errors. Finally, pupillometric data did not provide evidence of any correlation between the number of visible layers of depth and pupil size, however, using structural equation modelling, we demonstrated that cognitive load does influence detection failure rates when there is luminance contrasts between the target and its surrounding neighbours. Results suggest that WIMP interaction designers should consider stimulus-driven factors, which were shown to influence the efficiency with which a target icon can be found in a 3D WIMP interface.