28 resultados para decoupled image-based visual servoing

em CentAUR: Central Archive University of Reading - UK


Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we introduce a novel high-level visual content descriptor which is devised for performing semantic-based image classification and retrieval. The work can be treated as an attempt to bridge the so called “semantic gap”. The proposed image feature vector model is fundamentally underpinned by the image labelling framework, called Collaterally Confirmed Labelling (CCL), which incorporates the collateral knowledge extracted from the collateral texts of the images with the state-of-the-art low-level image processing and visual feature extraction techniques for automatically assigning linguistic keywords to image regions. Two different high-level image feature vector models are developed based on the CCL labelling of results for the purposes of image data clustering and retrieval respectively. A subset of the Corel image collection has been used for evaluating our proposed method. The experimental results to-date already indicates that our proposed semantic-based visual content descriptors outperform both traditional visual and textual image feature models.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A novel framework referred to as collaterally confirmed labelling (CCL) is proposed, aiming at localising the visual semantics to regions of interest in images with textual keywords. Both the primary image and collateral textual modalities are exploited in a mutually co-referencing and complementary fashion. The collateral content and context-based knowledge is used to bias the mapping from the low-level region-based visual primitives to the high-level visual concepts defined in a visual vocabulary. We introduce the notion of collateral context, which is represented as a co-occurrence matrix of the visual keywords. A collaborative mapping scheme is devised using statistical methods like Gaussian distribution or Euclidean distance together with collateral content and context-driven inference mechanism. We introduce a novel high-level visual content descriptor that is devised for performing semantic-based image classification and retrieval. The proposed image feature vector model is fundamentally underpinned by the CCL framework. Two different high-level image feature vector models are developed based on the CCL labelling of results for the purposes of image data clustering and retrieval, respectively. A subset of the Corel image collection has been used for evaluating our proposed method. The experimental results to-date already indicate that the proposed semantic-based visual content descriptors outperform both traditional visual and textual image feature models. (C) 2007 Elsevier B.V. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A novel framework for multimodal semantic-associative collateral image labelling, aiming at associating image regions with textual keywords, is described. Both the primary image and collateral textual modalities are exploited in a cooperative and complementary fashion. The collateral content and context based knowledge is used to bias the mapping from the low-level region-based visual primitives to the high-level visual concepts defined in a visual vocabulary. We introduce the notion of collateral context, which is represented as a co-occurrence matrix, of the visual keywords, A collaborative mapping scheme is devised using statistical methods like Gaussian distribution or Euclidean distance together with collateral content and context-driven inference mechanism. Finally, we use Self Organising Maps to examine the classification and retrieval effectiveness of the proposed high-level image feature vector model which is constructed based on the image labelling results.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A large volume of visual content is inaccessible until effective and efficient indexing and retrieval of such data is achieved. In this paper, we introduce the DREAM system, which is a knowledge-assisted semantic-driven context-aware visual information retrieval system applied in the film post production domain. We mainly focus on the automatic labelling and topic map related aspects of the framework. The use of the context- related collateral knowledge, represented by a novel probabilistic based visual keyword co-occurrence matrix, had been proven effective via the experiments conducted during system evaluation. The automatically generated semantic labels were fed into the Topic Map Engine which can automatically construct ontological networks using Topic Maps technology, which dramatically enhances the indexing and retrieval performance of the system towards an even higher semantic level.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

1. Jerdon's courser Rhinoptilus bitorquatus is a nocturnally active cursorial bird that is only known to occur in a small area of scrub jungle in Andhra Pradesh, India, and is listed as critically endangered by the IUCN. Information on its habitat requirements is needed urgently to underpin conservation measures. We quantified the habitat features that correlated with the use of different areas of scrub jungle by Jerdon's coursers, and developed a model to map potentially suitable habitat over large areas from satellite imagery and facilitate the design of surveys of Jerdon's courser distribution. 2. We used 11 arrays of 5-m long tracking strips consisting of smoothed fine soil to detect the footprints of Jerdon's coursers, and measured tracking rates (tracking events per strip night). We counted the number of bushes and trees, and described other attributes of vegetation and substrate in a 10-m square plot centred on each strip. We obtained reflectance data from Landsat 7 satellite imagery for the pixel within which each strip lay. 3. We used logistic regression models to describe the relationship between tracking rate by Jerdon's coursers and characteristics of the habitat around the strips, using ground-based survey data and satellite imagery. 4. Jerdon's coursers were most likely to occur where the density of large (>2 m tall) bushes was in the range 300-700 ha(-1) and where the density of smaller bushes was less than 1000 ha(-1). This habitat was detectable using satellite imagery. 5. Synthesis and applications. The occurrence of Jerdon's courser is strongly correlated with the density of bushes and trees, and is in turn affected by grazing with domestic livestock, woodcutting and mechanical clearance of bushes to create pasture, orchards and farmland. It is likely that there is an optimal level of grazing and woodcutting that would maintain or create suitable conditions for the species. Knowledge of the species' distribution is incomplete and there is considerable pressure from human use of apparently suitable habitats. Hence, distribution mapping is a high conservation priority. A two-step procedure is proposed, involving the use of ground surveys of bush density to calibrate satellite image-based mapping of potential habitat. These maps could then be used to select priority areas for Jerdon's courser surveys. The use of tracking strips to study habitat selection and distribution has potential in studies of other scarce and secretive species.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

1. Jerdon's courser Rhinoptilus bitorquatus is a nocturnally active cursorial bird that is only known to occur in a small area of scrub jungle in Andhra Pradesh, India, and is listed as critically endangered by the IUCN. Information on its habitat requirements is needed urgently to underpin conservation measures. We quantified the habitat features that correlated with the use of different areas of scrub jungle by Jerdon's coursers, and developed a model to map potentially suitable habitat over large areas from satellite imagery and facilitate the design of surveys of Jerdon's courser distribution. 2. We used 11 arrays of 5-m long tracking strips consisting of smoothed fine soil to detect the footprints of Jerdon's coursers, and measured tracking rates (tracking events per strip night). We counted the number of bushes and trees, and described other attributes of vegetation and substrate in a 10-m square plot centred on each strip. We obtained reflectance data from Landsat 7 satellite imagery for the pixel within which each strip lay. 3. We used logistic regression models to describe the relationship between tracking rate by Jerdon's coursers and characteristics of the habitat around the strips, using ground-based survey data and satellite imagery. 4. Jerdon's coursers were most likely to occur where the density of large (>2 m tall) bushes was in the range 300-700 ha(-1) and where the density of smaller bushes was less than 1000 ha(-1). This habitat was detectable using satellite imagery. 5. Synthesis and applications. The occurrence of Jerdon's courser is strongly correlated with the density of bushes and trees, and is in turn affected by grazing with domestic livestock, woodcutting and mechanical clearance of bushes to create pasture, orchards and farmland. It is likely that there is an optimal level of grazing and woodcutting that would maintain or create suitable conditions for the species. Knowledge of the species' distribution is incomplete and there is considerable pressure from human use of apparently suitable habitats. Hence, distribution mapping is a high conservation priority. A two-step procedure is proposed, involving the use of ground surveys of bush density to calibrate satellite image-based mapping of potential habitat. These maps could then be used to select priority areas for Jerdon's courser surveys. The use of tracking strips to study habitat selection and distribution has potential in studies of other scarce and secretive species.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we address issues in segmentation Of remotely sensed LIDAR (LIght Detection And Ranging) data. The LIDAR data, which were captured by airborne laser scanner, contain 2.5 dimensional (2.5D) terrain surface height information, e.g. houses, vegetation, flat field, river, basin, etc. Our aim in this paper is to segment ground (flat field)from non-ground (houses and high vegetation) in hilly urban areas. By projecting the 2.5D data onto a surface, we obtain a texture map as a grey-level image. Based on the image, Gabor wavelet filters are applied to generate Gabor wavelet features. These features are then grouped into various windows. Among these windows, a combination of their first and second order of statistics is used as a measure to determine the surface properties. The test results have shown that ground areas can successfully be segmented from LIDAR data. Most buildings and high vegetation can be detected. In addition, Gabor wavelet transform can partially remove hill or slope effects in the original data by tuning Gabor parameters.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The goal was to quantitatively estimate and compare the fidelity of images acquired with a digital imaging system (ADAR 5500) and generated through scanning of color infrared aerial photographs (SCIRAP) using image-based metrics. Images were collected nearly simultaneously in two repetitive flights to generate multi-temporal datasets. Spatial fidelity of ADAR was lower than that of SCIRAP images. Radiometric noise was higher for SCIRAP than for ADAR images, even though noise from misregistration effects was lower. These results suggest that with careful control of film scanning, the overall fidelity of SCIRAP imagery can be comparable to that of digital multispectral camera data. Therefore, SCIRAP images can likely be used in conjunction with digital metric camera imagery in long-term landcover change analyses.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Periocular recognition has recently become an active topic in biometrics. Typically it uses 2D image data of the periocular region. This paper is the first description of combining 3D shape structure with 2D texture. A simple and effective technique using iterative closest point (ICP) was applied for 3D periocular region matching. It proved its strength for relatively unconstrained eye region capture, and does not require any training. Local binary patterns (LBP) were applied for 2D image based periocular matching. The two modalities were combined at the score-level. This approach was evaluated using the Bosphorus 3D face database, which contains large variations in facial expressions, head poses and occlusions. The rank-1 accuracy achieved from the 3D data (80%) was better than that for 2D (58%), and the best accuracy (83%) was achieved by fusing the two types of data. This suggests that significant improvements to periocular recognition systems could be achieved using the 3D structure information that is now available from small and inexpensive sensors.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Scene classification based on latent Dirichlet allocation (LDA) is a more general modeling method known as a bag of visual words, in which the construction of a visual vocabulary is a crucial quantization process to ensure success of the classification. A framework is developed using the following new aspects: Gaussian mixture clustering for the quantization process, the use of an integrated visual vocabulary (IVV), which is built as the union of all centroids obtained from the separate quantization process of each class, and the usage of some features, including edge orientation histogram, CIELab color moments, and gray-level co-occurrence matrix (GLCM). The experiments are conducted on IKONOS images with six semantic classes (tree, grassland, residential, commercial/industrial, road, and water). The results show that the use of an IVV increases the overall accuracy (OA) by 11 to 12% and 6% when it is implemented on the selected and all features, respectively. The selected features of CIELab color moments and GLCM provide a better OA than the implementation over CIELab color moment or GLCM as individuals. The latter increases the OA by only ∼2 to 3%. Moreover, the results show that the OA of LDA outperforms the OA of C4.5 and naive Bayes tree by ∼20%. © 2014 Society of Photo-Optical Instrumentation Engineers (SPIE) [DOI: 10.1117/1.JRS.8.083690]

Relevância:

50.00% 50.00%

Publicador:

Resumo:

Event-related desynchronization (ERD) of the electroencephalogram (EEG) from the motor cortex is associated with execution, observation, and mental imagery of motor tasks. Generation of ERD by motor imagery (MI) has been widely used for brain-computer interfaces (BCIs) linked to neuroprosthetics and other motor assistance devices. Control of MI-based BCIs can be acquired by neurofeedback training to reliably induce MI-associated ERD. To develop more effective training conditions, we investigated the effect of static and dynamic visual representations of target movements (a picture of forearms or a video clip of hand grasping movements) during the BCI training. After 4 consecutive training days, the group that performed MI while viewing the video showed significant improvement in generating MI-associated ERD compared with the group that viewed the static image. This result suggests that passively observing the target movement during MI would improve the associated mental imagery and enhance MI-based BCIs skills.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

Interferences from the spatially adjacent non-target stimuli evoke ERPs during non-target sub-trials and lead to false positives. This phenomenon is commonly seen in visual attention based BCIs and affects the performance of BCI system. Although, users or subjects tried to focus on the target stimulus, they still could not help being affected by conspicuous changes of the stimuli (flashes or presenting images) which were adjacent to the target stimulus. In view of this case, the aim of this study is to reduce the adjacent interference using new stimulus presentation pattern based on facial expression changes. Positive facial expressions can be changed to negative facial expressions by minor changes to the original facial image. Although the changes are minor, the contrast will be big enough to evoke strong ERPs. In this paper, two different conditions (Pattern_1, Pattern_2) were used to compare across objective measures such as classification accuracy and information transfer rate as well as subjective measures. Pattern_1 was a “flash-only” pattern and Pattern_2 was a facial expression change of a dummy face. In the facial expression change patterns, the background is a positive facial expression and the stimulus is a negative facial expression. The results showed that the interferences from adjacent stimuli could be reduced significantly (P<;0.05) by using the facial expression change patterns. The online performance of the BCI system using the facial expression change patterns was significantly better than that using the “flash-only” patterns in terms of classification accuracy (p<;0.01), bit rate (p<;0.01), and practical bit rate (p<;0.01). Subjects reported that the annoyance and fatigue could be significantly decreased (p<;0.05) using the new stimulus presentation pattern presented in this paper.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This workshop paper reports recent developments to a vision system for traffic interpretation which relies extensively on the use of geometrical and scene context. Firstly, a new approach to pose refinement is reported, based on forces derived from prominent image derivatives found close to an initial hypothesis. Secondly, a parameterised vehicle model is reported, able to represent different vehicle classes. This general vehicle model has been fitted to sample data, and subjected to a Principal Component Analysis to create a deformable model of common car types having 6 parameters. We show that the new pose recovery technique is also able to operate on the PCA model, to allow the structure of an initial vehicle hypothesis to be adapted to fit the prevailing context. We report initial experiments with the model, which demonstrate significant improvements to pose recovery.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

A driver controls a car by turning the steering wheel or by pressing on the accelerator or the brake. These actions are modelled by Gaussian processes, leading to a stochastic model for the motion of the car. The stochastic model is the basis of a new filter for tracking and predicting the motion of the car, using measurements obtained by fitting a rigid 3D model to a monocular sequence of video images. Experiments show that the filter easily outperforms traditional filters.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In an immersive virtual environment, observers fail to notice the expansion of a room around them and consequently make gross errors when comparing the size of objects. This result is difficult to explain if the visual system continuously generates a 3-D model of the scene based on known baseline information from interocular separation or proprioception as the observer walks. An alternative is that observers use view-based methods to guide their actions and to represent the spatial layout of the scene. In this case, they may have an expectation of the images they will receive but be insensitive to the rate at which images arrive as they walk. We describe the way in which the eye movement strategy of animals simplifies motion processing if their goal is to move towards a desired image and discuss dorsal and ventral stream processing of moving images in that context. Although many questions about view-based approaches to scene representation remain unanswered, the solutions are likely to be highly relevant to understanding biological 3-D vision.