985 resultados para visual selection
Resumo:
Recovering position from sensor information is an important problem in mobile robotics, known as localisation. Localisation requires a map or some other description of the environment to provide the robot with a context to interpret sensor data. The mobile robot system under discussion is using an artificial neural representation of position. Building a geometrical map of the environment with a single camera and artificial neural networks is difficult. Instead it would be simpler to learn position as a function of the visual input. Usually when learning images, an intermediate representation is employed. An appropriate starting point for biologically plausible image representation is the complex cells of the visual cortex, which have invariance properties that appear useful for localisation. The effectiveness for localisation of two different complex cell models are evaluated. Finally the ability of a simple neural network with single shot learning to recognise these representations and localise a robot is examined.
Resumo:
RatSLAM is a vision-based SLAM system based on extended models of the rodent hippocampus. RatSLAM creates environment representations that can be processed by the experience mapping algorithm to produce maps suitable for goal recall. The experience mapping algorithm also allows RatSLAM to map environments many times larger than could be achieved with a one to one correspondence between the map and environment, by reusing the RatSLAM maps to represent multiple sections of the environment. This paper describes experiments investigating the effects of the environment-representation size ratio and visual ambiguity on mapping and goal navigation performance. The experiments demonstrate that system performance is weakly dependent on either parameter in isolation, but strongly dependent on their joint values.
Resumo:
The Simultaneous Localisation And Mapping (SLAM) problem is one of the major challenges in mobile robotics. Probabilistic techniques using high-end range finding devices are well established in the field, but recent work has investigated vision-only approaches. We present an alternative approach to the leading existing techniques, which extracts approximate rotational and translation velocity information from a vehicle-mounted consumer camera, without tracking landmarks. When coupled with an existing SLAM system, the vision module is able to map a 45 metre long indoor loop and a 1.6 km long outdoor road loop, without any parameter or system adjustment between tests. The work serves as a promising pilot study into ground-based vision-only SLAM, with minimal geometric interpretation of the environment.
Resumo:
This paper investigates the use of the FAB-MAP appearance-only SLAM algorithm as a method for performing visual data association for RatSLAM, a semi-metric full SLAM system. While both systems have shown the ability to map large (60-70km) outdoor locations of approximately the same scale, for either larger areas or across longer time periods both algorithms encounter difficulties with false positive matches. By combining these algorithms using a mapping between appearance and pose space, both false positives and false negatives generated by FAB-MAP are significantly reduced during outdoor mapping using a forward-facing camera. The hybrid FAB-MAP-RatSLAM system developed demonstrates the potential for successful SLAM over large periods of time.
Resumo:
Acoustically, car cabins are extremely noisy and as a consequence audio-only, in-car voice recognition systems perform poorly. As the visual modality is immune to acoustic noise, using the visual lip information from the driver is seen as a viable strategy in circumventing this problem by using audio visual automatic speech recognition (AVASR). However, implementing AVASR requires a system being able to accurately locate and track the drivers face and lip area in real-time. In this paper we present such an approach using the Viola-Jones algorithm. Using the AVICAR [1] in-car database, we show that the Viola- Jones approach is a suitable method of locating and tracking the driver’s lips despite the visual variability of illumination and head pose for audio-visual speech recognition system.
Resumo:
This paper presents a vision-based method of vehicle localisation that has been developed and tested on a large forklift type robotic vehicle which operates in a mainly outdoor industrial setting. The localiser uses a sparse 3D edgemap of the environment and a particle filter to estimate the pose of the vehicle. The vehicle operates in dynamic and non-uniform outdoor lighting conditions, an issue that is addressed by using knowledge of the scene to intelligently adjust the camera exposure and hence improve the quality of the information in the image. Results from the industrial vehicle are shown and compared to another laser-based localiser which acts as a ground truth. An improved likelihood metric, using peredge calculation, is presented and has shown to be 40% more accurate in estimating rotation. Visual localization results from the vehicle driving an arbitrary 1.5km path during a bright sunny period show an average position error of 0.44m and rotation error of 0.62deg.
Resumo:
This paper illustrates a method for finding useful visual landmarks for performing simultaneous localization and mapping (SLAM). The method is based loosely on biological principles, using layers of filtering and pooling to create learned templates that correspond to different views of the environment. Rather than using a set of landmarks and reporting range and bearing to the landmark, this system maps views to poses. The challenge is to produce a system that produces the same view for small changes in robot pose, but provides different views for larger changes in pose. The method has been developed to interface with the RatSLAM system, a biologically inspired method of SLAM. The paper describes the method of learning and recalling visual landmarks in detail, and shows the performance of the visual system in real robot tests.
Resumo:
Various piezoelectric polymers based on polyvinylidene fluoride (PVDF) are of interest for large aperture space-based telescopes. Dimensional adjustments of adaptive polymer films depend on charge deposition and require a detailed understanding of the piezoelectric material responses which are expected to deteriorate owing to strong vacuum UV, � -, X-ray, energetic particles and atomic oxygen exposure. We have investigated the degradation of PVDF and its copolymers under various stress environments detrimental to reliable operation in space. Initial radiation aging studies have shown complex material changes with lowered Curie temperatures, complex material changes with lowered melting points, morphological transformations and significant crosslinking, but little influence on piezoelectric d33 constants. Complex aging processes have also been observed in accelerated temperature environments inducing annealing phenomena and cyclic stresses. The results suggest that poling and chain orientation are negatively affected by radiation and temperature exposure. A framework for dealing with these complex material qualification issues and overall system survivability predictions in low earth orbit conditions has been established. It allows for improved material selection, feedback for manufacturing and processing, material optimization/stabilization strategies and provides guidance on any alternative materials.
Resumo:
Acoustically, car cabins are extremely noisy and as a consequence, existing audio-only speech recognition systems, for voice-based control of vehicle functions such as the GPS based navigator, perform poorly. Audio-only speech recognition systems fail to make use of the visual modality of speech (eg: lip movements). As the visual modality is immune to acoustic noise, utilising this visual information in conjunction with an audio only speech recognition system has the potential to improve the accuracy of the system. The field of recognising speech using both auditory and visual inputs is known as Audio Visual Speech Recognition (AVSR). Continuous research in AVASR field has been ongoing for the past twenty-five years with notable progress being made. However, the practical deployment of AVASR systems for use in a variety of real-world applications has not yet emerged. The main reason is due to most research to date neglecting to address variabilities in the visual domain such as illumination and viewpoint in the design of the visual front-end of the AVSR system. In this paper we present an AVASR system in a real-world car environment using the AVICAR database [1], which is publicly available in-car database and we show that the use of visual speech conjunction with the audio modality is a better approach to improve the robustness and effectiveness of voice-only recognition systems in car cabin environments.
Resumo:
The detection of voice activity is a challenging problem, especially when the level of acoustic noise is high. Most current approaches only utilise the audio signal, making them susceptible to acoustic noise. An obvious approach to overcome this is to use the visual modality. The current state-of-the-art visual feature extraction technique is one that uses a cascade of visual features (i.e. 2D-DCT, feature mean normalisation, interstep LDA). In this paper, we investigate the effectiveness of this technique for the task of visual voice activity detection (VAD), and analyse each stage of the cascade and quantify the relative improvement in performance gained by each successive stage. The experiments were conducted on the CUAVE database and our results highlight that the dynamics of the visual modality can be used to good effect to improve visual voice activity detection performance.
Resumo:
Web service composition is an important problem in web service based systems. It is about how to build a new value-added web service using existing web services. A web service may have many implementations, all of which have the same functionality, but may have different QoS values. Thus, a significant research problem in web service composition is how to select a web service implementation for each of the web services such that the composite web service gives the best overall performance. This is so-called optimal web service selection problem. There may be mutual constraints between some web service implementations. Sometimes when an implementation is selected for one web service, a particular implementation for another web service must be selected. This is so called dependency constraint. Sometimes when an implementation for one web service is selected, a set of implementations for another web service must be excluded in the web service composition. This is so called conflict constraint. Thus, the optimal web service selection is a typical constrained ombinatorial optimization problem from the computational point of view. This paper proposes a new hybrid genetic algorithm for the optimal web service selection problem. The hybrid genetic algorithm has been implemented and evaluated. The evaluation results have shown that the hybrid genetic algorithm outperforms other two existing genetic algorithms when the number of web services and the number of constraints are large.
Resumo:
This paper presents a formulation of image-based visual servoing (IBVS) for a spherical camera where coordinates are parameterized in terms of colatitude and longitude: IBVSSph. The image Jacobian is derived and simulation results are presented for canonical rotational, translational as well as general motion. Problems with large rotations that affect the planar perspective form of IBVS are not present on the sphere, whereas the desirable robustness properties of IBVS are shown to be retained. We also describe a structure from motion (SfM) system based on camera-centric spherical coordinates and show how a recursive estimator can be used to recover structure. The spherical formulations for IBVS and SfM are particularly suitable for platforms, such as aerial and underwater robots, that move in SE(3).
Resumo:
Wide-angle images exhibit significant distortion for which existing scale-space detectors such as the scale-invariant feature transform (SIFT) are inappropriate. The required scale-space images for feature detection are correctly obtained through the convolution of the image, mapped to the sphere, with the spherical Gaussian. A new visual key-point detector, based on this principle, is developed and several computational approaches to the convolution are investigated in both the spatial and frequency domain. In particular, a close approximation is developed that has comparable computation time to conventional SIFT but with improved matching performance. Results are presented for monocular wide-angle outdoor image sequences obtained using fisheye and equiangular catadioptric cameras. We evaluate the overall matching performance (recall versus 1-precision) of these methods compared to conventional SIFT. We also demonstrate the use of the technique for variable frame-rate visual odometry and its application to place recognition.
Resumo:
In this paper a generic decoupled imaged-based control scheme for calibrated cameras obeying the unified projection model is proposed. The proposed decoupled scheme is based on the surface of object projections onto the unit sphere. Such features are invariant to rotational motions. This allows the control of translational motion independently from the rotational motion. Finally, the proposed results are validated with experiments using a classical perspective camera as well as a fisheye camera mounted on a 6 dofs robot platform.
Resumo:
This paper demonstrates some interesting connections between the hitherto disparate fields of mobile robot navigation and image-based visual servoing. A planar formulation of the well-known image-based visual servoing method leads to a bearing-only navigation system that requires no explicit localization and directly yields desired velocity. The well known benefits of image-based visual servoing such as robustness apply also to the planar case. Simulation results are presented.