673 resultados para Visual mosaic systems
Resumo:
This paper demonstrates some interesting connections between the hitherto disparate fields of mobile robot navigation and image-based visual servoing. A planar formulation of the well-known image-based visual servoing method leads to a bearing-only navigation system that requires no explicit localization and directly yields desired velocity. The well known benefits of image-based visual servoing such as robustness apply also to the planar case. Simulation results are presented.
Resumo:
The application of high-speed machine vision for close-loop position control, or visual servoing, of a robot manipulator. It provides a comprehensive coverage of all aspects of the visual servoing problem: robotics, vision, control, technology and implementation issues. While much of the discussion is quite general the experimental work described is based on the use of a high-speed binary vision system with a monocular "eye-in-hand" camera.
Resumo:
This present paper reviews the reliability and validity of visual analogue scales (VAS) in terms of (1) their ability to predict feeding behaviour, (2) their sensitivity to experimental manipulations, and (3) their reproducibility. VAS correlate with, but do not reliably predict, energy intake to the extent that they could be used as a proxy of energy intake. They do predict meal initiation in subjects eating their normal diets in their normal environment. Under laboratory conditions, subjectively rated motivation to eat using VAS is sensitive to experimental manipulations and has been found to be reproducible in relation to those experimental regimens. Other work has found them not to be reproducible in relation to repeated protocols. On balance, it would appear, in as much as it is possible to quantify, that VAS exhibit a good degree of within-subject reliability and validity in that they predict with reasonable certainty, meal initiation and amount eaten, and are sensitive to experimental manipulations. This reliability and validity appears more pronounced under the controlled (but more arti®cial) conditions of the laboratory where the signal : noise ratio in experiments appears to be elevated relative to real life. It appears that VAS are best used in within-subject, repeated-measures designs where the effect of different treatments can be compared under similar circumstances. They are best used in conjunction with other measures (e.g. feeding behaviour, changes in plasma metabolites) rather than as proxies for these variables. New hand-held electronic appetite rating systems (EARS) have been developed to increase reliability of data capture and decrease investigator workload. Recent studies have compared these with traditional pen and paper (P&P) VAS. The EARS have been found to be sensitive to experimental manipulations and reproducible relative to P&P. However, subjects appear to exhibit a signi®cantly more constrained use of the scale when using the EARS relative to the P&P. For this reason it is recommended that the two techniques are not used interchangeably
Resumo:
The cascading appearance-based (CAB) feature extraction technique has established itself as the state-of-the-art in extracting dynamic visual speech features for speech recognition. In this paper, we will focus on investigating the effectiveness of this technique for the related speaker verification application. By investigating the speaker verification ability of each stage of the cascade we will demonstrate that the same steps taken to reduce static speaker and environmental information for the visual speech recognition application also provide similar improvements for visual speaker recognition. A further study is conducted comparing synchronous HMM (SHMM) based fusion of CAB visual features and traditional perceptual linear predictive (PLP) acoustic features to show that higher complexity inherit in the SHMM approach does not appear to provide any improvement in the final audio-visual speaker verification system over simpler utterance level score fusion.
Resumo:
This paper proposes a generic decoupled imagebased control scheme for cameras obeying the unified projection model. The scheme is based on the spherical projection model. Invariants to rotational motion are computed from this projection and used to control the translational degrees of freedom. Importantly we form invariants which decrease the sensitivity of the interaction matrix to object depth variation. Finally, the proposed results are validated with experiments using a classical perspective camera as well as a fisheye camera mounted on a 6-DOF robotic platform.
Resumo:
We present a novel method for integrating GPS position estimates with position and attitude estimates derived from visual odometry using a scheme similar to a classic loosely-coupled GPS/INS integration. Under such an arrangement, we derive the error dynamics of the system and develop a Kalman Filter for estimating the errors in position and attitude. Using a control-based approach to observability, we show that the errors in both position and attitude (including yaw) are fully observable when there is a component of acceleration perpendicular to the velocity vector in the navigation frame. Numerical simulations are performed to confirm the observability analysis.
Resumo:
Interacting with technology within a vehicle environment using a voice interface can greatly reduce the effects of driver distraction. Most current approaches to this problem only utilise the audio signal, making them susceptible to acoustic noise. An obvious approach to circumvent this is to use the visual modality in addition. However, capturing, storing and distributing audio-visual data in a vehicle environment is very costly and difficult. One current dataset available for such research is the AVICAR [1] database. Unfortunately this database is largely unusable due to timing mismatch between the two streams and in addition, no protocol is available. We have overcome this problem by re-synchronising the streams on the phone-number portion of the dataset and established a protocol for further research. This paper presents the first audio-visual results on this dataset for speaker-independent speech recognition. We hope this will serve as a catalyst for future research in this area.
Resumo:
It is possible for the visual attention characteristics of a person to be exploited as a biometric for authentication or identification of individual viewers. The visual attention characteristics of a person can be easily monitored by tracking the gaze of a viewer during the presentation of a known or unknown visual scene. The positions and sequences of gaze locations during viewing may be determined by overt (conscious) or covert (sub-conscious) viewing behaviour. This paper presents a method to authenticate individuals using their covert viewing behaviour, thus yielding a unique behavioural biometric. A method to quantify the spatial and temporal patterns established by the viewer for their covert behaviour is proposed utilsing a principal component analysis technique called `eigenGaze'. Experimental results suggest that it is possible to capture the unique visual attention characteristics of a person to provide a simple behavioural biometric.
Resumo:
Inspection of solder joints has been a critical process in the electronic manufacturing industry to reduce manufacturing cost, improve yield, and ensure product quality and reliability. This paper proposes two inspection modules for an automatic solder joint classification system. The “front-end” inspection system includes illumination normalisation, localisation and segmentation. The “back-end” inspection involves the classification of solder joints using the Log Gabor filter and classifier fusion. Five different levels of solder quality with respect to the amount of solder paste have been defined. The Log Gabor filter has been demonstrated to achieve high recognition rates and is resistant to misalignment. This proposed system does not need any special illumination system, and the images are acquired by an ordinary digital camera. This system could contribute to the development of automated non-contact, non-destructive and low cost solder joint quality inspection systems.
Resumo:
In this paper, I would like to outline the approach we have taken to mapping and assessing integrity systems and how this has led us to see integrity systems in a new light. Indeed, it has led us to a new visual metaphor for integrity systems – a bird’s nest rather than a Greek temple. This was the result of a pair of major research projects completed in partnership with Transparency International (TI). One worked on refining and extending the measurement of corruption. This, the second, looked at what was then the emerging institutional means for reducing corruption – ‘national integrity systems’
Resumo:
Visual noise insensitivity is important to audio visual speech recognition (AVSR). Visual noise can take on a number of forms such as varying frame rate, occlusion, lighting or speaker variabilities. The use of a high dimensional secondary classifier on the word likelihood scores from both the audio and video modalities is investigated for the purposes of adaptive fusion. Preliminary results are presented demonstrating performance above the catastrophic fusion boundary for our confidence measure irrespective of the type of visual noise presented to it. Our experiments were restricted to small vocabulary applications.
Resumo:
The use of visual features in the form of lip movements to improve the performance of acoustic speech recognition has been shown to work well, particularly in noisy acoustic conditions. However, whether this technique can outperform speech recognition incorporating well-known acoustic enhancement techniques, such as spectral subtraction, or multi-channel beamforming is not known. This is an important question to be answered especially in an automotive environment, for the design of an efficient human-vehicle computer interface. We perform a variety of speech recognition experiments on a challenging automotive speech dataset and results show that synchronous HMM-based audio-visual fusion can outperform traditional single as well as multi-channel acoustic speech enhancement techniques. We also show that further improvement in recognition performance can be obtained by fusing speech-enhanced audio with the visual modality, demonstrating the complementary nature of the two robust speech recognition approaches.