946 resultados para visual method


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Acoustically, car cabins are extremely noisy and as a consequence, existing audio-only speech recognition systems, for voice-based control of vehicle functions such as the GPS based navigator, perform poorly. Audio-only speech recognition systems fail to make use of the visual modality of speech (eg: lip movements). As the visual modality is immune to acoustic noise, utilising this visual information in conjunction with an audio only speech recognition system has the potential to improve the accuracy of the system. The field of recognising speech using both auditory and visual inputs is known as Audio Visual Speech Recognition (AVSR). Continuous research in AVASR field has been ongoing for the past twenty-five years with notable progress being made. However, the practical deployment of AVASR systems for use in a variety of real-world applications has not yet emerged. The main reason is due to most research to date neglecting to address variabilities in the visual domain such as illumination and viewpoint in the design of the visual front-end of the AVSR system. In this paper we present an AVASR system in a real-world car environment using the AVICAR database [1], which is publicly available in-car database and we show that the use of visual speech conjunction with the audio modality is a better approach to improve the robustness and effectiveness of voice-only recognition systems in car cabin environments.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Gabor representations have been widely used in facial analysis (face recognition, face detection and facial expression detection) due to their biological relevance and computational properties. Two popular Gabor representations used in literature are: 1) Log-Gabor and 2) Gabor energy filters. Even though these representations are somewhat similar, they also have distinct differences as the Log-Gabor filters mimic the simple cells in the visual cortex while the Gabor energy filters emulate the complex cells, which causes subtle differences in the responses. In this paper, we analyze the difference between these two Gabor representations and quantify these differences on the task of facial action unit (AU) detection. In our experiments conducted on the Cohn-Kanade dataset, we report an average area underneath the ROC curve (A`) of 92.60% across 17 AUs for the Gabor energy filters, while the Log-Gabor representation achieved an average A` of 96.11%. This result suggests that small spatial differences that the Log-Gabor filters pick up on are more useful for AU detection than the differences in contours and edges that the Gabor energy filters extract.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The detection of voice activity is a challenging problem, especially when the level of acoustic noise is high. Most current approaches only utilise the audio signal, making them susceptible to acoustic noise. An obvious approach to overcome this is to use the visual modality. The current state-of-the-art visual feature extraction technique is one that uses a cascade of visual features (i.e. 2D-DCT, feature mean normalisation, interstep LDA). In this paper, we investigate the effectiveness of this technique for the task of visual voice activity detection (VAD), and analyse each stage of the cascade and quantify the relative improvement in performance gained by each successive stage. The experiments were conducted on the CUAVE database and our results highlight that the dynamics of the visual modality can be used to good effect to improve visual voice activity detection performance.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The previous investigations have shown that the modal strain energy correlation method, MSEC, could successfully identify the damage of truss bridge structures. However, it has to incorporate the sensitivity matrix to estimate damage and is not reliable in certain damage detection cases. This paper presents an improved MSEC method where the prediction of modal strain energy change vector is differently obtained by running the eigensolutions on-line in optimisation iterations. The particular trail damage treatment group maximising the fitness function close to unity is identified as the detected damage location. This improvement is then compared with the original MSEC method along with other typical correlation-based methods on the finite element model of a simple truss bridge. The contributions to damage detection accuracy of each considered mode is also weighed and discussed. The iterative searching process is operated by using genetic algorithm. The results demonstrate that the improved MSEC method suffices the demand in detecting the damage of truss bridge structures, even when noised measurement is considered.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Aim. The paper is a report of a study to demonstrate how the use of schematics can provide procedural clarity and promote rigour in the conduct of case study research. Background. Case study research is a methodologically flexible approach to research design that focuses on a particular case – whether an individual, a collective or a phenomenon of interest. It is known as the 'study of the particular' for its thorough investigation of particular, real-life situations and is gaining increased attention in nursing and social research. However, the methodological flexibility it offers can leave the novice researcher uncertain of suitable procedural steps required to ensure methodological rigour. Method. This article provides a real example of a case study research design that utilizes schematic representation drawn from a doctoral study of the integration of health promotion principles and practices into a palliative care organization. Discussion. The issues discussed are: (1) the definition and application of case study research design; (2) the application of schematics in research; (3) the procedural steps and their contribution to the maintenance of rigour; and (4) the benefits and risks of schematics in case study research. Conclusion. The inclusion of visual representations of design with accompanying explanatory text is recommended in reporting case study research methods.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Masks are widely used in different industries, for example, traditional metal industry, hospitals or semiconductor industry. Quality is a critical issue in mask industry as it is related to public health and safety. Traditional quality practices for manufacturing process have some limitations in implementing them in mask industries. This paper aims to investigate the suitability of Six Sigma quality control method for the manufacturing process in the mask industry to provide high quality products, enhancing the process capacity, reducing the defects and the returned goods arising in a selected mask manufacturing company. This paper suggests that modifications necessary in Six Sigma method for effective implementation in mask industry.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents a formulation of image-based visual servoing (IBVS) for a spherical camera where coordinates are parameterized in terms of colatitude and longitude: IBVSSph. The image Jacobian is derived and simulation results are presented for canonical rotational, translational as well as general motion. Problems with large rotations that affect the planar perspective form of IBVS are not present on the sphere, whereas the desirable robustness properties of IBVS are shown to be retained. We also describe a structure from motion (SfM) system based on camera-centric spherical coordinates and show how a recursive estimator can be used to recover structure. The spherical formulations for IBVS and SfM are particularly suitable for platforms, such as aerial and underwater robots, that move in SE(3).

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Wide-angle images exhibit significant distortion for which existing scale-space detectors such as the scale-invariant feature transform (SIFT) are inappropriate. The required scale-space images for feature detection are correctly obtained through the convolution of the image, mapped to the sphere, with the spherical Gaussian. A new visual key-point detector, based on this principle, is developed and several computational approaches to the convolution are investigated in both the spatial and frequency domain. In particular, a close approximation is developed that has comparable computation time to conventional SIFT but with improved matching performance. Results are presented for monocular wide-angle outdoor image sequences obtained using fisheye and equiangular catadioptric cameras. We evaluate the overall matching performance (recall versus 1-precision) of these methods compared to conventional SIFT. We also demonstrate the use of the technique for variable frame-rate visual odometry and its application to place recognition.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper a generic decoupled imaged-based control scheme for calibrated cameras obeying the unified projection model is proposed. The proposed decoupled scheme is based on the surface of object projections onto the unit sphere. Such features are invariant to rotational motions. This allows the control of translational motion independently from the rotational motion. Finally, the proposed results are validated with experiments using a classical perspective camera as well as a fisheye camera mounted on a 6 dofs robot platform.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper describes a novel experiment in which two very different methods of underwater robot localization are compared. The first method is based on a geometric approach in which a mobile node moves within a field of static nodes, and all nodes are capable of estimating the range to their neighbours acoustically. The second method uses visual odometry, from stereo cameras, by integrating scaled optical flow. The fundamental algorithmic principles of each localization technique is described. We also present experimental results comparing acoustic localization with GPS for surface operation, and a comparison of acoustic and visual methods for underwater operation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we present a tutorial introduction to two important senses for biological and robotic systems — inertial and visual perception. We discuss the fundamentals of these two sensing modalities from a biological and an engineering perspective. Digital camera chips and micro-machined accelerometers and gyroscopes are now commodities, and when combined with today's available computing can provide robust estimates of self-motion as well 3D scene structure, without external infrastructure. We discuss the complementarity of these sensors, describe some fundamental approaches to fusing their outputs and survey the field.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper considers the question of designing a fully image-based visual servo control for a class of dynamic systems. The work is motivated by the ongoing development of image-based visual servo control of small aerial robotic vehicles. The kinematics and dynamics of a rigid-body dynamical system (such as a vehicle airframe) maneuvering over a flat target plane with observable features are expressed in terms of an unnormalized spherical centroid and an optic flow measurement. The image-plane dynamics with respect to force input are dependent on the height of the camera above the target plane. This dependence is compensated by introducing virtual height dynamics and adaptive estimation in the proposed control. A fully nonlinear adaptive control design is provided that ensures asymptotic stability of the closed-loop system for all feasible initial conditions. The choice of control gains is based on an analysis of the asymptotic dynamics of the system. Results from a realistic simulation are presented that demonstrate the performance of the closed-loop system. To the author's knowledge, this paper documents the first time that an image-based visual servo control has been proposed for a dynamic system using vision measurement for both position and velocity.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Performing reliable localisation and navigation within highly unstructured underwater coral reef environments is a difficult task at the best of times. Typical research and commercial underwater vehicles use expensive acoustic positioning and sonar systems which require significant external infrastructure to operate effectively. This paper is focused on the development of a robust vision-based motion estimation technique using low-cost sensors for performing real-time autonomous and untethered environmental monitoring tasks in the Great Barrier Reef without the use of acoustic positioning. The technique is experimentally shown to provide accurate odometry and terrain profile information suitable for input into the vehicle controller to perform a range of environmental monitoring tasks.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Performing reliable localisation and navigation within highly unstructured underwater coral reef environments is a difficult task at the best of times. Typical research and commercial underwater vehicles use expensive acoustic positioning and sonar systems which require significant external infrastructure to operate effectively. This paper is focused on the development of a robust vision-based motion estimation technique using low-cost sensors for performing real-time autonomous and untethered environmental monitoring tasks in the Great Barrier Reef without the use of acoustic positioning. The technique is experimentally shown to provide accurate odometry and terrain profile information suitable for input into the vehicle controller to perform a range of environmental monitoring tasks.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we outline the sensing system used for the visual pose control of our experimental car-like vehicle, the autonomous tractor. The sensing system consists of a magnetic compass, an omnidirectional camera and a low-resolution odometry system. In this work, information from these sensors is fused using complementary filters. Complementary filters provide a means of fusing information from sensors with different characteristics in order to produce a more reliable estimate of the desired variable. Here, the range and bearing of landmarks observed by the vision system are fused with odometry information and a vehicle model, providing a more reliable estimate of these states. We also present a method of combining a compass sensor with odometry and a vehicle model to improve the heading estimate.