968 resultados para image fusion
Resumo:
A new algorithm for extracting features from images for object recognition is described. The algorithm uses higher order spectra to provide desirable invariance properties, to provide noise immunity, and to incorporate nonlinearity into the feature extraction procedure thereby allowing the use of simple classifiers. An image can be reduced to a set of 1D functions via the Radon transform, or alternatively, the Fourier transform of each 1D projection can be obtained from a radial slice of the 2D Fourier transform of the image according to the Fourier slice theorem. A triple product of Fourier coefficients, referred to as the deterministic bispectrum, is computed for each 1D function and is integrated along radial lines in bifrequency space. Phases of the integrated bispectra are shown to be translation- and scale-invariant. Rotation invariance is achieved by a regrouping of these invariants at a constant radius followed by a second stage of invariant extraction. Rotation invariance is thus converted to translation invariance in the second step. Results using synthetic and actual images show that isolated, compact clusters are formed in feature space. These clusters are linearly separable, indicating that the nonlinearity required in the mapping from the input space to the classification space is incorporated well into the feature extraction stage. The use of higher order spectra results in good noise immunity, as verified with synthetic and real images. Classification of images using the higher order spectra-based algorithm compares favorably to classification using the method of moment invariants
Resumo:
This paper investigates the use of lip information, in conjunction with speech information, for robust speaker verification in the presence of background noise. It has been previously shown in our own work, and in the work of others, that features extracted from a speaker's moving lips hold speaker dependencies which are complementary with speech features. We demonstrate that the fusion of lip and speech information allows for a highly robust speaker verification system which outperforms the performance of either sub-system. We present a new technique for determining the weighting to be applied to each modality so as to optimize the performance of the fused system. Given a correct weighting, lip information is shown to be highly effective for reducing the false acceptance and false rejection error rates in the presence of background noise
Resumo:
Investigates the use of temporal lip information, in conjunction with speech information, for robust, text-dependent speaker identification. We propose that significant speaker-dependent information can be obtained from moving lips, enabling speaker recognition systems to be highly robust in the presence of noise. The fusion structure for the audio and visual information is based around the use of multi-stream hidden Markov models (MSHMM), with audio and visual features forming two independent data streams. Recent work with multi-modal MSHMMs has been performed successfully for the task of speech recognition. The use of temporal lip information for speaker identification has been performed previously (T.J. Wark et al., 1998), however this has been restricted to output fusion via single-stream HMMs. We present an extension to this previous work, and show that a MSHMM is a valid structure for multi-modal speaker identification
Resumo:
Investigates the use of lip information, in conjunction with speech information, for robust speaker verification in the presence of background noise. We have previously shown (Int. Conf. on Acoustics, Speech and Signal Proc., vol. 6, pp. 3693-3696, May 1998) that features extracted from a speaker's moving lips hold speaker dependencies which are complementary with speech features. We demonstrate that the fusion of lip and speech information allows for a highly robust speaker verification system which outperforms either subsystem individually. We present a new technique for determining the weighting to be applied to each modality so as to optimize the performance of the fused system. Given a correct weighting, lip information is shown to be highly effective for reducing the false acceptance and false rejection error rates in the presence of background noise
Resumo:
Texture analysis and textural cues have been applied for image classification, segmentation and pattern recognition. Dominant texture descriptors include directionality, coarseness, line-likeness etc. In this dissertation a class of textures known as particulate textures are defined, which are predominantly coarse or blob-like. The set of features that characterise particulate textures are different from those that characterise classical textures. These features are micro-texture, macro-texture, size, shape and compaction. Classical texture analysis techniques do not adequately capture particulate texture features. This gap is identified and new methods for analysing particulate textures are proposed. The levels of complexity in particulate textures are also presented ranging from the simplest images where blob-like particles are easily isolated from their back- ground to the more complex images where the particles and the background are not easily separable or the particles are occluded. Simple particulate images can be analysed for particle shapes and sizes. Complex particulate texture images, on the other hand, often permit only the estimation of particle dimensions. Real life applications of particulate textures are reviewed, including applications to sedimentology, granulometry and road surface texture analysis. A new framework for computation of particulate shape is proposed. A granulometric approach for particle size estimation based on edge detection is developed which can be adapted to the gray level of the images by varying its parameters. This study binds visual texture analysis and road surface macrotexture in a theoretical framework, thus making it possible to apply monocular imaging techniques to road surface texture analysis. Results from the application of the developed algorithm to road surface macro-texture, are compared with results based on Fourier spectra, the auto- correlation function and wavelet decomposition, indicating the superior performance of the proposed technique. The influence of image acquisition conditions such as illumination and camera angle on the results was systematically analysed. Experimental data was collected from over 5km of road in Brisbane and the estimated coarseness along the road was compared with laser profilometer measurements. Coefficient of determination R2 exceeding 0.9 was obtained when correlating the proposed imaging technique with the state of the art Sensor Measured Texture Depth (SMTD) obtained using laser profilometers.
Resumo:
Trees, shrubs and other vegetation are of continued importance to the environment and our daily life. They provide shade around our roads and houses, offer a habitat for birds and wildlife, and absorb air pollutants. However, vegetation touching power lines is a risk to public safety and the environment, and one of the main causes of power supply problems. Vegetation management, which includes tree trimming and vegetation control, is a significant cost component of the maintenance of electrical infrastructure. For example, Ergon Energy, the Australia’s largest geographic footprint energy distributor, currently spends over $80 million a year inspecting and managing vegetation that encroach on power line assets. Currently, most vegetation management programs for distribution systems are calendar-based ground patrol. However, calendar-based inspection by linesman is labour-intensive, time consuming and expensive. It also results in some zones being trimmed more frequently than needed and others not cut often enough. Moreover, it’s seldom practicable to measure all the plants around power line corridors by field methods. Remote sensing data captured from airborne sensors has great potential in assisting vegetation management in power line corridors. This thesis presented a comprehensive study on using spiking neural networks in a specific image analysis application: power line corridor monitoring. Theoretically, the thesis focuses on a biologically inspired spiking cortical model: pulse coupled neural network (PCNN). The original PCNN model was simplified in order to better analyze the pulse dynamics and control the performance. Some new and effective algorithms were developed based on the proposed spiking cortical model for object detection, image segmentation and invariant feature extraction. The developed algorithms were evaluated in a number of experiments using real image data collected from our flight trails. The experimental results demonstrated the effectiveness and advantages of spiking neural networks in image processing tasks. Operationally, the knowledge gained from this research project offers a good reference to our industry partner (i.e. Ergon Energy) and other energy utilities who wants to improve their vegetation management activities. The novel approaches described in this thesis showed the potential of using the cutting edge sensor technologies and intelligent computing techniques in improve power line corridor monitoring. The lessons learnt from this project are also expected to increase the confidence of energy companies to move from traditional vegetation management strategy to a more automated, accurate and cost-effective solution using aerial remote sensing techniques.
Resumo:
This study investigated the Kinaesthetic Fusion Effect (KFE) first described by Craske and Kenny in 1981. The current study did not replicate these findings following a change in the reporting method used by participants. Participants did not perceive any reduction in the sagittal separation of a button pressed by the index finger of one arm and a probe touching the other, following repeated exposure to the tactile stimuli present on both unseen arms. This study’s failure to replicate the widely-cited KFE as described by Craske et al. (1984) suggests that it may be contingent on several aspects of visual information, especially the availability of a specific visual reference, the role of instructions regarding gaze direction, and the potential use of a line of sight strategy when referring felt positions to an interposed surface. In addition, a foreshortening effect was found; this may result from a line-of-sight judgment and represent a feature of the reporting method used. Finally, this research will benefit future studies that require participants to report the perceived locations of the unseen limbs.
Resumo:
We have developed digital image registration program for a MC 68000 based fundus image processing system (FIPS). FIPS not only is capable of executing typical image processing algorithms in spatial as well as Fourier domain, the execution time for many operations has been made much quicker by using a hybrid of "C", Fortran and MC6000 assembly languages.
Resumo:
This paper describes the feasibility of the application of an Imputer in a multiple choice answer sheet marking system based on image processing techniques.
Resumo:
This study investigated the Kinaesthetic Fusion Effect (KFE) first described by Craske and Kenny in 1981. In Experiment 1 the study did not replicate these findings following a change in the reporting method used by participants. Participants did not perceive any reduction in the sagittal separation of a button pressed by the index finger of one arm and a probe touching the other, following repeated exposure to the tactile stimuli present on both unseen arms. This study’s failure to replicate the widely-cited KFE as described by Craske et al. (1984) suggests that it may be contingent on several aspects of visual information, especially the availability of a specific visual reference, the role of instructions regarding gaze direction, and the potential use of a line of sight strategy when referring felt positions to an interposed surface. In addition, a foreshortening effect was found; this may result from a line-of-sight judgment and represent a feature of the reporting method used. Finally, this research will benefit future studies that require participants to report the perceived locations of the unseen limbs. Experiment 2 investigated the KFE when the visual reference was removed and participants made reports of touched position, blindfolded. A number of interesting outcomes arose from this change and may provide clarification to the phenomena.