871 resultados para Machine vision and image processing


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This research investigated the prevalence of vision disorders in Queensland Indigenous primary school children, creating the first comprehensive visual profile of Indigenous children. Findings showed reduced convergence ability and reduced visual information processing skills were more common in Indigenous compared to non-Indigenous children. Reduced visual information processing skills were also associated with reduced reading outcomes in both groups of children. As early detection of visual disorders is important, the research also reviewed the delivery of screening programs across Queensland and proposed a model for improved coordination and service delivery of vision screening to Queensland school children.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Recent advances suggest that encoding images through Symmetric Positive Definite (SPD) matrices and then interpreting such matrices as points on Riemannian manifolds can lead to increased classification performance. Taking into account manifold geometry is typically done via (1) embedding the manifolds in tangent spaces, or (2) embedding into Reproducing Kernel Hilbert Spaces (RKHS). While embedding into tangent spaces allows the use of existing Euclidean-based learning algorithms, manifold shape is only approximated which can cause loss of discriminatory information. The RKHS approach retains more of the manifold structure, but may require non-trivial effort to kernelise Euclidean-based learning algorithms. In contrast to the above approaches, in this paper we offer a novel solution that allows SPD matrices to be used with unmodified Euclidean-based learning algorithms, with the true manifold shape well-preserved. Specifically, we propose to project SPD matrices using a set of random projection hyperplanes over RKHS into a random projection space, which leads to representing each matrix as a vector of projection coefficients. Experiments on face recognition, person re-identification and texture classification show that the proposed approach outperforms several recent methods, such as Tensor Sparse Coding, Histogram Plus Epitome, Riemannian Locality Preserving Projection and Relational Divergence Classification.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We present a novel approach to video summarisation that makes use of a Bag-of-visual-Textures (BoT) approach. Two systems are proposed, one based solely on the BoT approach and another which exploits both colour information and BoT features. On 50 short-term videos from the Open Video Project we show that our BoT and fusion systems both achieve state-of-the-art performance, obtaining an average F-measure of 0.83 and 0.86 respectively, a relative improvement of 9% and 13% when compared to the previous state-of-the-art. When applied to a new underwater surveillance dataset containing 33 long-term videos, the proposed system reduces the amount of footage by a factor of 27, with only minor degradation in the information content. This order of magnitude reduction in video data represents significant savings in terms of time and potential labour cost when manually reviewing such footage.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Traditional nearest points methods use all the samples in an image set to construct a single convex or affine hull model for classification. However, strong artificial features and noisy data may be generated from combinations of training samples when significant intra-class variations and/or noise occur in the image set. Existing multi-model approaches extract local models by clustering each image set individually only once, with fixed clusters used for matching with various image sets. This may not be optimal for discrimination, as undesirable environmental conditions (eg. illumination and pose variations) may result in the two closest clusters representing different characteristics of an object (eg. frontal face being compared to non-frontal face). To address the above problem, we propose a novel approach to enhance nearest points based methods by integrating affine/convex hull classification with an adapted multi-model approach. We first extract multiple local convex hulls from a query image set via maximum margin clustering to diminish the artificial variations and constrain the noise in local convex hulls. We then propose adaptive reference clustering (ARC) to constrain the clustering of each gallery image set by forcing the clusters to have resemblance to the clusters in the query image set. By applying ARC, noisy clusters in the query set can be discarded. Experiments on Honda, MoBo and ETH-80 datasets show that the proposed method outperforms single model approaches and other recent techniques, such as Sparse Approximated Nearest Points, Mutual Subspace Method and Manifold Discriminant Analysis.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Existing multi-model approaches for image set classification extract local models by clustering each image set individually only once, with fixed clusters used for matching with other image sets. However, this may result in the two closest clusters to represent different characteristics of an object, due to different undesirable environmental conditions (such as variations in illumination and pose). To address this problem, we propose to constrain the clustering of each query image set by forcing the clusters to have resemblance to the clusters in the gallery image sets. We first define a Frobenius norm distance between subspaces over Grassmann manifolds based on reconstruction error. We then extract local linear subspaces from a gallery image set via sparse representation. For each local linear subspace, we adaptively construct the corresponding closest subspace from the samples of a probe image set by joint sparse representation. We show that by minimising the sparse representation reconstruction error, we approach the nearest point on a Grassmann manifold. Experiments on Honda, ETH-80 and Cambridge-Gesture datasets show that the proposed method consistently outperforms several other recent techniques, such as Affine Hull based Image Set Distance (AHISD), Sparse Approximated Nearest Points (SANP) and Manifold Discriminant Analysis (MDA).

Relevância:

100.00% 100.00%

Publicador:

Resumo:

While formal definitions and security proofs are well established in some fields like cryptography and steganography, they are not as evident in digital watermarking research. A systematic development of watermarking schemes is desirable, but at present their development is usually informal, ad hoc, and omits the complete realization of application scenarios. This practice not only hinders the choice and use of a suitable scheme for a watermarking application, but also leads to debate about the state-of-the-art for different watermarking applications. With a view to the systematic development of watermarking schemes, we present a formal generic model for digital image watermarking. Considering possible inputs, outputs, and component functions, the initial construction of a basic watermarking model is developed further to incorporate the use of keys. On the basis of our proposed model, fundamental watermarking properties are defined and their importance exemplified for different image applications. We also define a set of possible attacks using our model showing different winning scenarios depending on the adversary capabilities. It is envisaged that with a proper consideration of watermarking properties and adversary actions in different image applications, use of the proposed model would allow a unified treatment of all practically meaningful variants of watermarking schemes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

While existing multi-biometic Dempster-Shafer the- ory fusion approaches have demonstrated promising perfor- mance, they do not model the uncertainty appropriately, sug- gesting that further improvement can be achieved. This research seeks to develop a unified framework for multimodal biometric fusion to take advantage of the uncertainty concept of Dempster- Shafer theory, improving the performance of multi-biometric authentication systems. Modeling uncertainty as a function of uncertainty factors affecting the recognition performance of the biometric systems helps to address the uncertainty of the data and the confidence of the fusion outcome. A weighted combination of quality measures and classifiers performance (Equal Error Rate) are proposed to encode the uncertainty concept to improve the fusion. We also found that quality measures contribute unequally to the recognition performance, thus selecting only significant factors and fusing them with a Dempster-Shafer approach to generate an overall quality score play an important role in the success of uncertainty modeling. The proposed approach achieved a competitive performance (approximate 1% EER) in comparison with other Dempster-Shafer based approaches and other conventional fusion approaches.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Texture enhancement is an important component of image processing that finds extensive application in science and engineering. The quality of medical images, quantified using the imaging texture, plays a significant role in the routine diagnosis performed by medical practitioners. Most image texture enhancement is performed using classical integral order differential mask operators. Recently, first order fractional differential operators were used to enhance images. Experimentation with these methods led to the conclusion that fractional differential operators not only maintain the low frequency contour features in the smooth areas of the image, but they also nonlinearly enhance edges and textures corresponding to high frequency image components. However, whilst these methods perform well in particular cases, they are not routinely useful across all applications. To this end, we apply the second order Riesz fractional differential operator to improve upon existing approaches of texture enhancement. Compared with the classical integral order differential mask operators and other first order fractional differential operators, we find that our new algorithms provide higher signal to noise values and superior image quality.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Acoustic recordings of the environment provide an effective means to monitor bird species diversity. To facilitate exploration of acoustic recordings, we describe a content-based birdcall retrieval algorithm. A query birdcall is a region of spectrogram bounded by frequency and time. Retrieval depends on a similarity measure derived from the orientation and distribution of spectral ridges. The spectral ridge detection method caters for a broad range of birdcall structures. In this paper, we extend previous work by incorporating a spectrogram scaling step in order to improve the detection of spectral ridges. Compared to an existing approach based on MFCC features, our feature representation achieves better retrieval performance for multiple bird species in noisy recordings.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In recent years more and more complex humanoid robots have been developed. On the other hand programming these systems has become more difficult. There is a clear need for such robots to be able to adapt and perform certain tasks autonomously, or even learn by themselves how to act. An important issue to tackle is the closing of the sensorimotor loop. Especially when talking about humanoids the tight integration of perception with actions will allow for improved behaviours, embedding adaptation on the lower-level of the system.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Visual information processing in brain proceeds in both serial and parallel fashion throughout various functionally distinct hierarchically organised cortical areas. Feedforward signals from retina and hierarchically lower cortical levels are the major activators of visual neurons, but top-down and feedback signals from higher level cortical areas have a modulating effect on neural processing. My work concentrates on visual encoding in hierarchically low level cortical visual areas in human brain and examines neural processing especially in cortical representation of visual field periphery. I use magnetoencephalography and functional magnetic resonance imaging to measure neuromagnetic and hemodynamic responses during visual stimulation and oculomotor and cognitive tasks from healthy volunteers. My thesis comprises six publications. Visual cortex forms a great challenge for modeling of neuromagnetic sources. My work shows that a priori information of source locations are needed for modeling of neuromagnetic sources in visual cortex. In addition, my work examines other potential confounding factors in vision studies such as light scatter inside the eye which may result in erroneous responses in cortex outside the representation of stimulated region, and eye movements and attention. I mapped cortical representations of peripheral visual field and identified a putative human homologue of functional area V6 of the macaque in the posterior bank of parieto-occipital sulcus. My work shows that human V6 activates during eye-movements and that it responds to visual motion at short latencies. These findings suggest that human V6, like its monkey homologue, is related to fast processing of visual stimuli and visually guided movements. I demonstrate that peripheral vision is functionally related to eye-movements and connected to rapid stream of functional areas that process visual motion. In addition, my work shows two different forms of top-down modulation of neural processing in the hierachically lowest cortical levels; one that is related to dorsal stream activation and may reflect motor processing or resetting signals that prepare visual cortex for change in the environment and another local signal enhancement at the attended region that reflects local feed-back signal and may perceptionally increase the stimulus saliency.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We describe a novel method for human activity segmentation and interpretation in surveillance applications based on Gabor filter-bank features. A complex human activity is modeled as a sequence of elementary human actions like walking, running, jogging, boxing, hand-waving etc. Since human silhouette can be modeled by a set of rectangles, the elementary human actions can be modeled as a sequence of a set of rectangles with different orientations and scales. The activity segmentation is based on Gabor filter-bank features and normalized spectral clustering. The feature trajectories of an action category are learnt from training example videos using dynamic time warping. The combined segmentation and the recognition processes are very efficient as both the algorithms share the same framework and Gabor features computed for the former can be used for the later. We have also proposed a simple shadow detection technique to extract good silhouette which is necessary for good accuracy of an action recognition technique.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we present a growing and pruning radial basis function based no-reference (NR) image quality model for JPEG-coded images. The quality of the images are estimated without referring to their original images. The features for predicting the perceived image quality are extracted by considering key human visual sensitivity factors such as edge amplitude, edge length, background activity and background luminance. Image quality estimation involves computation of functional relationship between HVS features and subjective test scores. Here, the problem of quality estimation is transformed to a function approximation problem and solved using GAP-RBF network. GAP-RBF network uses sequential learning algorithm to approximate the functional relationship. The computational complexity and memory requirement are less in GAP-RBF algorithm compared to other batch learning algorithms. Also, the GAP-RBF algorithm finds a compact image quality model and does not require retraining when the new image samples are presented. Experimental results prove that the GAP-RBF image quality model does emulate the mean opinion score (MOS). The subjective test results of the proposed metric are compared with JPEG no-reference image quality index as well as full-reference structural similarity image quality index and it is observed to outperform both.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We propose two texture-based approaches, one involving Gabor filters and the other employing log-polar wavelets, for separating text from non-text elements in a document image. Both the proposed algorithms compute local energy at some information-rich points, which are marked by Harris' corner detector. The advantage of this approach is that the algorithm calculates the local energy at selected points and not throughout the image, thus saving a lot of computational time. The algorithm has been tested on a large set of scanned text pages and the results have been seen to be better than the results from the existing algorithms. Among the proposed schemes, the Gabor filter based scheme marginally outperforms the wavelet based scheme.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

3D Face Recognition is an active area of research for past several years. For a 3D face recognition system one would like to have an accurate as well as low cost setup for constructing 3D face model. In this paper, we use Profilometry approach to obtain a 3D face model.This method gives a low cost solution to the problem of acquiring 3D data and the 3D face models generated by this method are sufficiently accurate. We also develop an algorithm that can use the 3D face model generated by the above method for the recognition purpose.