871 resultados para Machine vision and image processing
Resumo:
The mining environment, being complex, irregular, and time-varying, presents a challenging prospect for stereo vision. For this application, speed, reliability, and the ability to produce a dense depth map are of foremost importance. This paper evaluates a number of matching techniques for possible use in a stereo vision sensor for mining automation applications. Area-based techniques have been investigated because they have the potential to yield dense maps, are amenable to fast hardware implementation, and are suited to textured scenes. In addition, two nonparametric transforms, namely, rank and census, have been investigated. Matching algorithms using these transforms were found to have a number of clear advantages, including reliability in the presence of radiometric distortion, low computational complexity, and amenability to hardware implementation.
Resumo:
The mining environment, being complex, irregular and time varying, presents a challenging prospect for stereo vision. The objective is to produce a stereo vision sensor suited to close-range scenes consisting primarily of rocks. This sensor should be able to produce a dense depth map within real-time constraints. Speed and robustness are of foremost importance for this investigation. A number of area based matching metrics have been implemented, including the SAD, SSD, NCC, and their zero-meaned versions. The NCC and the zero meaned SAD and SSD were found to produce the disparity maps with the highest proportion of valid matches. The plain SAD and SSD were the least computationally expensive, due to all their operations taking place in integer arithmetic, however, they were extremely sensitive to radiometric distortion. Non-parametric techniques for matching, in particular, the rank and the census transform, have also been investigated. The rank and census transforms were found to be robust with respect to radiometric distortion, as well as being able to produce disparity maps with a high proportion of valid matches. An additional advantage of both the rank and the census transform is their amenability to fast hardware implementation.
Resumo:
The mining environment presents a challenging prospect for stereo vision. Our objective is to produce a stereo vision sensor suited to close-range scenes consisting mostly of rocks. This sensor should produce a dense depth map within real-time constraints. Speed and robustness are of foremost importance for this application. This paper compares a number of stereo matching algorithms in terms of robustness and suitability to fast implementation. These include traditional area-based algorithms, and algorithms based on non-parametric transforms, notably the rank and census transforms. Our experimental results show that the rank and census transforms are robust with respect to radiometric distortion and introduce less computational complexity than conventional area-based matching techniques.
Resumo:
The mining environment, being complex, irregular and time varying, presents a challenging prospect for stereo vision. For this application, speed, reliability, and the ability to produce a dense depth map are of foremost importance. This paper evaluates a number of matching techniques for possible use in a stereo vision sensor for mining automation applications. Area-based techniques have been investigated because they have the potential to yield dense maps, are amenable to fast hardware implementation, and are suited to textured scenes. In addition, two non-parametric transforms, namely, the rank and census, have been investigated. Matching algorithms using these transforms were found to have a number of clear advantages, including reliability in the presence of radiometric distortion, low computational complexity, and amenability to hardware implementation.
Resumo:
A frame-rate stereo vision system, based on non-parametric matching metrics, is described. Traditional metrics, such as normalized cross-correlation, are expensive in terms of logic. Non-parametric measures require only simple, parallelizable, functions such as comparators, counters and exclusive-or, and are thus very well suited to implementation in reprogrammable logic.
Resumo:
This item provides supplementary materials for the paper mentioned in the title, specifically a range of organisms used in the study. The full abstract for the main paper is as follows: Next Generation Sequencing (NGS) technologies have revolutionised molecular biology, allowing clinical sequencing to become a matter of routine. NGS data sets consist of short sequence reads obtained from the machine, given context and meaning through downstream assembly and annotation. For these techniques to operate successfully, the collected reads must be consistent with the assumed species or species group, and not corrupted in some way. The common bacterium Staphylococcus aureus may cause severe and life-threatening infections in humans,with some strains exhibiting antibiotic resistance. In this paper, we apply an SVM classifier to the important problem of distinguishing S. aureus sequencing projects from alternative pathogens, including closely related Staphylococci. Using a sequence k-mer representation, we achieve precision and recall above 95%, implicating features with important functional associations.
Resumo:
In the field of face recognition, Sparse Representation (SR) has received considerable attention during the past few years. Most of the relevant literature focuses on holistic descriptors in closed-set identification applications. The underlying assumption in SR-based methods is that each class in the gallery has sufficient samples and the query lies on the subspace spanned by the gallery of the same class. Unfortunately, such assumption is easily violated in the more challenging face verification scenario, where an algorithm is required to determine if two faces (where one or both have not been seen before) belong to the same person. In this paper, we first discuss why previous attempts with SR might not be applicable to verification problems. We then propose an alternative approach to face verification via SR. Specifically, we propose to use explicit SR encoding on local image patches rather than the entire face. The obtained sparse signals are pooled via averaging to form multiple region descriptors, which are then concatenated to form an overall face descriptor. Due to the deliberate loss spatial relations within each region (caused by averaging), the resulting descriptor is robust to misalignment & various image deformations. Within the proposed framework, we evaluate several SR encoding techniques: l1-minimisation, Sparse Autoencoder Neural Network (SANN), and an implicit probabilistic technique based on Gaussian Mixture Models. Thorough experiments on AR, FERET, exYaleB, BANCA and ChokePoint datasets show that the proposed local SR approach obtains considerably better and more robust performance than several previous state-of-the-art holistic SR methods, in both verification and closed-set identification problems. The experiments also show that l1-minimisation based encoding has a considerably higher computational than the other techniques, but leads to higher recognition rates.
Resumo:
Abstract: Texture enhancement is an important component of image processing, with extensive application in science and engineering. The quality of medical images, quantified using the texture of the images, plays a significant role in the routine diagnosis performed by medical practitioners. Previously, image texture enhancement was performed using classical integral order differential mask operators. Recently, first order fractional differential operators were implemented to enhance images. Experiments conclude that the use of the fractional differential not only maintains the low frequency contour features in the smooth areas of the image, but also nonlinearly enhances edges and textures corresponding to high-frequency image components. However, whilst these methods perform well in particular cases, they are not routinely useful across all applications. To this end, we applied the second order Riesz fractional differential operator to improve upon existing approaches of texture enhancement. Compared with the classical integral order differential mask operators and other fractional differential operators, our new algorithms provide higher signal to noise values, which leads to superior image quality.
Resumo:
This paper presents practical vision-based collision avoidance for objects approximating a single point feature. Using a spherical camera model, a visual predictive control scheme guides the aircraft around the object along a conical spiral trajectory. Visibility, state and control constraints are considered explicitly in the controller design by combining image and vehicle dynamics in the process model, and solving the nonlinear optimization problem over the resulting state space. Importantly, range is not required. Instead, the principles of conical spiral motion are used to design an objective function that simultaneously guides the aircraft along the avoidance trajectory, whilst providing an indication of the appropriate point to stop the spiral behaviour. Our approach is aimed at providing a potential solution to the See and Avoid problem for unmanned aircraft and is demonstrated through a series.
Resumo:
Active Appearance Models (AAMs) employ a paradigm of inverting a synthesis model of how an object can vary in terms of shape and appearance. As a result, the ability of AAMs to register an unseen object image is intrinsically linked to two factors. First, how well the synthesis model can reconstruct the object image. Second, the degrees of freedom in the model. Fewer degrees of freedom yield a higher likelihood of good fitting performance. In this paper we look at how these seemingly contrasting factors can complement one another for the problem of AAM fitting of an ensemble of images stemming from a constrained set (e.g. an ensemble of face images of the same person).
Resumo:
There are several methods for determining the proteoglycan content of cartilage in biomechanics experiments. Many of these include assay-based methods and the histochemistry or spectrophotometry protocol where quantification is biochemically determined. More recently a method based on extracting data to quantify proteoglycan content has emerged using the image processing algorithms, e.g., in ImageJ, to process histological micrographs, with advantages including time saving and low cost. However, it is unknown whether or not this image analysis method produces results that are comparable to those obtained from the biochemical methodology. This paper compares the results of a well-established chemical method to those obtained using image analysis to determine the proteoglycan content of visually normal (n=33) and their progressively degraded counterparts with the protocols. The results reveal a strong linear relationship with a regression coefficient (R2) = 0.9928, leading to the conclusion that the image analysis methodology is a viable alternative to the spectrophotometry.
Resumo:
The huge amount of CCTV footage available makes it very burdensome to process these videos manually through human operators. This has made automated processing of video footage through computer vision technologies necessary. During the past several years, there has been a large effort to detect abnormal activities through computer vision techniques. Typically, the problem is formulated as a novelty detection task where the system is trained on normal data and is required to detect events which do not fit the learned ‘normal’ model. There is no precise and exact definition for an abnormal activity; it is dependent on the context of the scene. Hence there is a requirement for different feature sets to detect different kinds of abnormal activities. In this work we evaluate the performance of different state of the art features to detect the presence of the abnormal objects in the scene. These include optical flow vectors to detect motion related anomalies, textures of optical flow and image textures to detect the presence of abnormal objects. These extracted features in different combinations are modeled using different state of the art models such as Gaussian mixture model(GMM) and Semi- 2D Hidden Markov model(HMM) to analyse the performances. Further we apply perspective normalization to the extracted features to compensate for perspective distortion due to the distance between the camera and objects of consideration. The proposed approach is evaluated using the publicly available UCSD datasets and we demonstrate improved performance compared to other state of the art methods.
Resumo:
The selection of optimal camera configurations (camera locations, orientations, etc.) for multi-camera networks remains an unsolved problem. Previous approaches largely focus on proposing various objective functions to achieve different tasks. Most of them, however, do not generalize well to large scale networks. To tackle this, we propose a statistical framework of the problem as well as propose a trans-dimensional simulated annealing algorithm to effectively deal with it. We compare our approach with a state-of-the-art method based on binary integer programming (BIP) and show that our approach offers similar performance on small scale problems. However, we also demonstrate the capability of our approach in dealing with large scale problems and show that our approach produces better results than two alternative heuristics designed to deal with the scalability issue of BIP. Last, we show the versatility of our approach using a number of specific scenarios.
Resumo:
Reliable robotic perception and planning are critical to performing autonomous actions in uncertain, unstructured environments. In field robotic systems, automation is achieved by interpreting exteroceptive sensor information to infer something about the world. This is then mapped to provide a consistent spatial context, so that actions can be planned around the predicted future interaction of the robot and the world. The whole system is as reliable as the weakest link in this chain. In this paper, the term mapping is used broadly to describe the transformation of range-based exteroceptive sensor data (such as LIDAR or stereo vision) to a fixed navigation frame, so that it can be used to form an internal representation of the environment. The coordinate transformation from the sensor frame to the navigation frame is analyzed to produce a spatial error model that captures the dominant geometric and temporal sources of mapping error. This allows the mapping accuracy to be calculated at run time. A generic extrinsic calibration method for exteroceptive range-based sensors is then presented to determine the sensor location and orientation. This allows systematic errors in individual sensors to be minimized, and when multiple sensors are used, it minimizes the systematic contradiction between them to enable reliable multisensor data fusion. The mathematical derivations at the core of this model are not particularly novel or complicated, but the rigorous analysis and application to field robotics seems to be largely absent from the literature to date. The techniques in this paper are simple to implement, and they offer a significant improvement to the accuracy, precision, and integrity of mapped information. Consequently, they should be employed whenever maps are formed from range-based exteroceptive sensor data. © 2009 Wiley Periodicals, Inc.
Resumo:
This work aims to promote integrity in autonomous perceptual systems, with a focus on outdoor unmanned ground vehicles equipped with a camera and a 2D laser range finder. A method to check for inconsistencies between the data provided by these two heterogeneous sensors is proposed and discussed. First, uncertainties in the estimated transformation between the laser and camera frames are evaluated and propagated up to the projection of the laser points onto the image. Then, for each pair of laser scan-camera image acquired, the information at corners of the laser scan is compared with the content of the image, resulting in a likelihood of correspondence. The result of this process is then used to validate segments of the laser scan that are found to be consistent with the image, while inconsistent segments are rejected. Experimental results illustrate how this technique can improve the reliability of perception in challenging environmental conditions, such as in the presence of airborne dust.