981 resultados para local image features


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Affine covariant local image features are a powerful tool for many applications, including matching and calibrating wide baseline images. Local feature extractors that use a saliency map to locate features require adaptation processes in order to extract affine covariant features. The most effective extractors make use of the second moment matrix (SMM) to iteratively estimate the affine shape of local image regions. This paper shows that the Hessian matrix can be used to estimate local affine shape in a similar fashion to the SMM. The Hessian matrix requires significantly less computation effort than the SMM, allowing more efficient affine adaptation. Experimental results indicate that using the Hessian matrix in conjunction with a feature extractor that selects features in regions with high second order gradients delivers equivalent quality correspondences in less than 17% of the processing time, compared to the same extractor using the SMM.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

When we actively explore the visual environment, our gaze preferentially selects regions characterized by high contrast and high density of edges, suggesting that the guidance of eye movements during visual exploration is driven to a significant degree by perceptual characteristics of a scene. Converging findings suggest that the selection of the visual target for the upcoming saccade critically depends on a covert shift of spatial attention. However, it is unclear whether attention selects the location of the next fixation uniquely on the basis of global scene structure or additionally on local perceptual information. To investigate the role of spatial attention in scene processing, we examined eye fixation patterns of patients with spatial neglect during unconstrained exploration of natural images and compared these to healthy and brain-injured control participants. We computed luminance, colour, contrast, and edge information contained in image patches surrounding each fixation and evaluated whether they differed from randomly selected image patches. At the global level, neglect patients showed the characteristic ipsilesional shift of the distribution of their fixations. At the local level, patients with neglect and control participants fixated image regions in ipsilesional space that were closely similar with respect to their local feature content. In contrast, when directing their gaze to contralesional (impaired) space neglect patients fixated regions of significantly higher local luminance and lower edge content than controls. These results suggest that intact spatial attention is necessary for the active sampling of local feature content during scene perception.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

While the primary purpose of edge detection schemes is to be able to produce an edge map of a given image, the ability to distinguish between different feature types is also of importance. In this paper we examine feature classification based on local energy detection and show that local energy measures are intrinsically capable of making this classification because of the use of odd and even filters. The advantage of feature classification is that it allows for the elimination of certain feature types from the edge map, thus simplifying the task of object recognition.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Over the last decade, a plethora of computer-aided diagnosis (CAD) systems have been proposed aiming to improve the accuracy of the physicians in the diagnosis of interstitial lung diseases (ILD). In this study, we propose a scheme for the classification of HRCT image patches with ILD abnormalities as a basic component towards the quantification of the various ILD patterns in the lung. The feature extraction method relies on local spectral analysis using a DCT-based filter bank. After convolving the image with the filter bank, q-quantiles are computed for describing the distribution of local frequencies that characterize image texture. Then, the gray-level histogram values of the original image are added forming the final feature vector. The classification of the already described patches is done by a random forest (RF) classifier. The experimental results prove the superior performance and efficiency of the proposed approach compared against the state-of-the-art.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper is concerned with choosing image features for image based visual servo control and how this choice influences the closed-loop dynamics of the system. In prior work, image features tend to be chosen on the basis of image processing simplicity and noise sensitivity. In this paper we show that the choice of feature directly influences the closed-loop dynamics in task-space. We focus on the depth axis control of a visual servo system and compare analytically various approaches that have been reported recently in the literature. The theoretical predictions are verified by experiment.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Local image feature extractors that select local maxima of the determinant of Hessian function have been shown to perform well and are widely used. This paper introduces the negative local minima of the determinant of Hessian function for local feature extraction. The properties and scale-space behaviour of these features are examined and found to be desirable for feature extraction. It is shown how this new feature type can be implemented along with the existing local maxima approach at negligible extra processing cost. Applications to affine covariant feature extraction and sub-pixel precise corner extraction are demonstrated. Experimental results indicate that the new corner detector is more robust to image blur and noise than existing methods. It is also accurate for a broader range of corner geometries. An affine covariant feature extractor is implemented by combining the minima of the determinant of Hessian with existing scale and shape adaptation methods. This extractor can be implemented along side the existing Hessian maxima extractor simply by finding both minima and maxima during the initial extraction stage. The minima features increase the number of correspondences by two to four fold. The additional minima features are very distinct from the maxima features in descriptor space and do not make the matching process more ambiguous.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Falls are one of the greatest threats to elderly health in their daily living routines and activities. Therefore, it is very important to detect falls of an elderly in a timely and accurate manner, so that immediate response and proper care can be provided, by sending fall alarms to caregivers. Radar is an effective non-intrusive sensing modality which is well suited for this purpose, which can detect human motions in all types of environments, penetrate walls and fabrics, preserve privacy, and is insensitive to lighting conditions. Micro-Doppler features are utilized in radar signal corresponding to human body motions and gait to detect falls using a narrowband pulse-Doppler radar. Human motions cause time-varying Doppler signatures, which are analyzed using time-frequency representations and matching pursuit decomposition (MPD) for feature extraction and fall detection. The extracted features include MPD features and the principal components of the time-frequency signal representations. To analyze the sequential characteristics of typical falls, the extracted features are used for training and testing hidden Markov models (HMM) in different falling scenarios. Experimental results demonstrate that the proposed algorithm and method achieve fast and accurate fall detections. The risk of falls increases sharply when the elderly or patients try to exit beds. Thus, if a bed exit can be detected at an early stage of this motion, the related injuries can be prevented with a high probability. To detect bed exit for fall prevention, the trajectory of head movements is used for recognize such human motion. A head detector is trained using the histogram of oriented gradient (HOG) features of the head and shoulder areas from recorded bed exit images. A data association algorithm is applied on the head detection results to eliminate head detection false alarms. Then the three dimensional (3D) head trajectories are constructed by matching scale-invariant feature transform (SIFT) keypoints in the detected head areas from both the left and right stereo images. The extracted 3D head trajectories are used for training and testing an HMM based classifier for recognizing bed exit activities. The results of the classifier are presented and discussed in the thesis, which demonstrates the effectiveness of the proposed stereo vision based bed exit detection approach.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The relationship between multiple cameras viewing the same scene may be discovered automatically by finding corresponding points in the two views and then solving for the camera geometry. In camera networks with sparsely placed cameras, low resolution cameras or in scenes with few distinguishable features it may be difficult to find a sufficient number of reliable correspondences from which to compute geometry. This paper presents a method for extracting a larger number of correspondences from an initial set of putative correspondences without any knowledge of the scene or camera geometry. The method may be used to increase the number of correspondences and make geometry computations possible in cases where existing methods have produced insufficient correspondences.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Camera calibration information is required in order for multiple camera networks to deliver more than the sum of many single camera systems. Methods exist for manually calibrating cameras with high accuracy. Manually calibrating networks with many cameras is, however, time consuming, expensive and impractical for networks that undergo frequent change. For this reason, automatic calibration techniques have been vigorously researched in recent years. Fully automatic calibration methods depend on the ability to automatically find point correspondences between overlapping views. In typical camera networks, cameras are placed far apart to maximise coverage. This is referred to as a wide base-line scenario. Finding sufficient correspondences for camera calibration in wide base-line scenarios presents a significant challenge. This thesis focuses on developing more effective and efficient techniques for finding correspondences in uncalibrated, wide baseline, multiple-camera scenarios. The project consists of two major areas of work. The first is the development of more effective and efficient view covariant local feature extractors. The second area involves finding methods to extract scene information using the information contained in a limited set of matched affine features. Several novel affine adaptation techniques for salient features have been developed. A method is presented for efficiently computing the discrete scale space primal sketch of local image features. A scale selection method was implemented that makes use of the primal sketch. The primal sketch-based scale selection method has several advantages over the existing methods. It allows greater freedom in how the scale space is sampled, enables more accurate scale selection, is more effective at combining different functions for spatial position and scale selection, and leads to greater computational efficiency. Existing affine adaptation methods make use of the second moment matrix to estimate the local affine shape of local image features. In this thesis, it is shown that the Hessian matrix can be used in a similar way to estimate local feature shape. The Hessian matrix is effective for estimating the shape of blob-like structures, but is less effective for corner structures. It is simpler to compute than the second moment matrix, leading to a significant reduction in computational cost. A wide baseline dense correspondence extraction system, called WiDense, is presented in this thesis. It allows the extraction of large numbers of additional accurate correspondences, given only a few initial putative correspondences. It consists of the following algorithms: An affine region alignment algorithm that ensures accurate alignment between matched features; A method for extracting more matches in the vicinity of a matched pair of affine features, using the alignment information contained in the match; An algorithm for extracting large numbers of highly accurate point correspondences from an aligned pair of feature regions. Experiments show that the correspondences generated by the WiDense system improves the success rate of computing the epipolar geometry of very widely separated views. This new method is successful in many cases where the features produced by the best wide baseline matching algorithms are insufficient for computing the scene geometry.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This study investigates face recognition with partial occlusion, illumination variation and their combination, assuming no prior information about the mismatch, and limited training data for each person. The authors extend their previous posterior union model (PUM) to give a new method capable of dealing with all these problems. PUM is an approach for selecting the optimal local image features for recognition to improve robustness to partial occlusion. The extension is in two stages. First, authors extend PUM from a probability-based formulation to a similarity-based formulation, so that it operates with as little as one single training sample to offer robustness to partial occlusion. Second, they extend this new formulation to make it robust to illumination variation, and to combined illumination variation and partial occlusion, by a novel combination of multicondition relighting and optimal feature selection. To evaluate the new methods, a number of databases with various simulated and realistic occlusion/illumination mismatches have been used. The results have demonstrated the improved robustness of the new methods.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The perception of global form requires integration of local visual cues across space and is the foundation for object recognition. Here we used magnetoencephalography (MEG) to study the location and time course of neuronal activity associated with the perception of global structure from local image features. To minimize neuronal activity to low-level stimulus properties, such as luminance and contrast, the local image features were held constant during all phases of the MEG recording. This allowed us to assess the relative importance of striate (V1) versus extrastriate cortex in global form perception.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In public venues, crowd size is a key indicator of crowd safety and stability. Crowding levels can be detected using holistic image features, however this requires a large amount of training data to capture the wide variations in crowd distribution. If a crowd counting algorithm is to be deployed across a large number of cameras, such a large and burdensome training requirement is far from ideal. In this paper we propose an approach that uses local features to count the number of people in each foreground blob segment, so that the total crowd estimate is the sum of the group sizes. This results in an approach that is scalable to crowd volumes not seen in the training data, and can be trained on a very small data set. As a local approach is used, the proposed algorithm can easily be used to estimate crowd density throughout different regions of the scene and be used in a multi-camera environment. A unique localised approach to ground truth annotation reduces the required training data is also presented, as a localised approach to crowd counting has different training requirements to a holistic one. Testing on a large pedestrian database compares the proposed technique to existing holistic techniques and demonstrates improved accuracy, and superior performance when test conditions are unseen in the training set, or a minimal training set is used.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Existing crowd counting algorithms rely on holistic, local or histogram based features to capture crowd properties. Regression is then employed to estimate the crowd size. Insufficient testing across multiple datasets has made it difficult to compare and contrast different methodologies. This paper presents an evaluation across multiple datasets to compare holistic, local and histogram based methods, and to compare various image features and regression models. A K-fold cross validation protocol is followed to evaluate the performance across five public datasets: UCSD, PETS 2009, Fudan, Mall and Grand Central datasets. Image features are categorised into five types: size, shape, edges, keypoints and textures. The regression models evaluated are: Gaussian process regression (GPR), linear regression, K nearest neighbours (KNN) and neural networks (NN). The results demonstrate that local features outperform equivalent holistic and histogram based features; optimal performance is observed using all image features except for textures; and that GPR outperforms linear, KNN and NN regression

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis examines the feasibility of a forest inventory method based on two-phase sampling in estimating forest attributes at the stand or substand levels for forest management purposes. The method is based on multi-source forest inventory combining auxiliary data consisting of remote sensing imagery or other geographic information and field measurements. Auxiliary data are utilized as first-phase data for covering all inventory units. Various methods were examined for improving the accuracy of the forest estimates. Pre-processing of auxiliary data in the form of correcting the spectral properties of aerial imagery was examined (I), as was the selection of aerial image features for estimating forest attributes (II). Various spatial units were compared for extracting image features in a remote sensing aided forest inventory utilizing very high resolution imagery (III). A number of data sources were combined and different weighting procedures were tested in estimating forest attributes (IV, V). Correction of the spectral properties of aerial images proved to be a straightforward and advantageous method for improving the correlation between the image features and the measured forest attributes. Testing different image features that can be extracted from aerial photographs (and other very high resolution images) showed that the images contain a wealth of relevant information that can be extracted only by utilizing the spatial organization of the image pixel values. Furthermore, careful selection of image features for the inventory task generally gives better results than inputting all extractable features to the estimation procedure. When the spatial units for extracting very high resolution image features were examined, an approach based on image segmentation generally showed advantages compared with a traditional sample plot-based approach. Combining several data sources resulted in more accurate estimates than any of the individual data sources alone. The best combined estimate can be derived by weighting the estimates produced by the individual data sources by the inverse values of their mean square errors. Despite the fact that the plot-level estimation accuracy in two-phase sampling inventory can be improved in many ways, the accuracy of forest estimates based mainly on single-view satellite and aerial imagery is a relatively poor basis for making stand-level management decisions.