171 resultados para Vector representation
Resumo:
Age estimation from facial images is increasingly receiving attention to solve age-based access control, age-adaptive targeted marketing, amongst other applications. Since even humans can be induced in error due to the complex biological processes involved, finding a robust method remains a research challenge today. In this paper, we propose a new framework for the integration of Active Appearance Models (AAM), Local Binary Patterns (LBP), Gabor wavelets (GW) and Local Phase Quantization (LPQ) in order to obtain a highly discriminative feature representation which is able to model shape, appearance, wrinkles and skin spots. In addition, this paper proposes a novel flexible hierarchical age estimation approach consisting of a multi-class Support Vector Machine (SVM) to classify a subject into an age group followed by a Support Vector Regression (SVR) to estimate a specific age. The errors that may happen in the classification step, caused by the hard boundaries between age classes, are compensated in the specific age estimation by a flexible overlapping of the age ranges. The performance of the proposed approach was evaluated on FG-NET Aging and MORPH Album 2 datasets and a mean absolute error (MAE) of 4.50 and 5.86 years was achieved respectively. The robustness of the proposed approach was also evaluated on a merge of both datasets and a MAE of 5.20 years was achieved. Furthermore, we have also compared the age estimation made by humans with the proposed approach and it has shown that the machine outperforms humans. The proposed approach is competitive with current state-of-the-art and it provides an additional robustness to blur, lighting and expression variance brought about by the local phase features.
Resumo:
This paper presents an effective classification method based on Support Vector Machines (SVM) in the context of activity recognition. Local features that capture both spatial and temporal information in activity videos have made significant progress recently. Efficient and effective features, feature representation and classification plays a crucial role in activity recognition. For classification, SVMs are popularly used because of their simplicity and efficiency; however the common multi-class SVM approaches applied suffer from limitations including having easily confused classes and been computationally inefficient. We propose using a binary tree SVM to address the shortcomings of multi-class SVMs in activity recognition. We proposed constructing a binary tree using Gaussian Mixture Models (GMM), where activities are repeatedly allocated to subnodes until every new created node contains only one activity. Then, for each internal node a separate SVM is learned to classify activities, which significantly reduces the training time and increases the speed of testing compared to popular the `one-against-the-rest' multi-class SVM classifier. Experiments carried out on the challenging and complex Hollywood dataset demonstrates comparable performance over the baseline bag-of-features method.
Resumo:
This paper presents an effective feature representation method in the context of activity recognition. Efficient and effective feature representation plays a crucial role not only in activity recognition, but also in a wide range of applications such as motion analysis, tracking, 3D scene understanding etc. In the context of activity recognition, local features are increasingly popular for representing videos because of their simplicity and efficiency. While they achieve state-of-the-art performance with low computational requirements, their performance is still limited for real world applications due to a lack of contextual information and models not being tailored to specific activities. We propose a new activity representation framework to address the shortcomings of the popular, but simple bag-of-words approach. In our framework, first multiple instance SVM (mi-SVM) is used to identify positive features for each action category and the k-means algorithm is used to generate a codebook. Then locality-constrained linear coding is used to encode the features into the generated codebook, followed by spatio-temporal pyramid pooling to convey the spatio-temporal statistics. Finally, an SVM is used to classify the videos. Experiments carried out on two popular datasets with varying complexity demonstrate significant performance improvement over the base-line bag-of-feature method.
Resumo:
While many measures of viewpoint goodness have been proposed in computer graphics, none have been evaluated for ribbon representations of protein secondary structure. To fill this gap, we conducted a user study on Amazon’s Mechanical Turk platform, collecting human viewpoint preferences from 65 participants for 4 representative su- perfamilies of protein domains. In particular, we evaluated viewpoint entropy, which was previously shown to be a good predictor for human viewpoint preference of other, mostly non-abstract objects. In a second study, we asked 7 molecular biology experts to find the best viewpoint of the same protein domains and compared their choices with viewpoint entropy. Our results show that viewpoint entropy overall is a significant predictor of human viewpoint preference for ribbon representations of protein secondary structure. However, the accuracy is highly dependent on the complexity of the structure: while most participants agree on good viewpoints for small, non-globular structures with few secondary structure elements, viewpoint preference varies considerably for complex structures. Finally, experts tend to choose viewpoints of both low and high viewpoint entropy to emphasize different aspects of the respective structure.
Resumo:
This paper addresses the challenges of flood mapping using multispectral images. Quantitative flood mapping is critical for flood damage assessment and management. Remote sensing images obtained from various satellite or airborne sensors provide valuable data for this application, from which the information on the extent of flood can be extracted. However the great challenge involved in the data interpretation is to achieve more reliable flood extent mapping including both the fully inundated areas and the 'wet' areas where trees and houses are partly covered by water. This is a typical combined pure pixel and mixed pixel problem. In this paper, an extended Support Vector Machines method for spectral unmixing developed recently has been applied to generate an integrated map showing both pure pixels (fully inundated areas) and mixed pixels (trees and houses partly covered by water). The outputs were compared with the conventional mean based linear spectral mixture model, and better performance was demonstrated with a subset of Landsat ETM+ data recorded at the Daly River Basin, NT, Australia, on 3rd March, 2008, after a flood event.
Resumo:
The most difficult operation in the flood inundation mapping using optical flood images is to separate fully inundated areas from the ‘wet’ areas where trees and houses are partly covered by water. This can be referred as a typical problem the presence of mixed pixels in the images. A number of automatic information extraction image classification algorithms have been developed over the years for flood mapping using optical remote sensing images. Most classification algorithms generally, help in selecting a pixel in a particular class label with the greatest likelihood. However, these hard classification methods often fail to generate a reliable flood inundation mapping because the presence of mixed pixels in the images. To solve the mixed pixel problem advanced image processing techniques are adopted and Linear Spectral unmixing method is one of the most popular soft classification technique used for mixed pixel analysis. The good performance of linear spectral unmixing depends on two important issues, those are, the method of selecting endmembers and the method to model the endmembers for unmixing. This paper presents an improvement in the adaptive selection of endmember subset for each pixel in spectral unmixing method for reliable flood mapping. Using a fixed set of endmembers for spectral unmixing all pixels in an entire image might cause over estimation of the endmember spectra residing in a mixed pixel and hence cause reducing the performance level of spectral unmixing. Compared to this, application of estimated adaptive subset of endmembers for each pixel can decrease the residual error in unmixing results and provide a reliable output. In this current paper, it has also been proved that this proposed method can improve the accuracy of conventional linear unmixing methods and also easy to apply. Three different linear spectral unmixing methods were applied to test the improvement in unmixing results. Experiments were conducted in three different sets of Landsat-5 TM images of three different flood events in Australia to examine the method on different flooding conditions and achieved satisfactory outcomes in flood mapping.