977 resultados para feature representation


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a robust stochastic framework for the incorporation of visual observations into conventional estimation, data fusion, navigation and control algorithms. The representation combines Isomap, a non-linear dimensionality reduction algorithm, with expectation maximization, a statistical learning scheme. The joint probability distribution of this representation is computed offline based on existing training data. The training phase of the algorithm results in a nonlinear and non-Gaussian likelihood model of natural features conditioned on the underlying visual states. This generative model can be used online to instantiate likelihoods corresponding to observed visual features in real-time. The instantiated likelihoods are expressed as a Gaussian mixture model and are conveniently integrated within existing non-linear filtering algorithms. Example applications based on real visual data from heterogenous, unstructured environments demonstrate the versatility of the generative models.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a robust stochastic model for the incorporation of natural features within data fusion algorithms. The representation combines Isomap, a non-linear manifold learning algorithm, with Expectation Maximization, a statistical learning scheme. The representation is computed offline and results in a non-linear, non-Gaussian likelihood model relating visual observations such as color and texture to the underlying visual states. The likelihood model can be used online to instantiate likelihoods corresponding to observed visual features in real-time. The likelihoods are expressed as a Gaussian Mixture Model so as to permit convenient integration within existing nonlinear filtering algorithms. The resulting compactness of the representation is especially suitable to decentralized sensor networks. Real visual data consisting of natural imagery acquired from an Unmanned Aerial Vehicle is used to demonstrate the versatility of the feature representation.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Spatio-Temporal interest points are the most popular feature representation in the field of action recognition. A variety of methods have been proposed to detect and describe local patches in video with several techniques reporting state of the art performance for action recognition. However, the reported results are obtained under different experimental settings with different datasets, making it difficult to compare the various approaches. As a result of this, we seek to comprehensively evaluate state of the art spatio- temporal features under a common evaluation framework with popular benchmark datasets (KTH, Weizmann) and more challenging datasets such as Hollywood2. The purpose of this work is to provide guidance for researchers, when selecting features for different applications with different environmental conditions. In this work we evaluate four popular descriptors (HOG, HOF, HOG/HOF, HOG3D) using a popular bag of visual features representation, and Support Vector Machines (SVM)for classification. Moreover, we provide an in-depth analysis of local feature descriptors and optimize the codebook sizes for different datasets with different descriptors. In this paper, we demonstrate that motion based features offer better performance than those that rely solely on spatial information, while features that combine both types of data are more consistent across a variety of conditions, but typically require a larger codebook for optimal performance.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

This paper presents an effective feature representation method in the context of activity recognition. Efficient and effective feature representation plays a crucial role not only in activity recognition, but also in a wide range of applications such as motion analysis, tracking, 3D scene understanding etc. In the context of activity recognition, local features are increasingly popular for representing videos because of their simplicity and efficiency. While they achieve state-of-the-art performance with low computational requirements, their performance is still limited for real world applications due to a lack of contextual information and models not being tailored to specific activities. We propose a new activity representation framework to address the shortcomings of the popular, but simple bag-of-words approach. In our framework, first multiple instance SVM (mi-SVM) is used to identify positive features for each action category and the k-means algorithm is used to generate a codebook. Then locality-constrained linear coding is used to encode the features into the generated codebook, followed by spatio-temporal pyramid pooling to convey the spatio-temporal statistics. Finally, an SVM is used to classify the videos. Experiments carried out on two popular datasets with varying complexity demonstrate significant performance improvement over the base-line bag-of-feature method.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Thesis (Ph.D.)--University of Washington, 2016-06

Relevância:

70.00% 70.00%

Publicador:

Resumo:

This paper presents a new, dynamic feature representation method for high value parts consisting of complex and intersecting features. The method first extracts features from the CAD model of a complex part. Then the dynamic status of each feature is established between various operations to be carried out during the whole manufacturing process. Each manufacturing and verification operation can be planned and optimized using the real conditions of a feature, thus enhancing accuracy, traceability and process control. The dynamic feature representation is complementary to the design models used as underlining basis in current CAD/CAM and decision support systems. © 2012 CIRP.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The computation of compact and meaningful representations of high dimensional sensor data has recently been addressed through the development of Nonlinear Dimensional Reduction (NLDR) algorithms. The numerical implementation of spectral NLDR techniques typically leads to a symmetric eigenvalue problem that is solved by traditional batch eigensolution algorithms. The application of such algorithms in real-time systems necessitates the development of sequential algorithms that perform feature extraction online. This paper presents an efficient online NLDR scheme, Sequential-Isomap, based on incremental singular value decomposition (SVD) and the Isomap method. Example simulations demonstrate the validity and significant potential of this technique in real-time applications such as autonomous systems.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

utomatic pain monitoring has the potential to greatly improve patient diagnosis and outcomes by providing a continuous objective measure. One of the most promising methods is to do this via automatically detecting facial expressions. However, current approaches have failed due to their inability to: 1) integrate the rigid and non-rigid head motion into a single feature representation, and 2) incorporate the salient temporal patterns into the classification stage. In this paper, we tackle the first problem by developing a “histogram of facial action units” representation using Active Appearance Model (AAM) face features, and then utilize a Hidden Conditional Random Field (HCRF) to overcome the second issue. We show that both of these methods improve the performance on the task of pain detection in sequence level compared to current state-of-the-art-methods on the UNBC-McMaster Shoulder Pain Archive.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The detection and correction of defects remains among the most time consuming and expensive aspects of software development. Extensive automated testing and code inspections may mitigate their effect, but some code fragments are necessarily more likely to be faulty than others, and automated identification of fault prone modules helps to focus testing and inspections, thus limiting wasted effort and potentially improving detection rates. However, software metrics data is often extremely noisy, with enormous imbalances in the size of the positive and negative classes. In this work, we present a new approach to predictive modelling of fault proneness in software modules, introducing a new feature representation to overcome some of these issues. This rank sum representation offers improved or at worst comparable performance to earlier approaches for standard data sets, and readily allows the user to choose an appropriate trade-off between precision and recall to optimise inspection effort to suit different testing environments. The method is evaluated using the NASA Metrics Data Program (MDP) data sets, and performance is compared with existing studies based on the Support Vector Machine (SVM) and Naïve Bayes (NB) Classifiers, and with our own comprehensive evaluation of these methods.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Acoustic recordings of the environment provide an effective means to monitor bird species diversity. To facilitate exploration of acoustic recordings, we describe a content-based birdcall retrieval algorithm. A query birdcall is a region of spectrogram bounded by frequency and time. Retrieval depends on a similarity measure derived from the orientation and distribution of spectral ridges. The spectral ridge detection method caters for a broad range of birdcall structures. In this paper, we extend previous work by incorporating a spectrogram scaling step in order to improve the detection of spectral ridges. Compared to an existing approach based on MFCC features, our feature representation achieves better retrieval performance for multiple bird species in noisy recordings.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Age estimation from facial images is increasingly receiving attention to solve age-based access control, age-adaptive targeted marketing, amongst other applications. Since even humans can be induced in error due to the complex biological processes involved, finding a robust method remains a research challenge today. In this paper, we propose a new framework for the integration of Active Appearance Models (AAM), Local Binary Patterns (LBP), Gabor wavelets (GW) and Local Phase Quantization (LPQ) in order to obtain a highly discriminative feature representation which is able to model shape, appearance, wrinkles and skin spots. In addition, this paper proposes a novel flexible hierarchical age estimation approach consisting of a multi-class Support Vector Machine (SVM) to classify a subject into an age group followed by a Support Vector Regression (SVR) to estimate a specific age. The errors that may happen in the classification step, caused by the hard boundaries between age classes, are compensated in the specific age estimation by a flexible overlapping of the age ranges. The performance of the proposed approach was evaluated on FG-NET Aging and MORPH Album 2 datasets and a mean absolute error (MAE) of 4.50 and 5.86 years was achieved respectively. The robustness of the proposed approach was also evaluated on a merge of both datasets and a MAE of 5.20 years was achieved. Furthermore, we have also compared the age estimation made by humans with the proposed approach and it has shown that the machine outperforms humans. The proposed approach is competitive with current state-of-the-art and it provides an additional robustness to blur, lighting and expression variance brought about by the local phase features.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper presents an effective classification method based on Support Vector Machines (SVM) in the context of activity recognition. Local features that capture both spatial and temporal information in activity videos have made significant progress recently. Efficient and effective features, feature representation and classification plays a crucial role in activity recognition. For classification, SVMs are popularly used because of their simplicity and efficiency; however the common multi-class SVM approaches applied suffer from limitations including having easily confused classes and been computationally inefficient. We propose using a binary tree SVM to address the shortcomings of multi-class SVMs in activity recognition. We proposed constructing a binary tree using Gaussian Mixture Models (GMM), where activities are repeatedly allocated to subnodes until every new created node contains only one activity. Then, for each internal node a separate SVM is learned to classify activities, which significantly reduces the training time and increases the speed of testing compared to popular the `one-against-the-rest' multi-class SVM classifier. Experiments carried out on the challenging and complex Hollywood dataset demonstrates comparable performance over the baseline bag-of-features method.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This PhD research has proposed new machine learning techniques to improve human action recognition based on local features. Several novel video representation and classification techniques have been proposed to increase the performance with lower computational complexity. The major contributions are the construction of new feature representation techniques, based on advanced machine learning techniques such as multiple instance dictionary learning, Latent Dirichlet Allocation (LDA) and Sparse coding. A Binary-tree based classification technique was also proposed to deal with large amounts of action categories. These techniques are not only improving the classification accuracy with constrained computational resources but are also robust to challenging environmental conditions. These developed techniques can be easily extended to a wide range of video applications to provide near real-time performance.