960 resultados para local features


Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper presents an effective classification method based on Support Vector Machines (SVM) in the context of activity recognition. Local features that capture both spatial and temporal information in activity videos have made significant progress recently. Efficient and effective features, feature representation and classification plays a crucial role in activity recognition. For classification, SVMs are popularly used because of their simplicity and efficiency; however the common multi-class SVM approaches applied suffer from limitations including having easily confused classes and been computationally inefficient. We propose using a binary tree SVM to address the shortcomings of multi-class SVMs in activity recognition. We proposed constructing a binary tree using Gaussian Mixture Models (GMM), where activities are repeatedly allocated to subnodes until every new created node contains only one activity. Then, for each internal node a separate SVM is learned to classify activities, which significantly reduces the training time and increases the speed of testing compared to popular the `one-against-the-rest' multi-class SVM classifier. Experiments carried out on the challenging and complex Hollywood dataset demonstrates comparable performance over the baseline bag-of-features method.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper presents an effective feature representation method in the context of activity recognition. Efficient and effective feature representation plays a crucial role not only in activity recognition, but also in a wide range of applications such as motion analysis, tracking, 3D scene understanding etc. In the context of activity recognition, local features are increasingly popular for representing videos because of their simplicity and efficiency. While they achieve state-of-the-art performance with low computational requirements, their performance is still limited for real world applications due to a lack of contextual information and models not being tailored to specific activities. We propose a new activity representation framework to address the shortcomings of the popular, but simple bag-of-words approach. In our framework, first multiple instance SVM (mi-SVM) is used to identify positive features for each action category and the k-means algorithm is used to generate a codebook. Then locality-constrained linear coding is used to encode the features into the generated codebook, followed by spatio-temporal pyramid pooling to convey the spatio-temporal statistics. Finally, an SVM is used to classify the videos. Experiments carried out on two popular datasets with varying complexity demonstrate significant performance improvement over the base-line bag-of-feature method.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This PhD research has proposed new machine learning techniques to improve human action recognition based on local features. Several novel video representation and classification techniques have been proposed to increase the performance with lower computational complexity. The major contributions are the construction of new feature representation techniques, based on advanced machine learning techniques such as multiple instance dictionary learning, Latent Dirichlet Allocation (LDA) and Sparse coding. A Binary-tree based classification technique was also proposed to deal with large amounts of action categories. These techniques are not only improving the classification accuracy with constrained computational resources but are also robust to challenging environmental conditions. These developed techniques can be easily extended to a wide range of video applications to provide near real-time performance.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The structural landscape of acid-pyridine cocrystals is explored by adopting a combinatorial matrix method with 4-substituted benzoic acids and 4-substituted pyridines. The choice of the system restricts the primary synthon to the robust acid-pyridine entity. This methodology accordingly provides hints toward the formation of secondary synthons. The pK(a) rule is validated in the landscape by taking all components of the matrix together and exploring it as a whole. Along with the global features, the exploration of landscapes reveals some local features. Apart from the identification of secondary synthons, it also sheds light on the propensity of hydration in cocrystals, synthon competition, and certain topological similarities. The method described here combines two approaches, namely, database analysis and high throughput crystallography, to extract more information with minimal extra experimental effort.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper presents a GPU implementation of normalized cuts for road extraction problem using panchromatic satellite imagery. The roads have been extracted in three stages namely pre-processing, image segmentation and post-processing. Initially, the image is pre-processed to improve the tolerance by reducing the clutter (that mostly represents the buildings, vegetation,. and fallow regions). The road regions are then extracted using the normalized cuts algorithm. Normalized cuts algorithm is a graph-based partitioning `approach whose focus lies in extracting the global impression (perceptual grouping) of an image rather than local features. For the segmented image, post-processing is carried out using morphological operations - erosion and dilation. Finally, the road extracted image is overlaid on the original image. Here, a GPGPU (General Purpose Graphical Processing Unit) approach has been adopted to implement the same algorithm on the GPU for fast processing. A performance comparison of this proposed GPU implementation of normalized cuts algorithm with the earlier algorithm (CPU implementation) is presented. From the results, we conclude that the computational improvement in terms of time as the size of image increases for the proposed GPU implementation of normalized cuts. Also, a qualitative and quantitative assessment of the segmentation results has been projected.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Representing images and videos in the form of compact codes has emerged as an important research interest in the vision community, in the context of web scale image/video search. Recently proposed Vector of Locally Aggregated Descriptors (VLAD), has been shown to outperform the existing retrieval techniques, while giving a desired compact representation. VLAD aggregates the local features of an image in the feature space. In this paper, we propose to represent the local features extracted from an image, as sparse codes over an over-complete dictionary, which is obtained by K-SVD based dictionary training algorithm. The proposed VLAD aggregates the residuals in the space of these sparse codes, to obtain a compact representation for the image. Experiments are performed over the `Holidays' database using SIFT features. The performance of the proposed method is compared with the original VLAD. The 4% increment in the mean average precision (mAP) indicates the better retrieval performance of the proposed sparse coding based VLAD.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We perceive objects as containing a variety of attributes: local features, relations between features, internal details, and global properties. But we know little about how they combine. Here, we report a remarkably simple additive rule that governs how these diverse object attributes combine in vision. The perceived dissimilarity between two objects was accurately explained as a sum of (a) spatially tuned local contour-matching processes modulated by part decomposition; (b) differences in internal details, such as texture; (c) differences in emergent attributes, such as symmetry; and (d) differences in global properties, such as orientation or overall configuration of parts. Our results elucidate an enduring question in object vision by showing that the whole object is not a sum of its parts but a sum of its many attributes.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Weighted graph matching is a good way to align a pair of shapes represented by a set of descriptive local features; the set of correspondences produced by the minimum cost of matching features from one shape to the features of the other often reveals how similar the two shapes are. However, due to the complexity of computing the exact minimum cost matching, previous algorithms could only run efficiently when using a limited number of features per shape, and could not scale to perform retrievals from large databases. We present a contour matching algorithm that quickly computes the minimum weight matching between sets of descriptive local features using a recently introduced low-distortion embedding of the Earth Mover's Distance (EMD) into a normed space. Given a novel embedded contour, the nearest neighbors in a database of embedded contours are retrieved in sublinear time via approximate nearest neighbors search. We demonstrate our shape matching method on databases of 10,000 images of human figures and 60,000 images of handwritten digits.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This thesis addresses the problem of recognizing solid objects in the three-dimensional world, using two-dimensional shape information extracted from a single image. Objects can be partly occluded and can occur in cluttered scenes. A model based approach is taken, where stored models are matched to an image. The matching problem is separated into two stages, which employ different representations of objects. The first stage uses the smallest possible number of local features to find transformations from a model to an image. This minimizes the amount of search required in recognition. The second stage uses the entire edge contour of an object to verify each transformation. This reduces the chance of finding false matches.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper provides a summary of our studies on robust speech recognition based on a new statistical approach – the probabilistic union model. We consider speech recognition given that part of the acoustic features may be corrupted by noise. The union model is a method for basing the recognition on the clean part of the features, thereby reducing the effect of the noise on recognition. To this end, the union model is similar to the missing feature method. However, the two methods achieve this end through different routes. The missing feature method usually requires the identity of the noisy data for noise removal, while the union model combines the local features based on the union of random events, to reduce the dependence of the model on information about the noise. We previously investigated the applications of the union model to speech recognition involving unknown partial corruption in frequency band, in time duration, and in feature streams. Additionally, a combination of the union model with conventional noise-reduction techniques was studied, as a means of dealing with a mixture of known or trainable noise and unknown unexpected noise. In this paper, a unified review, in the context of dealing with unknown partial feature corruption, is provided into each of these applications, giving the appropriate theory and implementation algorithms, along with an experimental evaluation.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Face recognition with unknown, partial distortion and occlusion is a practical problem, and has a wide range of applications, including security and multimedia information retrieval. The authors present a new approach to face recognition subject to unknown, partial distortion and occlusion. The new approach is based on a probabilistic decision-based neural network, enhanced by a statistical method called the posterior union model (PUM). PUM is an approach for ignoring severely mismatched local features and focusing the recognition mainly on the reliable local features. It thereby improves the robustness while assuming no prior information about the corruption. We call the new approach the posterior union decision-based neural network (PUDBNN). The new PUDBNN model has been evaluated on three face image databases (XM2VTS, AT&T and AR) using testing images subjected to various types of simulated and realistic partial distortion and occlusion. The new system has been compared to other approaches and has demonstrated improved performance.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Segregation measures have been applied in the study of many societies, and traditionally such measures have been used to assess the degree of division between social and cultural groups across urban areas, wider regions, or perhaps national areas. The degree of segregation can vary substantially from place to place even within very small areas. In this paper the substantive concern is with religious/political segregation in Northern Ireland—particularly the proportion of Protestants (often taken as an indicator of those who wish to retain the union with Britain) to Catholics (often taken as an indicator of those who favour union with the Republic of Ireland). Traditionally, segregation is measured globally—that is, across all units in a given area. A recent trend in spatial data analysis generally, and in segregation analysis specifically, is to assess local features of spatial datasets. The rationale behind such approaches is that global methods may obscure important spatial variations in the property of interest, and thus prevent full use of the data. In this paper the utility of local measures of residential segregation is assessed with reference to the religious/political composition of Northern Ireland. The paper demonstrates marked spatial variations in the degree and nature of residential segregation across Northern Ireland. It is argued that local measures provide highly useful information in addition to that provided in maps of the raw variables and in standard global segregation measures.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Prediction of biotic responses to future climate change in tropical Africa tends to be based on two modelling approaches: bioclimatic species envelope models and dynamic vegetation models. Another complementary but underused approach is to examine biotic responses to similar climatic changes in the past as evidenced in fossil and historical records. This paper reviews these records and highlights the information that they provide in terms of understanding the local- and regional-scale responses of African vegetation to future climate change. A key point that emerges is that a move to warmer and wetter conditions in the past resulted in a large increase in biomass and a range distribution of woody plants up to 400–500 km north of its present location, the so-called greening of the Sahara. By contrast, a transition to warmer and drier conditions resulted in a reduction in woody vegetation in many regions and an increase in grass/savanna-dominated landscapes. The rapid rate of climate warming coming into the current interglacial resulted in a dramatic increase in community turnover, but there is little evidence for widespread extinctions. However, huge variation in biotic response in both space and time is apparent with, in some cases, totally different responses to the same climatic driver. This highlights the importance of local features such as soils, topography and also internal biotic factors in determining responses and resilience of the African biota to climate change, information that is difficult to obtain from modelling but is abundant in palaeoecological records.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This chapter describes an experimental system for the recognition of human faces from surveillance video. In surveillance applications, the system must be robust to changes in illumination, scale, pose and expression. The system must also be able to perform detection and recognition rapidly in real time. Our system detects faces using the Viola-Jones face detector, then extracts local features to build a shape-based feature vector. The feature vector is constructed from ratios of lengths and differences in tangents of angles, so as to be robust to changes in scale and rotations in-plane and out-of-plane. Consideration was given to improving the performance and accuracy of both the detection and recognition steps.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Bottom hinged oscillating wave surge converters are known to be an efficient method of extracting power from ocean waves. The present work deals with experimental and numerical studies of wave interactions with an oscillating wave surge converter. It focuses on two aspects: (1) viscous effects on device performance under normal operating conditions; and (2) effects of slamming on device survivability under extreme conditions. Part I deals with the viscous effects while the extreme sea conditions will be presented in Part II. The numerical simulations are performed using the commercial CFD package ANSYS FLUENT. The comparison between numerical results and experimental measurements shows excellent agreement in terms of capturing local features of the flow as well as the dynamics of the device. A series of simulations is conducted with various wave conditions, flap configurations and model scales to investigate the viscous and scaling effects on the device. It is found that the diffraction/radiation effects dominate the device motion and that the viscous effects are negligible for wide flaps.