10 resultados para Sift

em Deakin Research Online - Australia


Relevância:

20.00% 20.00%

Publicador:

Resumo:

How to recognize human action from videos captured by modern cameras efficiently and effectively is a challenge in real applications. Traditional methods which need professional analysts are facing a bottleneck because of their shortcomings. To cope with the disadvantage, methods based on computer vision techniques, without or with only a few human interventions, have been proposed to analyse human actions in videos automatically. This paper provides a method combining the three dimensional Scale Invariant Feature Transform (SIFT) detector and the Latent Dirichlet Allocation (LDA) model for human motion analysis. To represent videos effectively and robustly, we extract the 3D SIFT descriptor around each interest point, which is sampled densely from 3D Space-time video volumes. After obtaining the representation of each video frame, the LDA model is adopted to discover the underlying structure-the categorization of human actions in the collection of videos. Public available standard datasets are used to test our method. The concluding part discusses the research challenges and future directions.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The paper presents the Visual Mouse (VM), a novel and simple system for interaction with displays via hand gestures. Our method includes detecting bare hands using the fast SIFT (Scale-Invariant Feature Transform) algorithm saving long training time of the Adaboost algorithm, tracking hands based on the CAMShift algorithm, recognizing hand gestures in cluttered background via Principle Components Analysis (PCA) without extracting clear-cut hand contour, and defining simple and robustly interpretable vocabularies of hand gestures, which are subsequently used to control a computer mouse. The system provides a fast and simple interaction experience without the need for more expensive hardware and software.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The blue glass is always the hardest to find. On the beach you catch the waves bringing back the glass from forgotten tossed bottles, frosted green, clear, or mottled pale brown. But the blue glass - that's the real thing. I search for days without finding any. Sometimes there are slivers; other days, small chunks. Like a beachcomber, I comb the sands for it. I take the glass home and make some into jewellery and touchstones for people to hang on to; pour essential oils on others so the scents waft heavenward and meld together with the glass to form a bond. Words are like that. They can fuse with each other and ignite, or just quietly combine, On sunny days, I take my books with me to the beach. I toss words back and forth in my mind, like churning waves. I cobble them together, A phrase here. A sentence there. The water. The sun. The sand. The glass. The words. The paper. The Connection. I find myself enveloped in it all. The glass is from bottles tossed into the surf by unthinking people - picnickers, vacationers, those who don't have to return here and live with the remnants of their actions. Over time, the broken glass is ground and moulded by the action of the waves; the sharp edges are softened and etched by the sand and water, The sea glass is washed up on shore and picked up by beachcombers. Some recycle it for other uses like me; others just keep it as a reminder of a day at the beach. The words I sift through as I sit on the sand are measured in the sea glass. I pick each word up and look through it to see how much light shines through. What use do 1 have for it? A poem? An essay? A fragment of a sentence, for something to be said in the future? I watch the sun rest uneasily on its bed of water and slide slowly, farther down. I know the hot summer is coming to a close and I am loath to let go of the closeness I feel with nature. I live to find the blue glass, and sometimes it just happens. My search for Indian migrant women was like my quest for the blue glass. It was not an easy task. It became a process of rummaging through other people's lives, searching for fragments and relics. Eventually I was able to fit pieces together to form a mosaic of their lives in that other time, that other place. And also in this present time, in this place they now call home, Australia.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Automatic human action recognition has been a challenging issue in the field of machine vision. Some high-level features such as SIFT, although with promising performance for action recognition, are computationally complex to some extent. To deal with this problem, we construct the features based on the Distance Transform of body contours, which is relatively simple and computationally efficient, to represent human action in the video. After extracting the features from videos, we adopt the Conditional Random Field for modeling the temporal action sequences. The proposed method is tested with an available standard dataset. We also testify the robustness of our method on various realistic conditions, such as body occlusion or intersection.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper, we present a novel person detection system for public transport buses tackling the problem of changing illumination conditions. Our approach integrates a stable SIFT (Scale Invariant Feature Transform) background seat modeling mechanism with a human shape model into a weighted Bayesian framework to detect passengers on-board buses. SIFT background modeling extracts local stable features on the pre-annotated background seat areas and tracks these features over time to build a global statistical background model for each seat. Since SIFT features are partially invariant to lighting, this background model can be used robustly to detect the seat occupancy status even under severe lighting changes. The human shape model further confirms the existence of a passenger when a seat is occupied. This constructs a robust passenger monitoring system which is resilient to illumination changes. We evaluate the performance of our proposed system on a number of challenging video datasets obtained from bus cameras and the experimental results show that it is superior to state-of-art people detection systems.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper, we present a system for pedestrian detection involving scenes captured by mobile bus surveillance cameras in busy city streets. Our approach integrates scene localization, foreground and background separation, and pedestrian detection modules into a unified detection framework. The scene localization module performs a two stage clustering of the video data. In the first stage, SIFT Homography is applied to cluster frames in terms of their structural similarities and second stage further clusters these aligned frames in terms of lighting. This produces clusters of images which are differential in viewpoint and lighting. A kernel density estimation (KDE) method for colour and gradient foreground-background separation are then used to construct background model for each image cluster which is subsequently used to detect all foreground pixels. Finally, using a hierarchical template matching approach, pedestrians can be identified. We have tested our system on a set of real bus video datasets and the experimental results verify that our system works well in practice.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper describes our first attempt at tackling a pilot task in Trecvid: video summarization of rushes data [3]. Our method is based on the tight clustering produced via SIFT matching. In this first attempt, we try to examine how our approach performs without complex implementation in terms of concept detection and excerpt assembly (i.e, no picture-in-picture, split screen and special transitions). Although we do not perform very well in terms of concept inclusion, we rank very well in terms of the summary being easy to understand and relevancy of included segments.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The identification of useful structures in home video is difficult because this class of video is distinguished from other video sources by its unrestricted, non edited content and the absence of regulated storyline. In addition, home videos contain a lot of motion and erratic camera movements, with shots of the same character being captured from various angles and viewpoints. In this paper, we present a solution to the challenging problem of clustering shots and faces in home videos, based on the use of SIFT features. SIFT features have been known to be robust for object recognition; however, in dealing with the complexities of home video setting, the matching process needs to be augmented and adapted. This paper describes various techniques that can improve the number of matches returned as well as the correctness of matches. For example, existing methods for verification of matches are inadequate for cases when a small number of matches are returned, a common situation in home videos. We address this by constructing a robust classifier that works on matching sets instead of individual matches, allowing the exploitation of the geometric constraints between matches. Finally, we propose techniques for robustly extracting target clusters from individual feature matches.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The problem of 3D object recognition is of immense practical importance, with the last decade witnessing a number of breakthroughs in the state of the art. Most of the previous work has focused on the matching of textured objects using local appearance descriptors extracted around salient image points. The recently proposed bag of boundaries method was the first to address directly the problem of matching smooth objects using boundary features. However, no previous work has attempted to achieve a holistic treatment of the problem by jointly using textural and shape features which is what we describe herein. Due to the complementarity of the two modalities, we fuse the corresponding matching scores and learn their relative weighting in a data specific manner by optimizing discriminative performance on synthetically distorted data. For the textural description of an object we adopt a representation in the form of a histogram of SIFT based visual words. Similarly the apparent shape of an object is represented by a histogram of discretized features capturing local shape. On a large public database of a diverse set of objects, the proposed method is shown to outperform significantly both purely textural and purely shape based approaches for matching across viewpoint variation.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The analysis of human crowds has widespread uses from law enforcement to urban engineering and traffic management. All of these require a crowd to first be detected, which is the problem addressed in this paper. Given an image, the algorithm we propose segments it into crowd and non-crowd regions. The main idea is to capture two key properties of crowds: (i) on a narrow scale, its basic element should look like a human (only weakly so, due to low resolution, occlusion, clothing variation etc.), while (ii) on a larger scale, a crowd inherently contains repetitive appearance elements. Our method exploits this by building a pyramid of sliding windows and quantifying how “crowd-like” each level of the pyramid is using an underlying statistical model based on quantized SIFT features. The two aforementioned crowd properties are captured by the resulting feature vector of window responses, describing the degree of crowd-like appearance around an image location as the surrounding spatial extent is increased.