1000 resultados para video indexing


Relevância:

100.00% 100.00%

Publicador:

Resumo:

To sustain an ongoing rapid growth of video information, there is an emerging demand for a sophisticated content-based video indexing system. However, current video indexing solutions are still immature and lack of any standard. This doctoral consists of a research work based on an integrated multi-modal approach for sports video indexing and retrieval. By combining specific features extractable from multiple audio-visual modalities, generic structure and specific events can be detected and classified. During browsing and retrieval, users will benefit from the integration of high-level semantic and some descriptive mid-level features such as whistle and close-up view of player(s).

Relevância:

100.00% 100.00%

Publicador:

Resumo:

To sustain an ongoing rapid growth of video information, there is an emerging demand for a sophisticated content-based video indexing system. However, current video indexing solutions are still immature and lack of any standard. This doctoral consists of a research work based on an integrated multi-modal approach for sports video indexing and retrieval. By combining specific features extractable from multiple audio-visual modalities, generic structure and specific events can be detected and classified. During browsing and retrieval, users will benefit from the integration of high-level semantic and some descriptive mid-level features such as whistle and close-up view of player(s).

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis presents a research work based on an integrated multi-modal approach for sports video indexing and retrieval. By combining specific features extractable from multiple (audio-visual) modalities, generic structure and specific events can be detected and classified. During browsing and retrieval, users will benefit from the integration of high-level semantic and some descriptive mid-level features such as whistle and close-up view of player(s). The main objective is to contribute to the three major components of sports video indexing systems. The first component is a set of powerful techniques to extract audio-visual features and semantic contents automatically. The main purposes are to reduce manual annotations and to summarize the lengthy contents into a compact, meaningful and more enjoyable presentation. The second component is an expressive and flexible indexing technique that supports gradual index construction. Indexing scheme is essential to determine the methods by which users can access a video database. The third and last component is a query language that can generate dynamic video summaries for smart browsing and support user-oriented retrievals.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

While the largest common subgraph (LCSG) between a query and a database of models can provide an elegant and intuitive measure of similarity for many applications, it is computationally expensive to compute. Recently developed algorithms for subgraph isomorphism detection take advantage of prior knowledge of a database of models to improve the speed of on-line matching. This paper presents a new algorithm based on similar principles to solve the largest common subgraph problem. The new algorithm significantly reduces the computational complexity of detection of the LCSG between a known database of models, and a query given on-line.

Relevância:

100.00% 100.00%

Publicador:

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Recent approaches to video indexing and retrieval are either pixel-oriented or object-oriented. While the former approaches focus on motion and changes thereto, the latter focus on spatial relations among objects in the scene. In this paper, a spatial knowledge representation technique combining both approaches is proposed. This representation supplements the spatial knowledge of visual objects with information about their pixel positions in the video frame. It provides a practical way to construct video indices, enabling searching for and retrieval of video sequences that contain motion as well as sparsely disjoint objects

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Many tasks in computer vision can be expressed as graph problems. This allows the task to be solved using a well studied algorithm, however many of these algorithms are of exponential complexity. This is a disadvantage when considered in the context of searching a database of images or videos for similarity. Work by Mesaner and Bunke (1995) has suggested a new class of graph matching algorithms which uses a priori knowledge about a database of models to reduce the time taken during online classification. This paper presents a new algorithm which extends the earlier work to detection of the largest common subgraph.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

With the rapid increase in both centralized video archives and distributed WWW video resources, content-based video retrieval is gaining its importance. To support such applications efficiently, content-based video indexing must be addressed. Typically, each video is represented by a sequence of frames. Due to the high dimensionality of frame representation and the large number of frames, video indexing introduces an additional degree of complexity. In this paper, we address the problem of content-based video indexing and propose an efficient solution, called the Ordered VA-File (OVA-File) based on the VA-file. OVA-File is a hierarchical structure and has two novel features: 1) partitioning the whole file into slices such that only a small number of slices are accessed and checked during k Nearest Neighbor (kNN) search and 2) efficient handling of insertions of new vectors into the OVA-File, such that the average distance between the new vectors and those approximations near that position is minimized. To facilitate a search, we present an efficient approximate kNN algorithm named Ordered VA-LOW (OVA-LOW) based on the proposed OVA-File. OVA-LOW first chooses possible OVA-Slices by ranking the distances between their corresponding centers and the query vector, and then visits all approximations in the selected OVA-Slices to work out approximate kNN. The number of possible OVA-Slices is controlled by a user-defined parameter delta. By adjusting delta, OVA-LOW provides a trade-off between the query cost and the result quality. Query by video clip consisting of multiple frames is also discussed. Extensive experimental studies using real video data sets were conducted and the results showed that our methods can yield a significant speed-up over an existing VA-file-based method and iDistance with high query result quality. Furthermore, by incorporating temporal correlation of video content, our methods achieved much more efficient performance.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

With the availability of a huge amount of video data on various sources, efficient video retrieval tools are increasingly in demand. Video being a multi-modal data, the perceptions of ``relevance'' between the user provided query video (in case of Query-By-Example type of video search) and retrieved video clips are subjective in nature. We present an efficient video retrieval method that takes user's feedback on the relevance of retrieved videos and iteratively reformulates the input query feature vectors (QFV) for improved video retrieval. The QFV reformulation is done by a simple, but powerful feature weight optimization method based on Simultaneous Perturbation Stochastic Approximation (SPSA) technique. A video retrieval system with video indexing, searching and relevance feedback (RF) phases is built for demonstrating the performance of the proposed method. The query and database videos are indexed using the conventional video features like color, texture, etc. However, we use the comprehensive and novel methods of feature representations, and a spatio-temporal distance measure to retrieve the top M videos that are similar to the query. In feedback phase, the user activated iterative on the previously retrieved videos is used to reformulate the QFV weights (measure of importance) that reflect the user's preference, automatically. It is our observation that a few iterations of such feedback are generally sufficient for retrieving the desired video clips. The novel application of SPSA based RF for user-oriented feature weights optimization makes the proposed method to be distinct from the existing ones. The experimental results show that the proposed RF based video retrieval exhibit good performance.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Sport video data is growing rapidly as a result of the maturing digital technologies that support digital video capture, faster data processing, and large storage. However, (1) semi-automatic content extraction and annotation, (2) scalable indexing model, and (3) effective retrieval and browsing, still pose the most challenging problems for maximizing the usage of large video databases. This article will present the findings from a comprehensive work that proposes a scalable and extensible sports video retrieval system with two major contributions in the area of sports video indexing and retrieval. The first contribution is a new sports video indexing model that utilizes semi-schema-based indexing scheme on top of an Object-Relationship approach. This indexing model is scalable and extensible as it enables gradual index construction which is supported by ongoing development of future content extraction algorithms. The second contribution is a set of novel queries which are based on XQuery to generate dynamic and user-oriented summaries and event structures. The proposed sports video retrieval system has been fully implemented and populated with soccer, tennis, swimming, and diving video. The system has been evaluated against 20 users to demonstrate and confirm its feasibility and benefits. The experimental sports genres were specifically selected to represent the four main categories of sports domain: period-, set-point-, time (race)-, and performance-based sports. Thus, the proposed system should be generic and robust for all types of sports.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

We propose a simple technique for extracting camera motion parameters from a sequence of images. The method can estimate qualitatively camera pan, tilt, zoom, roll, and horizontal and vertical tracking. Unlike most other comparable techniques, the present method can distinguish pan from horizontal tracking, and tilt from vertical tracking. The technique can be applied to the automated indexing of video and film sequences.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Most current work on video indexing concentrates on queries which operate over high level semantic information which must be entirely composed and entered manually. We propose an indexing system which is based on spatial information about key objects in a scene. These key objects may be detected automatically, with manual supervision, and tracked through a sequence using one of a number of recently developed techniques. This representation is highly compact and allows rapid resolution of queries specified by iconic example. A number of systems have been produced which use 2D string notations to index digital image libraries. Just as 2D strings provide a compact and tractable indexing notation for digital pictures, a sequence of 2D strings might provide an index for a video or image sequence. To improve further upon this we reduce the representation to the 2D string pair representing the initial frame, and a sequence of edits to these strings. This takes advantage of the continuity between frames to further reduce the size of the notation. By representing video sequences using string edits, a notation has been developed which is compact, and allows querying on the spatial relationships of objects to be performed without rebuilding the majority of the scene. Calculating ranks of objects directly from the edit sequence allows matching with minimal calculation, thus greatly reducing search time. This paper presents the edit sequence notation and algorithms for evaluating queries over image sequences. A number of optimizations which represent a considerably saving in search time is demonstrated in the paper.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The increasing use of video editing software has resulted in a necessity for faster and more efficient editing tools. Here, we propose a lightweight high-quality video indexing tool that is suitable for video editing software.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The main challenges of multimedia data retrieval lie in the effective mapping between low-level features and high-level concepts, and in the individual users' subjective perceptions of multimedia content. ^ The objectives of this dissertation are to develop an integrated multimedia indexing and retrieval framework with the aim to bridge the gap between semantic concepts and low-level features. To achieve this goal, a set of core techniques have been developed, including image segmentation, content-based image retrieval, object tracking, video indexing, and video event detection. These core techniques are integrated in a systematic way to enable the semantic search for images/videos, and can be tailored to solve the problems in other multimedia related domains. In image retrieval, two new methods of bridging the semantic gap are proposed: (1) for general content-based image retrieval, a stochastic mechanism is utilized to enable the long-term learning of high-level concepts from a set of training data, such as user access frequencies and access patterns of images. (2) In addition to whole-image retrieval, a novel multiple instance learning framework is proposed for object-based image retrieval, by which a user is allowed to more effectively search for images that contain multiple objects of interest. An enhanced image segmentation algorithm is developed to extract the object information from images. This segmentation algorithm is further used in video indexing and retrieval, by which a robust video shot/scene segmentation method is developed based on low-level visual feature comparison, object tracking, and audio analysis. Based on shot boundaries, a novel data mining framework is further proposed to detect events in soccer videos, while fully utilizing the multi-modality features and object information obtained through video shot/scene detection. ^ Another contribution of this dissertation is the potential of the above techniques to be tailored and applied to other multimedia applications. This is demonstrated by their utilization in traffic video surveillance applications. The enhanced image segmentation algorithm, coupled with an adaptive background learning algorithm, improves the performance of vehicle identification. A sophisticated object tracking algorithm is proposed to track individual vehicles, while the spatial and temporal relationships of vehicle objects are modeled by an abstract semantic model. ^

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Visual tracking is an important task in various computer vision applications including visual surveillance, human computer interaction, event detection, video indexing and retrieval. Recent state of the art sparse representation (SR) based trackers show better robustness than many of the other existing trackers. One of the issues with these SR trackers is low execution speed. The particle filter framework is one of the major aspects responsible for slow execution, and is common to most of the existing SR trackers. In this paper,(1) we propose a robust interest point based tracker in l(1) minimization framework that runs at real-time with performance comparable to the state of the art trackers. In the proposed tracker, the target dictionary is obtained from the patches around target interest points. Next, the interest points from the candidate window of the current frame are obtained. The correspondence between target and candidate points is obtained via solving the proposed l(1) minimization problem. In order to prune the noisy matches, a robust matching criterion is proposed, where only the reliable candidate points that mutually match with target and candidate dictionary elements are considered for tracking. The object is localized by measuring the displacement of these interest points. The reliable candidate patches are used for updating the target dictionary. The performance and accuracy of the proposed tracker is benchmarked with several complex video sequences. The tracker is found to be considerably fast as compared to the reported state of the art trackers. The proposed tracker is further evaluated for various local patch sizes, number of interest points and regularization parameters. The performance of the tracker for various challenges including illumination change, occlusion, and background clutter has been quantified with a benchmark dataset containing 50 videos. (C) 2014 Elsevier B.V. All rights reserved.