994 resultados para Multimedia retrieval


Relevância:

40.00% 40.00%

Publicador:

Resumo:

The utilization of massive multimedia documents collections, such as multimedia documents in the global Internet, needs search engines which can rank using both text and image evidence. Massive size and (dynamic) nature of collection can make manual indexing prohibitively expensive in such situations. Traditional search engines utilize only text components of multimedia documents. But there are information needs, which require the utilization of image evidence. In this paper, we investigate image-feature for large and heterogeneous collections. Both the nature and complexities of information needs are key elements for an effective retrieval. Retrieval needs that depend on perceptual similarities (as found in art galleries, building architecture) require the utilization of visual cues. In such situations, the retrieval of multimedia document based on image ranking can provide higher effectiveness. Experimental results show that effectiveness of ranking based on image feature can be higher where perceptual similarities are key elements for retrieval than the retrieval effectiveness of algorithms based on text ranking algorithms

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In this paper we compare ranking effectiveness of heterogeneous multimedia document retrieval when different image organizations are used for formulating queries. The quality of image queries depends on the organization of images used to make queries which in turn significantly impacts retrieval precision. CBIR (content based information retrieval) needs an effective and efficient organization of images including user interface which must be part of the configuration parameters of image retrieval research.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The main challenges of multimedia data retrieval lie in the effective mapping between low-level features and high-level concepts, and in the individual users' subjective perceptions of multimedia content. ^ The objectives of this dissertation are to develop an integrated multimedia indexing and retrieval framework with the aim to bridge the gap between semantic concepts and low-level features. To achieve this goal, a set of core techniques have been developed, including image segmentation, content-based image retrieval, object tracking, video indexing, and video event detection. These core techniques are integrated in a systematic way to enable the semantic search for images/videos, and can be tailored to solve the problems in other multimedia related domains. In image retrieval, two new methods of bridging the semantic gap are proposed: (1) for general content-based image retrieval, a stochastic mechanism is utilized to enable the long-term learning of high-level concepts from a set of training data, such as user access frequencies and access patterns of images. (2) In addition to whole-image retrieval, a novel multiple instance learning framework is proposed for object-based image retrieval, by which a user is allowed to more effectively search for images that contain multiple objects of interest. An enhanced image segmentation algorithm is developed to extract the object information from images. This segmentation algorithm is further used in video indexing and retrieval, by which a robust video shot/scene segmentation method is developed based on low-level visual feature comparison, object tracking, and audio analysis. Based on shot boundaries, a novel data mining framework is further proposed to detect events in soccer videos, while fully utilizing the multi-modality features and object information obtained through video shot/scene detection. ^ Another contribution of this dissertation is the potential of the above techniques to be tailored and applied to other multimedia applications. This is demonstrated by their utilization in traffic video surveillance applications. The enhanced image segmentation algorithm, coupled with an adaptive background learning algorithm, improves the performance of vehicle identification. A sophisticated object tracking algorithm is proposed to track individual vehicles, while the spatial and temporal relationships of vehicle objects are modeled by an abstract semantic model. ^

Relevância:

40.00% 40.00%

Publicador:

Resumo:

With the proliferation of multimedia data and ever-growing requests for multimedia applications, there is an increasing need for efficient and effective indexing, storage and retrieval of multimedia data, such as graphics, images, animation, video, audio and text. Due to the special characteristics of the multimedia data, the Multimedia Database management Systems (MMDBMSs) have emerged and attracted great research attention in recent years. Though much research effort has been devoted to this area, it is still far from maturity and there exist many open issues. In this dissertation, with the focus of addressing three of the essential challenges in developing the MMDBMS, namely, semantic gap, perception subjectivity and data organization, a systematic and integrated framework is proposed with video database and image database serving as the testbed. In particular, the framework addresses these challenges separately yet coherently from three main aspects of a MMDBMS: multimedia data representation, indexing and retrieval. In terms of multimedia data representation, the key to address the semantic gap issue is to intelligently and automatically model the mid-level representation and/or semi-semantic descriptors besides the extraction of the low-level media features. The data organization challenge is mainly addressed by the aspect of media indexing where various levels of indexing are required to support the diverse query requirements. In particular, the focus of this study is to facilitate the high-level video indexing by proposing a multimodal event mining framework associated with temporal knowledge discovery approaches. With respect to the perception subjectivity issue, advanced techniques are proposed to support users' interaction and to effectively model users' perception from the feedback at both the image-level and object-level.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Searching for multimedia is an important activity for users of Web search engines. Studying user's interactions with Web search engine multimedia buttons, including image, audio, and video, is important for the development of multimedia Web search systems. This article provides results from a Weblog analysis study of multimedia Web searching by Dogpile users in 2006. The study analyzes the (a) duration, size, and structure of Web search queries and sessions; (b) user demographics; (c) most popular multimedia Web searching terms; and (d) use of advanced Web search techniques including Boolean and natural language. The current study findings are compared with results from previous multimedia Web searching studies. The key findings are: (a) Since 1997, image search consistently is the dominant media type searched followed by audio and video; (b) multimedia search duration is still short (>50% of searching episodes are <1 min), using few search terms; (c) many multimedia searches are for information about people, especially in audio search; and (d) multimedia search has begun to shift from entertainment to other categories such as medical, sports, and technology (based on the most repeated terms). Implications for design of Web multimedia search engines are discussed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Mobile devices are becoming indispensable personal assistants in people's daily life as these devices support work, study, play and socializing activities. The multi-modal sensors and rich features of smartphones can capture abundant information about users' life experience, such as taking photos or videos on what they see and hear, and organizing their tasks and activities using calendar, to-do lists, and notes. Such vast information can become useful to help users recalling episodic memories and reminisce about meaningful experiences. In this paper, we propose to apply autobiographical memory framework to provide an effective mechanism to structure mobile life-log data. The proposed model is an attempt towards a more complete personal life-log indexing model, which will support long term capture, organization, and retrieval. To demonstrate the benefits of the proposed model, we propose some design solutions for enabling users-driven capture, annotation, and retrieval of autobiographical multimedia chronicles tools.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper we describe the approaches adopted to generate the runs submitted to ImageCLEFPhoto 2009 with an aim to promote document diversity in the rankings. Four of our runs are text based approaches that employ textual statistics extracted from the captions of images, i.e. MMR [1] as a state of the art method for result diversification, two approaches that combine relevance information and clustering techniques, and an instantiation of Quantum Probability Ranking Principle. The fifth run exploits visual features of the provided images to re-rank the initial results by means of Factor Analysis. The results reveal that our methods based on only text captions consistently improve the performance of the respective baselines, while the approach that combines visual features with textual statistics shows lower levels of improvements.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Bioacoustic monitoring has become a significant research topic for species diversity conservation. Due to the development of sensing techniques, acoustic sensors are widely deployed in the field to record animal sounds over a large spatial and temporal scale. With large volumes of collected audio data, it is essential to develop semi-automatic or automatic techniques to analyse the data. This can help ecologists make decisions on how to protect and promote the species diversity. This paper presents generic features to characterize a range of bird species for vocalisation retrieval. In the implementation, audio recordings are first converted to spectrograms using short-time Fourier transform, then a ridge detection method is applied to the spectrogram for detecting points of interest. Based on the detected points, a new region representation are explored for describing various bird vocalisations and a local descriptor including temporal entropy, frequency bin entropy and histogram of counts of four ridge directions is calculated for each sub-region. To speed up the retrieval process, indexing is carried out and the retrieved results are ranked according to similarity scores. The experiment results show that our proposed feature set can achieve 0.71 in term of retrieval success rate which outperforms spectral ridge features alone (0.55) and Mel frequency cepstral coefficients (0.36).

Relevância:

30.00% 30.00%

Publicador:

Resumo:

With the availability of a huge amount of video data on various sources, efficient video retrieval tools are increasingly in demand. Video being a multi-modal data, the perceptions of ``relevance'' between the user provided query video (in case of Query-By-Example type of video search) and retrieved video clips are subjective in nature. We present an efficient video retrieval method that takes user's feedback on the relevance of retrieved videos and iteratively reformulates the input query feature vectors (QFV) for improved video retrieval. The QFV reformulation is done by a simple, but powerful feature weight optimization method based on Simultaneous Perturbation Stochastic Approximation (SPSA) technique. A video retrieval system with video indexing, searching and relevance feedback (RF) phases is built for demonstrating the performance of the proposed method. The query and database videos are indexed using the conventional video features like color, texture, etc. However, we use the comprehensive and novel methods of feature representations, and a spatio-temporal distance measure to retrieve the top M videos that are similar to the query. In feedback phase, the user activated iterative on the previously retrieved videos is used to reformulate the QFV weights (measure of importance) that reflect the user's preference, automatically. It is our observation that a few iterations of such feedback are generally sufficient for retrieving the desired video clips. The novel application of SPSA based RF for user-oriented feature weights optimization makes the proposed method to be distinct from the existing ones. The experimental results show that the proposed RF based video retrieval exhibit good performance.

Relevância:

30.00% 30.00%

Publicador: