882 resultados para Image retrieval


Relevância:

70.00% 70.00%

Publicador:

Resumo:

In this paper a new method for image retrieval using high level color semantic features is proposed. It is based on extraction of low level color characteristics and their conversion into high level semantic features using Johannes Itten theory of color, Dempster-Shafer theory of evidence and fuzzy production rules.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

This is an extended version of an article presented at the Second International Conference on Software, Services and Semantic Technologies, Sofia, Bulgaria, 11–12 September 2010.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

In this chapter we provide a comprehensive overview of the emerging field of visualising and browsing image databases. We start with a brief introduction to content-based image retrieval and the traditional query-by-example search paradigm that many retrieval systems employ. We specify the problems associated with this type of interface, such as users not being able to formulate a query due to not having a target image or concept in mind. The idea of browsing systems is then introduced as a means to combat these issues, harnessing the cognitive power of the human mind in order to speed up image retrieval.We detail common methods in which the often high-dimensional feature data extracted from images can be used to visualise image databases in an intuitive way. Systems using dimensionality reduction techniques, such as multi-dimensional scaling, are reviewed along with those that cluster images using either divisive or agglomerative techniques as well as graph-based visualisations. While visualisation of an image collection is useful for providing an overview of the contained images, it forms only part of an image database navigation system. We therefore also present various methods provided by these systems to allow for interactive browsing of these datasets. A further area we explore are user studies of systems and visualisations where we look at the different evaluations undertaken in order to test usability and compare systems, and highlight the key findings from these studies. We conclude the chapter with several recommendations for future work in this area. © 2011 Springer-Verlag Berlin Heidelberg.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The main challenges of multimedia data retrieval lie in the effective mapping between low-level features and high-level concepts, and in the individual users' subjective perceptions of multimedia content. ^ The objectives of this dissertation are to develop an integrated multimedia indexing and retrieval framework with the aim to bridge the gap between semantic concepts and low-level features. To achieve this goal, a set of core techniques have been developed, including image segmentation, content-based image retrieval, object tracking, video indexing, and video event detection. These core techniques are integrated in a systematic way to enable the semantic search for images/videos, and can be tailored to solve the problems in other multimedia related domains. In image retrieval, two new methods of bridging the semantic gap are proposed: (1) for general content-based image retrieval, a stochastic mechanism is utilized to enable the long-term learning of high-level concepts from a set of training data, such as user access frequencies and access patterns of images. (2) In addition to whole-image retrieval, a novel multiple instance learning framework is proposed for object-based image retrieval, by which a user is allowed to more effectively search for images that contain multiple objects of interest. An enhanced image segmentation algorithm is developed to extract the object information from images. This segmentation algorithm is further used in video indexing and retrieval, by which a robust video shot/scene segmentation method is developed based on low-level visual feature comparison, object tracking, and audio analysis. Based on shot boundaries, a novel data mining framework is further proposed to detect events in soccer videos, while fully utilizing the multi-modality features and object information obtained through video shot/scene detection. ^ Another contribution of this dissertation is the potential of the above techniques to be tailored and applied to other multimedia applications. This is demonstrated by their utilization in traffic video surveillance applications. The enhanced image segmentation algorithm, coupled with an adaptive background learning algorithm, improves the performance of vehicle identification. A sophisticated object tracking algorithm is proposed to track individual vehicles, while the spatial and temporal relationships of vehicle objects are modeled by an abstract semantic model. ^

Relevância:

70.00% 70.00%

Publicador:

Resumo:

With the rise of smart phones, lifelogging devices (e.g. Google Glass) and popularity of image sharing websites (e.g. Flickr), users are capturing and sharing every aspect of their life online producing a wealth of visual content. Of these uploaded images, the majority are poorly annotated or exist in complete semantic isolation making the process of building retrieval systems difficult as one must firstly understand the meaning of an image in order to retrieve it. To alleviate this problem, many image sharing websites offer manual annotation tools which allow the user to “tag” their photos, however, these techniques are laborious and as a result have been poorly adopted; Sigurbjörnsson and van Zwol (2008) showed that 64% of images uploaded to Flickr are annotated with < 4 tags. Due to this, an entire body of research has focused on the automatic annotation of images (Hanbury, 2008; Smeulders et al., 2000; Zhang et al., 2012a) where one attempts to bridge the semantic gap between an image’s appearance and meaning e.g. the objects present. Despite two decades of research the semantic gap still largely exists and as a result automatic annotation models often offer unsatisfactory performance for industrial implementation. Further, these techniques can only annotate what they see, thus ignoring the “bigger picture” surrounding an image (e.g. its location, the event, the people present etc). Much work has therefore focused on building photo tag recommendation (PTR) methods which aid the user in the annotation process by suggesting tags related to those already present. These works have mainly focused on computing relationships between tags based on historical images e.g. that NY and timessquare co-exist in many images and are therefore highly correlated. However, tags are inherently noisy, sparse and ill-defined often resulting in poor PTR accuracy e.g. does NY refer to New York or New Year? This thesis proposes the exploitation of an image’s context which, unlike textual evidences, is always present, in order to alleviate this ambiguity in the tag recommendation process. Specifically we exploit the “what, who, where, when and how” of the image capture process in order to complement textual evidences in various photo tag recommendation and retrieval scenarios. In part II, we combine text, content-based (e.g. # of faces present) and contextual (e.g. day-of-the-week taken) signals for tag recommendation purposes, achieving up to a 75% improvement to precision@5 in comparison to a text-only TF-IDF baseline. We then consider external knowledge sources (i.e. Wikipedia & Twitter) as an alternative to (slower moving) Flickr in order to build recommendation models on, showing that similar accuracy could be achieved on these faster moving, yet entirely textual, datasets. In part II, we also highlight the merits of diversifying tag recommendation lists before discussing at length various problems with existing automatic image annotation and photo tag recommendation evaluation collections. In part III, we propose three new image retrieval scenarios, namely “visual event summarisation”, “image popularity prediction” and “lifelog summarisation”. In the first scenario, we attempt to produce a rank of relevant and diverse images for various news events by (i) removing irrelevant images such memes and visual duplicates (ii) before semantically clustering images based on the tweets in which they were originally posted. Using this approach, we were able to achieve over 50% precision for images in the top 5 ranks. In the second retrieval scenario, we show that by combining contextual and content-based features from images, we are able to predict if it will become “popular” (or not) with 74% accuracy, using an SVM classifier. Finally, in chapter 9 we employ blur detection and perceptual-hash clustering in order to remove noisy images from lifelogs, before combining visual and geo-temporal signals in order to capture a user’s “key moments” within their day. We believe that the results of this thesis show an important step towards building effective image retrieval models when there lacks sufficient textual content (i.e. a cold start).

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Dissertação de Mestrado, Processamento de Linguagem Natural e Indústrias da Língua, Faculdade de Ciências Humanas e Sociais, Universidade do Algarve, 2014

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Conceptual interpretation of languages has gathered peak interest in the world of artificial intelligence. The challenge in modeling various complications involved in a language is the main motivation behind our work. Our main focus in this work is to develop conceptual graphical representation for image captions. We have used discourse representation structure to gain semantic information which is further modeled into a graphical structure. The effectiveness of the model is evaluated by a caption based image retrieval system. The image retrieval is performed by computing subgraph based similarity measures. Best retrievals were given an average rating of . ± . out of 4 by a group of 25 human judges. The experiments were performed on a subset of the SBU Captioned Photo Dataset. This purpose of this work is to establish the cognitive sensibility of the approach to caption representations

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Conceptual interpretation of languages has gathered peak interest in the world of artificial intelligence. The challenge in modeling various complications involved in a language is the main motivation behind our work. Our main focus in this work is to develop conceptual graphical representation for image captions. We have used discourse representation structure to gain semantic information which is further modeled into a graphical structure. The effectiveness of the model is evaluated by a caption based image retrieval system. The image retrieval is performed by computing subgraph based similarity measures. Best retrievals were given an average rating of . ± . out of 4 by a group of 25 human judges. The experiments were performed on a subset of the SBU Captioned Photo Dataset. This purpose of this work is to establish the cognitive sensibility of the approach to caption representations.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

While multimedia data, image data in particular, is an integral part of most websites and web documents, our quest for information so far is still restricted to text based search. To explore the World Wide Web more effectively, especially its rich repository of truly multimedia information, we are facing a number of challenging problems. Firstly, we face the ambiguous and highly subjective nature of defining image semantics and similarity. Secondly, multimedia data could come from highly diversified sources, as a result of automatic image capturing and generation processes. Finally, multimedia information exists in decentralised sources over the Web, making it difficult to use conventional content-based image retrieval (CBIR) techniques for effective and efficient search. In this special issue, we present a collection of five papers on visual and multimedia information management and retrieval topics, addressing some aspects of these challenges. These papers have been selected from the conference proceedings (Kluwer Academic Publishers, ISBN: 1-4020- 7060-8) of the Sixth IFIP 2.6 Working Conference on Visual Database Systems (VDB6), held in Brisbane, Australia, on 29–31 May 2002.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Relevant past events can be remembered when visualizing related pictures. The main difficulty is how to find these photos in a large personal collection. Query definition and image annotation are key issues to overcome this problem. The former is relevant due to the diversity of the clues provided by our memory when recovering a past moment and the later because images need to be annotated with information regarding those clues to be retrieved. Consequently, tools to recover past memories should deal carefully with these two tasks. This paper describes a user interface designed to explore pictures from personal memories. Users can query the media collection in several ways and for this reason an iconic visual language to define queries is proposed. Automatic and semi-automatic annotation is also performed using the image content and the audio information obtained when users show their images to others. The paper also presents the user interface evaluation based on tests with 58 participants.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In visual sensor networks, local feature descriptors can be computed at the sensing nodes, which work collaboratively on the data obtained to make an efficient visual analysis. In fact, with a minimal amount of computational effort, the detection and extraction of local features, such as binary descriptors, can provide a reliable and compact image representation. In this paper, it is proposed to extract and code binary descriptors to meet the energy and bandwidth constraints at each sensing node. The major contribution is a binary descriptor coding technique that exploits the correlation using two different coding modes: Intra, which exploits the correlation between the elements that compose a descriptor; and Inter, which exploits the correlation between descriptors of the same image. The experimental results show bitrate savings up to 35% without any impact in the performance efficiency of the image retrieval task. © 2014 EURASIP.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Existe actualmente um crescente desenvolvimento de sistemas de armazenamento e pesquisa de imagens. Uma aproximação adoptada nesses sistemas é a recuperação de imagens baseada em conteúdo (CBIR, Content-Based Image Retrieval). No âmbito destas aplicações existem utilizadores que pretendem utilizar imagens clip art para os seus trabalhos e apresentações. Existem muitas imagens clip art espalhadas por diversas bases de dados em sítios na Internet ou em colecções vendidas em dispositivos ópticos. A pesquisa de imagens nestas bases de dados leva os utilizadores a percorrem várias listas de imagens manualmente ou por métodos de pesquisa por texto, muitas vezes ineficientes. Essas bases de dados de clip arts são representadas por imagens vectoriais e imagens raster. Existem várias tecnologias de pesquisa e recuperação de ambos os tipos de imagens clip art, raster e vectoriais, contudo, a investigação tem sido realizada em separado sem retirar partido das duas áreas de investigação em conjunto, no problema de recuperar e explorar colecções de clip arts. O objectivo deste trabalho é implementar um motor de busca para encontrar clip arts em base de dados compostas por imagens vectoriais e imagens raster. O trabalho envolve um conversor de imagens raster em vectoriais, a extracção de características das imagens raster e vectoriais e a avaliação do sistema de recuperação de clip arts.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Shape model, deformable models, structural models, biometry, content based image retrieval, sketches

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We investigate whether dimensionality reduction using a latent generative model is beneficial for the task of weakly supervised scene classification. In detail, we are given a set of labeled images of scenes (for example, coast, forest, city, river, etc.), and our objective is to classify a new image into one of these categories. Our approach consists of first discovering latent ";topics"; using probabilistic Latent Semantic Analysis (pLSA), a generative model from the statistical text literature here applied to a bag of visual words representation for each image, and subsequently, training a multiway classifier on the topic distribution vector for each image. We compare this approach to that of representing each image by a bag of visual words vector directly and training a multiway classifier on these vectors. To this end, we introduce a novel vocabulary using dense color SIFT descriptors and then investigate the classification performance under changes in the size of the visual vocabulary, the number of latent topics learned, and the type of discriminative classifier used (k-nearest neighbor or SVM). We achieve superior classification performance to recent publications that have used a bag of visual word representation, in all cases, using the authors' own data sets and testing protocols. We also investigate the gain in adding spatial information. We show applications to image retrieval with relevance feedback and to scene classification in videos

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Shape complexity has recently received attention from different fields, such as computer vision and psychology. In this paper, integral geometry and information theory tools are applied to quantify the shape complexity from two different perspectives: from the inside of the object, we evaluate its degree of structure or correlation between its surfaces (inner complexity), and from the outside, we compute its degree of interaction with the circumscribing sphere (outer complexity). Our shape complexity measures are based on the following two facts: uniformly distributed global lines crossing an object define a continuous information channel and the continuous mutual information of this channel is independent of the object discretisation and invariant to translations, rotations, and changes of scale. The measures introduced in this paper can be potentially used as shape descriptors for object recognition, image retrieval, object localisation, tumour analysis, and protein docking, among others