871 resultados para Content-Base Image Retrieval
Resumo:
The size of online image datasets is constantly increasing. Considering an image dataset with millions of images, image retrieval becomes a seemingly intractable problem for exhaustive similarity search algorithms. Hashing methods, which encodes high-dimensional descriptors into compact binary strings, have become very popular because of their high efficiency in search and storage capacity. In the first part, we propose a multimodal retrieval method based on latent feature models. The procedure consists of a nonparametric Bayesian framework for learning underlying semantically meaningful abstract features in a multimodal dataset, a probabilistic retrieval model that allows cross-modal queries and an extension model for relevance feedback. In the second part, we focus on supervised hashing with kernels. We describe a flexible hashing procedure that treats binary codes and pairwise semantic similarity as latent and observed variables, respectively, in a probabilistic model based on Gaussian processes for binary classification. We present a scalable inference algorithm with the sparse pseudo-input Gaussian process (SPGP) model and distributed computing. In the last part, we define an incremental hashing strategy for dynamic databases where new images are added to the databases frequently. The method is based on a two-stage classification framework using binary and multi-class SVMs. The proposed method also enforces balance in binary codes by an imbalance penalty to obtain higher quality binary codes. We learn hash functions by an efficient algorithm where the NP-hard problem of finding optimal binary codes is solved via cyclic coordinate descent and SVMs are trained in a parallelized incremental manner. For modifications like adding images from an unseen class, we propose an incremental procedure for effective and efficient updates to the previous hash functions. Experiments on three large-scale image datasets demonstrate that the incremental strategy is capable of efficiently updating hash functions to the same retrieval performance as hashing from scratch.
Resumo:
Multimedia mining primarily involves, information analysis and retrieval based on implicit knowledge. The ever increasing digital image databases on the Internet has created a need for using multimedia mining on these databases for effective and efficient retrieval of images. Contents of an image can be expressed in different features such as Shape, Texture and Intensity-distribution(STI). Content Based Image Retrieval(CBIR) is an efficient retrieval of relevant images from large databases based on features extracted from the image. Most of the existing systems either concentrate on a single representation of all features or linear combination of these features. The paper proposes a CBIR System named STIRF (Shape, Texture, Intensity-distribution with Relevance Feedback) that uses a neural network for nonlinear combination of the heterogenous STI features. Further the system is self-adaptable to different applications and users based upon relevance feedback. Prior to retrieval of relevant images, each feature is first clustered independent of the other in its own space and this helps in matching of similar images. Testing the system on a database of images with varied contents and intensive backgrounds showed good results with most relevant images being retrieved for a image query. The system showed better and more robust performance compared to existing CBIR systems
Resumo:
142 p.
Resumo:
Grey Level Co-occurrence Matrices (GLCM) are one of the earliest techniques used for image texture analysis. In this paper we defined a new feature called trace extracted from the GLCM and its implications in texture analysis are discussed in the context of Content Based Image Retrieval (CBIR). The theoretical extension of GLCM to n-dimensional gray scale images are also discussed. The results indicate that trace features outperform Haralick features when applied to CBIR.
Resumo:
This paper reports a novel region-based shape descriptor based on orthogonal Legendre moments. The preprocessing steps for invariance improvement of the proposed Improved Legendre Moment Descriptor (ILMD) are discussed. The performance of the ILMD is compared to the MPEG-7 approved region shape descriptor, angular radial transformation descriptor (ARTD), and the widely used Zernike moment descriptor (ZMD). Set B of the MPEG-7 CE-1 contour database and all the datasets of the MPEG-7 CE-2 region database were used for experimental validation. The average normalized modified retrieval rate (ANMRR) and precision- recall pair were employed for benchmarking the performance of the candidate descriptors. The ILMD has lower ANMRR values than ARTD for most of the datasets, and ARTD has a lower value compared to ZMD. This indicates that overall performance of the ILMD is better than that of ARTD and ZMD. This result is confirmed by the precision-recall test where ILMD was found to have better precision rates for most of the datasets tested. Besides retrieval accuracy, ILMD is more compact than ARTD and ZMD. The descriptor proposed is useful as a generic shape descriptor for content-based image retrieval (CBIR) applications
Resumo:
With the rapid growth of databases of various types (text, multimedia, etc..), There exist a need to propose methods for ordering, access and retrieve data in a simple and fast way. The images databases, in addition to these needs, require a representation of the images so that the semantic content characteristics are considered. Accordingly, several proposals such as the textual annotations based retrieval has been made. In the annotations approach, the recovery is based on the comparison between the textual description that a user can make of images and descriptions of the images stored in database. Among its drawbacks, it is noted that the textual description is very dependent on the observer, in addition to the computational effort required to describe all the images in database. Another approach is the content based image retrieval - CBIR, where each image is represented by low-level features such as: color, shape, texture, etc. In this sense, the results in the area of CBIR has been very promising. However, the representation of the images semantic by low-level features is an open problem. New algorithms for the extraction of features as well as new methods of indexing have been proposed in the literature. However, these algorithms become increasingly complex. So, doing an analysis, it is natural to ask whether there is a relationship between semantics and low-level features extracted in an image? and if there is a relationship, which descriptors better represent the semantic? which leads us to a new question: how to use descriptors to represent the content of the images?. The work presented in this thesis, proposes a method to analyze the relationship between low-level descriptors and semantics in an attempt to answer the questions before. Still, it was observed that there are three possibilities of indexing images: Using composed characteristic vectors, using parallel and independent index structures (for each descriptor or set of them) and using characteristic vectors sorted in sequential order. Thus, the first two forms have been widely studied and applied in literature, but there were no records of the third way has even been explored. So this thesis also proposes to index using a sequential structure of descriptors and also the order of these descriptors should be based on the relationship that exists between each descriptor and semantics of the users. Finally, the proposed index in this thesis revealed better than the traditional approachs and yet, was showed experimentally that the order in this sequence is important and there is a direct relationship between this order and the relationship of low-level descriptors with the semantics of the users
Resumo:
In this paper we propose a novel method for shape analysis called HTS (Hough Transform Statistics), which uses statistics from Hough Transform space in order to characterize the shape of objects in digital images. Experimental results showed that the HTS descriptor is robust and presents better accuracy than some traditional shape description methods. Furthermore, HTS algorithm has linear complexity, which is an important requirement for content based image retrieval from large databases. © 2013 IEEE.
Resumo:
Pós-graduação em Ciência da Computação - IBILCE
Resumo:
With the widespread proliferation of computers, many human activities entail the use of automatic image analysis. The basic features used for image analysis include color, texture, and shape. In this paper, we propose a new shape description method, called Hough Transform Statistics (HTS), which uses statistics from the Hough space to characterize the shape of objects or regions in digital images. A modified version of this method, called Hough Transform Statistics neighborhood (HTSn), is also presented. Experiments carried out on three popular public image databases showed that the HTS and HTSn descriptors are robust, since they presented precision-recall results much better than several other well-known shape description methods. When compared to Beam Angle Statistics (BAS) method, a shape description method that inspired their development, both the HTS and the HTSn methods presented inferior results regarding the precision-recall criterion, but superior results in the processing time and multiscale separability criteria. The linear complexity of the HTS and the HTSn algorithms, in contrast to BAS, make them more appropriate for shape analysis in high-resolution image retrieval tasks when very large databases are used, which are very common nowadays. (C) 2014 Elsevier Inc. All rights reserved.
Resumo:
The multiple-instance learning (MIL) model has been successful in areas such as drug discovery and content-based image-retrieval. Recently, this model was generalized and a corresponding kernel was introduced to learn generalized MIL concepts with a support vector machine. While this kernel enjoyed empirical success, it has limitations in its representation. We extend this kernel by enriching its representation and empirically evaluate our new kernel on data from content-based image retrieval, biological sequence analysis, and drug discovery. We found that our new kernel generalized noticeably better than the old one in content-based image retrieval and biological sequence analysis and was slightly better or even with the old kernel in the other applications, showing that an SVM using this kernel does not overfit despite its richer representation.
Resumo:
Content-based image retrieval is still a challenging issue due to the inherent complexity of images and choice of the most discriminant descriptors. Recent developments in the field have introduced multidimensional projections to burst accuracy in the retrieval process, but many issues such as introduction of pattern recognition tasks and deeper user intervention to assist the process of choosing the most discriminant features still remain unaddressed. In this paper, we present a novel framework to CBIR that combines pattern recognition tasks, class-specific metrics, and multidimensional projection to devise an effective and interactive image retrieval system. User interaction plays an essential role in the computation of the final multidimensional projection from which image retrieval will be attained. Results have shown that the proposed approach outperforms existing methods, turning out to be a very attractive alternative for managing image data sets.
Resumo:
In this paper, we present a novel approach to perform similarity queries over medical images, maintaining the semantics of a given query posted by the user. Content-based image retrieval systems relying on relevance feedback techniques usually request the users to label relevant/irrelevant images. Thus, we present a highly effective strategy to survey user profiles, taking advantage of such labeling to implicitly gather the user perceptual similarity. The profiles maintain the settings desired for each user, allowing tuning of the similarity assessment, which encompasses the dynamic change of the distance function employed through an interactive process. Experiments on medical images show that the method is effective and can improve the decision making process during analysis.
Resumo:
Questo studio si propone di realizzare un’applicazione per dispositivi Android che permetta, per mezzo di un gioco di ruolo strutturato come caccia al tesoro, di visitare in prima persona città d’arte e luoghi turistici. Gli utenti finali, grazie alle funzionalità dell’app stessa, potranno giocare, creare e condividere cacce al tesoro basate sulla ricerca di edifici, monumenti, luoghi di rilevanza artistico-storica o turistica; in particolare al fine di completare ciascuna tappa di una caccia al tesoro il giocatore dovrà scattare una fotografia al monumento o edificio descritto nell’obiettivo della caccia stessa. Il software grazie ai dati rilevati tramite GPS e giroscopio (qualora il dispositivo ne sia dotato) e per mezzo di un algoritmo di instance recognition sarà in grado di affermare se la foto scattata rappresenta la risposta corretta al quesito della tappa. L’applicazione GeoPhotoHunt rappresenta non solo uno strumento ludico per la visita di città turistiche o più in generale luoghi di interesse, lo studio propone, infatti come suo contributo originale, l’implementazione su piattaforma mobile di un Content Based Image Retrieval System (CBIR) del tutto indipendente da un supporto server. Nello specifico il server dell’applicazione non sarà altro che uno strumento di appoggio con il quale i membri della “community” di GeoPhotoHunt potranno pubblicare le cacce al tesoro da loro create e condividere i punteggi che hanno totalizzato partecipando a una caccia al tesoro. In questo modo quando un utente ha scaricato sul proprio smartphone i dati di una caccia al tesoro potrà iniziare l’avventura anche in assenza di una connessione internet. L’intero studio è stato suddiviso in più fasi, ognuna di queste corrisponde ad una specifica sezione dell’elaborato che segue. In primo luogo si sono effettuate delle ricerche, soprattutto nel web, con lo scopo di individuare altre applicazioni che implementano l’idea della caccia al tesoro su piattaforma mobile o applicazioni che implementassero algoritmi di instance recognition direttamente su smartphone. In secondo luogo si è ricercato in letteratura quali fossero gli algoritmi di riconoscimento di immagini più largamente diffusi e studiati in modo da avere una panoramica dei metodi da testare per poi fare la scelta dell’algoritmo più adatto al caso di studio. Quindi si è proceduto con lo sviluppo dell’applicazione GeoPhotoHunt stessa, sia per quanto riguarda l’app front-end per dispositivi Android sia la parte back-end server. Infine si è passati ad una fase di test di algoritmi di riconoscimento di immagini in modo di avere una sufficiente quantità di dati sperimentali da permettere di effettuare una scelta dell’algoritmo più adatto al caso di studio. Al termine della fase di testing si è deciso di implementare su Android un algoritmo basato sulla distanza tra istogrammi di colore costruiti sulla scala cromatica HSV, questo metodo pur non essendo robusto in presenza di variazioni di luminosità e contrasto, rappresenta un buon compromesso tra prestazioni, complessità computazionale in modo da rendere la user experience quanto più coinvolgente.
Resumo:
В статье рассмотрена проблема семантической разницы между содержимым мультимедиа и его текстовым описанием, определяемым вручную. Предложен комбинированный подход к представлению семантики мультимедиа, основанный на объединении близких по содержанию и текстовому описанию мультимедиа в классы, содержащие обобщённые описания объектов, связей между ними и ключевых слов текстовых метаданных из некоторого тезауруса. Для формирования этих классов используются операции иерархической кластеризации и машинного обучения. Данный подход позволяет расширить область поиска и навигации мультимедиа благодаря привлечению медиа-данных, имеющих схожее содержание и текстовое описание.
Resumo:
This article presents the principal results of the Ph.D. thesis A Novel Method for Content-Based Image Retrieval in Art Image Collections Utilizing Colour Semantics by Krassimira Ivanova (Institute of Mathematics and Informatics, BAS), successfully defended at Hasselt Uni-versity in Belgium, Faculty of Science, on 15 November 2011.