911 resultados para IMAGE RETRIEVAL
Resumo:
In order to bridge the “Semantic gap”, a number of relevance feedback (RF) mechanisms have been applied to content-based image retrieval (CBIR). However current RF techniques in most existing CBIR systems still lack satisfactory user interaction although some work has been done to improve the interaction as well as the search accuracy. In this paper, we propose a four-factor user interaction model and investigate its effects on CBIR by an empirical evaluation. Whilst the model was developed for our research purposes, we believe the model could be adapted to any content-based search system.
Resumo:
This paper presents an interactive content-based image retrieval framework—uInteract, for delivering a novel four-factor user interaction model visually. The four-factor user interaction model is an interactive relevance feedback mechanism that we proposed, aiming to improve the interaction between users and the CBIR system and in turn users overall search experience. In this paper, we present how the framework is developed to deliver the four-factor user interaction model, and how the visual interface is designed to support user interaction activities. From our preliminary user evaluation result on the ease of use and usefulness of the proposed framework, we have learnt what the users like about the framework and the aspects we could improve in future studies. Whilst the framework is developed for our research purposes, we believe the functionalities could be adapted to any content-based image search framework.
Resumo:
Dissimilarity measurement plays a crucial role in content-based image retrieval, where data objects and queries are represented as vectors in high-dimensional content feature spaces. Given the large number of dissimilarity measures that exist in many fields, a crucial research question arises: Is there a dependency, if yes, what is the dependency, of a dissimilarity measure’s retrieval performance, on different feature spaces? In this paper, we summarize fourteen core dissimilarity measures and classify them into three categories. A systematic performance comparison is carried out to test the effectiveness of these dissimilarity measures with six different feature spaces and some of their combinations on the Corel image collection. From our experimental results, we have drawn a number of observations and insights on dissimilarity measurement in content-based image retrieval, which will lay a foundation for developing more effective image search technologies.
Resumo:
Due to the rapid growth of the number of digital media elements like image, video, audio, graphics on Internet, there is an increasing demand for effective search and retrieval techniques. Recently, many search engines have made image search as an option like Google, AlltheWeb, AltaVista, Freenet. In addition to this, Ditto, Picsearch, can search only the images on Internet. There are also other domain specific search engines available for graphics and clip art, audio, video, educational images, artwork, stock photos, science and nature [www.faganfinder.com/img]. These entire search engines are directory based. They crawls the entire Internet and index all the images in certain categories. They do not display the images in any particular order with respect to the time and context. With the availability of MPEG-7, a standard for describing multimedia content, it is now possible to store the images with its metadata in a structured format. This helps in searching and retrieving the images. The MPEG-7 standard uses XML to describe the content of multimedia information objects. These objects will have metadata information in the form of MPEG-7 or any other similar format associated with them. It can be used in different ways to search the objects. In this paper we propose a system, which can do content based image retrieval on the World Wide Web. It displays the result in user-defined order.
Resumo:
As the volume of image data and the need of using it in various applications is growing significantly in the last days it brings a necessity of retrieval efficiency and effectiveness. Unfortunately, existing indexing methods are not applicable to a wide range of problem-oriented fields due to their operating time limitations and strong dependency on the traditional descriptors extracted from the image. To meet higher requirements, a novel distance-based indexing method for region-based image retrieval has been proposed and investigated. The method creates premises for considering embedded partitions of images to carry out the search with different refinement or roughening level and so to seek the image meaningful content.
Resumo:
An approach to building a CBIR-system for searching computer tomography images using the methods of wavelet-analysis is presented in this work. The index vectors are constructed on the basis of the local features of the image and on their positions. The purpose of the proposed system is to extract visually similar data from the individual personal records and from analogous analysis of other patients.
Resumo:
Efficient and effective approaches of dealing with the vast amount of visual information available nowadays are highly sought after. This is particularly the case for image collections, both personal and commercial. Due to the magnitude of these ever expanding image repositories, annotation of all images images is infeasible, and search in such an image collection therefore becomes inherently difficult. Although content-based image retrieval techniques have shown much potential, such approaches also suffer from various problems making it difficult to adopt them in practice. In this paper, we follow a different approach, namely that of browsing image databases for image retrieval. In our Honeycomb Image Browser, large image databases are visualised on a hexagonal lattice with image thumbnails occupying hexagons. Arranged in a space filling manner, visually similar images are located close together enabling large image datasets to be navigated in a hierarchical manner. Various browsing tools are incorporated to allow for interactive exploration of the database. Experimental results confirm that our approach affords efficient image retrieval. © 2010 IEEE.
Resumo:
The goal of image retrieval and matching is to find and locate object instances in images from a large-scale image database. While visual features are abundant, how to combine them to improve performance by individual features remains a challenging task. In this work, we focus on leveraging multiple features for accurate and efficient image retrieval and matching. We first propose two graph-based approaches to rerank initially retrieved images for generic image retrieval. In the graph, vertices are images while edges are similarities between image pairs. Our first approach employs a mixture Markov model based on a random walk model on multiple graphs to fuse graphs. We introduce a probabilistic model to compute the importance of each feature for graph fusion under a naive Bayesian formulation, which requires statistics of similarities from a manually labeled dataset containing irrelevant images. To reduce human labeling, we further propose a fully unsupervised reranking algorithm based on a submodular objective function that can be efficiently optimized by greedy algorithm. By maximizing an information gain term over the graph, our submodular function favors a subset of database images that are similar to query images and resemble each other. The function also exploits the rank relationships of images from multiple ranked lists obtained by different features. We then study a more well-defined application, person re-identification, where the database contains labeled images of human bodies captured by multiple cameras. Re-identifications from multiple cameras are regarded as related tasks to exploit shared information. We apply a novel multi-task learning algorithm using both low level features and attributes. A low rank attribute embedding is joint learned within the multi-task learning formulation to embed original binary attributes to a continuous attribute space, where incorrect and incomplete attributes are rectified and recovered. To locate objects in images, we design an object detector based on object proposals and deep convolutional neural networks (CNN) in view of the emergence of deep networks. We improve a Fast RCNN framework and investigate two new strategies to detect objects accurately and efficiently: scale-dependent pooling (SDP) and cascaded rejection classifiers (CRC). The SDP improves detection accuracy by exploiting appropriate convolutional features depending on the scale of input object proposals. The CRC effectively utilizes convolutional features and greatly eliminates negative proposals in a cascaded manner, while maintaining a high recall for true objects. The two strategies together improve the detection accuracy and reduce the computational cost.
Resumo:
The size of online image datasets is constantly increasing. Considering an image dataset with millions of images, image retrieval becomes a seemingly intractable problem for exhaustive similarity search algorithms. Hashing methods, which encodes high-dimensional descriptors into compact binary strings, have become very popular because of their high efficiency in search and storage capacity. In the first part, we propose a multimodal retrieval method based on latent feature models. The procedure consists of a nonparametric Bayesian framework for learning underlying semantically meaningful abstract features in a multimodal dataset, a probabilistic retrieval model that allows cross-modal queries and an extension model for relevance feedback. In the second part, we focus on supervised hashing with kernels. We describe a flexible hashing procedure that treats binary codes and pairwise semantic similarity as latent and observed variables, respectively, in a probabilistic model based on Gaussian processes for binary classification. We present a scalable inference algorithm with the sparse pseudo-input Gaussian process (SPGP) model and distributed computing. In the last part, we define an incremental hashing strategy for dynamic databases where new images are added to the databases frequently. The method is based on a two-stage classification framework using binary and multi-class SVMs. The proposed method also enforces balance in binary codes by an imbalance penalty to obtain higher quality binary codes. We learn hash functions by an efficient algorithm where the NP-hard problem of finding optimal binary codes is solved via cyclic coordinate descent and SVMs are trained in a parallelized incremental manner. For modifications like adding images from an unseen class, we propose an incremental procedure for effective and efficient updates to the previous hash functions. Experiments on three large-scale image datasets demonstrate that the incremental strategy is capable of efficiently updating hash functions to the same retrieval performance as hashing from scratch.
Resumo:
Traditional content-based image retrieval (CBIR) systems use low-level features such as colors, shapes, and textures of images. Although, users make queries based on semantics, which are not easily related to such low-level characteristics. Recent works on CBIR confirm that researchers have been trying to map visual low-level characteristics and high-level semantics. The relation between low-level characteristics and image textual information has motivated this article which proposes a model for automatic classification and categorization of words associated to images. This proposal considers a self-organizing neural network architecture, which classifies textual information without previous learning. Experimental results compare the performance results of the text-based approach to an image retrieval system based on low-level features. (c) 2008 Wiley Periodicals, Inc.
Resumo:
Continuing advances in digital image capture and storage are resulting in a proliferation of imagery and associated problems of information overload in image domains. In this work we present a framework that supports image management using an interactive approach that captures and reuses task-based contextual information. Our framework models the relationship between images and domain tasks they support by monitoring the interactive manipulation and annotation of task-relevant imagery. During image analysis, interactions are captured and a task context is dynamically constructed so that human expertise, proficiency and knowledge can be leveraged to support other users in carrying out similar domain tasks using case-based reasoning techniques. In this article we present our framework for capturing task context and describe how we have implemented the framework as two image retrieval applications in the geo-spatial and medical domains. We present an evaluation that tests the efficiency of our algorithms for retrieving image context information and the effectiveness of the framework for carrying out goal-directed image tasks. © 2010 Springer Science+Business Media, LLC.
Resumo:
Modern computer graphics systems are able to construct renderings of such high quality that viewers are deceived into regarding the images as coming from a photographic source. Large amounts of computing resources are expended in this rendering process, using complex mathematical models of lighting and shading. However, psychophysical experiments have revealed that viewers only regard certain informative regions within a presented image. Furthermore, it has been shown that these visually important regions contain low-level visual feature differences that attract the attention of the viewer. This thesis will present a new approach to image synthesis that exploits these experimental findings by modulating the spatial quality of image regions by their visual importance. Efficiency gains are therefore reaped, without sacrificing much of the perceived quality of the image. Two tasks must be undertaken to achieve this goal. Firstly, the design of an appropriate region-based model of visual importance, and secondly, the modification of progressive rendering techniques to effect an importance-based rendering approach. A rule-based fuzzy logic model is presented that computes, using spatial feature differences, the relative visual importance of regions in an image. This model improves upon previous work by incorporating threshold effects induced by global feature difference distributions and by using texture concentration measures. A modified approach to progressive ray-tracing is also presented. This new approach uses the visual importance model to guide the progressive refinement of an image. In addition, this concept of visual importance has been incorporated into supersampling, texture mapping and computer animation techniques. Experimental results are presented, illustrating the efficiency gains reaped from using this method of progressive rendering. This visual importance-based rendering approach is expected to have applications in the entertainment industry, where image fidelity may be sacrificed for efficiency purposes, as long as the overall visual impression of the scene is maintained. Different aspects of the approach should find many other applications in image compression, image retrieval, progressive data transmission and active robotic vision.
Resumo:
With regard to the long-standing problem of the semantic gap between low-level image features and high-level human knowledge, the image retrieval community has recently shifted its emphasis from low-level features analysis to high-level image semantics extrac- tion. User studies reveal that users tend to seek information using high-level semantics. Therefore, image semantics extraction is of great importance to content-based image retrieval because it allows the users to freely express what images they want. Semantic content annotation is the basis for semantic content retrieval. The aim of image anno- tation is to automatically obtain keywords that can be used to represent the content of images. The major research challenges in image semantic annotation are: what is the basic unit of semantic representation? how can the semantic unit be linked to high-level image knowledge? how can the contextual information be stored and utilized for image annotation? In this thesis, the Semantic Web technology (i.e. ontology) is introduced to the image semantic annotation problem. Semantic Web, the next generation web, aims at mak- ing the content of whatever type of media not only understandable to humans but also to machines. Due to the large amounts of multimedia data prevalent on the Web, re- searchers and industries are beginning to pay more attention to the Multimedia Semantic Web. The Semantic Web technology provides a new opportunity for multimedia-based applications, but the research in this area is still in its infancy. Whether ontology can be used to improve image annotation and how to best use ontology in semantic repre- sentation and extraction is still a worth-while investigation. This thesis deals with the problem of image semantic annotation using ontology and machine learning techniques in four phases as below. 1) Salient object extraction. A salient object servers as the basic unit in image semantic extraction as it captures the common visual property of the objects. Image segmen- tation is often used as the �rst step for detecting salient objects, but most segmenta- tion algorithms often fail to generate meaningful regions due to over-segmentation and under-segmentation. We develop a new salient object detection algorithm by combining multiple homogeneity criteria in a region merging framework. 2) Ontology construction. Since real-world objects tend to exist in a context within their environment, contextual information has been increasingly used for improving object recognition. In the ontology construction phase, visual-contextual ontologies are built from a large set of fully segmented and annotated images. The ontologies are composed of several types of concepts (i.e. mid-level and high-level concepts), and domain contextual knowledge. The visual-contextual ontologies stand as a user-friendly interface between low-level features and high-level concepts. 3) Image objects annotation. In this phase, each object is labelled with a mid-level concept in ontologies. First, a set of candidate labels are obtained by training Support Vectors Machines with features extracted from salient objects. After that, contextual knowledge contained in ontologies is used to obtain the �nal labels by removing the ambiguity concepts. 4) Scene semantic annotation. The scene semantic extraction phase is to get the scene type by using both mid-level concepts and domain contextual knowledge in ontologies. Domain contextual knowledge is used to create scene con�guration that describes which objects co-exist with which scene type more frequently. The scene con�guration is represented in a probabilistic graph model, and probabilistic inference is employed to calculate the scene type given an annotated image. To evaluate the proposed methods, a series of experiments have been conducted in a large set of fully annotated outdoor scene images. These include a subset of the Corel database, a subset of the LabelMe dataset, the evaluation dataset of localized semantics in images, the spatial context evaluation dataset, and the segmented and annotated IAPR TC-12 benchmark.
Resumo:
The amount of original imaging information produced yearly during the last decade has experienced a tremendous growth in all industries due to the technological breakthroughs in digital imaging and electronic storage capabilities. This trend is affecting the construction industry as well, where digital cameras and image databases are gradually replacing traditional photography. Owners demand complete site photograph logs and engineers store thousands of images for each project to use in a number of construction management tasks like monitoring an activity's progress and keeping evidence of the "as built" in case any disputes arise. So far, retrieval methodologies are done manually with the user being responsible for imaging classification according to specific rules that serve a limited number of construction management tasks. New methods that, with the guidance of the user, can automatically classify and retrieve construction site images are being developed and promise to remove the heavy burden of manually indexing images. In this paper, both the existing methods and a novel image retrieval method developed by the authors for the classification and retrieval of construction site images are described and compared. Specifically a number of examples are deployed in order to present their advantages and limitations. The results from this comparison demonstrates that the content based image retrieval method developed by the authors can reduce the overall time spent for the classification and retrieval of construction images while providing the user with the flexibility to retrieve images according different classification schemes.
Resumo:
Digital photographs of construction site activities are gradually replacing their traditional paper based counterparts. Existing digital imaging technologies in hardware and software make it easy for site engineers to take numerous photographs of “interesting” processes and activities on a daily basis. The resulting photographic data are evidence of the “as-built” project, and can therefore be used in a number of project life cycle tasks. However, the task of retrieving the relevant photographs needed in these tasks is often burdened by the sheer volume of photographs accumulating in project databases over time and the numerous objects present in each photograph. To solve this problem, the writers have recently developed a number of complementary techniques that can automatically classify and retrieve construction site images according to a variety of criteria (materials, time, date, location, etc.). This paper presents a novel complementary technique that can automatically identify linear (i.e., beam, column) and nonlinear (i.e., wall, slab) construction objects within the image content and use that information to enhance the performance of the writers’ existing construction site image retrieval approach.