939 resultados para content-based retrieval


Relevância:

100.00% 100.00%

Publicador:

Resumo:

With regard to the long-standing problem of the semantic gap between low-level image features and high-level human knowledge, the image retrieval community has recently shifted its emphasis from low-level features analysis to high-level image semantics extrac- tion. User studies reveal that users tend to seek information using high-level semantics. Therefore, image semantics extraction is of great importance to content-based image retrieval because it allows the users to freely express what images they want. Semantic content annotation is the basis for semantic content retrieval. The aim of image anno- tation is to automatically obtain keywords that can be used to represent the content of images. The major research challenges in image semantic annotation are: what is the basic unit of semantic representation? how can the semantic unit be linked to high-level image knowledge? how can the contextual information be stored and utilized for image annotation? In this thesis, the Semantic Web technology (i.e. ontology) is introduced to the image semantic annotation problem. Semantic Web, the next generation web, aims at mak- ing the content of whatever type of media not only understandable to humans but also to machines. Due to the large amounts of multimedia data prevalent on the Web, re- searchers and industries are beginning to pay more attention to the Multimedia Semantic Web. The Semantic Web technology provides a new opportunity for multimedia-based applications, but the research in this area is still in its infancy. Whether ontology can be used to improve image annotation and how to best use ontology in semantic repre- sentation and extraction is still a worth-while investigation. This thesis deals with the problem of image semantic annotation using ontology and machine learning techniques in four phases as below. 1) Salient object extraction. A salient object servers as the basic unit in image semantic extraction as it captures the common visual property of the objects. Image segmen- tation is often used as the �rst step for detecting salient objects, but most segmenta- tion algorithms often fail to generate meaningful regions due to over-segmentation and under-segmentation. We develop a new salient object detection algorithm by combining multiple homogeneity criteria in a region merging framework. 2) Ontology construction. Since real-world objects tend to exist in a context within their environment, contextual information has been increasingly used for improving object recognition. In the ontology construction phase, visual-contextual ontologies are built from a large set of fully segmented and annotated images. The ontologies are composed of several types of concepts (i.e. mid-level and high-level concepts), and domain contextual knowledge. The visual-contextual ontologies stand as a user-friendly interface between low-level features and high-level concepts. 3) Image objects annotation. In this phase, each object is labelled with a mid-level concept in ontologies. First, a set of candidate labels are obtained by training Support Vectors Machines with features extracted from salient objects. After that, contextual knowledge contained in ontologies is used to obtain the �nal labels by removing the ambiguity concepts. 4) Scene semantic annotation. The scene semantic extraction phase is to get the scene type by using both mid-level concepts and domain contextual knowledge in ontologies. Domain contextual knowledge is used to create scene con�guration that describes which objects co-exist with which scene type more frequently. The scene con�guration is represented in a probabilistic graph model, and probabilistic inference is employed to calculate the scene type given an annotated image. To evaluate the proposed methods, a series of experiments have been conducted in a large set of fully annotated outdoor scene images. These include a subset of the Corel database, a subset of the LabelMe dataset, the evaluation dataset of localized semantics in images, the spatial context evaluation dataset, and the segmented and annotated IAPR TC-12 benchmark.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Acoustic recordings of the environment provide an effective means to monitor bird species diversity. To facilitate exploration of acoustic recordings, we describe a content-based birdcall retrieval algorithm. A query birdcall is a region of spectrogram bounded by frequency and time. Retrieval depends on a similarity measure derived from the orientation and distribution of spectral ridges. The spectral ridge detection method caters for a broad range of birdcall structures. In this paper, we extend previous work by incorporating a spectrogram scaling step in order to improve the detection of spectral ridges. Compared to an existing approach based on MFCC features, our feature representation achieves better retrieval performance for multiple bird species in noisy recordings.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Bioacoustic data can be used for monitoring animal species diversity. The deployment of acoustic sensors enables acoustic monitoring at large temporal and spatial scales. We describe a content-based birdcall retrieval algorithm for the exploration of large data bases of acoustic recordings. In the algorithm, an event-based searching scheme and compact features are developed. In detail, ridge events are detected from audio files using event detection on spectral ridges. Then event alignment is used to search through audio files to locate candidate instances. A similarity measure is then applied to dimension-reduced spectral ridge feature vectors. The event-based searching method processes a smaller list of instances for faster retrieval. The experimental results demonstrate that our features achieve better success rate than existing methods and the feature dimension is greatly reduced.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The usual task in music information retrieval (MIR) is to find occurrences of a monophonic query pattern within a music database, which can contain both monophonic and polyphonic content. The so-called query-by-humming systems are a famous instance of content-based MIR. In such a system, the user's hummed query is converted into symbolic form to perform search operations in a similarly encoded database. The symbolic representation (e.g., textual, MIDI or vector data) is typically a quantized and simplified version of the sampled audio data, yielding to faster search algorithms and space requirements that can be met in real-life situations. In this thesis, we investigate geometric approaches to MIR. We first study some musicological properties often needed in MIR algorithms, and then give a literature review on traditional (e.g., string-matching-based) MIR algorithms and novel techniques based on geometry. We also introduce some concepts from digital image processing, namely the mathematical morphology, which we will use to develop and implement four algorithms for geometric music retrieval. The symbolic representation in the case of our algorithms is a binary 2-D image. We use various morphological pre- and post-processing operations on the query and the database images to perform template matching / pattern recognition for the images. The algorithms are basically extensions to classic image correlation and hit-or-miss transformation techniques used widely in template matching applications. They aim to be a future extension to the retrieval engine of C-BRAHMS, which is a research project of the Department of Computer Science at University of Helsinki.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Multimedia mining primarily involves, information analysis and retrieval based on implicit knowledge. The ever increasing digital image databases on the Internet has created a need for using multimedia mining on these databases for effective and efficient retrieval of images. Contents of an image can be expressed in different features such as Shape, Texture and Intensity-distribution(STI). Content Based Image Retrieval(CBIR) is an efficient retrieval of relevant images from large databases based on features extracted from the image. Most of the existing systems either concentrate on a single representation of all features or linear combination of these features. The paper proposes a CBIR System named STIRF (Shape, Texture, Intensity-distribution with Relevance Feedback) that uses a neural network for nonlinear combination of the heterogenous STI features. Further the system is self-adaptable to different applications and users based upon relevance feedback. Prior to retrieval of relevant images, each feature is first clustered independent of the other in its own space and this helps in matching of similar images. Testing the system on a database of images with varied contents and intensive backgrounds showed good results with most relevant images being retrieved for a image query. The system showed better and more robust performance compared to existing CBIR systems

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The amount of original imaging information produced yearly during the last decade has experienced a tremendous growth in all industries due to the technological breakthroughs in digital imaging and electronic storage capabilities. This trend is affecting the construction industry as well, where digital cameras and image databases are gradually replacing traditional photography. Owners demand complete site photograph logs and engineers store thousands of images for each project to use in a number of construction management tasks like monitoring an activity's progress and keeping evidence of the "as built" in case any disputes arise. So far, retrieval methodologies are done manually with the user being responsible for imaging classification according to specific rules that serve a limited number of construction management tasks. New methods that, with the guidance of the user, can automatically classify and retrieve construction site images are being developed and promise to remove the heavy burden of manually indexing images. In this paper, both the existing methods and a novel image retrieval method developed by the authors for the classification and retrieval of construction site images are described and compared. Specifically a number of examples are deployed in order to present their advantages and limitations. The results from this comparison demonstrates that the content based image retrieval method developed by the authors can reduce the overall time spent for the classification and retrieval of construction images while providing the user with the flexibility to retrieve images according different classification schemes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

With the digital all-sky imager (ASI) emergence in aurora research, millions of images are captured annually. However, only a fraction of which can be actually used. To address the problem incurred by low efficient manual processing, an integrated image analysis and retrieval system is developed. For precisely representing aurora image, macroscopic and microscopic features are combined to describe aurora texture. To reduce the feature dimensionality of the huge dataset, a modified local binary pattern (LBP) called ALBP is proposed to depict the microscopic texture, and scale-invariant Gabor and orientation-invariant Gabor are employed to extract the macroscopic texture. A physical property of aurora is inducted as region features to bridge the gap between the low-level visual features and high-level semantic description. The experiments results demonstrate that the ALBP method achieves high classification rate and low computational complexity. The retrieval simulation results show that the developed retrieval system is efficient for huge dataset. (c) 2010 Elsevier Inc. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper reports a novel region-based shape descriptor based on orthogonal Legendre moments. The preprocessing steps for invariance improvement of the proposed Improved Legendre Moment Descriptor (ILMD) are discussed. The performance of the ILMD is compared to the MPEG-7 approved region shape descriptor, angular radial transformation descriptor (ARTD), and the widely used Zernike moment descriptor (ZMD). Set B of the MPEG-7 CE-1 contour database and all the datasets of the MPEG-7 CE-2 region database were used for experimental validation. The average normalized modified retrieval rate (ANMRR) and precision- recall pair were employed for benchmarking the performance of the candidate descriptors. The ILMD has lower ANMRR values than ARTD for most of the datasets, and ARTD has a lower value compared to ZMD. This indicates that overall performance of the ILMD is better than that of ARTD and ZMD. This result is confirmed by the precision-recall test where ILMD was found to have better precision rates for most of the datasets tested. Besides retrieval accuracy, ILMD is more compact than ARTD and ZMD. The descriptor proposed is useful as a generic shape descriptor for content-based image retrieval (CBIR) applications

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Traditional content-based image retrieval (CBIR) systems use low-level features such as colors, shapes, and textures of images. Although, users make queries based on semantics, which are not easily related to such low-level characteristics. Recent works on CBIR confirm that researchers have been trying to map visual low-level characteristics and high-level semantics. The relation between low-level characteristics and image textual information has motivated this article which proposes a model for automatic classification and categorization of words associated to images. This proposal considers a self-organizing neural network architecture, which classifies textual information without previous learning. Experimental results compare the performance results of the text-based approach to an image retrieval system based on low-level features. (c) 2008 Wiley Periodicals, Inc.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In content-based image retrieval, learning from users’ feedback can be considered as an one-class classification problem. However, the OCIB method proposed in [1] suffers from the problem that it is only a one-mode method which cannot deal with multiple interest regions. In addition, it requires a pre-specified radius which is usually unavailable in real world applications. This paper overcomes these two problems by introducing ensemble learning into the OCIB method: by Bagging, we can construct a group of one-class classifiers which emphasize various parts of the data set; this is followed by a rank aggregating with which results from different parameter settings are incorporated into a single final ranking list. The experimental results show that the proposed I-OCIB method outperforms the OCIB for image retrieval applications.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

One of the content-based image retrieval techniques is the shape-based technique, which allows users to ask for objects similar in shape to a query object. Sajjanhar and Lu proposed a method for shape representation and similarity measure called the grid-based method [1]. They have shown that the method is effective for the retrieval of segmented objects based on shape. In this paper, we describe a system which uses the grid-based method for retrieval of images with multiple objects. We perform experiments on the prototype system to compare the performance of the grid-based method with the Fourier descriptors method [2]. Preliminary results have been presented.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Previously, we proposed the concept of connectivity to obtain discriminating shape descriptors. In this paper, we use connectivity to obtain superior distance histograms for multi-scale images. Experiments are performed to evaluate the distance histograms, based on connectivity, for shape-based retrieval of multi-scale images. Item S8 within the MPEG-7 still images content set is used for performing experiments. Experimental results show that the proposed method enhances retrieval performance significantly.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, color information and keyword information are combined for image retrieval. In detail, each image is divided into several blocks and then the color histograms of each block are derived. Users could feed back some annotations represented by keywords. Then, the keywords may spread in the image database so that both color-based and keyword-based retrieval could be utilized together. A prototype system shows that the proposed method is effective and efficient in performing image retrieval tasks.