985 resultados para image databases


Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this article, techniques have been presented for faster evolution of wavelet lifting coefficients for fingerprint image compression (FIC). In addition to increasing the computational speed by 81.35%, the coefficients performed much better than the reported coefficients in literature. Generally, full-size images are used for evolving wavelet coefficients, which is time consuming. To overcome this, in this work, wavelets were evolved with resized, cropped, resized-average and cropped-average images. On comparing the peak- signal-to-noise-ratios (PSNR) offered by the evolved wavelets, it was found that the cropped images excelled the resized images and is in par with the results reported till date. Wavelet lifting coefficients evolved from an average of four 256 256 centre-cropped images took less than 1/5th the evolution time reported in literature. It produced an improvement of 1.009 dB in average PSNR. Improvement in average PSNR was observed for other compression ratios (CR) and degraded images as well. The proposed technique gave better PSNR for various bit rates, with set partitioning in hierarchical trees (SPIHT) coder. These coefficients performed well with other fingerprint databases as well.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Textural image classification technologies have been extensively explored and widely applied in many areas. It is advantageous to combine both the occurrence and spatial distribution of local patterns to describe a texture. However, most existing state-of-the-art approaches for textural image classification only employ the occurrence histogram of local patterns to describe textures, without considering their co-occurrence information. And they are usually very time-consuming because of the vector quantization involved. Moreover, those feature extraction paradigms are implemented at a single scale. In this paper we propose a novel multi-scale local pattern co-occurrence matrix (MS_LPCM) descriptor to characterize textural images through four major steps. Firstly, Gaussian filtering pyramid preprocessing is employed to obtain multi-scale images; secondly, a local binary pattern (LBP) operator is applied on each textural image to create a LBP image; thirdly, the gray-level co-occurrence matrix (GLCM) is utilized to extract local pattern co-occurrence matrix (LPCM) from LBP images as the features; finally, all LPCM features from the same textural image at different scales are concatenated as the final feature vectors for classification. The experimental results on three benchmark databases in this study have shown a higher classification accuracy and lower computing cost as compared with other state-of-the-art algorithms.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this study, an interactive Content-Based Image Retrieval (CBIR) system that allows searching and retrieving images from databases is designed and developed. Based on the fuzzy c-means clustering algorithm, the CBIR system fuses color and texture features in image segmentation. A technique to form compound queries based on the combined features of different images is devised. This technique allows users to have a better control on the search criteria, thus a higher retrieval performance can be achieved. A database consisting of skin cancer imagery is used to demonstrate the applicability of the CBIR system.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Automatic face recognition is an area with immense practical potential which includes a wide range of commercial and law enforcement applications. Hence it is unsurprising that it continues to be one of the most active research areas of computer vision. Even after over three decades of intense research, the state-of-the-art in face recognition continues to improve, benefitting from advances in a range of different research fields such as image processing, pattern recognition, computer graphics, and physiology. Systems based on visible spectrum images, the most researched face recognition modality, have reached a significant level of maturity with some practical success. However, they continue to face challenges in the presence of illumination, pose and expression changes, as well as facial disguises, all of which can significantly decrease recognition accuracy. Amongst various approaches which have been proposed in an attempt to overcome these limitations, the use of infrared (IR) imaging has emerged as a particularly promising research direction. This paper presents a comprehensive and timely review of the literature on this subject. Our key contributions are (i) a summary of the inherent properties of infrared imaging which makes this modality promising in the context of face recognition; (ii) a systematic review of the most influential approaches, with a focus on emerging common trends as well as key differences between alternative methodologies; (iii) a description of the main databases of infrared facial images available to the researcher; and lastly (iv) a discussion of the most promising avenues for future research. © 2014 Elsevier Ltd.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Given a large image set, in which very few images have labels, how to guess labels for the remaining majority? How to spot images that need brand new labels different from the predefined ones? How to summarize these data to route the user’s attention to what really matters? Here we answer all these questions. Specifically, we propose QuMinS, a fast, scalable solution to two problems: (i) Low-labor labeling (LLL) – given an image set, very few images have labels, find the most appropriate labels for the rest; and (ii) Mining and attention routing – in the same setting, find clusters, the top-'N IND.O' outlier images, and the 'N IND.R' images that best represent the data. Experiments on satellite images spanning up to 2.25 GB show that, contrasting to the state-of-the-art labeling techniques, QuMinS scales linearly on the data size, being up to 40 times faster than top competitors (GCap), still achieving better or equal accuracy, it spots images that potentially require unpredicted labels, and it works even with tiny initial label sets, i.e., nearly five examples. We also report a case study of our method’s practical usage to show that QuMinS is a viable tool for automatic coffee crop detection from remote sensing images.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

[EN]In this paper, we address the challenge of gender classi - cation using large databases of images with two goals. The rst objective is to evaluate whether the error rate decreases compared to smaller databases. The second goal is to determine if the classi er that provides the best classi cation rate for one database, improves the classi cation results for other databases, that is, the cross-database performance.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Spatial data mining recently emerges from a number of real applications, such as real-estate marketing, urban planning, weather forecasting, medical image analysis, road traffic accident analysis, etc. It demands for efficient solutions for many new, expensive, and complicated problems. In this paper, we investigate the problem of evaluating the top k distinguished “features” for a “cluster” based on weighted proximity relationships between the cluster and features. We measure proximity in an average fashion to address possible nonuniform data distribution in a cluster. Combining a standard multi-step paradigm with new lower and upper proximity bounds, we presented an efficient algorithm to solve the problem. The algorithm is implemented in several different modes. Our experiment results not only give a comparison among them but also illustrate the efficiency of the algorithm.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This report presents and evaluates a novel idea for scalable lossy colour image coding with Matching Pursuit (MP) performed in a transform domain. The benefits of the idea of MP performed in the transform domain are analysed in detail. The main contribution of this work is extending MP with wavelets to colour coding and proposing a coding method. We exploit correlations between image subbands after wavelet transformation in RGB colour space. Then, a new and simple quantisation and coding scheme of colour MP decomposition based on Run Length Encoding (RLE), inspired by the idea of coding indexes in relational databases, is applied. As a final coding step arithmetic coding is used assuming uniform distributions of MP atom parameters. The target application is compression at low and medium bit-rates. Coding performance is compared to JPEG 2000 showing the potential to outperform the latter with more sophisticated than uniform data models for arithmetic coder. The results are presented for grayscale and colour coding of 12 standard test images.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A variety of content-based image retrieval systems exist which enable users to perform image retrieval based on colour content - i.e., colour-based image retrieval. For the production of media for use in television and film, colour-based image retrieval is useful for retrieving specifically coloured animations, graphics or videos from large databases (by comparing user queries to the colour content of extracted key frames). It is also useful to graphic artists creating realistic computer-generated imagery (CGI). Unfortunately, current methods for evaluating colour-based image retrieval systems have 2 major drawbacks. Firstly, the relevance of images retrieved during the task cannot be measured reliably. Secondly, existing methods do not account for the creative design activity known as reflection-in-action. Consequently, the development and application of novel and potentially more effective colour-based image retrieval approaches, better supporting the large number of users creating media for use in television and film productions, is not possible as their efficacy cannot be reliably measured and compared to existing technologies. As a solution to the problem, this paper introduces the Mosaic Test. The Mosaic Test is a user-based evaluation approach in which participants complete an image mosaic of a predetermined target image, using the colour-based image retrieval system that is being evaluated. In this paper, we introduce the Mosaic Test and report on a user evaluation. The findings of the study reveal that the Mosaic Test overcomes the 2 major drawbacks associated with existing evaluation methods and does not require expert participants. © 2012 Springer Science+Business Media, LLC.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

More and more researchers have realized that ontologies will play a critical role in the development of the Semantic Web, the next generation Web in which content is not only consumable by humans, but also by software agents. The development of tools to support ontology management including creation, visualization, annotation, database storage, and retrieval is thus extremely important. We have developed ImageSpace, an image ontology creation and annotation tool that features (1) full support for the standard web ontology language DAML+OIL; (2) image ontology creation, visualization, image annotation and display in one integrated framework; (3) ontology consistency assurance; and (4) storing ontologies and annotations in relational databases. It is expected that the availability of such a tool will greatly facilitate the creation of image repositories as islands of the Semantic Web.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

ACM Computing Classification System (1998): I.7, I.7.5.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The size of online image datasets is constantly increasing. Considering an image dataset with millions of images, image retrieval becomes a seemingly intractable problem for exhaustive similarity search algorithms. Hashing methods, which encodes high-dimensional descriptors into compact binary strings, have become very popular because of their high efficiency in search and storage capacity. In the first part, we propose a multimodal retrieval method based on latent feature models. The procedure consists of a nonparametric Bayesian framework for learning underlying semantically meaningful abstract features in a multimodal dataset, a probabilistic retrieval model that allows cross-modal queries and an extension model for relevance feedback. In the second part, we focus on supervised hashing with kernels. We describe a flexible hashing procedure that treats binary codes and pairwise semantic similarity as latent and observed variables, respectively, in a probabilistic model based on Gaussian processes for binary classification. We present a scalable inference algorithm with the sparse pseudo-input Gaussian process (SPGP) model and distributed computing. In the last part, we define an incremental hashing strategy for dynamic databases where new images are added to the databases frequently. The method is based on a two-stage classification framework using binary and multi-class SVMs. The proposed method also enforces balance in binary codes by an imbalance penalty to obtain higher quality binary codes. We learn hash functions by an efficient algorithm where the NP-hard problem of finding optimal binary codes is solved via cyclic coordinate descent and SVMs are trained in a parallelized incremental manner. For modifications like adding images from an unseen class, we propose an incremental procedure for effective and efficient updates to the previous hash functions. Experiments on three large-scale image datasets demonstrate that the incremental strategy is capable of efficiently updating hash functions to the same retrieval performance as hashing from scratch.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Image (Video) retrieval is an interesting problem of retrieving images (videos) similar to the query. Images (Videos) are represented in an input (feature) space and similar images (videos) are obtained by finding nearest neighbors in the input representation space. Numerous input representations both in real valued and binary space have been proposed for conducting faster retrieval. In this thesis, we present techniques that obtain improved input representations for retrieval in both supervised and unsupervised settings for images and videos. Supervised retrieval is a well known problem of retrieving same class images of the query. We address the practical aspects of achieving faster retrieval with binary codes as input representations for the supervised setting in the first part, where binary codes are used as addresses into hash tables. In practice, using binary codes as addresses does not guarantee fast retrieval, as similar images are not mapped to the same binary code (address). We address this problem by presenting an efficient supervised hashing (binary encoding) method that aims to explicitly map all the images of the same class ideally to a unique binary code. We refer to the binary codes of the images as `Semantic Binary Codes' and the unique code for all same class images as `Class Binary Code'. We also propose a new class­ based Hamming metric that dramatically reduces the retrieval times for larger databases, where only hamming distance is computed to the class binary codes. We also propose a Deep semantic binary code model, by replacing the output layer of a popular convolutional Neural Network (AlexNet) with the class binary codes and show that the hashing functions learned in this way outperforms the state­ of ­the art, and at the same time provide fast retrieval times. In the second part, we also address the problem of supervised retrieval by taking into account the relationship between classes. For a given query image, we want to retrieve images that preserve the relative order i.e. we want to retrieve all same class images first and then, the related classes images before different class images. We learn such relationship aware binary codes by minimizing the similarity between inner product of the binary codes and the similarity between the classes. We calculate the similarity between classes using output embedding vectors, which are vector representations of classes. Our method deviates from the other supervised binary encoding schemes as it is the first to use output embeddings for learning hashing functions. We also introduce new performance metrics that take into account the related class retrieval results and show significant gains over the state­ of­ the art. High Dimensional descriptors like Fisher Vectors or Vector of Locally Aggregated Descriptors have shown to improve the performance of many computer vision applications including retrieval. In the third part, we will discuss an unsupervised technique for compressing high dimensional vectors into high dimensional binary codes, to reduce storage complexity. In this approach, we deviate from adopting traditional hyperplane hashing functions and instead learn hyperspherical hashing functions. The proposed method overcomes the computational challenges of directly applying the spherical hashing algorithm that is intractable for compressing high dimensional vectors. A practical hierarchical model that utilizes divide and conquer techniques using the Random Select and Adjust (RSA) procedure to compress such high dimensional vectors is presented. We show that our proposed high dimensional binary codes outperform the binary codes obtained using traditional hyperplane methods for higher compression ratios. In the last part of the thesis, we propose a retrieval based solution to the Zero shot event classification problem - a setting where no training videos are available for the event. To do this, we learn a generic set of concept detectors and represent both videos and query events in the concept space. We then compute similarity between the query event and the video in the concept space and videos similar to the query event are classified as the videos belonging to the event. We show that we significantly boost the performance using concept features from other modalities.