968 resultados para Binary Image Representation


Relevância:

90.00% 90.00%

Publicador:

Resumo:

Local spatio-temporal features with a Bag-of-visual words model is a popular approach used in human action recognition. Bag-of-features methods suffer from several challenges such as extracting appropriate appearance and motion features from videos, converting extracted features appropriate for classification and designing a suitable classification framework. In this paper we address the problem of efficiently representing the extracted features for classification to improve the overall performance. We introduce two generative supervised topic models, maximum entropy discrimination LDA (MedLDA) and class- specific simplex LDA (css-LDA), to encode the raw features suitable for discriminative SVM based classification. Unsupervised LDA models disconnect topic discovery from the classification task, hence yield poor results compared to the baseline Bag-of-words framework. On the other hand supervised LDA techniques learn the topic structure by considering the class labels and improve the recognition accuracy significantly. MedLDA maximizes likelihood and within class margins using max-margin techniques and yields a sparse highly discriminative topic structure; while in css-LDA separate class specific topics are learned instead of common set of topics across the entire dataset. In our representation first topics are learned and then each video is represented as a topic proportion vector, i.e. it can be comparable to a histogram of topics. Finally SVM classification is done on the learned topic proportion vector. We demonstrate the efficiency of the above two representation techniques through the experiments carried out in two popular datasets. Experimental results demonstrate significantly improved performance compared to the baseline Bag-of-features framework which uses kmeans to construct histogram of words from the feature vectors.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The increased availability of image capturing devices has enabled collections of digital images to rapidly expand in both size and diversity. This has created a constantly growing need for efficient and effective image browsing, searching, and retrieval tools. Pseudo-relevance feedback (PRF) has proven to be an effective mechanism for improving retrieval accuracy. An original, simple yet effective rank-based PRF mechanism (RB-PRF) that takes into account the initial rank order of each image to improve retrieval accuracy is proposed. This RB-PRF mechanism innovates by making use of binary image signatures to improve retrieval precision by promoting images similar to highly ranked images and demoting images similar to lower ranked images. Empirical evaluations based on standard benchmarks, namely Wang, Oliva & Torralba, and Corel datasets demonstrate the effectiveness of the proposed RB-PRF mechanism in image retrieval.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Binary image classifiction is a problem that has received much attention in recent years. In this paper we evaluate a selection of popular techniques in an effort to find a feature set/ classifier combination which generalizes well to full resolution image data. We then apply that system to images at one-half through one-sixteenth resolution, and consider the corresponding error rates. In addition, we further observe generalization performance as it depends on the number of training images, and lastly, compare the system's best error rates to that of a human performing an identical classification task given teh same set of test images.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Thesis (Ph.D.)--University of Washington, 2013

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In this paper, spherical harmonics are proposed as shape descriptors for 2D images. We introduce the concept of connectivity; 2D images are decomposed using connectivity, which is followed by 3D model construction. Spherical harmonics are obtained for 3D models and used as descriptors for the underlying 2D shapes. Difference between two images is computed as the Euclidean distance between their spherical harmonics descriptors. Experiments are performed to test the effectiveness of spherical harmonics for retrieval of 2D images. Item S8 within the MPEG-7 still images content set is used for performing experiments; this dataset consists of 3621 still images. Experimental results show that the proposed descriptors for 2D images are effective

Relevância:

90.00% 90.00%

Publicador:

Resumo:

A powerful image editing system called OVIE is described, which provides fast and accurate creation, composition, rendering and other manipulation of image contents. Flexibility and convenience of the system are achieved by including two modules: image decomposition and image vectorization to understand and represent an image respectively. To understand an image comprehensively, we propose to integrate image segmentation, shape completion and image completion techniques to ensure a seamless image editing. An array of pixels is replaced by vector data with geometric edit ability for image representation since the geometrically-based editing has physical meanings and thus it is more natural or intuitive for users to edit. Compared to the existing works, our system is more convenient and can generate effects with higher quality. © 2012 IEEE.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

How to learn an over complete dictionary for sparse representations of image is an important topic in machine learning, sparse coding, blind source separation, etc. The so-called K-singular value decomposition (K-SVD) method [3] is powerful for this purpose, however, it is too time-consuming to apply. Recently, an adaptive orthogonal sparsifying transform (AOST) method has been developed to learn the dictionary that is faster. However, the corresponding coefficient matrix may not be as sparse as that of K-SVD. For solving this problem, in this paper, a non-orthogonal iterative match method is proposed to learn the dictionary. By using the approach of sequentially extracting columns of the stacked image blocks, the non-orthogonal atoms of the dictionary are learned adaptively, and the resultant coefficient matrix is sparser. Experiment results show that the proposed method can yield effective dictionaries and the resulting image representation is sparser than AOST.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Determination of the soil coverage by crop residues after ploughing is a fundamental element of Conservation Agriculture. This paper presents the application of genetic algorithms employed during the fine tuning of the segmentation process of a digital image with the aim of automatically quantifying the residue coverage. In other words, the objective is to achieve a segmentation that would permit the discrimination of the texture of the residue so that the output of the segmentation process is a binary image in which residue zones are isolated from the rest. The RGB images used come from a sample of images in which sections of terrain were photographed with a conventional camera positioned in zenith orientation atop a tripod. The images were taken outdoors under uncontrolled lighting conditions. Up to 92% similarity was achieved between the images obtained by the segmentation process proposed in this paper and the templates made by an elaborate manual tracing process. In addition to the proposed segmentation procedure and the fine tuning procedure that was developed, a global quantification of the soil coverage by residues for the sampled area was achieved that differed by only 0.85% from the quantification obtained using template images. Moreover, the proposed method does not depend on the type of residue present in the image. The study was conducted at the experimental farm “El Encín” in Alcalá de Henares (Madrid, Spain).

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The objective of this paper is to develop a method to hide information inside a binary image. An algorithm to embed data in scanned text or figures is proposed, based on the detection of suitable pixels, which verify some conditions in order to be not detected. In broad terms, the algorithm locates those pixels placed at the contours of the figures or in those areas where some scattering of the two colors can be found. The hidden information is independent from the values of the pixels where this information is embedded. Notice that, depending on the sequence of bits to be hidden, around half of the used pixels to keep bits of data will not be modified. The other basic characteristic of the proposed scheme is that it is necessary to take into consideration the bits that are modified, in order to perform the recovering process of the information, which consists on recovering the sequence of bits placed in the proper positions. An application to banking sector is proposed for hidding some information in signatures.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In this paper, we present ICICLE (Image ChainNet and Incremental Clustering Engine), a prototype system that we have developed to efficiently and effectively retrieve WWW images based on image semantics. ICICLE has two distinguishing features. First, it employs a novel image representation model called Weight ChainNet to capture the semantics of the image content. A new formula, called list space model, for computing semantic similarities is also introduced. Second, to speed up retrieval, ICICLE employs an incremental clustering mechanism, ICC (Incremental Clustering on ChainNet), to cluster images with similar semantics into the same partition. Each cluster has a summary representative and all clusters' representatives are further summarized into a balanced and full binary tree structure. We conducted an extensive performance study to evaluate ICICLE. Compared with some recently proposed methods, our results show that ICICLE provides better recall and precision. Our clustering technique ICC facilitates speedy retrieval of images without sacrificing recall and precision significantly.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Recovering position from sensor information is an important problem in mobile robotics, known as localisation. Localisation requires a map or some other description of the environment to provide the robot with a context to interpret sensor data. The mobile robot system under discussion is using an artificial neural representation of position. Building a geometrical map of the environment with a single camera and artificial neural networks is difficult. Instead it would be simpler to learn position as a function of the visual input. Usually when learning images, an intermediate representation is employed. An appropriate starting point for biologically plausible image representation is the complex cells of the visual cortex, which have invariance properties that appear useful for localisation. The effectiveness for localisation of two different complex cell models are evaluated. Finally the ability of a simple neural network with single shot learning to recognise these representations and localise a robot is examined.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Purpose – This paper aims to evaluate critically the conventional binary hierarchical representation of the formal/informal economy dualism which reads informal employment as a residual and marginal sphere that has largely negative consequences for economic development and needs to be deterred. Design/methodology/approach – To contest this depiction, the results of 600 household interviews conducted in Ukraine during 2005/2006 on the extent and nature of their informal employment are reported. Findings – Informal employment is revealed to be an extensively used form of work and, through a richer and more textured understanding of the multiple roles that different forms of informal employment play, a form of work that positively contributes to economic and social development, acting both as an important seedbed for enterprise creation and development and as a primary vehicle through which community self-help is delivered in contemporary Ukraine. Research limitations/implications – This survey reveals that depicting informal employment as a hindrance to development and deterring engagement in this sphere results in state authorities destroying the entrepreneurial endeavour and active citizenship that other public policies are seeking to nurture. The paper concludes by addressing how this public policy paradox might start to be resolved. Originality/value – This paper is one of the first to document the role of informal employment in nurturing enterprise creation and development as well as community exchange.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

We present and evaluate a novel idea for scalable lossy colour image coding with Matching Pursuit (MP) performed in a transform domain. The idea is to exploit correlations in RGB colour space between image subbands after wavelet transformation rather than in the spatial domain. We propose a simple quantisation and coding scheme of colour MP decomposition based on Run Length Encoding (RLE) which can achieve comparable performance to JPEG 2000 even though the latter utilises careful data modelling at the coding stage. Thus, the obtained image representation has the potential to outperform JPEG 2000 with a more sophisticated coding algorithm.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This thesis considers sparse approximation of still images as the basis of a lossy compression system. The Matching Pursuit (MP) algorithm is presented as a method particularly suited for application in lossy scalable image coding. Its multichannel extension, capable of exploiting inter-channel correlations, is found to be an efficient way to represent colour data in RGB colour space. Known problems with MP, high computational complexity of encoding and dictionary design, are tackled by finding an appropriate partitioning of an image. The idea of performing MP in the spatio-frequency domain after transform such as Discrete Wavelet Transform (DWT) is explored. The main challenge, though, is to encode the image representation obtained after MP into a bit-stream. Novel approaches for encoding the atomic decomposition of a signal and colour amplitudes quantisation are proposed and evaluated. The image codec that has been built is capable of competing with scalable coders such as JPEG 2000 and SPIHT in terms of compression ratio.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Purpose: To evaluate and compare the performance of Ripplet Type-1 transform and directional discrete cosine transform (DDCT) and their combinations for improved representation of MRI images while preserving its fine features such as edges along the smooth curves and textures. Methods: In a novel image representation method based on fusion of Ripplet type-1 and conventional/directional DCT transforms, source images were enhanced in terms of visual quality using Ripplet and DDCT and their various combinations. The enhancement achieved was quantified on the basis of peak signal to noise ratio (PSNR), mean square error (MSE), structural content (SC), average difference (AD), maximum difference (MD), normalized cross correlation (NCC), and normalized absolute error (NAE). To determine the attributes of both transforms, these transforms were combined to represent the entire image as well. All the possible combinations were tested to present a complete study of combinations of the transforms and the contrasts were evaluated amongst all the combinations. Results: While using the direct combining method (DDCT) first and then the Ripplet method, a PSNR value of 32.3512 was obtained which is comparatively higher than the PSNR values of the other combinations. This novel designed technique gives PSNR value approximately equal to the PSNR’s of parent techniques. Along with this, it was able to preserve edge information, texture information and various other directional image features. The fusion of DDCT followed by the Ripplet reproduced the best images. Conclusion: The transformation of images using Ripplet followed by DDCT ensures a more efficient method for the representation of images with preservation of its fine details like edges and textures.