916 resultados para Digital processing image
Resumo:
Video coding technologies have played a major role in the explosion of large market digital video applications and services. In this context, the very popular MPEG-x and H-26x video coding standards adopted a predictive coding paradigm, where complex encoders exploit the data redundancy and irrelevancy to 'control' much simpler decoders. This codec paradigm fits well applications and services such as digital television and video storage where the decoder complexity is critical, but does not match well the requirements of emerging applications such as visual sensor networks where the encoder complexity is more critical. The Slepian Wolf and Wyner-Ziv theorems brought the possibility to develop the so-called Wyner-Ziv video codecs, following a different coding paradigm where it is the task of the decoder, and not anymore of the encoder, to (fully or partly) exploit the video redundancy. Theoretically, Wyner-Ziv video coding does not incur in any compression performance penalty regarding the more traditional predictive coding paradigm (at least for certain conditions). In the context of Wyner-Ziv video codecs, the so-called side information, which is a decoder estimate of the original frame to code, plays a critical role in the overall compression performance. For this reason, much research effort has been invested in the past decade to develop increasingly more efficient side information creation methods. This paper has the main objective to review and evaluate the available side information methods after proposing a classification taxonomy to guide this review, allowing to achieve more solid conclusions and better identify the next relevant research challenges. After classifying the side information creation methods into four classes, notably guess, try, hint and learn, the review of the most important techniques in each class and the evaluation of some of them leads to the important conclusion that the side information creation methods provide better rate-distortion (RD) performance depending on the amount of temporal correlation in each video sequence. It became also clear that the best available Wyner-Ziv video coding solutions are almost systematically based on the learn approach. The best solutions are already able to systematically outperform the H.264/AVC Intra, and also the H.264/AVC zero-motion standard solutions for specific types of content. (C) 2013 Elsevier B.V. All rights reserved.
Resumo:
L'objectiu d'aquest TFC és implementar un sistema senzill de protecció del copyright per a imatges de format BMP mitjançant un esquema de marcatge basat en codis duals de Hamming, que permeten recuperar la marca quan la imatge marcada ha estat sotmesa a certs tipus de processament digital i també en el cas de confabulació de dos compradors.
Resumo:
We investigate whether dimensionality reduction using a latent generative model is beneficial for the task of weakly supervised scene classification. In detail, we are given a set of labeled images of scenes (for example, coast, forest, city, river, etc.), and our objective is to classify a new image into one of these categories. Our approach consists of first discovering latent ";topics"; using probabilistic Latent Semantic Analysis (pLSA), a generative model from the statistical text literature here applied to a bag of visual words representation for each image, and subsequently, training a multiway classifier on the topic distribution vector for each image. We compare this approach to that of representing each image by a bag of visual words vector directly and training a multiway classifier on these vectors. To this end, we introduce a novel vocabulary using dense color SIFT descriptors and then investigate the classification performance under changes in the size of the visual vocabulary, the number of latent topics learned, and the type of discriminative classifier used (k-nearest neighbor or SVM). We achieve superior classification performance to recent publications that have used a bag of visual word representation, in all cases, using the authors' own data sets and testing protocols. We also investigate the gain in adding spatial information. We show applications to image retrieval with relevance feedback and to scene classification in videos
Resumo:
A common problem in video surveys in very shallow waters is the presence of strong light fluctuations, due to sun light refraction. Refracted sunlight casts fast moving patterns, which can significantly degrade the quality of the acquired data. Motivated by the growing need to improve the quality of shallow water imagery, we propose a method to remove sunlight patterns in video sequences. The method exploits the fact that video sequences allow several observations of the same area of the sea floor, over time. It is based on computing the image difference between a given reference frame and the temporal median of a registered set of neighboring images. A key observation is that this difference will have two components with separable spectral content. One is related to the illumination field (lower spatial frequencies) and the other to the registration error (higher frequencies). The illumination field, recovered by lowpass filtering, is used to correct the reference image. In addition to removing the sunflickering patterns, an important advantage of the approach is the ability to preserve the sharpness in corrected image, even in the presence of registration inaccuracies. The effectiveness of the method is illustrated in image sets acquired under strong camera motion containing non-rigid benthic structures. The results testify the good performance and generality of the approach
Resumo:
The standard data fusion methods may not be satisfactory to merge a high-resolution panchromatic image and a low-resolution multispectral image because they can distort the spectral characteristics of the multispectral data. The authors developed a technique, based on multiresolution wavelet decomposition, for the merging and data fusion of such images. The method presented consists of adding the wavelet coefficients of the high-resolution image to the multispectral (low-resolution) data. They have studied several possibilities concluding that the method which produces the best results consists in adding the high order coefficients of the wavelet transform of the panchromatic image to the intensity component (defined as L=(R+G+B)/3) of the multispectral image. The method is, thus, an improvement on standard intensity-hue-saturation (IHS or LHS) mergers. They used the ¿a trous¿ algorithm which allows the use of a dyadic wavelet to merge nondyadic data in a simple and efficient scheme. They used the method to merge SPOT and LANDSATTM images. The technique presented is clearly better than the IHS and LHS mergers in preserving both spectral and spatial information.
Resumo:
Usual image fusion methods inject features from a high spatial resolution panchromatic sensor into every low spatial resolution multispectral band trying to preserve spectral signatures and improve spatial resolution to that of the panchromatic sensor. The objective is to obtain the image that would be observed by a sensor with the same spectral response (i.e., spectral sensitivity and quantum efficiency) as the multispectral sensors and the spatial resolution of the panchromatic sensor. But in these methods, features from electromagnetic spectrum regions not covered by multispectral sensors are injected into them, and physical spectral responses of the sensors are not considered during this process. This produces some undesirable effects, such as resolution overinjection images and slightly modified spectral signatures in some features. The authors present a technique which takes into account the physical electromagnetic spectrum responses of sensors during the fusion process, which produces images closer to the image obtained by the ideal sensor than those obtained by usual wavelet-based image fusion methods. This technique is used to define a new wavelet-based image fusion method.
Resumo:
The problem of understanding how humans perceive the quality of a reproduced image is of interest to researchers of many fields related to vision science and engineering: optics and material physics, image processing (compression and transfer), printing and media technology, and psychology. A measure for visual quality cannot be defined without ambiguity because it is ultimately the subjective opinion of an “end-user” observing the product. The purpose of this thesis is to devise computational methods to estimate the overall visual quality of prints, i.e. a numerical value that combines all the relevant attributes of the perceived image quality. The problem is limited to consider the perceived quality of printed photographs from the viewpoint of a consumer, and moreover, the study focuses only on digital printing methods, such as inkjet and electrophotography. The main contributions of this thesis are two novel methods to estimate the overall visual quality of prints. In the first method, the quality is computed as a visible difference between the reproduced image and the original digital (reference) image, which is assumed to have an ideal quality. The second method utilises instrumental print quality measures, such as colour densities, measured from printed technical test fields, and connects the instrumental measures to the overall quality via subjective attributes, i.e. attributes that directly contribute to the perceived quality, using a Bayesian network. Both approaches were evaluated and verified with real data, and shown to predict well the subjective evaluation results.
Resumo:
The work is intended to study the following important aspects of document image processing and develop new methods. (1) Segmentation ofdocument images using adaptive interval valued neuro-fuzzy method. (2) Improving the segmentation procedure using Simulated Annealing technique. (3) Development of optimized compression algorithms using Genetic Algorithm and parallel Genetic Algorithm (4) Feature extraction of document images (5) Development of IV fuzzy rules. This work also helps for feature extraction and foreground and background identification. The proposed work incorporates Evolutionary and hybrid methods for segmentation and compression of document images. A study of different neural networks used in image processing, the study of developments in the area of fuzzy logic etc is carried out in this work
Resumo:
Super Resolution problem is an inverse problem and refers to the process of producing a High resolution (HR) image, making use of one or more Low Resolution (LR) observations. It includes up sampling the image, thereby, increasing the maximum spatial frequency and removing degradations that arise during the image capture namely aliasing and blurring. The work presented in this thesis is based on learning based single image super-resolution. In learning based super-resolution algorithms, a training set or database of available HR images are used to construct the HR image of an image captured using a LR camera. In the training set, images are stored as patches or coefficients of feature representations like wavelet transform, DCT, etc. Single frame image super-resolution can be used in applications where database of HR images are available. The advantage of this method is that by skilfully creating a database of suitable training images, one can improve the quality of the super-resolved image. A new super resolution method based on wavelet transform is developed and it is better than conventional wavelet transform based methods and standard interpolation methods. Super-resolution techniques based on skewed anisotropic transform called directionlet transform are developed to convert a low resolution image which is of small size into a high resolution image of large size. Super-resolution algorithm not only increases the size, but also reduces the degradations occurred during the process of capturing image. This method outperforms the standard interpolation methods and the wavelet methods, both visually and in terms of SNR values. Artifacts like aliasing and ringing effects are also eliminated in this method. The super-resolution methods are implemented using, both critically sampled and over sampled directionlets. The conventional directionlet transform is computationally complex. Hence lifting scheme is used for implementation of directionlets. The new single image super-resolution method based on lifting scheme reduces computational complexity and thereby reduces computation time. The quality of the super resolved image depends on the type of wavelet basis used. A study is conducted to find the effect of different wavelets on the single image super-resolution method. Finally this new method implemented on grey images is extended to colour images and noisy images
Resumo:
We investigate whether dimensionality reduction using a latent generative model is beneficial for the task of weakly supervised scene classification. In detail, we are given a set of labeled images of scenes (for example, coast, forest, city, river, etc.), and our objective is to classify a new image into one of these categories. Our approach consists of first discovering latent ";topics"; using probabilistic Latent Semantic Analysis (pLSA), a generative model from the statistical text literature here applied to a bag of visual words representation for each image, and subsequently, training a multiway classifier on the topic distribution vector for each image. We compare this approach to that of representing each image by a bag of visual words vector directly and training a multiway classifier on these vectors. To this end, we introduce a novel vocabulary using dense color SIFT descriptors and then investigate the classification performance under changes in the size of the visual vocabulary, the number of latent topics learned, and the type of discriminative classifier used (k-nearest neighbor or SVM). We achieve superior classification performance to recent publications that have used a bag of visual word representation, in all cases, using the authors' own data sets and testing protocols. We also investigate the gain in adding spatial information. We show applications to image retrieval with relevance feedback and to scene classification in videos
Resumo:
A common problem in video surveys in very shallow waters is the presence of strong light fluctuations, due to sun light refraction. Refracted sunlight casts fast moving patterns, which can significantly degrade the quality of the acquired data. Motivated by the growing need to improve the quality of shallow water imagery, we propose a method to remove sunlight patterns in video sequences. The method exploits the fact that video sequences allow several observations of the same area of the sea floor, over time. It is based on computing the image difference between a given reference frame and the temporal median of a registered set of neighboring images. A key observation is that this difference will have two components with separable spectral content. One is related to the illumination field (lower spatial frequencies) and the other to the registration error (higher frequencies). The illumination field, recovered by lowpass filtering, is used to correct the reference image. In addition to removing the sunflickering patterns, an important advantage of the approach is the ability to preserve the sharpness in corrected image, even in the presence of registration inaccuracies. The effectiveness of the method is illustrated in image sets acquired under strong camera motion containing non-rigid benthic structures. The results testify the good performance and generality of the approach
Resumo:
In this paper a novel rank estimation technique for trajectories motion segmentation within the Local Subspace Affinity (LSA) framework is presented. This technique, called Enhanced Model Selection (EMS), is based on the relationship between the estimated rank of the trajectory matrix and the affinity matrix built by LSA. The results on synthetic and real data show that without any a priori knowledge, EMS automatically provides an accurate and robust rank estimation, improving the accuracy of the final motion segmentation
Resumo:
Automatic indexing and retrieval of digital data poses major challenges. The main problem arises from the ever increasing mass of digital media and the lack of efficient methods for indexing and retrieval of such data based on the semantic content rather than keywords. To enable intelligent web interactions, or even web filtering, we need to be capable of interpreting the information base in an intelligent manner. For a number of years research has been ongoing in the field of ontological engineering with the aim of using ontologies to add such (meta) knowledge to information. In this paper, we describe the architecture of a system (Dynamic REtrieval Analysis and semantic metadata Management (DREAM)) designed to automatically and intelligently index huge repositories of special effects video clips, based on their semantic content, using a network of scalable ontologies to enable intelligent retrieval. The DREAM Demonstrator has been evaluated as deployed in the film post-production phase to support the process of storage, indexing and retrieval of large data sets of special effects video clips as an exemplar application domain. This paper provides its performance and usability results and highlights the scope for future enhancements of the DREAM architecture which has proven successful in its first and possibly most challenging proving ground, namely film production, where it is already in routine use within our test bed Partners' creative processes. (C) 2009 Published by Elsevier B.V.
Resumo:
This thesis is about new digital moving image recording technologies and how they augment the distribution of creativity and the flexibility in moving image production systems, but also impose constraints on how images flow through the production system. The central concept developed in this thesis is ‘creative space’ which links quality and efficiency in moving image production to time for creative work, capacity of digital tools, user skills and the constitution of digital moving image material. The empirical evidence of this thesis is primarily based on semi-structured interviews conducted with Swedish film and TV production representatives.This thesis highlights the importance of pre-production technical planning and proposes a design management support tool (MI-FLOW) as a way to leverage functional workflows that is a prerequisite for efficient and cost effective moving image production.
Resumo:
The use of the maps obtained from remote sensing orbital images submitted to digital processing became fundamental to optimize conservation and monitoring actions of the coral reefs. However, the accuracy reached in the mapping of submerged areas is limited by variation of the water column that degrades the signal received by the orbital sensor and introduces errors in the final result of the classification. The limited capacity of the traditional methods based on conventional statistical techniques to solve the problems related to the inter-classes took the search of alternative strategies in the area of the Computational Intelligence. In this work an ensemble classifiers was built based on the combination of Support Vector Machines and Minimum Distance Classifier with the objective of classifying remotely sensed images of coral reefs ecosystem. The system is composed by three stages, through which the progressive refinement of the classification process happens. The patterns that received an ambiguous classification in a certain stage of the process were revalued in the subsequent stage. The prediction non ambiguous for all the data happened through the reduction or elimination of the false positive. The images were classified into five bottom-types: deep water; under-water corals; inter-tidal corals; algal and sandy bottom. The highest overall accuracy (89%) was obtained from SVM with polynomial kernel. The accuracy of the classified image was compared through the use of error matrix to the results obtained by the application of other classification methods based on a single classifier (neural network and the k-means algorithm). In the final, the comparison of results achieved demonstrated the potential of the ensemble classifiers as a tool of classification of images from submerged areas subject to the noise caused by atmospheric effects and the water column