850 resultados para Text-Based Image Retrieval
Resumo:
In this paper we propose a cryptographic transformation based on matrix manipulations for image encryption. Substitution and diffusion operations, based on the matrix, facilitate fast conversion of plaintext and images into ciphertext and cipher images. The paper describes the encryption algorithm, discusses the simulation results and compares with results obtained from Advanced Encryption Standard (AES). It is shown that the proposed algorithm is capable of encrypting images eight times faster than AES.
Resumo:
The standard separable two dimensional wavelet transform has achieved a great success in image denoising applications due to its sparse representation of images. However it fails to capture efficiently the anisotropic geometric structures like edges and contours in images as they intersect too many wavelet basis functions and lead to a non-sparse representation. In this paper a novel de-noising scheme based on multi directional and anisotropic wavelet transform called directionlet is presented. The image denoising in wavelet domain has been extended to the directionlet domain to make the image features to concentrate on fewer coefficients so that more effective thresholding is possible. The image is first segmented and the dominant direction of each segment is identified to make a directional map. Then according to the directional map, the directionlet transform is taken along the dominant direction of the selected segment. The decomposed images with directional energy are used for scale dependent subband adaptive optimal threshold computation based on SURE risk. This threshold is then applied to the sub-bands except the LLL subband. The threshold corrected sub-bands with the unprocessed first sub-band (LLL) are given as input to the inverse directionlet algorithm for getting the de-noised image. Experimental results show that the proposed method outperforms the standard wavelet-based denoising methods in terms of numeric and visual quality
Resumo:
In this paper, we describe an interdisciplinary project in which visualization techniques were developed for and applied to scholarly work from literary studies. The aim was to bring Christof Schöch's electronic edition of Bérardier de Bataut's Essai sur le récit (1776) to the web. This edition is based on the Text Encoding Initiative's XML-based encoding scheme (TEI P5, subset TEI-Lite). This now de facto standard applies to machine-readable texts used chiefly in the humanities and social sciences. The intention of this edition is to make the edited text freely available on the web, to allow for alternative text views (here original and modern/corrected text), to ensure reader-friendly annotation and navigation, to permit on-line collaboration in encoding and annotation as well as user comments, all in an open source, generically usable, lightweight package. These aims were attained by relying on a GPL-based, public domain CMS (Drupal) and combining it with XSL-Stylesheets and Java Script.
Resumo:
We present a statistical image-based shape + structure model for Bayesian visual hull reconstruction and 3D structure inference. The 3D shape of a class of objects is represented by sets of contours from silhouette views simultaneously observed from multiple calibrated cameras. Bayesian reconstructions of new shapes are then estimated using a prior density constructed with a mixture model and probabilistic principal components analysis. We show how the use of a class-specific prior in a visual hull reconstruction can reduce the effect of segmentation errors from the silhouette extraction process. The proposed method is applied to a data set of pedestrian images, and improvements in the approximate 3D models under various noise conditions are shown. We further augment the shape model to incorporate structural features of interest; unknown structural parameters for a novel set of contours are then inferred via the Bayesian reconstruction process. Model matching and parameter inference are done entirely in the image domain and require no explicit 3D construction. Our shape model enables accurate estimation of structure despite segmentation errors or missing views in the input silhouettes, and works even with only a single input view. Using a data set of thousands of pedestrian images generated from a synthetic model, we can accurately infer the 3D locations of 19 joints on the body based on observed silhouette contours from real images.
Resumo:
We present a new method for rendering novel images of flexible 3D objects from a small number of example images in correspondence. The strength of the method is the ability to synthesize images whose viewing position is significantly far away from the viewing cone of the example images ("view extrapolation"), yet without ever modeling the 3D structure of the scene. The method relies on synthesizing a chain of "trilinear tensors" that governs the warping function from the example images to the novel image, together with a multi-dimensional interpolation function that synthesizes the non-rigid motions of the viewed object from the virtual camera position. We show that two closely spaced example images alone are sufficient in practice to synthesize a significant viewing cone, thus demonstrating the ability of representing an object by a relatively small number of model images --- for the purpose of cheap and fast viewers that can run on standard hardware.
Resumo:
This paper presents an image-based rendering system using algebraic relations between different views of an object. The system uses pictures of an object taken from known positions. Given three such images it can generate "virtual'' ones as the object would look from any position near the ones that the two input images were taken from. The extrapolation from the example images can be up to about 60 degrees of rotation. The system is based on the trilinear constraints that bind any three view so fan object. As a side result, we propose two new methods for camera calibration. We developed and used one of them. We implemented the system and tested it on real images of objects and faces. We also show experimentally that even when only two images taken from unknown positions are given, the system can be used to render the object from other view points as long as we have a good estimate of the internal parameters of the camera used and we are able to find good correspondence between the example images. In addition, we present the relation between these algebraic constraints and a factorization method for shape and motion estimation. As a result we propose a method for motion estimation in the special case of orthographic projection.
Resumo:
The image comparison operation ??sessing how well one image matches another ??rms a critical component of many image analysis systems and models of human visual processing. Two norms used commonly for this purpose are L1 and L2, which are specific instances of the Minkowski metric. However, there is often not a principled reason for selecting one norm over the other. One way to address this problem is by examining whether one metric better captures the perceptual notion of image similarity than the other. With this goal, we examined perceptual preferences for images retrieved on the basis of the L1 versus the L2 norm. These images were either small fragments without recognizable content, or larger patterns with recognizable content created via vector quantization. In both conditions the subjects showed a consistent preference for images matched using the L1 metric. These results suggest that, in the domain of natural images of the kind we have used, the L1 metric may better capture human notions of image similarity.
Resumo:
In this paper we face the problem of positioning a camera attached to the end-effector of a robotic manipulator so that it gets parallel to a planar object. Such problem has been treated for a long time in visual servoing. Our approach is based on linking to the camera several laser pointers so that its configuration is aimed to produce a suitable set of visual features. The aim of using structured light is not only for easing the image processing and to allow low-textured objects to be treated, but also for producing a control scheme with nice properties like decoupling, stability, well conditioning and good camera trajectory
Resumo:
Consumer reviews, opinions and shared experiences in the use of a product is a powerful source of information about consumer preferences that can be used in recommender systems. Despite the importance and value of such information, there is no comprehensive mechanism that formalizes the opinions selection and retrieval process and the utilization of retrieved opinions due to the difficulty of extracting information from text data. In this paper, a new recommender system that is built on consumer product reviews is proposed. A prioritizing mechanism is developed for the system. The proposed approach is illustrated using the case study of a recommender system for digital cameras
Resumo:
One of the key aspects in 3D-image registration is the computation of the joint intensity histogram. We propose a new approach to compute this histogram using uniformly distributed random lines to sample stochastically the overlapping volume between two 3D-images. The intensity values are captured from the lines at evenly spaced positions, taking an initial random offset different for each line. This method provides us with an accurate, robust and fast mutual information-based registration. The interpolation effects are drastically reduced, due to the stochastic nature of the line generation, and the alignment process is also accelerated. The results obtained show a better performance of the introduced method than the classic computation of the joint histogram
Resumo:
El test de circuits és una fase del procés de producció que cada vegada pren més importància quan es desenvolupa un nou producte. Les tècniques de test i diagnosi per a circuits digitals han estat desenvolupades i automatitzades amb èxit, mentre que aquest no és encara el cas dels circuits analògics. D'entre tots els mètodes proposats per diagnosticar circuits analògics els més utilitzats són els diccionaris de falles. En aquesta tesi se'n descriuen alguns, tot analitzant-ne els seus avantatges i inconvenients. Durant aquests últims anys, les tècniques d'Intel·ligència Artificial han esdevingut un dels camps de recerca més importants per a la diagnosi de falles. Aquesta tesi desenvolupa dues d'aquestes tècniques per tal de cobrir algunes de les mancances que presenten els diccionaris de falles. La primera proposta es basa en construir un sistema fuzzy com a eina per identificar. Els resultats obtinguts son força bons, ja que s'aconsegueix localitzar la falla en un elevat tant percent dels casos. Per altra banda, el percentatge d'encerts no és prou bo quan a més a més s'intenta esbrinar la desviació. Com que els diccionaris de falles es poden veure com una aproximació simplificada al Raonament Basat en Casos (CBR), la segona proposta fa una extensió dels diccionaris de falles cap a un sistema CBR. El propòsit no és donar una solució general del problema sinó contribuir amb una nova metodologia. Aquesta consisteix en millorar la diagnosis dels diccionaris de falles mitjançant l'addició i l'adaptació dels nous casos per tal d'esdevenir un sistema de Raonament Basat en Casos. Es descriu l'estructura de la base de casos així com les tasques d'extracció, de reutilització, de revisió i de retenció, fent èmfasi al procés d'aprenentatge. En el transcurs del text s'utilitzen diversos circuits per mostrar exemples dels mètodes de test descrits, però en particular el filtre biquadràtic és l'utilitzat per provar les metodologies plantejades, ja que és un dels benchmarks proposats en el context dels circuits analògics. Les falles considerades son paramètriques, permanents, independents i simples, encara que la metodologia pot ser fàcilment extrapolable per a la diagnosi de falles múltiples i catastròfiques. El mètode es centra en el test dels components passius, encara que també es podria extendre per a falles en els actius.
Resumo:
La tesis se centra en la Visión por Computador y, más concretamente, en la segmentación de imágenes, la cual es una de las etapas básicas en el análisis de imágenes y consiste en la división de la imagen en un conjunto de regiones visualmente distintas y uniformes considerando su intensidad, color o textura. Se propone una estrategia basada en el uso complementario de la información de región y de frontera durante el proceso de segmentación, integración que permite paliar algunos de los problemas básicos de la segmentación tradicional. La información de frontera permite inicialmente identificar el número de regiones presentes en la imagen y colocar en el interior de cada una de ellas una semilla, con el objetivo de modelar estadísticamente las características de las regiones y definir de esta forma la información de región. Esta información, conjuntamente con la información de frontera, es utilizada en la definición de una función de energía que expresa las propiedades requeridas a la segmentación deseada: uniformidad en el interior de las regiones y contraste con las regiones vecinas en los límites. Un conjunto de regiones activas inician entonces su crecimiento, compitiendo por los píxeles de la imagen, con el objetivo de optimizar la función de energía o, en otras palabras, encontrar la segmentación que mejor se adecua a los requerimientos exprsados en dicha función. Finalmente, todo esta proceso ha sido considerado en una estructura piramidal, lo que nos permite refinar progresivamente el resultado de la segmentación y mejorar su coste computacional. La estrategia ha sido extendida al problema de segmentación de texturas, lo que implica algunas consideraciones básicas como el modelaje de las regiones a partir de un conjunto de características de textura y la extracción de la información de frontera cuando la textura es presente en la imagen. Finalmente, se ha llevado a cabo la extensión a la segmentación de imágenes teniendo en cuenta las propiedades de color y textura. En este sentido, el uso conjunto de técnicas no-paramétricas de estimación de la función de densidad para la descripción del color, y de características textuales basadas en la matriz de co-ocurrencia, ha sido propuesto para modelar adecuadamente y de forma completa las regiones de la imagen. La propuesta ha sido evaluada de forma objetiva y comparada con distintas técnicas de integración utilizando imágenes sintéticas. Además, se han incluido experimentos con imágenes reales con resultados muy positivos.
Resumo:
The s–x model of microwave emission from soil and vegetation layers is widely used to estimate soil moisture content from passive microwave observations. Its application to prospective satellite-based observations aggregating several thousand square kilometres requires understanding of the effects of scene heterogeneity. The effects of heterogeneity in soil surface roughness, soil moisture, water area and vegetation density on the retrieval of soil moisture from simulated single- and multi-angle observing systems were tested. Uncertainty in water area proved the most serious problem for both systems, causing errors of a few percent in soil moisture retrieval. Single-angle retrieval was largely unaffected by the other factors studied here. Multiple-angle retrievals errors around one percent arose from heterogeneity in either soil roughness or soil moisture. Errors of a few percent were caused by vegetation heterogeneity. A simple extension of the model vegetation representation was shown to reduce this error substantially for scenes containing a range of vegetation types.
Resumo:
In an immersive virtual environment, observers fail to notice the expansion of a room around them and consequently make gross errors when comparing the size of objects. This result is difficult to explain if the visual system continuously generates a 3-D model of the scene based on known baseline information from interocular separation or proprioception as the observer walks. An alternative is that observers use view-based methods to guide their actions and to represent the spatial layout of the scene. In this case, they may have an expectation of the images they will receive but be insensitive to the rate at which images arrive as they walk. We describe the way in which the eye movement strategy of animals simplifies motion processing if their goal is to move towards a desired image and discuss dorsal and ventral stream processing of moving images in that context. Although many questions about view-based approaches to scene representation remain unanswered, the solutions are likely to be highly relevant to understanding biological 3-D vision.
Resumo:
The potential of the τ-ω model for retrieving the volumetric moisture content of bare and vegetated soil from dual polarisation passive microwave data acquired at single and multiple angles is tested. Measurement error and several additional sources of uncertainty will affect the theoretical retrieval accuracy. These include uncertainty in the soil temperature, the vegetation structure and consequently its microwave singlescattering albedo, and uncertainty in soil microwave emissivity based on its roughness. To test the effects of these uncertainties for simple homogeneous scenes, we attempt to retrieve soil moisture from a number of simulated microwave brightness temperature datasets generated using the τ-ω model. The uncertainties for each influence are estimated and applied to curves generated for typical scenarios, and an inverse model used to retrieve the soil moisture content, vegetation optical depth and soil temperature. The effect of each influence on the theoretical soil moisture retrieval limit is explored, the likelihood of each sensor configuration meeting user requirements is assessed, and the most effective means of improving moisture retrieval indicated.