916 resultados para Morphing Alteration Detection Image Warping
Resumo:
本文介绍了小波变换理论 ,讨论了基本小波函数的选取准则和小波变换算法 ,分析了小波变换与人工智能等其它方法的结合方式和特点 .通过介绍小波变换在信号瞬态分析、图像边沿检测、图像去噪、模式识别、数据压缩、分形信号分析等方面的应用实例 ,讨论了小波变换在处理非平稳信号和复杂图像时的优势 .最后 ,对小波变换理论的发展及其应用前景作了描述 .
Resumo:
The ability to detect faces in images is of critical ecological significance. It is a pre-requisite for other important face perception tasks such as person identification, gender classification and affect analysis. Here we address the question of how the visual system classifies images into face and non-face patterns. We focus on face detection in impoverished images, which allow us to explore information thresholds required for different levels of performance. Our experimental results provide lower bounds on image resolution needed for reliable discrimination between face and non-face patterns and help characterize the nature of facial representations used by the visual system under degraded viewing conditions. Specifically, they enable an evaluation of the contribution of luminance contrast, image orientation and local context on face-detection performance.
Resumo:
Several Lamb wave modes can be coupled to a particular structure, depending on its geometry and transducer used to generate the guided waves. Each Lamb mode interacts in a particular form with different types of defects, like notches, delamination, surface defects, resulting in different information which can be used to improve damage detection and characterization. An image compounding technique that uses the information obtained from different propagation modes of Lamb waves for non-destructive testing of plate-like structures is proposed. A linear array consisting of 16 piezoelectric elements is attached to a 1 mm thickness aluminum plate, coupling the fundamental A0 and S0 modes at the frequencies of 100 kHz and 360 kHz, respectively. For each mode two images are obtained from amplitude and phase information: one image using the Total Focusing Method (TFM) and one phase image obtained from the Sign Coherence Factor (SCF). Each TFM image is multiplied by the SCF image of the respective mode to improve contrast and reduce side and grating lobes effects. The high dispersive characteristic of the A0 mode is compensated for adequate defect detection. The information in the SCF images is used to select one of the TFM mode images, at each pixel, to obtain the compounded image. As a result, dead zone is reduced, resolution and contrast are improved, enhancing damage detection when compared to the use of only one mode. © 2013 Elsevier Ltd.
Resumo:
Several Lamb wave modes can be coupled to a particular structure, depending on its geometry and transducer used to generate the guided waves. Each Lamb mode interacts in a particular form with different types of defects, like notches, delamination, surface defects, resulting in different information which can be used to improve damage detection and characterization. An image compounding technique that uses the information obtained from different propagation modes of Lamb waves for non-destructive testing of plate-like structures is proposed. A linear array consisting of 16 piezoelectric elements is attached to a 1 mm thickness aluminum plate, coupling the fundamental A0 and SO modes at the frequencies of 100 kHz and 360 kHz, respectively. For each mode two images are obtained from amplitude and phase information: one image using the Total Focusing Method (TFM) and one phase image obtained from the Sign Coherence Factor (SCF). Each TFM image is multiplied by the SCF image of the respective mode to improve contrast and reduce side and grating lobes effects. The high dispersive characteristic of the A0 mode is compensated for adequate defect detection. The information in the SCF images is used to select one of the TFM mode images, at each pixel, to obtain the compounded image. As a result, dead zone is reduced, resolution and contrast are improved, enhancing damage detection when compared to the use of only one mode. (C) 2013 Elsevier Ltd. All rights reserved. (AU)
Resumo:
In this paper we present a hybrid technique for correcting distortions that appear when projecting images onto geometrically complex, colored and textured surfaces. It analyzes the optical flow that results from perspective distortions during motions of the observer and tries to use this information for computing the correct image warping. If this fails due to an unreliable optical flow, an accurate -but slower and visiblestructured light projection is automatically triggered. Together with an appropriate radiometric compensation, view-dependent content can be projected onto arbitrary everyday surfaces. An implementation mainly on the GPU ensures fast frame rates.
Resumo:
This thesis covers a broad part of the field of computational photography, including video stabilization and image warping techniques, introductions to light field photography and the conversion of monocular images and videos into stereoscopic 3D content. We present a user assisted technique for stereoscopic 3D conversion from 2D images. Our approach exploits the geometric structure of perspective images including vanishing points. We allow a user to indicate lines, planes, and vanishing points in the input image, and directly employ these as guides of an image warp that produces a stereo image pair. Our method is most suitable for scenes with large scale structures such as buildings and is able to skip the step of constructing a depth map. Further, we propose a method to acquire 3D light fields using a hand-held camera, and describe several computational photography applications facilitated by our approach. As the input we take an image sequence from a camera translating along an approximately linear path with limited camera rotations. Users can acquire such data easily in a few seconds by moving a hand-held camera. We convert the input into a regularly sampled 3D light field by resampling and aligning them in the spatio-temporal domain. We also present a novel technique for high-quality disparity estimation from light fields. Finally, we show applications including digital refocusing and synthetic aperture blur, foreground removal, selective colorization, and others.
Resumo:
Este Proyecto Fin de Carrera trata sobre el reconocimiento e identificación de caracteres de matrículas de automóviles. Este tipo de sistemas de reconocimiento también se los conoce mundialmente como sistemas ANPR ("Automatic Number Plate Recognition") o LPR ("License Plate Recognition"). La gran cantidad de vehículos y logística que se mueve cada segundo por todo el planeta, hace necesaria su registro para su tratamiento y control. Por ello, es necesario implementar un sistema que pueda identificar correctamente estos recursos, para su posterior procesado, construyendo así una herramienta útil, ágil y dinámica. El presente trabajo ha sido estructurado en varias partes. La primera de ellas nos muestra los objetivos y las motivaciones que se persiguen con la realización de este proyecto. En la segunda, se abordan y desarrollan todos los diferentes procesos teóricos y técnicos, así como matemáticos, que forman un sistema ANPR común, con el fin de implementar una aplicación práctica que pueda demostrar la utilidad de estos en cualquier situación. En la tercera, se desarrolla esa parte práctica en la que se apoya la base teórica del trabajo. En ésta se describen y desarrollan los diversos algoritmos, creados con el fin de estudiar y comprobar todo lo planteado hasta ahora, así como observar su comportamiento. Se implementan varios procesos característicos del reconocimiento de caracteres y patrones, como la detección de áreas o patrones, rotado y transformación de imágenes, procesos de detección de bordes, segmentación de caracteres y patrones, umbralización y normalización, extracción de características y patrones, redes neuronales, y finalmente el reconocimiento óptico de caracteres o comúnmente conocido como OCR. La última parte refleja los resultados obtenidos a partir del sistema de reconocimiento de caracteres implementado para el trabajo y se exponen las conclusiones extraídas a partir de éste. Finalmente se plantean las líneas futuras de mejora, desarrollo e investigación, para poder realizar un sistema más eficiente y global. This Thesis deals about license plate characters recognition and identification. These kinds of systems are also known worldwide as ANPR systems ("Automatic Number Plate Recognition") or LPR ("License Plate Recognition"). The great number of vehicles and logistics moving every second all over the world, requires a registration for treatment and control. Thereby, it’s therefore necessary to implement a system that can identify correctly these resources, for further processing, thus building a useful, flexible and dynamic tool. This work has been structured into several parts. The first one shows the objectives and motivations attained by the completion of this project. In the second part, it’s developed all the different theoretical and technical processes, forming a common ANPR system in order to implement a practical application that can demonstrate the usefulness of these ones on any situation. In the third, the practical part is developed, which is based on the theoretical work. In this one are described and developed various algorithms, created to study and verify all the questions until now suggested, and complain the behavior of these systems. Several recognition of characters and patterns characteristic processes are implemented, such as areas or patterns detection, image rotation and transformation, edge detection processes, patterns and character segmentation, thresholding and normalization, features and patterns extraction, neural networks, and finally the optical character recognition or commonly known like OCR. The last part shows the results obtained from the character recognition system implemented for this thesis and the outlines conclusions drawn from it. Finally, future lines of improvement, research and development are proposed, in order to make a more efficient and comprehensive system.
Resumo:
With the rapid increase in both centralized video archives and distributed WWW video resources, content-based video retrieval is gaining its importance. To support such applications efficiently, content-based video indexing must be addressed. Typically, each video is represented by a sequence of frames. Due to the high dimensionality of frame representation and the large number of frames, video indexing introduces an additional degree of complexity. In this paper, we address the problem of content-based video indexing and propose an efficient solution, called the Ordered VA-File (OVA-File) based on the VA-file. OVA-File is a hierarchical structure and has two novel features: 1) partitioning the whole file into slices such that only a small number of slices are accessed and checked during k Nearest Neighbor (kNN) search and 2) efficient handling of insertions of new vectors into the OVA-File, such that the average distance between the new vectors and those approximations near that position is minimized. To facilitate a search, we present an efficient approximate kNN algorithm named Ordered VA-LOW (OVA-LOW) based on the proposed OVA-File. OVA-LOW first chooses possible OVA-Slices by ranking the distances between their corresponding centers and the query vector, and then visits all approximations in the selected OVA-Slices to work out approximate kNN. The number of possible OVA-Slices is controlled by a user-defined parameter delta. By adjusting delta, OVA-LOW provides a trade-off between the query cost and the result quality. Query by video clip consisting of multiple frames is also discussed. Extensive experimental studies using real video data sets were conducted and the results showed that our methods can yield a significant speed-up over an existing VA-file-based method and iDistance with high query result quality. Furthermore, by incorporating temporal correlation of video content, our methods achieved much more efficient performance.
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-08
Resumo:
Object recognition has long been a core problem in computer vision. To improve object spatial support and speed up object localization for object recognition, generating high-quality category-independent object proposals as the input for object recognition system has drawn attention recently. Given an image, we generate a limited number of high-quality and category-independent object proposals in advance and used as inputs for many computer vision tasks. We present an efficient dictionary-based model for image classification task. We further extend the work to a discriminative dictionary learning method for tensor sparse coding. In the first part, a multi-scale greedy-based object proposal generation approach is presented. Based on the multi-scale nature of objects in images, our approach is built on top of a hierarchical segmentation. We first identify the representative and diverse exemplar clusters within each scale. Object proposals are obtained by selecting a subset from the multi-scale segment pool via maximizing a submodular objective function, which consists of a weighted coverage term, a single-scale diversity term and a multi-scale reward term. The weighted coverage term forces the selected set of object proposals to be representative and compact; the single-scale diversity term encourages choosing segments from different exemplar clusters so that they will cover as many object patterns as possible; the multi-scale reward term encourages the selected proposals to be discriminative and selected from multiple layers generated by the hierarchical image segmentation. The experimental results on the Berkeley Segmentation Dataset and PASCAL VOC2012 segmentation dataset demonstrate the accuracy and efficiency of our object proposal model. Additionally, we validate our object proposals in simultaneous segmentation and detection and outperform the state-of-art performance. To classify the object in the image, we design a discriminative, structural low-rank framework for image classification. We use a supervised learning method to construct a discriminative and reconstructive dictionary. By introducing an ideal regularization term, we perform low-rank matrix recovery for contaminated training data from all categories simultaneously without losing structural information. A discriminative low-rank representation for images with respect to the constructed dictionary is obtained. With semantic structure information and strong identification capability, this representation is good for classification tasks even using a simple linear multi-classifier.
Resumo:
Road features extraction from remote sensed imagery has been a long-term topic of great interest within the photogrammetry and remote sensing communities for over three decades. The majority of the early work only focused on linear feature detection approaches, with restrictive assumption on image resolution and road appearance. The widely available of high resolution digital aerial images makes it possible to extract sub-road features, e.g. road pavement markings. In this paper, we will focus on the automatic extraction of road lane markings, which are required by various lane-based vehicle applications, such as, autonomous vehicle navigation, and lane departure warning. The proposed approach consists of three phases: i) road centerline extraction from low resolution image, ii) road surface detection in the original image, and iii) pavement marking extraction on the generated road surface. The proposed method was tested on the aerial imagery dataset of the Bruce Highway, Queensland, and the results demonstrate the efficiency of our approach.
Resumo:
Robust texture recognition in underwater image sequences for marine pest population control such as Crown-Of-Thorns Starfish (COTS) is a relatively unexplored area of research. Typically, humans count COTS by laboriously processing individual images taken during surveys. Being able to autonomously collect and process images of reef habitat and segment out the various marine biota holds the promise of allowing researchers to gain a greater understanding of the marine ecosystem and evaluate the impact of different environmental variables. This research applies and extends the use of Local Binary Patterns (LBP) as a method for texture-based identification of COTS from survey images. The performance and accuracy of the algorithms are evaluated on a image data set taken on the Great Barrier Reef.
Resumo:
This paper presents an alternative approach to image segmentation by using the spatial distribution of edge pixels as opposed to pixel intensities. The segmentation is achieved by a multi-layered approach and is intended to find suitable landing areas for an aircraft emergency landing. We combine standard techniques (edge detectors) with novel developed algorithms (line expansion and geometry test) to design an original segmentation algorithm. Our approach removes the dependency on environmental factors that traditionally influence lighting conditions, which in turn have negative impact on pixel-based segmentation techniques. We present test outcomes on realistic visual data collected from an aircraft, reporting on preliminary feedback about the performance of the detection. We demonstrate consistent performances over 97% detection rate.
Resumo:
In this paper we tackle the problem of efficient video event detection. We argue that linear detection functions should be preferred in this regard due to their scalability and efficiency during estimation and evaluation. A popular approach in this regard is to represent a sequence using a bag of words (BOW) representation due to its: (i) fixed dimensionality irrespective of the sequence length, and (ii) its ability to compactly model the statistics in the sequence. A drawback to the BOW representation, however, is the intrinsic destruction of the temporal ordering information. In this paper we propose a new representation that leverages the uncertainty in relative temporal alignments between pairs of sequences while not destroying temporal ordering. Our representation, like BOW, is of a fixed dimensionality making it easily integrated with a linear detection function. Extensive experiments on CK+, 6DMG, and UvA-NEMO databases show significant performance improvements across both isolated and continuous event detection tasks.
Resumo:
Usually digital image forgeries are created by copy-pasting a portion of an image onto some other image. While doing so, it is often necessary to resize the pasted portion of the image to suit the sampling grid of the host image. The resampling operation changes certain characteristics of the pasted portion, which when detected serves as a clue of tampering. In this paper, we present deterministic techniques to detect resampling, and localize the portion of the image that has been tampered with. Two of the techniques are in pixel domain and two others in frequency domain. We study the efficacy of our techniques against JPEG compression and subsequent resampling of the entire tampered image.