977 resultados para 2-D image inputs


Relevância:

100.00% 100.00%

Publicador:

Resumo:

A new algorithm for extracting features from images for object recognition is described. The algorithm uses higher order spectra to provide desirable invariance properties, to provide noise immunity, and to incorporate nonlinearity into the feature extraction procedure thereby allowing the use of simple classifiers. An image can be reduced to a set of 1D functions via the Radon transform, or alternatively, the Fourier transform of each 1D projection can be obtained from a radial slice of the 2D Fourier transform of the image according to the Fourier slice theorem. A triple product of Fourier coefficients, referred to as the deterministic bispectrum, is computed for each 1D function and is integrated along radial lines in bifrequency space. Phases of the integrated bispectra are shown to be translation- and scale-invariant. Rotation invariance is achieved by a regrouping of these invariants at a constant radius followed by a second stage of invariant extraction. Rotation invariance is thus converted to translation invariance in the second step. Results using synthetic and actual images show that isolated, compact clusters are formed in feature space. These clusters are linearly separable, indicating that the nonlinearity required in the mapping from the input space to the classification space is incorporated well into the feature extraction stage. The use of higher order spectra results in good noise immunity, as verified with synthetic and real images. Classification of images using the higher order spectra-based algorithm compares favorably to classification using the method of moment invariants

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The recognition of 3-D objects from sequences of their 2-D views is modeled by a family of self-organizing neural architectures, called VIEWNET, that use View Information Encoded With NETworks. VIEWNET incorporates a preprocessor that generates a compressed but 2-D invariant representation of an image, a supervised incremental learning system that classifies the preprocessed representations into 2-D view categories whose outputs arc combined into 3-D invariant object categories, and a working memory that makes a 3-D object prediction by accumulating evidence from 3-D object category nodes as multiple 2-D views are experienced. The simplest VIEWNET achieves high recognition scores without the need to explicitly code the temporal order of 2-D views in working memory. Working memories are also discussed that save memory resources by implicitly coding temporal order in terms of the relative activity of 2-D view category nodes, rather than as explicit 2-D view transitions. Variants of the VIEWNET architecture may also be used for scene understanding by using a preprocessor and classifier that can determine both What objects are in a scene and Where they are located. The present VIEWNET preprocessor includes the CORT-X 2 filter, which discounts the illuminant, regularizes and completes figural boundaries, and suppresses image noise. This boundary segmentation is rendered invariant under 2-D translation, rotation, and dilation by use of a log-polar transform. The invariant spectra undergo Gaussian coarse coding to further reduce noise and 3-D foreshortening effects, and to increase generalization. These compressed codes are input into the classifier, a supervised learning system based on the fuzzy ARTMAP algorithm. Fuzzy ARTMAP learns 2-D view categories that are invariant under 2-D image translation, rotation, and dilation as well as 3-D image transformations that do not cause a predictive error. Evidence from sequence of 2-D view categories converges at 3-D object nodes that generate a response invariant under changes of 2-D view. These 3-D object nodes input to a working memory that accumulates evidence over time to improve object recognition. ln the simplest working memory, each occurrence (nonoccurrence) of a 2-D view category increases (decreases) the corresponding node's activity in working memory. The maximally active node is used to predict the 3-D object. Recognition is studied with noisy and clean image using slow and fast learning. Slow learning at the fuzzy ARTMAP map field is adapted to learn the conditional probability of the 3-D object given the selected 2-D view category. VIEWNET is demonstrated on an MIT Lincoln Laboratory database of l28x128 2-D views of aircraft with and without additive noise. A recognition rate of up to 90% is achieved with one 2-D view and of up to 98.5% correct with three 2-D views. The properties of 2-D view and 3-D object category nodes are compared with those of cells in monkey inferotemporal cortex.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis is an outcome of the investigations carried out on the development of an Artificial Neural Network (ANN) model to implement 2-D DFT at high speed. A new definition of 2-D DFT relation is presented. This new definition enables DFT computation organized in stages involving only real addition except at the final stage of computation. The number of stages is always fixed at 4. Two different strategies are proposed. 1) A visual representation of 2-D DFT coefficients. 2) A neural network approach. The visual representation scheme can be used to compute, analyze and manipulate 2D signals such as images in the frequency domain in terms of symbols derived from 2x2 DFT. This, in turn, can be represented in terms of real data. This approach can help analyze signals in the frequency domain even without computing the DFT coefficients. A hierarchical neural network model is developed to implement 2-D DFT. Presently, this model is capable of implementing 2-D DFT for a particular order N such that ((N))4 = 2. The model can be developed into one that can implement the 2-D DFT for any order N upto a set maximum limited by the hardware constraints. The reported method shows a potential in implementing the 2-D DF T in hardware as a VLSI / ASIC

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The thesis explores the area of still image compression. The image compression techniques can be broadly classified into lossless and lossy compression. The most common lossy compression techniques are based on Transform coding, Vector Quantization and Fractals. Transform coding is the simplest of the above and generally employs reversible transforms like, DCT, DWT, etc. Mapped Real Transform (MRT) is an evolving integer transform, based on real additions alone. The present research work aims at developing new image compression techniques based on MRT. Most of the transform coding techniques employ fixed block size image segmentation, usually 8×8. Hence, a fixed block size transform coding is implemented using MRT and the merits and demerits are analyzed for both 8×8 and 4×4 blocks. The N2 unique MRT coefficients, for each block, are computed using templates. Considering the merits and demerits of fixed block size transform coding techniques, a hybrid form of these techniques is implemented to improve the performance of compression. The performance of the hybrid coder is found to be better compared to the fixed block size coders. Thus, if the block size is made adaptive, the performance can be further improved. In adaptive block size coding, the block size may vary from the size of the image to 2×2. Hence, the computation of MRT using templates is impractical due to memory requirements. So, an adaptive transform coder based on Unique MRT (UMRT), a compact form of MRT, is implemented to get better performance in terms of PSNR and HVS The suitability of MRT in vector quantization of images is then experimented. The UMRT based Classified Vector Quantization (CVQ) is implemented subsequently. The edges in the images are identified and classified by employing a UMRT based criteria. Based on the above experiments, a new technique named “MRT based Adaptive Transform Coder with Classified Vector Quantization (MATC-CVQ)”is developed. Its performance is evaluated and compared against existing techniques. A comparison with standard JPEG & the well-known Shapiro’s Embedded Zero-tree Wavelet (EZW) is done and found that the proposed technique gives better performance for majority of images

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The transformation technique is a tool FIR designing 2-D filters, useful for the design of specially shaped filters with passband/stopband regions not centred around the origin. The authors extend this technique to design two types or filters. A notch filter has a stopband centred about a small region in the 2-D frequency plane. The authors propose an extension to the transformation technique with the windowing concept to achieve the design of notch filters. A directional filter has a passband extending fully along: a straight line pacing through the origin. The transformation technique is: further extended to yield such directional filters. Design and application examples for both these fillers are also presented.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we propose a new approach to construct a 2-dimensional (2-D) directional filter bank (DFB) by cascading a 2-D nonseparable checkerboard-shaped filter pair and 2-D separable cosine modulated filter bank (CMFB). Similar to diagonal subbands in 2-D separable wavelets, most of the subbands in 2-D separable CMFBs, tensor products of two 1-D CMFBs, are poor in directional selectivity due to the fact that the frequency supports of most of the subband filters are concentrated along two different directions. To improve the directional selectivity, we propose a new DFB to realize the subband decomposition. First, a checkerboard-shaped filter pair is used to decompose an input image into two images containing different directional information of the original image. Next, a 2-D separable CMFB is applied to each of the two images for directional decomposition. The new DFB is easy in design and has merits: low redundancy ratio and fine directional-frequency tiling. As its application, the BLS-GSM algorithm for image denoising is extended to use the new DFBs. Experimental results show that the proposed DFB achieves better denoising performance than the methods using other DFBs for images of abundant textures. (C) 2008 Elsevier B.V. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We discuss a strategy for visual recognition by forming groups of salient image features, and then using these groups to index into a data base to find all of the matching groups of model features. We discuss the most space efficient possible method of representing 3-D models for indexing from 2-D data, and show how to account for sensing error when indexing. We also present a convex grouping method that is robust and efficient, both theoretically and in practice. Finally, we combine these modules into a complete recognition system, and test its performance on many real images.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Techniques, suitable for parallel implementation, for robust 2D model-based object recognition in the presence of sensor error are studied. Models and scene data are represented as local geometric features and robust hypothesis of feature matchings and transformations is considered. Bounds on the error in the image feature geometry are assumed constraining possible matchings and transformations. Transformation sampling is introduced as a simple, robust, polynomial-time, and highly parallel method of searching the space of transformations to hypothesize feature matchings. Key to the approach is that error in image feature measurement is explicitly accounted for. A Connection Machine implementation and experiments on real images are presented.

Relevância:

100.00% 100.00%

Publicador:

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we present a new method for simultaneously determining three dimensional (3-D) shape and motion of a non-rigid object from uncalibrated two dimensional (2- D) images without assuming the distribution characteristics. A non-rigid motion can be treated as a combination of a rigid rotation and a non-rigid deformation. To seek accurate recovery of deformable structures, we estimate the probability distribution function of the corresponding features through random sampling, incorporating an established probabilistic model. The fitting between the observation and the projection of the estimated 3-D structure will be evaluated using a Markov chain Monte Carlo based expectation maximisation algorithm. Applications of the proposed method to both synthetic and real image sequences are demonstrated with promising results.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The inclusion of the Discrete Wavelet Transform in the JPEG-2000 standard has added impetus to the research of hardware architectures for the two-dimensional wavelet transform. In this paper, a VLSI architecture for performing the symmetrically extended two-dimensional transform is presented. This architecture conforms to the JPEG-2000 standard and is capable of near-optimal performance when dealing with the image boundaries. The architecture also achieves efficient processor utilization. Implementation results based on a Xilinx Virtex-2 FPGA device are included.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Precision edge feature extraction is a very important step in vision, Researchers mainly use step edges to model an edge at subpixel level. In this paper we describe a new technique for two dimensional edge feature extraction to subpixel accuracy using a general edge model. Using six basic edge types to model edges, the edge parameters at subpixel level are extracted by fitting a model to the image signal using least-.squared error fitting technique.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Os dados sísmicos terrestres são afetados pela existência de irregularidades na superfície de medição, e.g. a topografia. Neste sentido, para obter uma imagem sísmica de alta resolução, faz-se necessário corrigir estas irregularidades usando técnicas de processamento sísmico, e.g. correições estáticas residuais e de campo. O método de empilhamento Superfície de Reflexão Comum, CRS ("Common-Reflection-Surface", em inglês) é uma nova técnica de processamento para simular seções sísmicas com afastamento-nulo, ZO ("Zero-Offset", em inglês) a partir de dados sísmicos de cobertura múltipla. Este método baseia-se na aproximação hiperbólica de tempos de trânsito paraxiais de segunda ordem referido ao raio (central) normal. O operador de empilhamento CRS para uma superfície de medição planar depende de três parâmetros, denominados o ângulo de emergência do raio normal, a curvatura da onda Ponto de Incidência Normal, NIP ("Normal Incidence Point", em inglês) e a curvatura da onda Normal, N. Neste artigo o método de empilhamento CRS ZO 2-D é modificado com a finalidade de considerar uma superfície de medição com topografia suave também dependente desses parâmetros. Com este novo formalismo CRS, obtemos uma seção sísmica ZO de alta resolução, sem aplicar as correições estáticas, onde em cada ponto desta seção são estimados os três parâmetros relevantes do processo de empilhamento CRS.