912 resultados para Efficient image processing


Relevância:

100.00% 100.00%

Publicador:

Resumo:

With security and surveillance, there is an increasing need to process image data efficiently and effectively either at source or in a large data network. Whilst a Field-Programmable Gate Array (FPGA) has been seen as a key technology for enabling this, the design process has been viewed as problematic in terms of the time and effort needed for implementation and verification. The work here proposes a different approach of using optimized FPGA-based soft-core processors which allows the user to exploit the task and data level parallelism to achieve the quality of dedicated FPGA implementations whilst reducing design time. The paper also reports some preliminary
progress on the design flow to program the structure. An implementation for a Histogram of Gradients algorithm is also reported which shows that a performance of 328 fps can be achieved with this design approach, whilst avoiding the long design time, verification and debugging steps associated with conventional FPGA implementations.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Coupled map lattices (CML) can describe many relaxation and optimization algorithms currently used in image processing. We recently introduced the ‘‘plastic‐CML’’ as a paradigm to extract (segment) objects in an image. Here, the image is applied by a set of forces to a metal sheet which is allowed to undergo plastic deformation parallel to the applied forces. In this paper we present an analysis of our ‘‘plastic‐CML’’ in one and two dimensions, deriving the nature and stability of its stationary solutions. We also detail how to use the CML in image processing, how to set the system parameters and present examples of it at work. We conclude that the plastic‐CML is able to segment images with large amounts of noise and large dynamic range of pixel values, and is suitable for a very large scale integration(VLSI) implementation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The size of online image datasets is constantly increasing. Considering an image dataset with millions of images, image retrieval becomes a seemingly intractable problem for exhaustive similarity search algorithms. Hashing methods, which encodes high-dimensional descriptors into compact binary strings, have become very popular because of their high efficiency in search and storage capacity. In the first part, we propose a multimodal retrieval method based on latent feature models. The procedure consists of a nonparametric Bayesian framework for learning underlying semantically meaningful abstract features in a multimodal dataset, a probabilistic retrieval model that allows cross-modal queries and an extension model for relevance feedback. In the second part, we focus on supervised hashing with kernels. We describe a flexible hashing procedure that treats binary codes and pairwise semantic similarity as latent and observed variables, respectively, in a probabilistic model based on Gaussian processes for binary classification. We present a scalable inference algorithm with the sparse pseudo-input Gaussian process (SPGP) model and distributed computing. In the last part, we define an incremental hashing strategy for dynamic databases where new images are added to the databases frequently. The method is based on a two-stage classification framework using binary and multi-class SVMs. The proposed method also enforces balance in binary codes by an imbalance penalty to obtain higher quality binary codes. We learn hash functions by an efficient algorithm where the NP-hard problem of finding optimal binary codes is solved via cyclic coordinate descent and SVMs are trained in a parallelized incremental manner. For modifications like adding images from an unseen class, we propose an incremental procedure for effective and efficient updates to the previous hash functions. Experiments on three large-scale image datasets demonstrate that the incremental strategy is capable of efficiently updating hash functions to the same retrieval performance as hashing from scratch.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Image (Video) retrieval is an interesting problem of retrieving images (videos) similar to the query. Images (Videos) are represented in an input (feature) space and similar images (videos) are obtained by finding nearest neighbors in the input representation space. Numerous input representations both in real valued and binary space have been proposed for conducting faster retrieval. In this thesis, we present techniques that obtain improved input representations for retrieval in both supervised and unsupervised settings for images and videos. Supervised retrieval is a well known problem of retrieving same class images of the query. We address the practical aspects of achieving faster retrieval with binary codes as input representations for the supervised setting in the first part, where binary codes are used as addresses into hash tables. In practice, using binary codes as addresses does not guarantee fast retrieval, as similar images are not mapped to the same binary code (address). We address this problem by presenting an efficient supervised hashing (binary encoding) method that aims to explicitly map all the images of the same class ideally to a unique binary code. We refer to the binary codes of the images as `Semantic Binary Codes' and the unique code for all same class images as `Class Binary Code'. We also propose a new class­ based Hamming metric that dramatically reduces the retrieval times for larger databases, where only hamming distance is computed to the class binary codes. We also propose a Deep semantic binary code model, by replacing the output layer of a popular convolutional Neural Network (AlexNet) with the class binary codes and show that the hashing functions learned in this way outperforms the state­ of ­the art, and at the same time provide fast retrieval times. In the second part, we also address the problem of supervised retrieval by taking into account the relationship between classes. For a given query image, we want to retrieve images that preserve the relative order i.e. we want to retrieve all same class images first and then, the related classes images before different class images. We learn such relationship aware binary codes by minimizing the similarity between inner product of the binary codes and the similarity between the classes. We calculate the similarity between classes using output embedding vectors, which are vector representations of classes. Our method deviates from the other supervised binary encoding schemes as it is the first to use output embeddings for learning hashing functions. We also introduce new performance metrics that take into account the related class retrieval results and show significant gains over the state­ of­ the art. High Dimensional descriptors like Fisher Vectors or Vector of Locally Aggregated Descriptors have shown to improve the performance of many computer vision applications including retrieval. In the third part, we will discuss an unsupervised technique for compressing high dimensional vectors into high dimensional binary codes, to reduce storage complexity. In this approach, we deviate from adopting traditional hyperplane hashing functions and instead learn hyperspherical hashing functions. The proposed method overcomes the computational challenges of directly applying the spherical hashing algorithm that is intractable for compressing high dimensional vectors. A practical hierarchical model that utilizes divide and conquer techniques using the Random Select and Adjust (RSA) procedure to compress such high dimensional vectors is presented. We show that our proposed high dimensional binary codes outperform the binary codes obtained using traditional hyperplane methods for higher compression ratios. In the last part of the thesis, we propose a retrieval based solution to the Zero shot event classification problem - a setting where no training videos are available for the event. To do this, we learn a generic set of concept detectors and represent both videos and query events in the concept space. We then compute similarity between the query event and the video in the concept space and videos similar to the query event are classified as the videos belonging to the event. We show that we significantly boost the performance using concept features from other modalities.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Radio Simultaneous Location and Mapping (SLAM) consists of the simultaneous tracking of the target and estimation of the surrounding environment, to build a map and estimate the target movements within it. It is an increasingly exploited technique for automotive applications, in order to improve the localization of obstacles and the target relative movement with respect to them, for emergency situations, for example when it is necessary to explore (with a drone or a robot) environments with a limited visibility, or for personal radar applications, thanks to its versatility and cheapness. Until today, these systems were based on light detection and ranging (lidar) or visual cameras, high-accuracy and expensive approaches that are limited to specific environments and weather conditions. Instead, in case of smoke, fog or simply darkness, radar-based systems can operate exactly in the same way. In this thesis activity, the Fourier-Mellin algorithm is analyzed and implemented, to verify the applicability to Radio SLAM, in which the radar frames can be treated as images and the radar motion between consecutive frames can be covered with registration. Furthermore, a simplified version of that algorithm is proposed, in order to solve the problems of the Fourier-Mellin algorithm when working with real radar images and improve the performance. The INRAS RBK2, a MIMO 2x16 mmWave radar, is used for experimental acquisitions, consisting of multiple tests performed in Lab-E of the Cesena Campus, University of Bologna. The different performances of Fourier-Mellin and its simplified version are compared also with the MatchScan algorithm, a classic algorithm for SLAM systems.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The integration of CMOS cameras with embedded processors and wireless communication devices has enabled the development of distributed wireless vision systems. Wireless Vision Sensor Networks (WVSNs), which consist of wirelessly connected embedded systems with vision and sensing capabilities, provide wide variety of application areas that have not been possible to realize with the wall-powered vision systems with wired links or scalar-data based wireless sensor networks. In this paper, the design of a middleware for a wireless vision sensor node is presented for the realization of WVSNs. The implemented wireless vision sensor node is tested through a simple vision application to study and analyze its capabilities, and determine the challenges in distributed vision applications through a wireless network of low-power embedded devices. The results of this paper highlight the practical concerns for the development of efficient image processing and communication solutions for WVSNs and emphasize the need for cross-layer solutions that unify these two so-far-independent research areas.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Ultrasound imaging is widely used in medical diagnostics as it is the fastest, least invasive, and least expensive imaging modality. However, ultrasound images are intrinsically difficult to be interpreted. In this scenario, Computer Aided Detection (CAD) systems can be used to support physicians during diagnosis providing them a second opinion. This thesis discusses efficient ultrasound processing techniques for computer aided medical diagnostics, focusing on two major topics: (i) Ultrasound Tissue Characterization (UTC), aimed at characterizing and differentiating between healthy and diseased tissue; (ii) Ultrasound Image Segmentation (UIS), aimed at detecting the boundaries of anatomical structures to automatically measure organ dimensions and compute clinically relevant functional indices. Research on UTC produced a CAD tool for Prostate Cancer detection to improve the biopsy protocol. In particular, this thesis contributes with: (i) the development of a robust classification system; (ii) the exploitation of parallel computing on GPU for real-time performance; (iii) the introduction of both an innovative Semi-Supervised Learning algorithm and a novel supervised/semi-supervised learning scheme for CAD system training that improve system performance reducing data collection effort and avoiding collected data wasting. The tool provides physicians a risk map highlighting suspect tissue areas, allowing them to perform a lesion-directed biopsy. Clinical validation demonstrated the system validity as a diagnostic support tool and its effectiveness at reducing the number of biopsy cores requested for an accurate diagnosis. For UIS the research developed a heart disease diagnostic tool based on Real-Time 3D Echocardiography. Thesis contributions to this application are: (i) the development of an automated GPU based level-set segmentation framework for 3D images; (ii) the application of this framework to the myocardium segmentation. Experimental results showed the high efficiency and flexibility of the proposed framework. Its effectiveness as a tool for quantitative analysis of 3D cardiac morphology and function was demonstrated through clinical validation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Mining in the Iberian Pyrite Belt (IPB), the biggest VMS metallogenetic province known in the world to date, has to face a deep crisis in spite of the huge reserves still known after ≈5 000 years of production. This is due to several factors, as the difficult processing of complex Cu-Pb-Zn-Ag- Au ores, the exhaustion of the oxidation zone orebodies (the richest for gold, in gossan), the scarce demand for sulphuric acid in the world market, and harder environmental regulations. Of these factors, only the first and the last mentioned can be addressed by local ore geologists. A reactivation of mining can therefore only be achieved by an improved and more efficient ore processing, under the constraint of strict environmental controls. Digital image analysis of the ores, coupled to reflected light microscopy, provides a quantified and reliable mineralogical and textural characterization of the ores. The automation of the procedure for the first time furnishes the process engineers with real-time information, to improve the process and to preclude or control pollution; it can be applied to metallurgical tailings as well. This is shown by some examples of the IPB.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Functional brain imaging techniques such as functional MRI (fMRI) that allow the in vivo investigation of the human brain have been exponentially employed to address the neurophysiological substrates of emotional processing. Despite the growing number of fMRI studies in the field, when taken separately these individual imaging studies demonstrate contrasting findings and variable pictures, and are unable to definitively characterize the neural networks underlying each specific emotional condition. Different imaging packages, as well as the statistical approaches for image processing and analysis, probably have a detrimental role by increasing the heterogeneity of findings. In particular, it is unclear to what extent the observed neurofunctional response of the brain cortex during emotional processing depends on the fMRI package used in the analysis. In this pilot study, we performed a double analysis of an fMRI dataset using emotional faces. The Statistical Parametric Mapping (SPM) version 2.6 (Wellcome Department of Cognitive Neurology, London, UK) and the XBAM 3.4 (Brain Imaging Analysis Unit, Institute of Psychiatry, Kings College London, UK) programs, which use parametric and non-parametric analysis, respectively, were used to assess our results. Both packages revealed that processing of emotional faces was associated with an increased activation in the brain`s visual areas (occipital, fusiform and lingual gyri), in the cerebellum, in the parietal cortex, in the cingulate cortex (anterior and posterior cingulate), and in the dorsolateral and ventrolateral prefrontal cortex. However, blood oxygenation level-dependent (BOLD) response in the temporal regions, insula and putamen was evident in the XBAM analysis but not in the SPM analysis. Overall, SPM and XBAM analyses revealed comparable whole-group brain responses. Further Studies are needed to explore the between-group compatibility of the different imaging packages in other cognitive and emotional processing domains. (C) 2009 Elsevier Ltd. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

An efficient representation method for arbitrarily shaped image segments is proposed. This method includes a smart way to select wavelet basis to approximate the given image segment, with improved image quality and reduced computational load.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The standard data fusion methods may not be satisfactory to merge a high-resolution panchromatic image and a low-resolution multispectral image because they can distort the spectral characteristics of the multispectral data. The authors developed a technique, based on multiresolution wavelet decomposition, for the merging and data fusion of such images. The method presented consists of adding the wavelet coefficients of the high-resolution image to the multispectral (low-resolution) data. They have studied several possibilities concluding that the method which produces the best results consists in adding the high order coefficients of the wavelet transform of the panchromatic image to the intensity component (defined as L=(R+G+B)/3) of the multispectral image. The method is, thus, an improvement on standard intensity-hue-saturation (IHS or LHS) mergers. They used the ¿a trous¿ algorithm which allows the use of a dyadic wavelet to merge nondyadic data in a simple and efficient scheme. They used the method to merge SPOT and LANDSATTM images. The technique presented is clearly better than the IHS and LHS mergers in preserving both spectral and spatial information.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we present an efficient numerical scheme for the recently introduced geodesic active fields (GAF) framework for geometric image registration. This framework considers the registration task as a weighted minimal surface problem. Hence, the data-term and the regularization-term are combined through multiplication in a single, parametrization invariant and geometric cost functional. The multiplicative coupling provides an intrinsic, spatially varying and data-dependent tuning of the regularization strength, and the parametrization invariance allows working with images of nonflat geometry, generally defined on any smoothly parametrizable manifold. The resulting energy-minimizing flow, however, has poor numerical properties. Here, we provide an efficient numerical scheme that uses a splitting approach; data and regularity terms are optimized over two distinct deformation fields that are constrained to be equal via an augmented Lagrangian approach. Our approach is more flexible than standard Gaussian regularization, since one can interpolate freely between isotropic Gaussian and anisotropic TV-like smoothing. In this paper, we compare the geodesic active fields method with the popular Demons method and three more recent state-of-the-art algorithms: NL-optical flow, MRF image registration, and landmark-enhanced large displacement optical flow. Thus, we can show the advantages of the proposed FastGAF method. It compares favorably against Demons, both in terms of registration speed and quality. Over the range of example applications, it also consistently produces results not far from more dedicated state-of-the-art methods, illustrating the flexibility of the proposed framework.