799 resultados para Content Based Image Retrieval (CBIR)


Relevância:

40.00% 40.00%

Publicador:

Resumo:

This article presents a probabilistic method for vehicle detection and tracking through the analysis of monocular images obtained from a vehicle-mounted camera. The method is designed to address the main shortcomings of traditional particle filtering approaches, namely Bayesian methods based on importance sampling, for use in traffic environments. These methods do not scale well when the dimensionality of the feature space grows, which creates significant limitations when tracking multiple objects. Alternatively, the proposed method is based on a Markov chain Monte Carlo (MCMC) approach, which allows efficient sampling of the feature space. The method involves important contributions in both the motion and the observation models of the tracker. Indeed, as opposed to particle filter-based tracking methods in the literature, which typically resort to observation models based on appearance or template matching, in this study a likelihood model that combines appearance analysis with information from motion parallax is introduced. Regarding the motion model, a new interaction treatment is defined based on Markov random fields (MRF) that allows for the handling of possible inter-dependencies in vehicle trajectories. As for vehicle detection, the method relies on a supervised classification stage using support vector machines (SVM). The contribution in this field is twofold. First, a new descriptor based on the analysis of gradient orientations in concentric rectangles is dened. This descriptor involves a much smaller feature space compared to traditional descriptors, which are too costly for real-time applications. Second, a new vehicle image database is generated to train the SVM and made public. The proposed vehicle detection and tracking method is proven to outperform existing methods and to successfully handle challenging situations in the test sequences.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This thesis deals with the problem of efficiently tracking 3D objects in sequences of images. We tackle the efficient 3D tracking problem by using direct image registration. This problem is posed as an iterative optimization procedure that minimizes a brightness error norm. We review the most popular iterative methods for image registration in the literature, turning our attention to those algorithms that use efficient optimization techniques. Two forms of efficient registration algorithms are investigated. The first type comprises the additive registration algorithms: these algorithms incrementally compute the motion parameters by linearly approximating the brightness error function. We centre our attention on Hager and Belhumeur’s factorization-based algorithm for image registration. We propose a fundamental requirement that factorization-based algorithms must satisfy to guarantee good convergence, and introduce a systematic procedure that automatically computes the factorization. Finally, we also bring out two warp functions to register rigid and nonrigid 3D targets that satisfy the requirement. The second type comprises the compositional registration algorithms, where the brightness function error is written by using function composition. We study the current approaches to compositional image alignment, and we emphasize the importance of the Inverse Compositional method, which is known to be the most efficient image registration algorithm. We introduce a new algorithm, the Efficient Forward Compositional image registration: this algorithm avoids the necessity of inverting the warping function, and provides a new interpretation of the working mechanisms of the inverse compositional alignment. By using this information, we propose two fundamental requirements that guarantee the convergence of compositional image registration methods. Finally, we support our claims by using extensive experimental testing with synthetic and real-world data. We propose a distinction between image registration and tracking when using efficient algorithms. We show that, depending whether the fundamental requirements are hold, some efficient algorithms are eligible for image registration but not for tracking.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

A generic bio-inspired adaptive architecture for image compression suitable to be implemented in embedded systems is presented. The architecture allows the system to be tuned during its calibration phase. An evolutionary algorithm is responsible of making the system evolve towards the required performance. A prototype has been implemented in a Xilinx Virtex-5 FPGA featuring an adaptive wavelet transform core directed at improving image compression for specific types of images. An Evolution Strategy has been chosen as the search algorithm and its typical genetic operators adapted to allow for a hardware friendly implementation. HW/SW partitioning issues are also considered after a high level description of the algorithm is profiled which validates the proposed resource allocation in the device fabric. To check the robustness of the system and its adaptation capabilities, different types of images have been selected as validation patterns. A direct application of such a system is its deployment in an unknown environment during design time, letting the calibration phase adjust the system parameters so that it performs efcient image compression. Also, this prototype implementation may serve as an accelerator for the automatic design of evolved transform coefficients which are later on synthesized and implemented in a non-adaptive system in the final implementation device, whether it is a HW or SW based computing device. The architecture has been built in a modular way so that it can be easily extended to adapt other types of image processing cores. Details on this pluggable component point of view are also given in the paper.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In this paper, we seek to expand the use of direct methods in real-time applications by proposing a vision-based strategy for pose estimation of aerial vehicles. The vast majority of approaches make use of features to estimate motion. Conversely, the strategy we propose is based on a MR (Multi-Resolution) implementation of an image registration technique (Inverse Compositional Image Alignment ICIA) using direct methods. An on-board camera in a downwards-looking configuration, and the assumption of planar scenes, are the bases of the algorithm. The motion between frames (rotation and translation) is recovered by decomposing the frame-to-frame homography obtained by the ICIA algorithm applied to a patch that covers around the 80% of the image. When the visual estimation is required (e.g. GPS drop-out), this motion is integrated with the previous known estimation of the vehicles' state, obtained from the on-board sensors (GPS/IMU), and the subsequent estimations are based only on the vision-based motion estimations. The proposed strategy is tested with real flight data in representative stages of a flight: cruise, landing, and take-off, being two of those stages considered critical: take-off and landing. The performance of the pose estimation strategy is analyzed by comparing it with the GPS/IMU estimations. Results show correlation between the visual estimation obtained with the MR-ICIA and the GPS/IMU data, that demonstrate that the visual estimation can be used to provide a good approximation of the vehicle's state when it is required (e.g. GPS drop-outs). In terms of performance, the proposed strategy is able to maintain an estimation of the vehicle's state for more than one minute, at real-time frame rates based, only on visual information.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Industrial applications of computer vision sometimes require detection of atypical objects that occur as small groups of pixels in digital images. These objects are difficult to single out because they are small and randomly distributed. In this work we propose an image segmentation method using the novel Ant System-based Clustering Algorithm (ASCA). ASCA models the foraging behaviour of ants, which move through the data space searching for high data-density regions, and leave pheromone trails on their path. The pheromone map is used to identify the exact number of clusters, and assign the pixels to these clusters using the pheromone gradient. We applied ASCA to detection of microcalcifications in digital mammograms and compared its performance with state-of-the-art clustering algorithms such as 1D Self-Organizing Map, k-Means, Fuzzy c-Means and Possibilistic Fuzzy c-Means. The main advantage of ASCA is that the number of clusters needs not to be known a priori. The experimental results show that ASCA is more efficient than the other algorithms in detecting small clusters of atypical data.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper describes the participation of DAEDALUS at ImageCLEF 2011 Medical Retrieval task. We have focused on multimodal (or mixed) experiments that combine textual and visual retrieval. The main objective of our research has been to evaluate the effect on the medical retrieval process of the existence of an extended corpus that is annotated with the image type, associated to both the image itself and also to its textual description. For this purpose, an image classifier has been developed to tag each document with its class (1st level of the hierarchy: Radiology, Microscopy, Photograph, Graphic, Other) and subclass (2nd level: AN, CT, MR, etc.). For the textual-based experiments, several runs using different semantic expansion techniques have been performed. For the visual-based retrieval, different runs are defined by the corpus used in the retrieval process and the strategy for obtaining the class and/or subclass. The best results are achieved in runs that make use of the image subclass based on the classification of the sample images. Although different multimodal strategies have been submitted, none of them has shown to be able to provide results that are at least comparable to the ones achieved by the textual retrieval alone. We believe that we have been unable to find a metric for the assessment of the relevance of the results provided by the visual and textual processes

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Determination of the soil coverage by crop residues after ploughing is a fundamental element of Conservation Agriculture. This paper presents the application of genetic algorithms employed during the fine tuning of the segmentation process of a digital image with the aim of automatically quantifying the residue coverage. In other words, the objective is to achieve a segmentation that would permit the discrimination of the texture of the residue so that the output of the segmentation process is a binary image in which residue zones are isolated from the rest. The RGB images used come from a sample of images in which sections of terrain were photographed with a conventional camera positioned in zenith orientation atop a tripod. The images were taken outdoors under uncontrolled lighting conditions. Up to 92% similarity was achieved between the images obtained by the segmentation process proposed in this paper and the templates made by an elaborate manual tracing process. In addition to the proposed segmentation procedure and the fine tuning procedure that was developed, a global quantification of the soil coverage by residues for the sampled area was achieved that differed by only 0.85% from the quantification obtained using template images. Moreover, the proposed method does not depend on the type of residue present in the image. The study was conducted at the experimental farm “El Encín” in Alcalá de Henares (Madrid, Spain).

Relevância:

40.00% 40.00%

Publicador:

Resumo:

These slides present several 3-D reconstruction methods to obtain the geometric structure of a scene that is viewed by multiple cameras. We focus on the combination of the geometric modeling in the image formation process with the use of standard optimization tools to estimate the characteristic parameters that describe the geometry of the 3-D scene. In particular, linear, non-linear and robust methods to estimate the monocular and epipolar geometry are introduced as cornerstones to generate 3-D reconstructions with multiple cameras. Some examples of systems that use this constructive strategy are Bundler, PhotoSynth, VideoSurfing, etc., which are able to obtain 3-D reconstructions with several hundreds or thousands of cameras. En esta presentación se tratan varios métodos de reconstrucción 3-D para la obtención de la estructura geométrica de una escena que es visualizada por varias cámaras. Se enfatiza la combinación de modelado geométrico del proceso de formación de la imagen con el uso de herramientas estándar de optimización para estimar los parámetros característicos que describen la geometría de la escena 3-D. En concreto, se presentan métodos de estimación lineales, no lineales y robustos de las geometrías monocular y epipolar como punto de partida para generar reconstrucciones con tres o más cámaras. Algunos ejemplos de sistemas que utilizan este enfoque constructivo son Bundler, PhotoSynth, VideoSurfing, etc., los cuales, en la práctica pueden llegar a reconstruir una escena con varios cientos o miles de cámaras.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Stereo video techniques are effective for estimating the space–time wave dynamics over an area of the ocean. Indeed, a stereo camera view allows retrieval of both spatial and temporal data whose statistical content is richer than that of time series data retrieved from point wave probes. We present an application of the Wave Acquisition Stereo System (WASS) for the analysis of offshore video measurements of gravity waves in the Northern Adriatic Sea and near the southern seashore of the Crimean peninsula, in the Black Sea. We use classical epipolar techniques to reconstruct the sea surface from the stereo pairs sequentially in time, viz. a sequence of spatial snapshots. We also present a variational approach that exploits the entire data image set providing a global space–time imaging of the sea surface, viz. simultaneous reconstruction of several spatial snapshots of the surface in order to guarantee continuity of the sea surface both in space and time. Analysis of the WASS measurements show that the sea surface can be accurately estimated in space and time together, yielding associated directional spectra and wave statistics at a point in time that agrees well with probabilistic models. In particular, WASS stereo imaging is able to capture typical features of the wave surface, especially the crest-to-trough asymmetry due to second order nonlinearities, and the observed shape of large waves are fairly described by theoretical models based on the theory of quasi-determinism (Boccotti, 2000). Further, we investigate space–time extremes of the observed stationary sea states, viz. the largest surface wave heights expected over a given area during the sea state duration. The WASS analysis provides the first experimental proof that a space–time extreme is generally larger than that observed in time via point measurements, in agreement with the predictions based on stochastic theories for global maxima of Gaussian fields.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

ATM, SDH or satellite have been used in the last century as the contribution network of Broadcasters. However the attractive price of IP networks is changing the infrastructure of these networks in the last decade. Nowadays, IP networks are widely used, but their characteristics do not offer the level of performance required to carry high quality video under certain circumstances. Data transmission is always subject to errors on line. In the case of streaming, correction is attempted at destination, while on transfer of files, retransmissions of information are conducted and a reliable copy of the file is obtained. In the latter case, reception time is penalized because of the low priority this type of traffic on the networks usually has. While in streaming, image quality is adapted to line speed, and line errors result in a decrease of quality at destination, in the file copy the difference between coding speed vs line speed and errors in transmission are reflected in an increase of transmission time. The way news or audiovisual programs are transferred from a remote office to the production centre depends on the time window and the type of line available; in many cases, it must be done in real time (streaming), with the resulting image degradation. The main purpose of this work is the workflow optimization and the image quality maximization, for that reason a transmission model for multimedia files adapted to JPEG2000, is described based on the combination of advantages of file transmission and those of streaming transmission, putting aside the disadvantages that these models have. The method is based on two patents and consists of the safe transfer of the headers and data considered to be vital for reproduction. Aside, the rest of the data is sent by streaming, being able to carry out recuperation operations and error concealment. Using this model, image quality is maximized according to the time window. In this paper, we will first give a briefest overview of the broadcasters requirements and the solutions with IP networks. We will then focus on a different solution for video file transfer. We will take the example of a broadcast center with mobile units (unidirectional video link) and regional headends (bidirectional link), and we will also present a video file transfer file method that satisfies the broadcaster requirements.