850 resultados para Text-Based Image Retrieval


Relevância:

40.00% 40.00%

Publicador:

Resumo:

These slides present several 3-D reconstruction methods to obtain the geometric structure of a scene that is viewed by multiple cameras. We focus on the combination of the geometric modeling in the image formation process with the use of standard optimization tools to estimate the characteristic parameters that describe the geometry of the 3-D scene. In particular, linear, non-linear and robust methods to estimate the monocular and epipolar geometry are introduced as cornerstones to generate 3-D reconstructions with multiple cameras. Some examples of systems that use this constructive strategy are Bundler, PhotoSynth, VideoSurfing, etc., which are able to obtain 3-D reconstructions with several hundreds or thousands of cameras. En esta presentación se tratan varios métodos de reconstrucción 3-D para la obtención de la estructura geométrica de una escena que es visualizada por varias cámaras. Se enfatiza la combinación de modelado geométrico del proceso de formación de la imagen con el uso de herramientas estándar de optimización para estimar los parámetros característicos que describen la geometría de la escena 3-D. En concreto, se presentan métodos de estimación lineales, no lineales y robustos de las geometrías monocular y epipolar como punto de partida para generar reconstrucciones con tres o más cámaras. Algunos ejemplos de sistemas que utilizan este enfoque constructivo son Bundler, PhotoSynth, VideoSurfing, etc., los cuales, en la práctica pueden llegar a reconstruir una escena con varios cientos o miles de cámaras.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

ATM, SDH or satellite have been used in the last century as the contribution network of Broadcasters. However the attractive price of IP networks is changing the infrastructure of these networks in the last decade. Nowadays, IP networks are widely used, but their characteristics do not offer the level of performance required to carry high quality video under certain circumstances. Data transmission is always subject to errors on line. In the case of streaming, correction is attempted at destination, while on transfer of files, retransmissions of information are conducted and a reliable copy of the file is obtained. In the latter case, reception time is penalized because of the low priority this type of traffic on the networks usually has. While in streaming, image quality is adapted to line speed, and line errors result in a decrease of quality at destination, in the file copy the difference between coding speed vs line speed and errors in transmission are reflected in an increase of transmission time. The way news or audiovisual programs are transferred from a remote office to the production centre depends on the time window and the type of line available; in many cases, it must be done in real time (streaming), with the resulting image degradation. The main purpose of this work is the workflow optimization and the image quality maximization, for that reason a transmission model for multimedia files adapted to JPEG2000, is described based on the combination of advantages of file transmission and those of streaming transmission, putting aside the disadvantages that these models have. The method is based on two patents and consists of the safe transfer of the headers and data considered to be vital for reproduction. Aside, the rest of the data is sent by streaming, being able to carry out recuperation operations and error concealment. Using this model, image quality is maximized according to the time window. In this paper, we will first give a briefest overview of the broadcasters requirements and the solutions with IP networks. We will then focus on a different solution for video file transfer. We will take the example of a broadcast center with mobile units (unidirectional video link) and regional headends (bidirectional link), and we will also present a video file transfer file method that satisfies the broadcaster requirements.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In this paper, we describe new results and improvements to a lan-guage identification (LID) system based on PPRLM previously introduced in [1] and [2]. In this case, we use as parallel phone recognizers the ones provided by the Brno University of Technology for Czech, Hungarian, and Russian lan-guages, and instead of using traditional n-gram language models we use a lan-guage model that is created using a ranking with the most frequent and discrim-inative n-grams. In this language model approach, the distance between the ranking for the input sentence and the ranking for each language is computed, based on the difference in relative positions for each n-gram. This approach is able to model reliably longer span information than in traditional language models obtaining more reliable estimations. We also describe the modifications that we have being introducing along the time to the original ranking technique, e.g., different discriminative formulas to establish the ranking, variations of the template size, the suppression of repeated consecutive phones, and a new clus-tering technique for the ranking scores. Results show that this technique pro-vides a 12.9% relative improvement over PPRLM. Finally, we also describe re-sults where the traditional PPRLM and our ranking technique are combined.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper describes a low complexity strategy for detecting and recognizing text signs automatically. Traditional approaches use large image algorithms for detecting the text sign, followed by the application of an Optical Character Recognition (OCR) algorithm in the previously identified areas. This paper proposes a new architecture that applies the OCR to a whole lightly treated image and then carries out the text detection process of the OCR output. The strategy presented in this paper significantly reduces the processing time required for text localization in an image, while guaranteeing a high recognition rate. This strategy will facilitate the incorporation of video processing-based applications into the automatic detection of text sign similar to that of a smartphone. These applications will increase the autonomy of visually impaired people in their daily life.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

A first study in order to construct a simple model of the mammalian retina is reported. The basic elements for this model are Optical Programmable Logic Cells, OPLCs, previously employed as a functional element for Optical Computing. The same type of circuit simulates the five types of neurons present in the retina. Different responses are obtained by modifying either internal or external connections. Two types of behaviors are reported: symmetrical and non-symmetrical with respect to light position. Some other higher functions, as the possibility to differentiate between symmetric and non-symmetric light images, are performed by another simulation of the first layers of the visual cortex. The possibility to apply these models to image processing is reported.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

A proposal for a model of the primary visual cortex is reported. It is structured with the basis of a simple unit cell able to perform fourteen pairs of different boolean functions corresponding to the two possible inputs. As a first step, a model of the retina is presented. Different types of responses, according to the different possibilities of interconnecting the building blocks, have been obtained. These responses constitute the basis for an initial configuration of the mammalian primary visual cortex. Some qualitative functions, as symmetry or size of an optical input, have been obtained. A proposal to extend this model to some higher functions, concludes the paper.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Evolvable Hardware (EH) is a technique that consists of using reconfigurable hardware devices whose configuration is controlled by an Evolutionary Algorithm (EA). Our system consists of a fully-FPGA implemented scalable EH platform, where the Reconfigurable processing Core (RC) can adaptively increase or decrease in size. Figure 1 shows the architecture of the proposed System-on-Programmable-Chip (SoPC), consisting of a MicroBlaze processor responsible of controlling the whole system operation, a Reconfiguration Engine (RE), and a Reconfigurable processing Core which is able to change its size in both height and width. This system is used to implement image filters, which are generated autonomously thanks to the evolutionary process. The system is complemented with a camera that enables the usage of the platform for real time applications.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper presents a vision based autonomous landing control approach for unmanned aerial vehicles (UAV). The 3D position of an unmanned helicopter is estimated based on the homographies estimated of a known landmark. The translation and altitude estimation of the helicopter against the helipad position are the only information that is used to control the longitudinal, lateral and descend speeds of the vehicle. The control system approach consists in three Fuzzy controllers to manage the speeds of each 3D axis of the aircraft s coordinate system. The 3D position estimation was proven rst, comparing it with the GPS + IMU data with very good results. The robust of the vision algorithm against occlusions was also tested. The excellent behavior of the Fuzzy control approach using the 3D position estimation based in homographies was proved in an outdoors test using a real unmanned helicopter.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This work proposes an optimization of a semi-supervised Change Detection methodology based on a combination of Change Indices (CI) derived from an image multitemporal data set. For this purpose, SPOT 5 Panchromatic images with 2.5 m spatial resolution have been used, from which three Change Indices have been calculated. Two of them are usually known indices; however the third one has been derived considering the Kullbak-Leibler divergence. Then, these three indices have been combined forming a multiband image that has been used in as input for a Support Vector Machine (SVM) classifier where four different discriminant functions have been tested in order to differentiate between change and no_change categories. The performance of the suggested procedure has been assessed applying different quality measures, reaching in each case highly satisfactory values. These results have demonstrated that the simultaneous combination of basic change indices with others more sophisticated like the Kullback-Leibler distance, and the application of non-parametric discriminant functions like those employees in the SVM method, allows solving efficiently a change detection problem.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Current fusion devices consist of multiple diagnostics and hundreds or even thousands of signals. This situation forces on multiple occasions to use distributed data acquisition systems as the best approach. In this type of distributed systems, one of the most important issues is the synchronization between signals, so that it is possible to have a temporal correlation as accurate as possible between the acquired samples of all channels. In last decades, many fusion devices use different types of video cameras to provide inside views of the vessel during operations and to monitor plasma behavior. The synchronization between each video frame and the rest of the different signals acquired from any other diagnostics is essential in order to know correctly the plasma evolution, since it is possible to analyze jointly all the information having accurate knowledge of their temporal correlation. The developed system described in this paper allows timestamping image frames in a real-time acquisition and processing system using 1588 clock distribution. The system has been implemented using FPGA based devices together with a 1588 synchronized timing card (see Fig.1). The solution is based on a previous system [1] that allows image acquisition and real-time image processing based on PXIe technology. This architecture is fully compatible with the ITER Fast Controllers [2] and offers integration with EPICS to control and monitor the entire system. However, this set-up is not able to timestamp the frames acquired since the frame grabber module does not present any type of timing input (IRIG-B, GPS, PTP). To solve this lack, an IEEE1588 PXI timing device its used to provide an accurate way to synchronize distributed data acquisition systems using the Precision Time Protocol (PTP) IEEE 1588 2008 standard. This local timing device can be connected to a master clock device for global synchronization. The timing device has a buffer timestamp for each PXI trigger line and requires tha- a software application assigns each frame the corresponding timestamp. The previous action is critical and cannot be achieved if the frame rate is high. To solve this problem, it has been designed a solution that distributes the clock from the IEEE 1588 timing card to all FlexRIO devices [3]. This solution uses two PXI trigger lines that provide the capacity to assign timestamps to every frame acquired and register events by hardware in a deterministic way. The system provides a solution for timestamping frames to synchronize them with the rest of the different signals.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Video analytics play a critical role in most recent traffic monitoring and driver assistance systems. In this context, the correct detection and classification of surrounding vehicles through image analysis has been the focus of extensive research in the last years. Most of the pieces of work reported for image-based vehicle verification make use of supervised classification approaches and resort to techniques, such as histograms of oriented gradients (HOG), principal component analysis (PCA), and Gabor filters, among others. Unfortunately, existing approaches are lacking in two respects: first, comparison between methods using a common body of work has not been addressed; second, no study of the combination potentiality of popular features for vehicle classification has been reported. In this study the performance of the different techniques is first reviewed and compared using a common public database. Then, the combination capabilities of these techniques are explored and a methodology is presented for the fusion of classifiers built upon them, taking into account also the vehicle pose. The study unveils the limitations of single-feature based classification and makes clear that fusion of classifiers is highly beneficial for vehicle verification.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The conformational space annealing (CSA) method for global optimization has been applied to the 10-55 fragment of the B-domain of staphylococcal protein A (protein A) and to a 75-residue protein, apo calbindin D9K (PDB ID code 1CLB), by using the UNRES off-lattice united-residue force field. Although the potential was not calibrated with these two proteins, the native-like structures were found among the low-energy conformations, without the use of threading or secondary-structure predictions. This is because the CSA method can find many distinct families of low-energy conformations. Starting from random conformations, the CSA method found that there are two families of low-energy conformations for each of the two proteins, the native-like fold and its mirror image. The CSA method converged to the same low-energy folds in all cases studied, as opposed to other optimization methods. It appears that the CSA method with the UNRES force field, which is based on the thermodynamic hypothesis, can be used in prediction of protein structures in real time.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Remembering an event involves not only what happened, but also where and when it occurred. We measured regional cerebral blood flow by positron emission tomography during initial encoding and subsequent retrieval of item, location, and time information. Multivariate image analysis showed that left frontal brain regions were always activated during encoding, and right superior frontal regions were always activated at retrieval. Pairwise image subtraction analyses revealed information-specific activations at (i) encoding, item information in left hippocampal, location information in right parietal, and time information in left fusiform regions; and (ii) retrieval, item in right inferior frontal and temporal, location in left frontal, and time in anterior cingulate cortices. These results point to the existence of general encoding and retrieval networks of episodic memory whose operations are augmented by unique brain areas recruited for processing specific aspects of remembered events.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Theories of image segmentation suggest that the human visual system may use two distinct processes to segregate figure from background: a local process that uses local feature contrasts to mark borders of coherent regions and a global process that groups similar features over a larger spatial scale. We performed psychophysical experiments to determine whether and to what extent the global similarity process contributes to image segmentation by motion and color. Our results show that for color, as well as for motion, segmentation occurs first by an integrative process on a coarse spatial scale, demonstrating that for both modalities the global process is faster than one based on local feature contrasts. Segmentation by motion builds up over time, whereas segmentation by color does not, indicating a fundamental difference between the modalities. Our data suggest that segmentation by motion proceeds first via a cooperative linking over space of local motion signals, generating almost immediate perceptual coherence even of physically incoherent signals. This global segmentation process occurs faster than the detection of absolute motion, providing further evidence for the existence of two motion processes with distinct dynamic properties.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

We present a controlled image smoothing and enhancement method based on a curvature flow interpretation of the geometric heat equation. Compared to existing techniques, the model has several distinct advantages. (i) It contains just one enhancement parameter. (ii) The scheme naturally inherits a stopping criterion from the image; continued application of the scheme produces no further change. (iii) The method is one of the fastest possible schemes based on a curvature-controlled approach.