55 resultados para Image Based Visual Servoing
Resumo:
Clasificación de una imagen de alta resolución "Quickbird" con la técnica de análisis de imágenes en base a objetos
Resumo:
Multi-view microscopy techniques such as Light-Sheet Fluorescence Microscopy (LSFM) are powerful tools for 3D + time studies of live embryos in developmental biology. The sample is imaged from several points of view, acquiring a set of 3D views that are then combined or fused in order to overcome their individual limitations. Views fusion is still an open problem despite recent contributions in the field. We developed a wavelet-based multi-view fusion method that, due to wavelet decomposition properties, is able to combine the complementary directional information from all available views into a single volume. Our method is demonstrated on LSFM acquisitions from live sea urchin and zebrafish embryos. The fusion results show improved overall contrast and details when compared with any of the acquired volumes. The proposed method does not need knowledge of the system's point spread function (PSF) and performs better than other existing PSF independent fusion methods.
Resumo:
This paper addresses initial efforts to develop a navigation system for ground vehicles supported by visual feedback from a mini aerial vehicle. A visual-based algorithm computes the ground vehicle pose in the world frame, as well as possible obstacles within the ground vehicle pathway. Relying on that information, a navigation and obstacle avoidance system is used to re-plan the ground vehicle trajectory, ensuring an optimal detour. Finally, some experiments are presented employing a unmanned ground vehicle (UGV) and a low cost mini unmanned aerial vehicle (UAV).
Resumo:
ATM, SDH or satellite have been used in the last century as the contribution network of Broadcasters. However the attractive price of IP networks is changing the infrastructure of these networks in the last decade. Nowadays, IP networks are widely used, but their characteristics do not offer the level of performance required to carry high quality video under certain circumstances. Data transmission is always subject to errors on line. In the case of streaming, correction is attempted at destination, while on transfer of files, retransmissions of information are conducted and a reliable copy of the file is obtained. In the latter case, reception time is penalized because of the low priority this type of traffic on the networks usually has. While in streaming, image quality is adapted to line speed, and line errors result in a decrease of quality at destination, in the file copy the difference between coding speed vs line speed and errors in transmission are reflected in an increase of transmission time. The way news or audiovisual programs are transferred from a remote office to the production centre depends on the time window and the type of line available; in many cases, it must be done in real time (streaming), with the resulting image degradation. The main purpose of this work is the workflow optimization and the image quality maximization, for that reason a transmission model for multimedia files adapted to JPEG2000, is described based on the combination of advantages of file transmission and those of streaming transmission, putting aside the disadvantages that these models have. The method is based on two patents and consists of the safe transfer of the headers and data considered to be vital for reproduction. Aside, the rest of the data is sent by streaming, being able to carry out recuperation operations and error concealment. Using this model, image quality is maximized according to the time window. In this paper, we will first give a briefest overview of the broadcasters requirements and the solutions with IP networks. We will then focus on a different solution for video file transfer. We will take the example of a broadcast center with mobile units (unidirectional video link) and regional headends (bidirectional link), and we will also present a video file transfer file method that satisfies the broadcaster requirements.
Resumo:
This paper proposes a new method, oriented to image real-time processing, for identifying crop rows in maize fields in the images. The vision system is designed to be installed onboard a mobile agricultural vehicle, that is, submitted to gyros, vibrations, and undesired movements. The images are captured under image perspective, being affected by the above undesired effects. The image processing consists of two main processes: image segmentation and crop row detection. The first one applies a threshold to separate green plants or pixels (crops and weeds) from the rest (soil, stones, and others). It is based on a fuzzy clustering process, which allows obtaining the threshold to be applied during the normal operation process. The crop row detection applies a method based on image perspective projection that searches for maximum accumulation of segmented green pixels along straight alignments. They determine the expected crop lines in the images. The method is robust enough to work under the above-mentioned undesired effects. It is favorably compared against the well-tested Hough transformation for line detection.
Resumo:
Sensing systems in living bodies offer a large variety of possible different configurations and philosophies able to be emulated in artificial sensing systems. Motion detection is one of the areas where different animals adopt different solutions and, in most of the cases, these solutions reflect a very sophisticated form. One of them, the mammalian visual system, presents several advantages with respect to the artificial ones. The main objective of this paper is to present a system, based on this biological structure, able to detect motion, its sense and its characteristics. The configuration adopted responds to the internal structure of the mammalian retina, where just five types of cells arranged in five layers are able to differentiate a large number of characteristics of the image impinging onto it. Its main advantage is that the detection of these properties is based purely on its hardware. A simple unit, based in a previous optical logic cell employed in optical computing, is the basis for emulating the different behaviors of the biological neurons. No software is present and, in this way, no possible interference from outside affects to the final behavior. This type of structure is able to work, once the internal configuration is implemented, without any further attention. Different possibilities are present in the architecture to be presented: detection of motion, of its direction and intensity. Moreover, some other characteristics, as symmetry may be obtained.
Resumo:
The use of new technologies in neurorehabilitation has led to higher intensity rehabilitation processes, extending therapies in an economically sustainable way. Interactive Video (IV) technology allows therapists to work with virtual environments that reproduce real situations. In this way, patients deal with Activities of the Daily Living (ADL) immersed within enhanced environments [1]. These rehabilitation exercises, which focus in re-learning lost functions, will try to modulate the neural plasticity processes [2]. This research presents a system where a neurorehabilitation IV-based environment has been integrated with an eye-tracker device in order to monitor and to interact using visual attention. While patients are interacting with the neurorehabilitation environment, their visual behavior is closely related with their cognitive state, which in turn mirrors the brain damage condition suffered by them [3] [4]. Patients’ gaze data can provide knowledge on their attention focus and their cognitive state, as well as on the validity of the rehabilitation tasks proposed [5].
Resumo:
Optical signal processing in any living being is more complex than the one obtained in artificial systems. Cortex architecture, although only partly known, gives some useful ideas to be employed in communications. To analyze some of these structures is the objective of this paper. One of the main possibilities reported is handling signals in a parallel way. As it is shown, according to the signal characteristics each signal impinging onto a single input may be routed to a different output. At the same time, identical signals, coming to different inputs, may be routed to the same output without internal conflicts. This is due to the change of some of their characteristics in the way out when going through the intermediate levels. The simulation of this architecture is based on simple logic cells. The basis for the proposed architecture is the five layers of the mammalian retina and the first levels of the visual cortex.
Resumo:
Background Gray scale images make the bulk of data in bio-medical image analysis, and hence, the main focus of many image processing tasks lies in the processing of these monochrome images. With ever improving acquisition devices, spatial and temporal image resolution increases, and data sets become very large. Various image processing frameworks exists that make the development of new algorithms easy by using high level programming languages or visual programming. These frameworks are also accessable to researchers that have no background or little in software development because they take care of otherwise complex tasks. Specifically, the management of working memory is taken care of automatically, usually at the price of requiring more it. As a result, processing large data sets with these tools becomes increasingly difficult on work station class computers. One alternative to using these high level processing tools is the development of new algorithms in a languages like C++, that gives the developer full control over how memory is handled, but the resulting workflow for the prototyping of new algorithms is rather time intensive, and also not appropriate for a researcher with little or no knowledge in software development. Another alternative is in using command line tools that run image processing tasks, use the hard disk to store intermediate results, and provide automation by using shell scripts. Although not as convenient as, e.g. visual programming, this approach is still accessable to researchers without a background in computer science. However, only few tools exist that provide this kind of processing interface, they are usually quite task specific, and don’t provide an clear approach when one wants to shape a new command line tool from a prototype shell script. Results The proposed framework, MIA, provides a combination of command line tools, plug-ins, and libraries that make it possible to run image processing tasks interactively in a command shell and to prototype by using the according shell scripting language. Since the hard disk becomes the temporal storage memory management is usually a non-issue in the prototyping phase. By using string-based descriptions for filters, optimizers, and the likes, the transition from shell scripts to full fledged programs implemented in C++ is also made easy. In addition, its design based on atomic plug-ins and single tasks command line tools makes it easy to extend MIA, usually without the requirement to touch or recompile existing code. Conclusion In this article, we describe the general design of MIA, a general purpouse framework for gray scale image processing. We demonstrated the applicability of the software with example applications from three different research scenarios, namely motion compensation in myocardial perfusion imaging, the processing of high resolution image data that arises in virtual anthropology, and retrospective analysis of treatment outcome in orthognathic surgery. With MIA prototyping algorithms by using shell scripts that combine small, single-task command line tools is a viable alternative to the use of high level languages, an approach that is especially useful when large data sets need to be processed.
Resumo:
The emergence of cloud datacenters enhances the capability of online data storage. Since massive data is stored in datacenters, it is necessary to effectively locate and access interest data in such a distributed system. However, traditional search techniques only allow users to search images over exact-match keywords through a centralized index. These techniques cannot satisfy the requirements of content based image retrieval (CBIR). In this paper, we propose a scalable image retrieval framework which can efficiently support content similarity search and semantic search in the distributed environment. Its key idea is to integrate image feature vectors into distributed hash tables (DHTs) by exploiting the property of locality sensitive hashing (LSH). Thus, images with similar content are most likely gathered into the same node without the knowledge of any global information. For searching semantically close images, the relevance feedback is adopted in our system to overcome the gap between low-level features and high-level features. We show that our approach yields high recall rate with good load balance and only requires a few number of hops.
Resumo:
Evolvable Hardware (EH) is a technique that consists of using reconfigurable hardware devices whose configuration is controlled by an Evolutionary Algorithm (EA). Our system consists of a fully-FPGA implemented scalable EH platform, where the Reconfigurable processing Core (RC) can adaptively increase or decrease in size. Figure 1 shows the architecture of the proposed System-on-Programmable-Chip (SoPC), consisting of a MicroBlaze processor responsible of controlling the whole system operation, a Reconfiguration Engine (RE), and a Reconfigurable processing Core which is able to change its size in both height and width. This system is used to implement image filters, which are generated autonomously thanks to the evolutionary process. The system is complemented with a camera that enables the usage of the platform for real time applications.
Resumo:
Tradicionalmente, el uso de técnicas de análisis de datos ha sido una de las principales vías para el descubrimiento de conocimiento oculto en grandes cantidades de datos, recopilados por expertos en diferentes dominios. Por otra parte, las técnicas de visualización también se han usado para mejorar y facilitar este proceso. Sin embargo, existen limitaciones serias en la obtención de conocimiento, ya que suele ser un proceso lento, tedioso y en muchas ocasiones infructífero, debido a la dificultad de las personas para comprender conjuntos de datos de grandes dimensiones. Otro gran inconveniente, pocas veces tenido en cuenta por los expertos que analizan grandes conjuntos de datos, es la degradación involuntaria a la que someten a los datos durante las tareas de análisis, previas a la obtención final de conclusiones. Por degradación quiere decirse que los datos pueden perder sus propiedades originales, y suele producirse por una reducción inapropiada de los datos, alterando así su naturaleza original y llevando en muchos casos a interpretaciones y conclusiones erróneas que podrían tener serias implicaciones. Además, este hecho adquiere una importancia trascendental cuando los datos pertenecen al dominio médico o biológico, y la vida de diferentes personas depende de esta toma final de decisiones, en algunas ocasiones llevada a cabo de forma inapropiada. Ésta es la motivación de la presente tesis, la cual propone un nuevo framework visual, llamado MedVir, que combina la potencia de técnicas avanzadas de visualización y minería de datos para tratar de dar solución a estos grandes inconvenientes existentes en el proceso de descubrimiento de información válida. El objetivo principal es hacer más fácil, comprensible, intuitivo y rápido el proceso de adquisición de conocimiento al que se enfrentan los expertos cuando trabajan con grandes conjuntos de datos en diferentes dominios. Para ello, en primer lugar, se lleva a cabo una fuerte disminución en el tamaño de los datos con el objetivo de facilitar al experto su manejo, y a la vez preservando intactas, en la medida de lo posible, sus propiedades originales. Después, se hace uso de efectivas técnicas de visualización para representar los datos obtenidos, permitiendo al experto interactuar de forma sencilla e intuitiva con los datos, llevar a cabo diferentes tareas de análisis de datos y así estimular visualmente su capacidad de comprensión. De este modo, el objetivo subyacente se basa en abstraer al experto, en la medida de lo posible, de la complejidad de sus datos originales para presentarle una versión más comprensible, que facilite y acelere la tarea final de descubrimiento de conocimiento. MedVir se ha aplicado satisfactoriamente, entre otros, al campo de la magnetoencefalografía (MEG), que consiste en la predicción en la rehabilitación de lesiones cerebrales traumáticas (Traumatic Brain Injury (TBI) rehabilitation prediction). Los resultados obtenidos demuestran la efectividad del framework a la hora de acelerar y facilitar el proceso de descubrimiento de conocimiento sobre conjuntos de datos reales. ABSTRACT Traditionally, the use of data analysis techniques has been one of the main ways of discovering knowledge hidden in large amounts of data, collected by experts in different domains. Moreover, visualization techniques have also been used to enhance and facilitate this process. However, there are serious limitations in the process of knowledge acquisition, as it is often a slow, tedious and many times fruitless process, due to the difficulty for human beings to understand large datasets. Another major drawback, rarely considered by experts that analyze large datasets, is the involuntary degradation to which they subject the data during analysis tasks, prior to obtaining the final conclusions. Degradation means that data can lose part of their original properties, and it is usually caused by improper data reduction, thereby altering their original nature and often leading to erroneous interpretations and conclusions that could have serious implications. Furthermore, this fact gains a trascendental importance when the data belong to medical or biological domain, and the lives of people depends on the final decision-making, which is sometimes conducted improperly. This is the motivation of this thesis, which proposes a new visual framework, called MedVir, which combines the power of advanced visualization techniques and data mining to try to solve these major problems existing in the process of discovery of valid information. Thus, the main objective is to facilitate and to make more understandable, intuitive and fast the process of knowledge acquisition that experts face when working with large datasets in different domains. To achieve this, first, a strong reduction in the size of the data is carried out in order to make the management of the data easier to the expert, while preserving intact, as far as possible, the original properties of the data. Then, effective visualization techniques are used to represent the obtained data, allowing the expert to interact easily and intuitively with the data, to carry out different data analysis tasks, and so visually stimulating their comprehension capacity. Therefore, the underlying objective is based on abstracting the expert, as far as possible, from the complexity of the original data to present him a more understandable version, thus facilitating and accelerating the task of knowledge discovery. MedVir has been succesfully applied to, among others, the field of magnetoencephalography (MEG), which consists in predicting the rehabilitation of Traumatic Brain Injury (TBI). The results obtained successfully demonstrate the effectiveness of the framework to accelerate and facilitate the process of knowledge discovery on real world datasets.
Resumo:
One of the main challenges for intelligent vehicles is the capability of detecting other vehicles in their environment, which constitute the main source of accidents. Specifically, many methods have been proposed in the literature for video-based vehicle detection. Most of them perform supervised classification using some appearance-related feature, in particular, symmetry has been extensively utilized. However, an in-depth analysis of the classification power of this feature is missing. As a first contribution of this paper, a thorough study of the classification performance of symmetry is presented within a Bayesian decision framework. This study reveals that the performance of symmetry-based classification is very limited. Therefore, as a second contribution, a new gradient-based descriptor is proposed for vehicle detection. This descriptor exploits the known rectangular structure of vehicle rears within a Histogram of Gradients (HOG)-based framework. Experiments show that the proposed descriptor outperforms largely symmetry as a feature for vehicle verification, achieving classification rates over 90%.
Resumo:
We present an adaptive unequal error protection (UEP) strategy built on the 1-D interleaved parity Application Layer Forward Error Correction (AL-FEC) code for protecting the transmission of stereoscopic 3D video content encoded with Multiview Video Coding (MVC) through IP-based networks. Our scheme targets the minimization of quality degradation produced by packet losses during video transmission in time-sensitive application scenarios. To that end, based on a novel packet-level distortion model, it selects in real time the most suitable packets within each Group of Pictures (GOP) to be protected and the most convenient FEC technique parameters, i.e., the size of the FEC generator matrix. In order to make these decisions, it considers the relevance of the packet, the behavior of the channel, and the available bitrate for protection purposes. Simulation results validate both the distortion model introduced to estimate the importance of packets and the optimization of the FEC technique parameter values.
Resumo:
El tema de la presente tesis es la valoración del patrimonio y en ella se considera que el patrimonio es un proceso cultural interesado en negociar, crear y recrear recuerdos, valores y significados culturales. Actualmente el patrimonio como proceso se está consolidando en la literatura científica, aunque la idea de que es una ‘cosa’ es dominante en el debate internacional y está respaldada tanto por políticas como prácticas de la UNESCO. El considerar el patrimonio como un proceso permite una mirada crítica, que subraya la significación. Es decir, supone el correlato que conlleva definir algo como ‘patrimonio’, o hacer que lo vaya siendo. Esta visión del concepto permite la posibilidad de comprender no sólo lo que se ha valorado, sino también lo que se ha olvidado y el porqué. El principal objetivo de esta investigación es explorar las características de un proceso de razonamiento visual para aplicarlo en el de valoración del patrimonio. Éste que se presenta, implica la creación de representaciones visuales y sus relaciones, además su meta no está centrada en producir un ambiente que sea indiferenciado de la realidad física. Con él se pretende ofrecer la posibilidad de comunicar la dimensión ‘poliédrica’ del patrimonio. Para que este nuevo proceso que propongo sea viable y sostenible, existe la necesidad de tener en cuenta el fin que se quiere lograr: la valoración. Es importante considerar que es un proceso en el cual las dinámicas de aprendizaje, comportamientos y exploración del patrimonio están directamente relacionadas con su valoración. Por lo tanto, hay que saber cómo se genera la valoración del patrimonio, con el fin de ser capaces de desarrollar el proceso adaptado a estas dinámicas. La hipótesis de esta tesis defiende que un proceso de razonamiento visual para la valoración del patrimonio permite que las personas involucradas en el proceso inicien un proceso de interacción con un elemento patrimonial y su imagen mental para llegar a ciertas conclusiones con respecto a su valor y significado. El trabajo describe la metodología que da lugar al proceso de razonamiento visual para el patrimonio, que ha sido concebido sobre un modelado descriptivo de procesos, donde se han caracterizado tres niveles: meta-nivel, de análisis y operacional. En el modelado del proceso los agentes, junto con el patrimonio, son los protagonistas. El enfoque propuesto no es sólo sobre el patrimonio, sino sobre la compleja relación entre las personas y el patrimonio. Los agentes humanos dan valor a los testimonios de la vida pasada y les imbuyen de significado. Por lo tanto, este enfoque de un proceso de razonamiento visual sirve para detectar los cambios en el valor del patrimonio, además de su dimensión poliédrica en términos espaciales y temporales. Además se ha propuesto una nueva tipología de patrimonio necesaria para sustentar un proceso de razonamiento visual para su valoración. Esta tipología está apoyada en la usabilidad del patrimonio y dentro de ella se encuentran los siguientes tipos de patrimonio: accesible, cautivo, contextualizado, descontextualizado, original y vicarial. El desarrollo de un proceso de razonamiento visual para el patrimonio es una propuesta innovadora porque integra el proceso para su valoración, contemplando la dimensión poliédrica del patrimonio y explotando la potencialidad del razonamiento visual. Además, los posibles usuarios del proceso propuesto van a tener interacción de manera directa con el patrimonio e indirecta con la información relativa a él, como por ejemplo, con los metadatos. Por tanto, el proceso propuesto posibilita que los posibles usuarios se impliquen activamente en la propia valoración del patrimonio. ABSTRACT The subject of this thesis is heritage valuation and it argues that heritage is a cultural process that is inherited, transmitted, and transformed by individuals who are interested in negotiating, creating and recreating memories and cultural meanings. Recently heritage as a process has seen a consolidation in the research, although the idea that heritage is a ‘thing’ is dominant in the international debate and is supported by policies and practice of UNESCO. Seeing heritage as a process enables a critical view, underscoring the significance. That is, it is the correlate involved in defining something as ‘heritage’, or converting it into heritage. This view of the concept allows the possibility to understand not only what has been valued, but also what has been forgotten and why. The main objective of this research is to explore the characteristics of a visual reasoning process in order to apply it to a heritage valuation. The goal of the process is not centered on producing an environment that is undifferentiated from physical reality. Thus, the objective of the process is to provide the ability to communicate the ‘polyhedral’ dimension of heritage. For this new process to be viable and sustainable, it is necessary to consider what is to be achieved: heritage valuation. It is important to note that it is a process in which the dynamics of learning, behavior and exploration heritage are directly related to its valuation. Therefore, we need to know how this valuation takes place in order to be able to develop a process that is adapted to these dynamic. The hypothesis of this thesis argues that a visual reasoning process for heritage valuation allows people involved in the process to initiate an interaction with a heritage and to build its mental image to reach certain conclusions regarding its value and meaning. The thesis describes the methodology that results in a visual reasoning process for heritage valuation, which has been based on a descriptive modeling process and have characterized three levels: meta, analysis and operational -level. The agents are the protagonists in the process, along with heritage. The proposed approach is not only about heritage but the complex relationship between people and heritage. Human operators give value to the testimonies of past life and imbue them with meaning. Therefore, this approach of a visual reasoning process is used to detect changes in the value of heritage and its multifaceted dimension in spatial and temporal terms. A new type of heritage required to support a visual reasoning process for heritage valuation has also been proposed. This type is supported by its usability and it covers the following types of heritage: available, captive, contextualized, decontextualized, original and vicarious. The development of a visual reasoning process for heritage valuation is innovative because it integrates the process for valuation of heritage, considering the multifaceted dimension of heritage and exploiting the potential of visual reasoning. In addition, potential users of the proposed process will have direct interaction with heritage and indirectly with the information about it, such as the metadata. Therefore, the proposed process enables potential users to be actively involved in their own heritage valuation.