944 resultados para stereo vision,stereo matching,cuda,lisp,connection machine
Resumo:
La informática teórica es una disciplina básica ya que la mayoría de los avances en informática se sustentan en un sólido resultado de esa materia. En los últimos a~nos debido tanto al incremento de la potencia de los ordenadores, como a la cercanía del límite físico en la miniaturización de los componentes electrónicos, resurge el interés por modelos formales de computación alternativos a la arquitectura clásica de von Neumann. Muchos de estos modelos se inspiran en la forma en la que la naturaleza resuelve eficientemente problemas muy complejos. La mayoría son computacionalmente completos e intrínsecamente paralelos. Por este motivo se les está llegando a considerar como nuevos paradigmas de computación (computación natural). Se dispone, por tanto, de un abanico de arquitecturas abstractas tan potentes como los computadores convencionales y, a veces, más eficientes: alguna de ellas mejora el rendimiento, al menos temporal, de problemas NPcompletos proporcionando costes no exponenciales. La representación formal de las redes de procesadores evolutivos requiere de construcciones, tanto independientes, como dependientes del contexto, dicho de otro modo, en general una representación formal completa de un NEP implica restricciones, tanto sintácticas, como semánticas, es decir, que muchas representaciones aparentemente (sintácticamente) correctas de casos particulares de estos dispositivos no tendrían sentido porque podrían no cumplir otras restricciones semánticas. La aplicación de evolución gramatical semántica a los NEPs pasa por la elección de un subconjunto de ellos entre los que buscar los que solucionen un problema concreto. En este trabajo se ha realizado un estudio sobre un modelo inspirado en la biología celular denominado redes de procesadores evolutivos [55, 53], esto es, redes cuyos nodos son procesadores muy simples capaces de realizar únicamente un tipo de mutación puntual (inserción, borrado o sustitución de un símbolo). Estos nodos están asociados con un filtro que está definido por alguna condición de contexto aleatorio o de pertenencia. Las redes están formadas a lo sumo de seis nodos y, teniendo los filtros definidos por una pertenencia a lenguajes regulares, son capaces de generar todos los lenguajes enumerables recursivos independientemente del grafo subyacente. Este resultado no es sorprendente ya que semejantes resultados han sido documentados en la literatura. Si se consideran redes con nodos y filtros definidos por contextos aleatorios {que parecen estar más cerca a las implementaciones biológicas{ entonces se pueden generar lenguajes más complejos como los lenguajes no independientes del contexto. Sin embargo, estos mecanismos tan simples son capaces de resolver problemas complejos en tiempo polinomial. Se ha presentado una solución lineal para un problema NP-completo, el problema de los 3-colores. Como primer aporte significativo se ha propuesto una nueva dinámica de las redes de procesadores evolutivos con un comportamiento no determinista y masivamente paralelo [55], y por tanto todo el trabajo de investigación en el área de la redes de procesadores se puede trasladar a las redes masivamente paralelas. Por ejemplo, las redes masivamente paralelas se pueden modificar de acuerdo a determinadas reglas para mover los filtros hacia las conexiones. Cada conexión se ve como un canal bidireccional de manera que los filtros de entrada y salida coinciden. A pesar de esto, estas redes son computacionalmente completas. Se pueden también implementar otro tipo de reglas para extender este modelo computacional. Se reemplazan las mutaciones puntuales asociadas a cada nodo por la operación de splicing. Este nuevo tipo de procesador se denomina procesador splicing. Este modelo computacional de Red de procesadores con splicing ANSP es semejante en cierto modo a los sistemas distribuidos en tubos de ensayo basados en splicing. Además, se ha definido un nuevo modelo [56] {Redes de procesadores evolutivos con filtros en las conexiones{ , en el cual los procesadores tan solo tienen reglas y los filtros se han trasladado a las conexiones. Dicho modelo es equivalente, bajo determinadas circunstancias, a las redes de procesadores evolutivos clásicas. Sin dichas restricciones el modelo propuesto es un superconjunto de los NEPs clásicos. La principal ventaja de mover los filtros a las conexiones radica en la simplicidad de la modelización. Otras aportaciones de este trabajo ha sido el dise~no de un simulador en Java [54, 52] para las redes de procesadores evolutivos propuestas en esta Tesis. Sobre el término "procesador evolutivo" empleado en esta Tesis, el proceso computacional descrito aquí no es exactamente un proceso evolutivo en el sentido Darwiniano. Pero las operaciones de reescritura que se han considerado pueden interpretarse como mutaciones y los procesos de filtrado se podrían ver como procesos de selección. Además, este trabajo no abarca la posible implementación biológica de estas redes, a pesar de ser de gran importancia. A lo largo de esta tesis se ha tomado como definición de la medida de complejidad para los ANSP, una que denotaremos como tama~no (considerando tama~no como el número de nodos del grafo subyacente). Se ha mostrado que cualquier lenguaje enumerable recursivo L puede ser aceptado por un ANSP en el cual el número de procesadores está linealmente acotado por la cardinalidad del alfabeto de la cinta de una máquina de Turing que reconoce dicho lenguaje L. Siguiendo el concepto de ANSP universales introducido por Manea [65], se ha demostrado que un ANSP con una estructura de grafo fija puede aceptar cualquier lenguaje enumerable recursivo. Un ANSP se puede considerar como un ente capaz de resolver problemas, además de tener otra propiedad relevante desde el punto de vista práctico: Se puede definir un ANSP universal como una subred, donde solo una cantidad limitada de parámetros es dependiente del lenguaje. La anterior característica se puede interpretar como un método para resolver cualquier problema NP en tiempo polinomial empleando un ANSP de tama~no constante, concretamente treinta y uno. Esto significa que la solución de cualquier problema NP es uniforme en el sentido de que la red, exceptuando la subred universal, se puede ver como un programa; adaptándolo a la instancia del problema a resolver, se escogerín los filtros y las reglas que no pertenecen a la subred universal. Un problema interesante desde nuestro punto de vista es el que hace referencia a como elegir el tama~no optimo de esta red.---ABSTRACT---This thesis deals with the recent research works in the area of Natural Computing {bio-inspired models{, more precisely Networks of Evolutionary Processors first developed by Victor Mitrana and they are based on P Systems whose father is Georghe Paun. In these models, they are a set of processors connected in an underlying undirected graph, such processors have an object multiset (strings) and a set of rules, named evolution rules, that transform objects inside processors[55, 53],. These objects can be sent/received using graph connections provided they accomplish constraints defined at input and output filters processors have. This symbolic model, non deterministic one (processors are not synchronized) and massive parallel one[55] (all rules can be applied in one computational step) has some important properties regarding solution of NP-problems in lineal time and of course, lineal resources. There are a great number of variants such as hybrid networks, splicing processors, etc. that provide the model a computational power equivalent to Turing machines. The origin of networks of evolutionary processors (NEP for short) is a basic architecture for parallel and distributed symbolic processing, related to the Connection Machine as well as the Logic Flow paradigm, which consists of several processors, each of them being placed in a node of a virtual complete graph, which are able to handle data associated with the respective node. All the nodes send simultaneously their data and the receiving nodes handle also simultaneously all the arriving messages, according to some strategies. In a series of papers one considers that each node may be viewed as a cell having genetic information encoded in DNA sequences which may evolve by local evolutionary events, that is point mutations. Each node is specialized just for one of these evolutionary operations. Furthermore, the data in each node is organized in the form of multisets of words (each word appears in an arbitrarily large number of copies), and all the copies are processed in parallel such that all the possible events that can take place do actually take place. Obviously, the computational process just described is not exactly an evolutionary process in the Darwinian sense. But the rewriting operations we have considered might be interpreted as mutations and the filtering process might be viewed as a selection process. Recombination is missing but it was asserted that evolutionary and functional relationships between genes can be captured by taking only local mutations into consideration. It is clear that filters associated with each node allow a strong control of the computation. Indeed, every node has an input and output filter; two nodes can exchange data if it passes the output filter of the sender and the input filter of the receiver. Moreover, if some data is sent out by some node and not able to enter any node, then it is lost. In this paper we simplify the ANSP model considered in by moving the filters from the nodes to the edges. Each edge is viewed as a two-way channel such that the input and output filters coincide. Clearly, the possibility of controlling the computation in such networks seems to be diminished. For instance, there is no possibility to loose data during the communication steps. In spite of this and of the fact that splicing is not a powerful operation (remember that splicing systems generates only regular languages) we prove here that these devices are computationally complete. As a consequence, we propose characterizations of two complexity classes, namely NP and PSPACE, in terms of accepting networks of restricted splicing processors with filtered connections. We proposed a uniform linear time solution to SAT based on ANSPFCs with linearly bounded resources. This solution should be understood correctly: we do not solve SAT in linear time and space. Since any word and auxiliary word appears in an arbitrarily large number of copies, one can generate in linear time, by parallelism and communication, an exponential number of words each of them having an exponential number of copies. However, this does not seem to be a major drawback since by PCR (Polymerase Chain Reaction) one can generate an exponential number of identical DNA molecules in a linear number of reactions. It is worth mentioning that the ANSPFC constructed above remains unchanged for any instance with the same number of variables. Therefore, the solution is uniform in the sense that the network, excepting the input and output nodes, may be viewed as a program according to the number of variables, we choose the filters, the splicing words and the rules, then we assign all possible values to the variables, and compute the formula.We proved that ANSP are computationally complete. Do the ANSPFC remain still computationally complete? If this is not the case, what other problems can be eficiently solved by these ANSPFCs? Moreover, the complexity class NP is exactly the class of all languages decided by ANSP in polynomial time. Can NP be characterized in a similar way with ANSPFCs?
Resumo:
The purpose of this work is to demonstrate and to assess a simple algorithm for automatic estimation of the most salient region in an image, that have possible application in computer vision. The algorithm uses the connection between color dissimilarities in the image and the image’s most salient region. The algorithm also avoids using image priors. Pixel dissimilarity is an informal function of the distance of a specific pixel’s color to other pixels’ colors in an image. We examine the relation between pixel color dissimilarity and salient region detection on the MSRA1K image dataset. We propose a simple algorithm for salient region detection through random pixel color dissimilarity. We define dissimilarity by accumulating the distance between each pixel and a sample of n other random pixels, in the CIELAB color space. An important result is that random dissimilarity between each pixel and just another pixel (n = 1) is enough to create adequate saliency maps when combined with median filter, with competitive average performance if compared with other related methods in the saliency detection research field. The assessment was performed by means of precision-recall curves. This idea is inspired on the human attention mechanism that is able to choose few specific regions to focus on, a biological system that the computer vision community aims to emulate. We also review some of the history on this topic of selective attention.
Resumo:
This work aims to develop a neurogeometric model of stereo vision, based on cortical architectures involved in the problem of 3D perception and neural mechanisms generated by retinal disparities. First, we provide a sub-Riemannian geometry for stereo vision, inspired by the work on the stereo problem by Zucker (2006), and using sub-Riemannian tools introduced by Citti-Sarti (2006) for monocular vision. We present a mathematical interpretation of the neural mechanisms underlying the behavior of binocular cells, that integrate monocular inputs. The natural compatibility between stereo geometry and neurophysiological models shows that these binocular cells are sensitive to position and orientation. Therefore, we model their action in the space R3xS2 equipped with a sub-Riemannian metric. Integral curves of the sub-Riemannian structure model neural connectivity and can be related to the 3D analog of the psychophysical association fields for the 3D process of regular contour formation. Then, we identify 3D perceptual units in the visual scene: they emerge as a consequence of the random cortico-cortical connection of binocular cells. Considering an opportune stochastic version of the integral curves, we generate a family of kernels. These kernels represent the probability of interaction between binocular cells, and they are implemented as facilitation patterns to define the evolution in time of neural population activity at a point. This activity is usually modeled through a mean field equation: steady stable solutions lead to consider the associated eigenvalue problem. We show that three-dimensional perceptual units naturally arise from the discrete version of the eigenvalue problem associated to the integro-differential equation of the population activity.
Resumo:
To use a world model, a mobile robot must be able to determine its own position in the world. To support truly autonomous navigation, I present MARVEL, a system that builds and maintains its own models of world locations and uses these models to recognize its world position from stereo vision input. MARVEL is designed to be robust with respect to input errors and to respond to a gradually changing world by updating its world location models. I present results from real-world tests of the system that demonstrate its reliability. MARVEL fits into a world modeling system under development.
Resumo:
La percepció per visió es millorada quan es pot gaudir d'un camp de visió ampli. Aquesta tesi es concentra en la percepció visual de la profunditat amb l'ajuda de càmeres omnidireccionals. La percepció 3D s'obté generalment en la visió per computadora utilitzant configuracions estèreo amb el desavantatge del cost computacional elevat a l'hora de buscar els elements visuals comuns entre les imatges. La solució que ofereix aquesta tesi és l'ús de la llum estructurada per resoldre el problema de relacionar les correspondències. S'ha realitzat un estudi sobre els sistemes de visió omnidireccional. S'han avaluat vàries configuracions estèreo i s'ha escollit la millor. Els paràmetres del model són difícils de mesurar directament i, en conseqüència, s'ha desenvolupat una sèrie de mètodes de calibració. Els resultats obtinguts són prometedors i demostren que el sensor pot ésser utilitzat en aplicacions per a la percepció de la profunditat com serien el modelatge de l'escena, la inspecció de canonades, navegació de robots, etc.
Resumo:
In an immersive virtual reality environment, subjects fail to notice when a scene expands or contracts around them, despite correct and consistent information from binocular stereopsis and motion parallax, resulting in gross failures of size constancy (A. Glennerster, L. Tcheang, S. J. Gilson, A. W. Fitzgibbon, & A. J. Parker, 2006). We determined whether the integration of stereopsis/motion parallax cues with texture-based cues could be modified through feedback. Subjects compared the size of two objects, each visible when the room was of a different size. As the subject walked, the room expanded or contracted, although subjects failed to notice any change. Subjects were given feedback about the accuracy of their size judgments, where the “correct” size setting was defined either by texture-based cues or (in a separate experiment) by stereo/motion parallax cues. Because of feedback, observers were able to adjust responses such that fewer errors were made. For texture-based feedback, the pattern of responses was consistent with observers weighting texture cues more heavily. However, for stereo/motion parallax feedback, performance in many conditions became worse such that, paradoxically, biases moved away from the point reinforced by the feedback. This can be explained by assuming that subjects remap the relationship between stereo/motion parallax cues and perceived size or that they develop strategies to change their criterion for a size match on different trials. In either case, subjects appear not to have direct access to stereo/motion parallax cues.
Resumo:
L'analisi di un'immagine con strumenti automatici si è sviluppata in quella che oggi viene chiamata "computer vision", la materia di studio proveniente dal mondo informatico che si occupa, letteralmente, di "vedere oltre", di estrarre da una figura una serie di aspetti strutturali, sotto forma di dati numerici. Tra le tante aree di ricerca che ne derivano, una in particolare è dedicata alla comprensione di un dettaglio estremamente interessante, che si presta ad applicazioni di molteplici tipologie: la profondità. L'idea di poter recuperare ciò che, apparentemente, si era perso fermando una scena ed imprimendone l'istante in un piano a due dimensioni poteva sembrare, fino a non troppi anni fa, qualcosa di impossibile. Grazie alla cosiddetta "visione stereo", invece, oggi possiamo godere della "terza dimensione" in diversi ambiti, legati ad attività professionali piuttosto che di svago. Inoltre, si presta ad utilizzi ancora più interessanti quando gli strumenti possono vantare caratteristiche tecniche accessibili, come dimensioni ridotte e facilità d'uso. Proprio quest'ultimo aspetto ha catturato l'attenzione di un gruppo di lavoro, dal quale è nata l'idea di sviluppare una soluzione, chiamata "SuperStereo", capace di permettere la stereo vision usando uno strumento estremamente diffuso nel mercato tecnologico globale: uno smartphone e, più in generale, qualsiasi dispositivo mobile appartenente a questa categoria.
Resumo:
In this work we solve the uncalibrated photometric stereo problem with lights placed near the scene. We investigate different image formation models and find the one that best fits our observations. Although the devised model is more complex than its far-light counterpart, we show that under a global linear ambiguity the reconstruction is possible up to a rotation and scaling, which can be easily fixed. We also propose a solution for reconstructing the normal map, the albedo, the light positions and the light intensities of a scene given only a sequence of near-light images. This is done in an alternating minimization framework which first estimates both the normals and the albedo, and then the light positions and intensities. We validate our method on real world experiments and show that a near-light model leads to a significant improvement in the surface reconstruction compared to the classic distant illumination case.
Resumo:
In recent years, remote sensing imaging systems for the measurement of oceanic sea states have attracted renovated attention. Imaging technology is economical, non-invasive and enables a better understanding of the space-time dynamics of ocean waves over an area rather than at selected point locations of previous monitoring methods (buoys, wave gauges, etc.). We present recent progress in space-time measurement of ocean waves using stereo vision systems on offshore platforms, which focus on sea states with wavelengths in the range of 0.01 m to 10 m. Classical epipolar techniques and modern variational methods are reviewed to reconstruct the sea surface from the stereo pairs sequentially in time. The statistical and spectral properties of the resulting observed waves are analyzed. Current improvements of the variational methods are discussed as future lines of research.
Resumo:
In recent years, remote sensing imaging systems for the measurement of oceanic sea states have attracted renovated attention. Imaging technology is economical, non-invasive and enables a better understanding of the space-time dynamics of ocean waves over an area rather than at selected point locations of previous monitoring methods (buoys, wave gauges, etc.). We present recent progress in space-time measurement of ocean waves using stereo vision systems on offshore platforms, which focus on sea states with wavelengths in the range of 0.01 m to 1 m. Both traditional disparity-based systems and modern elevation-based ones are presented in a variational optimization framework: the main idea is to pose the stereoscopic reconstruction problem of the surface of the ocean in a variational setting and design an energy functional whose minimizer is the desired temporal sequence of wave heights. The functional combines photometric observations as well as spatial and temporal smoothness priors. Disparity methods estimate the disparity between images as an intermediate step toward retrieving the depth of the waves with respect to the cameras, whereas elevation methods estimate the ocean surface displacements directly in 3-D space. Both techniques are used to measure ocean waves from real data collected at offshore platforms in the Black Sea (Crimean Peninsula, Ukraine) and the Northern Adriatic Sea (Venice coast, Italy). Then, the statistical and spectral properties of the resulting observed waves are analyzed. We show the advantages and disadvantages of the presented stereo vision systems and discuss future lines of research to improve their performance in critical issues such as the robustness of the camera calibration in spite of undesired variations of the camera parameters or the processing time that it takes to retrieve ocean wave measurements from the stereo videos, which are very large datasets that need to be processed efficiently to be of practical usage. Multiresolution and short-time approaches would improve efficiency and scalability of the techniques so that wave displacements are obtained in feasible times.
Resumo:
Remote sensing imaging systems for the measurement of oceanic sea states have recently attracted renovated attention. Imaging technology is economical, non-invasive and enables a better understanding of the space-time dynamics of ocean waves over an area rather than at selected point locations of previous monitoring methods (buoys, wave gauges, etc.). We present recent progress in space-time measurement of ocean waves using stereo vision systems on offshore platforms. Both traditional disparity-based systems and modern elevation-based ones are presented in a variational optimization framework: the main idea is to pose the stereoscopic reconstruction problem of the surface of the ocean in a variational setting and design an energy functional whose minimizer is the desired temporal sequence of wave heights. The functional combines photometric observations as well as spatial and temporal smoothness priors. Disparity methods estimate the disparity between images as an intermediate step toward retrieving the depth of the waves with respect to the cameras, whereas elevation methods estimate the ocean surface displacements directly in 3-D space. Both techniques are used to measure ocean waves from real data collected at offshore platforms in the Black Sea (Crimean Peninsula, Ukraine) and the Northern Adriatic Sea (Venice coast, Italy). Then, the statistical and spectral properties of the resulting observed waves are analyzed. We show the advantages and disadvantages of the presented stereo vision systems and discuss the improvement of their performance in critical issues such as the robustness of the camera calibration in spite of undesired variations of the camera parameters.
Resumo:
In this paper we propose an innovative method for the automatic detection and tracking of road traffic signs using an onboard stereo camera. It involves a combination of monocular and stereo analysis strategies to increase the reliability of the detections such that it can boost the performance of any traffic sign recognition scheme. Firstly, an adaptive color and appearance based detection is applied at single camera level to generate a set of traffic sign hypotheses. In turn, stereo information allows for sparse 3D reconstruction of potential traffic signs through a SURF-based matching strategy. Namely, the plane that best fits the cloud of 3D points traced back from feature matches is estimated using a RANSAC based approach to improve robustness to outliers. Temporal consistency of the 3D information is ensured through a Kalman-based tracking stage. This also allows for the generation of a predicted 3D traffic sign model, which is in turn used to enhance the previously mentioned color-based detector through a feedback loop, thus improving detection accuracy. The proposed solution has been tested with real sequences under several illumination conditions and in both urban areas and highways, achieving very high detection rates in challenging environments, including rapid motion and significant perspective distortion
Resumo:
Falls are one of the greatest threats to elderly health in their daily living routines and activities. Therefore, it is very important to detect falls of an elderly in a timely and accurate manner, so that immediate response and proper care can be provided, by sending fall alarms to caregivers. Radar is an effective non-intrusive sensing modality which is well suited for this purpose, which can detect human motions in all types of environments, penetrate walls and fabrics, preserve privacy, and is insensitive to lighting conditions. Micro-Doppler features are utilized in radar signal corresponding to human body motions and gait to detect falls using a narrowband pulse-Doppler radar. Human motions cause time-varying Doppler signatures, which are analyzed using time-frequency representations and matching pursuit decomposition (MPD) for feature extraction and fall detection. The extracted features include MPD features and the principal components of the time-frequency signal representations. To analyze the sequential characteristics of typical falls, the extracted features are used for training and testing hidden Markov models (HMM) in different falling scenarios. Experimental results demonstrate that the proposed algorithm and method achieve fast and accurate fall detections. The risk of falls increases sharply when the elderly or patients try to exit beds. Thus, if a bed exit can be detected at an early stage of this motion, the related injuries can be prevented with a high probability. To detect bed exit for fall prevention, the trajectory of head movements is used for recognize such human motion. A head detector is trained using the histogram of oriented gradient (HOG) features of the head and shoulder areas from recorded bed exit images. A data association algorithm is applied on the head detection results to eliminate head detection false alarms. Then the three dimensional (3D) head trajectories are constructed by matching scale-invariant feature transform (SIFT) keypoints in the detected head areas from both the left and right stereo images. The extracted 3D head trajectories are used for training and testing an HMM based classifier for recognizing bed exit activities. The results of the classifier are presented and discussed in the thesis, which demonstrates the effectiveness of the proposed stereo vision based bed exit detection approach.
Resumo:
Clouds are important in weather prediction, climate studies and aviation safety. Important parameters include cloud height, type and cover percentage. In this paper, the recent improvements in the development of a low-cost cloud height measurement setup are described. It is based on stereo vision with consumer digital cameras. The cameras positioning is calibrated using the position of stars in the night sky. An experimental uncertainty analysis of the calibration parameters is performed. Cloud height measurement results are presented and compared with LIDAR measurements.
Resumo:
A presente dissertação endereça o desenvolvimento de um sistema de visão stereo ativo para os robôs de futebol robótico da equipa ISePorto do ISEP, de modo a que estes tirem o máximo partido das câmaras rotativas neles existentes. Este trabalho surge da necessidade de melhorar a capacidade de perceção do ambiente por parte dos robôs, principalmente da perceção da bola quando não está no plano do campo e dos robôs adversários. Esta necessidade surge devido ao aumento da dinâmica que se tem vindo a veri car ultimamente nas competições. Para tal, foram estudados algumas trabalhos relacionados no que diz respeito a sistemas de visão stereo com baselines variáveis e eixos de rotação em ambas as câmaras, bem como fundamentos de visão stereo. Foi proposta uma arquitetura para o sistema de visão ativo de modo a ser aplicado em qualquer robô da equipa MSL (Middle Size League). Para tornar possível a implementação desta arquitetura foi desenvolvido um procedimento para a calibração e determinação em tempo real dos parâmetros extrínsecos do par stereo em função da posição angular dos eixos rotativos do robô. O sistema de visão foi também dotado de capacidade de sincronismo e foram implementadas funcionalidades ao nível de software que possibilitam a deteção de objetos na imagem, a correspondência de objetos presentes nas imagens de ambas as câmaras e consequentemente a determinação das posições tridimensionais desses objetos relativamente ao robô. O sistema desenvolvido foi testado e validado em cenário MSL ao nível de perceção da bola, robôs adversários e linhas do campo. Os resultados obtidos apresentam uma melhoria signi cativa, face à implementação atual dos robôs, na perceção tridimensional da bola quando não está no plano do campo, e dos robôs adversários.