Biblioteca Digital

45 resultados para visual surveillance system

Monitoring visual attention on a neurorehabilitation environment based on interactive video

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The use of new technologies in neurorehabilitation has led to higher intensity rehabilitation processes, extending therapies in an economically sustainable way. Interactive Video (IV) technology allows therapists to work with virtual environments that reproduce real situations. In this way, patients deal with Activities of the Daily Living (ADL) immersed within enhanced environments [1]. These rehabilitation exercises, which focus in re-learning lost functions, will try to modulate the neural plasticity processes [2]. This research presents a system where a neurorehabilitation IV-based environment has been integrated with an eye-tracker device in order to monitor and to interact using visual attention. While patients are interacting with the neurorehabilitation environment, their visual behavior is closely related with their cognitive state, which in turn mirrors the brain damage condition suffered by them [3] [4]. Patients’ gaze data can provide knowledge on their attention focus and their cognitive state, as well as on the validity of the rehabilitation tasks proposed [5].

Dynamic gamma frequency feedback coupling between higher and lower order visual cortices underlies perceptual completion in humans

Relevância:

30.00% 30.00%

Publicador:

Resumo:

To perceive a coherent environment, incomplete or overlapping visual forms must be integrated into meaningful coherent percepts, a process referred to as ?Gestalt? formation or perceptual completion. Increasing evidence suggests that this process engages oscillatory neuronal activity in a distributed neuronal assembly. A separate line of evidence suggests that Gestalt formation requires top-down feedback from higher order brain regions to early visual cortex. Here we combine magnetoencephalography (MEG) and effective connectivity analysis in the frequency domain to specifically address the effective coupling between sources of oscillatory brain activity during Gestalt formation. We demonstrate that perceptual completion of two-tone ?Mooney? faces induces increased gamma frequency band power (55?71 Hz) in human early visual, fusiform and parietal cortices. Within this distributed neuronal assembly fusiform and parietal gamma oscillators are coupled by forward and backward connectivity during Mooney face perception, indicating reciprocal influences of gamma activity between these higher order visual brain regions. Critically, gamma band oscillations in early visual cortex are modulated by top-down feedback connectivity from both fusiform and parietal cortices. Thus, we provide a mechanistic account of Gestalt perception in which gamma oscillations in feature sensitive and spatial attention-relevant brain regions reciprocally drive one another and convey global stimulus aspects to local processing units at low levels of the sensory hierarchy by top-down feedback. Our data therefore support the notion of inverse hierarchical processing within the visual system underlying awareness of coherent percepts.

A comparative analysis of visual strategy in elite and amateur handball goalkeepers

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The aim of the present study was to analyze the visual strategies prior to a throw from 7 metres in elite and amateur handball goalkeepers. To this end we analyzed the visual fixations in number and order of 10 goalkeepers (29.7±5.4 years; 14.7±8.6 years of experience), 3 elite and 7 amateurs, during the life size projection of 14 different throws, made by different players. During each throw the movement of the eyeballs, the dilation of the pupil (pupillometry) and the subject?s blinking were recorded thanks to a technological system which permitted eye tracking with high speed cameras, and the subsequent presentation of the visual data for each action studied. The elite goalkeepers performed a greater number of visual fixations than the amateur goalkeepers, revealing large and significant differences. Equally the priority zones observed were differed, with the amateur goalkeepers fixating more on the thrower?s face, and the elite goalkeepers paying more attention to the area of the arm/ball. It can therefore be inferred that elite goalkeepers have a greater perceptive capacity and also use different visual strategies from the amateur goalkeepers.

Some relations between visual perception and non-linear photonic structures

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The main objective of this work is to present a way to emulate some functions of the mammalian visual system and a model to analyze subjective sensations and visual illusions

Bayesian scene analysis for multi-camera 3D tracking and camera positioning

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Los sistemas de seguimiento mono-cámara han demostrado su notable capacidad para el análisis de trajectorias de objectos móviles y para monitorización de escenas de interés; sin embargo, tanto su robustez como sus posibilidades en cuanto a comprensión semántica de la escena están fuertemente limitadas por su naturaleza local y monocular, lo que los hace insuficientes para aplicaciones realistas de videovigilancia. El objetivo de esta tesis es la extensión de las posibilidades de los sistemas de seguimiento de objetos móviles para lograr un mayor grado de robustez y comprensión de la escena. La extensión propuesta se divide en dos direcciones separadas. La primera puede considerarse local, ya que está orientada a la mejora y enriquecimiento de las posiciones estimadas para los objetos móviles observados directamente por las cámaras del sistema; dicha extensión se logra mediante el desarrollo de un sistema multi-cámara de seguimiento 3D, capaz de proporcionar consistentemente las posiciones 3D de múltiples objetos a partir de las observaciones capturadas por un conjunto de sensores calibrados y con campos de visión solapados. La segunda extensión puede considerarse global, dado que su objetivo consiste en proporcionar un contexto global para relacionar las observaciones locales realizadas por una cámara con una escena de mucho mayor tamaño; para ello se propone un sistema automático de localización de cámaras basado en las trayectorias observadas de varios objetos móviles y en un mapa esquemático de la escena global monitorizada. Ambas líneas de investigación se tratan utilizando, como marco común, técnicas de estimación bayesiana: esta elección está justificada por la versatilidad y flexibilidad proporcionada por dicho marco estadístico, que permite la combinación natural de múltiples fuentes de información sobre los parámetros a estimar, así como un tratamiento riguroso de la incertidumbre asociada a las mismas mediante la inclusión de modelos de observación específicamente diseñados. Además, el marco seleccionado abre grandes posibilidades operacionales, puesto que permite la creación de diferentes métodos numéricos adaptados a las necesidades y características específicas de distintos problemas tratados. El sistema de seguimiento 3D con múltiples cámaras propuesto está específicamente diseñado para permitir descripciones esquemáticas de las medidas realizadas individualmente por cada una de las cámaras del sistema: esta elección de diseño, por tanto, no asume ningún algoritmo específico de detección o seguimiento 2D en ninguno de los sensores de la red, y hace que el sistema propuesto sea aplicable a redes reales de vigilancia con capacidades limitadas tanto en términos de procesamiento como de transmision. La combinación robusta de las observaciones capturadas individualmente por las cámaras, ruidosas, incompletas y probablemente contaminadas por falsas detecciones, se basa en un metodo de asociación bayesiana basado en geometría y color: los resultados de dicha asociación permiten el seguimiento 3D de los objetos de la escena mediante el uso de un filtro de partículas. El sistema de fusión de observaciones propuesto tiene, como principales características, una gran precisión en términos de localización 3D de objetos, y una destacable capacidad de recuperación tras eventuales errores debidos a un número insuficiente de datos de entrada. El sistema automático de localización de cámaras se basa en la observación de múltiples objetos móviles y un mapa esquemático de las áreas transitables del entorno monitorizado para inferir la posición absoluta de dicho sensor. Para este propósito, se propone un novedoso marco bayesiano que combina modelos dinámicos inducidos por el mapa en los objetos móviles presentes en la escena con las trayectorias observadas por la cámara, lo que representa un enfoque nunca utilizado en la literatura existente. El sistema de localización se divide en dos sub-tareas diferenciadas, debido a que cada una de estas tareas requiere del diseño de algoritmos específicos de muestreo para explotar en profundidad las características del marco desarrollado: por un lado, análisis de la ambigüedad del caso específicamente tratado y estimación aproximada de la localización de la cámara, y por otro, refinado de la localización de la cámara. El sistema completo, diseñado y probado para el caso específico de localización de cámaras en entornos de tráfico urbano, podría tener aplicación también en otros entornos y sensores de diferentes modalidades tras ciertas adaptaciones. ABSTRACT Mono-camera tracking systems have proved their capabilities for moving object trajectory analysis and scene monitoring, but their robustness and semantic possibilities are strongly limited by their local and monocular nature and are often insufficient for realistic surveillance applications. This thesis is aimed at extending the possibilities of moving object tracking systems to a higher level of scene understanding. The proposed extension comprises two separate directions. The first one is local, since is aimed at enriching the inferred positions of the moving objects within the area of the monitored scene directly covered by the cameras of the system; this task is achieved through the development of a multi-camera system for robust 3D tracking, able to provide 3D tracking information of multiple simultaneous moving objects from the observations reported by a set of calibrated cameras with semi-overlapping fields of view. The second extension is global, as is aimed at providing local observations performed within the field of view of one camera with a global context relating them to a much larger scene; to this end, an automatic camera positioning system relying only on observed object trajectories and a scene map is designed. The two lines of research in this thesis are addressed using Bayesian estimation as a general unifying framework. Its suitability for these two applications is justified by the flexibility and versatility of that stochastic framework, which allows the combination of multiple sources of information about the parameters to estimate in a natural and elegant way, addressing at the same time the uncertainty associated to those sources through the inclusion of models designed to this end. In addition, it opens multiple possibilities for the creation of different numerical methods for achieving satisfactory and efficient practical solutions to each addressed application. The proposed multi-camera 3D tracking method is specifically designed to work on schematic descriptions of the observations performed by each camera of the system: this choice allows the use of unspecific off-the-shelf 2D detection and/or tracking subsystems running independently at each sensor, and makes the proposal suitable for real surveillance networks with moderate computational and transmission capabilities. The robust combination of such noisy, incomplete and possibly unreliable schematic descriptors relies on a Bayesian association method, based on geometry and color, whose results allow the tracking of the targets in the scene with a particle filter. The main features exhibited by the proposal are, first, a remarkable accuracy in terms of target 3D positioning, and second, a great recovery ability after tracking losses due to insufficient input data. The proposed system for visual-based camera self-positioning uses the observations of moving objects and a schematic map of the passable areas of the environment to infer the absolute sensor position. To this end, a new Bayesian framework combining trajectory observations and map-induced dynamic models for moving objects is designed, which represents an approach to camera positioning never addressed before in the literature. This task is divided into two different sub-tasks, setting ambiguity analysis and approximate position estimation, on the one hand, and position refining, on the other, since they require the design of specific sampling algorithms to correctly exploit the discriminative features of the developed framework. This system, designed for camera positioning and demonstrated in urban traffic environments, can also be applied to different environments and sensors of other modalities after certain required adaptations.

Editor web de conexiones para la creación visual de aplicaciones composicionales mediante mashup

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A lo largo de este documento se describe el trabajo llevado a cabo para el logro de todos objetivos de este Trabajo Fin de Grado, el cual tiene como objetivo principal la mejora de la herramienta de edición de las conexiones internas de un mashup proporcionada actualmente por la plataforma web WireCloud. WireCloud es una plataforma web centrada en la construcción visual de mashups de aplicaciones a partir de la interconexión de pequeñas aplicaciones web denominadas widgets. Los principales inconvenientes presentes en el actual editor de conexiones que incluye esta plataforma afectan principalmente a sus usuarios con poca experiencia en diseño web. Estos usuarios tienen dificultades a la hora de interpretar el esquema de conexiones de un mashup ajeno y también, de crear el esquema de conexiones de un mashup propio. La mejora realizada supone un cambio en la metáfora utilizada para la creación de las conexiones, que ahora se organiza en torno a unidades conceptuales denominadas comportamientos que representan subconjuntos cohesionados de conexiones con significado (representando por si mismos comportamientos relevantes del mashup). Con este cambio se logra solventar los inconvenientes que presenta el actual sistema, principalmente la necesidad de crear (y visualizar) simultáneamente todas las conexiones requeridas por el mashup, lo cual supone que: a) es difícil identificar con qué propósito se ha creado cada conexión y qué relación guardan unas conexiones con otras. b) existe un riesgo de olvidar alguna conexión, fundamentalmente por ser difícil la interpretación del propósito de cada conexión y por la imposibilidad de identificar y nombrar conjuntos de conexiones que tienen un propósito determinado. Antes de implementar el código fuente se realizó un estudio pormenorizado de las tecnologías que actualmente utiliza WireCloud, además de un estudio en profundidad de la situación del anterior editor y de las nuevas características y ventajas buscadas en este nuevo editor. Con este último propósito se definieron varios casos de estudio que ayudaron a concretar qué se ha de entender por un comportamiento en el diseño de las conexiones de un mashup y también, ayudaron a definir la mejor organización visual del nuevo editor en torno al concepto de comportamiento. El resto del trabajo consistió en la implementación del nuevo editor y en la elaboración de toda la documentación relacionada: principalmente el manual de uso del nuevo editor que estará disponible como parte de la documentación online de WireCloud.---ABSTRACT---This document describes the work carried out for the achievement of all targets of this Final Project, which has as its main objective the improvement of the edition tool of the mashup’s internal connections that is currently provided by WireCloud. WireCloud is a web platform focused on building visual of web mashups from the interconnection of web applications called web widgets. The main drawbacks present in the current web editor of connections, includes on this platform, mainly affect to users with little experience in web design. These users have difficulties for interpreting the wiring diagram of any web mashup and also, creating the wiring diagram of an own web mashup. The improvement made a change in the metaphor used for creating connections, now managing a conceptual units called behaviors that represent subsets of meaningful connections. This change overcomes the drawbacks of the current system, mainly the need to create and display simultaneously all connections required by the web mashup, which means that: a) Difficultly identify what purpose is created each connection and how they relate to each other connections. b) There is the risk of forgetting some connection, mainly for being difficult to interpret the purpose of each connection and the inability to identify and name sets of connections that have a specific purpose. Before deploying the source code, the study of the technologies currently used WireCloud was performed, plus the study of the situation of the previous editor and new features and advantages searched for this new editor was performed too. For the latter purpose was defined several case studies that helped to specify what understood by behavior in the design of the connections of a web mashup and also that helped to define the best visual organization of the new behavior-oriented wiring editor. The rest of the work involved in implementing the new wiring web editor and in the preparation of all documentation related: mainly user manual for using the new wiring web editor that will be available as part of the WireCloud online documentation.

Constructed Fashion: The Catwalk System as an Architectural Project

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Teniendo en cuenta que no hay nada que se escape de la moda 1, y extendiendonos más allá de esta manida discusión sobre intersecciones formales, esta investigación propone la pasarela como un lugar real de mediación entre moda y arquitectura. Asumiendo esta condición, la pasarela encarna nuevos modos de producción apropiándose de su espacio y estructura, y convierténdose en una máquina capaz de generar múltiples y más bien infinitos significados. La moda es sin duda un proyecto creativo, que ha venido utilizando la pasarela como un marco para la reordenación de su narrativa visual, renovándose asi mismo como fenómeno social. Este proyecto de investigación plantea, que contrariamente las tipologías actuales de las pasarelas no nos facilitan la comprensión de una colección – que suele ser el objetivo principal. Presentan en cambio un entorno en el que se acoplan diferentes formatos visuales, -con varias capas-, conviéndolo en una compleja construcción y provocando nunerosas fricciones con el espacio-tiempo-acción durante el proceso de creación de otros territorios. Partiendo de la idea de la pasarela como un sistema, en el que sus numerosas variables pueden producir diversas combinaciones, esta investigación plantea la hipótesis por la cual un nuevo sistema de pasarela se estaría formando enteramente con capas de información. Este escenario nos conduciría a la inmersión final de la moda en los tejidos de la virtualidad. Si bien el debate sobre la relevancia de los desfiles de moda se ha vuelto más evidente hoy en día, esta investigación especula con la posibilidad del pensamiento arquitectónico y como este puede introducir metodologías de análisis en el marco de estos desfiles de moda, proponiendo una lectura de la pasarela como un sistema de procedimientos específicos inherente a los proyectos/procesos de la arquitectura. Este enfoque enlaza ambas prácticas en un territorio común donde el espacio, el diseño, el comportamiento, el movimiento, y los cuerpos son ordenados/organizados en la creación de estas nuevas posibilidades visuales, y donde las interacciones activan la generación de la novedad y los mensajes. PALABRAS CLAVES moda, sistema, virtual, información, arquitectura Considering that there is nothing left untouched by fashion2, and going beyond the already exhausted discussion about formal intersections, this research introduces the catwalk as the real arena of mediation between fashion and architecture. By assuming this condition, the catwalk embodies new modes of production that appropriates its space and turns it into a machine for generating multiple if not infinite meanings. Fashion, as a creative project, has utilized the catwalk as a frame for rearranging its visual narrative and renewing itself as social phenomena. This research disputes, however, that the current typologies of catwalks do not facilitate the understanding of the collection – as its primary goal - but, instead, present an environment composed of multi-layered visual formats, becoming a complex construct that collides space-time-action in the creation of other territories. Departing from the analysis of the catwalk as a system and how its many variables can produce diverse combinations, this research presents the hypothesis that a new system is being formed entirely built out of information. Such scenario indicates fashion´s final immersion into the fabrics of virtuality. While the discussion about the relevance of fashion shows has become more evident today, this research serves as an introductory speculation on how architectural thinking can introduce methodologies of analysis within the framework of the fashion shows, by proposing a reading of the catwalk as a system through specific procedures that are inherent to architectural projects. Such approach intertwines both practices into a common territory where space, design, behaviour, movement, and bodies are organized for the creation of visual possibilities, and where interactions are triggered in the making of novelty and messages. KEYWORDS fashion, system, virtual, information, architectural

Bayesian scene analysis for multi-camera 3D tracking and camera positioning

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Los sistemas de seguimiento mono-cámara han demostrado su notable capacidad para el análisis de trajectorias de objectos móviles y para monitorización de escenas de interés; sin embargo, tanto su robustez como sus posibilidades en cuanto a comprensión semántica de la escena están fuertemente limitadas por su naturaleza local y monocular, lo que los hace insuficientes para aplicaciones realistas de videovigilancia. El objetivo de esta tesis es la extensión de las posibilidades de los sistemas de seguimiento de objetos móviles para lograr un mayor grado de robustez y comprensión de la escena. La extensión propuesta se divide en dos direcciones separadas. La primera puede considerarse local, ya que está orientada a la mejora y enriquecimiento de las posiciones estimadas para los objetos móviles observados directamente por las cámaras del sistema; dicha extensión se logra mediante el desarrollo de un sistema multi-cámara de seguimiento 3D, capaz de proporcionar consistentemente las posiciones 3D de múltiples objetos a partir de las observaciones capturadas por un conjunto de sensores calibrados y con campos de visión solapados. La segunda extensión puede considerarse global, dado que su objetivo consiste en proporcionar un contexto global para relacionar las observaciones locales realizadas por una cámara con una escena de mucho mayor tamaño; para ello se propone un sistema automático de localización de cámaras basado en las trayectorias observadas de varios objetos móviles y en un mapa esquemático de la escena global monitorizada. Ambas líneas de investigación se tratan utilizando, como marco común, técnicas de estimación bayesiana: esta elección está justificada por la versatilidad y flexibilidad proporcionada por dicho marco estadístico, que permite la combinación natural de múltiples fuentes de información sobre los parámetros a estimar, así como un tratamiento riguroso de la incertidumbre asociada a las mismas mediante la inclusión de modelos de observación específicamente diseñados. Además, el marco seleccionado abre grandes posibilidades operacionales, puesto que permite la creación de diferentes métodos numéricos adaptados a las necesidades y características específicas de distintos problemas tratados. El sistema de seguimiento 3D con múltiples cámaras propuesto está específicamente diseñado para permitir descripciones esquemáticas de las medidas realizadas individualmente por cada una de las cámaras del sistema: esta elección de diseño, por tanto, no asume ningún algoritmo específico de detección o seguimiento 2D en ninguno de los sensores de la red, y hace que el sistema propuesto sea aplicable a redes reales de vigilancia con capacidades limitadas tanto en términos de procesamiento como de transmision. La combinación robusta de las observaciones capturadas individualmente por las cámaras, ruidosas, incompletas y probablemente contaminadas por falsas detecciones, se basa en un metodo de asociación bayesiana basado en geometría y color: los resultados de dicha asociación permiten el seguimiento 3D de los objetos de la escena mediante el uso de un filtro de partículas. El sistema de fusión de observaciones propuesto tiene, como principales características, una gran precisión en términos de localización 3D de objetos, y una destacable capacidad de recuperación tras eventuales errores debidos a un número insuficiente de datos de entrada. El sistema automático de localización de cámaras se basa en la observación de múltiples objetos móviles y un mapa esquemático de las áreas transitables del entorno monitorizado para inferir la posición absoluta de dicho sensor. Para este propósito, se propone un novedoso marco bayesiano que combina modelos dinámicos inducidos por el mapa en los objetos móviles presentes en la escena con las trayectorias observadas por la cámara, lo que representa un enfoque nunca utilizado en la literatura existente. El sistema de localización se divide en dos sub-tareas diferenciadas, debido a que cada una de estas tareas requiere del diseño de algoritmos específicos de muestreo para explotar en profundidad las características del marco desarrollado: por un lado, análisis de la ambigüedad del caso específicamente tratado y estimación aproximada de la localización de la cámara, y por otro, refinado de la localización de la cámara. El sistema completo, diseñado y probado para el caso específico de localización de cámaras en entornos de tráfico urbano, podría tener aplicación también en otros entornos y sensores de diferentes modalidades tras ciertas adaptaciones. ABSTRACT Mono-camera tracking systems have proved their capabilities for moving object trajectory analysis and scene monitoring, but their robustness and semantic possibilities are strongly limited by their local and monocular nature and are often insufficient for realistic surveillance applications. This thesis is aimed at extending the possibilities of moving object tracking systems to a higher level of scene understanding. The proposed extension comprises two separate directions. The first one is local, since is aimed at enriching the inferred positions of the moving objects within the area of the monitored scene directly covered by the cameras of the system; this task is achieved through the development of a multi-camera system for robust 3D tracking, able to provide 3D tracking information of multiple simultaneous moving objects from the observations reported by a set of calibrated cameras with semi-overlapping fields of view. The second extension is global, as is aimed at providing local observations performed within the field of view of one camera with a global context relating them to a much larger scene; to this end, an automatic camera positioning system relying only on observed object trajectories and a scene map is designed. The two lines of research in this thesis are addressed using Bayesian estimation as a general unifying framework. Its suitability for these two applications is justified by the flexibility and versatility of that stochastic framework, which allows the combination of multiple sources of information about the parameters to estimate in a natural and elegant way, addressing at the same time the uncertainty associated to those sources through the inclusion of models designed to this end. In addition, it opens multiple possibilities for the creation of different numerical methods for achieving satisfactory and efficient practical solutions to each addressed application. The proposed multi-camera 3D tracking method is specifically designed to work on schematic descriptions of the observations performed by each camera of the system: this choice allows the use of unspecific off-the-shelf 2D detection and/or tracking subsystems running independently at each sensor, and makes the proposal suitable for real surveillance networks with moderate computational and transmission capabilities. The robust combination of such noisy, incomplete and possibly unreliable schematic descriptors relies on a Bayesian association method, based on geometry and color, whose results allow the tracking of the targets in the scene with a particle filter. The main features exhibited by the proposal are, first, a remarkable accuracy in terms of target 3D positioning, and second, a great recovery ability after tracking losses due to insufficient input data. The proposed system for visual-based camera self-positioning uses the observations of moving objects and a schematic map of the passable areas of the environment to infer the absolute sensor position. To this end, a new Bayesian framework combining trajectory observations and map-induced dynamic models for moving objects is designed, which represents an approach to camera positioning never addressed before in the literature. This task is divided into two different sub-tasks, setting ambiguity analysis and approximate position estimation, on the one hand, and position refining, on the other, since they require the design of specific sampling algorithms to correctly exploit the discriminative features of the developed framework. This system, designed for camera positioning and demonstrated in urban traffic environments, can also be applied to different environments and sensors of other modalities after certain required adaptations.

Multisensor data fusion for accurate modelling of mobile objects

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In the last decade, multi-sensor data fusion has become a broadly demanded discipline to achieve advanced solutions that can be applied in many real world situations, either civil or military. In Defence,accurate detection of all target objects is fundamental to maintaining situational awareness, to locating threats in the battlefield and to identifying and protecting strategically own forces. Civil applications, such as traffic monitoring, have similar requirements in terms of object detection and reliable identification of incidents in order to ensure safety of road users. Thanks to the appropriate data fusion technique, we can give these systems the power to exploit automatically all relevant information from multiple sources to face for instance mission needs or assess daily supervision operations. This paper focuses on its application to active vehicle monitoring in a particular area of high density traffic, and how it is redirecting the research activities being carried out in the computer vision, signal processing and machine learning fields for improving the effectiveness of detection and tracking in ground surveillance scenarios in general. Specifically, our system proposes fusion of data at a feature level which is extracted from a video camera and a laser scanner. In addition, a stochastic-based tracking which introduces some particle filters into the model to deal with uncertainty due to occlusions and improve the previous detection output is presented in this paper. It has been shown that this computer vision tracker contributes to detect objects even under poor visual information. Finally, in the same way that humans are able to analyze both temporal and spatial relations among items in the scene to associate them a meaning, once the targets objects have been correctly detected and tracked, it is desired that machines can provide a trustworthy description of what is happening in the scene under surveillance. Accomplishing so ambitious task requires a machine learning-based hierarchic architecture able to extract and analyse behaviours at different abstraction levels. A real experimental testbed has been implemented for the evaluation of the proposed modular system. Such scenario is a closed circuit where real traffic situations can be simulated. First results have shown the strength of the proposed system.

Low-Cost Radar-Based Target Identification Prototype using an Expert System

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Smart and green cities are hot topics in current research because people are becoming more conscious about their impact on the environment and the sustainability of their cities as the population increases. Many researchers are searching for mechanisms that can reduce power consumption and pollution in the city environment. This paper addresses the issue of public lighting and how it can be improved in order to achieve a more energy efficient city. This work is focused on making the process of turning the streetlights on and off more intelligent so that they consume less power and cause less light pollution. The proposed solution is comprised of a radar device and an expert system implemented on a low-cost platform based on a DSP. By analyzing the radar echo in both the frequency and time domains, the system is able to detect and identify objects moving in front of it. This information is used to decide whether or not the streetlight should be turned on. Experimental results show that the proposed system can provide hit rates over 80%, promising a good performance. In addition, the proposed solution could be useful in kind of other applications such as intelligent security and surveillance systems and home automation.

Foreground segmentation in depth imagery using depth and spatial dynamic models for video surveillance applications

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Low-cost systems that can obtain a high-quality foreground segmentation almostindependently of the existing illumination conditions for indoor environments are verydesirable, especially for security and surveillance applications. In this paper, a novelforeground segmentation algorithm that uses only a Kinect depth sensor is proposedto satisfy the aforementioned system characteristics. This is achieved by combininga mixture of Gaussians-based background subtraction algorithm with a new Bayesiannetwork that robustly predicts the foreground/background regions between consecutivetime steps. The Bayesian network explicitly exploits the intrinsic characteristics ofthe depth data by means of two dynamic models that estimate the spatial and depthevolution of the foreground/background regions. The most remarkable contribution is thedepth-based dynamic model that predicts the changes in the foreground depth distributionbetween consecutive time steps. This is a key difference with regard to visible imagery,where the color/gray distribution of the foreground is typically assumed to be constant.Experiments carried out on two different depth-based databases demonstrate that theproposed combination of algorithms is able to obtain a more accurate segmentation of theforeground/background than other state-of-the art approaches.

Toward Visual Autonomous Ship Board Landing of a VTOL UAV

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper we tackle the problem of landing a helicopter autonomously on a ship deck, using as the main sensor, an on-board colour camera. To create a test-bed, we first adequately simulate the movement of a ship landing platform on the Sea, for different Sea States, for different ships, randomly and realistically enough. We use a commercial parallel robot to get this movement. Once we had this, we developed an accurate and robust computer vision system to measure the pose of the helipad with respect to the on-board camera. To deal with the noise and the possible fails of the computer vision, a state estimator was created. With all of this, we are now able to develop and test a controller that closes the loop and finish the autonomous landing task.

Visual attention and perception models for assessing quality in 2D and 3D stereoscopic video

Relevância:

30.00% 30.00%

Publicador:

Resumo:

La medida de calidad de vídeo sigue siendo necesaria para definir los criterios que caracterizan una señal que cumpla los requisitos de visionado impuestos por el usuario. Las nuevas tecnologías, como el vídeo 3D estereoscópico o formatos más allá de la alta definición, imponen nuevos criterios que deben ser analizadas para obtener la mayor satisfacción posible del usuario. Entre los problemas detectados durante el desarrollo de esta tesis doctoral se han determinado fenómenos que afectan a distintas fases de la cadena de producción audiovisual y tipo de contenido variado. En primer lugar, el proceso de generación de contenidos debe encontrarse controlado mediante parámetros que eviten que se produzca el disconfort visual y, consecuentemente, fatiga visual, especialmente en lo relativo a contenidos de 3D estereoscópico, tanto de animación como de acción real. Por otro lado, la medida de calidad relativa a la fase de compresión de vídeo emplea métricas que en ocasiones no se encuentran adaptadas a la percepción del usuario. El empleo de modelos psicovisuales y diagramas de atención visual permitirían ponderar las áreas de la imagen de manera que se preste mayor importancia a los píxeles que el usuario enfocará con mayor probabilidad. Estos dos bloques se relacionan a través de la definición del término saliencia. Saliencia es la capacidad del sistema visual para caracterizar una imagen visualizada ponderando las áreas que más atractivas resultan al ojo humano. La saliencia en generación de contenidos estereoscópicos se refiere principalmente a la profundidad simulada mediante la ilusión óptica, medida en términos de distancia del objeto virtual al ojo humano. Sin embargo, en vídeo bidimensional, la saliencia no se basa en la profundidad, sino en otros elementos adicionales, como el movimiento, el nivel de detalle, la posición de los píxeles o la aparición de caras, que serán los factores básicos que compondrán el modelo de atención visual desarrollado. Con el objetivo de detectar las características de una secuencia de vídeo estereoscópico que, con mayor probabilidad, pueden generar disconfort visual, se consultó la extensa literatura relativa a este tema y se realizaron unas pruebas subjetivas preliminares con usuarios. De esta forma, se llegó a la conclusión de que se producía disconfort en los casos en que se producía un cambio abrupto en la distribución de profundidades simuladas de la imagen, aparte de otras degradaciones como la denominada “violación de ventana”. A través de nuevas pruebas subjetivas centradas en analizar estos efectos con diferentes distribuciones de profundidades, se trataron de concretar los parámetros que definían esta imagen. Los resultados de las pruebas demuestran que los cambios abruptos en imágenes se producen en entornos con movimientos y disparidades negativas elevadas que producen interferencias en los procesos de acomodación y vergencia del ojo humano, así como una necesidad en el aumento de los tiempos de enfoque del cristalino. En la mejora de las métricas de calidad a través de modelos que se adaptan al sistema visual humano, se realizaron también pruebas subjetivas que ayudaron a determinar la importancia de cada uno de los factores a la hora de enmascarar una determinada degradación. Los resultados demuestran una ligera mejora en los resultados obtenidos al aplicar máscaras de ponderación y atención visual, los cuales aproximan los parámetros de calidad objetiva a la respuesta del ojo humano. ABSTRACT Video quality assessment is still a necessary tool for defining the criteria to characterize a signal with the viewing requirements imposed by the final user. New technologies, such as 3D stereoscopic video and formats of HD and beyond HD oblige to develop new analysis of video features for obtaining the highest user’s satisfaction. Among the problems detected during the process of this doctoral thesis, it has been determined that some phenomena affect to different phases in the audiovisual production chain, apart from the type of content. On first instance, the generation of contents process should be enough controlled through parameters that avoid the occurrence of visual discomfort in observer’s eye, and consequently, visual fatigue. It is especially necessary controlling sequences of stereoscopic 3D, with both animation and live-action contents. On the other hand, video quality assessment, related to compression processes, should be improved because some objective metrics are adapted to user’s perception. The use of psychovisual models and visual attention diagrams allow the weighting of image regions of interest, giving more importance to the areas which the user will focus most probably. These two work fields are related together through the definition of the term saliency. Saliency is the capacity of human visual system for characterizing an image, highlighting the areas which result more attractive to the human eye. Saliency in generation of 3DTV contents refers mainly to the simulated depth of the optic illusion, i.e. the distance from the virtual object to the human eye. On the other hand, saliency is not based on virtual depth, but on other features, such as motion, level of detail, position of pixels in the frame or face detection, which are the basic features that are part of the developed visual attention model, as demonstrated with tests. Extensive literature involving visual comfort assessment was looked up, and the development of new preliminary subjective assessment with users was performed, in order to detect the features that increase the probability of discomfort to occur. With this methodology, the conclusions drawn confirmed that one common source of visual discomfort was when an abrupt change of disparity happened in video transitions, apart from other degradations, such as window violation. New quality assessment was performed to quantify the distribution of disparities over different sequences. The results confirmed that abrupt changes in negative parallax environment produce accommodation-vergence mismatches derived from the increasing time for human crystalline to focus the virtual objects. On the other side, for developing metrics that adapt to human visual system, additional subjective tests were developed to determine the importance of each factor, which masks a concrete distortion. Results demonstrated slight improvement after applying visual attention to objective metrics. This process of weighing pixels approximates the quality results to human eye’s response.

Human-computer interaction based on visual hand-gesture recognition using volumetric spatiograms of local binary patterns

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A more natural, intuitive, user-friendly, and less intrusive Human–Computer interface for controlling an application by executing hand gestures is presented. For this purpose, a robust vision-based hand-gesture recognition system has been developed, and a new database has been created to test it. The system is divided into three stages: detection, tracking, and recognition. The detection stage searches in every frame of a video sequence potential hand poses using a binary Support Vector Machine classifier and Local Binary Patterns as feature vectors. These detections are employed as input of a tracker to generate a spatio-temporal trajectory of hand poses. Finally, the recognition stage segments a spatio-temporal volume of data using the obtained trajectories, and compute a video descriptor called Volumetric Spatiograms of Local Binary Patterns (VS-LBP), which is delivered to a bank of SVM classifiers to perform the gesture recognition. The VS-LBP is a novel video descriptor that constitutes one of the most important contributions of the paper, which is able to provide much richer spatio-temporal information than other existing approaches in the state of the art with a manageable computational cost. Excellent results have been obtained outperforming other approaches of the state of the art.

Development of a hand-gesture recognition system for human-computer interaction

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The aim of this Master Thesis is the analysis, design and development of a robust and reliable Human-Computer Interaction interface, based on visual hand-gesture recognition. The implementation of the required functions is oriented to the simulation of a classical hardware interaction device: the mouse, by recognizing a specific hand-gesture vocabulary in color video sequences. For this purpose, a prototype of a hand-gesture recognition system has been designed and implemented, which is composed of three stages: detection, tracking and recognition. This system is based on machine learning methods and pattern recognition techniques, which have been integrated together with other image processing approaches to get a high recognition accuracy and a low computational cost. Regarding pattern recongition techniques, several algorithms and strategies have been designed and implemented, which are applicable to color images and video sequences. The design of these algorithms has the purpose of extracting spatial and spatio-temporal features from static and dynamic hand gestures, in order to identify them in a robust and reliable way. Finally, a visual database containing the necessary vocabulary of gestures for interacting with the computer has been created.

«
1
2
3
»