920 resultados para decoupled image-based visual servoing
Resumo:
El principal objetivo de este trabajo es proporcionar una solución en tiempo real basada en visión estéreo o monocular precisa y robusta para que un vehículo aéreo no tripulado (UAV) sea autónomo en varios tipos de aplicaciones UAV, especialmente en entornos abarrotados sin señal GPS. Este trabajo principalmente consiste en tres temas de investigación de UAV basados en técnicas de visión por computador: (I) visual tracking, proporciona soluciones efectivas para localizar visualmente objetos de interés estáticos o en movimiento durante el tiempo que dura el vuelo del UAV mediante una aproximación adaptativa online y una estrategia de múltiple resolución, de este modo superamos los problemas generados por las diferentes situaciones desafiantes, tales como cambios significativos de aspecto, iluminación del entorno variante, fondo del tracking embarullado, oclusión parcial o total de objetos, variaciones rápidas de posición y vibraciones mecánicas a bordo. La solución ha sido utilizada en aterrizajes autónomos, inspección de plataformas mar adentro o tracking de aviones en pleno vuelo para su detección y evasión; (II) odometría visual: proporciona una solución eficiente al UAV para estimar la posición con 6 grados de libertad (6D) usando únicamente la entrada de una cámara estéreo a bordo del UAV. Un método Semi-Global Blocking Matching (SGBM) eficiente basado en una estrategia grueso-a-fino ha sido implementada para una rápida y profunda estimación del plano. Además, la solución toma provecho eficazmente de la información 2D y 3D para estimar la posición 6D, resolviendo de esta manera la limitación de un punto de referencia fijo en la cámara estéreo. Una robusta aproximación volumétrica de mapping basada en el framework Octomap ha sido utilizada para reconstruir entornos cerrados y al aire libre bastante abarrotados en 3D con memoria y errores correlacionados espacialmente o temporalmente; (III) visual control, ofrece soluciones de control prácticas para la navegación de un UAV usando Fuzzy Logic Controller (FLC) con la estimación visual. Y el framework de Cross-Entropy Optimization (CEO) ha sido usado para optimizar el factor de escala y la función de pertenencia en FLC. Todas las soluciones basadas en visión en este trabajo han sido probadas en test reales. Y los conjuntos de datos de imágenes reales grabados en estos test o disponibles para la comunidad pública han sido utilizados para evaluar el rendimiento de estas soluciones basadas en visión con ground truth. Además, las soluciones de visión presentadas han sido comparadas con algoritmos de visión del estado del arte. Los test reales y los resultados de evaluación muestran que las soluciones basadas en visión proporcionadas han obtenido rendimientos en tiempo real precisos y robustos, o han alcanzado un mejor rendimiento que aquellos algoritmos del estado del arte. La estimación basada en visión ha ganado un rol muy importante en controlar un UAV típico para alcanzar autonomía en aplicaciones UAV. ABSTRACT The main objective of this dissertation is providing real-time accurate robust monocular or stereo vision-based solution for Unmanned Aerial Vehicle (UAV) to achieve the autonomy in various types of UAV applications, especially in GPS-denied dynamic cluttered environments. This dissertation mainly consists of three UAV research topics based on computer vision technique: (I) visual tracking, it supplys effective solutions to visually locate interesting static or moving object over time during UAV flight with on-line adaptivity approach and multiple-resolution strategy, thereby overcoming the problems generated by the different challenging situations, such as significant appearance change, variant surrounding illumination, cluttered tracking background, partial or full object occlusion, rapid pose variation and onboard mechanical vibration. The solutions have been utilized in autonomous landing, offshore floating platform inspection and midair aircraft tracking for sense-and-avoid; (II) visual odometry: it provides the efficient solution for UAV to estimate the 6 Degree-of-freedom (6D) pose using only the input of stereo camera onboard UAV. An efficient Semi-Global Blocking Matching (SGBM) method based on a coarse-to-fine strategy has been implemented for fast depth map estimation. In addition, the solution effectively takes advantage of both 2D and 3D information to estimate the 6D pose, thereby solving the limitation of a fixed small baseline in the stereo camera. A robust volumetric occupancy mapping approach based on the Octomap framework has been utilized to reconstruct indoor and outdoor large-scale cluttered environments in 3D with less temporally or spatially correlated measurement errors and memory; (III) visual control, it offers practical control solutions to navigate UAV using Fuzzy Logic Controller (FLC) with the visual estimation. And the Cross-Entropy Optimization (CEO) framework has been used to optimize the scaling factor and the membership function in FLC. All the vision-based solutions in this dissertation have been tested in real tests. And the real image datasets recorded from these tests or available from public community have been utilized to evaluate the performance of these vision-based solutions with ground truth. Additionally, the presented vision solutions have compared with the state-of-art visual algorithms. Real tests and evaluation results show that the provided vision-based solutions have obtained real-time accurate robust performances, or gained better performance than those state-of-art visual algorithms. The vision-based estimation has played a critically important role for controlling a typical UAV to achieve autonomy in the UAV application.
Resumo:
La medida de calidad de vídeo sigue siendo necesaria para definir los criterios que caracterizan una señal que cumpla los requisitos de visionado impuestos por el usuario. Las nuevas tecnologías, como el vídeo 3D estereoscópico o formatos más allá de la alta definición, imponen nuevos criterios que deben ser analizadas para obtener la mayor satisfacción posible del usuario. Entre los problemas detectados durante el desarrollo de esta tesis doctoral se han determinado fenómenos que afectan a distintas fases de la cadena de producción audiovisual y tipo de contenido variado. En primer lugar, el proceso de generación de contenidos debe encontrarse controlado mediante parámetros que eviten que se produzca el disconfort visual y, consecuentemente, fatiga visual, especialmente en lo relativo a contenidos de 3D estereoscópico, tanto de animación como de acción real. Por otro lado, la medida de calidad relativa a la fase de compresión de vídeo emplea métricas que en ocasiones no se encuentran adaptadas a la percepción del usuario. El empleo de modelos psicovisuales y diagramas de atención visual permitirían ponderar las áreas de la imagen de manera que se preste mayor importancia a los píxeles que el usuario enfocará con mayor probabilidad. Estos dos bloques se relacionan a través de la definición del término saliencia. Saliencia es la capacidad del sistema visual para caracterizar una imagen visualizada ponderando las áreas que más atractivas resultan al ojo humano. La saliencia en generación de contenidos estereoscópicos se refiere principalmente a la profundidad simulada mediante la ilusión óptica, medida en términos de distancia del objeto virtual al ojo humano. Sin embargo, en vídeo bidimensional, la saliencia no se basa en la profundidad, sino en otros elementos adicionales, como el movimiento, el nivel de detalle, la posición de los píxeles o la aparición de caras, que serán los factores básicos que compondrán el modelo de atención visual desarrollado. Con el objetivo de detectar las características de una secuencia de vídeo estereoscópico que, con mayor probabilidad, pueden generar disconfort visual, se consultó la extensa literatura relativa a este tema y se realizaron unas pruebas subjetivas preliminares con usuarios. De esta forma, se llegó a la conclusión de que se producía disconfort en los casos en que se producía un cambio abrupto en la distribución de profundidades simuladas de la imagen, aparte de otras degradaciones como la denominada “violación de ventana”. A través de nuevas pruebas subjetivas centradas en analizar estos efectos con diferentes distribuciones de profundidades, se trataron de concretar los parámetros que definían esta imagen. Los resultados de las pruebas demuestran que los cambios abruptos en imágenes se producen en entornos con movimientos y disparidades negativas elevadas que producen interferencias en los procesos de acomodación y vergencia del ojo humano, así como una necesidad en el aumento de los tiempos de enfoque del cristalino. En la mejora de las métricas de calidad a través de modelos que se adaptan al sistema visual humano, se realizaron también pruebas subjetivas que ayudaron a determinar la importancia de cada uno de los factores a la hora de enmascarar una determinada degradación. Los resultados demuestran una ligera mejora en los resultados obtenidos al aplicar máscaras de ponderación y atención visual, los cuales aproximan los parámetros de calidad objetiva a la respuesta del ojo humano. ABSTRACT Video quality assessment is still a necessary tool for defining the criteria to characterize a signal with the viewing requirements imposed by the final user. New technologies, such as 3D stereoscopic video and formats of HD and beyond HD oblige to develop new analysis of video features for obtaining the highest user’s satisfaction. Among the problems detected during the process of this doctoral thesis, it has been determined that some phenomena affect to different phases in the audiovisual production chain, apart from the type of content. On first instance, the generation of contents process should be enough controlled through parameters that avoid the occurrence of visual discomfort in observer’s eye, and consequently, visual fatigue. It is especially necessary controlling sequences of stereoscopic 3D, with both animation and live-action contents. On the other hand, video quality assessment, related to compression processes, should be improved because some objective metrics are adapted to user’s perception. The use of psychovisual models and visual attention diagrams allow the weighting of image regions of interest, giving more importance to the areas which the user will focus most probably. These two work fields are related together through the definition of the term saliency. Saliency is the capacity of human visual system for characterizing an image, highlighting the areas which result more attractive to the human eye. Saliency in generation of 3DTV contents refers mainly to the simulated depth of the optic illusion, i.e. the distance from the virtual object to the human eye. On the other hand, saliency is not based on virtual depth, but on other features, such as motion, level of detail, position of pixels in the frame or face detection, which are the basic features that are part of the developed visual attention model, as demonstrated with tests. Extensive literature involving visual comfort assessment was looked up, and the development of new preliminary subjective assessment with users was performed, in order to detect the features that increase the probability of discomfort to occur. With this methodology, the conclusions drawn confirmed that one common source of visual discomfort was when an abrupt change of disparity happened in video transitions, apart from other degradations, such as window violation. New quality assessment was performed to quantify the distribution of disparities over different sequences. The results confirmed that abrupt changes in negative parallax environment produce accommodation-vergence mismatches derived from the increasing time for human crystalline to focus the virtual objects. On the other side, for developing metrics that adapt to human visual system, additional subjective tests were developed to determine the importance of each factor, which masks a concrete distortion. Results demonstrated slight improvement after applying visual attention to objective metrics. This process of weighing pixels approximates the quality results to human eye’s response.
Resumo:
La computación ubicua está extendiendo su aplicación desde entornos específicos hacia el uso cotidiano; el Internet de las cosas (IoT, en inglés) es el ejemplo más brillante de su aplicación y de la complejidad intrínseca que tiene, en comparación con el clásico desarrollo de aplicaciones. La principal característica que diferencia la computación ubicua de los otros tipos está en como se emplea la información de contexto. Las aplicaciones clásicas no usan en absoluto la información de contexto o usan sólo una pequeña parte de ella, integrándola de una forma ad hoc con una implementación específica para la aplicación. La motivación de este tratamiento particular se tiene que buscar en la dificultad de compartir el contexto con otras aplicaciones. En realidad lo que es información de contexto depende del tipo de aplicación: por poner un ejemplo, para un editor de imágenes, la imagen es la información y sus metadatos, tales como la hora de grabación o los ajustes de la cámara, son el contexto, mientras que para el sistema de ficheros la imagen junto con los ajustes de cámara son la información, y el contexto es representado por los metadatos externos al fichero como la fecha de modificación o la de último acceso. Esto significa que es difícil compartir la información de contexto, y la presencia de un middleware de comunicación que soporte el contexto de forma explícita simplifica el desarrollo de aplicaciones para computación ubicua. Al mismo tiempo el uso del contexto no tiene que ser obligatorio, porque si no se perdería la compatibilidad con las aplicaciones que no lo usan, convirtiendo así dicho middleware en un middleware de contexto. SilboPS, que es nuestra implementación de un sistema publicador/subscriptor basado en contenido e inspirado en SIENA [11, 9], resuelve dicho problema extendiendo el paradigma con dos elementos: el Contexto y la Función de Contexto. El contexto representa la información contextual propiamente dicha del mensaje por enviar o aquella requerida por el subscriptor para recibir notificaciones, mientras la función de contexto se evalúa usando el contexto del publicador y del subscriptor. Esto permite desacoplar la lógica de gestión del contexto de aquella de la función de contexto, incrementando de esta forma la flexibilidad de la comunicación entre varias aplicaciones. De hecho, al utilizar por defecto un contexto vacío, las aplicaciones clásicas y las que manejan el contexto pueden usar el mismo SilboPS, resolviendo de esta forma la incompatibilidad entre las dos categorías. En cualquier caso la posible incompatibilidad semántica sigue existiendo ya que depende de la interpretación que cada aplicación hace de los datos y no puede ser solucionada por una tercera parte agnóstica. El entorno IoT conlleva retos no sólo de contexto, sino también de escalabilidad. La cantidad de sensores, el volumen de datos que producen y la cantidad de aplicaciones que podrían estar interesadas en manipular esos datos está en continuo aumento. Hoy en día la respuesta a esa necesidad es la computación en la nube, pero requiere que las aplicaciones sean no sólo capaces de escalar, sino de hacerlo de forma elástica [22]. Desgraciadamente no hay ninguna primitiva de sistema distribuido de slicing que soporte un particionamiento del estado interno [33] junto con un cambio en caliente, además de que los sistemas cloud actuales como OpenStack u OpenNebula no ofrecen directamente una monitorización elástica. Esto implica que hay un problema bilateral: cómo puede una aplicación escalar de forma elástica y cómo monitorizar esa aplicación para saber cuándo escalarla horizontalmente. E-SilboPS es la versión elástica de SilboPS y se adapta perfectamente como solución para el problema de monitorización, gracias al paradigma publicador/subscriptor basado en contenido y, a diferencia de otras soluciones [5], permite escalar eficientemente, para cumplir con la carga de trabajo sin sobre-provisionar o sub-provisionar recursos. Además está basado en un algoritmo recientemente diseñado que muestra como añadir elasticidad a una aplicación con distintas restricciones sobre el estado: sin estado, estado aislado con coordinación externa y estado compartido con coordinación general. Su evaluación enseña como se pueden conseguir notables speedups, siendo el nivel de red el principal factor limitante: de hecho la eficiencia calculada (ver Figura 5.8) demuestra cómo se comporta cada configuración en comparación con las adyacentes. Esto permite conocer la tendencia actual de todo el sistema, para saber si la siguiente configuración compensará el coste que tiene con la ganancia que lleva en el throughput de notificaciones. Se tiene que prestar especial atención en la evaluación de los despliegues con igual coste, para ver cuál es la mejor solución en relación a una carga de trabajo dada. Como último análisis se ha estimado el overhead introducido por las distintas configuraciones a fin de identificar el principal factor limitante del throughput. Esto ayuda a determinar la parte secuencial y el overhead de base [26] en un despliegue óptimo en comparación con uno subóptimo. Efectivamente, según el tipo de carga de trabajo, la estimación puede ser tan baja como el 10 % para un óptimo local o tan alta como el 60 %: esto ocurre cuando se despliega una configuración sobredimensionada para la carga de trabajo. Esta estimación de la métrica de Karp-Flatt es importante para el sistema de gestión porque le permite conocer en que dirección (ampliar o reducir) es necesario cambiar el despliegue para mejorar sus prestaciones, en lugar que usar simplemente una política de ampliación. ABSTRACT The application of pervasive computing is extending from field-specific to everyday use. The Internet of Things (IoT) is the shiniest example of its application and of its intrinsic complexity compared with classical application development. The main characteristic that differentiates pervasive from other forms of computing lies in the use of contextual information. Some classical applications do not use any contextual information whatsoever. Others, on the other hand, use only part of the contextual information, which is integrated in an ad hoc fashion using an application-specific implementation. This information is handled in a one-off manner because of the difficulty of sharing context across applications. As a matter of fact, the application type determines what the contextual information is. For instance, for an imaging editor, the image is the information and its meta-data, like the time of the shot or camera settings, are the context, whereas, for a file-system application, the image, including its camera settings, is the information and the meta-data external to the file, like the modification date or the last accessed timestamps, constitute the context. This means that contextual information is hard to share. A communication middleware that supports context decidedly eases application development in pervasive computing. However, the use of context should not be mandatory; otherwise, the communication middleware would be reduced to a context middleware and no longer be compatible with non-context-aware applications. SilboPS, our implementation of content-based publish/subscribe inspired by SIENA [11, 9], solves this problem by adding two new elements to the paradigm: the context and the context function. Context represents the actual contextual information specific to the message to be sent or that needs to be notified to the subscriber, whereas the context function is evaluated using the publisher’s context and the subscriber’s context to decide whether the current message and context are useful for the subscriber. In this manner, context logic management is decoupled from context management, increasing the flexibility of communication and usage across different applications. Since the default context is empty, context-aware and classical applications can use the same SilboPS, resolving the syntactic mismatch that there is between the two categories. In any case, the possible semantic mismatch is still present because it depends on how each application interprets the data, and it cannot be resolved by an agnostic third party. The IoT environment introduces not only context but scaling challenges too. The number of sensors, the volume of the data that they produce and the number of applications that could be interested in harvesting such data are growing all the time. Today’s response to the above need is cloud computing. However, cloud computing applications need to be able to scale elastically [22]. Unfortunately there is no slicing, as distributed system primitives that support internal state partitioning [33] and hot swapping and current cloud systems like OpenStack or OpenNebula do not provide elastic monitoring out of the box. This means there is a two-sided problem: 1) how to scale an application elastically and 2) how to monitor the application and know when it should scale in or out. E-SilboPS is the elastic version of SilboPS. I t is the solution for the monitoring problem thanks to its content-based publish/subscribe nature and, unlike other solutions [5], it scales efficiently so as to meet workload demand without overprovisioning or underprovisioning. Additionally, it is based on a newly designed algorithm that shows how to add elasticity in an application with different state constraints: stateless, isolated stateful with external coordination and shared stateful with general coordination. Its evaluation shows that it is able to achieve remarkable speedups where the network layer is the main limiting factor: the calculated efficiency (see Figure 5.8) shows how each configuration performs with respect to adjacent configurations. This provides insight into the actual trending of the whole system in order to predict if the next configuration would offset its cost against the resulting gain in notification throughput. Particular attention has been paid to the evaluation of same-cost deployments in order to find out which one is the best for the given workload demand. Finally, the overhead introduced by the different configurations has been estimated to identify the primary limiting factor for throughput. This helps to determine the intrinsic sequential part and base overhead [26] of an optimal versus a suboptimal deployment. Depending on the type of workload, this can be as low as 10% in a local optimum or as high as 60% when an overprovisioned configuration is deployed for a given workload demand. This Karp-Flatt metric estimation is important for system management because it indicates the direction (scale in or out) in which the deployment has to be changed in order to improve its performance instead of simply using a scale-out policy.
Luz industrial e imagen tecnificada: de Moholy Nagy al C.A.V.S. (Center for Advanced Visual Studies)
Resumo:
El desarrollo de la tecnología de la luz implicará la transformación de la vida social, cultural y económica. Tanto las consideraciones espaciales del Movimiento Moderno, como los efectos producidos por la segunda Guerra Mundial, tendrán efectos visibles en las nuevas configuraciones espaciales y en la relación simbiótica y recíproca que se dará entre ideología y tecnología. La transformación en la comprensión de la articulación espacial, asociada al desarrollo tecnológico, afectará al modo en que este espacio es experimentado y percibido. El espacio expositivo y el espacio escénico se convertirán en laboratorio práctico donde desarrollar y hacer comprensible todo el potencial ilusorio de la luz, la proyección y la imagen, como parámetros modificadores y dinamizadores del espacio arquitectónico. Esta experimentación espacial estará precedida por la investigación y creación conceptual en el mundo plástico, donde los nuevos medios mecánicos serán responsables de la construcción de una nueva mirada moderna mediatizada por los elementos técnicos. La experimentación óptica, a través de la fotografía, el cine, o el movimiento de la luz y su percepción, vinculada a nuevos modos de representación y comunicación, se convertirá en elemento fundamental en la configuración espacial. Este ámbito de experimentación se hará patente en la Escuela de la Bauhaus, de la mano de Gropius, Schlemmer o Moholy Nagy entre otros; tanto en reflexiones teóricas como en el desarrollo de proyectos expositivos, arquitectónicos o teatrales, que evolucionarán en base a la tecnología y la modificación de la relación con el espectador. El espacio expositivo y el espacio escénico se tomarán como oportunidad de investigación espacial y de análisis de los modos de percepción, convirtiéndose en lugares de experimentación básicos para el aprendizaje. El teatro se postula como punto de encuentro entre el arte y la técnica, cobrando especial importancia la intersección con otras disciplinas en la definición espacial. Las múltiples innovaciones técnicas ligadas a los nuevos fundamentos teatrales en la modificación de la relación con la escena, que se producen a principios del siglo XX, tendrán como consecuencia la transformación del espacio en un espacio dinámico, tanto física como perceptivamente, que dará lugar a nuevas concepciones espaciales, muchas de ellas utópicas. La luz, la proyección y la creación de ilusión en base a estímulos visuales y sonoros, aparecen como elementos proyectuales efímeros e inmateriales, que tendrán una gran incidencia en el espacio y su modo de ser experimentado. La implicación de la tecnología en el arte conllevará modificaciones en la visualización, así como en la configuración espacial de los espacios destinados a esta. Destacaremos como propuesta el Teatro Total de Walter Gropius, en cuyo desarrollo se recogen de algún modo las experiencias espaciales y las investigaciones desarrolladas sobre la estructura formal de la percepción realizadas por Moholy Nagy, además de los conceptos acerca del espacio escénico desarrollados en el taller de Teatro de la Bauhaus por Oskar Schlemmer. En el Teatro Total, Gropius incorporará su propia visión de cuestiones que pertenecen a la tradición de la arquitectura teatral y las innovaciones conceptuales que estaban teniendo lugar desde finales del s.XIX, tales como la participación activa del público o la superación entre escena y auditorio, estableciendo en el proyecto una nueva relación perceptual entre sala, espectáculo y espectador; aumentando la sensación de inmersión, a través del uso de la física, la óptica, y la acústica, creando una energía concéntrica capaz de extenderse en todas direcciones. El Teatro Total será uno de los primeros ejemplos en los que desde el punto de partida del proyecto, se conjuga la imagen como elemento comunicativo con la configuración espacial. Las nuevas configuraciones escénicas tendrán como premisa de desarrollo la capacidad de transformación tanto perceptiva, como física. En la segunda mitad del s.XX, la creación de centros de investigación como el CAVS (The Center for Advanced Visual Studies,1967), o el EAT (Experiments in Art and Technology, 1966), favorecerán la colaboración interdisciplinar entre arte y ciencia, implicando a empresas de carácter tecnológico, como Siemens, HP, IBM o Philips, facilitando soporte técnico y económico para el desarrollo de nuevos sistemas. Esta colaboración interdisciplinar dará lugar a una serie de intervenciones espaciales que tendrán su mayor visibilidad en algunas Exposiciones Universales. El resultado será, en la mayoría de los casos, la creación de espacios de carácter inmersivo, donde se establecerá una relación simbiótica entre espacio, imagen, sonido, y espectador. La colocación del espectador en el centro de la escena y la disposición dinámica de imagen y sonido, crearán una particular narrativa espacial no lineal, concebida para la experiencia. Desde las primeras proyecciones de cine a la pantalla múltiple de los Eames, las técnicas espaciales de difusión del sonido en Stockhausen, o los experimentos con el movimiento físico interactivo, la imagen, la luz en movimiento y el sonido, quedan inevitablemente convertidos en material arquitectónico. ABSTRACT. Light technology development would lead to a social, cultural and economic transformation. Both spatial consideration of “Modern Movement” and Second World War effects on technology, would have a visible aftereffect on spatial configuration and on the symbiotic and mutual relationship between ideology & technology. Comprehension adjustment on the articulation of space together with technology development, would impact on how space is perceived and felt. Exhibition space and scenic space would turn into a laboratory where developing and making comprehensive all illusory potential of light, projection and image. These new parameters would modify and revitalize the architectonic space. as modifying and revitalizing parameters of architectonic space. Spatial experimentation would be preceded by conceptual creation and investigation on the sculptural field, where new mechanic media would be responsible for a fresh and modern look influenced by technical elements. Optical experimentation, through photography, cinema or light movement and its perception, would turn into essential components for spatial arrangement linked to new ways of performance and communication. This experimentation sphere would be clear at The Bauhaus School, by the hand of Gropius, Schlemmer or Moholy Nag among others; in theoretical, theatrical or architectural performance’s projects, that would evolve based on technology and also based on the transformation of the relationship with the observer. Exhibition and perfor-mance areas would be taken as opportunities of spatial investigation and for the analysis of the different ways of perception, thus becoming key places for learning. Theater is postulated as a meeting point between art and technique, taking on a new significance at its intersection with other disciplines working with spatial definition too. The multiple innovation techniques linked to the new foundations for the theater regarding stage relation, would have as a consequence the regeneration of the space. Space would turn dynamic, both physically and perceptibly, bringing innovative spatial conceptions, many of them unrealistic. Light, projection and illusory creation based on sound and visual stimulus would appear as intangible and momentary design components, which would have a great impact on the space and on the way it is experienced. Implication of technology in art would bring changes on the observer as well as on the spatial configuration of the art spaces2. It would stand out as a proposal Walter Groupis Total Theater, whose development would include somehow the spatial experiments and studies about formal structure of perception accomplished by Moholy Nagy besides the concepts regarding stage space enhanced at the Bauhaus Theater Studio by Oskar Schlemmer. Within Total Theater, Groupis would incorporate his own view about traditional theatric architecture and conceptual innovations that were taking place since the end of the nineteenth century, such as active audience participation or the diffusing limits between scene and audience, establishing a new perception relationship between auditorium, performance and audience, improving the feeling of immersion through the use of physics, optics and acoustics, creating a concentric energy capable of spreading in all directions. Total Theater would be one of the first example in which, from the beginning of the Project, image is combined as a communicating element with the spatial configuration. As a premise of development, new stage arrangement would have the capacity of transformation, both perceptive and physically. During the second half or the twentieth century, the creation of investigation centers such as CAVS (Center for Advanced Visual Studies, 1967) or EAT (Experiments in Art and Technology, 1966), would help to the interdisciplinary collaboration between art and science, involving technology companies like Siemens, HP, IBM or Philips, providing technical and economic support to the development of new systems. This interdisciplinary collaboration would give room to a series of spatial interventions which would have visibility in some Universal Exhibitions. The result would be, in most cases, the creation of immersive character spaces, where a symbiotic relationship would be stablished between space, image, sound and audience. The new location of the audience in the middle of the display, together with the dynamic arrangement of sound and image would create a particular, no lineal narrative conceived to be experienced. Since the first cinema projections, the multiple screen of Eames, the spatial techniques for sound dissemination at Stockhausen or the interactive physical movement experimentation, image, motion light and sound would turn inevitably into architectural material.
Resumo:
Objective: To estimate the magnitude of serious eye disorders and of visual impairment in a defined elderly population of a typical metropolitan area in England, and to assess the frequency they were in touch with, or known to, the eye care services.
Resumo:
Theories of image segmentation suggest that the human visual system may use two distinct processes to segregate figure from background: a local process that uses local feature contrasts to mark borders of coherent regions and a global process that groups similar features over a larger spatial scale. We performed psychophysical experiments to determine whether and to what extent the global similarity process contributes to image segmentation by motion and color. Our results show that for color, as well as for motion, segmentation occurs first by an integrative process on a coarse spatial scale, demonstrating that for both modalities the global process is faster than one based on local feature contrasts. Segmentation by motion builds up over time, whereas segmentation by color does not, indicating a fundamental difference between the modalities. Our data suggest that segmentation by motion proceeds first via a cooperative linking over space of local motion signals, generating almost immediate perceptual coherence even of physically incoherent signals. This global segmentation process occurs faster than the detection of absolute motion, providing further evidence for the existence of two motion processes with distinct dynamic properties.
Resumo:
The visual responses of neurons in the cerebral cortex were first adequately characterized in the 1960s by D. H. Hubel and T. N. Wiesel [(1962) J. Physiol. (London) 160, 106-154; (1968) J. Physiol. (London) 195, 215-243] using qualitative analyses based on simple geometric visual targets. Over the past 30 years, it has become common to consider the properties of these neurons by attempting to make formal descriptions of these transformations they execute on the visual image. Most such models have their roots in linear-systems approaches pioneered in the retina by C. Enroth-Cugell and J. R. Robson [(1966) J. Physiol. (London) 187, 517-552], but it is clear that purely linear models of cortical neurons are inadequate. We present two related models: one designed to account for the responses of simple cells in primary visual cortex (V1) and one designed to account for the responses of pattern direction selective cells in MT (or V5), an extrastriate visual area thought to be involved in the analysis of visual motion. These models share a common structure that operates in the same way on different kinds of input, and instantiate the widely held view that computational strategies are similar throughout the cerebral cortex. Implementations of these models for Macintosh microcomputers are available and can be used to explore the models' properties.
Resumo:
Single photon emission with computed tomography (SPECT) hexamethylphenylethyleneamineoxime technetium-99 images were analyzed by an optimal interpolative neural network (OINN) algorithm to determine whether the network could discriminate among clinically diagnosed groups of elderly normal, Alzheimer disease (AD), and vascular dementia (VD) subjects. After initial image preprocessing and registration, image features were obtained that were representative of the mean regional tissue uptake. These features were extracted from a given image by averaging the intensities over various regions defined by suitable masks. After training, the network classified independent trials of patients whose clinical diagnoses conformed to published criteria for probable AD or probable/possible VD. For the SPECT data used in the current tests, the OINN agreement was 80 and 86% for probable AD and probable/possible VD, respectively. These results suggest that artificial neural network methods offer potential in diagnoses from brain images and possibly in other areas of scientific research where complex patterns of data may have scientifically meaningful groupings that are not easily identifiable by the researcher.
Resumo:
Falls are one of the greatest threats to elderly health in their daily living routines and activities. Therefore, it is very important to detect falls of an elderly in a timely and accurate manner, so that immediate response and proper care can be provided, by sending fall alarms to caregivers. Radar is an effective non-intrusive sensing modality which is well suited for this purpose, which can detect human motions in all types of environments, penetrate walls and fabrics, preserve privacy, and is insensitive to lighting conditions. Micro-Doppler features are utilized in radar signal corresponding to human body motions and gait to detect falls using a narrowband pulse-Doppler radar. Human motions cause time-varying Doppler signatures, which are analyzed using time-frequency representations and matching pursuit decomposition (MPD) for feature extraction and fall detection. The extracted features include MPD features and the principal components of the time-frequency signal representations. To analyze the sequential characteristics of typical falls, the extracted features are used for training and testing hidden Markov models (HMM) in different falling scenarios. Experimental results demonstrate that the proposed algorithm and method achieve fast and accurate fall detections. The risk of falls increases sharply when the elderly or patients try to exit beds. Thus, if a bed exit can be detected at an early stage of this motion, the related injuries can be prevented with a high probability. To detect bed exit for fall prevention, the trajectory of head movements is used for recognize such human motion. A head detector is trained using the histogram of oriented gradient (HOG) features of the head and shoulder areas from recorded bed exit images. A data association algorithm is applied on the head detection results to eliminate head detection false alarms. Then the three dimensional (3D) head trajectories are constructed by matching scale-invariant feature transform (SIFT) keypoints in the detected head areas from both the left and right stereo images. The extracted 3D head trajectories are used for training and testing an HMM based classifier for recognizing bed exit activities. The results of the classifier are presented and discussed in the thesis, which demonstrates the effectiveness of the proposed stereo vision based bed exit detection approach.
Resumo:
Póster presentado en SPIE Photonics Europe, Brussels, 16-19 April 2012.
Resumo:
Retinal image quality is commonly analyzed through parameters inherited from instrumental optics. These parameters are defined for ‘good optics’ so they are hard to translate into visual quality metrics. Instead of using point or artificial functions, we propose a quality index that takes into account properties of natural images. These images usually show strong local correlations that help to interpret the image. Our aim is to derive an objective index that quantifies the quality of vision by taking into account the local structure of the scene, instead of focusing on a particular aberration. As we show, this index highly correlates with visual acuity and allows inter-comparison of natural images around the retina. The usefulness of the index is proven through the analysis of real eyes before and after undergoing corneal surgery, which usually are hard to analyze with standard metrics.
Resumo:
New low cost sensors and open free libraries for 3D image processing are making important advances in robot vision applications possible, such as three-dimensional object recognition, semantic mapping, navigation and localization of robots, human detection and/or gesture recognition for human-machine interaction. In this paper, a novel method for recognizing and tracking the fingers of a human hand is presented. This method is based on point clouds from range images captured by a RGBD sensor. It works in real time and it does not require visual marks, camera calibration or previous knowledge of the environment. Moreover, it works successfully even when multiple objects appear in the scene or when the ambient light is changed. Furthermore, this method was designed to develop a human interface to control domestic or industrial devices, remotely. In this paper, the method was tested by operating a robotic hand. Firstly, the human hand was recognized and the fingers were detected. Secondly, the movement of the fingers was analysed and mapped to be imitated by a robotic hand.
Resumo:
Grady distinguishes two main types of metaphor in order to provide a solution in the controversies stemming from the conceptual theory of metaphor: correlation-based metaphors and resemblance metaphors. In “correlation-based metaphors”, the source domain is sensory-motor, while the target domain is not. On the contrary, “resemblance metaphors” are originated by a physical or conceptual perception which is common in both domains, by the association of concepts with common features. Primary metaphors are the minimal units of correlation-based metaphors; they are inherent in human nature and the result of the nature of our brain, our body and the world that we inhabit. We acquire them automatically and we cannot avoid them. Furthermore, as corporal experiences are universal, so are primary metaphors. In this paper, I will argue that primary metaphors manifest themselves visually through scene-setting techniques such as composition, framing, camera movement or lighting. Film-makers can use the different aspects of mise-en-scène metaphorically in order to express abstract notions like evil, importance, control, relationship or confusion. Such visual manifestations, as also occurs with their verbal equivalents, frequently go unnoticed or have been used so often that they have become clichés. But the important thing to bear in mind is that their origin lies in a primary metaphor and due to this origin these kinds of film-making strategies have been so expressively successful.
Resumo:
Mathematical morphology has been an area of intensive research over the last few years. Although many remarkable advances have been achieved throughout these years, there is still a great interest in accelerating morphological operations in order for them to be implemented in real-time systems. In this work, we present a new model for computing mathematical morphology operations, the so-called morphological trajectory model (MTM), in which a morphological filter will be divided into a sequence of basic operations. Then, a trajectory-based morphological operation (such as dilation, and erosion) is defined as the set of points resulting from the ordered application of the instant basic operations. The MTM approach allows working with different structuring elements, such as disks, and from the experiments, it can be extracted that our method is independent of the structuring element size and can be easily applied to industrial systems and high-resolution images.