36 resultados para SLAM RGB-D SlamDunk Android 3D mobile
Resumo:
An innovative background modeling technique that is able to accurately segment foreground regions in RGB-D imagery (RGB plus depth) has been presented in this paper. The technique is based on a Bayesian framework that efficiently fuses different sources of information to segment the foreground. In particular, the final segmentation is obtained by considering a prediction of the foreground regions, carried out by a novel Bayesian Network with a depth-based dynamic model, and, by considering two independent depth and color-based mixture of Gaussians background models. The efficient Bayesian combination of all these data reduces the noise and uncertainties introduced by the color and depth features and the corresponding models. As a result, more compact segmentations, and refined foreground object silhouettes are obtained. Experimental results with different databases suggest that the proposed technique outperforms existing state-of-the-art algorithms.
Resumo:
En esta tesis se presenta un análisis en profundidad de cómo se deben utilizar dos tipos de métodos directos, Lucas-Kanade e Inverse Compositional, en imágenes RGB-D y se analiza la capacidad y precisión de los mismos en una serie de experimentos sintéticos. Estos simulan imágenes RGB, imágenes de profundidad (D) e imágenes RGB-D para comprobar cómo se comportan en cada una de las combinaciones. Además, se analizan estos métodos sin ninguna técnica adicional que modifique el algoritmo original ni que lo apoye en su tarea de optimización tal y como sucede en la mayoría de los artículos encontrados en la literatura. Esto se hace con el fin de poder entender cuándo y por qué los métodos convergen o divergen para que así en el futuro cualquier interesado pueda aplicar los conocimientos adquiridos en esta tesis de forma práctica. Esta tesis debería ayudar al futuro interesado a decidir qué algoritmo conviene más en una determinada situación y debería también ayudarle a entender qué problemas le pueden dar estos algoritmos para poder poner el remedio más apropiado. Las técnicas adicionales que sirven de remedio para estos problemas quedan fuera de los contenidos que abarca esta tesis, sin embargo, sí se hace una revisión sobre ellas.---ABSTRACT---This thesis presents an in-depth analysis about how direct methods such as Lucas- Kanade and Inverse Compositional can be applied in RGB-D images. The capability and accuracy of these methods is also analyzed employing a series of synthetic experiments. These simulate the efects produced by RGB images, depth images and RGB-D images so that diferent combinations can be evaluated. Moreover, these methods are analyzed without using any additional technique that modifies the original algorithm or that aids the algorithm in its search for a global optima unlike most of the articles found in the literature. Our goal is to understand when and why do these methods converge or diverge so that in the future, the knowledge extracted from the results presented here can efectively help a potential implementer. After reading this thesis, the implementer should be able to decide which algorithm fits best for a particular task and should also know which are the problems that have to be addressed in each algorithm so that an appropriate correction is implemented using additional techniques. These additional techniques are outside the scope of this thesis, however, they are reviewed from the literature.
Resumo:
Low cost RGB-D cameras such as the Microsoft’s Kinect or the Asus’s Xtion Pro are completely changing the computer vision world, as they are being successfully used in several applications and research areas. Depth data are particularly attractive and suitable for applications based on moving objects detection through foreground/background segmentation approaches; the RGB-D applications proposed in literature employ, in general, state of the art foreground/background segmentation techniques based on the depth information without taking into account the color information. The novel approach that we propose is based on a combination of classifiers that allows improving background subtraction accuracy with respect to state of the art algorithms by jointly considering color and depth data. In particular, the combination of classifiers is based on a weighted average that allows to adaptively modifying the support of each classifier in the ensemble by considering foreground detections in the previous frames and the depth and color edges. In this way, it is possible to reduce false detections due to critical issues that can not be tackled by the individual classifiers such as: shadows and illumination changes, color and depth camouflage, moved background objects and noisy depth measurements. Moreover, we propose, for the best of the author’s knowledge, the first publicly available RGB-D benchmark dataset with hand-labeled ground truth of several challenging scenarios to test background/foreground segmentation algorithms.
Resumo:
Esta tesis se centra en desarrollo de tecnologías para la interacción hombre-robot en entornos nucleares de fusión. La problemática principal del sector de fusión nuclear radica en las condiciones ambientales tan extremas que hay en el interior del reactor, y la necesidad de que los equipos cumplan requisitos muy restrictivos para poder aguantar esos niveles de radiación, magnetismo, ultravacío, temperatura... Como no es viable la ejecución de tareas directamente por parte de humanos, habrá que utilizar dispositivos de manipulación remota para llevar a cabo los procesos de operación y mantenimiento. En las instalaciones de ITER es obligatorio tener un entorno controlado de extrema seguridad, que necesita de estándares validados. La definición y uso de protocolos es indispensable para regir su buen funcionamiento. Si nos centramos en la telemanipulación con algo grado de escalado, surge la necesidad de definir protocolos para sistemas abiertos que permitan la interacción entre equipos y dispositivos de diversa índole. En este contexto se plantea la definición del Protocolo de Teleoperación que permita la interconexión entre dispositivos maestros y esclavos de distinta tipología, pudiéndose comunicar bilateralmente entre sí y utilizar distintos algoritmos de control según la tarea a desempeñar. Este protocolo y su interconectividad se han puesto a prueba en la Plataforma Abierta de Teleoperación (P.A.T.) que se ha desarrollado e integrado en la ETSII UPM como una herramienta que permita probar, validar y realizar experimentos de telerrobótica. Actualmente, este Protocolo de Teleoperación se ha propuesto a través de AENOR al grupo ISO de Telerobotics como una solución válida al problema existente y se encuentra bajo revisión. Con el diseño de dicho protocolo se ha conseguido enlazar maestro y esclavo, sin embargo con los niveles de radiación tan altos que hay en ITER la electrónica del controlador no puede entrar dentro del tokamak. Por ello se propone que a través de una mínima electrónica convenientemente protegida se puedan multiplexar las señales de control que van a través del cableado umbilical desde el controlador hasta la base del robot. En este ejercicio teórico se demuestra la utilidad y viabilidad de utilizar este tipo de solución para reducir el volumen y peso del cableado umbilical en cifras aproximadas de un 90%, para ello hay que desarrollar una electrónica específica y con certificación RadHard para soportar los enormes niveles de radiación de ITER. Para este manipulador de tipo genérico y con ayuda de la Plataforma Abierta de Teleoperación, se ha desarrollado un algoritmo que mediante un sensor de fuerza/par y una IMU colocados en la muñeca del robot, y convenientemente protegidos ante la radiación, permiten calcular las fuerzas e inercias que produce la carga, esto es necesario para poder transmitirle al operador unas fuerzas escaladas, y que pueda sentir la carga que manipula, y no otras fuerzas que puedan influir en el esclavo remoto, como ocurre con otras técnicas de estimación de fuerzas. Como el blindaje de los sensores no debe ser grande ni pesado, habrá que destinar este tipo de tecnología a las tareas de mantenimiento de las paradas programadas de ITER, que es cuando los niveles de radiación están en sus valores mínimos. Por otro lado para que el operador sienta lo más fielmente posible la fuerza de carga se ha desarrollado una electrónica que mediante el control en corriente de los motores permita realizar un control en fuerza a partir de la caracterización de los motores del maestro. Además para aumentar la percepción del operador se han realizado unos experimentos que demuestran que al aplicar estímulos multimodales (visuales, auditivos y hápticos) aumenta su inmersión y el rendimiento en la consecución de la tarea puesto que influyen directamente en su capacidad de respuesta. Finalmente, y en referencia a la realimentación visual del operador, en ITER se trabaja con cámaras situadas en localizaciones estratégicas, si bien el humano cuando manipula objetos hace uso de su visión binocular cambiando constantemente el punto de vista adecuándose a las necesidades visuales de cada momento durante el desarrollo de la tarea. Por ello, se ha realizado una reconstrucción tridimensional del espacio de la tarea a partir de una cámara-sensor RGB-D, lo cual nos permite obtener un punto de vista binocular virtual móvil a partir de una cámara situada en un punto fijo que se puede proyectar en un dispositivo de visualización 3D para que el operador pueda variar el punto de vista estereoscópico según sus preferencias. La correcta integración de estas tecnologías para la interacción hombre-robot en la P.A.T. ha permitido validar mediante pruebas y experimentos para verificar su utilidad en la aplicación práctica de la telemanipulación con alto grado de escalado en entornos nucleares de fusión. Abstract This thesis focuses on developing technologies for human-robot interaction in nuclear fusion environments. The main problem of nuclear fusion sector resides in such extreme environmental conditions existing in the hot-cell, leading to very restrictive requirements for equipment in order to deal with these high levels of radiation, magnetism, ultravacuum, temperature... Since it is not feasible to carry out tasks directly by humans, we must use remote handling devices for accomplishing operation and maintenance processes. In ITER facilities it is mandatory to have a controlled environment of extreme safety and security with validated standards. The definition and use of protocols is essential to govern its operation. Focusing on Remote Handling with some degree of escalation, protocols must be defined for open systems to allow interaction among different kind of equipment and several multifunctional devices. In this context, a Teleoperation Protocol definition enables interconnection between master and slave devices from different typologies, being able to communicate bilaterally one each other and using different control algorithms depending on the task to perform. This protocol and its interconnectivity have been tested in the Teleoperation Open Platform (T.O.P.) that has been developed and integrated in the ETSII UPM as a tool to test, validate and conduct experiments in Telerobotics. Currently, this protocol has been proposed for Teleoperation through AENOR to the ISO Telerobotics group as a valid solution to the existing problem, and it is under review. Master and slave connection has been achieved with this protocol design, however with such high radiation levels in ITER, the controller electronics cannot enter inside the tokamak. Therefore it is proposed a multiplexed electronic board, that through suitable and RadHard protection processes, to transmit control signals through an umbilical cable from the controller to the robot base. In this theoretical exercise the utility and feasibility of using this type of solution reduce the volume and weight of the umbilical wiring approximate 90% less, although it is necessary to develop specific electronic hardware and validate in RadHard qualifications in order to handle huge levels of ITER radiation. Using generic manipulators does not allow to implement regular sensors for force feedback in ITER conditions. In this line of research, an algorithm to calculate the forces and inertia produced by the load has been developed using a force/torque sensor and IMU, both conveniently protected against radiation and placed on the robot wrist. Scaled forces should be transmitted to the operator, feeling load forces but not other undesirable forces in slave system as those resulting from other force estimation techniques. Since shielding of the sensors should not be large and heavy, it will be necessary to allocate this type of technology for programmed maintenance periods of ITER, when radiation levels are at their lowest levels. Moreover, the operator perception needs to feel load forces as accurate as possible, so some current control electronics were developed to perform a force control of master joint motors going through a correct motor characterization. In addition to increase the perception of the operator, some experiments were conducted to demonstrate applying multimodal stimuli (visual, auditory and haptic) increases immersion and performance in achieving the task since it is directly correlated with response time. Finally, referring to the visual feedback to the operator in ITER, it is usual to work with 2D cameras in strategic locations, while humans use binocular vision in direct object manipulation, constantly changing the point of view adapting it to the visual needs for performing manipulation during task procedures. In this line a three-dimensional reconstruction of non-structured scenarios has been developed using RGB-D sensor instead of cameras in the remote environment. Thus a mobile virtual binocular point of view could be generated from a camera at a fixed point, projecting stereoscopic images in 3D display device according to operator preferences. The successful integration of these technologies for human-robot interaction in the T.O.P., and validating them through tests and experiments, verify its usefulness in practical application of high scaling remote handling at nuclear fusion environments.
Resumo:
Este trabajo esta orientado a resolver el problema de la caracterización de la copa de arboles frutales para la aplicacion localizada de fitosanitarios. Esta propuesta utiliza un mapa de profundidad (Depth image) y una imagen RGB combinadas (RGB-D), proporcionados por el sensor Kinect de Microsoft, para aplicar pesticidas de forma localizada. A través del mapa de profundidad se puede estimar la densidad de la copa y a partir de esta información determinar qué boquillas se deben abrir en cada momento. Se desarrollaron algoritmos implementados en Matlab que permiten además de la adquisición de las imágenes RGB-D, aplicar plaguicidas sólo a hojas y/o frutos según se desee. Estos algoritmos fueron implementados en un software que se comunica con el entorno de desarrollo "Kinect Windows SDK", encargado de extraer las imágenes desde el sensor Kinect. Por otra parte, para identificar hojas, se implementaron algoritmos de clasificación e identificación. Los algoritmos de clasificación utilizados fueron "Fuzzy C-Means con Gustafson Kessel" (FCM-GK) y "K-Means". Los centroides o prototipos de cada clase generados por FCM-GK fueron usados como semilla para K-Means, para acelerar la convergencia del algoritmo y mantener la coherencia temporal en los grupos generados por K-Means. Los algoritmos de clasificación fueron aplicados sobre las imágenes transformadas al espacio de color L*a*b*; específicamente se emplearon los canales a*, b* (canales cromáticos) con el fin de reducir el efecto de la luz sobre los colores. Los algoritmos de clasificación fueron configurados para buscar cuatro grupos: hojas, porosidad, frutas y tronco. Una vez que el clasificador genera los prototipos de los grupos, un clasificador denominado Máquina de Soporte Vectorial, que utiliza como núcleo una función Gaussiana base radial, identifica la clase de interés (hojas). La combinación de estos algoritmos ha mostrado bajos errores de clasificación, rendimiento del 4% de error en la identificación de hojas. Además, estos algoritmos de procesamiento de hasta 8.4 imágenes por segundo, lo que permite su aplicación en tiempo real. Los resultados demuestran la viabilidad de utilizar el sensor "Kinect" para determinar dónde y cuándo aplicar pesticidas. Por otra parte, también muestran que existen limitaciones en su uso, impuesta por las condiciones de luz. En otras palabras, es posible usar "Kinect" en exteriores, pero durante días nublados, temprano en la mañana o en la noche con iluminación artificial, o añadiendo un parasol en condiciones de luz intensa.
Resumo:
In the recent years, the computer vision community has shown great interest on depth-based applications thanks to the performance and flexibility of the new generation of RGB-D imagery. In this paper, we present an efficient background subtraction algorithm based on the fusion of multiple region-based classifiers that processes depth and color data provided by RGB-D cameras. Foreground objects are detected by combining a region-based foreground prediction (based on depth data) with different background models (based on a Mixture of Gaussian algorithm) providing color and depth descriptions of the scene at pixel and region level. The information given by these modules is fused in a mixture of experts fashion to improve the foreground detection accuracy. The main contributions of the paper are the region-based models of both background and foreground, built from the depth and color data. The obtained results using different database sequences demonstrate that the proposed approach leads to a higher detection accuracy with respect to existing state-of-the-art techniques.
Resumo:
The Web of Data currently comprises ? 62 billion triples from more than 2,000 different datasets covering many fields of knowledge3. This volume of structured Linked Data can be seen as a particular case of Big Data, referred to as Big Semantic Data [4]. Obviously, powerful computational configurations are tradi- tionally required to deal with the scalability problems arising to Big Semantic Data. It is not surprising that this ?data revolution? has competed in parallel with the growth of mobile computing. Smartphones and tablets are massively used at the expense of traditional computers but, to date, mobile devices have more limited computation resources. Therefore, one question that we may ask ourselves would be: can (potentially large) semantic datasets be consumed natively on mobile devices? Currently, only a few mobile apps (e.g., [1, 9, 2, 8]) make use of semantic data that they store in the mobile devices, while many others access existing SPARQL endpoints or Linked Data directly. Two main reasons can be considered for this fact. On the one hand, in spite of some initial approaches [6, 3], there are no well-established triplestores for mobile devices. This is an important limitation because any po- tential app must assume both RDF storage and SPARQL resolution. On the other hand, the particular features of these devices (little storage space, less computational power or more limited bandwidths) limit the adoption of seman- tic data for different uses and purposes. This paper introduces our HDTourist mobile application prototype. It con- sumes urban data from DBpedia4 to help tourists visiting a foreign city. Although it is a simple app, its functionality allows illustrating how semantic data can be stored and queried with limited resources. Our prototype is implemented for An- droid, but its foundations, explained in Section 2, can be deployed in any other platform. The app is described in Section 3, and Section 4 concludes about our current achievements and devises the future work.
Resumo:
Exploiting the full potential of telemedical systems means using platform based solutions: data are recovered from biomedical sensors, hospital information systems, care-givers, as well as patients themselves, and are processed and redistributed in an either centralized or, more probably, decentralized way. The integration of all these different devices, and interfaces, as well as the automated analysis and representation of all the pieces of information are current key challenges in telemedicine. Mobile phone technology has just begun to offer great opportunities of using this diverse information for guiding, warning, and educating patients, thus increasing their autonomy and adherence to their prescriptions. However, most of these existing mobile solutions are not based on platform systems and therefore represent limited, isolated applications. This article depicts how telemedical systems, based on integrated health data platforms, can maximize prescription adherence in chronic patients through mobile feedback. The application described here has been developed in an EU-funded R&D project called METABO, dedicated to patients with type 1 or type 2 Diabetes Mellitus
Resumo:
Nowadays, a wide offer of mobile augmented reality (mAR) applications is available at the market, and the user base of mobile AR-capable devices -smartphones- is rapidly increasing. Nevertheless, likewise to what happens in other mobile segments, business models to put mAR in value are not clearly defined yet. In this paper, we focus on sketching the big picture of the commercial offer of mAR applications, in order to inspire a posterior analysis of business models that may successfully support the evolution of mAR. We have gathered more than 400 mAR applications from Android Market, and analyzed the offer as a whole, taking into account some technology aspects, pricing schemes and user adoption factors. Results show, for example, that application providers are not expecting to generate revenues per direct download, although they are producing high-quality applications, well rated by the users.
Resumo:
La mayor parte de los entornos diseñados por el hombre presentan características geométricas específicas. En ellos es frecuente encontrar formas poligonales, rectangulares, circulares . . . con una serie de relaciones típicas entre distintos elementos del entorno. Introducir este tipo de conocimiento en el proceso de construcción de mapas de un robot móvil puede mejorar notablemente la calidad y la precisión de los mapas resultantes. También puede hacerlos más útiles de cara a un razonamiento de más alto nivel. Cuando la construcción de mapas se formula en un marco probabilístico Bayesiano, una especificación completa del problema requiere considerar cierta información a priori sobre el tipo de entorno. El conocimiento previo puede aplicarse de varias maneras, en esta tesis se presentan dos marcos diferentes: uno basado en el uso de primitivas geométricas y otro que emplea un método de representación cercano al espacio de las medidas brutas. Un enfoque basado en características geométricas supone implícitamente imponer un cierto modelo a priori para el entorno. En este sentido, el desarrollo de una solución al problema SLAM mediante la optimización de un grafo de características geométricas constituye un primer paso hacia nuevos métodos de construcción de mapas en entornos estructurados. En el primero de los dos marcos propuestos, el sistema deduce la información a priori a aplicar en cada caso en base a una extensa colección de posibles modelos geométricos genéricos, siguiendo un método de Maximización de la Esperanza para hallar la estructura y el mapa más probables. La representación de la estructura del entorno se basa en un enfoque jerárquico, con diferentes niveles de abstracción para los distintos elementos geométricos que puedan describirlo. Se llevaron a cabo diversos experimentos para mostrar la versatilidad y el buen funcionamiento del método propuesto. En el segundo marco, el usuario puede definir diferentes modelos de estructura para el entorno mediante grupos de restricciones y energías locales entre puntos vecinos de un conjunto de datos del mismo. El grupo de restricciones que se aplica a cada grupo de puntos depende de la topología, que es inferida por el propio sistema. De este modo, se pueden incorporar nuevos modelos genéricos de estructura para el entorno con gran flexibilidad y facilidad. Se realizaron distintos experimentos para demostrar la flexibilidad y los buenos resultados del enfoque propuesto. Abstract Most human designed environments present specific geometrical characteristics. In them, it is easy to find polygonal, rectangular and circular shapes, with a series of typical relations between different elements of the environment. Introducing this kind of knowledge in the mapping process of mobile robots can notably improve the quality and accuracy of the resulting maps. It can also make them more suitable for higher level reasoning applications. When mapping is formulated in a Bayesian probabilistic framework, a complete specification of the problem requires considering a prior for the environment. The prior over the structure of the environment can be applied in several ways; this dissertation presents two different frameworks, one using a feature based approach and another one employing a dense representation close to the measurements space. A feature based approach implicitly imposes a prior for the environment. In this sense, feature based graph SLAM was a first step towards a new mapping solution for structured scenarios. In the first framework, the prior is inferred by the system from a wide collection of feature based priors, following an Expectation-Maximization approach to obtain the most probable structure and the most probable map. The representation of the structure of the environment is based on a hierarchical model with different levels of abstraction for the geometrical elements describing it. Various experiments were conducted to show the versatility and the good performance of the proposed method. In the second framework, different priors can be defined by the user as sets of local constraints and energies for consecutive points in a range scan from a given environment. The set of constraints applied to each group of points depends on the topology, which is inferred by the system. This way, flexible and generic priors can be incorporated very easily. Several tests were carried out to demonstrate the flexibility and the good results of the proposed approach.
Resumo:
Les jeux sur mobile sont un exemple majeur à la fois d'une application réussie sur les mobiles et du nombre croissant de plates-formes pour les médias et les industries de loisirs. Explorant cette convergence, l'article analyse les caractéristiques principales du marché des jeux sur mobile et de son écosystème industriel, ses activités et acteurs principaux. L'article se concentre sur le rôle des différentes plates-formes de logiciels et sur les défis et opportunités futures pour les développeurs de jeux sur mobile dans un nouveau scénario dominé par les plates-formes de mobile.
Resumo:
This paper employs a 3D hp self-adaptive grid-refinement finite element strategy for the solution of a particular electromagnetic waveguide structure known as Magic-T. This structure is utilized as a power divider/combiner in communication systems as well as in other applications. It often incorporates dielectrics, metallic screws, round corners, and so on, which may facilitate its construction or improve its design, but significantly difficult its modeling when employing semi-analytical techniques. The hp-adaptive finite element method enables accurate modeling of a Magic-T structure even in the presence of these undesired materials/geometries. Numerical results demonstrate the suitability of the hp-adaptive method for modeling a Magic-T rectangular waveguide structure, delivering errors below 0.5% with a limited number of unknowns. Solutions of waveguide problems delivered by the self-adaptive hp-FEM are comparable to those obtained with semi-analytical techniques such as the Mode Matching method, for problems where the latest methods can be applied. At the same time, the hp-adaptive FEM enables accurate modeling of more complex waveguide structures.
Resumo:
This article proposes an innovative biometric technique based on the idea of authenticating a person on a mobile device by gesture recognition. To accomplish this aim, a user is prompted to be recognized by a gesture he/she performs moving his/her hand while holding a mobile device with an accelerometer embedded. As users are not able to repeat a gesture exactly in the air, an algorithm based on sequence alignment is developed to correct slight differences between repetitions of the same gesture. The robustness of this biometric technique has been studied within 2 different tests analyzing a database of 100 users with real falsifications. Equal Error Rates of 2.01 and 4.82% have been obtained in a zero-effort and an active impostor attack, respectively. A permanence evaluation is also presented from the analysis of the repetition of the gestures of 25 users in 10 sessions over a month. Furthermore, two different gesture databases have been developed: one made up of 100 genuine identifying 3-D hand gestures and 3 impostors trying to falsify each of them and another with 25 volunteers repeating their identifying 3- D hand gesture in 10 sessions over a month. These databases are the most extensive in published studies, to the best of our knowledge.
Resumo:
This paper presents a multiprotocol mobile application for building automation which supports and enables the integration of the most representative control technologies such as KNX, LonWorks and X-10. The application includes a real-time monitoring service. Finally, advanced control functionalities based on gestures recognition and predefined scenes have been implemented. This application has been developed and tested in the Energy Efficiency Research Facility located at CeDInt-UPM, where electrical loads, blinds and HVAC and lighting systems can be controlled.
Resumo:
Mobile activity recognition focuses on inferring the current activities of a mobile user by leveraging the sensory data that is available on today’s smart phones. The state of the art in mobile activity recognition uses traditional classification learning techniques. Thus, the learning process typically involves: i) collection of labelled sensory data that is transferred and collated in a centralised repository; ii) model building where the classification model is trained and tested using the collected data; iii) a model deployment stage where the learnt model is deployed on-board a mobile device for identifying activities based on new sensory data. In this paper, we demonstrate the Mobile Activity Recognition System (MARS) where for the first time the model is built and continuously updated on-board the mobile device itself using data stream mining. The advantages of the on-board approach are that it allows model personalisation and increased privacy as the data is not sent to any external site. Furthermore, when the user or its activity profile changes MARS enables promptly adaptation. MARS has been implemented on the Android platform to demonstrate that it can achieve accurate mobile activity recognition. Moreover, we can show in practise that MARS quickly adapts to user profile changes while at the same time being scalable and efficient in terms of consumption of the device resources.