875 resultados para Computer Vision for Robotics and Automation
Resumo:
Several studies have shown that people with disabilities benefit substantially from access to a means of independent mobility and assistive technology. Researchers are using technology originally developed for mobile robots to create easier to use wheelchairs. With this kind of technology people with disabilities can gain a degree of independence in performing daily life activities. In this work a computer vision system is presented, able to drive a wheelchair with a minimum number of finger commands. The user hand is detected and segmented with the use of a kinect camera, and fingertips are extracted from depth information, and used as wheelchair commands.
Resumo:
Hand gesture recognition for human computer interaction, being a natural way of human computer interaction, is an area of active research in computer vision and machine learning. This is an area with many different possible applications, giving users a simpler and more natural way to communicate with robots/systems interfaces, without the need for extra devices. So, the primary goal of gesture recognition research is to create systems, which can identify specific human gestures and use them to convey information or for device control. For that, vision-based hand gesture interfaces require fast and extremely robust hand detection, and gesture recognition in real time. In this study we try to identify hand features that, isolated, respond better in various situations in human-computer interaction. The extracted features are used to train a set of classifiers with the help of RapidMiner in order to find the best learner. A dataset with our own gesture vocabulary consisted of 10 gestures, recorded from 20 users was created for later processing. Experimental results show that the radial signature and the centroid distance are the features that when used separately obtain better results, with an accuracy of 91% and 90,1% respectively obtained with a Neural Network classifier. These to methods have also the advantage of being simple in terms of computational complexity, which make them good candidates for real-time hand gesture recognition.
Resumo:
Vision-based hand gesture recognition is an area of active current research in computer vision and machine learning. Being a natural way of human interaction, it is an area where many researchers are working on, with the goal of making human computer interaction (HCI) easier and natural, without the need for any extra devices. So, the primary goal of gesture recognition research is to create systems, which can identify specific human gestures and use them, for example, to convey information. For that, vision-based hand gesture interfaces require fast and extremely robust hand detection, and gesture recognition in real time. Hand gestures are a powerful human communication modality with lots of potential applications and in this context we have sign language recognition, the communication method of deaf people. Sign lan- guages are not standard and universal and the grammars differ from country to coun- try. In this paper, a real-time system able to interpret the Portuguese Sign Language is presented and described. Experiments showed that the system was able to reliably recognize the vowels in real-time, with an accuracy of 99.4% with one dataset of fea- tures and an accuracy of 99.6% with a second dataset of features. Although the im- plemented solution was only trained to recognize the vowels, it is easily extended to recognize the rest of the alphabet, being a solid foundation for the development of any vision-based sign language recognition user interface system.
Resumo:
Hand gestures are a powerful way for human communication, with lots of potential applications in the area of human computer interaction. Vision-based hand gesture recognition techniques have many proven advantages compared with traditional devices, giving users a simpler and more natural way to communicate with electronic devices. This work proposes a generic system architecture based in computer vision and machine learning, able to be used with any interface for humancomputer interaction. The proposed solution is mainly composed of three modules: a pre-processing and hand segmentation module, a static gesture interface module and a dynamic gesture interface module. The experiments showed that the core of vision-based interaction systems can be the same for all applications and thus facilitate the implementation. In order to test the proposed solutions, three prototypes were implemented. For hand posture recognition, a SVM model was trained and used, able to achieve a final accuracy of 99.4%. For dynamic gestures, an HMM model was trained for each gesture that the system could recognize with a final average accuracy of 93.7%. The proposed solution as the advantage of being generic enough with the trained models able to work in real-time, allowing its application in a wide range of human-machine applications.
Resumo:
Dissertação de mestrado em Engenharia Eletrónica Industrial e Computadores (área de especialização em Robótica)
Resumo:
Report for the scientific sojourn at the Swiss Federal Institute of Technology Zurich, Switzerland, between September and December 2007. In order to make robots useful assistants for our everyday life, the ability to learn and recognize objects is of essential importance. However, object recognition in real scenes is one of the most challenging problems in computer vision, as it is necessary to deal with difficulties. Furthermore, in mobile robotics a new challenge is added to the list: computational complexity. In a dynamic world, information about the objects in the scene can become obsolete before it is ready to be used if the detection algorithm is not fast enough. Two recent object recognition techniques have achieved notable results: the constellation approach proposed by Lowe and the bag of words approach proposed by Nistér and Stewénius. The Lowe constellation approach is the one currently being used in the robot localization project of the COGNIRON project. This report is divided in two main sections. The first section is devoted to briefly review the currently used object recognition system, the Lowe approach, and bring to light the drawbacks found for object recognition in the context of indoor mobile robot navigation. Additionally the proposed improvements for the algorithm are described. In the second section the alternative bag of words method is reviewed, as well as several experiments conducted to evaluate its performance with our own object databases. Furthermore, some modifications to the original algorithm to make it suitable for object detection in unsegmented images are proposed.
Resumo:
The present study investigates the short- and long-term outcomes of a computer-assisted cognitive remediation (CACR) program in adolescents with psychosis or at high risk. 32 adolescents participated in a blinded 8-week randomized controlled trial of CACR treatment compared to computer games (CG). Clinical and neuropsychological evaluations were undertaken at baseline, at the end of the program and at 6-month. At the end of the program (n = 28), results indicated that visuospatial abilities (Repeatable Battery for the Assessment of Neuropsychological Status, RBANS; P = .005) improved signifi cantly more in the CACR group compared to the CG group. Furthermore, other cognitive functions (RBANS), psychotic symptoms (Positive and Negative Symptom Scale) and psychosocial functioning (Social and Occupational Functioning Assessment Scale) improved signifi cantly, but at similar rates, in the two groups. At long term (n = 22), cognitive abilities did not demonstrated any amelioration in the control group while, in the CACR group, signifi cant long-term improvements in inhibition (Stroop; P = .040) and reasoning (Block Design Test; P = .005) were observed. In addition, symptom severity (Clinical Global Improvement) decreased signifi cantly in the control group (P = .046) and marginally in the CACR group (P = .088). To sum up, CACR can be successfully administered in this population. CACR proved to be effective over and above CG for the most intensively trained cognitive ability. Finally, on the long-term, enhanced reasoning and inhibition abilities, which are necessary to execute higher-order goals or to adapt behavior to the ever-changing environment, were observed in adolescents benefi ting from a CACR.
Resumo:
Photo-mosaicing techniques have become popular for seafloor mapping in various marine science applications. However, the common methods cannot accurately map regions with high relief and topographical variations. Ortho-mosaicing borrowed from photogrammetry is an alternative technique that enables taking into account the 3-D shape of the terrain. A serious bottleneck is the volume of elevation information that needs to be estimated from the video data, fused, and processed for the generation of a composite ortho-photo that covers a relatively large seafloor area. We present a framework that combines the advantages of dense depth-map and 3-D feature estimation techniques based on visual motion cues. The main goal is to identify and reconstruct certain key terrain feature points that adequately represent the surface with minimal complexity in the form of piecewise planar patches. The proposed implementation utilizes local depth maps for feature selection, while tracking over several views enables 3-D reconstruction by bundle adjustment. Experimental results with synthetic and real data validate the effectiveness of the proposed approach
Resumo:
We present a computer vision system that associates omnidirectional vision with structured light with the aim of obtaining depth information for a 360 degrees field of view. The approach proposed in this article combines an omnidirectional camera with a panoramic laser projector. The article shows how the sensor is modelled and its accuracy is proved by means of experimental results. The proposed sensor provides useful information for robot navigation applications, pipe inspection, 3D scene modelling etc
Resumo:
This paper presents the use of a mobile robot platform as an innovative educational tool in order to promote and integrate different curriculum knowledge. Hence, it is presented the acquired experience within a summer course named ldquoapplied mobile roboticsrdquo. The main aim of the course is to integrate different subjects as electronics, programming, architecture, perception systems, communications, control and trajectory planning by using the educational open mobile robot platform PRIM. The summer course is addressed to a wide range of student profiles. However, it is of special interests to the students of electrical and computer engineering around their final academic year. The summer course consists of the theoretical and laboratory sessions, related to the following topics: design & programming of electronic devices, modelling and control systems, trajectory planning and control, and computer vision systems. Therefore, the clues for achieving a renewed path of progress in robotics are the integration of several knowledgeable fields, such as computing, communications, and control sciences, in order to perform a higher level reasoning and use decision tools with strong theoretical base
Resumo:
In this paper we face the problem of positioning a camera attached to the end-effector of a robotic manipulator so that it gets parallel to a planar object. Such problem has been treated for a long time in visual servoing. Our approach is based on linking to the camera several laser pointers so that its configuration is aimed to produce a suitable set of visual features. The aim of using structured light is not only for easing the image processing and to allow low-textured objects to be treated, but also for producing a control scheme with nice properties like decoupling, stability, well conditioning and good camera trajectory
Resumo:
In a search for new sensor systems and new methods for underwater vehicle positioning based on visual observation, this paper presents a computer vision system based on coded light projection. 3D information is taken from an underwater scene. This information is used to test obstacle avoidance behaviour. In addition, the main ideas for achieving stabilisation of the vehicle in front of an object are presented
Resumo:
The absolute necessity of obtaining 3D information of structured and unknown environments in autonomous navigation reduce considerably the set of sensors that can be used. The necessity to know, at each time, the position of the mobile robot with respect to the scene is indispensable. Furthermore, this information must be obtained in the least computing time. Stereo vision is an attractive and widely used method, but, it is rather limited to make fast 3D surface maps, due to the correspondence problem. The spatial and temporal correspondence among images can be alleviated using a method based on structured light. This relationship can be directly found codifying the projected light; then each imaged region of the projected pattern carries the needed information to solve the correspondence problem. We present the most significant techniques, used in recent years, concerning the coded structured light method
Resumo:
Given a set of images of scenes containing different object categories (e.g. grass, roads) our objective is to discover these objects in each image, and to use this object occurrences to perform a scene classification (e.g. beach scene, mountain scene). We achieve this by using a supervised learning algorithm able to learn with few images to facilitate the user task. We use a probabilistic model to recognise the objects and further we classify the scene based on their object occurrences. Experimental results are shown and evaluated to prove the validity of our proposal. Object recognition performance is compared to the approaches of He et al. (2004) and Marti et al. (2001) using their own datasets. Furthermore an unsupervised method is implemented in order to evaluate the advantages and disadvantages of our supervised classification approach versus an unsupervised one
Resumo:
Introduction: Neuroimaging of the self focused on high-level mechanisms such as language, memory or imagery of the self. Recent evidence suggests that low-level mechanisms of multisensory and sensorimotor integration may play a fundamental role in encoding self-location and the first-person perspective (Blanke and Metzinger, 2009). Neurological patients with out-of body experiences (OBE) suffer from abnormal self-location and the first-person perspective due to a damage in the temporo-parietal junction (Blanke et al., 2004). Although self-location and the first-person perspective can be studied experimentally (Lenggenhager et al., 2009), the neural underpinnings of self-location have yet to be investigated. To investigate the brain network involved in self-location and first-person perspective we used visuo-tactile multisensory conflict, magnetic resonance (MR)-compatible robotics, and fMRI in study 1, and lesion analysis in a sample of 9 patients with OBE due to focal brain damage in study 2. Methods: Twenty-two participants saw a video showing either a person's back or an empty room being stroked (visual stimuli) while the MR-compatible robotic device stroked their back (tactile stimulation). Direction and speed of the seen stroking could either correspond (synchronous) or not (asynchronous) to those of the seen stroking. Each run comprised the four conditions according to a 2x2 factorial design with Object (Body, No-Body) and Synchrony (Synchronous, Asynchronous) as main factors. Self-location was estimated using the mental ball dropping (MBD; Lenggenhager et al., 2009). After the fMRI session participants completed a 6-item adapted from the original questionnaire created by Botvinick and Cohen (1998) and based on questions and data obtained by Lenggenhager et al. (2007, 2009). They were also asked to complete a questionnaire to disclose the perspective they adopted during the illusion. Response times (RTs) for the MBD and fMRI data were analyzed with a 3-way mixed model ANOVA with the in-between factor Perspective (up, down) and the two with-in factors Object (body, no-body) and Stroking (synchronous, asynchronous). Quantitative lesion analysis was performed using MRIcron (Rorden et al., 2007). We compared the distributions of brain lesions confirmed by multimodality imaging (Knowlton, 2004) in patients with OBE with those showing complex visual hallucinations involving people or faces, but without any disturbance of self-location and first person perspective. Nine patients with OBE were investigated. The control group comprised 8 patients. Structural imaging data were available for normalization and co-registration in all the patients. Normalization of each patient's lesion into the common MNI (Montreal Neurological Institute) reference space permitted simple, voxel-wise, algebraic comparisons to be made. Results: Even if in the scanner all participants were lying on their back and were facing upwards, analysis of perspective showed that half of the participants had the impression to be looking down at the virtual human body below them, despite any cues about their body position (Down-group). The other participants had the impression to be looking up at the virtual body above them (Up-group). Analysis of Q3 ("How strong was the feeling that the body you saw was you?") indicated stronger self-identification with the virtual body during the synchronous stroking. RTs in the MBD task confirmed these subjective data (significant 3-way interaction between perspective, object and stroking). fMRI results showed eight cortical regions where the BOLD signal was significantly different during at least one of the conditions resulting from the combination of Object and Stroking, relative to baseline: right and left temporo-parietal junction, right EBA, left middle occipito-temporal gyrus, left postcentral gyrus, right medial parietal lobe, bilateral medial occipital lobe (Fig 1). The activation patterns in right and left temporo-parietal junction and right EBA reflected changes in self-location and perspective as revealed by statistical analysis that was performed on the percentage of BOLD change with respect to the baseline. Statistical lesion overlap comparison (using nonparametric voxel based lesion symptom mapping) with respect to the control group revealed the right temporo-parietal junction, centered at the angular gyrus (Talairach coordinates x = 54, y =-52, z = 26; p>0.05, FDR corrected). Conclusions: The present questionnaire and behavioural results show that - despite the noisy and constraining MR environment) our participants had predictable changes in self-location, self-identification, and first-person perspective when robotic tactile stroking was applied synchronously with the robotic visual stroking. fMRI data in healthy participants and lesion data in patients with abnormal self-location and first-person perspective jointly revealed that the temporo-parietal cortex especially in the right hemisphere encodes these conscious experiences. We argue that temporo-parietal activity reflects the experience of the conscious "I" as embodied and localized within bodily space.