327 resultados para Object vision
Resumo:
In recent years more and more complex humanoid robots have been developed. On the other hand programming these systems has become more difficult. There is a clear need for such robots to be able to adapt and perform certain tasks autonomously, or even learn by themselves how to act. An important issue to tackle is the closing of the sensorimotor loop. Especially when talking about humanoids the tight integration of perception with actions will allow for improved behaviours, embedding adaptation on the lower-level of the system.
Resumo:
There is an increased interest on the use of Unmanned Aerial Vehicles (UAVs) for wildlife and feral animal monitoring around the world. This paper describes a novel system which uses a predictive dynamic application that places the UAV ahead of a user, with a low cost thermal camera, a small onboard computer that identifies heat signatures of a target animal from a predetermined altitude and transmits that target’s GPS coordinates. A map is generated and various data sets and graphs are displayed using a GUI designed for easy use. The paper describes the hardware and software architecture and the probabilistic model for downward facing camera for the detection of an animal. Behavioral dynamics of target movement for the design of a Kalman filter and Markov model based prediction algorithm are used to place the UAV ahead of the user. Geometrical concepts and Haversine formula are applied to the maximum likelihood case in order to make a prediction regarding a future state of the user, thus delivering a new way point for autonomous navigation. Results show that the system is capable of autonomously locating animals from a predetermined height and generate a map showing the location of the animals ahead of the user.
Resumo:
This paper introduces a machine learning based system for controlling a robotic manipulator with visual perception only. The capability to autonomously learn robot controllers solely from raw-pixel images and without any prior knowledge of configuration is shown for the first time. We build upon the success of recent deep reinforcement learning and develop a system for learning target reaching with a three-joint robot manipulator using external visual observation. A Deep Q Network (DQN) was demonstrated to perform target reaching after training in simulation. Transferring the network to real hardware and real observation in a naive approach failed, but experiments show that the network works when replacing camera images with synthetic images.
Resumo:
Recent interest in affect and the body have mobilised a contemporary review of aesthetics and phenomenology within architecture to unpack how environments affect spatial experience. Emerging spatial studies within the neurosciences, and their implications for architectural research as raised by architectural theorists has been well supported by a raft of scientists and institutions. Although there has been some headway in spatial studies of the vision impaired (Cattaneo et al., 2011) to understand the role of their non-visual systems in assisting navigation and location, little is discussed in terms of their other abilities in sensing particular qualities of space which impinge upon emotion and wellbeing. This research explores, through published studies and constructed spatial interviews, the affective perception of the vision impaired and how further interplay between this research and the architectural field can contribute new knowledge regarding space and affect. The research aims to provide background of current and potential cross disciplinary research and highlight the role wearable technologies can play in enhancing knowledge of affective spatial experience.
Resumo:
Evidence has accumulated that rod activation under mesopic and scotopic light levels alters visual perception and performance. Here we review the most recent developments in the measurement of rod and cone contributions to mesopic color perception and temporal processing, with a focus on data measured using the four-primary photostimulator method that independently controls rod and cone excitations. We discuss the findings in the context of rod inputs to the three primary retinogeniculate pathways to understand rod contributions to mesopic vision. Additionally, we present evidence that hue perception is possible under scotopic, pure rod-mediated conditions that involves cortical mechanisms.
Resumo:
To develop and test a custom-built instrument to simultaneously assess tear film surface quality (TFSQ) and subjective vision score (SVS).
Resumo:
Surveying threatened and invasive species to obtain accurate population estimates is an important but challenging task that requires a considerable investment in time and resources. Estimates using existing ground-based monitoring techniques, such as camera traps and surveys performed on foot, are known to be resource intensive, potentially inaccurate and imprecise, and difficult to validate. Recent developments in unmanned aerial vehicles (UAV), artificial intelligence and miniaturized thermal imaging systems represent a new opportunity for wildlife experts to inexpensively survey relatively large areas. The system presented in this paper includes thermal image acquisition as well as a video processing pipeline to perform object detection, classification and tracking of wildlife in forest or open areas. The system is tested on thermal video data from ground based and test flight footage, and is found to be able to detect all the target wildlife located in the surveyed area. The system is flexible in that the user can readily define the types of objects to classify and the object characteristics that should be considered during classification.
Resumo:
Online groups rely on contributions from their members to flourish, but in the context of behaviour change individuals are typically reluctant to participate actively before they have changed successfully. We took inspiration from CSCW research on objects to address this problem by shifting the focus of online participation from the exchange of personal experiences to more incidental interactions mediated by objects that offer support for change. In this article we describe how we designed, deployed and studied a smartphone application that uses different objects, called distractions and tips, to facilitate social interaction amongst people trying to quit smoking. A field study with 18 smokers revealed different forms of interaction: purely instrumental interactions with the objects, subtle engagement with other users through receptive and covert interactions, as well as explicit interaction with other users through disclosure and mutual support. The distraction objects offered a stepping-stone into interaction, whereas the tips encouraged interaction with the people behind the objects. This understanding of interaction through objects complements existing frameworks of online participation and adds to the current discourse on object-centred sociality. Furthermore, it provides an alternative approach to the design of online support groups, which offers the users enhanced control about the information they share with other users. We conclude by discussing how researchers and practitioners can apply the ideas of interaction around objects to other domains where individuals may have a simultaneous desire and reluctance to interact.
Resumo:
Detect and Avoid (DAA) technology is widely acknowledged as a critical enabler for unsegregated Remote Piloted Aircraft (RPA) operations, particularly Beyond Visual Line of Sight (BVLOS). Image-based DAA, in the visible spectrum, is a promising technological option for addressing the challenges DAA presents. Two impediments to progress for this approach are the scarcity of available video footage to train and test algorithms, in conjunction with testing regimes and specifications which facilitate repeatable, statistically valid, performance assessment. This paper includes three key contributions undertaken to address these impediments. In the first instance, we detail our progress towards the creation of a large hybrid collision and near-collision encounter database. Second, we explore the suitability of techniques employed by the biometric research community (Speaker Verification and Language Identification), for DAA performance optimisation and assessment. These techniques include Detection Error Trade-off (DET) curves, Equal Error Rates (EER), and the Detection Cost Function (DCF). Finally, the hybrid database and the speech-based techniques are combined and employed in the assessment of a contemporary, image based DAA system. This system includes stabilisation, morphological filtering and a Hidden Markov Model (HMM) temporal filter.
Resumo:
Scene understanding has been investigated from a mainly visual information point of view. Recently depth has been provided an extra wealth of information, allowing more geometric knowledge to fuse into scene understanding. Yet to form a holistic view, especially in robotic applications, one can create even more data by interacting with the world. In fact humans, when growing up, seem to heavily investigate the world around them by haptic exploration. We show an application of haptic exploration on a humanoid robot in cooperation with a learning method for object segmentation. The actions performed consecutively improve the segmentation of objects in the scene.
Resumo:
In this paper we focus on the challenging problem of place categorization and semantic mapping on a robot with-out environment-specific training. Motivated by their ongoing success in various visual recognition tasks, we build our system upon a state-of-the-art convolutional network. We overcome its closed-set limitations by complementing the network with a series of one-vs-all classifiers that can learn to recognize new semantic classes online. Prior domain knowledge is incorporated by embedding the classification system into a Bayesian filter framework that also ensures temporal coherence. We evaluate the classification accuracy of the system on a robot that maps a variety of places on our campus in real-time. We show how semantic information can boost robotic object detection performance and how the semantic map can be used to modulate the robot’s behaviour during navigation tasks. The system is made available to the community as a ROS module.
Resumo:
Deep convolutional neural networks (DCNNs) have been employed in many computer vision tasks with great success due to their robustness in feature learning. One of the advantages of DCNNs is their representation robustness to object locations, which is useful for object recognition tasks. However, this also discards spatial information, which is useful when dealing with topological information of the image (e.g. scene labeling, face recognition). In this paper, we propose a deeper and wider network architecture to tackle the scene labeling task. The depth is achieved by incorporating predictions from multiple early layers of the DCNN. The width is achieved by combining multiple outputs of the network. We then further refine the parsing task by adopting graphical models (GMs) as a post-processing step to incorporate spatial and contextual information into the network. The new strategy for a deeper, wider convolutional network coupled with graphical models has shown promising results on the PASCAL-Context dataset.