840 resultados para Vision.
Resumo:
In recent years more and more complex humanoid robots have been developed. On the other hand programming these systems has become more difficult. There is a clear need for such robots to be able to adapt and perform certain tasks autonomously, or even learn by themselves how to act. An important issue to tackle is the closing of the sensorimotor loop. Especially when talking about humanoids the tight integration of perception with actions will allow for improved behaviours, embedding adaptation on the lower-level of the system.
Resumo:
There is an increased interest on the use of Unmanned Aerial Vehicles (UAVs) for wildlife and feral animal monitoring around the world. This paper describes a novel system which uses a predictive dynamic application that places the UAV ahead of a user, with a low cost thermal camera, a small onboard computer that identifies heat signatures of a target animal from a predetermined altitude and transmits that target’s GPS coordinates. A map is generated and various data sets and graphs are displayed using a GUI designed for easy use. The paper describes the hardware and software architecture and the probabilistic model for downward facing camera for the detection of an animal. Behavioral dynamics of target movement for the design of a Kalman filter and Markov model based prediction algorithm are used to place the UAV ahead of the user. Geometrical concepts and Haversine formula are applied to the maximum likelihood case in order to make a prediction regarding a future state of the user, thus delivering a new way point for autonomous navigation. Results show that the system is capable of autonomously locating animals from a predetermined height and generate a map showing the location of the animals ahead of the user.
Resumo:
This paper introduces a machine learning based system for controlling a robotic manipulator with visual perception only. The capability to autonomously learn robot controllers solely from raw-pixel images and without any prior knowledge of configuration is shown for the first time. We build upon the success of recent deep reinforcement learning and develop a system for learning target reaching with a three-joint robot manipulator using external visual observation. A Deep Q Network (DQN) was demonstrated to perform target reaching after training in simulation. Transferring the network to real hardware and real observation in a naive approach failed, but experiments show that the network works when replacing camera images with synthetic images.
Resumo:
Recent interest in affect and the body have mobilised a contemporary review of aesthetics and phenomenology within architecture to unpack how environments affect spatial experience. Emerging spatial studies within the neurosciences, and their implications for architectural research as raised by architectural theorists has been well supported by a raft of scientists and institutions. Although there has been some headway in spatial studies of the vision impaired (Cattaneo et al., 2011) to understand the role of their non-visual systems in assisting navigation and location, little is discussed in terms of their other abilities in sensing particular qualities of space which impinge upon emotion and wellbeing. This research explores, through published studies and constructed spatial interviews, the affective perception of the vision impaired and how further interplay between this research and the architectural field can contribute new knowledge regarding space and affect. The research aims to provide background of current and potential cross disciplinary research and highlight the role wearable technologies can play in enhancing knowledge of affective spatial experience.
Resumo:
Evidence has accumulated that rod activation under mesopic and scotopic light levels alters visual perception and performance. Here we review the most recent developments in the measurement of rod and cone contributions to mesopic color perception and temporal processing, with a focus on data measured using the four-primary photostimulator method that independently controls rod and cone excitations. We discuss the findings in the context of rod inputs to the three primary retinogeniculate pathways to understand rod contributions to mesopic vision. Additionally, we present evidence that hue perception is possible under scotopic, pure rod-mediated conditions that involves cortical mechanisms.
Resumo:
To develop and test a custom-built instrument to simultaneously assess tear film surface quality (TFSQ) and subjective vision score (SVS).
Resumo:
The use of UAVs for remote sensing tasks; e.g. agriculture, search and rescue is increasing. The ability for UAVs to autonomously find a target and perform on-board decision making, such as descending to a new altitude or landing next to a target is a desired capability. Computer-vision functionality allows the Unmanned Aerial Vehicle (UAV) to follow a designated flight plan, detect an object of interest, and change its planned path. In this paper we describe a low cost and an open source system where all image processing is achieved on-board the UAV using a Raspberry Pi 2 microprocessor interfaced with a camera. The Raspberry Pi and the autopilot are physically connected through serial and communicate via MAVProxy. The Raspberry Pi continuously monitors the flight path in real time through USB camera module. The algorithm checks whether the target is captured or not. If the target is detected, the position of the object in frame is represented in Cartesian coordinates and converted into estimate GPS coordinates. In parallel, the autopilot receives the target location approximate GPS and makes a decision to guide the UAV to a new location. This system also has potential uses in the field of Precision Agriculture, plant pest detection and disease outbreaks which cause detrimental financial damage to crop yields if not detected early on. Results show the algorithm is accurate to detect 99% of object of interest and the UAV is capable of navigation and doing on-board decision making.
Resumo:
Robotic vision is limited by line of sight and onboard camera capabilities. Robots can acquire video or images from remote cameras, but processing additional data has a computational burden. This paper applies the Distributed Robotic Vision Service, DRVS, to robot path planning using data outside line-of-sight of the robot. DRVS implements a distributed visual object detection service to distributes the computation to remote camera nodes with processing capabilities. Robots request task-specific object detection from DRVS by specifying a geographic region of interest and object type. The remote camera nodes perform the visual processing and send the high-level object information to the robot. Additionally, DRVS relieves robots of sensor discovery by dynamically distributing object detection requests to remote camera nodes. Tested over two different indoor path planning tasks DRVS showed dramatic reduction in mobile robot compute load and wireless network utilization.
Resumo:
Visual information processing in brain proceeds in both serial and parallel fashion throughout various functionally distinct hierarchically organised cortical areas. Feedforward signals from retina and hierarchically lower cortical levels are the major activators of visual neurons, but top-down and feedback signals from higher level cortical areas have a modulating effect on neural processing. My work concentrates on visual encoding in hierarchically low level cortical visual areas in human brain and examines neural processing especially in cortical representation of visual field periphery. I use magnetoencephalography and functional magnetic resonance imaging to measure neuromagnetic and hemodynamic responses during visual stimulation and oculomotor and cognitive tasks from healthy volunteers. My thesis comprises six publications. Visual cortex forms a great challenge for modeling of neuromagnetic sources. My work shows that a priori information of source locations are needed for modeling of neuromagnetic sources in visual cortex. In addition, my work examines other potential confounding factors in vision studies such as light scatter inside the eye which may result in erroneous responses in cortex outside the representation of stimulated region, and eye movements and attention. I mapped cortical representations of peripheral visual field and identified a putative human homologue of functional area V6 of the macaque in the posterior bank of parieto-occipital sulcus. My work shows that human V6 activates during eye-movements and that it responds to visual motion at short latencies. These findings suggest that human V6, like its monkey homologue, is related to fast processing of visual stimuli and visually guided movements. I demonstrate that peripheral vision is functionally related to eye-movements and connected to rapid stream of functional areas that process visual motion. In addition, my work shows two different forms of top-down modulation of neural processing in the hierachically lowest cortical levels; one that is related to dorsal stream activation and may reflect motor processing or resetting signals that prepare visual cortex for change in the environment and another local signal enhancement at the attended region that reflects local feed-back signal and may perceptionally increase the stimulus saliency.
Resumo:
Detect and Avoid (DAA) technology is widely acknowledged as a critical enabler for unsegregated Remote Piloted Aircraft (RPA) operations, particularly Beyond Visual Line of Sight (BVLOS). Image-based DAA, in the visible spectrum, is a promising technological option for addressing the challenges DAA presents. Two impediments to progress for this approach are the scarcity of available video footage to train and test algorithms, in conjunction with testing regimes and specifications which facilitate repeatable, statistically valid, performance assessment. This paper includes three key contributions undertaken to address these impediments. In the first instance, we detail our progress towards the creation of a large hybrid collision and near-collision encounter database. Second, we explore the suitability of techniques employed by the biometric research community (Speaker Verification and Language Identification), for DAA performance optimisation and assessment. These techniques include Detection Error Trade-off (DET) curves, Equal Error Rates (EER), and the Detection Cost Function (DCF). Finally, the hybrid database and the speech-based techniques are combined and employed in the assessment of a contemporary, image based DAA system. This system includes stabilisation, morphological filtering and a Hidden Markov Model (HMM) temporal filter.
Resumo:
This paper presents an SIMD machine which has been tuned to execute low-level vision algorithms employing the relaxation labeling paradigm. Novel features of the design include: 1. (1) a communication scheme capable of window accessing under a single instruction. 2. (2) flexible I/O instructions to load overlapped data segments; and 3. (3) data-conditional instructions which can be nested to an arbitrary degree. A time analysis of the stereo correspondence problem, as implemented on a simulated version of the machine using the probabilistic relaxation technique, shows a speed up of almost N2 for an N × N array of PEs.
Resumo:
I avhandlingen analyseras arbetsprocesserna vid en webbredaktion. Undersökningen är en etnografisk fallstudie där Hufvudstadsbladets webbproduktion fungerar som case. Den övergripande frågeställningen är hur växelverkan mellan tidningsredaktionen och webbredaktionen fungerar och varför. Syftet var att hitta och synliggöra de underliggande spänningar i organisationen som bimedialiteten kan ha gett upphov till, och analysera produktionen mot bakgrund av tidigare forskning. Analysen behandlar tre områden som ofta återkommer i mediekonvergensforskningen, det vill säga organisation, innehåll och inställning. Forskningsmaterialet är insamlat med hjälp av observation, intervjuer och en e-postenkät. Arbetet på redaktionen observerades under sex arbetsskift. Webbreporterns arbete observerades, observationerna antecknades och efter varje arbetsskift bandades en intervju med webbreportern. Utöver dessa intervjuer gjordes ytterligare tre intervjuer med två nyhetschefer och chefredaktören. En e-postenkät med öppna frågor skickades ut till samtliga redaktionsmedlemmar. Avhandlingen tar avstamp i mediekonvergensforskning, redaktionsforskning och aktivitetsteori. Eftersom den teoretiska utgångspunkten delvis ligger inom aktivitetsteori och utvecklande arbetsforskning räknades samtidigt störningar i arbetsprocessen för att kunna identifiera underliggande spänningar i organisationen. Alla händelser som innebar ett längre eller kortare avbrott i arbetsprocessen antecknades och delades in i kategorier. Sammanlagt sextio störningar identifierades, varav den största andelen, en tredjedel, konstaterades bero på organisations- och kommunikationsfaktorer, främst till följd av bristfällig intern kommunikation. Slutsatserna är webbproduktionen till följd av heterogena objekt i aktivitetssystemet - oklara mål och oklarhet gällande webbens roll i organisationen sitter fast i klyftan mellan ledningens vision och verkligheten på redaktionen. Ett flertal motstridiga uppfattningar om webbproduktionens roll råder på redaktionen. Det leder till störningar i arbetsprocessen som i sin tur gör att produktionen haltar och inte utvecklas. Oklarheten kring målen leder till oklarhet kring konkret praxis, kommunikationssvårigheter, missförstånd och en sned arbetsfördelning, som samtliga inverkar på smidigheten i produktionen.
Resumo:
Modern smart phones often come with a significant amount of computational power and an integrated digital camera making them an ideal platform for intelligents assistants. This work is restricted to retail environments, where users could be provided with for example navigational in- structions to desired products or information about special offers within their close proximity. This kind of applications usually require information about the user's current location in the domain environment, which in our case corresponds to a retail store. We propose a vision based positioning approach that recognizes products the user's mobile phone's camera is currently pointing at. The products are related to locations within the store, which enables us to locate the user by pointing the mobile phone's camera to a group of products. The first step of our method is to extract meaningful features from digital images. We use the Scale- Invariant Feature Transform SIFT algorithm, which extracts features that are highly distinctive in the sense that they can be correctly matched against a large database of features from many images. We collect a comprehensive set of images from all meaningful locations within our domain and extract the SIFT features from each of these images. As the SIFT features are of high dimensionality and thus comparing individual features is infeasible, we apply the Bags of Keypoints method which creates a generic representation, visual category, from all features extracted from images taken from a specific location. A category for an unseen image can be deduced by extracting the corresponding SIFT features and by choosing the category that best fits the extracted features. We have applied the proposed method within a Finnish supermarket. We consider grocery shelves as categories which is a sufficient level of accuracy to help users navigate or to provide useful information about nearby products. We achieve a 40% accuracy which is quite low for commercial applications while significantly outperforming the random guess baseline. Our results suggest that the accuracy of the classification could be increased with a deeper analysis on the domain and by combining existing positioning methods with ours.
Resumo:
This paper is concerned with grasping biological cells in aqueous medium with miniature grippers that can also help estimate forces using vision-based displacement measurement and computation. We present the design, fabrication, and testing of three single-piece, compliant miniature grippers with parallel and angular jaw motions. Two grippers were designed using experience and intuition, while the third one was designed using topology optimization with implicit manufacturing constraints. These grippers were fabricated using different manufacturing techniques using spring steel and polydimethylsiloxane ( PDMS). The grippers also serve the purpose of a force sensor. Toward this, we present a vision-based force-sensing technique by solving Cauchy's problem in elasticity using an improved algorithm. We validated this technique at the macroscale, where there was an independent method to estimate the force. In this study, the gripper was used to hold a yeast ball and a zebrafish egg cell of less than 1 mm in diameter. The forces involved were estimated to be about 30 and 10 mN for the yeast ball and the zebrafish egg cell, respectively.