628 resultados para Visual identification tasks
Resumo:
We propose a method for learning specific object representations that can be applied (and reused) in visual detection and identification tasks. A machine learning technique called Cartesian Genetic Programming (CGP) is used to create these models based on a series of images. Our research investigates how manipulation actions might allow for the development of better visual models and therefore better robot vision. This paper describes how visual object representations can be learned and improved by performing object manipulation actions, such as, poke, push and pick-up with a humanoid robot. The improvement can be measured and allows for the robot to select and perform the `right' action, i.e. the action with the best possible improvement of the detector.
Resumo:
This paper introduces an improved line tracker using IMU and vision data for visual servoing tasks. We utilize an Image Jacobian which describes motion of a line feature to corresponding camera movements. These camera motions are estimated using an IMU. We demonstrate impacts of the proposed method in challenging environments: maximum angular rate ~160 0/s, acceleration ~6m /s2 and in cluttered outdoor scenes. Simulation and quantitative tracking performance comparison with the Visual Servoing Platform (ViSP) are also presented.
Resumo:
Purpose To quantify the effects of driver age on night-time pedestrian conspicuity, and to determine whether individual differences in visual performance can predict drivers' ability to recognise pedestrians at night. Methods Participants were 32 visually normal drivers (20 younger: M = 24.4 years ± 6.4 years; 12 older: M = 72.0 years ± 5.0 years). Visual performance was measured in a laboratory-based testing session including visual acuity, contrast sensitivity, motion sensitivity and the useful field of view. Night-time pedestrian recognition distances were recorded while participants drove an instrumented vehicle along a closed road course at night; to increase the workload of drivers, auditory and visual distracter tasks were presented for some of the laps. Pedestrians walked in place, sideways to the oncoming vehicles, and wore either a standard high visibility reflective vest or reflective tape positioned on the movable joints (biological motion). Results Driver age and pedestrian clothing significantly (p < 0.05) affected the distance at which the drivers first responded to the pedestrians. Older drivers recognised pedestrians at approximately half the distance of the younger drivers and pedestrians were recognised more often and at longer distances when they wore a biological motion reflective clothing configuration than when they wore a reflective vest. Motion sensitivity was an independent predictor of pedestrian recognition distance, even when controlling for driver age. Conclusions The night-time pedestrian recognition capacity of older drivers was significantly worse than that of younger drivers. The distance at which drivers first recognised pedestrians at night was best predicted by a test of motion sensitivity.
Resumo:
Image representations derived from simplified models of the primary visual cortex (V1), such as HOG and SIFT, elicit good performance in a myriad of visual classification tasks including object recognition/detection, pedestrian detection and facial expression classification. A central question in the vision, learning and neuroscience communities regards why these architectures perform so well. In this paper, we offer a unique perspective to this question by subsuming the role of V1-inspired features directly within a linear support vector machine (SVM). We demonstrate that a specific class of such features in conjunction with a linear SVM can be reinterpreted as inducing a weighted margin on the Kronecker basis expansion of an image. This new viewpoint on the role of V1-inspired features allows us to answer fundamental questions on the uniqueness and redundancies of these features, and offer substantial improvements in terms of computational and storage efficiency.
Resumo:
This paper introduces a minimalistic approach to produce a visual hybrid map of a mobile robot’s working environment. The proposed system uses omnidirectional images along with odometry information to build an initial dense posegraph map. Then a two level hybrid map is extracted from the dense graph. The hybrid map consists of global and local levels. The global level contains a sparse topological map extracted from the initial graph using a dual clustering approach. The local level contains a spherical view stored at each node of the global level. The spherical views provide both an appearance signature for the nodes, which the robot uses to localize itself in the environment, and heading information when the robot uses the map for visual navigation. In order to show the usefulness of the map, an experiment was conducted where the map was used for multiple visual navigation tasks inside an office workplace.
Resumo:
In this paper we focus on the challenging problem of place categorization and semantic mapping on a robot with-out environment-specific training. Motivated by their ongoing success in various visual recognition tasks, we build our system upon a state-of-the-art convolutional network. We overcome its closed-set limitations by complementing the network with a series of one-vs-all classifiers that can learn to recognize new semantic classes online. Prior domain knowledge is incorporated by embedding the classification system into a Bayesian filter framework that also ensures temporal coherence. We evaluate the classification accuracy of the system on a robot that maps a variety of places on our campus in real-time. We show how semantic information can boost robotic object detection performance and how the semantic map can be used to modulate the robot’s behaviour during navigation tasks. The system is made available to the community as a ROS module.
Resumo:
Purpose: There have been few studies of visual temporal processing of myopic eyes. This study investigated the visual performance of emmetropic and myopic eyes using a backward visual masking location task. Methods: Data were collected for 39 subjects (15 emmetropes, 12 stable myopes, 12 progressing myopes). In backward visual masking, a target’s visibility is reduced by a mask presented in quick succession ‘after’ the target. The target and mask stimuli were presented at different interstimulus intervals (from 12 to 300 ms). The task involved locating the position of a target letter with both a higher (seven per cent) and a lower (five per cent) contrast. Results: Emmetropic subjects had significantly better performance for the lower contrast location task than the myopes (F2,36 = 22.88; p < 0.001) but there was no difference between the progressing and stable myopic groups (p = 0.911). There were no differences between the groups for the higher contrast location task (F2,36 = 0.72, p = 0.495). No relationship between task performance and either the magnitude of myopia or axial length was found for either task. Conclusions: A location task deficit was observed in myopes only for lower contrast stimuli. Both emmetropic and myopic groups had better performance for the higher contrast task compared to the lower contrast task, with myopes showing considerable improvement. This suggests that five per cent contrast may be the contrast threshold required to bias the task towards the magnocellular system (where myopes have a temporal processing deficit). Alternatively, the task may be sensitive to the contrast sensitivity of the observer.
Resumo:
Visual recording devices such as video cameras, CCTVs, or webcams have been broadly used to facilitate work progress or safety monitoring on construction sites. Without human intervention, however, both real-time reasoning about captured scenes and interpretation of recorded images are challenging tasks. This article presents an exploratory method for automated object identification using standard video cameras on construction sites. The proposed method supports real-time detection and classification of mobile heavy equipment and workers. The background subtraction algorithm extracts motion pixels from an image sequence, the pixels are then grouped into regions to represent moving objects, and finally the regions are identified as a certain object using classifiers. For evaluating the method, the formulated computer-aided process was implemented on actual construction sites, and promising results were obtained. This article is expected to contribute to future applications of automated monitoring systems of work zone safety or productivity.
Resumo:
The problem of determining the script and language of a document image has a number of important applications in the field of document analysis, such as indexing and sorting of large collections of such images, or as a precursor to optical character recognition (OCR). In this paper, we investigate the use of texture as a tool for determining the script of a document image, based on the observation that text has a distinct visual texture. An experimental evaluation of a number of commonly used texture features is conducted on a newly created script database, providing a qualitative measure of which features are most appropriate for this task. Strategies for improving classification results in situations with limited training data and multiple font types are also proposed.
Resumo:
Experience plays an important role in building management. “How often will this asset need repair?” or “How much time is this repair going to take?” are types of questions that project and facility managers face daily in planning activities. Failure or success in developing good schedules, budgets and other project management tasks depend on the project manager's ability to obtain reliable information to be able to answer these types of questions. Young practitioners tend to rely on information that is based on regional averages and provided by publishing companies. This is in contrast to experienced project managers who tend to rely heavily on personal experience. Another aspect of building management is that many practitioners are seeking to improve available scheduling algorithms, estimating spreadsheets and other project management tools. Such “micro-scale” levels of research are important in providing the required tools for the project manager's tasks. However, even with such tools, low quality input information will produce inaccurate schedules and budgets as output. Thus, it is also important to have a broad approach to research at a more “macro-scale.” Recent trends show that the Architectural, Engineering, Construction (AEC) industry is experiencing explosive growth in its capabilities to generate and collect data. There is a great deal of valuable knowledge that can be obtained from the appropriate use of this data and therefore the need has arisen to analyse this increasing amount of available data. Data Mining can be applied as a powerful tool to extract relevant and useful information from this sea of data. Knowledge Discovery in Databases (KDD) and Data Mining (DM) are tools that allow identification of valid, useful, and previously unknown patterns so large amounts of project data may be analysed. These technologies combine techniques from machine learning, artificial intelligence, pattern recognition, statistics, databases, and visualization to automatically extract concepts, interrelationships, and patterns of interest from large databases. The project involves the development of a prototype tool to support facility managers, building owners and designers. This final report presents the AIMMTM prototype system and documents how and what data mining techniques can be applied, the results of their application and the benefits gained from the system. The AIMMTM system is capable of searching for useful patterns of knowledge and correlations within the existing building maintenance data to support decision making about future maintenance operations. The application of the AIMMTM prototype system on building models and their maintenance data (supplied by industry partners) utilises various data mining algorithms and the maintenance data is analysed using interactive visual tools. The application of the AIMMTM prototype system to help in improving maintenance management and building life cycle includes: (i) data preparation and cleaning, (ii) integrating meaningful domain attributes, (iii) performing extensive data mining experiments in which visual analysis (using stacked histograms), classification and clustering techniques, associative rule mining algorithm such as “Apriori” and (iv) filtering and refining data mining results, including the potential implications of these results for improving maintenance management. Maintenance data of a variety of asset types were selected for demonstration with the aim of discovering meaningful patterns to assist facility managers in strategic planning and provide a knowledge base to help shape future requirements and design briefing. Utilising the prototype system developed here, positive and interesting results regarding patterns and structures of data have been obtained.
Resumo:
Experience plays an important role in building management. “How often will this asset need repair?” or “How much time is this repair going to take?” are types of questions that project and facility managers face daily in planning activities. Failure or success in developing good schedules, budgets and other project management tasks depend on the project manager's ability to obtain reliable information to be able to answer these types of questions. Young practitioners tend to rely on information that is based on regional averages and provided by publishing companies. This is in contrast to experienced project managers who tend to rely heavily on personal experience. Another aspect of building management is that many practitioners are seeking to improve available scheduling algorithms, estimating spreadsheets and other project management tools. Such “micro-scale” levels of research are important in providing the required tools for the project manager's tasks. However, even with such tools, low quality input information will produce inaccurate schedules and budgets as output. Thus, it is also important to have a broad approach to research at a more “macro-scale.” Recent trends show that the Architectural, Engineering, Construction (AEC) industry is experiencing explosive growth in its capabilities to generate and collect data. There is a great deal of valuable knowledge that can be obtained from the appropriate use of this data and therefore the need has arisen to analyse this increasing amount of available data. Data Mining can be applied as a powerful tool to extract relevant and useful information from this sea of data. Knowledge Discovery in Databases (KDD) and Data Mining (DM) are tools that allow identification of valid, useful, and previously unknown patterns so large amounts of project data may be analysed. These technologies combine techniques from machine learning, artificial intelligence, pattern recognition, statistics, databases, and visualization to automatically extract concepts, interrelationships, and patterns of interest from large databases. The project involves the development of a prototype tool to support facility managers, building owners and designers. This Industry focused report presents the AIMMTM prototype system and documents how and what data mining techniques can be applied, the results of their application and the benefits gained from the system. The AIMMTM system is capable of searching for useful patterns of knowledge and correlations within the existing building maintenance data to support decision making about future maintenance operations. The application of the AIMMTM prototype system on building models and their maintenance data (supplied by industry partners) utilises various data mining algorithms and the maintenance data is analysed using interactive visual tools. The application of the AIMMTM prototype system to help in improving maintenance management and building life cycle includes: (i) data preparation and cleaning, (ii) integrating meaningful domain attributes, (iii) performing extensive data mining experiments in which visual analysis (using stacked histograms), classification and clustering techniques, associative rule mining algorithm such as “Apriori” and (iv) filtering and refining data mining results, including the potential implications of these results for improving maintenance management. Maintenance data of a variety of asset types were selected for demonstration with the aim of discovering meaningful patterns to assist facility managers in strategic planning and provide a knowledge base to help shape future requirements and design briefing. Utilising the prototype system developed here, positive and interesting results regarding patterns and structures of data have been obtained.
Resumo:
Objectives. Intrusive memories of extreme trauma can disrupt a stepwise approach to imaginal exposure. Concurrent tasks that load the visuospatial sketchpad (VSSP) of working memory reduce the vividness of recalled images. This study tested whether relief of distress from competing VSSP tasks during imaginal exposure is at the cost of impaired desensitization . Design. This study examined repeated exposure to emotive memories using 18 unselected undergraduates and a within-subjects design with three exposure conditions (Eye Movement, Visual Noise, Exposure Alone) in random, counterbalanced order. Method. At baseline, participants recalled positive and negative experiences, and rated the vividness and emotiveness of each image. A different positive and negative recollection was then used for each condition. Vividness and emotiveness were rated after each of eight exposure trials. At a post-exposure session 1 week later, participants rated each image without any concurrent task. Results. Consistent with previous research, vividness and distress during imaging were lower during Eye Movements than in Exposure Alone, with passive visual interference giving intermediate results. A reduction in emotional responses from Baseline to Post was of similar size for the three conditions. Conclusion. Visuospatial tasks may offer a temporary response aid for imaginal exposure without affecting desensitization.
Resumo:
A set of five tasks was designed to examine dynamic aspects of visual attention: selective attention to color, selective attention to pattern, dividing and switching attention between color and pattern, and selective attention to pattern with changing target. These varieties of visual attention were examined using the same set of stimuli under different instruction sets; thus differences between tasks cannot be attributed to differences in the perceptual features of the stimuli. ERP data are presented for each of these tasks. A within-task analysis of different stimulus types varying in similarity to the attended target feature revealed that an early frontal selection positivity (FSP) was evident in selective attention tasks, regardless of whether color was the attended feature. The scalp distribution of a later posterior selection negativity (SN) was affected by whether the attended feature was color or pattern. The SN was largely unaffected by dividing attention across color and pattern. A large widespread positivity was evident in most conditions, consisting of at least three subcomponents which were differentially affected by the attention conditions. These findings are discussed in relation to prior research and the time course of visual attention processes in the brain.
Resumo:
It has been claimed that the symptoms of post-traumatic stress disorder (PTSD) can be ameliorated by eye-movement desensitization-reprocessing therapy (EMD-R), a procedure that involves the individual making saccadic eye-movements while imagining the traumatic event. We hypothesized that these eye-movements reduce the vividness of distressing images by disrupting the function of the visuospatial sketchpad (VSSP) of working memory, and that by doing so they reduce the intensity of the emotion associated with the image. This hypothesis was tested by asking non-PTSD participants to form images of neutral and negative pictures under dual task conditions. Their images were less vivid with concurrent eye-movements and with a concurrent spatial tapping task that did not involve eye-movements. In the first three experiments, these secondary tasks did not consistently affect participants' emotional responses to the images. However, Expt 4 used personal recollections as stimuli for the imagery task, and demonstrated a significant reduction in emotional response under the same dual task conditions. These results suggest that, if EMD-R works, it does so by reducing the vividness and emotiveness of traumatic images via the VSSP of working memory. Other visuospatial tasks may also be of therapeutic value.
Resumo:
Surveillance networks are typically monitored by a few people, viewing several monitors displaying the camera feeds. It is then very difficult for a human operator to effectively detect events as they happen. Recently, computer vision research has begun to address ways to automatically process some of this data, to assist human operators. Object tracking, event recognition, crowd analysis and human identification at a distance are being pursued as a means to aid human operators and improve the security of areas such as transport hubs. The task of object tracking is key to the effective use of more advanced technologies. To recognize an event people and objects must be tracked. Tracking also enhances the performance of tasks such as crowd analysis or human identification. Before an object can be tracked, it must be detected. Motion segmentation techniques, widely employed in tracking systems, produce a binary image in which objects can be located. However, these techniques are prone to errors caused by shadows and lighting changes. Detection routines often fail, either due to erroneous motion caused by noise and lighting effects, or due to the detection routines being unable to split occluded regions into their component objects. Particle filters can be used as a self contained tracking system, and make it unnecessary for the task of detection to be carried out separately except for an initial (often manual) detection to initialise the filter. Particle filters use one or more extracted features to evaluate the likelihood of an object existing at a given point each frame. Such systems however do not easily allow for multiple objects to be tracked robustly, and do not explicitly maintain the identity of tracked objects. This dissertation investigates improvements to the performance of object tracking algorithms through improved motion segmentation and the use of a particle filter. A novel hybrid motion segmentation / optical flow algorithm, capable of simultaneously extracting multiple layers of foreground and optical flow in surveillance video frames is proposed. The algorithm is shown to perform well in the presence of adverse lighting conditions, and the optical flow is capable of extracting a moving object. The proposed algorithm is integrated within a tracking system and evaluated using the ETISEO (Evaluation du Traitement et de lInterpretation de Sequences vidEO - Evaluation for video understanding) database, and significant improvement in detection and tracking performance is demonstrated when compared to a baseline system. A Scalable Condensation Filter (SCF), a particle filter designed to work within an existing tracking system, is also developed. The creation and deletion of modes and maintenance of identity is handled by the underlying tracking system; and the tracking system is able to benefit from the improved performance in uncertain conditions arising from occlusion and noise provided by a particle filter. The system is evaluated using the ETISEO database. The dissertation then investigates fusion schemes for multi-spectral tracking systems. Four fusion schemes for combining a thermal and visual colour modality are evaluated using the OTCBVS (Object Tracking and Classification in and Beyond the Visible Spectrum) database. It is shown that a middle fusion scheme yields the best results and demonstrates a significant improvement in performance when compared to a system using either mode individually. Findings from the thesis contribute to improve the performance of semi-automated video processing and therefore improve security in areas under surveillance.