923 resultados para medical image processing
Resumo:
Video surveillance systems using Closed Circuit Television (CCTV) cameras, is one of the fastest growing areas in the field of security technologies. However, the existing video surveillance systems are still not at a stage where they can be used for crime prevention. The systems rely heavily on human observers and are therefore limited by factors such as fatigue and monitoring capabilities over long periods of time. This work attempts to address these problems by proposing an automatic suspicious behaviour detection which utilises contextual information. The utilisation of contextual information is done via three main components: a context space model, a data stream clustering algorithm, and an inference algorithm. The utilisation of contextual information is still limited in the domain of suspicious behaviour detection. Furthermore, it is nearly impossible to correctly understand human behaviour without considering the context where it is observed. This work presents experiments using video feeds taken from CAVIAR dataset and a camera mounted on one of the buildings Z-Block) at the Queensland University of Technology, Australia. From these experiments, it is shown that by exploiting contextual information, the proposed system is able to make more accurate detections, especially of those behaviours which are only suspicious in some contexts while being normal in the others. Moreover, this information gives critical feedback to the system designers to refine the system.
Resumo:
Automatic species recognition plays an important role in assisting ecologists to monitor the environment. One critical issue in this research area is that software developers need prior knowledge of specific targets people are interested in to build templates for these targets. This paper proposes a novel approach for automatic species recognition based on generic knowledge about acoustic events to detect species. Acoustic component detection is the most critical and fundamental part of this proposed approach. This paper gives clear definitions of acoustic components and presents three clustering algorithms for detecting four acoustic components in sound recordings; whistles, clicks, slurs, and blocks. The experiment result demonstrates that these acoustic component recognisers have achieved high precision and recall rate.
Resumo:
The automated extraction of roads from aerial imagery can be of value for tasks including mapping, surveillance and change detection. Unfortunately, there are no public databases or standard evaluation protocols for evaluating these techniques. Many techniques are further hindered by a reliance on manual initialisation, making large scale application of the techniques impractical. In this paper, we present a public database and evaluation protocol for the evaluation of road extraction algorithms, and propose an improved automatic seed finding technique to initialise road extraction, based on a combination of geometric and colour features.
Resumo:
Rats are superior to the most advanced robots when it comes to creating and exploiting spatial representations. A wild rat can have a foraging range of hundreds of meters, possibly kilometers, and yet the rodent can unerringly return to its home after each foraging mission, and return to profitable foraging locations at a later date (Davis, et al., 1948). The rat runs through undergrowth and pipes with few distal landmarks, along paths where the visual, textural, and olfactory appearance constantly change (Hardy and Taylor, 1980; Recht, 1988). Despite these challenges the rat builds, maintains, and exploits internal representations of large areas of the real world throughout its two to three year lifetime. While algorithms exist that allow robots to build maps, the questions of how to maintain those maps and how to handle change in appearance over time remain open. The robotic approach to map building has been dominated by algorithms that optimise the geometry of the map based on measurements of distances to features. In a robotic approach, measurements of distance to features are taken with range-measuring devices such as laser range finders or ultrasound sensors, and in some cases estimates of depth from visual information. The features are incorporated into the map based on previous readings of other features in view and estimates of self-motion. The algorithms explicitly model the uncertainty in measurements of range and the measurement of self-motion, and use probability theory to find optimal solutions for the geometric configuration of the map features (Dissanayake, et al., 2001; Thrun and Leonard, 2008). Some of the results from the application of these algorithms have been impressive, ranging from three-dimensional maps of large urban strucutures (Thrun and Montemerlo, 2006) to natural environments (Montemerlo, et al., 2003).
Resumo:
In this paper we present a novel algorithm for localization during navigation that performs matching over local image sequences. Instead of calculating the single location most likely to correspond to a current visual scene, the approach finds candidate matching locations within every section (subroute) of all learned routes. Through this approach, we reduce the demands upon the image processing front-end, requiring it to only be able to correctly pick the best matching image from within a short local image sequence, rather than globally. We applied this algorithm to a challenging downhill mountain biking visual dataset where there was significant perceptual or environment change between repeated traverses of the environment, and compared performance to applying the feature-based algorithm FAB-MAP. The results demonstrate the potential for localization using visual sequences, even when there are no visual features that can be reliably detected.
Resumo:
This paper addresses the challenge of developing robots that map and navigate autonomously in real world, dynamic environments throughout the robot’s entire lifetime – the problem of lifelong navigation. Static mapping algorithms can produce highly accurate maps, but have found few applications in real environments that are in constant flux. Environments change in many ways: both rapidly and gradually, transiently and permanently, geometrically and in appearance. This paper demonstrates a biologically inspired navigation algorithm, RatSLAM, that uses principles found in rodent neural circuits. The algorithm is demonstrated in an office delivery challenge where the robot was required to perform mock deliveries to goal locations in two different buildings. The robot successfully completed 1177 out of 1178 navigation trials over 37 hours of around the clock operation spread over 11 days.
Resumo:
In this paper an existing method for indoor Simultaneous Localisation and Mapping (SLAM) is extended to operate in large outdoor environments using an omnidirectional camera as its principal external sensor. The method, RatSLAM, is based upon computational models of the area in the rat brain that maintains the rodent’s idea of its position in the world. The system uses the visual appearance of different locations to build hybrid spatial-topological maps of places it has experienced that facilitate relocalisation and path planning. A large dataset was acquired from a dynamic campus environment and used to verify the system’s ability to construct representations of the world and simultaneously use these representations to maintain localisation.
Resumo:
The head direction (HD) system in mammals contains neurons that fire to represent the direction the animal is facing in its environment. The ability of these cells to reliably track head direction even after the removal of external sensory cues implies that the HD system is calibrated to function effectively using just internal (proprioceptive and vestibular) inputs. Rat pups and other infant mammals display stereotypical warm-up movements prior to locomotion in novel environments, and similar warm-up movements are seen in adult mammals with certain brain lesion-induced motor impairments. In this study we propose that synaptic learning mechanisms, in conjunction with appropriate movement strategies based on warm-up movements, can calibrate the HD system so that it functions effectively even in darkness. To examine the link between physical embodiment and neural control, and to determine that the system is robust to real-world phenomena, we implemented the synaptic mechanisms in a spiking neural network and tested it on a mobile robot platform. Results show that the combination of the synaptic learning mechanisms and warm-up movements are able to reliably calibrate the HD system so that it accurately tracks real-world head direction, and that calibration breaks down in systematic ways if certain movements are omitted. This work confirms that targeted, embodied behaviour can be used to calibrate neural systems, demonstrates that ‘grounding’ of modeled biological processes in the real world can reveal underlying functional principles (supporting the importance of robotics to biology), and proposes a functional role for stereotypical behaviours seen in infant mammals and those animals with certain motor deficits. We conjecture that these calibration principles may extend to the calibration of other neural systems involved in motion tracking and the representation of space, such as grid cells in entorhinal cortex.
Resumo:
This paper presents a novel technique for performing SLAM along a continuous trajectory of appearance. Derived from components of FastSLAM and FAB-MAP, the new system dubbed Continuous Appearance-based Trajectory SLAM (CAT-SLAM) augments appearancebased place recognition with particle-filter based ‘pose filtering’ within a probabilistic framework, without calculating global feature geometry or performing 3D map construction. For loop closure detection CAT-SLAM updates in constant time regardless of map size. We evaluate the effectiveness of CAT-SLAM on a 16km outdoor road network and determine its loop closure performance relative to FAB-MAP. CAT-SLAM recognizes 3 times the number of loop closures for the case where no false positives occur, demonstrating its potential use for robust loop closure detection in large environments.
Resumo:
In this paper, we present a new algorithm for boosting visual template recall performance through a process of visual expectation. Visual expectation dynamically modifies the recognition thresholds of learnt visual templates based on recently matched templates, improving the recall of sequences of familiar places while keeping precision high, without any feedback from a mapping backend. We demonstrate the performance benefits of visual expectation using two 17 kilometer datasets gathered in an outdoor environment at two times separated by three weeks. The visual expectation algorithm provides up to a 100% improvement in recall. We also combine the visual expectation algorithm with the RatSLAM SLAM system and show how the algorithm enables successful mapping
Resumo:
The Lingodroids are a pair of mobile robots that evolve a language for places and relationships between places (based on distance and direction). Each robot in these studies has its own understanding of the layout of the world, based on its unique experiences and exploration of the environment. Despite having different internal representations of the world, the robots are able to develop a common lexicon for places, and then use simple sentences to explain and understand relationships between places even places that they could not physically experience, such as areas behind closed doors. By learning the language, the robots are able to develop representations for places that are inaccessible to them, and later, when the doors are opened, use those representations to perform goal-directed behavior.
Resumo:
Thermal-infrared images have superior statistical properties compared with visible-spectrum images in many low-light or no-light scenarios. However, a detailed understanding of feature detector performance in the thermal modality lags behind that of the visible modality. To address this, the first comprehensive study on feature detector performance on thermal-infrared images is conducted. A dataset is presented which explores a total of ten different environments with a range of statistical properties. An investigation is conducted into the effects of several digital and physical image transformations on detector repeatability in these environments. The effect of non-uniformity noise, unique to the thermal modality, is analyzed. The accumulation of sensor non-uniformities beyond the minimum possible level was found to have only a small negative effect. A limiting of feature counts was found to improve the repeatability performance of several detectors. Most other image transformations had predictable effects on feature stability. The best-performing detector varied considerably depending on the nature of the scene and the test.
Resumo:
Recent algorithms for monocular motion capture (MoCap) estimate weak-perspective camera matrices between images using a small subset of approximately-rigid points on the human body (i.e. the torso and hip). A problem with this approach, however, is that these points are often close to coplanar, causing canonical linear factorisation algorithms for rigid structure from motion (SFM) to become extremely sensitive to noise. In this paper, we propose an alternative solution to weak-perspective SFM based on a convex relaxation of graph rigidity. We demonstrate the success of our algorithm on both synthetic and real world data, allowing for much improved solutions to marker less MoCap problems on human bodies. Finally, we propose an approach to solve the two-fold ambiguity over bone direction using a k-nearest neighbour kernel density estimator.
Resumo:
We propose an approach to employ eigen light-fields for face recognition across pose on video. Faces of a subject are collected from video frames and combined based on the pose to obtain a set of probe light-fields. These probe data are then projected to the principal subspace of the eigen light-fields within which the classification takes place. We modify the original light-field projection and found that it is more robust in the proposed system. Evaluation on VidTIMIT dataset has demonstrated that the eigen light-fields method is able to take advantage of multiple observations contained in the video.