60 resultados para Computer Imaging, Vision, Pattern Recognition and Graphics
Resumo:
A vision system for recognizing rigid and articulated three-dimensional objects in two-dimensional images is described. Geometrical models are extracted from a commercial computer aided design package. The models are then augmented with appearance and functional information which improves the system's hypothesis generation, hypothesis verification, and pose refinement. Significant advantages over existing CAD-based vision systems, which utilize only information available in the CAD system, are realized. Examples show the system recognizing, locating, and tracking a variety of objects in a robot work-cell and in natural scenes.
Resumo:
An algorithm for tracking multiple feature positions in a dynamic image sequence is presented. This is achieved using a combination of two trajectory-based methods, with the resulting hybrid algorithm exhibiting the advantages of both. An optimizing exchange algorithm is described which enables short feature paths to be tracked without prior knowledge of the motion being studied. The resulting partial trajectories are then used to initialize a fast predictor algorithm which is capable of rapidly tracking multiple feature paths. As this predictor algorithm becomes tuned to the feature positions being tracked, it is shown how the location of occluded or poorly detected features can be predicted. The results of applying this tracking algorithm to data obtained from real-world scenes are then presented.
Resumo:
The dynamics of inter-regional communication within the brain during cognitive processing – referred to as functional connectivity – are investigated as a control feature for a brain computer interface. EMDPL is used to map phase synchronization levels between all channel pair combinations in the EEG. This results in complex networks of channel connectivity at all time–frequency locations. The mean clustering coefficient is then used as a descriptive feature encapsulating information about inter-channel connectivity. Hidden Markov models are applied to characterize and classify dynamics of the resulting complex networks. Highly accurate levels of classification are achieved when this technique is applied to classify EEG recorded during real and imagined single finger taps. These results are compared to traditional features used in the classification of a finger tap BCI demonstrating that functional connectivity dynamics provide additional information and improved BCI control accuracies.
Resumo:
There is a rising demand for the quantitative performance evaluation of automated video surveillance. To advance research in this area, it is essential that comparisons in detection and tracking approaches may be drawn and improvements in existing methods can be measured. There are a number of challenges related to the proper evaluation of motion segmentation, tracking, event recognition, and other components of a video surveillance system that are unique to the video surveillance community. These include the volume of data that must be evaluated, the difficulty in obtaining ground truth data, the definition of appropriate metrics, and achieving meaningful comparison of diverse systems. This chapter provides descriptions of useful benchmark datasets and their availability to the computer vision community. It outlines some ground truth and evaluation techniques, and provides links to useful resources. It concludes by discussing the future direction for benchmark datasets and their associated processes.
Resumo:
Analysis of human behaviour through visual information has been a highly active research topic in the computer vision community. This was previously achieved via images from a conventional camera, but recently depth sensors have made a new type of data available. This survey starts by explaining the advantages of depth imagery, then describes the new sensors that are available to obtain it. In particular, the Microsoft Kinect has made high-resolution real-time depth cheaply available. The main published research on the use of depth imagery for analysing human activity is reviewed. Much of the existing work focuses on body part detection and pose estimation. A growing research area addresses the recognition of human actions. The publicly available datasets that include depth imagery are listed, as are the software libraries that can acquire it from a sensor. This survey concludes by summarising the current state of work on this topic, and pointing out promising future research directions.
Resumo:
For general home monitoring, a system should automatically interpret people’s actions. The system should be non-intrusive, and able to deal with a cluttered background, and loose clothes. An approach based on spatio-temporal local features and a Bag-of-Words (BoW) model is proposed for single-person action recognition from combined intensity and depth images. To restore the temporal structure lost in the traditional BoW method, a dynamic time alignment technique with temporal binning is applied in this work, which has not been previously implemented in the literature for human action recognition on depth imagery. A novel human action dataset with depth data has been created using two Microsoft Kinect sensors. The ReadingAct dataset contains 20 subjects and 19 actions for a total of 2340 videos. To investigate the effect of using depth images and the proposed method, testing was conducted on three depth datasets, and the proposed method was compared to traditional Bag-of-Words methods. Results showed that the proposed method improves recognition accuracy when adding depth to the conventional intensity data, and has advantages when dealing with long actions.
Resumo:
OBJECTIVE: Assimilating the diagnosis complete spinal cord injury (SCI) takes time and is not easy, as patients know that there is no 'cure' at the present time. Brain-computer interfaces (BCIs) can facilitate daily living. However, inter-subject variability demands measurements with potential user groups and an understanding of how they differ to healthy users BCIs are more commonly tested with. Thus, a three-class motor imagery (MI) screening (left hand, right hand, feet) was performed with a group of 10 able-bodied and 16 complete spinal-cord-injured people (paraplegics, tetraplegics) with the objective of determining what differences were present between the user groups and how they would impact upon the ability of these user groups to interact with a BCI. APPROACH: Electrophysiological differences between patient groups and healthy users are measured in terms of sensorimotor rhythm deflections from baseline during MI, electroencephalogram microstate scalp maps and strengths of inter-channel phase synchronization. Additionally, using a common spatial pattern algorithm and a linear discriminant analysis classifier, the classification accuracy was calculated and compared between groups. MAIN RESULTS: It is seen that both patient groups (tetraplegic and paraplegic) have some significant differences in event-related desynchronization strengths, exhibit significant increases in synchronization and reach significantly lower accuracies (mean (M) = 66.1%) than the group of healthy subjects (M = 85.1%). SIGNIFICANCE: The results demonstrate significant differences in electrophysiological correlates of motor control between healthy individuals and those individuals who stand to benefit most from BCI technology (individuals with SCI). They highlight the difficulty in directly translating results from healthy subjects to participants with SCI and the challenges that, therefore, arise in providing BCIs to such individuals.
Resumo:
Dendritic cells (DC) can produce Th-polarizing cytokines and direct the class of the adaptive immune response. Microbial stimuli, cytokines, chemokines, and T cell-derived signals all have been shown to trigger cytokine synthesis by DC, but it remains unclear whether these signals are functionally equivalent and whether they determine the nature of the cytokine produced or simply initiate a preprogrammed pattern of cytokine production, which may be DC subtype specific. Here, we demonstrate that microbial and T cell-derived stimuli can synergize to induce production of high levels of IL-12 p70 or IL-10 by individual murine DC subsets but that the choice of cytokine is dictated by the microbial pattern recognition receptor engaged. We show that bacterial components such as CpG-containing DNA or extracts from Mycobacterium tuberculosis predispose CD8alpha(+) and CD8alpha(-)CD4(-) DC to make IL-12 p70. In contrast, exposure of CD8alpha(+), CD4(+) and CD8alpha(-)CD4(-) DC to heat-killed yeasts leads to production of IL-10. In both cases, secretion of high levels of cytokine requires a second signal from T cells, which can be replaced by CD40 ligand. Consistent with their differential effects on cytokine production, extracts from M. tuberculosis promote IL-12 production primarily via Toll-like receptor 2 and an MyD88-dependent pathway, whereas heat-killed yeasts activate DC via a Toll-like receptor 2-, MyD88-, and Toll/IL-1R domain containing protein-independent pathway. These results show that T cell feedback amplifies innate signals for cytokine production by DC and suggest that pattern recognition rather than ontogeny determines the production of cytokines by individual DC subsets.
Resumo:
A new formulation of a pose refinement technique using ``active'' models is described. An error term derived from the detection of image derivatives close to an initial object hypothesis is linearised and solved by least squares. The method is particularly well suited to problems involving external geometrical constraints (such as the ground-plane constraint). We show that the method is able to recover both the pose of a rigid model, and the structure of a deformable model. We report an initial assessment of the performance and cost of pose and structure recovery using the active model in comparison with our previously reported ``passive'' model-based techniques in the context of traffic surveillance. The new method is more stable, and requires fewer iterations, especially when the number of free parameters increases, but shows somewhat poorer convergence.
Resumo:
Flood modelling of urban areas is still at an early stage, partly because until recently topographic data of sufficiently high resolution and accuracy have been lacking in urban areas. However, Digital Surface Models (DSMs) generated from airborne scanning laser altimetry (LiDAR) having sub-metre spatial resolution have now become available, and these are able to represent the complexities of urban topography. The paper describes the development of a LiDAR post-processor for urban flood modelling based on the fusion of LiDAR and digital map data. The map data are used in conjunction with LiDAR data to identify different object types in urban areas, though pattern recognition techniques are also employed. Post-processing produces a Digital Terrain Model (DTM) for use as model bathymetry, and also a friction parameter map for use in estimating spatially-distributed friction coefficients. In vegetated areas, friction is estimated from LiDAR-derived vegetation height, and (unlike most vegetation removal software) the method copes with short vegetation less than ~1m high, which may occupy a substantial fraction of even an urban floodplain. The DTM and friction parameter map may also be used to help to generate an unstructured mesh of a vegetated urban floodplain for use by a 2D finite element model. The mesh is decomposed to reflect floodplain features having different frictional properties to their surroundings, including urban features such as buildings and roads as well as taller vegetation features such as trees and hedges. This allows a more accurate estimation of local friction. The method produces a substantial node density due to the small dimensions of many urban features.
Resumo:
In an immersive virtual environment, observers fail to notice the expansion of a room around them and consequently make gross errors when comparing the size of objects. This result is difficult to explain if the visual system continuously generates a 3-D model of the scene based on known baseline information from interocular separation or proprioception as the observer walks. An alternative is that observers use view-based methods to guide their actions and to represent the spatial layout of the scene. In this case, they may have an expectation of the images they will receive but be insensitive to the rate at which images arrive as they walk. We describe the way in which the eye movement strategy of animals simplifies motion processing if their goal is to move towards a desired image and discuss dorsal and ventral stream processing of moving images in that context. Although many questions about view-based approaches to scene representation remain unanswered, the solutions are likely to be highly relevant to understanding biological 3-D vision.
Resumo:
Airborne scanning laser altimetry (LiDAR) is an important new data source for river flood modelling. LiDAR can give dense and accurate DTMs of floodplains for use as model bathymetry. Spatial resolutions of 0.5m or less are possible, with a height accuracy of 0.15m. LiDAR gives a Digital Surface Model (DSM), so vegetation removal software (e.g. TERRASCAN) must be used to obtain a DTM. An example used to illustrate the current state of the art will be the LiDAR data provided by the EA, which has been processed by their in-house software to convert the raw data to a ground DTM and separate vegetation height map. Their method distinguishes trees from buildings on the basis of object size. EA data products include the DTM with or without buildings removed, a vegetation height map, a DTM with bridges removed, etc. Most vegetation removal software ignores short vegetation less than say 1m high. We have attempted to extend vegetation height measurement to short vegetation using local height texture. Typically most of a floodplain may be covered in such vegetation. The idea is to assign friction coefficients depending on local vegetation height, so that friction is spatially varying. This obviates the need to calibrate a global floodplain friction coefficient. It’s not clear at present if the method is useful, but it’s worth testing further. The LiDAR DTM is usually determined by looking for local minima in the raw data, then interpolating between these to form a space-filling height surface. This is a low pass filtering operation, in which objects of high spatial frequency such as buildings, river embankments and walls may be incorrectly classed as vegetation. The problem is particularly acute in urban areas. A solution may be to apply pattern recognition techniques to LiDAR height data fused with other data types such as LiDAR intensity or multispectral CASI data. We are attempting to use digital map data (Mastermap structured topography data) to help to distinguish buildings from trees, and roads from areas of short vegetation. The problems involved in doing this will be discussed. A related problem of how best to merge historic river cross-section data with a LiDAR DTM will also be considered. LiDAR data may also be used to help generate a finite element mesh. In rural area we have decomposed a floodplain mesh according to taller vegetation features such as hedges and trees, so that e.g. hedge elements can be assigned higher friction coefficients than those in adjacent fields. We are attempting to extend this approach to urban area, so that the mesh is decomposed in the vicinity of buildings, roads, etc as well as trees and hedges. A dominant points algorithm is used to identify points of high curvature on a building or road, which act as initial nodes in the meshing process. A difficulty is that the resulting mesh may contain a very large number of nodes. However, the mesh generated may be useful to allow a high resolution FE model to act as a benchmark for a more practical lower resolution model. A further problem discussed will be how best to exploit data redundancy due to the high resolution of the LiDAR compared to that of a typical flood model. Problems occur if features have dimensions smaller than the model cell size e.g. for a 5m-wide embankment within a raster grid model with 15m cell size, the maximum height of the embankment locally could be assigned to each cell covering the embankment. But how could a 5m-wide ditch be represented? Again, this redundancy has been exploited to improve wetting/drying algorithms using the sub-grid-scale LiDAR heights within finite elements at the waterline.
Resumo:
Acute doses of Ginkgo biloba have been shown to improve attention and memory in young, healthy participants, but there has been a lack of investigation into possible effects on executive function. In addition, only one study has investigated the effects of chronic treatment in young volunteers. This study was conducted to compare the effects of ginkgo after acute and chronic treatment on tests of attention, memory and executive function in healthy university students. Using a placebo-controlled double-blind design, in experiment 1, 52 students were randomly allocated to receive a single dose of ginkgo (120 mg, n=26) or placebo (n=26), and were tested 4h later. In experiment 2, 40 students were randomly allocated to receive ginkgo (120 mg/day; n=20) or placebo (n=20) for a 6-week period and were tested at baseline and after 6 weeks of treatment. In both experiments, participants underwent tests of sustained attention, episodic and working memory, mental flexibility and planning, and completed mood rating scales. The acute dose of ginkgo significantly improved performance on the sustained-attention task and pattern-recognition memory task; however, there were no effects on working memory, planning, mental flexibility or mood. After 6 weeks of treatment, there were no significant effects of ginkgo on mood or any of the cognitive tests. In line with the literature, after acute administration ginkgo improved performance in tests of attention and memory. However, there were no effects after 6 weeks, suggesting that tolerance develops to the effects in young, healthy participants.
Resumo:
In this paper, a fuzzy Markov random field (FMRF) model is used to segment land-objects into free, grass, building, and road regions by fusing remotely, sensed LIDAR data and co-registered color bands, i.e. scanned aerial color (RGB) photo and near infra-red (NIR) photo. An FMRF model is defined as a Markov random field (MRF) model in a fuzzy domain. Three optimization algorithms in the FMRF model, i.e. Lagrange multiplier (LM), iterated conditional mode (ICM), and simulated annealing (SA), are compared with respect to the computational cost and segmentation accuracy. The results have shown that the FMRF model-based ICM algorithm balances the computational cost and segmentation accuracy in land-cover segmentation from LIDAR data and co-registered bands.