Biblioteca Digital

536 resultados para colour image processing

Automatic surveillance in transportation hubs: No Longer just about catching the bad guy

Relevância:

80.00% 80.00%

Publicador:

Resumo:

As critical infrastructure such as transportation hubs continue to grow in complexity, greater importance is placed on monitoring these facilities to ensure their secure and efficient operation. In order to achieve these goals, technology continues to evolve in response to the needs of various infrastructure. To date, however, the focus of technology for surveillance has been primarily concerned with security, and little attention has been placed on assisting operations and monitoring performance in real-time. Consequently, solutions have emerged to provide real-time measurements of queues and crowding in spaces, but have been installed as system add-ons (rather than making better use of existing infrastructure), resulting in expensive infrastructure outlay for the owner/operator, and an overload of surveillance systems which in itself creates further complexity. Given many critical infrastructure already have camera networks installed, it is much more desirable to better utilise these networks to address operational monitoring as well as security needs. Recently, a growing number of approaches have been proposed to monitor operational aspects such as pedestrian throughput, crowd size and dwell times. In this paper, we explore how these techniques relate to and complement the more commonly seen security analytics, and demonstrate the value that can be added by operational analytics by demonstrating their performance on airport surveillance data. We explore how multiple analytics and systems can be combined to better leverage the large amount of data that is available, and we discuss the applicability and resulting benefits of the proposed framework for the ongoing operation of airports and airport networks.

Improving deep convultional neural networks with unsupervised feature learning

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The latest generation of Deep Convolutional Neural Networks (DCNN) have dramatically advanced challenging computer vision tasks, especially in object detection and object classification, achieving state-of-the-art performance in several computer vision tasks including text recognition, sign recognition, face recognition and scene understanding. The depth of these supervised networks has enabled learning deeper and hierarchical representation of features. In parallel, unsupervised deep learning such as Convolutional Deep Belief Network (CDBN) has also achieved state-of-the-art in many computer vision tasks. However, there is very limited research on jointly exploiting the strength of these two approaches. In this paper, we investigate the learning capability of both methods. We compare the output of individual layers and show that many learnt filters and outputs of the corresponding level layer are almost similar for both approaches. Stacking the DCNN on top of unsupervised layers or replacing layers in the DCNN with the corresponding learnt layers in the CDBN can improve the recognition/classification accuracy and training computational expense. We demonstrate the validity of the proposal on ImageNet dataset.

The vertebral venous system in healthy and scoliotic adolescent spines - a 3D MRI investigation

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Introduction. The venous drainage system within vertebral bodies (VBs) has been well documented previously in cadaveric specimens. Advances in 3D imaging and image processing now allow for in vivo quantification of larger venous vessels, such as the basivertebral vein. Differences between healthy and scoliotic VB veins can therefore be investigated. Methods. 20 healthy adolescent controls and 21 AIS patients were recruited (with ethics approval) to undergo 3D MRI, using a 3 Tesla, T1-weighted 3D gradient echo sequence, resulting in 512 slices across the thoraco-lumbar spine, with a voxel size of 0.5x0.5x0.5mm. Using Amira Filament Editor, five transverse slices through the VB were examined simultaneously and the resulting observable vascular network traced. Each VB was assessed, and a vascular network recorded when observable. A local coordinate system was created in the centre of each VB and the vascular networks aligned to this. The length of the vascular network on the left and right sides (with a small central region) of the VB was calculated, and the spatial patterning of the networks assessed level-by-level within each subject. Results. An average of 6 (range 4-10) vascular networks, consistent with descriptions of the basivertebral vein, were identifiable within each subject, most commonly between T10-L1. Differences were seen in the left/right distribution of vessels in the control and AIS subjects. Healthy controls saw a percentage distribution of 29:18:53 across the left:centre:right regions respectively, whereas the AIS subjects had a slightly shifted distribution of 33:25:42. The control group showed consistent spatial patterning of the vascular networks across most levels, but this was not seen in the AIS group. Conclusion. Observation and quantification of the basivertebral vein in vivo is possible using 3D MRI. The AIS group lacked the spatial pattern repetition seen in the control group and minor differences were seen in the left/right distribution of vessels.

Advanced CUBIC protocols for whole-brain and whole-body clearing and imaging

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Here we describe a protocol for advanced CUBIC (Clear, Unobstructed Brain/Body Imaging Cocktails and Computational analysis). The CUBIC protocol enables simple and efficient organ clearing, rapid imaging by light-sheet microscopy and quantitative imaging analysis of multiple samples. The organ or body is cleared by immersion for 1â€“14 d, with the exact time required dependent on the sample type and the experimental purposes. A single imaging set can be completed in 30â€“60 min. Image processing and analysis can take <1 d, but it is dependent on the number of samples in the data set. The CUBIC clearing protocol can process multiple samples simultaneously. We previously used CUBIC to image whole-brain neural activities at single-cell resolution using Arc-dVenus transgenic (Tg) mice. CUBIC informatics calculated the Venus signal subtraction, comparing different brains at a whole-organ scale. These protocols provide a platform for organism-level systems biology by comprehensively detecting cells in a whole organ or body.

A biomechanical approach to iris normalization

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The richness of the iris texture and its variability across individuals make it a useful biometric trait for personal authentication. One of the key stages in classical iris recognition is the normalization process, where the annular iris region is mapped to a dimensionless pseudo-polar coordinate system. This process results in a rectangular structure that can be used to compensate for differences in scale and variations in pupil size. Most iris recognition methods in the literature adopt linear sampling in the radial and angular directions when performing iris normalization. In this paper, a biomechanical model of the iris is used to define a novel nonlinear normalization scheme that improves iris recognition accuracy under different degrees of pupil dilation. The proposed biomechanical model is used to predict the radial displacement of any point in the iris at a given dilation level, and this information is incorporated in the normalization process. Experimental results on the WVU pupil light reflex database (WVU-PLR) indicate the efficacy of the proposed technique, especially when matching iris images with large differences in pupil size.

Automating marine mammal detection in aerial images captured during wildlife surveys: A deep learning approach

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Aerial surveys conducted using manned or unmanned aircraft with customized camera payloads can generate a large number of images. Manual review of these images to extract data is prohibitive in terms of time and financial resources, thus providing strong incentive to automate this process using computer vision systems. There are potential applications for these automated systems in areas such as surveillance and monitoring, precision agriculture, law enforcement, asset inspection, and wildlife assessment. In this paper, we present an efficient machine learning system for automating the detection of marine species in aerial imagery. The effectiveness of our approach can be credited to the combination of a well-suited region proposal method and the use of Deep Convolutional Neural Networks (DCNNs). In comparison to previous algorithms designed for the same purpose, we have been able to dramatically improve recall to more than 80% and improve precision to 27% by using DCNNs as the core approach.

Detection of anuran calling activity in long field recordings for bio-acoustic monitoring

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper presents a system to analyze long field recordings with low signal-to-noise ratio (SNR) for bio-acoustic monitoring. A method based on spectral peak track, Shannon entropy, harmonic structure and oscillation structure is proposed to automatically detect anuran (frog) calling activity. Gaussian mixture model (GMM) is introduced for modelling those features. Four anuran species widespread in Queensland, Australia, are selected to evaluate the proposed system. A visualization method based on extracted indices is employed for detection of anuran calling activity which achieves high accuracy.

Acoustic classification of Australian anurans using syllable features

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Acoustic classification of anurans (frogs) has received increasing attention for its promising application in biological and environment studies. In this study, a novel feature extraction method for frog call classification is presented based on the analysis of spectrograms. The frog calls are first automatically segmented into syllables. Then, spectral peak tracks are extracted to separate desired signal (frog calls) from background noise. The spectral peak tracks are used to extract various syllable features, including: syllable duration, dominant frequency, oscillation rate, frequency modulation, and energy modulation. Finally, a k-nearest neighbor classifier is used for classifying frog calls based on the results of principal component analysis. The experiment results show that syllable features can achieve an average classification accuracy of 90.5% which outperforms Mel-frequency cepstral coefficients features (79.0%).

From vision to actions: Towards adaptive and autonomous humanoid robots

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Although robotics research has seen advances over the last decades robots are still not in widespread use outside industrial applications. Yet a range of proposed scenarios have robots working together, helping and coexisting with humans in daily life. In all these a clear need to deal with a more unstructured, changing environment arises. I herein present a system that aims to overcome the limitations of highly complex robotic systems, in terms of autonomy and adaptation. The main focus of research is to investigate the use of visual feedback for improving reaching and grasping capabilities of complex robots. To facilitate this a combined integration of computer vision and machine learning techniques is employed. From a robot vision point of view the combination of domain knowledge from both imaging processing and machine learning techniques, can expand the capabilities of robots. I present a novel framework called Cartesian Genetic Programming for Image Processing (CGP-IP). CGP-IP can be trained to detect objects in the incoming camera streams and successfully demonstrated on many different problem domains. The approach requires only a few training images (it was tested with 5 to 10 images per experiment) is fast, scalable and robust yet requires very small training sets. Additionally, it can generate human readable programs that can be further customized and tuned. While CGP-IP is a supervised-learning technique, I show an integration on the iCub, that allows for the autonomous learning of object detection and identification. Finally this dissertation includes two proof-of-concepts that integrate the motion and action sides. First, reactive reaching and grasping is shown. It allows the robot to avoid obstacles detected in the visual stream, while reaching for the intended target object. Furthermore the integration enables us to use the robot in non-static environments, i.e. the reaching is adapted on-the- fly from the visual feedback received, e.g. when an obstacle is moved into the trajectory. The second integration highlights the capabilities of these frameworks, by improving the visual detection by performing object manipulation actions.

Compact features for birdcall retrieval from environmental acoustic recordings

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Bioacoustic data can be used for monitoring animal species diversity. The deployment of acoustic sensors enables acoustic monitoring at large temporal and spatial scales. We describe a content-based birdcall retrieval algorithm for the exploration of large data bases of acoustic recordings. In the algorithm, an event-based searching scheme and compact features are developed. In detail, ridge events are detected from audio files using event detection on spectral ridges. Then event alignment is used to search through audio files to locate candidate instances. A similarity measure is then applied to dimension-reduced spectral ridge feature vectors. The event-based searching method processes a smaller list of instances for faster retrieval. The experimental results demonstrate that our features achieve better success rate than existing methods and the feature dimension is greatly reduced.

Automated topometric graph generation from floor plan analysis

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The world is rich with information such as signage and maps to assist humans to navigate. We present a method to extract topological spatial information from a generic bitmap floor plan and build a topometric graph that can be used by a mobile robot for tasks such as path planning and guided exploration. The algorithm first detects and extracts text in an image of the floor plan. Using the locations of the extracted text, flood fill is used to find the rooms and hallways. Doors are found by matching SURF features and these form the connections between rooms, which are the edges of the topological graph. Our system is able to automatically detect doors and differentiate between hallways and rooms, which is important for effective navigation. We show that our method can extract a topometric graph from a floor plan and is robust against ambiguous cases most commonly seen in floor plans including elevators and stairwells.

Learning Temporal Alignment Uncertainty for Efficient Event Detection

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper we tackle the problem of efficient video event detection. We argue that linear detection functions should be preferred in this regard due to their scalability and efficiency during estimation and evaluation. A popular approach in this regard is to represent a sequence using a bag of words (BOW) representation due to its: (i) fixed dimensionality irrespective of the sequence length, and (ii) its ability to compactly model the statistics in the sequence. A drawback to the BOW representation, however, is the intrinsic destruction of the temporal ordering information. In this paper we propose a new representation that leverages the uncertainty in relative temporal alignments between pairs of sequences while not destroying temporal ordering. Our representation, like BOW, is of a fixed dimensionality making it easily integrated with a linear detection function. Extensive experiments on CK+, 6DMG, and UvA-NEMO databases show significant performance improvements across both isolated and continuous event detection tasks.

Unsupervised Domain Adaptation by Domain Invariant Projection

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Domain-invariant representations are key to addressing the domain shift problem where the training and test exam- ples follow different distributions. Existing techniques that have attempted to match the distributions of the source and target domains typically compare these distributions in the original feature space. This space, however, may not be di- rectly suitable for such a comparison, since some of the fea- tures may have been distorted by the domain shift, or may be domain specific. In this paper, we introduce a Domain Invariant Projection approach: An unsupervised domain adaptation method that overcomes this issue by extracting the information that is invariant across the source and tar- get domains. More specifically, we learn a projection of the data to a low-dimensional latent space where the distance between the empirical distributions of the source and target examples is minimized. We demonstrate the effectiveness of our approach on the task of visual object recognition and show that it outperforms state-of-the-art methods on a stan- dard domain adaptation benchmark dataset

Enhancing human action recognition with region proposals

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Deep convolutional network models have dominated recent work in human action recognition as well as image classification. However, these methods are often unduly influenced by the image background, learning and exploiting the presence of cues in typical computer vision datasets. For unbiased robotics applications, the degree of variation and novelty in action backgrounds is far greater than in computer vision datasets. To address this challenge, we propose an “action region proposal” method that, informed by optical flow, extracts image regions likely to contain actions for input into the network both during training and testing. In a range of experiments, we demonstrate that manually segmenting the background is not enough; but through active action region proposals during training and testing, state-of-the-art or better performance can be achieved on individual spatial and temporal video components. Finally, we show by focusing attention through action region proposals, we can further improve upon the existing state-of-the-art in spatio-temporally fused action recognition performance.

Class-specific sparse codes for representing activities

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper we investigate the effectiveness of class specific sparse codes in the context of discriminative action classification. The bag-of-words representation is widely used in activity recognition to encode features, and although it yields state-of-the art performance with several feature descriptors it still suffers from large quantization errors and reduces the overall performance. Recently proposed sparse representation methods have been shown to effectively represent features as a linear combination of an over complete dictionary by minimizing the reconstruction error. In contrast to most of the sparse representation methods which focus on Sparse-Reconstruction based Classification (SRC), this paper focuses on a discriminative classification using a SVM by constructing class-specific sparse codes for motion and appearance separately. Experimental results demonstrates that separate motion and appearance specific sparse coefficients provide the most effective and discriminative representation for each class compared to a single class-specific sparse coefficients.

«
1
2
...
28
29
30
31
32
33
34
35
36
»