281 resultados para histogram
Resumo:
This paper presents two algorithms to automate the detection of marine species in aerial imagery. An algorithm from an initial pilot study is presented in which morphology operations and colour analysis formed the basis of its working principle. A second approach is presented in which saturation channel and histogram-based shape profiling were used. We report on performance for both algorithms using datasets collected from an unmanned aerial system at an altitude of 1000 ft. Early results have demonstrated recall values of 48.57% and 51.4%, and precision values of 4.01% and 4.97%.
Resumo:
Background Physiotherapists are a professional group with a high rate of attrition and at high risk of musculoskeletal disorders. The purpose of this investigation was to examine the physical activity levels and health-related quality of life of physiotherapists working in metropolitan clinical settings in an Australian hospital and health service. It was hypothesized that practicing physiotherapists would report excellent health-related quality of life and would already be physically active. Such a finding would add weight to a claim that general physical activity conditioning strategies may not be useful for preventing musculoskeletal disorders among active healthy physiotherapists, but rather, future investigations should focus on the development and evaluation of role specific conditioning strategies. Methods A questionnaire was completed by 44 physiotherapists from three inpatient units and three ambulatory clinics (63.7% response rate). Physical activity levels were reported using the Active Australia Survey. Health-related quality of life was examined using the EQ-5D instrument. Physical activity and EQ-5D data were examined using conventional descriptive statistics; with domain responses for the EQ-5D presented in a frequency histogram. Results The majority of physiotherapists in this sample were younger than 30 years of age (n = 25, 56.8%) consistent with the presence of a high attrition rate. Almost all respondents exceeded minimum recommended physical activity guidelines (n = 40, 90.9%). Overall the respondents engaged in more vigorous physical activity (median = 180 minutes) and walking (median = 135 minutes) than moderate exercise (median = 35 minutes) each week. Thirty-seven (84.1%) participants reported no pain or discomfort impacting their health-related quality of life, with most (n = 35,79.5%) being in full health. Conclusions Physical-conditioning based interventions for the prevention of musculoskeletal disorders among practicing physiotherapists may be better targeted to role or task specific conditioning rather than general physical conditioning among this physically active population. It is plausible that an inherent attrition of physiotherapists may occur among those not as active or healthy as therapists who cope with the physical demands of clinical practice. Extrapolation of findings from this study may be limited due to the sample characteristics. However, this investigation addressed the study objectives and has provided a foundation for larger scale longitudinal investigations in this field.
Resumo:
Clustering identities in a broadcast video is a useful task to aid in video annotation and retrieval. Quality based frame selection is a crucial task in video face clustering, to both improve the clustering performance and reduce the computational cost. We present a frame work that selects the highest quality frames available in a video to cluster the face. This frame selection technique is based on low level and high level features (face symmetry, sharpness, contrast and brightness) to select the highest quality facial images available in a face sequence for clustering. We also consider the temporal distribution of the faces to ensure that selected faces are taken at times distributed throughout the sequence. Normalized feature scores are fused and frames with high quality scores are used in a Local Gabor Binary Pattern Histogram Sequence based face clustering system. We present a news video database to evaluate the clustering system performance. Experiments on the newly created news database show that the proposed method selects the best quality face images in the video sequence, resulting in improved clustering performance.
Resumo:
Efficient and effective feature detection and representation is an important consideration when processing videos, and a large number of applications such as motion analysis, 3D scene understanding, tracking etc. depend on this. Amongst several feature description methods, local features are becoming increasingly popular for representing videos because of their simplicity and efficiency. While they achieve state-of-the-art performance with low computational complexity, their performance is still too limited for real world applications. Furthermore, rapid increases in the uptake of mobile devices has increased the demand for algorithms that can run with reduced memory and computational requirements. In this paper we propose a semi binary based feature detectordescriptor based on the BRISK detector, which can detect and represent videos with significantly reduced computational requirements, while achieving comparable performance to the state of the art spatio-temporal feature descriptors. First, the BRISK feature detector is applied on a frame by frame basis to detect interest points, then the detected key points are compared against consecutive frames for significant motion. Key points with significant motion are encoded with the BRISK descriptor in the spatial domain and Motion Boundary Histogram in the temporal domain. This descriptor is not only lightweight but also has lower memory requirements because of the binary nature of the BRISK descriptor, allowing the possibility of applications using hand held devices.We evaluate the combination of detectordescriptor performance in the context of action classification with a standard, popular bag-of-features with SVM framework. Experiments are carried out on two popular datasets with varying complexity and we demonstrate comparable performance with other descriptors with reduced computational complexity.
Resumo:
A cell classification algorithm that uses first, second and third order statistics of pixel intensity distributions over pre-defined regions is implemented and evaluated. A cell image is segmented into 6 regions extending from a boundary layer to an inner circle. First, second and third order statistical features are extracted from histograms of pixel intensities in these regions. Third order statistical features used are one-dimensional bispectral invariants. 108 features were considered as candidates for Adaboost based fusion. The best 10 stage fused classifier was selected for each class and a decision tree constructed for the 6-class problem. The classifier is robust, accurate and fast by design.
Resumo:
Real-time image analysis and classification onboard robotic marine vehicles, such as AUVs, is a key step in the realisation of adaptive mission planning for large-scale habitat mapping in previously unexplored environments. This paper describes a novel technique to train, process, and classify images collected onboard an AUV used in relatively shallow waters with poor visibility and non-uniform lighting. The approach utilises Förstner feature detectors and Laws texture energy masks for image characterisation, and a bag of words approach for feature recognition. To improve classification performance we propose a usefulness gain to learn the importance of each histogram component for each class. Experimental results illustrate the performance of the system in characterisation of a variety of marine habitats and its ability to operate onboard an AUV's main processor suitable for real-time mission planning.
Resumo:
In this paper, we propose a new steganalytic method to detect the message hidden in a black and white image using the steganographic technique developed by Liang, Wang and Zhang. Our detection method estimates the length of hidden message embedded in a binary image. Although the hidden message embedded is visually imperceptible, it changes some image statistic (such as inter-pixels correlation). Based on this observation, we first derive the 512 patterns histogram from the boundary pixels as the distinguishing statistic, then we compute the histogram difference to determine the changes of the 512 patterns histogram induced by the embedding operation. Finally we propose histogram quotient to estimate the length of the embedded message. Experimental results confirm that the proposed method can effectively and reliably detect the length of the embedded message.
Resumo:
Recent advances suggest that encoding images through Symmetric Positive Definite (SPD) matrices and then interpreting such matrices as points on Riemannian manifolds can lead to increased classification performance. Taking into account manifold geometry is typically done via (1) embedding the manifolds in tangent spaces, or (2) embedding into Reproducing Kernel Hilbert Spaces (RKHS). While embedding into tangent spaces allows the use of existing Euclidean-based learning algorithms, manifold shape is only approximated which can cause loss of discriminatory information. The RKHS approach retains more of the manifold structure, but may require non-trivial effort to kernelise Euclidean-based learning algorithms. In contrast to the above approaches, in this paper we offer a novel solution that allows SPD matrices to be used with unmodified Euclidean-based learning algorithms, with the true manifold shape well-preserved. Specifically, we propose to project SPD matrices using a set of random projection hyperplanes over RKHS into a random projection space, which leads to representing each matrix as a vector of projection coefficients. Experiments on face recognition, person re-identification and texture classification show that the proposed approach outperforms several recent methods, such as Tensor Sparse Coding, Histogram Plus Epitome, Riemannian Locality Preserving Projection and Relational Divergence Classification.
Resumo:
Person re-identification is particularly challenging due to significant appearance changes across separate camera views. In order to re-identify people, a representative human signature should effectively handle differences in illumination, pose and camera parameters. While general appearance-based methods are modelled in Euclidean spaces, it has been argued that some applications in image and video analysis are better modelled via non-Euclidean manifold geometry. To this end, recent approaches represent images as covariance matrices, and interpret such matrices as points on Riemannian manifolds. As direct classification on such manifolds can be difficult, in this paper we propose to represent each manifold point as a vector of similarities to class representers, via a recently introduced form of Bregman matrix divergence known as the Stein divergence. This is followed by using a discriminative mapping of similarity vectors for final classification. The use of similarity vectors is in contrast to the traditional approach of embedding manifolds into tangent spaces, which can suffer from representing the manifold structure inaccurately. Comparative evaluations on benchmark ETHZ and iLIDS datasets for the person re-identification task show that the proposed approach obtains better performance than recent techniques such as Histogram Plus Epitome, Partial Least Squares, and Symmetry-Driven Accumulation of Local Features.
Resumo:
We propose a method of representing audience behavior through facial and body motions from a single video stream, and use these features to predict the rating for feature-length movies. This is a very challenging problem as: i) the movie viewing environment is dark and contains views of people at different scales and viewpoints; ii) the duration of feature-length movies is long (80-120 mins) so tracking people uninterrupted for this length of time is still an unsolved problem, and; iii) expressions and motions of audience members are subtle, short and sparse making labeling of activities unreliable. To circumvent these issues, we use an infrared illuminated test-bed to obtain a visually uniform input. We then utilize motion-history features which capture the subtle movements of a person within a pre-defined volume, and then form a group representation of the audience by a histogram of pair-wise correlations over a small-window of time. Using this group representation, we learn our movie rating classifier from crowd-sourced ratings collected by rottentomatoes.com and show our prediction capability on audiences from 30 movies across 250 subjects (> 50 hrs).
Resumo:
Existing crowd counting algorithms rely on holistic, local or histogram based features to capture crowd properties. Regression is then employed to estimate the crowd size. Insufficient testing across multiple datasets has made it difficult to compare and contrast different methodologies. This paper presents an evaluation across multiple datasets to compare holistic, local and histogram based methods, and to compare various image features and regression models. A K-fold cross validation protocol is followed to evaluate the performance across five public datasets: UCSD, PETS 2009, Fudan, Mall and Grand Central datasets. Image features are categorised into five types: size, shape, edges, keypoints and textures. The regression models evaluated are: Gaussian process regression (GPR), linear regression, K nearest neighbours (KNN) and neural networks (NN). The results demonstrate that local features outperform equivalent holistic and histogram based features; optimal performance is observed using all image features except for textures; and that GPR outperforms linear, KNN and NN regression
Resumo:
Due to the popularity of security cameras in public places, it is of interest to design an intelligent system that can efficiently detect events automatically. This paper proposes a novel algorithm for multi-person event detection. To ensure greater than real-time performance, features are extracted directly from compressed MPEG video. A novel histogram-based feature descriptor that captures the angles between extracted particle trajectories is proposed, which allows us to capture motion patterns of multi-person events in the video. To alleviate the need for fine-grained annotation, we propose the use of Labelled Latent Dirichlet Allocation, a “weakly supervised” method that allows the use of coarse temporal annotations which are much simpler to obtain. This novel system is able to run at approximately ten times real-time, while preserving state-of-theart detection performance for multi-person events on a 100-hour real-world surveillance dataset (TRECVid SED).
Resumo:
Abnormal event detection has attracted a lot of attention in the computer vision research community during recent years due to the increased focus on automated surveillance systems to improve security in public places. Due to the scarcity of training data and the definition of an abnormality being dependent on context, abnormal event detection is generally formulated as a data-driven approach where activities are modeled in an unsupervised fashion during the training phase. In this work, we use a Gaussian mixture model (GMM) to cluster the activities during the training phase, and propose a Gaussian mixture model based Markov random field (GMM-MRF) to estimate the likelihood scores of new videos in the testing phase. Further-more, we propose two new features: optical acceleration, and the histogram of optical flow gradients; to detect the presence of any abnormal objects and speed violations in the scene. We show that our proposed method outperforms other state of the art abnormal event detection algorithms on publicly available UCSD dataset.
Resumo:
We propose expected attainable discrimination (EAD) as a measure to select discrete valued features for reliable discrimination between two classes of data. EAD is an average of the area under the ROC curves obtained when a simple histogram probability density model is trained and tested on many random partitions of a data set. EAD can be incorporated into various stepwise search methods to determine promising subsets of features, particularly when misclassification costs are difficult or impossible to specify. Experimental application to the problem of risk prediction in pregnancy is described.
Resumo:
Local spatio-temporal features with a Bag-of-visual words model is a popular approach used in human action recognition. Bag-of-features methods suffer from several challenges such as extracting appropriate appearance and motion features from videos, converting extracted features appropriate for classification and designing a suitable classification framework. In this paper we address the problem of efficiently representing the extracted features for classification to improve the overall performance. We introduce two generative supervised topic models, maximum entropy discrimination LDA (MedLDA) and class- specific simplex LDA (css-LDA), to encode the raw features suitable for discriminative SVM based classification. Unsupervised LDA models disconnect topic discovery from the classification task, hence yield poor results compared to the baseline Bag-of-words framework. On the other hand supervised LDA techniques learn the topic structure by considering the class labels and improve the recognition accuracy significantly. MedLDA maximizes likelihood and within class margins using max-margin techniques and yields a sparse highly discriminative topic structure; while in css-LDA separate class specific topics are learned instead of common set of topics across the entire dataset. In our representation first topics are learned and then each video is represented as a topic proportion vector, i.e. it can be comparable to a histogram of topics. Finally SVM classification is done on the learned topic proportion vector. We demonstrate the efficiency of the above two representation techniques through the experiments carried out in two popular datasets. Experimental results demonstrate significantly improved performance compared to the baseline Bag-of-features framework which uses kmeans to construct histogram of words from the feature vectors.