372 resultados para Image processing, computer-assisted
Resumo:
This paper presents an efficient face detection method suitable for real-time surveillance applications. Improved efficiency is achieved by constraining the search window of an AdaBoost face detector to pre-selected regions. Firstly, the proposed method takes a sparse grid of sample pixels from the image to reduce whole image scan time. A fusion of foreground segmentation and skin colour segmentation is then used to select candidate face regions. Finally, a classifier-based face detector is applied only to selected regions to verify the presence of a face (the Viola-Jones detector is used in this paper). The proposed system is evaluated using 640 x 480 pixels test images and compared with other relevant methods. Experimental results show that the proposed method reduces the detection time to 42 ms, where the Viola-Jones detector alone requires 565 ms (on a desktop processor). This improvement makes the face detector suitable for real-time applications. Furthermore, the proposed method requires 50% of the computation time of the best competing method, while reducing the false positive rate by 3.2% and maintaining the same hit rate.
Resumo:
Modelling video sequences by subspaces has recently shown promise for recognising human actions. Subspaces are able to accommodate the effects of various image variations and can capture the dynamic properties of actions. Subspaces form a non-Euclidean and curved Riemannian manifold known as a Grassmann manifold. Inference on manifold spaces usually is achieved by embedding the manifolds in higher dimensional Euclidean spaces. In this paper, we instead propose to embed the Grassmann manifolds into reproducing kernel Hilbert spaces and then tackle the problem of discriminant analysis on such manifolds. To achieve efficient machinery, we propose graph-based local discriminant analysis that utilises within-class and between-class similarity graphs to characterise intra-class compactness and inter-class separability, respectively. Experiments on KTH, UCF Sports, and Ballet datasets show that the proposed approach obtains marked improvements in discrimination accuracy in comparison to several state-of-the-art methods, such as the kernel version of affine hull image-set distance, tensor canonical correlation analysis, spatial-temporal words and hierarchy of discriminative space-time neighbourhood features.
Resumo:
Background subtraction is a fundamental low-level processing task in numerous computer vision applications. The vast majority of algorithms process images on a pixel-by-pixel basis, where an independent decision is made for each pixel. A general limitation of such processing is that rich contextual information is not taken into account. We propose a block-based method capable of dealing with noise, illumination variations, and dynamic backgrounds, while still obtaining smooth contours of foreground objects. Specifically, image sequences are analyzed on an overlapping block-by-block basis. A low-dimensional texture descriptor obtained from each block is passed through an adaptive classifier cascade, where each stage handles a distinct problem. A probabilistic foreground mask generation approach then exploits block overlaps to integrate interim block-level decisions into final pixel-level foreground segmentation. Unlike many pixel-based methods, ad-hoc postprocessing of foreground masks is not required. Experiments on the difficult Wallflower and I2R datasets show that the proposed approach obtains on average better results (both qualitatively and quantitatively) than several prominent methods. We furthermore propose the use of tracking performance as an unbiased approach for assessing the practical usefulness of foreground segmentation methods, and show that the proposed approach leads to considerable improvements in tracking accuracy on the CAVIAR dataset.
Resumo:
Speaker diarization determines instances of the same speaker within a recording. Extending this task to a collection of recordings for linking together segments spoken by a unique speaker requires speaker linking. In this paper we propose a speaker linking system using linkage clustering and state-of-the-art speaker recognition techniques. We evaluate our approach against two baseline linking systems using agglomerative cluster merging (AC) and agglomerative clustering with model retraining (ACR). We demonstrate that our linking method, using complete-linkage clustering, provides a relative improvement of 20% and 29% in attribution error rate (AER), over the AC and ACR systems, respectively.
Resumo:
Our everyday environment is full of text but this rich source of information remains largely inaccessible to mobile robots. In this paper we describe an active text spotting system that uses a small number of wide angle views to locate putative text in the environment and then foveates and zooms onto that text in order to improve the reliability of text recognition. We present extensive experimental results obtained with a pan/tilt/zoom camera and a ROS-based mobile robot operating in an indoor environment.
Resumo:
This item provides supplementary materials for the paper mentioned in the title, specifically a range of organisms used in the study. The full abstract for the main paper is as follows: Next Generation Sequencing (NGS) technologies have revolutionised molecular biology, allowing clinical sequencing to become a matter of routine. NGS data sets consist of short sequence reads obtained from the machine, given context and meaning through downstream assembly and annotation. For these techniques to operate successfully, the collected reads must be consistent with the assumed species or species group, and not corrupted in some way. The common bacterium Staphylococcus aureus may cause severe and life-threatening infections in humans,with some strains exhibiting antibiotic resistance. In this paper, we apply an SVM classifier to the important problem of distinguishing S. aureus sequencing projects from alternative pathogens, including closely related Staphylococci. Using a sequence k-mer representation, we achieve precision and recall above 95%, implicating features with important functional associations.
Resumo:
In this paper, we propose an approach which attempts to solve the problem of surveillance event detection, assuming that we know the definition of the events. To facilitate the discussion, we first define two concepts. The event of interest refers to the event that the user requests the system to detect; and the background activities are any other events in the video corpus. This is an unsolved problem due to many factors as listed below: 1) Occlusions and clustering: The surveillance scenes which are of significant interest at locations such as airports, railway stations, shopping centers are often crowded, where occlusions and clustering of people are frequently encountered. This significantly affects the feature extraction step, and for instance, trajectories generated by object tracking algorithms are usually not robust under such a situation. 2) The requirement for real time detection: The system should process the video fast enough in both of the feature extraction and the detection step to facilitate real time operation. 3) Massive size of the training data set: Suppose there is an event that lasts for 1 minute in a video with a frame rate of 25fps, the number of frames for this events is 60X25 = 1500. If we want to have a training data set with many positive instances of the event, the video is likely to be very large in size (i.e. hundreds of thousands of frames or more). How to handle such a large data set is a problem frequently encountered in this application. 4) Difficulty in separating the event of interest from background activities: The events of interest often co-exist with a set of background activities. Temporal groundtruth typically very ambiguous, as it does not distinguish the event of interest from a wide range of co-existing background activities. However, it is not practical to annotate the locations of the events in large amounts of video data. This problem becomes more serious in the detection of multi-agent interactions, since the location of these events can often not be constrained to within a bounding box. 5) Challenges in determining the temporal boundaries of the events: An event can occur at any arbitrary time with an arbitrary duration. The temporal segmentation of events is difficult and ambiguous, and also affected by other factors such as occlusions.
Resumo:
In the field of face recognition, Sparse Representation (SR) has received considerable attention during the past few years. Most of the relevant literature focuses on holistic descriptors in closed-set identification applications. The underlying assumption in SR-based methods is that each class in the gallery has sufficient samples and the query lies on the subspace spanned by the gallery of the same class. Unfortunately, such assumption is easily violated in the more challenging face verification scenario, where an algorithm is required to determine if two faces (where one or both have not been seen before) belong to the same person. In this paper, we first discuss why previous attempts with SR might not be applicable to verification problems. We then propose an alternative approach to face verification via SR. Specifically, we propose to use explicit SR encoding on local image patches rather than the entire face. The obtained sparse signals are pooled via averaging to form multiple region descriptors, which are then concatenated to form an overall face descriptor. Due to the deliberate loss spatial relations within each region (caused by averaging), the resulting descriptor is robust to misalignment & various image deformations. Within the proposed framework, we evaluate several SR encoding techniques: l1-minimisation, Sparse Autoencoder Neural Network (SANN), and an implicit probabilistic technique based on Gaussian Mixture Models. Thorough experiments on AR, FERET, exYaleB, BANCA and ChokePoint datasets show that the proposed local SR approach obtains considerably better and more robust performance than several previous state-of-the-art holistic SR methods, in both verification and closed-set identification problems. The experiments also show that l1-minimisation based encoding has a considerably higher computational than the other techniques, but leads to higher recognition rates.
Resumo:
Abstract. In recent years, sparse representation based classification(SRC) has received much attention in face recognition with multipletraining samples of each subject. However, it cannot be easily applied toa recognition task with insufficient training samples under uncontrolledenvironments. On the other hand, cohort normalization, as a way of mea-suring the degradation effect under challenging environments in relationto a pool of cohort samples, has been widely used in the area of biometricauthentication. In this paper, for the first time, we introduce cohort nor-malization to SRC-based face recognition with insufficient training sam-ples. Specifically, a user-specific cohort set is selected to normalize theraw residual, which is obtained from comparing the test sample with itssparse representations corresponding to the gallery subject, using poly-nomial regression. Experimental results on AR and FERET databases show that cohort normalization can bring SRC much robustness against various forms of degradation factors for undersampled face recognition.
Resumo:
An algorithm for computing dense correspondences between images of a stereo pair or image sequence is presented. The algorithm can make use of both standard matching metrics and the rank and census filters, two filters based on order statistics which have been applied to the image matching problem. Their advantages include robustness to radiometric distortion and amenability to hardware implementation. Results obtained using both real stereo pairs and a synthetic stereo pair with ground truth were compared. The rank and census filters were shown to significantly improve performance in the case of radiometric distortion. In all cases, the results obtained were comparable to, if not better than, those obtained using standard matching metrics. Furthermore, the rank and census have the additional advantage that their computational overhead is less than these metrics. For all techniques tested, the difference between the results obtained for the synthetic stereo pair, and the ground truth results was small.
Resumo:
Aerial Vehicles (UAV) has become a significant growing segment of the global aviation industry. These vehicles are developed with the intention of operating in regions where the presence of onboard human pilots is either too risky or unnecessary. Their popularity with both the military and civilian sectors have seen the use of UAVs in a diverse range of applications, from reconnaissance and surveillance tasks for the military, to civilian uses such as aid relief and monitoring tasks. Efficient energy utilisation on an UAV is essential to its functioning, often to achieve the operational goals of range, endurance and other specific mission requirements. Due to the limitations of the space available and the mass budget on the UAV, it is often a delicate balance between the onboard energy available (i.e. fuel) and achieving the operational goals. This paper presents the development of a parallel Hybrid Electric Propulsion System (HEPS) on a small fixed-wing UAV incorporating an Ideal Operating Line (IOL) control strategy. A simulation model of an UAV was developed in the MATLAB Simulink environment, utilising the AeroSim Blockset and the in-built Aerosonde UAV block and its parameters. An IOL analysis of an Aerosonde engine was performed, and the most efficient (i.e. provides greatest torque output at the least fuel consumption) points of operation for this engine were determined. Simulation models of the components in a HEPS were designed and constructed in the MATLAB Simulink environment. It was demonstrated through simulation that an UAV with the current HEPS configuration was capable of achieving a fuel saving of 6.5%, compared to the ICE-only configuration. These components form the basis for the development of a complete simulation model of a Hybrid-Electric UAV (HEUAV).
Resumo:
The main objective of this paper is to describe the development of a remote sensing airborne air sampling system for Unmanned Aerial Systems (UAS) and provide the capability for the detection of particle and gas concentrations in real time over remote locations. The design of the air sampling methodology started by defining system architecture, and then by selecting and integrating each subsystem. A multifunctional air sampling instrument, with capability for simultaneous measurement of particle and gas concentrations was modified and integrated with ARCAA’s Flamingo UAS platform and communications protocols. As result of the integration process, a system capable of both real time geo-location monitoring and indexed-link sampling was obtained. Wind tunnel tests were conducted in order to evaluate the performance of the air sampling instrument in controlled nonstationary conditions at the typical operational velocities of the UAS platform. Once the remote fully operative air sampling system was obtained, the problem of mission design was analyzed through the simulation of different scenarios. Furthermore, flight tests of the complete air sampling system were then conducted to check the dynamic characteristics of the UAS with the air sampling system and to prove its capability to perform an air sampling mission following a specific flight path.
Resumo:
With the explosive growth of resources available through the Internet, information mismatching and overload have become a severe concern to users. Web users are commonly overwhelmed by huge volume of information and are faced with the challenge of finding the most relevant and reliable information in a timely manner. Personalised information gathering and recommender systems represent state-of-the-art tools for efficient selection of the most relevant and reliable information resources, and the interest in such systems has increased dramatically over the last few years. However, web personalization has not yet been well-exploited; difficulties arise while selecting resources through recommender systems from a technological and social perspective. Aiming to promote high quality research in order to overcome these challenges, this paper provides a comprehensive survey on the recent work and achievements in the areas of personalised web information gathering and recommender systems. The report covers concept-based techniques exploited in personalised information gathering and recommender systems.
Resumo:
This report describes the available functionality and use of the ClusterEval evaluation software. It implements novel and standard measures for the evaluation of cluster quality. This software has been used at the INEX XML Mining track and in the MediaEval Social Event Detection task.
Resumo:
Iris based identity verification is highly reliable but it can also be subject to attacks. Pupil dilation or constriction stimulated by the application of drugs are examples of sample presentation security attacks which can lead to higher false rejection rates. Suspects on a watch list can potentially circumvent the iris based system using such methods. This paper investigates a new approach using multiple parts of the iris (instances) and multiple iris samples in a sequential decision fusion framework that can yield robust performance. Results are presented and compared with the standard full iris based approach for a number of iris degradations. An advantage of the proposed fusion scheme is that the trade-off between detection errors can be controlled by setting parameters such as the number of instances and the number of samples used in the system. The system can then be operated to match security threat levels. It is shown that for optimal values of these parameters, the fused system also has a lower total error rate.