515 resultados para Medical image processing
Resumo:
Automatic detection of suspicious activities in CCTV camera feeds is crucial to the success of video surveillance systems. Such a capability can help transform the dumb CCTV cameras into smart surveillance tools for fighting crime and terror. Learning and classification of basic human actions is a precursor to detecting suspicious activities. Most of the current approaches rely on a non-realistic assumption that a complete dataset of normal human actions is available. This paper presents a different approach to deal with the problem of understanding human actions in video when no prior information is available. This is achieved by working with an incomplete dataset of basic actions which are continuously updated. Initially, all video segments are represented by Bags-Of-Words (BOW) method using only Term Frequency-Inverse Document Frequency (TF-IDF) features. Then, a data-stream clustering algorithm is applied for updating the system's knowledge from the incoming video feeds. Finally, all the actions are classified into different sets. Experiments and comparisons are conducted on the well known Weizmann and KTH datasets to show the efficacy of the proposed approach.
Resumo:
In public venues, crowd size is a key indicator of crowd safety and stability. Crowding levels can be detected using holistic image features, however this requires a large amount of training data to capture the wide variations in crowd distribution. If a crowd counting algorithm is to be deployed across a large number of cameras, such a large and burdensome training requirement is far from ideal. In this paper we propose an approach that uses local features to count the number of people in each foreground blob segment, so that the total crowd estimate is the sum of the group sizes. This results in an approach that is scalable to crowd volumes not seen in the training data, and can be trained on a very small data set. As a local approach is used, the proposed algorithm can easily be used to estimate crowd density throughout different regions of the scene and be used in a multi-camera environment. A unique localised approach to ground truth annotation reduces the required training data is also presented, as a localised approach to crowd counting has different training requirements to a holistic one. Testing on a large pedestrian database compares the proposed technique to existing holistic techniques and demonstrates improved accuracy, and superior performance when test conditions are unseen in the training set, or a minimal training set is used.
Resumo:
This paper describes a novel framework for facial expression recognition from still images by selecting, optimizing and fusing ‘salient’ Gabor feature layers to recognize six universal facial expressions using the K nearest neighbor classifier. The recognition comparisons with all layer approach using JAFFE and Cohn-Kanade (CK) databases confirm that using ‘salient’ Gabor feature layers with optimized sizes can achieve better recognition performance and dramatically reduce computational time. Moreover, comparisons with the state of the art performances demonstrate the effectiveness of our approach.
Resumo:
In the filed of semantic grid, QoS-based Web service scheduling for workflow optimization is an important problem.However, in semantic and service rich environment like semantic grid, the emergence of context constraints on Web services is very common making the scheduling consider not only quality properties of Web services, but also inter service dependencies which are formed due to the context constraints imposed on Web services. In this paper, we present a repair genetic algorithm, namely minimal-conflict hill-climbing repair genetic algorithm, to address scheduling optimization problems in workflow applications in the presence of domain constraints and inter service dependencies. Experimental results demonstrate the scalability and effectiveness of the genetic algorithm.
Resumo:
In Web service based systems, new value-added Web services can be constructed by integrating existing Web services. A Web service may have many implementations, which are functionally identical, but have different Quality of Service (QoS) attributes, such as response time, price, reputation, reliability, availability and so on. Thus, a significant research problem in Web service composition is how to select an implementation for each of the component Web services so that the overall QoS of the composite Web service is optimal. This is so called QoS-aware Web service composition problem. In some composite Web services there are some dependencies and conflicts between the Web service implementations. However, existing approaches cannot handle the constraints. This paper tackles the QoS-aware Web service composition problem with inter service dependencies and conflicts using a penalty-based genetic algorithm (GA). Experimental results demonstrate the effectiveness and the scalability of the penalty-based GA.
Resumo:
Spatial information captured from optical remote sensors on board unmanned aerial vehicles (UAVs) has great potential in automatic surveillance of electrical infrastructure. For an automatic vision-based power line inspection system, detecting power lines from a cluttered background is one of the most important and challenging tasks. In this paper, a novel method is proposed, specifically for power line detection from aerial images. A pulse coupled neural filter is developed to remove background noise and generate an edge map prior to the Hough transform being employed to detect straight lines. An improved Hough transform is used by performing knowledge-based line clustering in Hough space to refine the detection results. The experiment on real image data captured from a UAV platform demonstrates that the proposed approach is effective for automatic power line detection.
Resumo:
The application of object-based approaches to the problem of extracting vegetation information from images requires accurate delineation of individual tree crowns. This paper presents an automated method for individual tree crown detection and delineation by applying a simplified PCNN model in spectral feature space followed by post-processing using morphological reconstruction. The algorithm was tested on high resolution multi-spectral aerial images and the results are compared with two existing image segmentation algorithms. The results demonstrate that our algorithm outperforms the other two solutions with the average accuracy of 81.8%.
Resumo:
The relationship between multiple cameras viewing the same scene may be discovered automatically by finding corresponding points in the two views and then solving for the camera geometry. In camera networks with sparsely placed cameras, low resolution cameras or in scenes with few distinguishable features it may be difficult to find a sufficient number of reliable correspondences from which to compute geometry. This paper presents a method for extracting a larger number of correspondences from an initial set of putative correspondences without any knowledge of the scene or camera geometry. The method may be used to increase the number of correspondences and make geometry computations possible in cases where existing methods have produced insufficient correspondences.
Resumo:
One of the major challenges facing a present day game development company is the removal of bugs from such complex virtual environments. This work presents an approach for measuring the correctness of synthetic scenes generated by a rendering system of a 3D application, such as a computer game. Our approach builds a database of labelled point clouds representing the spatiotemporal colour distribution for the objects present in a sequence of bug-free frames. This is done by converting the position that the pixels take over time into the 3D equivalent points with associated colours. Once the space of labelled points is built, each new image produced from the same game by any rendering system can be analysed by measuring its visual inconsistency in terms of distance from the database. Objects within the scene can be relocated (manually or by the application engine); yet the algorithm is able to perform the image analysis in terms of the 3D structure and colour distribution of samples on the surface of the object. We applied our framework to the publicly available game RacingGame developed for Microsoft(R) Xna(R). Preliminary results show how this approach can be used to detect a variety of visual artifacts generated by the rendering system in a professional quality game engine.
Resumo:
A method of improving the security of biometric templates which satisfies desirable properties such as (a) irreversibility of the template, (b) revocability and assignment of a new template to the same biometric input, (c) matching in the secure transformed domain is presented. It makes use of an iterative procedure based on the bispectrum that serves as an irreversible transformation for biometric features because signal phase is discarded each iteration. Unlike the usual hash function, this transformation preserves closeness in the transformed domain for similar biometric inputs. A number of such templates can be generated from the same input. These properties are illustrated using synthetic data and applied to images from the FRGC 3D database with Gabor features. Verification can be successfully performed using these secure templates with an EER of 5.85%
Resumo:
Cultural objects are increasingly generated and stored in digital form, yet effective methods for their indexing and retrieval still remain an important area of research. The main problem arises from the disconnection between the content-based indexing approach used by computer scientists and the description-based approach used by information scientists. There is also a lack of representational schemes that allow the alignment of the semantics and context with keywords and low-level features that can be automatically extracted from the content of these cultural objects. This paper presents an integrated approach to address these problems, taking advantage of both computer science and information science approaches. We firstly discuss the requirements from a number of perspectives: users, content providers, content managers and technical systems. We then present an overview of our system architecture and describe various techniques which underlie the major components of the system. These include: automatic object category detection; user-driven tagging; metadata transform and augmentation, and an expression language for digital cultural objects. In addition, we discuss our experience on testing and evaluating some existing collections, analyse the difficulties encountered and propose ways to address these problems.
Resumo:
Abstract—Corneal topography estimation that is based on the Placido disk principle relies on good quality of precorneal tear film and sufficiently wide eyelid (palpebral) aperture to avoid reflections from eyelashes. However, in practice, these conditions are not always fulfilled resulting in missing regions, smaller corneal coverage, and subsequently poorer estimates of corneal topography. Our aim was to enhance the standard operating range of a Placido disk videokeratoscope to obtain reliable corneal topography estimates in patients with poor tear film quality, such as encountered in those diagnosed with dry eye, and with narrower palpebral apertures as in the case of Asian subjects. This was achieved by incorporating in the instrument’s own topography estimation algorithm an image processing technique that comprises a polar-domain adaptive filter and amorphological closing operator. The experimental results from measurements of test surfaces and real corneas showed that the incorporation of the proposed technique results in better estimates of corneal topography, and, in many cases, to a significant increase in the estimated coverage area making such an enhanced videokeratoscope a better tool for clinicians.
Resumo:
A new method for noninvasive assessment of tear film surface quality (TFSQ) is proposed. The method is based on high-speed videokeratoscopy in which the corneal area for the analysis is dynamically estimated in a manner that removes videokeratoscopy interference from the shadows of eyelashes but not that related to the poor quality of the precorneal tear film that is of interest. The separation between the two types of seemingly similar videokeratoscopy interference is achieved by region-based classification in which the overall noise is first separated from the useful signal (unaltered videokeratoscopy pattern), followed by a dedicated interference classification algorithm that distinguishes between the two considered interferences. The proposed technique provides a much wider corneal area for the analysis of TFSQ than the previously reported techniques. A preliminary study with the proposed technique, carried out for a range of anterior eye conditions, showed an effective behavior in terms of noise to signal separation, interference classification, as well as consistent TFSQ results. Subsequently, the method proved to be able to not only discriminate between the bare eye and the lens on eye conditions but also to have the potential to discriminate between the two types of contact lenses.