923 resultados para medical image processing
Resumo:
We present a novel approach to video summarisation that makes use of a Bag-of-visual-Textures (BoT) approach. Two systems are proposed, one based solely on the BoT approach and another which exploits both colour information and BoT features. On 50 short-term videos from the Open Video Project we show that our BoT and fusion systems both achieve state-of-the-art performance, obtaining an average F-measure of 0.83 and 0.86 respectively, a relative improvement of 9% and 13% when compared to the previous state-of-the-art. When applied to a new underwater surveillance dataset containing 33 long-term videos, the proposed system reduces the amount of footage by a factor of 27, with only minor degradation in the information content. This order of magnitude reduction in video data represents significant savings in terms of time and potential labour cost when manually reviewing such footage.
Resumo:
This paper describes a novel system for automatic classification of images obtained from Anti-Nuclear Antibody (ANA) pathology tests on Human Epithelial type 2 (HEp-2) cells using the Indirect Immunofluorescence (IIF) protocol. The IIF protocol on HEp-2 cells has been the hallmark method to identify the presence of ANAs, due to its high sensitivity and the large range of antigens that can be detected. However, it suffers from numerous shortcomings, such as being subjective as well as time and labour intensive. Computer Aided Diagnostic (CAD) systems have been developed to address these problems, which automatically classify a HEp-2 cell image into one of its known patterns (eg. speckled, homogeneous). Most of the existing CAD systems use handpicked features to represent a HEp-2 cell image, which may only work in limited scenarios. We propose a novel automatic cell image classification method termed Cell Pyramid Matching (CPM), which is comprised of regional histograms of visual words coupled with the Multiple Kernel Learning framework. We present a study of several variations of generating histograms and show the efficacy of the system on two publicly available datasets: the ICPR HEp-2 cell classification contest dataset and the SNPHEp-2 dataset.
Resumo:
Recent advances in computer vision and machine learning suggest that a wide range of problems can be addressed more appropriately by considering non-Euclidean geometry. In this paper we explore sparse dictionary learning over the space of linear subspaces, which form Riemannian structures known as Grassmann manifolds. To this end, we propose to embed Grassmann manifolds into the space of symmetric matrices by an isometric mapping, which enables us to devise a closed-form solution for updating a Grassmann dictionary, atom by atom. Furthermore, to handle non-linearity in data, we propose a kernelised version of the dictionary learning algorithm. Experiments on several classification tasks (face recognition, action recognition, dynamic texture classification) show that the proposed approach achieves considerable improvements in discrimination accuracy, in comparison to state-of-the-art methods such as kernelised Affine Hull Method and graph-embedding Grassmann discriminant analysis.
Resumo:
Person re-identification is particularly challenging due to significant appearance changes across separate camera views. In order to re-identify people, a representative human signature should effectively handle differences in illumination, pose and camera parameters. While general appearance-based methods are modelled in Euclidean spaces, it has been argued that some applications in image and video analysis are better modelled via non-Euclidean manifold geometry. To this end, recent approaches represent images as covariance matrices, and interpret such matrices as points on Riemannian manifolds. As direct classification on such manifolds can be difficult, in this paper we propose to represent each manifold point as a vector of similarities to class representers, via a recently introduced form of Bregman matrix divergence known as the Stein divergence. This is followed by using a discriminative mapping of similarity vectors for final classification. The use of similarity vectors is in contrast to the traditional approach of embedding manifolds into tangent spaces, which can suffer from representing the manifold structure inaccurately. Comparative evaluations on benchmark ETHZ and iLIDS datasets for the person re-identification task show that the proposed approach obtains better performance than recent techniques such as Histogram Plus Epitome, Partial Least Squares, and Symmetry-Driven Accumulation of Local Features.
Resumo:
This paper presents a novel place recognition algorithm inspired by the recent discovery of overlapping and multi-scale spatial maps in the rodent brain. We mimic this hierarchical framework by training arrays of Support Vector Machines to recognize places at multiple spatial scales. Place match hypotheses are then cross-validated across all spatial scales, a process which combines the spatial specificity of the finest spatial map with the consensus provided by broader mapping scales. Experiments on three real-world datasets including a large robotics benchmark demonstrate that mapping over multiple scales uniformly improves place recognition performance over a single scale approach without sacrificing localization accuracy. We present analysis that illustrates how matching over multiple scales leads to better place recognition performance and discuss several promising areas for future investigation.
Resumo:
Brain decoding of functional Magnetic Resonance Imaging data is a pattern analysis task that links brain activity patterns to the experimental conditions. Classifiers predict the neural states from the spatial and temporal pattern of brain activity extracted from multiple voxels in the functional images in a certain period of time. The prediction results offer insight into the nature of neural representations and cognitive mechanisms and the classification accuracy determines our confidence in understanding the relationship between brain activity and stimuli. In this paper, we compared the efficacy of three machine learning algorithms: neural network, support vector machines, and conditional random field to decode the visual stimuli or neural cognitive states from functional Magnetic Resonance data. Leave-one-out cross validation was performed to quantify the generalization accuracy of each algorithm on unseen data. The results indicated support vector machine and conditional random field have comparable performance and the potential of the latter is worthy of further investigation.
Resumo:
Previous studies have found that the lateral posterior fusiform gyri respond more robustly to pictures of animals than pictures of manmade objects and suggested that these regions encode the visual properties characteristic of animals. We suggest that such effects actually reflect processing demands arising when items with similar representations must be finely discriminated. In a positron emission tomography (PET) study of category verification with colored photographs of animals and vehicles, there was robust animal-specific activation in the lateral posterior fusiform gyri when stimuli were categorized at an intermediate level of specificity (e.g., dog or car). However, when the same photographs were categorized at a more specific level (e.g., Labrador or BMW), these regions responded equally strongly to animals and vehicles. We conclude that the lateral posterior fusiform does not encode domain-specific representations of animals or visual properties characteristic of animals. Instead, these regions are strongly activated whenever an item must be discriminated from many close visual or semantic competitors. Apparent category effects arise because, at an intermediate level of specificity, animals have more visual and semantic competitors than do artifacts.
Resumo:
Studies of semantic impairment arising from brain disease suggest that the anterior temporal lobes are critical for semantic abilities in humans; yet activation of these regions is rarely reported in functional imaging studies of healthy controls performing semantic tasks. Here, we combined neuropsychological and PET functional imaging data to show that when healthy subjects identify concepts at a specific level, the regions activated correspond to the site of maximal atrophy in patients with relatively pure semantic impairment. The stimuli were color photographs of common animals or vehicles, and the task was category verification at specific (e.g., robin), intermediate (e.g., bird), or general (e.g., animal) levels. Specific, relative to general, categorization activated the antero-lateral temporal cortices bilaterally, despite matching of these experimental conditions for difficulty. Critically, in patients with atrophy in precisely these areas, the most pronounced deficit was in the retrieval of specific semantic information.
Resumo:
The proliferation of news reports published in online websites and news information sharing among social media users necessitates effective techniques for analysing the image, text and video data related to news topics. This paper presents the first study to classify affective facial images on emerging news topics. The proposed system dynamically monitors and selects the current hot (of great interest) news topics with strong affective interestingness using textual keywords in news articles and social media discussions. Images from the selected hot topics are extracted and classified into three categorized emotions, positive, neutral and negative, based on facial expressions of subjects in the images. Performance evaluations on two facial image datasets collected from real-world resources demonstrate the applicability and effectiveness of the proposed system in affective classification of facial images in news reports. Facial expression shows high consistency with the affective textual content in news reports for positive emotion, while only low correlation has been observed for neutral and negative. The system can be directly used for applications, such as assisting editors in choosing photos with a proper affective semantic for a certain topic during news report preparation.
Resumo:
INTRODUCTION Calculating segmental (vertebral level-by-level) torso masses in Adolescent Idiopathic Scoliosis (AIS) patients allows the gravitational loading on the scoliotic spine during relaxed standing to be estimated. METHODS Existing low dose CT scans were used to calculate vertebral level-by-level torso masses and joint moments occurring in the spine for a group of female AIS patients with right-sided thoracic curves. Image processing software, ImageJ (v1.45 NIH USA) was used to reconstruct the torso segments and subsequently measure the torso volume and mass corresponding to each vertebral level. Body segment masses for the head, neck and arms were taken from published anthropometric data. Intervertebral joint moments at each vertebral level were found by summing each of the torso segment masses above the required joint and multiplying it by the perpendicular distance to the centre of the disc. RESULTS AND DISCUSSION Twenty patients were included in this study with a mean age of 15.0±2.7 years and a mean Cobb angle 52±5.9°. The mean total trunk mass, as a percentage of total body mass, was 27.8 (SD 0.5) %. Mean segmental torso mass increased inferiorly from 0.6kg at T1 to 1.5kg at L5. The coronal plane joint moments during relaxed standing were typically 5-7Nm at the apex of the curve (Figure 1), with the highest apex joint of 7Nm. CT scans were performed in the supine position and curve magnitudes are known to be 7-10° smaller than those measured in standing [1]. Therefore joint moments produced by gravity will be greater than those calculated here. CONCLUSIONS Coronal plane joint moments as high as 7Nm can occur during relaxed standing in scoliosis patients, which may help to explain the mechanics of AIS progression. The body mass distributions calculated in this study can be used to estimate joint moments derived using other imaging modalities such as MRI and subsequently determine if a relationship exists between joint moments and progressive vertebral deformity.
Resumo:
The paper examines the knowledge of pedestrian movements, both in real scenarios, and from more recent years, in the virtual 4 simulation realm. Aiming to verify whether it is possible to learn from the study of virtual environments how people will behave in real 5 environments, it is vital to understand what is already known about behavior in real environments. Besides the walking interaction among 6 pedestrians, the interaction between pedestrians and the built environment in which they are walking also have greatest relevance. Force-based 7 models were compared with the other three major microscopic models of pedestrian simulation to demonstrate a more realistic and capable 8 heuristic approach is needed for the study of the dynamics of pedestrians.
Resumo:
The ability to build high-fidelity 3D representations of the environment from sensor data is critical for autonomous robots. Multi-sensor data fusion allows for more complete and accurate representations. Furthermore, using distinct sensing modalities (i.e. sensors using a different physical process and/or operating at different electromagnetic frequencies) usually leads to more reliable perception, especially in challenging environments, as modalities may complement each other. However, they may react differently to certain materials or environmental conditions, leading to catastrophic fusion. In this paper, we propose a new method to reliably fuse data from multiple sensing modalities, including in situations where they detect different targets. We first compute distinct continuous surface representations for each sensing modality, with uncertainty, using Gaussian Process Implicit Surfaces (GPIS). Second, we perform a local consistency test between these representations, to separate consistent data (i.e. data corresponding to the detection of the same target by the sensors) from inconsistent data. The consistent data can then be fused together, using another GPIS process, and the rest of the data can be combined as appropriate. The approach is first validated using synthetic data. We then demonstrate its benefit using a mobile robot, equipped with a laser scanner and a radar, which operates in an outdoor environment in the presence of large clouds of airborne dust and smoke.
Resumo:
Outdoor robots such as planetary rovers must be able to navigate safely and reliably in order to successfully perform missions in remote or hostile environments. Mobility prediction is critical to achieving this goal due to the inherent control uncertainty faced by robots traversing natural terrain. We propose a novel algorithm for stochastic mobility prediction based on multi-output Gaussian process regression. Our algorithm considers the correlation between heading and distance uncertainty and provides a predictive model that can easily be exploited by motion planning algorithms. We evaluate our method experimentally and report results from over 30 trials in a Mars-analogue environment that demonstrate the effectiveness of our method and illustrate the importance of mobility prediction in navigating challenging terrain.
Resumo:
This paper presents a full system demonstration of dynamic sensorbased reconfiguration of a networked robot team. Robots sense obstacles in their environment locally and dynamically adapt their global geometric configuration to conform to an abstract goal shape. We present a novel two-layer planning and control algorithm for team reconfiguration that is decentralised and assumes local (neighbour-to-neighbour) communication only. The approach is designed to be resource-efficient and we show experiments using a team of nine mobile robots with modest computation, communication, and sensing. The robots use acoustic beacons for localisation and can sense obstacles in their local neighbourhood using IR sensors. Our results demonstrate globally-specified reconfiguration from local information in a real robot network, and highlight limitations of standard mesh networks in implementing decentralised algorithms.
Contrast transfer function correction applied to cryo-electron tomography and sub-tomogram averaging
Resumo:
Cryo-electron tomography together with averaging of sub-tomograms containing identical particles can reveal the structure of proteins or protein complexes in their native environment. The resolution of this technique is limited by the contrast transfer function (CTF) of the microscope. The CTF is not routinely corrected in cryo-electron tomography because of difficulties including CTF detection, due to the low signal to noise ratio, and CTF correction, since images are characterised by a spatially variant CTF. Here we simulate the effects of the CTF on the resolution of the final reconstruction, before and after CTF correction, and consider the effect of errors and approximations in defocus determination. We show that errors in defocus determination are well tolerated when correcting a series of tomograms collected at a range of defocus values. We apply methods for determining the CTF parameters in low signal to noise images of tilted specimens, for monitoring defocus changes using observed magnification changes, and for correcting the CTF prior to reconstruction. Using bacteriophage PRDI as a test sample, we demonstrate that this approach gives an improvement in the structure obtained by sub-tomogram averaging from cryo-electron tomograms.