532 resultados para swd: Image processing
Resumo:
Automated crowd counting has become an active field of computer vision research in recent years. Existing approaches are scene-specific, as they are designed to operate in the single camera viewpoint that was used to train the system. Real world camera networks often span multiple viewpoints within a facility, including many regions of overlap. This paper proposes a novel scene invariant crowd counting algorithm that is designed to operate across multiple cameras. The approach uses camera calibration to normalise features between viewpoints and to compensate for regions of overlap. This compensation is performed by constructing an 'overlap map' which provides a measure of how much an object at one location is visible within other viewpoints. An investigation into the suitability of various feature types and regression models for scene invariant crowd counting is also conducted. The features investigated include object size, shape, edges and keypoints. The regression models evaluated include neural networks, K-nearest neighbours, linear and Gaussian process regresion. Our experiments demonstrate that accurate crowd counting was achieved across seven benchmark datasets, with optimal performance observed when all features were used and when Gaussian process regression was used. The combination of scene invariance and multi camera crowd counting is evaluated by training the system on footage obtained from the QUT camera network and testing it on three cameras from the PETS 2009 database. Highly accurate crowd counting was observed with a mean relative error of less than 10%. Our approach enables a pre-trained system to be deployed on a new environment without any additional training, bringing the field one step closer toward a 'plug and play' system.
Resumo:
Active Appearance Models (AAMs) employ a paradigm of inverting a synthesis model of how an object can vary in terms of shape and appearance. As a result, the ability of AAMs to register an unseen object image is intrinsically linked to two factors. First, how well the synthesis model can reconstruct the object image. Second, the degrees of freedom in the model. Fewer degrees of freedom yield a higher likelihood of good fitting performance. In this paper we look at how these seemingly contrasting factors can complement one another for the problem of AAM fitting of an ensemble of images stemming from a constrained set (e.g. an ensemble of face images of the same person).
Resumo:
This thesis is aimed at further understanding the uppermost lipid-filled membranous layer (i.e. surface amorphous layer (SAL)) of articular cartilage and to develop a scientific framework for re-introducing lipids onto the surface of lipid-depleted articular cartilage (i.e. "resurfacing"). The outcome will potentially contribute to knowledge that will facilitate the repair of the articular surface of cartilage where degradation is limited to the loss of the lipids of the SAL only. The surface amorphous layer is of utmost importance to the effective load-spreading, lubrication, and semipermeability (which controls its fluid management, nutrient transport and waste removal) of articular cartilage in the mammalian joints. However, because this uppermost layer of cartilage is often in contact during physiological function, it is prone to wear and tear, and thus, is the site for damage initiation that can lead to the early stages of joint condition like osteoarthritis, and related conditions that cause pain and discomfort leading to low quality of life in patients. It is therefore imperative to conduct a study which offers insight into remedying this problem. It is hypothesized that restoration (resurfacing) of the surface amorphous layer can be achieved by re-introducing synthetic surface-active phospholipids (SAPL) into the joint space. This hypothesis was tested in this thesis by exposing cartilage samples whose surface lipids had been depleted to individual and mixtures of synthetic saturated and unsaturated phospholipids. The surfaces of normal, delipidized, and relipidized samples of cartilage were characterized for their structural integrity and functionality using atomic force microscope (AFM), confocal microscope (COFM), Raman spectroscopy, magnetic resonance imaging (MRI) with image processing in the MATLAB® environment and mechanical loading experiments. The results from AFM imaging, confocal microscopy, and Raman spectroscopy revealed a successful deposition of new surface layer on delipidized cartilage when incubated in synthetic phospholipids. The relipidization resulted in a significant improvement in the surface nanostructure of the artificially degraded cartilage, with the complete SAPL mixture providing better outcomes in comparison to those created with the single SAPL components (palmitoyl-oleoyl-phosphatidylcholine, POPC and dipalmitoyl-phosphatidylcholine, DPPC). MRI analysis revealed that the surface created with the complete mixture of synthetic lipids was capable of providing semipermeability to the surface layer of the treated cartilage samples relative to the normal intact surface. Furthermore, deformation energy analysis revealed that the treated samples were capable of delivering the elastic properties required for load bearing and recovery of the tissue relative to the normal intact samples, with this capability closer between the normal and the samples incubated in the complete lipid mixture. In conclusion, this thesis has established that it is possible to deposit/create a potentially viable layer on the surface of cartilage following degradation/lipid loss through incubation in synthetic lipid solutions. However, further studies will be required to advance the ideas developed in this thesis, for the development of synthetic lipid-based injections/drugs for treatment of osteoarthritis and other related joint conditions.
Resumo:
The problem of estimating pseudobearing rate information of an airborne target based on measurements from a vision sensor is considered. Novel image speed and heading angle estimators are presented that exploit image morphology, hidden Markov model (HMM) filtering, and relative entropy rate (RER) concepts to allow pseudobearing rate information to be determined before (or whilst) the target track is being estimated from vision information.
Resumo:
Non-periodic structural variation has been found in the high Tc cuprates, YBa2Cu3O7-x and Hg0.67Pb0.33Ba2Ca2Cu 3O8+δ, by image analysis of high resolution transmission electron microscope (HRTEM) images. We use two methods for analysis of the HRTEM images. The first method is a means for measuring the bending of lattice fringes at twin planes. The second method is a low-pass filter technique which enhances information contained by diffuse-scattered electrons and reveals what appears to be an interference effect between domains of differing lattice parameter in the top and bottom of the thin foil. We believe that these methods of image analysis could be usefully applied to the many thousands of HRTEM images that have been collected by other workers in the high temperature superconductor field. This work provides direct structural evidence for phase separation in high Tc cuprates, and gives support to recent stripes models that have been proposed to explain various angle resolved photoelectron spectroscopy and nuclear magnetic resonance data. We believe that the structural variation is a response to an opening of an electronic solubility gap where holes are not uniformly distributed in the material but are confined to metallic stripes. Optimum doping may occur as a consequence of the diffuse boundaries between stripes which arise from spinodal decomposition. Theoretical ideas about the high Tc cuprates which treat the cuprates as homogeneous may need to be modified in order to take account of this type of structural variation.
Resumo:
The assessment of choroidal thickness from optical coherence tomography (OCT) images of the human choroid is an important clinical and research task, since it provides valuable information regarding the eye’s normal anatomy and physiology, and changes associated with various eye diseases and the development of refractive error. Due to the time consuming and subjective nature of manual image analysis, there is a need for the development of reliable objective automated methods of image segmentation to derive choroidal thickness measures. However, the detection of the two boundaries which delineate the choroid is a complicated and challenging task, in particular the detection of the outer choroidal boundary, due to a number of issues including: (i) the vascular ocular tissue is non-uniform and rich in non-homogeneous features, and (ii) the boundary can have a low contrast. In this paper, an automatic segmentation technique based on graph-search theory is presented to segment the inner choroidal boundary (ICB) and the outer choroidal boundary (OCB) to obtain the choroid thickness profile from OCT images. Before the segmentation, the B-scan is pre-processed to enhance the two boundaries of interest and to minimize the artifacts produced by surrounding features. The algorithm to detect the ICB is based on a simple edge filter and a directional weighted map penalty, while the algorithm to detect the OCB is based on OCT image enhancement and a dual brightness probability gradient. The method was tested on a large data set of images from a pediatric (1083 B-scans) and an adult (90 B-scans) population, which were previously manually segmented by an experienced observer. The results demonstrate the proposed method provides robust detection of the boundaries of interest and is a useful tool to extract clinical data.
Resumo:
Purpose Videokeratoscopy images can be used for the non-invasive assessment of the tear film. In this work the applicability of an image processing technique, textural-analysis, for the assessment of the tear film in Placido disc images has been investigated. Methods In the presence of tear film thinning/break-up, the reflected pattern from the videokeratoscope is disturbed in the region of tear film disruption. Thus, the Placido pattern carries information about the stability of the underlying tear film. By characterizing the pattern regularity, the tear film quality can be inferred. In this paper, a textural features approach is used to process the Placido images. This method provides a set of texture features from which an estimate of the tear film quality can be obtained. The method is tested for the detection of dry eye in a retrospective dataset from 34 subjects (22-normal and 12-dry eye), with measurements taken under suppressed blinking conditions. Results To assess the capability of each texture-feature to discriminate dry eye from normal subjects, the receiver operating curve (ROC) was calculated and the area under the curve (AUC), specificity and sensitivity extracted. For the different features examined, the AUC value ranged from 0.77 to 0.82, while the sensitivity typically showed values above 0.9 and the specificity showed values around 0.6. Overall, the estimated ROCs indicate that the proposed technique provides good discrimination performance. Conclusions Texture analysis of videokeratoscopy images is applicable to study tear film anomalies in dry eye subjects. The proposed technique appears to have demonstrated its clinical relevance and utility.
Resumo:
Trees are capable of portraying the semi-structured data which is common in web domain. Finding similarities between trees is mandatory for several applications that deal with semi-structured data. Existing similarity methods examine a pair of trees by comparing through nodes and paths of two trees, and find the similarity between them. However, these methods provide unfavorable results for unordered tree data and result in yielding NP-hard or MAX-SNP hard complexity. In this paper, we present a novel method that encodes a tree with an optimal traversing approach first, and then, utilizes it to model the tree with its equivalent matrix representation for finding similarity between unordered trees efficiently. Empirical analysis shows that the proposed method is able to achieve high accuracy even on the large data sets.
Resumo:
Non-rigid face alignment is a very important task in a large range of applications but the existing tracking based non-rigid face alignment methods are either inaccurate or requiring person-specific model. This dissertation has developed simultaneous alignment algorithms that overcome these constraints and provide alignment with high accuracy, efficiency, robustness to varying image condition, and requirement of only generic model.
Resumo:
Safety concerns in the operation of autonomous aerial systems require safe-landing protocols be followed during situations where the a mission should be aborted due to mechanical or other failure. On-board cameras provide information that can be used in the determination of potential landing sites, which are continually updated and ranked to prevent injury and minimize damage. Pulse Coupled Neural Networks have been used for the detection of features in images that assist in the classification of vegetation and can be used to minimize damage to the aerial vehicle. However, a significant drawback in the use of PCNNs is that they are computationally expensive and have been more suited to off-line applications on conventional computing architectures. As heterogeneous computing architectures are becoming more common, an OpenCL implementation of a PCNN feature generator is presented and its performance is compared across OpenCL kernels designed for CPU, GPU and FPGA platforms. This comparison examines the compute times required for network convergence under a variety of images obtained during unmanned aerial vehicle trials to determine the plausibility for real-time feature detection.
Resumo:
The huge amount of CCTV footage available makes it very burdensome to process these videos manually through human operators. This has made automated processing of video footage through computer vision technologies necessary. During the past several years, there has been a large effort to detect abnormal activities through computer vision techniques. Typically, the problem is formulated as a novelty detection task where the system is trained on normal data and is required to detect events which do not fit the learned ‘normal’ model. There is no precise and exact definition for an abnormal activity; it is dependent on the context of the scene. Hence there is a requirement for different feature sets to detect different kinds of abnormal activities. In this work we evaluate the performance of different state of the art features to detect the presence of the abnormal objects in the scene. These include optical flow vectors to detect motion related anomalies, textures of optical flow and image textures to detect the presence of abnormal objects. These extracted features in different combinations are modeled using different state of the art models such as Gaussian mixture model(GMM) and Semi- 2D Hidden Markov model(HMM) to analyse the performances. Further we apply perspective normalization to the extracted features to compensate for perspective distortion due to the distance between the camera and objects of consideration. The proposed approach is evaluated using the publicly available UCSD datasets and we demonstrate improved performance compared to other state of the art methods.
Resumo:
Clustering identities in a broadcast video is a useful task to aid in video annotation and retrieval. Quality based frame selection is a crucial task in video face clustering, to both improve the clustering performance and reduce the computational cost. We present a frame work that selects the highest quality frames available in a video to cluster the face. This frame selection technique is based on low level and high level features (face symmetry, sharpness, contrast and brightness) to select the highest quality facial images available in a face sequence for clustering. We also consider the temporal distribution of the faces to ensure that selected faces are taken at times distributed throughout the sequence. Normalized feature scores are fused and frames with high quality scores are used in a Local Gabor Binary Pattern Histogram Sequence based face clustering system. We present a news video database to evaluate the clustering system performance. Experiments on the newly created news database show that the proposed method selects the best quality face images in the video sequence, resulting in improved clustering performance.
Resumo:
This workshop was supported by the Australian Centre for Ecological Analysis and Synthesis (ACEAS, http://www.aceas.org.au/), a facility of the Australian Government-funded Terrestrial Ecosystem Research Network (http://www.tern.org.au/), a research infrastructure facility established under the National Collaborative Research Infrastructure Strategy and Education Infrastructure Fund - Super Science Initiative, through the Department of Industry, Innovation, Science, Research and Tertiary Education. Hosted by: Queensland University of Technology, Brisbane, Queensland. (QUT, http://www.qut.edu.au/) Dates: 8-11 May 2012 Report Editors: Prof Stuart Parsons (Uni. Auckland, NZ) and Dr Michael Towsey (QUT). This report is a compilation of notes and discussion summaries contributed by those attending the Workshop. They have been assembled into a logical order by the editors. Another report (with photographs) can be obtained at: http://www.aceas.org.au/index.php?option=com_content&view=article&id=94&Itemid=96
Resumo:
Acoustic recordings of the environment are an important aid to ecologists monitoring biodiversity and environmental health. However, rapid advances in recording technology, storage and computing make it possible to accumulate thousands of hours of recordings, of which, ecologists can only listen to a small fraction. The big-data challenge is to visualize the content of long-duration audio recordings on multiple scales, from hours, days, months to years. The visualization should facilitate navigation and yield ecologically meaningful information. Our approach is to extract (at one minute resolution) acoustic indices which reflect content of ecological interest. An acoustic index is a statistic that summarizes some aspect of the distribution of acoustic energy in a recording. We combine indices to produce false-colour images that reveal acoustic content and facilitate navigation through recordings that are months or even years in duration.
Resumo:
A new approach for recognizing the iris of the human eye is presented. Zero-crossings of the wavelet transform at various resolution levels are calculated over concentric circles on the iris, and the resulting one-dimensional (1-D) signals are compared with model features using different dissimilarity functions.