884 resultados para computer vision, facial expression recognition, swig, red5, actionscript, ruby on rails, html5


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper proposes a new method of using foreground silhouette images for human pose estimation. Labels are introduced to the silhouette images, providing an extra layer of information that can be used in the model fitting process. The pixels in the silhouettes are labelled according to the corresponding body part in the model of the current fit, with the labels propagated into the silhouette of the next frame to be used in the fitting for the next frame. Both single and multi-view implementations are detailed, with results showing performance improvements over only using standard unlabelled silhouettes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper discusses the use of models in automatic computer forensic analysis, and proposes and elaborates on a novel model for use in computer profiling, the computer profiling object model. The computer profiling object model is an information model which models a computer as objects with various attributes and inter-relationships. These together provide the information necessary for a human investigator or an automated reasoning engine to make judgements as to the probable usage and evidentiary value of a computer system. The computer profiling object model can be implemented so as to support automated analysis to provide an investigator with the information needed to decide whether manual analysis is required.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This document outlines the system submitted by the Speech and Audio Research Laboratory at the Queensland University of Technology (QUT) for the Speaker Identity Verication: Application task of EVALITA 2009. This submission consisted of a score-level fusion of three component systems, a joint-factor GMM system and two SVM systems using GLDS and GMM supervector kernels. Development and evaluation results are presented, demonstrating the effectiveness of this fused system approach.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Visual localization systems that are practical for autonomous vehicles in outdoor industrial applications must perform reliably in a wide range of conditions. Changing outdoor conditions cause difficulty by drastically altering the information available in the camera images. To confront the problem, we have developed a visual localization system that uses a surveyed three-dimensional (3D)-edge map of permanent structures in the environment. The map has the invariant properties necessary to achieve long-term robust operation. Previous 3D-edge map localization systems usually maintain a single pose hypothesis, making it difficult to initialize without an accurate prior pose estimate and also making them susceptible to misalignment with unmapped edges detected in the camera image. A multihypothesis particle filter is employed here to perform the initialization procedure with significant uncertainty in the vehicle's initial pose. A novel observation function for the particle filter is developed and evaluated against two existing functions. The new function is shown to further improve the abilities of the particle filter to converge given a very coarse estimate of the vehicle's initial pose. An intelligent exposure control algorithm is also developed that improves the quality of the pertinent information in the image. Results gathered over an entire sunny day and also during rainy weather illustrate that the localization system can operate in a wide range of outdoor conditions. The conclusion is that an invariant map, a robust multihypothesis localization algorithm, and an intelligent exposure control algorithm all combine to enable reliable visual localization through challenging outdoor conditions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Silhouettes are common features used by many applications in computer vision. For many of these algorithms to perform optimally, accurately segmenting the objects of interest from the background to extract the silhouettes is essential. Motion segmentation is a popular technique to segment moving objects from the background, however such algorithms can be prone to poor segmentation, particularly in noisy or low contrast conditions. In this paper, the work of [3] combining motion detection with graph cuts, is extended into two novel implementations that aim to allow greater uncertainty in the output of the motion segmentation, providing a less restricted input to the graph cut algorithm. The proposed algorithms are evaluated on a portion of the ETISEO dataset using hand segmented ground truth data, and an improvement in performance over the motion segmentation alone and the baseline system of [3] is shown.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Intelligent surveillance systems typically use a single visual spectrum modality for their input. These systems work well in controlled conditions, but often fail when lighting is poor, or environmental effects such as shadows, dust or smoke are present. Thermal spectrum imagery is not as susceptible to environmental effects, however thermal imaging sensors are more sensitive to noise and they are only gray scale, making distinguishing between objects difficult. Several approaches to combining the visual and thermal modalities have been proposed, however they are limited by assuming that both modalities are perfuming equally well. When one modality fails, existing approaches are unable to detect the drop in performance and disregard the under performing modality. In this paper, a novel middle fusion approach for combining visual and thermal spectrum images for object tracking is proposed. Motion and object detection is performed on each modality and the object detection results for each modality are fused base on the current performance of each modality. Modality performance is determined by comparing the number of objects tracked by the system with the number detected by each mode, with a small allowance made for objects entering and exiting the scene. The tracking performance of the proposed fusion scheme is compared with performance of the visual and thermal modes individually, and a baseline middle fusion scheme. Improvement in tracking performance using the proposed fusion approach is demonstrated. The proposed approach is also shown to be able to detect the failure of an individual modality and disregard its results, ensuring performance is not degraded in such situations.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The application of high-speed machine vision for close-loop position control, or visual servoing, of a robot manipulator. It provides a comprehensive coverage of all aspects of the visual servoing problem: robotics, vision, control, technology and implementation issues. While much of the discussion is quite general the experimental work described is based on the use of a high-speed binary vision system with a monocular "eye-in-hand" camera.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In public venues, crowd size is a key indicator of crowd safety and stability. In this paper we propose a crowd counting algorithm that uses tracking and local features to count the number of people in each group as represented by a foreground blob segment, so that the total crowd estimate is the sum of the group sizes. Tracking is employed to improve the robustness of the estimate, by analysing the history of each group, including splitting and merging events. A simplified ground truth annotation strategy results in an approach with minimal setup requirements that is highly accurate.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The main goal of this research is to design an efficient compression al~ gorithm for fingerprint images. The wavelet transform technique is the principal tool used to reduce interpixel redundancies and to obtain a parsimonious representation for these images. A specific fixed decomposition structure is designed to be used by the wavelet packet in order to save on the computation, transmission, and storage costs. This decomposition structure is based on analysis of information packing performance of several decompositions, two-dimensional power spectral density, effect of each frequency band on the reconstructed image, and the human visual sensitivities. This fixed structure is found to provide the "most" suitable representation for fingerprints, according to the chosen criteria. Different compression techniques are used for different subbands, based on their observed statistics. The decision is based on the effect of each subband on the reconstructed image according to the mean square criteria as well as the sensitivities in human vision. To design an efficient quantization algorithm, a precise model for distribution of the wavelet coefficients is developed. The model is based on the generalized Gaussian distribution. A least squares algorithm on a nonlinear function of the distribution model shape parameter is formulated to estimate the model parameters. A noise shaping bit allocation procedure is then used to assign the bit rate among subbands. To obtain high compression ratios, vector quantization is used. In this work, the lattice vector quantization (LVQ) is chosen because of its superior performance over other types of vector quantizers. The structure of a lattice quantizer is determined by its parameters known as truncation level and scaling factor. In lattice-based compression algorithms reported in the literature the lattice structure is commonly predetermined leading to a nonoptimized quantization approach. In this research, a new technique for determining the lattice parameters is proposed. In the lattice structure design, no assumption about the lattice parameters is made and no training and multi-quantizing is required. The design is based on minimizing the quantization distortion by adapting to the statistical characteristics of the source in each subimage. 11 Abstract Abstract Since LVQ is a multidimensional generalization of uniform quantizers, it produces minimum distortion for inputs with uniform distributions. In order to take advantage of the properties of LVQ and its fast implementation, while considering the i.i.d. nonuniform distribution of wavelet coefficients, the piecewise-uniform pyramid LVQ algorithm is proposed. The proposed algorithm quantizes almost all of source vectors without the need to project these on the lattice outermost shell, while it properly maintains a small codebook size. It also resolves the wedge region problem commonly encountered with sharply distributed random sources. These represent some of the drawbacks of the algorithm proposed by Barlaud [26). The proposed algorithm handles all types of lattices, not only the cubic lattices, as opposed to the algorithms developed by Fischer [29) and Jeong [42). Furthermore, no training and multiquantizing (to determine lattice parameters) is required, as opposed to Powell's algorithm [78). For coefficients with high-frequency content, the positive-negative mean algorithm is proposed to improve the resolution of reconstructed images. For coefficients with low-frequency content, a lossless predictive compression scheme is used to preserve the quality of reconstructed images. A method to reduce bit requirements of necessary side information is also introduced. Lossless entropy coding techniques are subsequently used to remove coding redundancy. The algorithms result in high quality reconstructed images with better compression ratios than other available algorithms. To evaluate the proposed algorithms their objective and subjective performance comparisons with other available techniques are presented. The quality of the reconstructed images is important for a reliable identification. Enhancement and feature extraction on the reconstructed images are also investigated in this research. A structural-based feature extraction algorithm is proposed in which the unique properties of fingerprint textures are used to enhance the images and improve the fidelity of their characteristic features. The ridges are extracted from enhanced grey-level foreground areas based on the local ridge dominant directions. The proposed ridge extraction algorithm, properly preserves the natural shape of grey-level ridges as well as precise locations of the features, as opposed to the ridge extraction algorithm in [81). Furthermore, it is fast and operates only on foreground regions, as opposed to the adaptive floating average thresholding process in [68). Spurious features are subsequently eliminated using the proposed post-processing scheme.

Relevância:

100.00% 100.00%

Publicador:

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Maintenance trains travel in convoy. In Australia, only the first train of the convoy pays attention to the track sig- nalization (the other convoy vehicles simply follow the preceding vehicle). Because of human errors, collisions can happen between the maintenance vehicles. Although an anti-collision system based on a laser distance meter is already in operation, the existing system has a limited range due to the curvature of the tracks. In this paper, we introduce an anti-collision system based on vision. The two main ideas are, (1) to warp the camera image into an image where the rails are parallel through a projective transform, and (2) to track the two rail curves simultaneously by evaluating small parallel segments. The performance of the system is demonstrated on an image dataset.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We propose a model-based approach to unify clustering and network modeling using time-course gene expression data. Specifically, our approach uses a mixture model to cluster genes. Genes within the same cluster share a similar expression profile. The network is built over cluster-specific expression profiles using state-space models. We discuss the application of our model to simulated data as well as to time-course gene expression data arising from animal models on prostate cancer progression. The latter application shows that with a combined statistical/bioinformatics analyses, we are able to extract gene-to-gene relationships supported by the literature as well as new plausible relationships.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Camera calibration information is required in order for multiple camera networks to deliver more than the sum of many single camera systems. Methods exist for manually calibrating cameras with high accuracy. Manually calibrating networks with many cameras is, however, time consuming, expensive and impractical for networks that undergo frequent change. For this reason, automatic calibration techniques have been vigorously researched in recent years. Fully automatic calibration methods depend on the ability to automatically find point correspondences between overlapping views. In typical camera networks, cameras are placed far apart to maximise coverage. This is referred to as a wide base-line scenario. Finding sufficient correspondences for camera calibration in wide base-line scenarios presents a significant challenge. This thesis focuses on developing more effective and efficient techniques for finding correspondences in uncalibrated, wide baseline, multiple-camera scenarios. The project consists of two major areas of work. The first is the development of more effective and efficient view covariant local feature extractors. The second area involves finding methods to extract scene information using the information contained in a limited set of matched affine features. Several novel affine adaptation techniques for salient features have been developed. A method is presented for efficiently computing the discrete scale space primal sketch of local image features. A scale selection method was implemented that makes use of the primal sketch. The primal sketch-based scale selection method has several advantages over the existing methods. It allows greater freedom in how the scale space is sampled, enables more accurate scale selection, is more effective at combining different functions for spatial position and scale selection, and leads to greater computational efficiency. Existing affine adaptation methods make use of the second moment matrix to estimate the local affine shape of local image features. In this thesis, it is shown that the Hessian matrix can be used in a similar way to estimate local feature shape. The Hessian matrix is effective for estimating the shape of blob-like structures, but is less effective for corner structures. It is simpler to compute than the second moment matrix, leading to a significant reduction in computational cost. A wide baseline dense correspondence extraction system, called WiDense, is presented in this thesis. It allows the extraction of large numbers of additional accurate correspondences, given only a few initial putative correspondences. It consists of the following algorithms: An affine region alignment algorithm that ensures accurate alignment between matched features; A method for extracting more matches in the vicinity of a matched pair of affine features, using the alignment information contained in the match; An algorithm for extracting large numbers of highly accurate point correspondences from an aligned pair of feature regions. Experiments show that the correspondences generated by the WiDense system improves the success rate of computing the epipolar geometry of very widely separated views. This new method is successful in many cases where the features produced by the best wide baseline matching algorithms are insufficient for computing the scene geometry.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Wireless Multi-media Sensor Networks (WMSNs) have become increasingly popular in recent years, driven in part by the increasing commoditization of small, low-cost CMOS sensors. As such, the challenge of automatically calibrating these types of cameras nodes has become an important research problem, especially for the case when a large quantity of these type of devices are deployed. This paper presents a method for automatically calibrating a wireless camera node with the ability to rotate around one axis. The method involves capturing images as the camera is rotated and computing the homographies between the images. The camera parameters, including focal length, principal point and the angle and axis of rotation can then recovered from two or more homographies. The homography computation algorithm is designed to deal with the limited resources of the wireless sensor and to minimize energy con- sumption. In this paper, a modified RANdom SAmple Consensus (RANSAC) algorithm is proposed to effectively increase the efficiency and reliability of the calibration procedure.