96 resultados para computer vision, facial expression recognition, swig, red5, actionscript, ruby on rails, html5


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Color segmentation of images usually requires a manual selection and classification of samples to train the system. This paper presents an automatic system that performs these tasks without the need of a long training, providing a useful tool to detect and identify figures. In real situations, it is necessary to repeat the training process if light conditions change, or if, in the same scenario, the colors of the figures and the background may have changed, being useful a fast training method. A direct application of this method is the detection and identification of football players.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we exploit the analogy between protein sequence alignment and image pair correspondence to design a bioinformatics-inspired framework for stereo matching based on dynamic programming. This approach also led to the creation of a meaningfulness graph, which helps to predict matching validity according to image overlap and pixel similarity. Finally, we propose an automatic procedure to estimate automatically all matching parameters. This work is evaluated qualitatively and quantitatively using a standard benchmarking dataset and by conducting stereo matching experiments between images captured at different resolutions. Results confirm the validity of the computer vision/bioinformatics analogy to develop a versatile and accurate low complexity stereo matching algorithm.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A novel non-linear dimensionality reduction method, called Temporal Laplacian Eigenmaps, is introduced to process efficiently time series data. In this embedded-based approach, temporal information is intrinsic to the objective function, which produces description of low dimensional spaces with time coherence between data points. Since the proposed scheme also includes bidirectional mapping between data and embedded spaces and automatic tuning of key parameters, it offers the same benefits as mapping-based approaches. Experiments on a couple of computer vision applications demonstrate the superiority of the new approach to other dimensionality reduction method in term of accuracy. Moreover, its lower computational cost and generalisation abilities suggest it is scalable to larger datasets. © 2010 IEEE.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, a novel framework for dense pixel matching based on dynamic programming is introduced. Unlike most techniques proposed in the literature, our approach assumes neither known camera geometry nor the availability of rectified images. Under such conditions, the matching task cannot be reduced to finding correspondences between a pair of scanlines. We propose to extend existing dynamic programming methodologies to a larger dimensional space by using a 3D scoring matrix so that correspondences between a line and a whole image can be calculated. After assessing our framework on a standard evaluation dataset of rectified stereo images, experiments are conducted on unrectified and non-linearly distorted images. Results validate our new approach and reveal the versatility of our algorithm.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we propose a statistical model for detection and tracking of human silhouette and the corresponding 3D skeletal structure in gait sequences. We follow a point distribution model (PDM) approach using a Principal Component Analysis (PCA). The problem of non-lineal PCA is partially resolved by applying a different PDM depending of pose estimation; frontal, lateral and diagonal, estimated by Fisher's linear discriminant. Additionally, the fitting is carried out by selecting the closest allowable shape from the training set by means of a nearest neighbor classifier. To improve the performance of the model we develop a human gait analysis to take into account temporal dynamic to track the human body. The incorporation of temporal constraints on the model increase reliability and robustness.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we consider the problem of tracking similar objects. We show how a mean field approach can be used to deal with interacting targets and we compare it with Markov Chain Monte Carlo (MCMC). Two mean field implementations are presented. The first one is more general and uses particle filtering. We discuss some simplifications of the base algorithm that reduce the computation time. The second one is based on suitable Gaussian approximations of probability densities that lead to a set of self-consistent equations for the means and covariances. These equations give the Kalman solution if there is no interaction. Experiments have been performed on two kinds of sequences. The first kind is composed of a single long sequence of twenty roaming ants and was previously analysed using MCMC. In this case, our mean field algorithms obtain substantially better results. The second kind corresponds to selected sequences of a football match in which the interaction avoids tracker coalescence in situations where independent trackers fail.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we show how interacting and occluding targets can be tackled successfully within a Gaussian approximation. For that purpose, we develop a general expansion of the mean and covariance of the posterior and we consider a first order approximation of it. The proposed method differs from EKF in that neither a non-linear dynamical model nor a non-linear measurement vector to state relation have to be defined, so it works with any kind of interaction potential and likelihood. The approach has been tested on three sequences (10400, 2500, and 400 frames each one). The results show that our approach helps to reduce the number of failures without increasing too much the computation time with respect to methods that do not take into account target interactions.

Relevância:

100.00% 100.00%

Publicador:

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Support vector machines (SVMs), though accurate, are not preferred in applications requiring high classification speed or when deployed in systems of limited computational resources, due to the large number of support vectors involved in the model. To overcome this problem we have devised a primal SVM method with the following properties: (1) it solves for the SVM representation without the need to invoke the representer theorem, (2) forward and backward selections are combined to approach the final globally optimal solution, and (3) a criterion is introduced for identification of support vectors leading to a much reduced support vector set. In addition to introducing this method the paper analyzes the complexity of the algorithm and presents test results on three public benchmark problems and a human activity recognition application. These applications demonstrate the effectiveness and efficiency of the proposed algorithm.


--------------------------------------------------------------------------------

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Automatic gender classification has many security and commercial applications. Various modalities have been investigated for gender classification with face-based classification being the most popular. In some real-world scenarios the face may be partially occluded. In these circumstances a classification based on individual parts of the face known as local features must be adopted. We investigate gender classification using lip movements. We show for the first time that important gender specific information can be obtained from the way in which a person moves their lips during speech. Furthermore our study indicates that the lip dynamics during speech provide greater gender discriminative information than simply lip appearance. We also show that the lip dynamics and appearance contain complementary gender information such that a model which captures both traits gives the highest overall classification result. We use Discrete Cosine Transform based features and Gaussian Mixture Modelling to model lip appearance and dynamics and employ the XM2VTS database for our experiments. Our experiments show that a model which captures lip dynamics along with appearance can improve gender classification rates by between 16-21% compared to models of only lip appearance.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Laughter is a frequently occurring social signal and an important part of human non-verbal communication. However it is often overlooked as a serious topic of scientific study. While the lack of research in this area is mostly due to laughter’s non-serious nature, it is also a particularly difficult social signal to produce on demand in a convincing manner; thus making it a difficult topic for study in laboratory settings. In this paper we provide some techniques and guidance for inducing both hilarious laughter and conversational laughter. These techniques were devised with the goal of capturing mo- tion information related to laughter while the person laughing was either standing or seated. Comments on the value of each of the techniques and general guidance as to the importance of atmosphere, environment and social setting are provided.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

For the first time in this paper we present results showing the effect of speaker head pose angle on automatic lip-reading performance over a wide range of closely spaced angles. We analyse the effect head pose has upon the features themselves and show that by selecting coefficients with minimum variance w.r.t. pose angle, recognition performance can be improved when train-test pose angles differ. Experiments are conducted using the initial phase of a unique multi view Audio-Visual database designed specifically for research and development of pose-invariant lip-reading systems. We firstly show that it is the higher order horizontal spatial frequency components that become most detrimental as the pose deviates. Secondly we assess the performance of different feature selection masks across a range of pose angles including a new mask based on Minimum Cross-Pose Variance coefficients. We report a relative improvement of 50% in Word Error Rate when using our selection mask over a common energy based selection during profile view lip-reading.