31 resultados para Human face recognition (Computer science)

em Indian Institute of Science - Bangalore - Índia


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The increasing use of 3D modeling of Human Face in Face Recognition systems, User Interfaces, Graphics, Gaming and the like has made it an area of active study. Majority of the 3D sensors rely on color coded light projection for 3D estimation. Such systems fail to generate any response in regions covered by Facial Hair (like beard, mustache), and hence generate holes in the model which have to be filled manually later on. We propose the use of wavelet transform based analysis to extract the 3D model of Human Faces from a sinusoidal white light fringe projected image. Our method requires only a single image as input. The method is robust to texture variations on the face due to space-frequency localization property of the wavelet transform. It can generate models to pixel level refinement as the phase is estimated for each pixel by a continuous wavelet transform. In cases of sparse Facial Hair, the shape distortions due to hairs can be filtered out, yielding an estimate for the underlying face. We use a low-pass filtering approach to estimate the face texture from the same image. We demonstrate the method on several Human Faces both with and without Facial Hairs. Unseen views of the face are generated by texture mapping on different rotations of the obtained 3D structure. To the best of our knowledge, this is the first attempt to estimate 3D for Human Faces in presence of Facial hair structures like beard and mustache without generating holes in those areas.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

3D Face Recognition is an active area of research for past several years. For a 3D face recognition system one would like to have an accurate as well as low cost setup for constructing 3D face model. In this paper, we use Profilometry approach to obtain a 3D face model.This method gives a low cost solution to the problem of acquiring 3D data and the 3D face models generated by this method are sufficiently accurate. We also develop an algorithm that can use the 3D face model generated by the above method for the recognition purpose.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we use optical flow based complex-valued features extracted from video sequences to recognize human actions. The optical flow features between two image planes can be appropriately represented in the Complex plane. Therefore, we argue that motion information that is used to model the human actions should be represented as complex-valued features and propose a fast learning fully complex-valued neural classifier to solve the action recognition task. The classifier, termed as, ``fast learning fully complex-valued neural (FLFCN) classifier'' is a single hidden layer fully complex-valued neural network. The neurons in the hidden layer employ the fully complex-valued activation function of the type of a hyperbolic secant function. The parameters of the hidden layer are chosen randomly and the output weights are estimated as the minimum norm least square solution to a set of linear equations. The results indicate the superior performance of FLFCN classifier in recognizing the actions compared to real-valued support vector machines and other existing results in the literature. Complex valued representation of 2D motion and orthogonal decision boundaries boost the classification performance of FLFCN classifier. (c) 2012 Elsevier B.V. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we present a machine learning approach for subject independent human action recognition using depth camera, emphasizing the importance of depth in recognition of actions. The proposed approach uses the flow information of all 3 dimensions to classify an action. In our approach, we have obtained the 2-D optical flow and used it along with the depth image to obtain the depth flow (Z motion vectors). The obtained flow captures the dynamics of the actions in space time. Feature vectors are obtained by averaging the 3-D motion over a grid laid over the silhouette in a hierarchical fashion. These hierarchical fine to coarse windows capture the motion dynamics of the object at various scales. The extracted features are used to train a Meta-cognitive Radial Basis Function Network (McRBFN) that uses a Projection Based Learning (PBL) algorithm, referred to as PBL-McRBFN, henceforth. PBL-McRBFN begins with zero hidden neurons and builds the network based on the best human learning strategy, namely, self-regulated learning in a meta-cognitive environment. When a sample is used for learning, PBLMcRBFN uses the sample overlapping conditions, and a projection based learning algorithm to estimate the parameters of the network. The performance of PBL-McRBFN is compared to that of a Support Vector Machine (SVM) and Extreme Learning Machine (ELM) classifiers with representation of every person and action in the training and testing datasets. Performance study shows that PBL-McRBFN outperforms these classifiers in recognizing actions in 3-D. Further, a subject-independent study is conducted by leave-one-subject-out strategy and its generalization performance is tested. It is observed from the subject-independent study that McRBFN is capable of generalizing actions accurately. The performance of the proposed approach is benchmarked with Video Analytics Lab (VAL) dataset and Berkeley Multimodal Human Action Database (MHAD). (C) 2013 Elsevier Ltd. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Large variations in human actions lead to major challenges in computer vision research. Several algorithms are designed to solve the challenges. Algorithms that stand apart, help in solving the challenge in addition to performing faster and efficient manner. In this paper, we propose a human cognition inspired projection based learning for person-independent human action recognition in the H.264/AVC compressed domain and demonstrate a PBL-McRBEN based approach to help take the machine learning algorithms to the next level. Here, we use gradient image based feature extraction process where the motion vectors and quantization parameters are extracted and these are studied temporally to form several Group of Pictures (GoP). The GoP is then considered individually for two different bench mark data sets and the results are classified using person independent human action recognition. The functional relationship is studied using Projection Based Learning algorithm of the Meta-cognitive Radial Basis Function Network (PBL-McRBFN) which has a cognitive and meta-cognitive component. The cognitive component is a radial basis function network while the Meta-Cognitive Component(MCC) employs self regulation. The McC emulates human cognition like learning to achieve better performance. Performance of the proposed approach can handle sparse information in compressed video domain and provides more accuracy than other pixel domain counterparts. Performance of the feature extraction process achieved more than 90% accuracy using the PTIL-McRBFN which catalyzes the speed of the proposed high speed action recognition algorithm. We have conducted twenty random trials to find the performance in GoP. The results are also compared with other well known classifiers in machine learning literature.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we propose a H.264/AVC compressed domain human action recognition system with projection based metacognitive learning classifier (PBL-McRBFN). The features are extracted from the quantization parameters and the motion vectors of the compressed video stream for a time window and used as input to the classifier. Since compressed domain analysis is done with noisy, sparse compression parameters, it is a huge challenge to achieve performance comparable to pixel domain analysis. On the positive side, compressed domain allows rapid analysis of videos compared to pixel level analysis. The classification results are analyzed for different values of Group of Pictures (GOP) parameter, time window including full videos. The functional relationship between the features and action labels are established using PBL-McRBFN with a cognitive and meta-cognitive component. The cognitive component is a radial basis function, while the meta-cognitive component employs self-regulation to achieve better performance in subject independent action recognition task. The proposed approach is faster and shows comparable performance with respect to the state-of-the-art pixel domain counterparts. It employs partial decoding, which rules out the complexity of full decoding, and minimizes computational load and memory usage. This results in reduced hardware utilization and increased speed of classification. The results are compared with two benchmark datasets and show more than 90% accuracy using the PBL-McRBFN. The performance for various GOP parameters and group of frames are obtained with twenty random trials and compared with other well-known classifiers in machine learning literature. (C) 2015 Elsevier B.V. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We propose a completely automatic approach for recognizing low resolution face images captured in uncontrolled environment. The approach uses multidimensional scaling to learn a common transformation matrix for the entire face which simultaneously transforms the facial features of the low resolution and the high resolution training images such that the distance between them approximates the distance had both the images been captured under the same controlled imaging conditions. Stereo matching cost is used to obtain the similarity of two images in the transformed space. Though this gives very good recognition performance, the time taken for computing the stereo matching cost is significant. To overcome this limitation, we propose a reference-based approach in which each face image is represented by its stereo matching cost from a few reference images. Experimental evaluation on the real world challenging databases and comparison with the state-of-the-art super-resolution, classifier based and cross modal synthesis techniques show the effectiveness of the proposed algorithm.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Indian logic has a long history. It somewhat covers the domains of two of the six schools (darsanas) of Indian philosophy, namely, Nyaya and Vaisesika. The generally accepted definition of Indian logic over the ages is the science which ascertains valid knowledge either by means of six senses or by means of the five members of the syllogism. In other words, perception and inference constitute the subject matter of logic. The science of logic evolved in India through three ages: the ancient, the medieval and the modern, spanning almost thirty centuries. Advances in Computer Science, in particular, in Artificial Intelligence have got researchers in these areas interested in the basic problems of language, logic and cognition in the past three decades. In the 1980s, Artificial Intelligence has evolved into knowledge-based and intelligent system design, and the knowledge base and inference engine have become standard subsystems of an intelligent system. One of the important issues in the design of such systems is knowledge acquisition from humans who are experts in a branch of learning (such as medicine or law) and transferring that knowledge to a computing system. The second important issue in such systems is the validation of the knowledge base of the system i.e. ensuring that the knowledge is complete and consistent. It is in this context that comparative study of Indian logic with recent theories of logic, language and knowledge engineering will help the computer scientist understand the deeper implications of the terms and concepts he is currently using and attempting to develop.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we present a fast learning neural network classifier for human action recognition. The proposed classifier is a fully complex-valued neural network with a single hidden layer. The neurons in the hidden layer employ the fully complex-valued hyperbolic secant as an activation function. The parameters of the hidden layer are chosen randomly and the output weights are estimated analytically as a minimum norm least square solution to a set of linear equations. The fast leaning fully complex-valued neural classifier is used for recognizing human actions accurately. Optical flow-based features extracted from the video sequences are utilized to recognize 10 different human actions. The feature vectors are computationally simple first order statistics of the optical flow vectors, obtained from coarse to fine rectangular patches centered around the object. The results indicate the superior performance of the complex-valued neural classifier for action recognition. The superior performance of the complex neural network for action recognition stems from the fact that motion, by nature, consists of two components, one along each of the axes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper discusses a novel high-speed approach for human action recognition in H. 264/AVC compressed domain. The proposed algorithm utilizes cues from quantization parameters and motion vectors extracted from the compressed video sequence for feature extraction and further classification using Support Vector Machines (SVM). The ultimate goal of our work is to portray a much faster algorithm than pixel domain counterparts, with comparable accuracy, utilizing only the sparse information from compressed video. Partial decoding rules out the complexity of full decoding, and minimizes computational load and memory usage, which can effect in reduced hardware utilization and fast recognition results. The proposed approach can handle illumination changes, scale, and appearance variations, and is robust in outdoor as well as indoor testing scenarios. We have tested our method on two benchmark action datasets and achieved more than 85% accuracy. The proposed algorithm classifies actions with speed (>2000 fps) approximately 100 times more than existing state-of-the-art pixel-domain algorithms.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We propose to develop a 3-D optical flow features based human action recognition system. Optical flow based features are employed here since they can capture the apparent movement in object, by design. Moreover, they can represent information hierarchically from local pixel level to global object level. In this work, 3-D optical flow based features a re extracted by combining the 2-1) optical flow based features with the depth flow features obtained from depth camera. In order to develop an action recognition system, we employ a Meta-Cognitive Neuro-Fuzzy Inference System (McFIS). The m of McFIS is to find the decision boundary separating different classes based on their respective optical flow based features. McFIS consists of a neuro-fuzzy inference system (cognitive component) and a self-regulatory learning mechanism (meta-cognitive component). During the supervised learning, self-regulatory learning mechanism monitors the knowledge of the current sample with respect to the existing knowledge in the network and controls the learning by deciding on sample deletion, sample learning or sample reserve strategies. The performance of the proposed action recognition system was evaluated on a proprietary data set consisting of eight subjects. The performance evaluation with standard support vector machine classifier and extreme learning machine indicates improved performance of McFIS is recognizing actions based of 3-D optical flow based features.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Facial emotions are the most expressive way to display emotions. Many algorithms have been proposed which employ a particular set of people (usually a database) to both train and test their model. This paper focuses on the challenging task of database independent emotion recognition, which is a generalized case of subject-independent emotion recognition. The emotion recognition system employed in this work is a Meta-Cognitive Neuro-Fuzzy Inference System (McFIS). McFIS has two components, a neuro-fuzzy inference system, which is the cognitive component and a self-regulatory learning mechanism, which is the meta-cognitive component. The meta-cognitive component, monitors the knowledge in the neuro-fuzzy inference system and decides on what-to-learn, when-to-learn and how-to-learn the training samples, efficiently. For each sample, the McFIS decides whether to delete the sample without being learnt, use it to add/prune or update the network parameter or reserve it for future use. This helps the network avoid over-training and as a result improve its generalization performance over untrained databases. In this study, we extract pixel based emotion features from well-known (Japanese Female Facial Expression) JAFFE and (Taiwanese Female Expression Image) TFEID database. Two sets of experiment are conducted. First, we study the individual performance of both databases on McFIS based on 5-fold cross validation study. Next, in order to study the generalization performance, McFIS trained on JAFFE database is tested on TFEID and vice-versa. The performance The performance comparison in both experiments against SVNI classifier gives promising results.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper discusses a novel high-speed approach for human action recognition in H.264/AVC compressed domain. The proposed algorithm utilizes cues from quantization parameters and motion vectors extracted from the compressed video sequence for feature extraction and further classification using Support Vector Machines (SVM). The ultimate goal of the proposed work is to portray a much faster algorithm than pixel domain counterparts, with comparable accuracy, utilizing only the sparse information from compressed video. Partial decoding rules out the complexity of full decoding, and minimizes computational load and memory usage, which can result in reduced hardware utilization and faster recognition results. The proposed approach can handle illumination changes, scale, and appearance variations, and is robust to outdoor as well as indoor testing scenarios. We have evaluated the performance of the proposed method on two benchmark action datasets and achieved more than 85 % accuracy. The proposed algorithm classifies actions with speed (> 2,000 fps) approximately 100 times faster than existing state-of-the-art pixel-domain algorithms.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Abstract-The success of automatic speaker recognition in laboratory environments suggests applications in forensic science for establishing the Identity of individuals on the basis of features extracted from speech. A theoretical model for such a verification scheme for continuous normaliy distributed featureIss developed. The three cases of using a) single feature, b)multipliendependent measurements of a single feature, and c)multpleindependent features are explored.The number iofndependent features needed for areliable personal identification is computed based on the theoretcal model and an expklatory study of some speech featues.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

An adaptive learning scheme, based on a fuzzy approximation to the gradient descent method for training a pattern classifier using unlabeled samples, is described. The objective function defined for the fuzzy ISODATA clustering procedure is used as the loss function for computing the gradient. Learning is based on simultaneous fuzzy decisionmaking and estimation. It uses conditional fuzzy measures on unlabeled samples. An exponential membership function is assumed for each class, and the parameters constituting these membership functions are estimated, using the gradient, in a recursive fashion. The induced possibility of occurrence of each class is useful for estimation and is computed using 1) the membership of the new sample in that class and 2) the previously computed average possibility of occurrence of the same class. An inductive entropy measure is defined in terms of induced possibility distribution to measure the extent of learning. The method is illustrated with relevant examples.