128 resultados para Hand gesture recognition
Resumo:
N-gram language models and lexicon-based word-recognition are popular methods in the literature to improve recognition accuracies of online and offline handwritten data. However, there are very few works that deal with application of these techniques on online Tamil handwritten data. In this paper, we explore methods of developing symbol-level language models and a lexicon from a large Tamil text corpus and their application to improving symbol and word recognition accuracies. On a test database of around 2000 words, we find that bigram language models improve symbol (3%) and word recognition (8%) accuracies and while lexicon methods offer much greater improvements (30%) in terms of word recognition, there is a large dependency on choosing the right lexicon. For comparison to lexicon and language model based methods, we have also explored re-evaluation techniques which involve the use of expert classifiers to improve symbol and word recognition accuracies.
Resumo:
We have benchmarked the maximum obtainable recognition accuracy on five publicly available standard word image data sets using semi-automated segmentation and a commercial OCR. These images have been cropped from camera captured scene images, born digital images (BDI) and street view images. Using the Matlab based tool developed by us, we have annotated at the pixel level more than 3600 word images from the five data sets. The word images binarized by the tool, as well as by our own midline analysis and propagation of segmentation (MAPS) algorithm are recognized using the trial version of Nuance Omnipage OCR and these two results are compared with the best reported in the literature. The benchmark word recognition rates obtained on ICDAR 2003, Sign evaluation, Street view, Born-digital and ICDAR 2011 data sets are 83.9%, 89.3%, 79.6%, 88.5% and 86.7%, respectively. The results obtained from MAPS binarized word images without the use of any lexicon are 64.5% and 71.7% for ICDAR 2003 and 2011 respectively, and these values are higher than the best reported values in the literature of 61.1% and 41.2%, respectively. MAPS results of 82.8% for BDI 2011 dataset matches the performance of the state of the art method based on power law transform.
Resumo:
Due to limited available therapeutic options, developing new lead compounds against hepatitis C virus is an urgent need. Human La protein stimulates hepatitis C virus translation through interaction with the hepatitis C viral RNA. A cyclic peptide mimicking the beta-turn of the human La protein that interacts with the viral RNA was synthesized. It inhibits hepatitis C viral RNA translation significantly better than the corresponding linear peptide at longer post-treatment times. The cyclic peptide also inhibited replication as measured by replicon RNA levels using real time RT-PCR. The cyclic peptide emerges as a promising lead compound against hepatitis C.
Resumo:
In this paper, we present a machine learning approach for subject independent human action recognition using depth camera, emphasizing the importance of depth in recognition of actions. The proposed approach uses the flow information of all 3 dimensions to classify an action. In our approach, we have obtained the 2-D optical flow and used it along with the depth image to obtain the depth flow (Z motion vectors). The obtained flow captures the dynamics of the actions in space time. Feature vectors are obtained by averaging the 3-D motion over a grid laid over the silhouette in a hierarchical fashion. These hierarchical fine to coarse windows capture the motion dynamics of the object at various scales. The extracted features are used to train a Meta-cognitive Radial Basis Function Network (McRBFN) that uses a Projection Based Learning (PBL) algorithm, referred to as PBL-McRBFN, henceforth. PBL-McRBFN begins with zero hidden neurons and builds the network based on the best human learning strategy, namely, self-regulated learning in a meta-cognitive environment. When a sample is used for learning, PBLMcRBFN uses the sample overlapping conditions, and a projection based learning algorithm to estimate the parameters of the network. The performance of PBL-McRBFN is compared to that of a Support Vector Machine (SVM) and Extreme Learning Machine (ELM) classifiers with representation of every person and action in the training and testing datasets. Performance study shows that PBL-McRBFN outperforms these classifiers in recognizing actions in 3-D. Further, a subject-independent study is conducted by leave-one-subject-out strategy and its generalization performance is tested. It is observed from the subject-independent study that McRBFN is capable of generalizing actions accurately. The performance of the proposed approach is benchmarked with Video Analytics Lab (VAL) dataset and Berkeley Multimodal Human Action Database (MHAD). (C) 2013 Elsevier Ltd. All rights reserved.
Resumo:
In this paper, we report a breakthrough result on the difficult task of segmentation and recognition of coloured text from the word image dataset of ICDAR robust reading competition challenge 2: reading text in scene images. We split the word image into individual colour, gray and lightness planes and enhance the contrast of each of these planes independently by a power-law transform. The discrimination factor of each plane is computed as the maximum between-class variance used in Otsu thresholding. The plane that has maximum discrimination factor is selected for segmentation. The trial version of Omnipage OCR is then used on the binarized words for recognition. Our recognition results on ICDAR 2011 and ICDAR 2003 word datasets are compared with those reported in the literature. As baseline, the images binarized by simple global and local thresholding techniques were also recognized. The word recognition rate obtained by our non-linear enhancement and selection of plance method is 72.8% and 66.2% for ICDAR 2011 and 2003 word datasets, respectively. We have created ground-truth for each image at the pixel level to benchmark these datasets using a toolkit developed by us. The recognition rate of benchmarked images is 86.7% and 83.9% for ICDAR 2011 and 2003 datasets, respectively.
Resumo:
We address the problem of multi-instrument recognition in polyphonic music signals. Individual instruments are modeled within a stochastic framework using Student's-t Mixture Models (tMMs). We impose a mixture of these instrument models on the polyphonic signal model. No a priori knowledge is assumed about the number of instruments in the polyphony. The mixture weights are estimated in a latent variable framework from the polyphonic data using an Expectation Maximization (EM) algorithm, derived for the proposed approach. The weights are shown to indicate instrument activity. The output of the algorithm is an Instrument Activity Graph (IAG), using which, it is possible to find out the instruments that are active at a given time. An average F-ratio of 0 : 7 5 is obtained for polyphonies containing 2-5 instruments, on a experimental test set of 8 instruments: clarinet, flute, guitar, harp, mandolin, piano, trombone and violin.
Resumo:
Sialic acids form a large family of 9-carbon monosaccharides and are integral components of glycoconjugates. They are known to bind to a wide range of receptors belonging to diverse sequence families and fold classes and are key mediators in a plethora of cellular processes. Thus, it is of great interest to understand the features that give rise to such a recognition capability. Structural analyses using a non-redundant data set of known sialic acid binding proteins was carried out, which included exhaustive binding site comparisons and site alignments using in-house algorithms, followed by clustering and tree computation, which has led to derivation of sialic acid recognition principles. Although the proteins in the data set belong to several sequence and structure families, their binding sites could be grouped into only six types. Structural comparison of the binding sites indicates that all sites contain one or more different combinations of key structural features over a common scaffold. The six binding site types thus serve as structural motifs for recognizing sialic acid. Scanning the motifs against a non-redundant set of binding sites from PDB indicated the motifs to be specific for sialic acid recognition. Knowledge of determinants obtained from this study will be useful for detecting function in unknown proteins. As an example analysis, a genome-wide scan for the motifs in structures of Mycobacterium tuberculosis proteome identified 17 hits that contain combinations of the features, suggesting a possible function of sialic acid binding by these proteins.
Resumo:
Facile synthesis of triad 3 and tetrad 4 incorporating -B(Mes)(2) (Mes = mesityl (2,4,6-trimethylphenyl)), boron dipyrromethene (BODIPY), and triphenylamine is reported. Introduction of two dissimilar acceptors (triarylborane and BODIPY) on a single donor resulted in two distinct intramolecular charge transfer processes (amine-to-borane and amine-to-BODIPY). The absorption and emission properties of the new triad and tetrad are highly dependent on individual building units. The nature of electronic communication among the individual fluorophore units has been comprehensively investigated and compared with building units. Compounds 3 and 4 showed chromogenic and fluorogenic responses for small anions such as fluoride and cyanide.
Resumo:
Inosine monophosphate dehydrogenase (IMPDH) enzyme involves in GMP biosynthesis pathway. Type I hIMPDH is expressed at lower levels in all cells, whereas type II is especially observed in acute myelogenous leukemia, chronic myelogenous leukemia cancer cells, and 10 ns simulation of the IMP-NAD(+) complex structures (PDB ID. 1B3O and 1JCN) have revealed the presence of a few conserved hydrophilic centers near carboxamide group of NAD(+). Three conserved water molecules (W1, W, and W1 `) in di-nucleotide binding pocket of enzyme have played a significant role in the recognition of carboxamide group (of NAD(+)) to D274 and H93 residues. Based on H-bonding interaction of conserved hydrophilic (water molecular) centers within IMP-NAD(+)-enzyme complexes and their recognition to NAD(+), some covalent modification at carboxamide group of di-nucleotide (NAD(+)) has been made by substituting the -CONH(2)group by -CONHNH2 (carboxyl hydrazide group) using water mimic inhibitor design protocol. The modeled structure of modified ligand may, though, be useful for the development of antileukemic agent or it could be act as better inhibitor for hIMPDH-II.
Resumo:
Primates exhibit laterality in hand usage either in terms of (a) hand with which an individual solves a task or while solving a task that requires both hands, executes the most complex action, that is, hand preference, or (b) hand with which an individual executes actions most efficiently, that is, hand performance. Observations from previous studies indicate that laterality in hand usage might reflect specialization of the two hands for accomplishing tasks that require maneuvering dexterity or physical strength. However, no existing study has investigated handedness with regard to this possibility. In this study, we examined laterality in hand usage in urban free-ranging bonnet macaques, Macaca radiata with regard to the above possibility. While solving four distinct food extraction tasks which varied in the number of steps involved in the food extraction process and the dexterity required in executing the individual steps, the macaques consistently used one hand for extracting food (i.e., task requiring maneuvering dexterity)the maneuvering hand, and the other hand for supporting the body (i.e., task requiring physical strength)the supporting hand. Analogously, the macaques used the maneuvering hand for the spontaneous routine activities that involved maneuvering in three-dimensional space, such as grooming, and hitting an opponent during an agonistic interaction, and the supporting hand for those that required physical strength, such as pulling the body up while climbing. Moreover, while solving a task that ergonomically forced the usage of a particular hand, the macaques extracted food faster with the maneuvering hand as compared to the supporting hand, demonstrating the higher maneuvering dexterity of the maneuvering hand. As opposed to the conventional ideas of handedness in non-human primates, these observations demonstrate division of labor between the two hands marked by their consistent usage across spontaneous and experimental tasks requiring maneuvering in three-dimensional space or those requiring physical strength. Am. J. Primatol. 76:576-585, 2014. (c) 2013 Wiley Periodicals, Inc.
Resumo:
We develop noise robust features using Gammatone wavelets derived from the popular Gammatone functions. These wavelets incorporate the characteristics of human peripheral auditory systems, in particular the spatially-varying frequency response of the basilar membrane. We refer to the new features as Gammatone Wavelet Cepstral Coefficients (GWCC). The procedure involved in extracting GWCC from a speech signal is similar to that of the conventional Mel-Frequency Cepstral Coefficients (MFCC) technique, with the difference being in the type of filterbank used. We replace the conventional mel filterbank in MFCC with a Gammatone wavelet filterbank, which we construct using Gammatone wavelets. We also explore the effect of Gammatone filterbank based features (Gammatone Cepstral Coefficients (GCC)) for robust speech recognition. On AURORA 2 database, a comparison of GWCCs and GCCs with MFCCs shows that Gammatone based features yield a better recognition performance at low SNRs.
Resumo:
This paper discusses a novel high-speed approach for human action recognition in H. 264/AVC compressed domain. The proposed algorithm utilizes cues from quantization parameters and motion vectors extracted from the compressed video sequence for feature extraction and further classification using Support Vector Machines (SVM). The ultimate goal of our work is to portray a much faster algorithm than pixel domain counterparts, with comparable accuracy, utilizing only the sparse information from compressed video. Partial decoding rules out the complexity of full decoding, and minimizes computational load and memory usage, which can effect in reduced hardware utilization and fast recognition results. The proposed approach can handle illumination changes, scale, and appearance variations, and is robust in outdoor as well as indoor testing scenarios. We have tested our method on two benchmark action datasets and achieved more than 85% accuracy. The proposed algorithm classifies actions with speed (>2000 fps) approximately 100 times more than existing state-of-the-art pixel-domain algorithms.
Resumo:
Multi-species mating aggregations are crowded environments within which mate recognition must occur. Mating aggregations of fig wasps can consist of thousands of individuals of many species that attain sexual maturity simultaneously and mate in the same microenvironment, i.e, in syntopy, within the close confines of an enclosed globular inflorescence called a syconium - a system that has many signalling constraints such as darkness and crowding. All wasps develop within individual galled flowers. Since mating mostly occurs when females are still confined within their galls,, male wasps have the additional burden of detecting conspecific females that are ``hidden'' behind barriers consisting of gall walls. In Ficus racemosa, we investigated signals used by pollinating fig wasp males to differentiate conspecific females from females of other syntopic fig wasp species. Male Ceratosolen fusciceps could detect conspecific females using cues from galls containing females, empty galls, as well as cues from gall volatiles and gall surface hydrocarbons. In many figs, syconia are pollinated by single foundress wasps, leading to high levels of wasp inbreeding due to sibmating. In F. racemosa, as most syconia contain many foundresses, we expected male pollinators to prefer non-sib females to female siblings to reduce inbreeding. We used galls containing females from non-natal figs as a proxy for non-sibs and those from natal figs as a proxy for sibling females. We found that males preferred galls of female pollinators from natal figs. However, males were undecided when given a choice between galls containing non-pollinator females from natal syconia and pollinator females from non-natal syconia, suggesting olfactory imprinting by the natal syconial environment. (C) 2013 Elsevier Masson SAS. All rights reserved.
Resumo:
Enzymes utilizing pyridoxal 5'-phosphate dependent mechanism for catalysis are observed in all cellular forms of living organisms. PLP-dependent enzymes catalyze a wide variety of reactions involving amino acid substrates and their analogs. Structurally, these ubiquitous enzymes have been classified into four major fold types. We have carried out investigations on the structure and function of fold type I enzymes serine hydroxymethyl transferase and acetylornithine amino transferase, fold type n enzymes catabolic threonine deaminase, D-serine deaminase, D-cysteine desulfhydrase and diaminopropionate ammonia lyase. This review summarizes the major findings of investigations on fold type II enzymes in the context of similar studies on other PLP-dependent enzymes. Fold type II enzymes participate in pathways of both degradation and synthesis of amino acids. Polypeptide folds of these enzymes, features of their active sites, nature of interactions between the cofactor and the polypeptide, oligomeric structure, catalytic activities with various ligands, origin of specificity and plausible regulation of activity are briefly described. Analysis of the available crystal structures of fold type II enzymes revealed five different classes. The dimeric interfaces found in these enzymes vary across the classes and probably have functional significance.
Resumo:
In this article, we aim at reducing the error rate of the online Tamil symbol recognition system by employing multiple experts to reevaluate certain decisions of the primary support vector machine classifier. Motivated by the relatively high percentage of occurrence of base consonants in the script, a reevaluation technique has been proposed to correct any ambiguities arising in the base consonants. Secondly, a dynamic time-warping method is proposed to automatically extract the discriminative regions for each set of confused characters. Class-specific features derived from these regions aid in reducing the degree of confusion. Thirdly, statistics of specific features are proposed for resolving any confusions in vowel modifiers. The reevaluation approaches are tested on two databases (a) the isolated Tamil symbols in the IWFHR test set, and (b) the symbols segmented from a set of 10,000 Tamil words. The recognition rate of the isolated test symbols of the IWFHR database improves by 1.9 %. For the word database, the incorporation of the reevaluation step improves the symbol recognition rate by 3.5 % (from 88.4 to 91.9 %). This, in turn, boosts the word recognition rate by 11.9 % (from 65.0 to 76.9 %). The reduction in the word error rate has been achieved using a generic approach, without the incorporation of language models.