998 resultados para digit recognition


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Features derived from the trispectra of DFT magnitude slices are used for multi-font digit recognition. These features are insensitive to translation, rotation, or scaling of the input. They are also robust to noise. Classification accuracy tests were conducted on a common data base of 256× 256 pixel bilevel images of digits in 9 fonts. Randomly rotated and translated noisy versions were used for training and testing. The results indicate that the trispectral features are better than moment invariants and affine moment invariants. They achieve a classification accuracy of 95% compared to about 81% for Hu's (1962) moment invariants and 39% for the Flusser and Suk (1994) affine moment invariants on the same data in the presence of 1% impulse noise using a 1-NN classifier. For comparison, a multilayer perceptron with no normalization for rotations and translations yields 34% accuracy on 16× 16 pixel low-pass filtered and decimated versions of the same data.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Development of Malayalam speech recognition system is in its infancy stage; although many works have been done in other Indian languages. In this paper we present the first work on speaker independent Malayalam isolated speech recognizer based on PLP (Perceptual Linear Predictive) Cepstral Coefficient and Hidden Markov Model (HMM). The performance of the developed system has been evaluated with different number of states of HMM (Hidden Markov Model). The system is trained with 21 male and female speakers in the age group ranging from 19 to 41 years. The system obtained an accuracy of 99.5% with the unseen data

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A primary medium for the human beings to communicate through language is Speech. Automatic Speech Recognition is wide spread today. Recognizing single digits is vital to a number of applications such as voice dialling of telephone numbers, automatic data entry, credit card entry, PIN (personal identification number) entry, entry of access codes for transactions, etc. In this paper we present a comparative study of SVM (Support Vector Machine) and HMM (Hidden Markov Model) to recognize and identify the digits used in Malayalam speech.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We describe a method of recognizing handwritten digits by fitting generative models that are built from deformable B-splines with Gaussian ``ink generators'' spaced along the length of the spline. The splines are adjusted using a novel elastic matching procedure based on the Expectation Maximization (EM) algorithm that maximizes the likelihood of the model generating the data. This approach has many advantages. (1) After identifying the model most likely to have generated the data, the system not only produces a classification of the digit but also a rich description of the instantiation parameters which can yield information such as the writing style. (2) During the process of explaining the image, generative models can perform recognition driven segmentation. (3) The method involves a relatively small number of parameters and hence training is relatively easy and fast. (4) Unlike many other recognition schemes it does not rely on some form of pre-normalization of input images, but can handle arbitrary scalings, translations and a limited degree of image rotation. We have demonstrated our method of fitting models to images does not get trapped in poor local minima. The main disadvantage of the method is it requires much more computation than more standard OCR techniques.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Garment information tracking is required for clean room garment management. In this paper, we present a camera-based robust system with implementation of Optical Character Reconition (OCR) techniques to fulfill garment label recognition. In the system, a camera is used for image capturing; an adaptive thresholding algorithm is employed to generate binary images; Connected Component Labelling (CCL) is then adopted for object detection in the binary image as a part of finding the ROI (Region of Interest); Artificial Neural Networks (ANNs) with the BP (Back Propagation) learning algorithm are used for digit recognition; and finally the system is verified by a system database. The system has been tested. The results show that it is capable of coping with variance of lighting, digit twisting, background complexity, and font orientations. The system performance with association to the digit recognition rate has met the design requirement. It has achieved real-time and error-free garment information tracking during the testing.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

In this paper, we present a new approach to visual speech recognition which improves contextual modelling by combining Inter-Frame Dependent and Hidden Markov Models. This approach captures contextual information in visual speech that may be lost using a Hidden Markov Model alone. We apply contextual modelling to a large speaker independent isolated digit recognition task, and compare our approach to two commonly adopted feature based techniques for incorporating speech dynamics. Results are presented from baseline feature based systems and the combined modelling technique. We illustrate that both of these techniques achieve similar levels of performance when used independently. However significant improvements in performance can be achieved through a combination of the two. In particular we report an improvement in excess of 17% relative Word Error Rate in comparison to our best baseline system.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Digit speech recognition is important in many applications such as automatic data entry, PIN entry, voice dialing telephone, automated banking system, etc. This paper presents speaker independent speech recognition system for Malayalam digits. The system employs Mel frequency cepstrum coefficient (MFCC) as feature for signal processing and Hidden Markov model (HMM) for recognition. The system is trained with 21 male and female voices in the age group of 20 to 40 years and there was 98.5% word recognition accuracy (94.8% sentence recognition accuracy) on a test set of continuous digit recognition task.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Although much of the brain’s functional organization is genetically predetermined, it appears that some noninnate functions can come to depend on dedicated and segregated neural tissue. In this paper, we describe a series of experiments that have investigated the neural development and organization of one such noninnate function: letter recognition. Functional neuroimaging demonstrates that letter and digit recognition depend on different neural substrates in some literate adults. How could the processing of two stimulus categories that are distinguished solely by cultural conventions become segregated in the brain? One possibility is that correlation-based learning in the brain leads to a spatial organization in cortex that reflects the temporal and spatial clustering of letters with letters in the environment. Simulations confirm that environmental co-occurrence does indeed lead to spatial localization in a neural network that uses correlation-based learning. Furthermore, behavioral studies confirm one critical prediction of this co-occurrence hypothesis, namely, that subjects exposed to a visual environment in which letters and digits occur together rather than separately (postal workers who process letters and digits together in Canadian postal codes) do indeed show less behavioral evidence for segregated letter and digit processing.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Deep neural networks have recently gained popularity for improv- ing state-of-the-art machine learning algorithms in diverse areas such as speech recognition, computer vision and bioinformatics. Convolutional networks especially have shown prowess in visual recognition tasks such as object recognition and detection in which this work is focused on. Mod- ern award-winning architectures have systematically surpassed previous attempts at tackling computer vision problems and keep winning most current competitions. After a brief study of deep learning architectures and readily available frameworks and libraries, the LeNet handwriting digit recognition network study case is developed, and lastly a deep learn- ing network for playing simple videogames is reviewed.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

讨论基于多种分类方法的模块组合实现的混合模式识别系统,它不同于利用多分类器输出结果表决的集成系统.提出两个系统:一个面向印刷体汉字文本识别,另一个面向自由手写体数字识别.利用多种特征和多种分类方法的组合、部分识别信息控制混淆字判别策略以及提出的动态模板库匹配后处理方法,使系统的性能与传统单一分类器系统比较,获得明显改善.实验表明:多方法多策略混合是解决复杂和增强系统鲁棒性的一条途径

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Ce mémoire est composé de trois articles et présente les résultats de travaux de recherche effectués dans le but d'améliorer les techniques actuelles permettant d'utiliser des données associées à certaines tâches dans le but d'aider à l'entraînement de réseaux de neurones sur une tâche différente. Les deux premiers articles présentent de nouveaux ensembles de données créés pour permettre une meilleure évaluation de ce type de techniques d'apprentissage machine. Le premier article introduit une suite d'ensembles de données pour la tâche de reconnaissance automatique de chiffres écrits à la main. Ces ensembles de données ont été générés à partir d'un ensemble de données déjà existant, MNIST, auquel des nouveaux facteurs de variation ont été ajoutés. Le deuxième article introduit un ensemble de données pour la tâche de reconnaissance automatique d'expressions faciales. Cet ensemble de données est composé d'images de visages qui ont été collectées automatiquement à partir du Web et ensuite étiquetées. Le troisième et dernier article présente deux nouvelles approches, dans le contexte de l'apprentissage multi-tâches, pour tirer avantage de données pour une tâche donnée afin d'améliorer les performances d'un modèle sur une tâche différente. La première approche est une généralisation des neurones Maxout récemment proposées alors que la deuxième consiste en l'application dans un contexte supervisé d'une technique permettant d'inciter des neurones à apprendre des fonctions orthogonales, à l'origine proposée pour utilisation dans un contexte semi-supervisé.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We introduce Thurstonian Boltzmann Machines (TBM), a unified architecture that can naturally incorporate a wide range of data inputs at the same time. Our motivation rests in the Thurstonian view that many discrete data types can be considered as being generated from a subset of underlying latent continuous variables, and in the observation that each realisation of a discrete type imposes certain inequalities on those variables. Thus learning and inference in TBM reduce to making sense of a set of inequalities. Our proposed TBM naturally supports the following types: Gaussian, intervals, censored, binary, categorical, muticategorical, ordinal, (in)-complete rank with and without ties. We demonstrate the versatility and capacity of the proposed model on three applications of very different natures; namely handwritten digit recognition, collaborative filtering and complex social survey analysis. Copyright 2013 by the author(s).

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Neuromorphic computing has become an emerging field in wide range of applications. Its challenge lies in developing a brain-inspired architecture that can emulate human brain and can work for real time applications. In this report a flexible neural architecture is presented which consists of 128 X 128 SRAM crossbar memory and 128 spiking neurons. For Neuron, digital integrate and fire model is used. All components are designed in 45nm technology node. The core can be configured for certain Neuron parameters, Axon types and synapses states and are fully digitally implemented. Learning for this architecture is done offline. To train this circuit a well-known algorithm Restricted Boltzmann Machine (RBM) is used and linear classifiers are trained at the output of RBM. Finally, circuit was tested for handwritten digit recognition application. Future prospects for this architecture are also discussed.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Principal component analysis (PCA) is one of the most popular techniques for processing, compressing and visualising data, although its effectiveness is limited by its global linearity. While nonlinear variants of PCA have been proposed, an alternative paradigm is to capture data complexity by a combination of local linear PCA projections. However, conventional PCA does not correspond to a probability density, and so there is no unique way to combine PCA models. Previous attempts to formulate mixture models for PCA have therefore to some extent been ad hoc. In this paper, PCA is formulated within a maximum-likelihood framework, based on a specific form of Gaussian latent variable model. This leads to a well-defined mixture model for probabilistic principal component analysers, whose parameters can be determined using an EM algorithm. We discuss the advantages of this model in the context of clustering, density modelling and local dimensionality reduction, and we demonstrate its application to image compression and handwritten digit recognition.