142 resultados para Online handwriting recognition
Resumo:
Effective feature extraction for robust speech recognition is a widely addressed topic and currently there is much effort to invoke non-stationary signal models instead of quasi-stationary signal models leading to standard features such as LPC or MFCC. Joint amplitude modulation and frequency modulation (AM-FM) is a classical non-parametric approach to non-stationary signal modeling and recently new feature sets for automatic speech recognition (ASR) have been derived based on a multi-band AM-FM representation of the signal. We consider several of these representations and compare their performances for robust speech recognition in noise, using the AURORA-2 database. We show that FEPSTRUM representation proposed is more effective than others. We also propose an improvement to FEPSTRUM based on the Teager energy operator (TEO) and show that it can selectively outperform even FEPSTRUM
Resumo:
Segmental dynamic time warping (DTW) has been demonstrated to be a useful technique for finding acoustic similarity scores between segments of two speech utterances. Due to its high computational requirements, it had to be computed in an offline manner, limiting the applications of the technique. In this paper, we present results of parallelization of this task by distributing the workload in either a static or dynamic way on an 8-processor cluster and discuss the trade-offs among different distribution schemes. We show that online unsupervised pattern discovery using segmental DTW is plausible with as low as 8 processors. This brings the task within reach of today's general purpose multi-core servers. We also show results on a 32-processor system, and discuss factors affecting scalability of our methods.
Resumo:
3D Face Recognition is an active area of research for past several years. For a 3D face recognition system one would like to have an accurate as well as low cost setup for constructing 3D face model. In this paper, we use Profilometry approach to obtain a 3D face model.This method gives a low cost solution to the problem of acquiring 3D data and the 3D face models generated by this method are sufficiently accurate. We also develop an algorithm that can use the 3D face model generated by the above method for the recognition purpose.
Resumo:
Solubilization of single walled carbon nanotubes (SWNTs) in aqueous milieu by self assembly of bivalent glycolipids is described. Thorough analysis of the resulting composites involving Vis/near-IR spectroscopy, surface plasmon resonance, confocal Raman and atomic force microscopy reveals that glycolipid-coated SWNTs possess specific molecular recognition properties towards lectins.
Resumo:
Ergonomic design of products demands accurate human dimensions-anthropometric data. Manual measurement over live subjects, has several limitations like long time, required presence of subjects for every new measurement, physical contact etc. Hence the data currently available is limited and anthropometric data related to facial features is difficult to obtain. In this paper, we discuss a methodology to automatically detect facial features and landmarks from scanned human head models. Segmentation of face into meaningful patches corresponding to facial features is achieved by Watershed algorithms and Mathematical Morphology tools. Many Important physiognomical landmarks are identified heuristically.
Resumo:
We address the problem of recognition and retrieval of relatively weak industrial signal such as Partial Discharges (PD) buried in excessive noise. The major bottleneck being the recognition and suppression of stochastic pulsive interference (PI) which has similar time-frequency characteristics as PD pulse. Therefore conventional frequency based DSP techniques are not useful in retrieving PD pulses. We employ statistical signal modeling based on combination of long-memory process and probabilistic principal component analysis (PPCA). An parametric analysis of the signal is exercised for extracting the features of desired pules. We incorporate a wavelet based bootstrap method for obtaining the noise training vectors from observed data. The procedure adopted in this work is completely different from the research work reported in the literature, which is generally based on deserved signal frequency and noise frequency.
Resumo:
Parallel sub-word recognition (PSWR) is a new model that has been proposed for language identification (LID) which does not need elaborate phonetic labeling of the speech data in a foreign language. The new approach performs a front-end tokenization in terms of sub-word units which are designed by automatic segmentation, segment clustering and segment HMM modeling. We develop PSWR based LID in a framework similar to the parallel phone recognition (PPR) approach in the literature. This includes a front-end tokenizer and a back-end language model, for each language to be identified. Considering various combinations of the statistical evaluation scores, it is found that PSWR can perform as well as PPR, even with broad acoustic sub-word tokenization, thus making it an efficient alternative to the PPR system.
Resumo:
Some experimental results on the recognition of three-dimensional wire-frame objects are presented. In order to overcome the limitations of a recent model, which employs radial basis functions-based neural networks, we have proposed a hybrid learning system for object recognition, featuring: an optimization strategy (simulated annealing) in order to avoid local minima of an energy functional; and an appropriate choice of centers of the units. Further, in an attempt to achieve improved generalization ability, and to reduce the time for training, we invoke the principle of self-organization which utilises an unsupervised learning algorithm.
Resumo:
This paper presents an artificial feed forward neural network (FFNN) approach for the assessment of power system voltage stability. A novel approach based on the input-output relation between real and reactive power, as well as voltage vectors for generators and load buses is used to train the neural net (NN). The input properties of the feed forward network are generated from offline training data with various simulated loading conditions using a conventional voltage stability algorithm based on the L-index. The neural network is trained for the L-index output as the target vector for each of the system loads. Two separate trained NN, corresponding to normal loading and contingency, are investigated on the 367 node practical power system network. The performance of the trained artificial neural network (ANN) is also investigated on the system under various voltage stability assessment conditions. As compared to the computationally intensive benchmark conventional software, near accurate results in the value of L-index and thus the voltage profile were obtained. Proposed algorithm is fast, robust and accurate and can be used online for predicting the L-indices of all the power system buses. The proposed ANN approach is also shown to be effective and computationally feasible in voltage stability assessment as well as potential enhancements within an overall energy management system in order to determining local and global stability indices