32 resultados para Speech Recognition System using LPC


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Motivation for Speaker recognition work is presented in the first part of the thesis. An exhaustive survey of past work in this field is also presented. A low cost system not including complex computation has been chosen for implementation. Towards achieving this a PC based system is designed and developed. A front end analog to digital convertor (12 bit) is built and interfaced to a PC. Software to control the ADC and to perform various analytical functions including feature vector evaluation is developed. It is shown that a fixed set of phrases incorporating evenly balanced phonemes is aptly suited for the speaker recognition work at hand. A set of phrases are chosen for recognition. Two new methods are adopted for the feature evaluation. Some new measurements involving a symmetry check method for pitch period detection and ACE‘ are used as featured. Arguments are provided to show the need for a new model for speech production. Starting from heuristic, a knowledge based (KB) speech production model is presented. In this model, a KB provides impulses to a voice producing mechanism and constant correction is applied via a feedback path. It is this correction that differs from speaker to speaker. Methods of defining measurable parameters for use as features are described. Algorithms for speaker recognition are developed and implemented. Two methods are presented. The first is based on the model postulated. Here the entropy on the utterance of a phoneme is evaluated. The transitions of voiced regions are used as speaker dependent features. The second method presented uses features found in other works, but evaluated differently. A knock—out scheme is used to provide the weightage values for the selection of features. Results of implementation are presented which show on an average of 80% recognition. It is also shown that if there are long gaps between sessions, the performance deteriorates and is speaker dependent. Cross recognition percentages are also presented and this in the worst case rises to 30% while the best case is 0%. Suggestions for further work are given in the concluding chapter.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Any automatically measurable, robust and distinctive physical characteristic or personal trait that can be used to identify an individual or verify the claimed identity of an individual, referred to as biometrics, has gained significant interest in the wake of heightened concerns about security and rapid advancements in networking, communication and mobility. Multimodal biometrics is expected to be ultra-secure and reliable, due to the presence of multiple and independent—verification clues. In this study, a multimodal biometric system utilising audio and facial signatures has been implemented and error analysis has been carried out. A total of one thousand face images and 250 sound tracks of 50 users are used for training the proposed system. To account for the attempts of the unregistered signatures data of 25 new users are tested. The short term spectral features were extracted from the sound data and Vector Quantization was done using K-means algorithm. Face images are identified based on Eigen face approach using Principal Component Analysis. The success rate of multimodal system using speech and face is higher when compared to individual unimodal recognition systems

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we address the problem of face detection and recognition of grey scale frontal view images. We propose a face recognition system based on probabilistic neural networks (PNN) architecture. The system is implemented using voronoi/ delaunay tessellations and template matching. Images are segmented successfully into homogeneous regions by virtue of voronoi diagram properties. Face verification is achieved using matching scores computed by correlating edge gradients of reference images. The advantage of classification using PNN models is its short training time. The correlation based template matching guarantees good classification results

Relevância:

100.00% 100.00%

Publicador:

Resumo:

n this paper we address the problem of face detection and recognition of grey scale frontal view images. We propose a face recognition system based on probabilistic neural networks (PNN) architecture. The system is implemented using voronoi/ delaunay tessellations and template matching. Images are segmented successfully into homogeneous regions by virtue of voronoi diagram properties. Face verification is achieved using matching scores computed by correlating edge gradients of reference images. The advantage of classification using PNN models is its short training time. The correlation based template matching guarantees good classification results.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we propose a handwritten character recognition system for Malayalam language. The feature extraction phase consists of gradient and curvature calculation and dimensionality reduction using Principal Component Analysis. Directional information from the arc tangent of gradient is used as gradient feature. Strength of gradient in curvature direction is used as the curvature feature. The proposed system uses a combination of gradient and curvature feature in reduced dimension as the feature vector. For classification, discriminative power of Support Vector Machine (SVM) is evaluated. The results reveal that SVM with Radial Basis Function (RBF) kernel yield the best performance with 96.28% and 97.96% of accuracy in two different datasets. This is the highest accuracy ever reported on these datasets

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents an efficient Online Handwritten character Recognition System for Malayalam Characters (OHR-M) using Kohonen network. It would help in recognizing Malayalam text entered using pen-like devices. It will be more natural and efficient way for users to enter text using a pen than keyboard and mouse. To identify the difference between similar characters in Malayalam a novel feature extraction method has been adopted-a combination of context bitmap and normalized (x, y) coordinates. The system reported an accuracy of 88.75% which is writer independent with a recognition time of 15-32 milliseconds

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Timely detection of sudden change in dynamics that adversely affect the performance of systems and quality of products has great scientific relevance. This work focuses on effective detection of dynamical changes of real time signals from mechanical as well as biological systems using a fast and robust technique of permutation entropy (PE). The results are used in detecting chatter onset in machine turning and identifying vocal disorders from speech signal.Permutation Entropy is a nonlinear complexity measure which can efficiently distinguish regular and complex nature of any signal and extract information about the change in dynamics of the process by indicating sudden change in its value. Here we propose the use of permutation entropy (PE), to detect the dynamical changes in two non linear processes, turning under mechanical system and speech under biological system.Effectiveness of PE in detecting the change in dynamics in turning process from the time series generated with samples of audio and current signals is studied. Experiments are carried out on a lathe machine for sudden increase in depth of cut and continuous increase in depth of cut on mild steel work pieces keeping the speed and feed rate constant. The results are applied to detect chatter onset in machining. These results are verified using frequency spectra of the signals and the non linear measure, normalized coarse-grained information rate (NCIR).PE analysis is carried out to investigate the variation in surface texture caused by chatter on the machined work piece. Statistical parameter from the optical grey level intensity histogram of laser speckle pattern recorded using a charge coupled device (CCD) camera is used to generate the time series required for PE analysis. Standard optical roughness parameter is used to confirm the results.Application of PE in identifying the vocal disorders is studied from speech signal recorded using microphone. Here analysis is carried out using speech signals of subjects with different pathological conditions and normal subjects, and the results are used for identifying vocal disorders. Standard linear technique of FFT is used to substantiate thc results.The results of PE analysis in all three cases clearly indicate that this complexity measure is sensitive to change in regularity of a signal and hence can suitably be used for detection of dynamical changes in real world systems. This work establishes the application of the simple, inexpensive and fast algorithm of PE for the benefit of advanced manufacturing process as well as clinical diagnosis in vocal disorders.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Speech signals are one of the most important means of communication among the human beings. In this paper, a comparative study of two feature extraction techniques are carried out for recognizing speaker independent spoken isolated words. First one is a hybrid approach with Linear Predictive Coding (LPC) and Artificial Neural Networks (ANN) and the second method uses a combination of Wavelet Packet Decomposition (WPD) and Artificial Neural Networks. Voice signals are sampled directly from the microphone and then they are processed using these two techniques for extracting the features. Words from Malayalam, one of the four major Dravidian languages of southern India are chosen for recognition. Training, testing and pattern recognition are performed using Artificial Neural Networks. Back propagation method is used to train the ANN. The proposed method is implemented for 50 speakers uttering 20 isolated words each. Both the methods produce good recognition accuracy. But Wavelet Packet Decomposition is found to be more suitable for recognizing speech because of its multi-resolution characteristics and efficient time frequency localizations

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Malayalam is one of the 22 scheduled languages in India with more than 130 million speakers. This paper presents a report on the development of a speaker independent, continuous transcription system for Malayalam. The system employs Hidden Markov Model (HMM) for acoustic modeling and Mel Frequency Cepstral Coefficient (MFCC) for feature extraction. It is trained with 21 male and female speakers in the age group ranging from 20 to 40 years. The system obtained a word recognition accuracy of 87.4% and a sentence recognition accuracy of 84%, when tested with a set of continuous speech data.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper proposes a content based image retrieval (CBIR) system using the local colour and texture features of selected image sub-blocks and global colour and shape features of the image. The image sub-blocks are roughly identified by segmenting the image into partitions of different configuration, finding the edge density in each partition using edge thresholding, morphological dilation and finding the corner density in each partition. The colour and texture features of the identified regions are computed from the histograms of the quantized HSV colour space and Gray Level Co- occurrence Matrix (GLCM) respectively. A combined colour and texture feature vector is computed for each region. The shape features are computed from the Edge Histogram Descriptor (EHD). Euclidean distance measure is used for computing the distance between the features of the query and target image. Experimental results show that the proposed method provides better retrieving result than retrieval using some of the existing methods

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper proposes a region based image retrieval system using the local colour and texture features of image sub regions. The regions of interest (ROI) are roughly identified by segmenting the image into fixed partitions, finding the edge map and applying morphological dilation. The colour and texture features of the ROIs are computed from the histograms of the quantized HSV colour space and Gray Level co- occurrence matrix (GLCM) respectively. Each ROI of the query image is compared with same number of ROIs of the target image that are arranged in the descending order of white pixel density in the regions, using Euclidean distance measure for similarity computation. Preliminary experimental results show that the proposed method provides better retrieving result than retrieval using some of the existing methods.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper proposes a content based image retrieval (CBIR) system using the local colour and texture features of selected image sub-blocks and global colour and shape features of the image. The image sub-blocks are roughly identified by segmenting the image into partitions of different configuration, finding the edge density in each partition using edge thresholding, morphological dilation. The colour and texture features of the identified regions are computed from the histograms of the quantized HSV colour space and Gray Level Co- occurrence Matrix (GLCM) respectively. A combined colour and texture feature vector is computed for each region. The shape features are computed from the Edge Histogram Descriptor (EHD). A modified Integrated Region Matching (IRM) algorithm is used for finding the minimum distance between the sub-blocks of the query and target image. Experimental results show that the proposed method provides better retrieving result than retrieval using some of the existing methods

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Speech is a natural mode of communication for people and speech recognition is an intensive area of research due to its versatile applications. This paper presents a comparative study of various feature extraction methods based on wavelets for recognizing isolated spoken words. Isolated words from Malayalam, one of the four major Dravidian languages of southern India are chosen for recognition. This work includes two speech recognition methods. First one is a hybrid approach with Discrete Wavelet Transforms and Artificial Neural Networks and the second method uses a combination of Wavelet Packet Decomposition and Artificial Neural Networks. Features are extracted by using Discrete Wavelet Transforms (DWT) and Wavelet Packet Decomposition (WPD). Training, testing and pattern recognition are performed using Artificial Neural Networks (ANN). The proposed method is implemented for 50 speakers uttering 20 isolated words each. The experimental results obtained show the efficiency of these techniques in recognizing speech

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A primary medium for the human beings to communicate through language is Speech. Automatic Speech Recognition is wide spread today. Recognizing single digits is vital to a number of applications such as voice dialling of telephone numbers, automatic data entry, credit card entry, PIN (personal identification number) entry, entry of access codes for transactions, etc. In this paper we present a comparative study of SVM (Support Vector Machine) and HMM (Hidden Markov Model) to recognize and identify the digits used in Malayalam speech.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The objective of the study is to develop a hand written character recognition system that could recognisze all the characters in the mordern script of malayalam language at a high recognition rate