Biblioteca Digital

993 resultados para Speech perception

Anticiper l’avenir de la prévention basée sur le risque génétique : analyse qualitative de la perception des participants à l’étude «Dessine-moi un futur!»

Relevância:

20.00% 20.00%

Publicador:

Resumo:

En cette ère de «nouvelle santé publique», les professionnels sont exhortés à détourner leur attention de l’individu afin de pouvoir mettre l’accent sur les déterminants sociaux de la santé. Un phénomène contraire s’opère dans le domaine des sciences biomédicales, où un mouvement vers la santé personnalisée permet d’envisager des soins préventifs et curatifs adaptés à chaque individu, en fonction de son profil de risque génétique. Bien qu’elles n’aient que partiellement fait leur entrée dans notre système de santé, ces avancées scientifiques risquent de changer significativement le visage de la prévention, et dans cette foulée, de susciter des débats de société importants. L'étude proposée vise à contribuer à une réflexion sur l'avenir d'une des fonctions essentielles de la santé publique en tentant de mieux comprendre comment le public perçoit la prévention basée sur le risque génétique. Ce projet de recherche qualitative consiste en l'analyse secondaire des échanges ayant eu lieu lors de quatre ateliers délibératifs auxquels ont participé des membres du public d'horizons divers, et durant lesquels ceux-ci ont débattu de la désirabilité d'une technologie préventive fictive, le «rectificateur cardiaque». La théorie de la structuration d'Anthony Giddens est utilisée comme cadre conceptuel guidant l’analyse des échanges. Celle-ci permet d’émettre les trois constats suivants: a- le « rectificateur cardiaque » est loin d’être interprété par tous les participants comme étant une intervention préventive; b- son utilisation est perçue comme étant légitime ou non dépendamment principalement des groupes de personnes qu’elle viserait; c- l’intervention proposée ne se pense pas hors contexte.

La perception des parents non gardiens de leur lien avec leur enfant dans un contexte où les conflits parentaux perdurent à la suite de la séparation conjugale

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Il est connu que de nombreux enfants vivent la séparation conjugale de leurs parents. Suite à cette séparation, les enfants vivent majoritairement avec leur mère (parent gardien), tout en maintenant des liens avec leur père (parent non gardien). Bien que les principes de droit suggèrent que l’enfant ait le droit de préserver des liens avec chacun de ses parents à la suite de la séparation conjugale, ces liens ne sont plus assurés sur une base quotidienne et peuvent être affectés. Vivant la séparation de ses parents, l’enfant peut être exposé aux conflits parentaux puisque la séparation peut augmenter leur intensité. L’objectif de ce mémoire est de mieux comprendre la perception des parents non gardiens de leur lien avec leur enfant dans un contexte où les conflits parentaux perdurent à la suite de la séparation conjugale. Un sous-objectif est de documenter les facteurs qui influencent les liens entre les parents non gardiens et leur enfant à la suite de la séparation conjugal. Pour ce faire, des entrevues individuelles semi-directives ont été effectuées auprès de huit parents non gardiens. Une analyse de contenu thématique concernant leur perspective sur l’objet de recherche a été effectuée. Selon la perspective des parents non gardiens, les résultats montrent que la qualité de la relation entre eux et leur enfant se maintient positivement. Le facteur le plus prédominant est les conflits parentaux post-séparation. Il en ressort qu’ils alimentent d’autres facteurs, tels que les modalités de garde d’enfant et droits d’accès, la fréquence de contacts entre les parents non gardiens et leur enfant, les comportements des enfants à l’égard de leur parent non gardien, l’engagement parental des parents non gardiens ainsi que la relation parentale post-séparation.

La perception naïve non native des voyelles nasales du portugais

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Les adultes peuvent éprouver des difficultés à discriminer des phonèmes d’une langue seconde (L2) qui ne servent pas à distinguer des items lexicaux dans leur langue maternelle (L1). Le Feature Model (FM) de Brown (1998) propose que les adultes peuvent réussir à créer des nouvelles catégories de sons seulement si celles-ci peuvent être construites à partir de traits distinctifs existant dans la L1 des auditeurs. Cette hypothèse a été testée sur plusieurs contrastes consonantiques dans différentes langues; cependant, il semble que les traits qui s’appliquent sur les voyelles n’aient jamais été examinés dans cette perspective et encore moins les traits qui opèrent à la fois dans les systèmes vocalique et consonantique et qui peuvent avoir un statut distinctif ou non-distinctif. Le principal objectif de la présente étude était de tester la validité du FM concernant le contraste vocalique oral-nasal du portugais brésilien (PB). La perception naïve du contraste /i/-/ĩ/ par des locuteurs du français, de l’anglais, de l’espagnol caribéen et de l’espagnol conservateur a été examinée, étant donné que ces quatre langues diffèrent en ce qui a trait au statut de la nasalité. De plus, la perception du contraste non-naïf /e/-/ẽ/ a été inclus afin de comparer les performances dans la perception naïve et non-naïve. Les résultats obtenus pour la discrimination naïve de /i/-/ĩ/ a permis de tirer les conclusions suivantes pour la première exposition à un contraste non natif : (1) le trait [nasal] qui opère de façon distinctive dans la grammaire d’une certaine L1 peut être redéployé au sein du système vocalique, (2) le trait [nasal] qui opère de façon distinctive dans la grammaire d’une certaine L1 ne peut pas être redéployé à travers les systèmes (consonne à voyelle) et (3) le trait [nasal] qui opère de façon non-distinctive dans la grammaire d’une certaine L1 peut être ou ne pas être redéployé au statut distinctif. En dernier lieu, la discrimination non-naïve de /e/-/ẽ/ a été réussie par tous les groupes, suggérant que les trois types de redéploiement s’avèrent possibles avec plus d’expérience dans la L2.

Il était une fois une cible et un distracteur : électrophysiologie des mécanismes corticaux de l'attention visuelle en perception et en mémoire.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Cet ouvrage explore en trois volets des aspects du traitement attentionnel de cibles et de distracteurs visuels ainsi que leur mesures électrophysiologiques. Le premier chapitre aborde le traitement attentionnel spécifique à la cible et aux distracteurs durant une recherche visuelle. La division de la N2pc en une NT et une PD remet en question la théorie proposant qu'il existe systématiquement une activité attentionnelle liée à un distracteur saillant, car un distracteur vert ne provoque aucune activité latéralisée propre. Le second chapitre aborde la question de la latéralisation des structures responsables du maintient et de la récupération d'information en mémoire visuelle à court-terme. En utilisant un paradigme de latéralisation de la cible et du distracteur, il nous est possible de vérifier qu'il existe une composante latéralisée négative dans la région temporale, la TCN, propre à la cible lors du rappel en mémoire. De plus, on observe également une composante latéralisée pour le distracteur sur la partie postérieure du crâne. Ces deux éléments convergent pour indiquer qu'il existe une latéralisation des structures activées lors de la récupération de l'information en mémoire visuelle à court-terme en fonction de l'hémichamps où se trouve la cible ou le distracteur. Enfin, dans le troisième chapitre, il est question de l'effet sur le déploiement attentionnel de l'ajout de distracteurs gris de faible saillance autour de cibles potentielles. L'ajout de ces distracteurs augmente la difficulté d'identification de la cible. Cette difficulté provoque un déplacement de l'activité de la N2pc vers la fenêtre de temps associée à la composante Ptc. Un nombre plus important de distracteurs gris entraîne une plus grande proportion de l'activité à être retardée. Également, les distracteurs gris qui sont placés entre les cibles potentielles provoquent un retard plus important que les distracteurs placés hors de cette région. Au cours de cette thèse, la question de la saillance attentionnelle des différentes couleurs durant une recherche visuelle est récurente. Nous observons une plus grande saillance du rouge par rapport au vert quand ils sont distracteurs et le vert est plus difficile à distinguer du gris que le jaune.

Romantic attachment and perception of partner support to explain psychological aggression perpetrated in couples seeking couples therapy

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Essai doctoral présenté à la Faculté des études supérieures en vue de l’obtention du grade de Docteur en psychologie (D.Psy.), option clinique

Une réflexion philosophique inédite sur le web : lecture de "L'être et l'écran. Comment le numérique change la perception" de Stéphane Vial (Puf 2013)

Relevância:

20.00% 20.00%

Publicador:

Resumo:

[« L'être et l'écran » de Stéphane Vial] ne se limite pas à revendiquer le droit des philosophes, un droit désormais reconnu, à s'occuper de web, d'applications, d’algorithmes et d'interfaces : il va bien au-delà de cette constatation pour encadrer l'ensemble des instruments techniques qui engendrent le web dans la pertinence d'une analyse philosophique, voire phénoménologique, qui les prend en compte en tant qu'instruments « phénoménotechniques », instruments qui « font le monde et nous le donne » et déterminent « la qualité de notre expérience d'exister. [...]

Politiques du passage de l'individuel au collectif dans Les Années d'Annie Ernaux

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Ce travail aborde l’œuvre d'Annie Ernaux en montrant sa préoccupation constante de tendre vers les autres, préoccupation qui découle de la conviction que l'individu est constitué des discours qui le traversent et du monde social qui l'entoure. Influencée par la sociologie de Bourdieu et par sa propre expérience de transfuge de classe, Ernaux fait aussi montre d'une grande sensibilité aux rapports de domination, omniprésents dans l'espace social. Par son œuvre, elle cherche à faire entrer dans la sphère du légitime des expériences reléguées dans l'illégitime, sa démarche se rapprochant en cela de celle de Foucault. Cela inclut le mode de vie et la mémoire des dominés, qui n'entrent généralement ni dans la littérature ni dans les discours dominants. Elle souhaite également montrer que les expériences vécues sur le mode individuel sont en fait largement partagées et ont des origines sociales et politiques. Bien que tous ses livres visent à accomplir ce passage de l'individuel au collectif, ce travail s'attarde plus particulièrement aux Années, qui conjugue des stratégies narratives employées dans ses livres précédents à une nouvelle forme de narration à la troisième personne, lui permettant de livrer un texte encore plus « auto-socio-biographique ».

Perception, aperception et conscience chez Leibniz

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Cet article propose une interprétation de certains passages qui posent un problème de cohérence dans la théorie leibnizienne de la perception et de l’aperception. C’est le cas notamment d’un passage des Nouveaux essais sur l’entendement humain (1704), qui accorde aux animaux l’aperception, et du quatrième paragraphe des Principes de la Nature et de la Grâce (1710), où Leibniz semble plutôt faire coïncider aperception et réflexion, celle-ci étant pourtant réservée aux esprits raisonnables ailleurs dans son œuvre. Afin d’éviter la contradiction, notre interprétation donne une crédibilité particulière au passage des Nouveaux essais en défendant l’idée que Leibniz accorde l’aperception aux animaux, mais réserve la réflexion aux esprits. Nous tâcherons aussi de rendre évident comment certains passages semblant contredire cette position peuvent néanmoins être interprétés en ce sens.

Speech Analysis using Modern Techniques of Nonlinear Dynamics

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Medical fields requires fast, simple and noninvasive methods of diagnostic techniques. Several methods are available and possible because of the growth of technology that provides the necessary means of collecting and processing signals. The present thesis details the work done in the field of voice signals. New methods of analysis have been developed to understand the complexity of voice signals, such as nonlinear dynamics aiming at the exploration of voice signals dynamic nature. The purpose of this thesis is to characterize complexities of pathological voice from healthy signals and to differentiate stuttering signals from healthy signals. Efficiency of various acoustic as well as non linear time series methods are analysed. Three groups of samples are used, one from healthy individuals, subjects with vocal pathologies and stuttering subjects. Individual vowels/ and a continuous speech data for the utterance of the sentence "iruvarum changatimaranu" the meaning in English is "Both are good friends" from Malayalam language are recorded using a microphone . The recorded audio are converted to digital signals and are subjected to analysis.Acoustic perturbation methods like fundamental frequency (FO), jitter, shimmer, Zero Crossing Rate(ZCR) were carried out and non linear measures like maximum lyapunov exponent(Lamda max), correlation dimension (D2), Kolmogorov exponent(K2), and a new measure of entropy viz., Permutation entropy (PE) are evaluated for all three groups of the subjects. Permutation Entropy is a nonlinear complexity measure which can efficiently distinguish regular and complex nature of any signal and extract information about the change in dynamics of the process by indicating sudden change in its value. The results shows that nonlinear dynamical methods seem to be a suitable technique for voice signal analysis, due to the chaotic component of the human voice. Permutation entropy is well suited due to its sensitivity to uncertainties, since the pathologies are characterized by an increase in the signal complexity and unpredictability. Pathological groups have higher entropy values compared to the normal group. The stuttering signals have lower entropy values compared to the normal signals.PE is effective in charaterising the level of improvement after two weeks of speech therapy in the case of stuttering subjects. PE is also effective in characterizing the dynamical difference between healthy and pathological subjects. This suggests that PE can improve and complement the recent voice analysis methods available for clinicians. The work establishes the application of the simple, inexpensive and fast algorithm of PE for diagnosis in vocal disorders and stuttering subjects.

Farm Programmes of Electronic Media: A Comparative Study of Audience Perception in Kerala

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Farm communication and extension programs are vital part of the farm development attempts. Electronic media plays a major role in farm extension activities. Kerala, the consumer state, which was a complete agricultural state in pre-independence period, is the sprouting land of agricultural extension and publication activities in print media. Later AIR (All India Radio) farm programs and farm broadcasting of Doordarshan enriched the role of electronic media in farm extension activities. The media saturated southern state of India received this new electronic media farm communication revolution whole heartedly. However, after 1990, Kerala witnessed a flood of private T V channels and currently there are 24 channels in this regional language, named Malayalam. All major news and entertainment channels are broadcasting farm programs. Farm programs of AIR and Doordarshan, broadcasted in Malayalam language, have been well accepted to the farmers‘ in Kerala. However, post-independence period, witnessed the formation of Kerala state in Indian Union and the first ballot-elected communist Government started its administration. After the land reform bills, the state witnessed a gradual decrease in agricultural production. Even if it is not reflected much in the attitude and practices of farm community and farm broadcast of traditional electronic broadcasting, a change is observable after the post-liberalization era of India. Private Television channels, which were focused on entertainment value of programs, started broadcasting farm programs and the parameters of program production went through certain changes. In this situation, there is ample relevance for a study about the farm programs of electronic media in terms of a comparative study of audience perception. The study is limited in the state of Kerala as it is the most media saturated state in India. The study analyzes the rate, nature and scope of adoption of farming methods transmitted through electronic media (T.V. and Radio) in Malayalam language.All kinds of Farm programs including comprehensive program serials, success stories, seasonal cropping methods, experts opinion, been analyzed on the basis of the following objectives.  To find whether propagating new farm methods through farm programs in electronic media or the availability of adequate infrastructure and economic factors make a farmer to adopt a new farming method.  To find which electronic media has more influence on farmers to adopt agricultural programs.  To find which form of electronic media gets better feedback from farmers  To find out whether the programs of T.V. or Radio is more acceptable to farmers than the print media.  To find whether farmers gets the message through their preferred medium for the message. The researcher recorded opinions from a panel of agricultural officers, farm Information officers, agro extension researchers and experts. According to their opinions and guidelines, a pilot study is designed and conducted in Kanjikuzhy Panchayath, in Alappuzha district, Kerala. The Panchayath is selected by considering its ideal nature of being the sample for a social Science research. Besides, the nature of farming in the Panchayath, which devoid of the cultivation of cash crops also supported its sample value. As per the observations from the pilot study, researcher confirmed the Triangulation method as the methodology of research. The questionnaire survey, being the primary part contained 42 Questions with 6 independent and 32 dependent variables. The survey is conducted among 400 respondents in Idukki, Alappuzha and Pathanamthitta districts considering geographical differences and distribution of different types of crops. The response from a total of 360 respondents, 120 from each district, finally selected for tabulation and data analysis.The data analysis, based on percentage analysis, along with the results from focus group discussion among a selected group of 20 farmers, together produced the results as follows. Farmers, who are the audience of farm programs, have a very serious approach towards the medium. They are maintaining a critical point of view towards the content of the programs. Farmers are reasonably aware about the financial side of the programs and the monitory aspirations of both private and Government owned Television channels. Even though, the farmers are not aware on the technical terminology and jargons, they have ideas about success stories, program serials and they are even informed about channels are not maintaining an audience research section like AIR. Though the farmers accept Doordarshan as the credential source of farm information and methods, they are inclined to the entertainment value of programs too. They prefer to have more entertainment value for the programs of Doordarshan. Surprisingly, they have very solid suggestions on even about the shots which add entertainment value to the farm broadcasting methods of Doordarshan. Farmers are very much aware about the fact that media is just an instrument for inspiration and persuasion. They strongly believe that the source of information and new methods is agricultural research and an effective change happens only when there are adequate infrastructure and marketing facilities, along with the proper support from Government agricultural guideline and support systems like Krishi Bhavans. They strongly believe that media alone cannot create any magic in increasing agricultural production. Farmers are pointing out the lack of response to the feedback and queries of farmers on farming methods, as an evidence for the difference in levels of commitment of Government and private owned Television channels.Farmers are still perceiving AIR farm programs are far more committed to farmers and farming than any other electronic medium. However, they are seriously lacking Radio receivers with medium wave reception facility. Farmers perceive that the farming methods on new crops are more adoptable than the farming methods of traditional crops in both private and Government owned Television channels. There are multiple factors behind this observation from farmers. Farmers changed in terms of viewing habits and they prefer success stories, which are totally irrelevant and they even think that such stories encourage people to go for farming and they opined that such stories are good sources of inspiration. However, they are all very much sure about the importance and particular about the presence of entertainment factor even in farm programs. Farmers expect direct interaction of any expert of the new farming method to implement the method in their agriculture practices. Though introduction of a new idea in the T.V. is acceptable, farmers need the direct instruction of expert on field to start implementing the new farming practices Farmers still have an affinity towards print media reports and agricultural pages and they have complaints to print media on the removal of agricultural information pages from news papers. They prefer the reports in print media as it facilitates them to collect and refer articles when they need it. Farmers are having an eye of doubt about the credibility of farm programs by private T.V. channels. Even if they prefer private Television channels for listening and adopting new farming methods and other farm information, they scrutinize programs to know whether they are sponsored programs by agrochemical or agro-fertilizer manufacturer.

Speech sample estimation from composite zerocrossings and encoding via adaptive switching of transforms

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This thesis investigates the potential use of zerocrossing information for speech sample estimation. It provides 21 new method tn) estimate speech samples using composite zerocrossings. A simple linear interpolation technique is developed for this purpose. By using this method the A/D converter can be avoided in a speech coder. The newly proposed zerocrossing sampling theory is supported with results of computer simulations using real speech data. The thesis also presents two methods for voiced/ unvoiced classification. One of these methods is based on a distance measure which is a function of short time zerocrossing rate and short time energy of the signal. The other one is based on the attractor dimension and entropy of the signal. Among these two methods the first one is simple and reguires only very few computations compared to the other. This method is used imtea later chapter to design an enhanced Adaptive Transform Coder. The later part of the thesis addresses a few problems in Adaptive Transform Coding and presents an improved ATC. Transform coefficient with maximum amplitude is considered as ‘side information’. This. enables more accurate tfiiz assignment enui step—size computation. A new bit reassignment scheme is also introduced in this work. Finally, sum ATC which applies switching between luiscrete Cosine Transform and Discrete Walsh-Hadamard Transform for voiced and unvoiced speech segments respectively is presented. Simulation results are provided to show the improved performance of the coder

Development of a Biometric Personal Authentication System Based on Fingerprint and Speech

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Biometrics deals with the physiological and behavioral characteristics of an individual to establish identity. Fingerprint based authentication is the most advanced biometric authentication technology. The minutiae based fingerprint identification method offer reasonable identification rate. The feature minutiae map consists of about 70-100 minutia points and matching accuracy is dropping down while the size of database is growing up. Hence it is inevitable to make the size of the fingerprint feature code to be as smaller as possible so that identification may be much easier. In this research, a novel global singularity based fingerprint representation is proposed. Fingerprint baseline, which is the line between distal and intermediate phalangeal joint line in the fingerprint, is taken as the reference line. A polygon is formed with the singularities and the fingerprint baseline. The feature vectors are the polygonal angle, sides, area, type and the ridge counts in between the singularities. 100% recognition rate is achieved in this method. The method is compared with the conventional minutiae based recognition method in terms of computation time, receiver operator characteristics (ROC) and the feature vector length. Speech is a behavioural biometric modality and can be used for identification of a speaker. In this work, MFCC of text dependant speeches are computed and clustered using k-means algorithm. A backpropagation based Artificial Neural Network is trained to identify the clustered speech code. The performance of the neural network classifier is compared with the VQ based Euclidean minimum classifier. Biometric systems that use a single modality are usually affected by problems like noisy sensor data, non-universality and/or lack of distinctiveness of the biometric trait, unacceptable error rates, and spoof attacks. Multifinger feature level fusion based fingerprint recognition is developed and the performances are measured in terms of the ROC curve. Score level fusion of fingerprint and speech based recognition system is done and 100% accuracy is achieved for a considerable range of matching threshold

Modified Block Adaptive Predictive Coder For Speech Processing

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This thesis investigated the potential use of Linear Predictive Coding in speech communication applications. A Modified Block Adaptive Predictive Coder is developed, which reduces the computational burden and complexity without sacrificing the speech quality, as compared to the conventional adaptive predictive coding (APC) system. For this, changes in the evaluation methods have been evolved. This method is as different from the usual APC system in that the difference between the true and the predicted value is not transmitted. This allows the replacement of the high order predictor in the transmitter section of a predictive coding system, by a simple delay unit, which makes the transmitter quite simple. Also, the block length used in the processing of the speech signal is adjusted relative to the pitch period of the signal being processed rather than choosing a constant length as hitherto done by other researchers. The efficiency of the newly proposed coder has been supported with results of computer simulation using real speech data. Three methods for voiced/unvoiced/silent/transition classification have been presented. The first one is based on energy, zerocrossing rate and the periodicity of the waveform. The second method uses normalised correlation coefficient as the main parameter, while the third method utilizes a pitch-dependent correlation factor. The third algorithm which gives the minimum error probability has been chosen in a later chapter to design the modified coder The thesis also presents a comparazive study beh-cm the autocorrelation and the covariance methods used in the evaluaiicn of the predictor parameters. It has been proved that the azztocorrelation method is superior to the covariance method with respect to the filter stabf-it)‘ and also in an SNR sense, though the increase in gain is only small. The Modified Block Adaptive Coder applies a switching from pitch precitzion to spectrum prediction when the speech segment changes from a voiced or transition region to an unvoiced region. The experiments cont;-:ted in coding, transmission and simulation, used speech samples from .\£=_‘ajr2_1a:r1 and English phrases. Proposal for a speaker reecgnifion syste: and a phoneme identification system has also been outlized towards the end of the thesis.

COMBINED FEATURE EXTRACTION TECHNIQUES AND NAIVE BAYES CLASSIFIER FOR SPEECH RECOGNITION

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Speech processing and consequent recognition are important areas of Digital Signal Processing since speech allows people to communicate more natu-rally and efficiently. In this work, a speech recognition system is developed for re-cognizing digits in Malayalam. For recognizing speech, features are to be ex-tracted from speech and hence feature extraction method plays an important role in speech recognition. Here, front end processing for extracting the features is per-formed using two wavelet based methods namely Discrete Wavelet Transforms (DWT) and Wavelet Packet Decomposition (WPD). Naive Bayes classifier is used for classification purpose. After classification using Naive Bayes classifier, DWT produced a recognition accuracy of 83.5% and WPD produced an accuracy of 80.7%. This paper is intended to devise a new feature extraction method which produces improvements in the recognition accuracy. So, a new method called Dis-crete Wavelet Packet Decomposition (DWPD) is introduced which utilizes the hy-brid features of both DWT and WPD. The performance of this new approach is evaluated and it produced an improved recognition accuracy of 86.2% along with Naive Bayes classifier.

PERFORMANCE OF DIFFERENT CLASSIFIERS IN SPEECH RECOGNITION

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Speech is the most natural means of communication among human beings and speech processing and recognition are intensive areas of research for the last five decades. Since speech recognition is a pattern recognition problem, classification is an important part of any speech recognition system. In this work, a speech recognition system is developed for recognizing speaker independent spoken digits in Malayalam. Voice signals are sampled directly from the microphone. The proposed method is implemented for 1000 speakers uttering 10 digits each. Since the speech signals are affected by background noise, the signals are tuned by removing the noise from it using wavelet denoising method based on Soft Thresholding. Here, the features from the signals are extracted using Discrete Wavelet Transforms (DWT) because they are well suitable for processing non-stationary signals like speech. This is due to their multi- resolutional, multi-scale analysis characteristics. Speech recognition is a multiclass classification problem. So, the feature vector set obtained are classified using three classifiers namely, Artificial Neural Networks (ANN), Support Vector Machines (SVM) and Naive Bayes classifiers which are capable of handling multiclasses. During classification stage, the input feature vector data is trained using information relating to known patterns and then they are tested using the test data set. The performances of all these classifiers are evaluated based on recognition accuracy. All the three methods produced good recognition accuracy. DWT and ANN produced a recognition accuracy of 89%, SVM and DWT combination produced an accuracy of 86.6% and Naive Bayes and DWT combination produced an accuracy of 83.5%. ANN is found to be better among the three methods.

«
1
2
...
51
52
53
54
55
56
57
...
66
67
»