924 resultados para acoustic speech recognition system
Resumo:
This paper reviews a study to determine if an auditory approach to speech correction can be of beneift to hearing impaired children who have become visually oriented.
Resumo:
This paper discusses a study done to determine how cochlear implant users perceive speech sounds using MPEAK or SPEAK speech coding strategy.
Resumo:
The aim of this thesis is to investigate computerized voice assessment methods to classify between the normal and Dysarthric speech signals. In this proposed system, computerized assessment methods equipped with signal processing and artificial intelligence techniques have been introduced. The sentences used for the measurement of inter-stress intervals (ISI) were read by each subject. These sentences were computed for comparisons between normal and impaired voice. Band pass filter has been used for the preprocessing of speech samples. Speech segmentation is performed using signal energy and spectral centroid to separate voiced and unvoiced areas in speech signal. Acoustic features are extracted from the LPC model and speech segments from each audio signal to find the anomalies. The speech features which have been assessed for classification are Energy Entropy, Zero crossing rate (ZCR), Spectral-Centroid, Mean Fundamental-Frequency (Meanf0), Jitter (RAP), Jitter (PPQ), and Shimmer (APQ). Naïve Bayes (NB) has been used for speech classification. For speech test-1 and test-2, 72% and 80% accuracies of classification between healthy and impaired speech samples have been achieved respectively using the NB. For speech test-3, 64% correct classification is achieved using the NB. The results direct the possibility of speech impairment classification in PD patients based on the clinical rating scale.
Resumo:
This work aims to investigate the efficiency of digital signal processing tools of acoustic emission signals in order to detect thermal damages in grinding process. To accomplish such a goal, an experimental work was carried out for 15 runs in a surface grinding machine operating with an aluminum oxide grinding wheel and ABNT 1045. The acoustic emission signals were acquired from a fixed sensor placed on the workpiece holder. A high sampling rate data acquisition system at 2.5 MHz was used to collect the raw acoustic emission instead of root mean square value usually employed. Many statistics have shown effective to detect burn, such as the root mean square (RMS), correlation of the AE, constant false alarm (CFAR), ratio of power (ROP) and mean-value deviance (MVD). However, the CFAR, ROP, Kurtosis and correlation of the AE have been presented more sensitive than the RMS.
Resumo:
We present a new approach for corpus-based speech enhancement that significantly improves over a method published by Xiao and Nickel in 2010. Corpus-based enhancement systems do not merely filter an incoming noisy signal, but resynthesize its speech content via an inventory of pre-recorded clean signals. The goal of the procedure is to perceptually improve the sound of speech signals in background noise. The proposed new method modifies Xiao's method in four significant ways. Firstly, it employs a Gaussian mixture model (GMM) instead of a vector quantizer in the phoneme recognition front-end. Secondly, the state decoding of the recognition stage is supported with an uncertainty modeling technique. With the GMM and the uncertainty modeling it is possible to eliminate the need for noise dependent system training. Thirdly, the post-processing of the original method via sinusoidal modeling is replaced with a powerful cepstral smoothing operation. And lastly, due to the improvements of these modifications, it is possible to extend the operational bandwidth of the procedure from 4 kHz to 8 kHz. The performance of the proposed method was evaluated across different noise types and different signal-to-noise ratios. The new method was able to significantly outperform traditional methods, including the one by Xiao and Nickel, in terms of PESQ scores and other objective quality measures. Results of subjective CMOS tests over a smaller set of test samples support our claims.
Resumo:
Users of cochlear implant systems, that is, of auditory aids which stimulate the auditory nerve at the cochlea electrically, often complain about poor speech understanding in noisy environments. Despite the proven advantages of multimicrophone directional noise reduction systems for conventional hearing aids, only one major manufacturer has so far implemented such a system in a product, presumably because of the added power consumption and size. We present a physically small (intermicrophone distance 7 mm) and computationally inexpensive adaptive noise reduction system suitable for behind-the-ear cochlear implant speech processors. Supporting algorithms, which allow the adjustment of the opening angle and the maximum noise suppression, are proposed and evaluated. A portable real-time device for test in real acoustic environments is presented.
Resumo:
Bone-anchored hearing implants (BAHI) are routinely used to alleviate the effects of the acoustic head shadow in single-sided sensorineural deafness (SSD). In this study, the influence of the directional microphone setting and the maximum power output of the BAHI sound processor on speech understanding in noise in a laboratory setting were investigated. Eight adult BAHI users with SSD participated in this pilot study. Speech understanding in noise was measured using a new Slovak speech-in-noise test in two different spatial settings, either with noise coming from the front and noise from the side of the BAHI (S90N0) or vice versa (S0N90). In both spatial settings, speech understanding was measured without a BAHI, with a Baha BP100 in omnidirectional mode, with a BP100 in directional mode, with a BP110 power in omnidirectional and with a BP110 power in directional mode. In spatial setting S90N0, speech understanding in noise with either sound processor and in either directional mode was improved by 2.2-2.8 dB (p = 0.004-0.016). In spatial setting S0N90, speech understanding in noise was reduced by either BAHI, but was significantly better by 1.0-1.8 dB, if the directional microphone system was activated (p = 0.046), when compared to the omnidirectional setting. With the limited number of subjects in this study, no statistically significant differences were found between the two sound processors.
Resumo:
OBJECTIVE To confirm the clinical efficacy and safety of a direct acoustic cochlear implant. STUDY DESIGN Prospective multicenter study. SETTING The study was performed at 3 university hospitals in Europe (Germany, The Netherlands, and Switzerland). PATIENTS Fifteen patients with severe-to-profound mixed hearing loss because of otosclerosis or previous failed stapes surgery. INTERVENTION Implantation with a Codacs direct acoustic cochlear implant investigational device (ID) combined with a stapedotomy with a conventional stapes prosthesis MAIN OUTCOME MEASURES Preoperative and postoperative (3 months after activation of the investigational direct acoustic cochlear implant) audiometric evaluation measuring conventional pure tone and speech audiometry, tympanometry, aided thresholds in sound field and hearing difficulty by the Abbreviated Profile of Hearing Aid Benefit questionnaire. RESULTS The preoperative and postoperative air and bone conduction thresholds did not change significantly by the implantation with the investigational Direct Acoustic Cochlear Implant. The mean sound field thresholds (0.25-8 kHz) improved significantly by 48 dB. The word recognition scores (WRS) at 50, 65, and 80 dB SPL improved significantly by 30.4%, 75%, and 78.2%, respectively, after implantation with the investigational direct acoustic cochlear implant compared with the preoperative unaided condition. The difficulty in hearing, measured by the Abbreviated Profile of Hearing Aid Benefit, decreased by 27% after implantation with the investigational direct acoustic cochlear implant. CONCLUSION Patients with moderate-to-severe mixed hearing loss because of otosclerosis can benefit substantially using the Codacs investigational device.
Resumo:
Objective. To compare hearing and speech understanding between a new, nonskin penetrating Baha system (Baha Attract) to the current Baha system using a skin-penetrating abutment. Methods. Hearing and speech understanding were measured in 16 experienced Baha users. The transmission path via the abutment was compared to a simulated Baha Attract transmission path by attaching the implantable magnet to the abutment and then by adding a sample of artificial skin and the external parts of the Baha Attract system. Four different measurements were performed: bone conduction thresholds directly through the sound processor (BC Direct), aided sound field thresholds, aided speech understanding in quiet, and aided speech understanding in noise. Results. The simulated Baha Attract transmission path introduced an attenuation starting from approximately 5 dB at 1000 Hz, increasing to 20–25 dB above 6000 Hz. However, aided sound field threshold shows smaller differences and aided speech understanding in quiet and in noise does not differ significantly between the two transmission paths. Conclusion. The Baha Attract system transmission path introduces predominately high frequency attenuation. This attenuation can be partially compensated by adequate fitting of the speech processor. No significant decrease in speech understanding in either quiet or in noise was found.
Resumo:
Several issues concerning the current use of speech interfaces are discussed and the design and development of a speech interface that enables air traffic controllers to command and control their terminals by voice is presented. A special emphasis is made in the comparison between laboratory experiments and field experiments in which a set of ergonomics-related effects are detected that cannot be observed in the controlled laboratory experiments. The paper presents both objective and subjective performance obtained in field evaluation of the system with student controllers at an air traffic control (ATC) training facility. The system exhibits high word recognition test rates (0.4% error in Spanish and 1.5% in English) and low command error (6% error in Spanish and 10.6% error in English in the field tests). Subjective impression has also been positive, encouraging future development and integration phases in the Spanish ATC terminals designed by Aeropuertos Españoles y Navegación Aérea (AENA).
Resumo:
This paper proposes the use of Factored Translation Models (FTMs) for improving a Speech into Sign Language Translation System. These FTMs allow incorporating syntactic-semantic information during the translation process. This new information permits to reduce significantly the translation error rate. This paper also analyses different alternatives for dealing with the non-relevant words. The speech into sign language translation system has been developed and evaluated in a specific application domain: the renewal of Identity Documents and Driver’s License. The translation system uses a phrase-based translation system (Moses). The evaluation results reveal that the BLEU (BiLingual Evaluation Understudy) has improved from 69.1% to 73.9% and the mSER (multiple references Sign Error Rate) has been reduced from 30.6% to 24.8%.
Resumo:
This paper describes a categorization module for improving the performance of a Spanish into Spanish Sign Language (LSE) translation system. This categorization module replaces Spanish words with associated tags. When implementing this module, several alternatives for dealing with non-relevant words have been studied. Non-relevant words are Spanish words not relevant in the translation process. The categorization module has been incorporated into a phrase-based system and a Statistical Finite State Transducer (SFST). The evaluation results reveal that the BLEU has increased from 69.11% to 78.79% for the phrase-based system and from 69.84% to 75.59% for the SFST.