36 resultados para acoustic speech recognition system
em BORIS: Bern Open Repository and Information System - Berna - Suiça
Resumo:
Computer vision-based food recognition could be used to estimate a meal's carbohydrate content for diabetic patients. This study proposes a methodology for automatic food recognition, based on the Bag of Features (BoF) model. An extensive technical investigation was conducted for the identification and optimization of the best performing components involved in the BoF architecture, as well as the estimation of the corresponding parameters. For the design and evaluation of the prototype system, a visual dataset with nearly 5,000 food images was created and organized into 11 classes. The optimized system computes dense local features, using the scale-invariant feature transform on the HSV color space, builds a visual dictionary of 10,000 visual words by using the hierarchical k-means clustering and finally classifies the food images with a linear support vector machine classifier. The system achieved classification accuracy of the order of 78%, thus proving the feasibility of the proposed approach in a very challenging image dataset.
Resumo:
OBJECTIVE To evaluate the speech intelligibility in noise with a new cochlear implant (CI) processor that uses a pinna effect imitating directional microphone system. STUDY DESIGN Prospective experimental study. SETTING Tertiary referral center. PATIENTS Ten experienced, unilateral CI recipients with bilateral severe-to-profound hearing loss. INTERVENTION All participants performed speech in noise tests with the Opus 2 processor (omnidirectional microphone mode only) and the newer Sonnet processor (omnidirectional and directional microphone mode). MAIN OUTCOME MEASURE The speech reception threshold (SRT) in noise was measured in four spatial settings. The test sentences were always presented from the front. The noise was arriving either from the front (S0N0), the ipsilateral side of the CI (S0NIL), the contralateral side of the CI (S0NCL), or the back (S0N180). RESULTS The directional mode improved the SRTs by 3.6 dB (p < 0.01), 2.2 dB (p < 0.01), and 1.3 dB (p < 0.05) in the S0N180, S0NIL, and S0NCL situations, when compared with the Sonnet in the omnidirectional mode. There was no statistically significant difference in the S0N0 situation. No differences between the Opus 2 and the Sonnet in the omnidirectional mode were observed. CONCLUSION Speech intelligibility with the Sonnet system was statistically different to speech recognition with the Opus 2 system suggesting that CI users might profit from the pinna effect imitating directionality mode in noisy environments.
Resumo:
Smart homes for the aging population have recently started attracting the attention of the research community. The "health state" of smart homes is comprised of many different levels; starting with the physical health of citizens, it also includes longer-term health norms and outcomes, as well as the arena of positive behavior changes. One of the problems of interest is to monitor the activities of daily living (ADL) of the elderly, aiming at their protection and well-being. For this purpose, we installed passive infrared (PIR) sensors to detect motion in a specific area inside a smart apartment and used them to collect a set of ADL. In a novel approach, we describe a technology that allows the ground truth collected in one smart home to train activity recognition systems for other smart homes. We asked the users to label all instances of all ADL only once and subsequently applied data mining techniques to cluster in-home sensor firings. Each cluster would therefore represent the instances of the same activity. Once the clusters were associated to their corresponding activities, our system was able to recognize future activities. To improve the activity recognition accuracy, our system preprocessed raw sensor data by identifying overlapping activities. To evaluate the recognition performance from a 200-day dataset, we implemented three different active learning classification algorithms and compared their performance: naive Bayesian (NB), support vector machine (SVM) and random forest (RF). Based on our results, the RF classifier recognized activities with an average specificity of 96.53%, a sensitivity of 68.49%, a precision of 74.41% and an F-measure of 71.33%, outperforming both the NB and SVM classifiers. Further clustering markedly improved the results of the RF classifier. An activity recognition system based on PIR sensors in conjunction with a clustering classification approach was able to detect ADL from datasets collected from different homes. Thus, our PIR-based smart home technology could improve care and provide valuable information to better understand the functioning of our societies, as well as to inform both individual and collective action in a smart city scenario.
Resumo:
Crowdsourcing linguistic phenomena with smartphone applications is relatively new. Apps have been used to train acoustic models for automatic speech recognition (de Vries et al. 2014) and to archive endangered languages (Iwaidja Inyaman Team 2012). Leemann and Kolly (2013) developed a free app for iOS—Dialäkt Äpp (DÄ) (>78k downloads)—to document language change in Swiss German. Here, we present results of sound change based on DÄ data. DÄ predicts the users’ dialects: for 16 variables, users select their dialectal variant. DÄ then tells users which dialect they speak. Underlying this prediction are maps from the Linguistic Atlas of German-speaking Switzerland (SDS, 1962-2003), which documents the linguistic situation around 1950. If predicted wrongly, users indicate their actual dialect. With this information, the 16 variables can be assessed for language change. Results revealed robustness of phonetic variables; lexical and morphological variables were more prone to change. Phonetic variables like to lift (variants: /lupfə, lʏpfə, lipfə/) revealed SDS agreement scores of nearly 85%, i.e., little sound change. Not all phonetic variables are equally robust: ladle (variants: /xælə, xællə, xæuə, xæɫə, xæɫɫə/) exhibited significant sound change. We will illustrate the results using maps that show details of the sound changes at hand.
Resumo:
The level of improvement in the audiological results of Baha(®) users mainly depends on the patient's preoperative hearing thresholds and the type of Baha sound processor used. This investigation shows correlations between the preoperative hearing threshold and postoperative aided thresholds and audiological results in speech understanding in quiet of 84 Baha users with unilateral conductive hearing loss, bilateral conductive hearing loss and bilateral mixed hearing loss. Secondly, speech understanding in noise of 26 Baha users with different Baha sound processors (Compact, Divino, and BP100) is investigated. Linear regression between aided sound field thresholds and bone conduction (BC) thresholds of the better ear shows highest correlation coefficients and the steepest slope. Differences between better BC thresholds and aided sound field thresholds are smallest for mid-frequencies (1 and 2 kHz) and become larger at 0.5 and 4 kHz. For Baha users, the gain in speech recognition in quiet can be expected to lie in the order of magnitude of the gain in their hearing threshold. Compared to its predecessor sound processors Baha(®) Compact and Baha(®) Divino, Baha(®) BP100 improves speech understanding in noise significantly by +0.9 to +4.6 dB signal-to-noise ratio, depending on the setting and the use of directional microphone. For Baha users with unilateral and bilateral conductive hearing loss and bilateral mixed hearing loss, audiological results in aided sound field thresholds can be estimated with the better BC hearing threshold. The benefit in speech understanding in quiet can be expected to be similar to the gain in their sound field hearing threshold. The most recent technology of Baha sound processor improves speech understanding in noise by an order of magnitude that is well perceived by users and which can be very useful in everyday life.
Resumo:
A new implantable hearing system, the direct acoustic cochlear stimulator (DACS) is presented. This system is based on the principle of a power-driven stapes prosthesis and intended for the treatment of severe mixed hearing loss due to advanced otosclerosis. It consists of an implantable electromagnetic transducer, which transfers acoustic energy directly to the inner ear, and an audio processor worn externally behind the implanted ear. The device is implanted using a specially developed retromeatal microsurgical approach. After removal of the stapes, a conventional stapes prosthesis is attached to the transducer and placed in the oval window to allow direct acoustical coupling to the perilymph of the inner ear. In order to restore the natural sound transmission of the ossicular chain, a second stapes prosthesis is placed in parallel to the first one into the oval window and attached to the patient's own incus, as in a conventional stapedectomy. Four patients were implanted with an investigational DACS device. The hearing threshold of the implanted ears before implantation ranged from 78 to 101 dB (air conduction, pure tone average, 0.5-4 kHz) with air-bone gaps of 33-44 dB in the same frequency range. Postoperatively, substantial improvements in sound field thresholds, speech intelligibility as well as in the subjective assessment of everyday situations were found in all patients. Two years after the implantations, monosyllabic word recognition scores in quiet at 75 dB improved by 45-100 percent points when using the DACS. Furthermore, hearing thresholds were already improved by the second stapes prosthesis alone by 14-28 dB (pure tone average 0.5-4 kHz, DACS switched off). No device-related serious medical complications occurred and all patients have continued to use their device on a daily basis for over 2 years. Copyright (c) 2008 S. Karger AG, Basel.
Resumo:
Users of cochlear implant systems, that is, of auditory aids which stimulate the auditory nerve at the cochlea electrically, often complain about poor speech understanding in noisy environments. Despite the proven advantages of multimicrophone directional noise reduction systems for conventional hearing aids, only one major manufacturer has so far implemented such a system in a product, presumably because of the added power consumption and size. We present a physically small (intermicrophone distance 7 mm) and computationally inexpensive adaptive noise reduction system suitable for behind-the-ear cochlear implant speech processors. Supporting algorithms, which allow the adjustment of the opening angle and the maximum noise suppression, are proposed and evaluated. A portable real-time device for test in real acoustic environments is presented.
Resumo:
Bone-anchored hearing implants (BAHI) are routinely used to alleviate the effects of the acoustic head shadow in single-sided sensorineural deafness (SSD). In this study, the influence of the directional microphone setting and the maximum power output of the BAHI sound processor on speech understanding in noise in a laboratory setting were investigated. Eight adult BAHI users with SSD participated in this pilot study. Speech understanding in noise was measured using a new Slovak speech-in-noise test in two different spatial settings, either with noise coming from the front and noise from the side of the BAHI (S90N0) or vice versa (S0N90). In both spatial settings, speech understanding was measured without a BAHI, with a Baha BP100 in omnidirectional mode, with a BP100 in directional mode, with a BP110 power in omnidirectional and with a BP110 power in directional mode. In spatial setting S90N0, speech understanding in noise with either sound processor and in either directional mode was improved by 2.2-2.8 dB (p = 0.004-0.016). In spatial setting S0N90, speech understanding in noise was reduced by either BAHI, but was significantly better by 1.0-1.8 dB, if the directional microphone system was activated (p = 0.046), when compared to the omnidirectional setting. With the limited number of subjects in this study, no statistically significant differences were found between the two sound processors.
Resumo:
OBJECTIVE To confirm the clinical efficacy and safety of a direct acoustic cochlear implant. STUDY DESIGN Prospective multicenter study. SETTING The study was performed at 3 university hospitals in Europe (Germany, The Netherlands, and Switzerland). PATIENTS Fifteen patients with severe-to-profound mixed hearing loss because of otosclerosis or previous failed stapes surgery. INTERVENTION Implantation with a Codacs direct acoustic cochlear implant investigational device (ID) combined with a stapedotomy with a conventional stapes prosthesis MAIN OUTCOME MEASURES Preoperative and postoperative (3 months after activation of the investigational direct acoustic cochlear implant) audiometric evaluation measuring conventional pure tone and speech audiometry, tympanometry, aided thresholds in sound field and hearing difficulty by the Abbreviated Profile of Hearing Aid Benefit questionnaire. RESULTS The preoperative and postoperative air and bone conduction thresholds did not change significantly by the implantation with the investigational Direct Acoustic Cochlear Implant. The mean sound field thresholds (0.25-8 kHz) improved significantly by 48 dB. The word recognition scores (WRS) at 50, 65, and 80 dB SPL improved significantly by 30.4%, 75%, and 78.2%, respectively, after implantation with the investigational direct acoustic cochlear implant compared with the preoperative unaided condition. The difficulty in hearing, measured by the Abbreviated Profile of Hearing Aid Benefit, decreased by 27% after implantation with the investigational direct acoustic cochlear implant. CONCLUSION Patients with moderate-to-severe mixed hearing loss because of otosclerosis can benefit substantially using the Codacs investigational device.
Resumo:
Objective. To compare hearing and speech understanding between a new, nonskin penetrating Baha system (Baha Attract) to the current Baha system using a skin-penetrating abutment. Methods. Hearing and speech understanding were measured in 16 experienced Baha users. The transmission path via the abutment was compared to a simulated Baha Attract transmission path by attaching the implantable magnet to the abutment and then by adding a sample of artificial skin and the external parts of the Baha Attract system. Four different measurements were performed: bone conduction thresholds directly through the sound processor (BC Direct), aided sound field thresholds, aided speech understanding in quiet, and aided speech understanding in noise. Results. The simulated Baha Attract transmission path introduced an attenuation starting from approximately 5 dB at 1000 Hz, increasing to 20–25 dB above 6000 Hz. However, aided sound field threshold shows smaller differences and aided speech understanding in quiet and in noise does not differ significantly between the two transmission paths. Conclusion. The Baha Attract system transmission path introduces predominately high frequency attenuation. This attenuation can be partially compensated by adequate fitting of the speech processor. No significant decrease in speech understanding in either quiet or in noise was found.
Resumo:
Supramolecular two-dimensional engineering epitomizes the design of complex molecular architectures through recognition events in multicomponent self-assembly. Despite being the subject of in-depth experimental studies, such articulated phenomena have not been yet elucidated in time and space with atomic precision. Here we use atomistic molecular dynamics to simulate the recognition of complementary hydrogen-bonding modules forming 2D porous networks on graphite. We describe the transition path from the melt to the crystalline hexagonal phase and show that self-assembly proceeds through a series of intermediate states featuring a plethora of polygonal types. Finally, we design a novel bicomponent system possessing kinetically improved self-healing ability in silico, thus demonstrating that a priori engineering of 2D self-assembly is possible.
Resumo:
A new hearing therapy based on direct acoustic cochlear stimulation was developed for the treatment of severe to profound mixed hearing loss. The device efficacy was validated in an initial clinical trial with four patients. This semi-implantable investigational device consists of an externally worn audio processor, a percutaneous connector, and an implantable microactuator. The actuator is placed in the mastoid bone, right behind the external auditory canal. It generates vibrations that are directly coupled to the inner ear fluids and that, therefore, bypass the external and the middle ear. The system is able to provide an equivalent sound pressure level of 125 dB over the frequency range between 125 and 8000 Hz. The hermetically sealed actuator is designed to provide maximal output power by keeping its dimensions small enough to enable implantation. A network model is used to simulate the dynamic characteristics of the actuator to adjust its transfer function to the characteristics of the middle ear. The geometry of the different actuator components is optimized using finite-element modeling.