997 resultados para Speech acquisition


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Pronunciation is an important part of speech acquisition, but little attention has been given to the mechanism or mechanisms by which it develops. Speech sound qualities, for example, have just been assumed to develop by simple imitation. In most accounts this is then assumed to be by acoustic matching, with the infant comparing his output to that of his caregiver. There are theoretical and empirical problems with both of these assumptions, and we present a computational model- Elija-that does not learn to pronounce speech sounds this way. Elija starts by exploring the sound making capabilities of his vocal apparatus. Then he uses the natural responses he gets from a caregiver to learn equivalence relations between his vocal actions and his caregiver's speech. We show that Elija progresses from a babbling stage to learning the names of objects. This demonstrates the viability of a non-imitative mechanism in learning to pronounce.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This article describes a neural network model that addresses the acquisition of speaking skills by infants and subsequent motor equivalent production of speech sounds. The model learns two mappings during a babbling phase. A phonetic-to-orosensory mapping specifies a vocal tract target for each speech sound; these targets take the form of convex regions in orosensory coordinates defining the shape of the vocal tract. The babbling process wherein these convex region targets are formed explains how an infant can learn phoneme-specific and language-specific limits on acceptable variability of articulator movements. The model also learns an orosensory-to-articulatory mapping wherein cells coding desired movement directions in orosensory space learn articulator movements that achieve these orosensory movement directions. The resulting mapping provides a natural explanation for the formation of coordinative structures. This mapping also makes efficient use of redundancy in the articulator system, thereby providing the model with motor equivalent capabilities. Simulations verify the model's ability to compensate for constraints or perturbations applied to the articulators automatically and without new learning and to explain contextual variability seen in human speech production.

Relevância:

100.00% 100.00%

Publicador:

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Automatic speech recognition from multiple distant micro- phones poses significant challenges because of noise and reverberations. The quality of speech acquisition may vary between microphones because of movements of speakers and channel distortions. This paper proposes a channel selection approach for selecting reliable channels based on selection criterion operating in the short-term modulation spectrum domain. The proposed approach quantifies the relative strength of speech from each microphone and speech obtained from beamforming modulations. The new technique is compared experimentally in the real reverb conditions in terms of perceptual evaluation of speech quality (PESQ) measures and word error rate (WER). Overall improvement in recognition rate is observed using delay-sum and superdirective beamformers compared to the case when the channel is selected randomly using circular microphone arrays.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Recent research into the acquisition of spoken language has stressed the importance of learning through embodied linguistic interaction with caregivers rather than through passive observation. However the necessity of interaction makes experimental work into the simulation of infant speech acquisition difficult because of the technical complexity of building real-time embodied systems. In this paper we present KLAIR: a software toolkit for building simulations of spoken language acquisition through interactions with a virtual infant. The main part of KLAIR is a sensori-motor server that supplies a client machine learning application with a virtual infant on screen that can see, hear and speak. By encapsulating the real-time complexities of audio and video processing within a server that will run on a modern PC, we hope that KLAIR will encourage and facilitate more experimental research into spoken language acquisition through interaction. Copyright © 2009 ISCA.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper describes a neural model of speech acquisition and production that accounts for a wide range of acoustic, kinematic, and neuroimaging data concerning the control of speech movements. The model is a neural network whose components correspond to regions of the cerebral cortex and cerebellum, including premotor, motor, auditory, and somatosensory cortical areas. Computer simulations of the model verify its ability to account for compensation to lip and jaw perturbations during speech. Specific anatomical locations of the model's components are estimated, and these estimates are used to simulate fMRI experiments of simple syllable production with and without jaw perturbations.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The literature has discussed the importance of early implementation of augmentative and alternative systems with people with various disabilities. This discussion is related to concerns about language acquisition and development within the various expressive possibilities. Researchers advise that starting intervention early through resources and procedures using augmentative and alternative communication does not impede speech acquisition and development. This paper aimed to describe oral expressive abilities during the implementation of augmentative and alternative communication with a student with cerebral palsy. An 11-year-old student with cerebral palsy participated in the augmentative and alternative communication program for two years. Twelve sessions were selected during the first year of the intervention. The sessions were filmed and the augmentative and alternative communication resource procedures were transcribed. The categories of analysis were defined as verbal expression; nonverbal expression and verbal and nonverbal expression associated. The results of this study identified that augmentative and alternative communication resources supported the use of verbal expression such as vocalizations, words and unintelligible oral expressions.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Pós-graduação em Educação - FFC

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Deafness is an invisible handicap which makes education difficult due to its effect on language acquisition and development, speech acquisition, and ability to communicate with others. Teachers of students who are hearing impaired have been at odds for more than a century as to the best method of communication to use in teaching students who are hearing impaired, and in providing these students with communication systems that will enable them to be effective communicators. This dissertation utilized qualitative research methods to analyze whether the Dade County Public Schools Procedures for Providing Special Education for Exceptional Students-Hearing Impaired (DCPS), (1976-77; 1992-93) show evidence of an appropriate curriculum and instructional program that is responsive to the conditions facing exceptional education students identified as hearing impaired. Results indicate that many of the curriculum and instructional program requirements are not identified nor described when analyzed by Tyler's "Rationale" for Curriculum and Stufflebeam's Improvement-Oriented Evaluation Model, better known as Stufflebeam's CIPP Model. Recommendations to improve and enhance programs for students who are hearing impaired are offered based on this analysis. ^

Relevância:

40.00% 40.00%

Publicador:

Resumo:

We present an unsupervised learning algorithm that acquires a natural-language lexicon from raw speech. The algorithm is based on the optimal encoding of symbol sequences in an MDL framework, and uses a hierarchical representation of language that overcomes many of the problems that have stymied previous grammar-induction procedures. The forward mapping from symbol sequences to the speech stream is modeled using features based on articulatory gestures. We present results on the acquisition of lexicons and language models from raw speech, text, and phonetic transcripts, and demonstrate that our algorithm compares very favorably to other reported results with respect to segmentation performance and statistical efficiency.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper discusses a study to compare test results of the CID GAEL test among hearing impaired children who are enrolled in cued speech vs. oral vs. signed english programs.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

How do infants learn word meanings? Research has established the impact of both parent and child behaviors on vocabulary development, however the processes and mechanisms underlying these relationships are still not fully understood. Much existing literature focuses on direct paths to word learning, demonstrating that parent speech and child gesture use are powerful predictors of later vocabulary. However, an additional body of research indicates that these relationships don’t always replicate, particularly when assessed in different populations, contexts, or developmental periods.

The current study examines the relationships between infant gesture, parent speech, and infant vocabulary over the course of the second year (10-22 months of age). Through the use of detailed coding of dyadic mother-child play interactions and a combination of quantitative and qualitative data analytic methods, the process of communicative development was explored. Findings reveal non-linear patterns of growth in both parent speech content and child gesture use. Analyses of contingency in dyadic interactions reveal that children are active contributors to communicative engagement through their use of gestures, shaping the type of input they receive from parents, which in turn influences child vocabulary acquisition. Recommendations for future studies and the use of nuanced methodologies to assess changes in the dynamic system of dyadic communication are discussed.

Relevância:

30.00% 30.00%

Publicador:

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Purpose: The classic study of Sumby and Pollack (1954, JASA, 26(2), 212-215) demonstrated that visual information aided speech intelligibility under noisy auditory conditions. Their work showed that visual information is especially useful under low signal-to-noise conditions where the auditory signal leaves greater margins for improvement. We investigated whether simulated cataracts interfered with the ability of participants to use visual cues to help disambiguate the auditory signal in the presence of auditory noise. Methods: Participants in the study were screened to ensure normal visual acuity (mean of 20/20) and normal hearing (auditory threshold ≤ 20 dB HL). Speech intelligibility was tested under an auditory only condition and two visual conditions: normal vision and simulated cataracts. The light scattering effects of cataracts were imitated using cataract-simulating filters. Participants wore blacked-out glasses in the auditory only condition and lens-free frames in the normal auditory-visual condition. Individual sentences were spoken by a live speaker in the presence of prerecorded four-person background babble set to a speech-to-noise ratio (SNR) of -16 dB. The SNR was determined in a preliminary experiment to support 50% correct identification of sentence under the auditory only conditions. The speaker was trained to match the rate, intensity and inflections of a prerecorded audio track of everyday speech sentences. The speaker was blind to the visual conditions of the participant to control for bias.Participants’ speech intelligibility was measured by comparing the accuracy of their written account of what they believed the speaker to have said to the actual spoken sentence. Results: Relative to the normal vision condition, speech intelligibility was significantly poorer when participants wore simulated catarcts. Conclusions: The results suggest that cataracts may interfere with the acquisition of visual cues to speech perception.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

While close talking microphones give the best signal quality and produce the highest accuracy from current Automatic Speech Recognition (ASR) systems, the speech signal enhanced by microphone array has been shown to be an effective alternative in a noisy environment. The use of microphone arrays in contrast to close talking microphones alleviates the feeling of discomfort and distraction to the user. For this reason, microphone arrays are popular and have been used in a wide range of applications such as teleconferencing, hearing aids, speaker tracking, and as the front-end to speech recognition systems. With advances in sensor and sensor network technology, there is considerable potential for applications that employ ad-hoc networks of microphone-equipped devices collaboratively as a virtual microphone array. By allowing such devices to be distributed throughout the users’ environment, the microphone positions are no longer constrained to traditional fixed geometrical arrangements. This flexibility in the means of data acquisition allows different audio scenes to be captured to give a complete picture of the working environment. In such ad-hoc deployment of microphone sensors, however, the lack of information about the location of devices and active speakers poses technical challenges for array signal processing algorithms which must be addressed to allow deployment in real-world applications. While not an ad-hoc sensor network, conditions approaching this have in effect been imposed in recent National Institute of Standards and Technology (NIST) ASR evaluations on distant microphone recordings of meetings. The NIST evaluation data comes from multiple sites, each with different and often loosely specified distant microphone configurations. This research investigates how microphone array methods can be applied for ad-hoc microphone arrays. A particular focus is on devising methods that are robust to unknown microphone placements in order to improve the overall speech quality and recognition performance provided by the beamforming algorithms. In ad-hoc situations, microphone positions and likely source locations are not known and beamforming must be achieved blindly. There are two general approaches that can be employed to blindly estimate the steering vector for beamforming. The first is direct estimation without regard to the microphone and source locations. An alternative approach is instead to first determine the unknown microphone positions through array calibration methods and then to use the traditional geometrical formulation for the steering vector. Following these two major approaches investigated in this thesis, a novel clustered approach which includes clustering the microphones and selecting the clusters based on their proximity to the speaker is proposed. Novel experiments are conducted to demonstrate that the proposed method to automatically select clusters of microphones (ie, a subarray), closely located both to each other and to the desired speech source, may in fact provide a more robust speech enhancement and recognition than the full array could.