6 resultados para Parties of speech

em Boston University Digital Common


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This article describes a neural network model that addresses the acquisition of speaking skills by infants and subsequent motor equivalent production of speech sounds. The model learns two mappings during a babbling phase. A phonetic-to-orosensory mapping specifies a vocal tract target for each speech sound; these targets take the form of convex regions in orosensory coordinates defining the shape of the vocal tract. The babbling process wherein these convex region targets are formed explains how an infant can learn phoneme-specific and language-specific limits on acceptable variability of articulator movements. The model also learns an orosensory-to-articulatory mapping wherein cells coding desired movement directions in orosensory space learn articulator movements that achieve these orosensory movement directions. The resulting mapping provides a natural explanation for the formation of coordinative structures. This mapping also makes efficient use of redundancy in the articulator system, thereby providing the model with motor equivalent capabilities. Simulations verify the model's ability to compensate for constraints or perturbations applied to the articulators automatically and without new learning and to explain contextual variability seen in human speech production.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

A neuroanatomical parcellation system is described which encompasses the entire cerebral cortex and the cerebellum. The cortical system modified version of the scheme described by Caviness et al. (1996) and is designed particularly for studies of speech processing. The cerebellum is parcellated into 6 cortical regions of interest (ROIs) and an ROI representing the deep cerebellar nuclei in each hemisphere. The boundaries of each ROI are based on individual anatomical markers that are clearly visible from standard structural MRI acquistions. The system permits averaginh of functional imaging data sets from multiple sujects while accounting for individual anatomical variability. Used in conjuction with region-of-interest analysis techniques such as that described by Nieto-Castanon et al. (2003), the parcellation system provides a more powerful means of analyzing functional data.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper describes a neural model of speech acquisition and production that accounts for a wide range of acoustic, kinematic, and neuroimaging data concerning the control of speech movements. The model is a neural network whose components correspond to regions of the cerebral cortex and cerebellum, including premotor, motor, auditory, and somatosensory cortical areas. Computer simulations of the model verify its ability to account for compensation to lip and jaw perturbations during speech. Specific anatomical locations of the model's components are estimated, and these estimates are used to simulate fMRI experiments of simple syllable production with and without jaw perturbations.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Auditory signals of speech are speaker-dependent, but representations of language meaning are speaker-independent. Such a transformation enables speech to be understood from different speakers. A neural model is presented that performs speaker normalization to generate a pitchindependent representation of speech sounds, while also preserving information about speaker identity. This speaker-invariant representation is categorized into unitized speech items, which input to sequential working memories whose distributed patterns can be categorized, or chunked, into syllable and word representations. The proposed model fits into an emerging model of auditory streaming and speech categorization. The auditory streaming and speaker normalization parts of the model both use multiple strip representations and asymmetric competitive circuits, thereby suggesting that these two circuits arose from similar neural designs. The normalized speech items are rapidly categorized and stably remembered by Adaptive Resonance Theory circuits. Simulations use synthesized steady-state vowels from the Peterson and Barney [J. Acoust. Soc. Am. 24, 175-184 (1952)] vowel database and achieve accuracy rates similar to those achieved by human listeners. These results are compared to behavioral data and other speaker normalization models.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper describes a model of speech production called DIVA that highlights issues of self-organization and motor equivalent production of phonological units. The model uses a circular reaction strategy to learn two mappings between three levels of representation. Data on the plasticity of phonemic perceptual boundaries motivates a learned mapping between phoneme representations and vocal tract variables. A second mapping between vocal tract variables and articulator movements is also learned. To achieve the flexible control made possible by the redundancy of this mapping, desired directions in vocal tract configuration space are mapped into articulator velocity commands. Because each vocal tract direction cell learns to activate several articulator velocities during babbling, the model provides a natural account of the formation of coordinative structures. Model simulations show automatic compensation for unexpected constraints despite no previous experience or learning under these constraints.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

A neural model of peripheral auditory processing is described and used to separate features of coarticulated vowels and consonants. After preprocessing of speech via a filterbank, the model splits into two parallel channels, a sustained channel and a transient channel. The sustained channel is sensitive to relatively stable parts of the speech waveform, notably synchronous properties of the vocalic portion of the stimulus it extends the dynamic range of eighth nerve filters using coincidence deteectors that combine operations of raising to a power, rectification, delay, multiplication, time averaging, and preemphasis. The transient channel is sensitive to critical features at the onsets and offsets of speech segments. It is built up from fast excitatory neurons that are modulated by slow inhibitory interneurons. These units are combined over high frequency and low frequency ranges using operations of rectification, normalization, multiplicative gating, and opponent processing. Detectors sensitive to frication and to onset or offset of stop consonants and vowels are described. Model properties are characterized by mathematical analysis and computer simulations. Neural analogs of model cells in the cochlear nucleus and inferior colliculus are noted, as are psychophysical data about perception of CV syllables that may be explained by the sustained transient channel hypothesis. The proposed sustained and transient processing seems to be an auditory analog of the sustained and transient processing that is known to occur in vision.