Biblioteca Digital

180 resultados para Syllable

Sibilantes tras consonante sonante en euskera: inserción vs. africación, fonética y fonología

Relevância:

10.00% 10.00%

Publicador:

Resumo:

[ES] Este artículo trata sobre el proceso fonológico que en euskera convierte en africadas las fricativas sibilantes tras consonante sonante. El análisis de dicho proceso es particularmente adecuado para la discusión de la relación recíproca entre fonética y fonología tal y defendida por la Fonología Natural. Es ese marco teórico, este trabajo estudia la motivación fonética de la fonología; por otro lado, explora las consecuencias perceptivas –tal vez también productivas– de los distintos inventarios fonémicos de cada lengua, comparando el proceso de africación vasco con el más conocido proceso inglés de inserción oclusiva. Se argumenta que la opción terminológica africación vs. inserción podría no ser una cuestión trivial sino el reflejo de alguna diferencia en el procesamiento fonológico de condiciones fonéticas básicamente equivalentes. La optimización de la estructura silábica se presenta como otro posible elemento de la configuración del proceso y como factor que contribuye a la mayor o menor relevancia de éste en lenguas tipológicamente distintas. Se ofrecen en la sección 3 algunos comentarios sobre imágenes espectrográficas como muestra de las observaciones que dieron lugar al trabajo de investigación en curso.

Neural representation of auditory temporal structure

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Neurons in the songbird forebrain nucleus HVc are highly sensitive to auditory temporal context and have some of the most complex auditory tuning properties yet discovered. HVc is crucial for learning, perceiving, and producing song, thus it is important to understand the neural circuitry and mechanisms that give rise to these remarkable auditory response properties. This thesis investigates these issues experimentally and computationally.

Extracellular studies reported here compare the auditory context sensitivity of neurons in HV c with neurons in the afferent areas of field L. These demonstrate that there is a substantial increase in the auditory temporal context sensitivity from the areas of field L to HVc. Whole-cell recordings of HVc neurons from acute brain slices are described which show that excitatory synaptic transmission between HVc neurons involve the release of glutamate and the activation of both AMPA/kainate and NMDA-type glutamate receptors. Additionally, widespread inhibitory interactions exist between HVc neurons that are mediated by postsynaptic GABA_A receptors. Intracellular recordings of HVc auditory neurons in vivo provides evidence that HV c neurons encode information about temporal structure using a variety of cellular and synaptic mechanisms including syllable-specific inhibition, excitatory post-synaptic potentials with a range of different time courses, and burst-firing, and song-specific hyperpolarization.

The final part of this thesis presents two computational approaches for representing and learning temporal structure. The first method utilizes comput ational elements that are analogous to temporal combination sensitive neurons in HVc. A network of these elements can learn using local information and lateral inhibition. The second method presents a more general framework which allows a network to discover mixtures of temporal features in a continuous stream of input.

The role of syllables in sign language production

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The aim of the present study was to investigate the functional role of syllables in sign language and how the different phonological combinations influence sign production. Moreover, the influence of age of acquisition was evaluated. Deaf signers (native and non-native) of Catalan Signed Language (LSC) were asked in a picture-sign interference task to sign picture names while ignoring distractor-signs with which they shared two phonological parameters (out of three of the main sign parameters: Location, Movement, and Handshape). The results revealed a different impact of the three phonological combinations. While no effect was observed for the phonological combination Handshape-Location, the combination Handshape-Movement slowed down signing latencies, but only in the non-native group. A facilitatory effect was observed for both groups when pictures and distractors shared Location-Movement. Importantly, linguistic models have considered this phonological combination to be a privileged unit in the composition of signs, as syllables are in spoken languages. Thus, our results support the functional role of syllable units during phonological articulation in sign language production.

Intonation modelling and adaptation for emotional prosody generation

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper proposes an HMM-based approach to generating emotional intonation patterns. A set of models were built to represent syllable-length intonation units. In a classification framework, the models were able to detect a sequence of intonation units from raw fundamental frequency values. Using the models in a generative framework, we were able to synthesize smooth and natural sounding pitch contours. As a case study for emotional intonation generation, Maximum Likelihood Linear Regression (MLLR) adaptation was used to transform the neutral model parameters with a small amount of happy and sad speech data. Perceptual tests showed that listeners could identify the speech with the sad intonation 80% of the time. On the other hand, listeners formed a bimodal distribution in their ability to detect the system generated happy intontation and on average listeners were able to detect happy intonation only 46% of the time. © Springer-Verlag Berlin Heidelberg 2005.

Investigation of acoustic units for LVCSR systems

Relevância:

10.00% 10.00%

Publicador:

Resumo:

One important issue in designing state-of-the-art LVCSR systems is the choice of acoustic units. Context dependent (CD) phones remain the dominant form of acoustic units. They can capture the co-articulatory effect in speech via explicit modelling. However, for other more complicated phonological processes, they rely on the implicit modelling ability of the underlying statistical models. Alternatively, it is possible to construct acoustic models based on higher level linguistic units, for example, syllables, to explicitly capture these complex patterns. When sufficient training data is available, this approach may show an advantage over implicit acoustic modelling. In this paper a wide range of acoustic units are investigated to improve LVCSR system performance. Significant error rate gains up to 7.1% relative (0.8% abs.) were obtained on a state-of-the-art Mandarin Chinese broadcast audio recognition task using word and syllable position dependent triphone and quinphone models. © 2011 IEEE.

Vowel normalisation: Time-domain processing of the internal dynamics of speech

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Human listeners can identify vowels regardless of speaker size, although the sound waves for an adult and a child speaking the ’same’ vowel would differ enormously. The differences are mainly due to the differences in vocal tract length (VTL) and glottal pulse rate (GPR) which are both related to body size. Automatic speech recognition machines are notoriously bad at understanding children if they have been trained on the speech of an adult. In this paper, we propose that the auditory system adapts its analysis of speech sounds, dynamically and automatically to the GPR and VTL of the speaker on a syllable-to-syllable basis. We illustrate how this rapid adaptation might be performed with the aid of a computational version of the auditory image model, and we propose that an auditory preprocessor of this form would improve the robustness of speech recognisers.

Language model cross adaptation for LVCSR system combination

Relevância:

10.00% 10.00%

Publicador:

Resumo:

State-of-the-art large vocabulary continuous speech recognition (LVCSR) systems often combine outputs from multiple sub-systems that may even be developed at different sites. Cross system adaptation, in which model adaptation is performed using the outputs from another sub-system, can be used as an alternative to hypothesis level combination schemes such as ROVER. Normally cross adaptation is only performed on the acoustic models. However, there are many other levels in LVCSR systems' modelling hierarchy where complimentary features may be exploited, for example, the sub-word and the word level, to further improve cross adaptation based system combination. It is thus interesting to also cross adapt language models (LMs) to capture these additional useful features. In this paper cross adaptation is applied to three forms of language models, a multi-level LM that models both syllable and word sequences, a word level neural network LM, and the linear combination of the two. Significant error rate reductions of 4.0-7.1% relative were obtained over ROVER and acoustic model only cross adaptation when combining a range of Chinese LVCSR sub-systems used in the 2010 and 2011 DARPA GALE evaluations. © 2012 Elsevier Ltd. All rights reserved.

汉语句子和语篇韵律层级边界的认知加工及其ERP效应

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The aim of the present study was to explore whether the CPS (Closure Positive Shift) which reflected prosodic processing will be elicited when listeners perceived different hierarchical prosodic boundaries in Chinese sentence and discourse (Quatrain). In addiction, the similarity and difference in amplitude, onset latency and scalp distribution between these CPS were investigated. The nature of the CPS and its relationship to acoustic parameters was also explored systematically. The main results and conclusions of the present study were: (1) Phonological phrase boundaries and intonational phrase boundaries in Chinese sentences both elicited the CPS; however, phonological word boundaries can't evoke it. The CPS induced by phonological phrase boundaries was earlier than the one related to intonational phrase boundaries in onset latency, and the amplitude was also somewhat lower. When the pauses in the vicinity of these two boundaries were removed, the onset latency difference disappeared while amplitude in the new conditions was also lower. This indicates that whenever listeners segment sentence into phrases, the CPS will be elicited. Besides, pause was not the decisive factor to elicit the CPS, but can modify its onset latency and amplitude effectively. (2) The different hierarchical prosodic boundaries in seven character quatrain including phonological phrase boundaries, intonational phrase boundaries and sentence pair boundaries elicited the CPS respectively. Furthermore, just like in the sentence level, onset latency of the CPS induced by the prosodic boundaries in the discourse was also influenced by the length of pause: the shorter the pause was, the earlier the onset latency. For the comparison between the CPS evoked by the same and different hierarchical prosodic boundaries, its amplitude was influenced by the extent to which prosodic representations were activated. Thus, the condition of the CPS elicitation was extended to the prosodic bounaries in discourse, and further indicated that it was influenced by acoustic parameters. (3) No matter what task the participants completed, just like word detection or rythem matching task, the CPS will be evoked. However, its amplitude was larger in the anterior region, when listeners completed the word detection task which needed more attention and higher load of working memory. The present result indicated that the elicitation of the CPS was not influenced by the task the participants completed, but different task influence its scalp distribution. (4) The final syllable of the sentence and quatrain can't elicit the CPS, but a P300-like positive component. Although the scalp distribution was similar to the CPS, it was much higher in amplitude. The present result suggested that only the prosodic boundaries reflecting not only the closure of the former prosodic unit but also integrating the later one will elicit the CPS.

汉字阅读中的字形、字音、字义及频率效应的研究

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Unlike alphabetic languages, Chinese language is ideographical writing system. Each Chinese character is single-syllable and usually has a direct meaning. So Chinese characters are a kind of valuable experimental material used for research on reading and comparisons of the reading mechanism of different language. In this paper, the normal persons and the patients with semantic dementia were respectively scheduled for two parts of experimental studies on the orthographic, phonologic, semantic and frequency effects of reading of Chinese characters. The Stroop-like character-picture interference experimental paradigm was used to investigate the orthographic, phonologic, semantic and frequency effects of Chinese characters on picture naming when they were presented with pictures to normal persons. The results indicated that the orthographic facilitation effect, phonologic facilitation effect, and semantic interference effect occurred at different SOA values. The orthographic and phonologic facilitation effects were independent. It was for the first time shown that the interaction between orthographic variable and semantic variable occurred when the high-frequency Chinese characters were read. Phonologic representation was activated quicker than semantic representation, by comparison of their SOA. Generally, it means that there is reading without meaning in Chinese character among the normal persons. The orthographic, phonologic, semantic, frequency and concrete effects of Chinese characters were further investigated among the dementia patients with DAT(dementia of Alzheimer's type disease) or CVA or both. They all have an impaired semantic memory. The results showed that patients with dementia could read the names of the pictures aloud while they could not name them or match them with a right character correctly. This is reading impairment without meaning in Chinese among the dementia patients. Meanwhile, they had a selective reading impairment and more LARC(a legitimate alternative reading of components) mistakes especially when reading low-frequency irregular, low-frequency inconsistent and abstract Chinese characters. With the patients' semantic impairment developed, their ability to read the pictures names would remain whereas their ability to read low-frequency irregular and low-frequency inconsistency Chinese characters was reduced. These results indicated that low-frequency irregular Chinese characters can be read correctly only when it is supported by their semantic information. Based on the above results of reading without meaning and of reading of low-frequency irregular Chinese characters supported by their semantic information, it is reasonable to suggest that at least two routes are involved in the process of reading Chinese characters. They are direct phonologic route and indirect semantic route; moreover, the two routes are independent.

AMAR: A Computational Model of Autosegmental Phonology

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This report describes a computational system with which phonologists may describe a natural language in terms of autosegmental phonology, currently the most advanced theory pertaining to the sound systems of human languages. This system allows linguists to easily test autosegmental hypotheses against a large corpus of data. The system was designed primarily with tonal systems in mind, but also provides support for tree or feature matrix representation of phonemes (as in The Sound Pattern of English), as well as syllable structures and other aspects of phonological theory. Underspecification is allowed, and trees may be specified before, during, and after rule application. The association convention is automatically applied, and other principles such as the conjunctivity condition are supported. The method of representation was designed such that rules are designated in as close a fashion as possible to the existing conventions of autosegmental theory while adhering to a textual constraint for maximum portability.

Whereabouts

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Francis, Matthew, Whereabouts (London: Rufus Books, 2005) RAE2008

Changes in the McGurk Effect Across Phonetic Contexts

Relevância:

10.00% 10.00%

Publicador:

Resumo:

To investigate the process underlying audiovisual speech perception, the McGurk illusion was examined across a range of phonetic contexts. Two major changes were found. First, the frequency of illusory /g/ fusion percepts increased relative to the frequency of illusory /d/ fusion percepts as vowel context was shifted from /i/ to /a/ to /u/. This trend could not be explained by biases present in perception of the unimodal visual stimuli. However, the change found in the McGurk fusion effect across vowel environments did correspond systematically with changes in second format frequency patterns across contexts. Second, the order of consonants in illusory combination percepts was found to depend on syllable type. This may be due to differences occuring across syllable contexts in the timecourses of inputs from the two modalities as delaying the auditory track of a vowel-consonant stimulus resulted in a change in the order of consonants perceived. Taken together, these results suggest that the speech perception system either fuses audiovisual inputs into a visually compatible percept with a similar second formant pattern to that of the acoustic stimulus or interleaves the information from different modalities, at a phonemic or subphonemic level, based on their relative arrival times.

Speaker Normalization Using Cortical Strip Maps: A Neural Model for Steady State Vowel Identification

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Auditory signals of speech are speaker-dependent, but representations of language meaning are speaker-independent. Such a transformation enables speech to be understood from different speakers. A neural model is presented that performs speaker normalization to generate a pitchindependent representation of speech sounds, while also preserving information about speaker identity. This speaker-invariant representation is categorized into unitized speech items, which input to sequential working memories whose distributed patterns can be categorized, or chunked, into syllable and word representations. The proposed model fits into an emerging model of auditory streaming and speech categorization. The auditory streaming and speaker normalization parts of the model both use multiple strip representations and asymmetric competitive circuits, thereby suggesting that these two circuits arose from similar neural designs. The normalized speech items are rapidly categorized and stably remembered by Adaptive Resonance Theory circuits. Simulations use synthesized steady-state vowels from the Peterson and Barney [J. Acoust. Soc. Am. 24, 175-184 (1952)] vowel database and achieve accuracy rates similar to those achieved by human listeners. These results are compared to behavioral data and other speaker normalization models.

Imaging Learned Song Representations in Populations of Sensorimotor Neurons Essential to Vocal Communication

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Perceiving or producing complex vocalizations such as speech and birdsongs require the coordinated activity of neuronal populations, and these activity patterns can vary over space and time. How learned communication signals are represented by populations of sensorimotor neurons essential to vocal perception and production remains poorly understood. Using a combination of two-photon calcium imaging, intracellular electrophysiological recording and retrograde tracing methods in anesthetized adult male zebra finches (Taeniopygia guttata), I addressed how the bird's own song and its component syllables are represented by the spatiotemporal patterns of activity of two spatially intermingled populations of projection neurons (PNs) in HVC, a sensorimotor area required for song perception and production. These experiments revealed that neighboring PNs can respond at markedly different times to song playback and that different syllables activate spatially intermingled HVC PNs within a small region. Moreover, noise correlation analysis reveals enhanced functional connectivity between PNs that respond most strongly to the same syllable and also provides evidence of a spatial gradient of functional connectivity specific to PNs that project to song motor nucleus (i.e. HVC_RA cells). These findings support a model in which syllabic and temporal features of song are represented by spatially intermingled PNs functionally organized into cell- and syllable-type networks.

The abstraction of form in semantic categories.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Undergraduates were asked to generate a name for a hypothetical new exemplar of a category. They produced names that had the same numbers of syllables, the same endings, and the same types of word stems as existing exemplars of that category. In addition, novel exemplars, each consisting of a nonsense syllable root and a prototypical ending, were accurately assigned to categories. The data demonstrate the abstraction and use of surface properties of words.

«
1
2
3
4
5
6
7
8
...
11
12
»