Biblioteca Digital

180 resultados para Syllable

Preliminary phonological analysis of the Limi dialect of Humla Bhotia

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The purpose of this research was to analyse the phonological system of the Limi dialect of Humla Bhotia. Humla Bhotia is a Tibeto-Burman language that is spoken by approximately 4000 5000 people in the far northwestern Humla province of the Kingdom of Nepal. The language has not previously been the subject of analysis. The data base for this thesis was collected on two different dialects of Humla Bhotia in Kathmandu, the capital of Nepal, from February to May 2000. I had three language informants who speak Humla Bhotia as their mother tongue. One of the informants speaks the Upper Humla dialect and the other two informants speak the Limi dialect. In this thesis I have concentrated on the phonology of the dialect of Limi but occasionally I also make reference to the Upper Humla dialect. The Limi data base consists of 600 words elicited in isolation, sentences where words have been checked for consonantal and pitch variation, and five texts comprising 117 sentences. Firstly, I have studied the geographical location, population and dialects of Humla Bhotia. Five dialects were identified: Limi, Upper Humla, La Yakba, Nyinba and Humli Khyampa. Information on the dialect areas is based on the accounts of seven mother tongue speakers of the language and on Nancy Levine s (1988) anthropological research of the ethnic group Nyinba. Secondly, I have analysed the phonological system of Limi from the viewpoint of American stucturalism much along the lines followed by Pike 1966 [1947] ja 1967 [1948]. In defining the prosodic elements I have also used acoustic analysis. In the Limi dialect there are 7 vowel phonemes. No vowel clusters occur within the same syllable. In this preliminary analysis 29 contrastive plosives, 8 affricates and 5 6 fricatives were found. The data also revealed 4 nasal phonemes, two rhotic phonemes, one lateral phoneme and two central approximants. Further research is however called for to check the phonemic status of these segments. Four contrastive prosodic elements were encountered: nasalisation, length, phonation type and pitch movement. There are two contrastive types of phonation: tense and lax. Many words were found with a third type of phonation, modal phonation. How modal phonation relates to the prosodic system is unclear at this stage and is therefore left for further research to determine. There are two contrastive pitch movement tonemes: a rising toneme and falling toneme. The falling toneme occurs in free variation with a level pitch contour. Rising appears to be linked with lax phonation and falling with tense phonation.

Cortical processing of speech and non-speech sounds in adults and newborns

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Comprehension of a complex acoustic signal - speech - is vital for human communication, with numerous brain processes required to convert the acoustics into an intelligible message. In four studies in the present thesis, cortical correlates for different stages of speech processing in a mature linguistic system of adults were investigated. In two further studies, developmental aspects of cortical specialisation and its plasticity in adults were examined. In the present studies, electroencephalographic (EEG) and magnetoencephalographic (MEG) recordings of the mismatch negativity (MMN) response elicited by changes in repetitive unattended auditory events and the phonological mismatch negativity (PMN) response elicited by unexpected speech sounds in attended speech inputs served as the main indicators of cortical processes. Changes in speech sounds elicited the MMNm, the magnetic equivalent of the electric MMN, that differed in generator loci and strength from those elicited by comparable changes in non-speech sounds, suggesting intra- and interhemispheric specialisation in the processing of speech and non-speech sounds at an early automatic processing level. This neuronal specialisation for the mother tongue was also reflected in the more efficient formation of stimulus representations in auditory sensory memory for typical native-language speech sounds compared with those formed for unfamiliar, non-prototype speech sounds and simple tones. Further, adding a speech or non-speech sound context to syllable changes was found to modulate the MMNm strength differently in the left and right hemispheres. Following the acoustic-phonetic processing of speech input, phonological effort related to the selection of possible lexical (word) candidates was linked with distinct left-hemisphere neuronal populations. In summary, the results suggest functional specialisation in the neuronal substrates underlying different levels of speech processing. Subsequently, plasticity of the brain's mature linguistic system was investigated in adults, in whom representations for an aurally-mediated communication system, Morse code, were found to develop within the same hemisphere where representations for the native-language speech sounds were already located. Finally, recording and localization of the MMNm response to changes in speech sounds was successfully accomplished in newborn infants, encouraging future MEG investigations on, for example, the state of neuronal specialisation at birth.

Lukemisen vaikeuden kuntoutus ensiluokkalaisilla : Kolme pedagogista interventiota

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Remediation of Reading Difficulties in Grade 1. Three Pedagogical Interventions Keywords: initial teaching, learning to read, reading difficulties, intervention, dyslexia, remediation of dyslexia, home reading, computerized training In this study three different reading interventions were tested for first-graders at risk of reading difficulties at school commencement. The intervention groups were compared together and with a control group receiving special education provided by the school. First intervention was a new approach called syllable rhythmics in which syllabic rhythm, phonological knowledge and letter-phoneme correspondence are emphasized. Syllable rhythmics is based on multi-sensory training elements aimed at finding the most functional modality for every child. The second intervention was computerized training of letter-sound correspondence with the Ekapeli learning game. The third intervention was home-based shared book reading, where every family was given a story book, and dialogic reading style reading and writing exercises were prepared for each chapter of the book. The participants were 80 first-graders in 19 classes in nine schools. The children were matched in four groups according to pre-test results: three intervention and one control. The interventions took ten weeks starting from September in grade 1. The first post-test including several measures of reading abilities was administered in December. The first delayed post-test was administered in March, the second in September in grade 2, and the third, “ALLU” test (reading test for primary school) was administered in March in grade 2. The intervention and control groups differed only slightly from each other in grade 1. However, girls progressed significantly more than boys in both word reading and reading comprehension in December and this difference remained in March. The children who had been cited as inattentive by their teachers also lagged behind the others in the post-tests in December and March. When participants were divided into two groups according to their initial letter knowledge at school entry, the weaker group (maximum 17 correctly named letters in pre-test) progressed more slowly in both word reading and reading comprehension in grade 1. Intervention group and gender had no interaction effect in grade 1. Instead, intervention group and attentiveness had an interaction effect on most test measures the inattentive students in the syllable rhythmic group doing worst and attentive students in the control group doing best in grade 1. The smallest difference between results of attentive and inattentive students was in the Ekapeli group. In grade 2 still only minor differences were found between the intervention groups and control group. The only significant difference was in non-word reading, with the syllable rhythmics group outperforming the other groups in the fall. The difference between girls’ and boys’ performances in both technical reading and text comprehension disappeared in grade 2. The difference between the inattentive and attentive students cold no longer be found in technical reading, and the difference became smaller in text comprehension as well. The difference between two groups divided according to their initial letter knowledge disappeared in technical reading but remained significant in text comprehension measures in the ALLU test in the spring of grade 2. In all, the children in the study did better in the ALLU test than expected according to ALLU test norms. Being the weakest readers in their classes in the pre-test, 52.3 % reached the normal reading ability level. In the norm group 72.3 % of all students attained normal reading ability. The results of this study indicate that different types of remediation programs can be effective, and that special education has been apparently useful. The results suggest careful consideration of first-graders’ initial reading abilities (especially letter knowledge) and possible failure of attention; remediation should be individually targeted while flexibly using different methods.

Adaptive frequency scaled wavelet packet decomposition for frog call classification

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Environmental changes have put great pressure on biological systems leading to the rapid decline of biodiversity. To monitor this change and protect biodiversity, animal vocalizations have been widely explored by the aid of deploying acoustic sensors in the field. Consequently, large volumes of acoustic data are collected. However, traditional manual methods that require ecologists to physically visit sites to collect biodiversity data are both costly and time consuming. Therefore it is essential to develop new semi-automated and automated methods to identify species in automated audio recordings. In this study, a novel feature extraction method based on wavelet packet decomposition is proposed for frog call classification. After syllable segmentation, the advertisement call of each frog syllable is represented by a spectral peak track, from which track duration, dominant frequency and oscillation rate are calculated. Then, a k-means clustering algorithm is applied to the dominant frequency, and the centroids of clustering results are used to generate the frequency scale for wavelet packet decomposition (WPD). Next, a new feature set named adaptive frequency scaled wavelet packet decomposition sub-band cepstral coefficients is extracted by performing WPD on the windowed frog calls. Furthermore, the statistics of all feature vectors over each windowed signal are calculated for producing the final feature set. Finally, two well-known classifiers, a k-nearest neighbour classifier and a support vector machine classifier, are used for classification. In our experiments, we use two different datasets from Queensland, Australia (18 frog species from commercial recordings and field recordings of 8 frog species from James Cook University recordings). The weighted classification accuracy with our proposed method is 99.5% and 97.4% for 18 frog species and 8 frog species respectively, which outperforms all other comparable methods.

Language learning in infancy

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Although immensely complex, speech is also a very efficient means of communication between humans. Understanding how we acquire the skills necessary for perceiving and producing speech remains an intriguing goal for research. However, while learning is likely to begin as soon as we start hearing speech, the tools for studying the language acquisition strategies in the earliest stages of development remain scarce. One prospective strategy is statistical learning. In order to investigate its role in language development, we designed a new research method. The method was tested in adults using magnetoencephalography (MEG) as a measure of cortical activity. Neonatal brain activity was measured with electroencephalography (EEG). Additionally, we developed a method for assessing the integration of seen and heard syllables in the developing brain as well as a method for assessing the role of visual speech when learning phoneme categories. The MEG study showed that adults learn statistical properties of speech during passive listening of syllables. The amplitude of the N400m component of the event-related magnetic fields (ERFs) reflected the location of syllables within pseudowords. The amplitude was also enhanced for syllables in a statistically unexpected position. The results suggest a role for the N400m component in statistical learning studies in adults. Using the same research design with sleeping newborn infants, the auditory event-related potentials (ERPs) measured with EEG reflected the location of syllables within pseudowords. The results were successfully replicated in another group of infants. The results show that even newborn infants have a powerful mechanism for automatic extraction of statistical characteristics from speech. We also found that 5-month-old infants integrate some auditory and visual syllables into a fused percept, whereas other syllable combinations are not fully integrated. Auditory syllables were paired with visual syllables possessing a different phonetic identity, and the ERPs for these artificial syllable combinations were compared with the ERPs for normal syllables. For congruent auditory-visual syllable combinations, the ERPs did not differ from those for normal syllables. However, for incongruent auditory-visual syllable combinations, we observed a mismatch response in the ERPs. The results show an early ability to perceive speech cross-modally. Finally, we exposed two groups of 6-month-old infants to artificially created auditory syllables located between two stereotypical English syllables in the formant space. The auditory syllables followed, equally for both groups, a unimodal statistical distribution, suggestive of a single phoneme category. The visual syllables combined with the auditory syllables, however, were different for the two groups, one group receiving visual stimuli suggestive of two separate phoneme categories, the other receiving visual stimuli suggestive of only one phoneme category. After a short exposure, we observed different learning outcomes for the two groups of infants. The results thus show that visual speech can influence learning of phoneme categories. Altogether, the results demonstrate that complex language learning skills exist from birth. They also suggest a role for the visual component of speech in the learning of phoneme categories.

Phonetic tone signals quantity and word structure

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Many languages exploit suprasegmental devices in signaling word meaning. Tone languages exploit fundamental frequency whereas quantity languages rely on segmental durations to distinguish otherwise similar words. Traditionally, duration and tone have been taken as mutually exclusive. However, some evidence suggests that, in addition to durational cues, phonological quantity is associated with and co-signaled by changes in fundamental frequency in quantity languages such as Finnish, Estonian, and Serbo-Croat. The results from the present experiment show that the structure of disyllabic word stems in Finnish are indeed signaled tonally and that the phonological length of the stressed syllable is further tonally distinguished within the disyllabic sequence. The results further indicate that the observed association of tone and duration in perception is systematically exploited in speech production in Finnish.

Mig eller mej, själ eller sjel? Problem och lösningar vid transkription av svenska sångtexter

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Abstract (Mig or mej, själ or sjel? Problems and solutions in the transcription of Swedish song texts): In this article I am pointing out and discussing problems and solutions concerning phonetic transcription of Swedish song texts. My material consists of 66 Swedish songs phonetically transcribed. The transcriptions were published by The Academy of Finnish Art Song in 2009. The first issue was which level of accuracy should be chosen. The transcriptions were created to be clear at a glance and suitable for the needs of interpretation of non Swedish speaking singers. The principle was to use as few signs and symbols as possible without sacrificing accuracy. Certain songs were provided with additional information whenever there was a chance of misinterpretation. The second issue was which geographic variety of the language should be visible in the transcription, Standard Swedish or Finland-Swedish? The songs in the volume are a selection of well-known works that are also of international interest. Most were composed by Jean Sibelius (1865–1957), a substantial number of whose songs were based on poems written by Finland’s national poet, Johan Ludvig Runeberg (1804–1877). Thus I chose to use the variety of Swedish language spoken in Finland, in order to reflect the cultural origin of the songs. This variety differs slightly from the variety spoken in Sweden both on prosodic and phonetic level. In singing, the note-text gives the interpretor enough information about prosody. The differences concern mostly the phonemes. A fully consequent transcript was, however, difficult to make, due to vocal requirement. So, for example, in an unstressed final syllable the vowel was often indicated as a central vowel, which in singing is given a more direct emphasis than in a literal pronunciation, even if this central vowel does not occur in spoken Finland-Swedish.

Quantity and tone in Finnish lexically stressed syllables

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper presents results from a study on the tonal aspects of quantity in Finnish lexically stressed syllables. Fourteen speakers produced a set of 66 utterances where the quantity and structure of the lexically stressed syllable was system- atically varied. The tonal aspects of the syllable nucleus and nucleus and coda in case of closed syllables was stud- ied in the framework of the Target Approximation theory as formulated by Yi Xu. The results show a clear tendency to- wards the quantity distinction and bimoracity in general in Finnish to be signalled tonally by a dynamic falling tone as opposed to a static high tone in short (one mora) nuclei.

Word order and tonal shape in the production of focus in short Finnish utterances

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper presents results from a study on the production of Finnish prosody. The effect of word order and the tonal shape in the production of Finnish prosody was studied as produced by 8 native Finnish speakers. Predictions formulated with regard to results from an earlier study pertaining to the perception of promi- nence were tested. These predictions had to do with the tonal shape of the utterances in the form of a flat hat pattern and the effect of word order on the so called top-line declination within an adver- bial phrase in the utterances. The results from the experiment give support to the following claims: the temporal domain of prosodic focus is the whole utterance, word order reversal from unmarked to marked has an effect on the production of prosody, and the pro- duction of the tonal aspects of focus in Finnish follows a basic flat hat pattern. That is the prominence of a word can be produced by an f 0 rise or a fall, depending on the location of the word in an utterance. The basic accentual shape of a Finnish word is then not a pointed rise/fall hat shape as claimed before since it can vary depending on the syllable structure and the position within an ut- terance.

Testing concordance in species boundaries using acoustic, morphological, and molecular data in the field cricket genus Itaropsis (Orthoptera: Grylloidea, Gryllidae: Gryllinae)

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In most taxa, species boundaries are inferred based on differences in morphology or DNA sequences revealed by taxonomic or phylogenetic analyses. In crickets, acoustic mating signals or calling songs have species-specific structures and provide a third data set to infer species boundaries. We examined the concordance in species boundaries obtained using acoustic, morphological, and molecular data sets in the field cricket genus Itaropsis. This genus is currently described by only one valid species, Itaropsis tenella, with a broad distribution in western peninsular India and Sri Lanka. Calling songs of males sampled from four sites in peninsular India exhibited significant differences in a number of call features, suggesting the existence of multiple species. Cluster analysis of the acoustic data, molecular phylogenetic analyses, and phylogenetic analyses combining all data sets suggested the existence of three clades. Whatever the differences in calling signals, no full congruence was obtained between all the data sets, even though the resultant lineages were largely concordant with the acoustic clusters. The genus Itaropsis could thus be represented by three morphologically cryptic incipient species in peninsular India; their distributions are congruent with usual patterns of endemism in the Western Ghats, India. Song evolution is analysed through the divergence in syllable period, syllable and call duration, and dominant frequency.

A novel acoustic-vibratory multimodal duet

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The communication strategy of most crickets and bushcrickets typically consists of males broadcasting loud acoustic calling songs, while females perform phonotaxis, moving towards the source of the call. Males of the pseudophylline bushcricket species Onomarchus uninotatus produce an unusually low-pitched call, and we found that the immediate and most robust response of females to the male acoustic call was a bodily vibration, or tremulation, following each syllable of the call. We hypothesized that these bodily oscillations might send out a vibrational signal along the substrate on which the female stands, which males could use to localize her position. We quantified these vibrational signals using a laser vibrometer and found a clear phase relationship of alternation between the chirps of the male acoustic call and the female vibrational response. This system therefore constitutes a novel multimodal duet with a reliable temporal structure. We also found that males could localize the source of vibration but only if both the acoustic and vibratory components of the duet were played back. This unique multimodal duetting system may have evolved in response to higher levels of bat predation on searching bushcricket females than calling males, shifting part of the risk associated with partner localization onto the male. This is the first known example of bushcricket female tremulation in response to a long-range male acoustic signal and the first known example of a multimodal duet among animals.

Exploiting Chinese character models to improve speech recognition performance

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The Chinese language is based on characters which are syllabic in nature. Since languages have syllabotactic rules which govern the construction of syllables and their allowed sequences, Chinese character sequence models can be used as a first level approximation of allowed syllable sequences. N-gram character sequence models were trained on 4.3 billion characters. Characters are used as a first level recognition unit with multiple pronunciations per character. For comparison the CU-HTK Mandarin word based system was used to recognize words which were then converted to character sequences. The character only system error rates for one best recognition were slightly worse than word based character recognition. However combining the two systems using log-linear combination gives better results than either system separately. An equally weighted combination gave consistent CER gains of 0.1-0.2% absolute over the word based standard system. Copyright © 2009 ISCA.

Language model combination and adaptation using weighted finite state transducers

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In speech recognition systems language model (LMs) are often constructed by training and combining multiple n-gram models. They can be either used to represent different genres or tasks found in diverse text sources, or capture stochastic properties of different linguistic symbol sequences, for example, syllables and words. Unsupervised LM adaptation may also be used to further improve robustness to varying styles or tasks. When using these techniques, extensive software changes are often required. In this paper an alternative and more general approach based on weighted finite state transducers (WFSTs) is investigated for LM combination and adaptation. As it is entirely based on well-defined WFST operations, minimum change to decoding tools is needed. A wide range of LM combination configurations can be flexibly supported. An efficient on-the-fly WFST decoding algorithm is also proposed. Significant error rate gains of 7.3% relative were obtained on a state-of-the-art broadcast audio recognition task using a history dependently adapted multi-level LM modelling both syllable and word sequences. ©2010 IEEE.

Prueba de emparejamiento de unidades fonológicas a partir de dibujos : diferencias de rendimiento entre niños prelectores de distinto estrato socioeconómico

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Resumen: Este trabajo se enmarca en las investigaciones sobre lectura que consideran a la conciencia fonológica como un factor esencial en el aprendizaje lector. Con el objetivo de detectar dificultades en la manipulación de unidades subléxicas de manera temprana un total de 127 niños prelectores de Sala de 4 y Sala de 5 de dos escuelas de distinto nivel socioeconómico realizaron una prueba de emparejamiento de sílabas y fonemas a partir de dibujos. Los resultados muestran diferencias significativas de rendimiento según la unidad evaluada así como entre las escuelas de distinto nivel socioeconómico.

Language model cross adaptation for LVCSR system combination

Relevância:

10.00% 10.00%

Publicador:

Resumo:

State-of-the-art large vocabulary continuous speech recognition (LVCSR) systems often combine outputs from multiple subsystems developed at different sites. Cross system adaptation can be used as an alternative to direct hypothesis level combination schemes such as ROVER. In normal cross adaptation it is assumed that useful diversity among systems exists only at acoustic level. However, complimentary features among complex LVCSR systems also manifest themselves in other layers of modelling hierarchy, e.g., subword and word level. It is thus interesting to also cross adapt language models (LM) to capture them. In this paper cross adaptation of multi-level LMs modelling both syllable and word sequences was investigated to improve LVCSR system combination. Significant error rate gains up to 6.7% rel. were obtained over ROVER and acoustic model only cross adaptation when combining 13 Chinese LVCSR subsystems used in the 2010 DARPA GALE evaluation. © 2010 ISCA.

«
1
2
3
4
5
6
7
8
...
11
12
»