158 resultados para Vowels
Resumo:
Dissertação de mestrado em Português Língua Não Materna (PLNM) - Português Língua Estrangeira (PLE) / Português Língua Segunda (PL2)
Resumo:
Dissertação de mestrado em Ciências da Linguagem
Resumo:
Tese de Doutoramento em Engenharia de Eletrónica e de Computadores
Resumo:
The statistical analysis of literary style is the part of stylometry that compares measurable characteristicsin a text that are rarely controlled by the author, with those in other texts. When thegoal is to settle authorship questions, these characteristics should relate to the author’s style andnot to the genre, epoch or editor, and they should be such that their variation between authors islarger than the variation within comparable texts from the same author.For an overview of the literature on stylometry and some of the techniques involved, see for exampleMosteller and Wallace (1964, 82), Herdan (1964), Morton (1978), Holmes (1985), Oakes (1998) orLebart, Salem and Berry (1998).Tirant lo Blanc, a chivalry book, is the main work in catalan literature and it was hailed to be“the best book of its kind in the world” by Cervantes in Don Quixote. Considered by writterslike Vargas Llosa or Damaso Alonso to be the first modern novel in Europe, it has been translatedseveral times into Spanish, Italian and French, with modern English translations by Rosenthal(1996) and La Fontaine (1993). The main body of this book was written between 1460 and 1465,but it was not printed until 1490.There is an intense and long lasting debate around its authorship sprouting from its first edition,where its introduction states that the whole book is the work of Martorell (1413?-1468), while atthe end it is stated that the last one fourth of the book is by Galba (?-1490), after the death ofMartorell. Some of the authors that support the theory of single authorship are Riquer (1990),Chiner (1993) and Badia (1993), while some of those supporting the double authorship are Riquer(1947), Coromines (1956) and Ferrando (1995). For an overview of this debate, see Riquer (1990).Neither of the two candidate authors left any text comparable to the one under study, and thereforediscriminant analysis can not be used to help classify chapters by author. By using sample textsencompassing about ten percent of the book, and looking at word length and at the use of 44conjunctions, prepositions and articles, Ginebra and Cabos (1998) detect heterogeneities that mightindicate the existence of two authors. By analyzing the diversity of the vocabulary, Riba andGinebra (2000) estimates that stylistic boundary to be near chapter 383.Following the lead of the extensive literature, this paper looks into word length, the use of the mostfrequent words and into the use of vowels in each chapter of the book. Given that the featuresselected are categorical, that leads to three contingency tables of ordered rows and therefore tothree sequences of multinomial observations.Section 2 explores these sequences graphically, observing a clear shift in their distribution. Section 3describes the problem of the estimation of a suden change-point in those sequences, in the followingsections we propose various ways to estimate change-points in multinomial sequences; the methodin section 4 involves fitting models for polytomous data, the one in Section 5 fits gamma modelsonto the sequence of Chi-square distances between each row profiles and the average profile, theone in Section 6 fits models onto the sequence of values taken by the first component of thecorrespondence analysis as well as onto sequences of other summary measures like the averageword length. In Section 7 we fit models onto the marginal binomial sequences to identify thefeatures that distinguish the chapters before and after that boundary. Most methods rely heavilyon the use of generalized linear models
Resumo:
This study is part of an ongoing collaborative effort between the medical and the signal processing communities to promote research on applying standard Automatic Speech Recognition (ASR) techniques for the automatic diagnosis of patients with severe obstructive sleep apnoea (OSA). Early detection of severe apnoea cases is important so that patients can receive early treatment. Effective ASR-based detection could dramatically cut medical testing time. Working with a carefully designed speech database of healthy and apnoea subjects, we describe an acoustic search for distinctive apnoea voice characteristics. We also study abnormal nasalization in OSA patients by modelling vowels in nasal and nonnasal phonetic contexts using Gaussian Mixture Model (GMM) pattern recognition on speech spectra. Finally, we present experimental findings regarding the discriminative power of GMMs applied to severe apnoea detection. We have achieved an 81% correct classification rate, which is very promising and underpins the interest in this line of inquiry.
Resumo:
RÉSUMÉ Cette thèse porte sur le développement de méthodes algorithmiques pour découvrir automatiquement la structure morphologique des mots d'un corpus. On considère en particulier le cas des langues s'approchant du type introflexionnel, comme l'arabe ou l'hébreu. La tradition linguistique décrit la morphologie de ces langues en termes d'unités discontinues : les racines consonantiques et les schèmes vocaliques. Ce genre de structure constitue un défi pour les systèmes actuels d'apprentissage automatique, qui opèrent généralement avec des unités continues. La stratégie adoptée ici consiste à traiter le problème comme une séquence de deux sous-problèmes. Le premier est d'ordre phonologique : il s'agit de diviser les symboles (phonèmes, lettres) du corpus en deux groupes correspondant autant que possible aux consonnes et voyelles phonétiques. Le second est de nature morphologique et repose sur les résultats du premier : il s'agit d'établir l'inventaire des racines et schèmes du corpus et de déterminer leurs règles de combinaison. On examine la portée et les limites d'une approche basée sur deux hypothèses : (i) la distinction entre consonnes et voyelles peut être inférée sur la base de leur tendance à alterner dans la chaîne parlée; (ii) les racines et les schèmes peuvent être identifiés respectivement aux séquences de consonnes et voyelles découvertes précédemment. L'algorithme proposé utilise une méthode purement distributionnelle pour partitionner les symboles du corpus. Puis il applique des principes analogiques pour identifier un ensemble de candidats sérieux au titre de racine ou de schème, et pour élargir progressivement cet ensemble. Cette extension est soumise à une procédure d'évaluation basée sur le principe de la longueur de description minimale, dans- l'esprit de LINGUISTICA (Goldsmith, 2001). L'algorithme est implémenté sous la forme d'un programme informatique nommé ARABICA, et évalué sur un corpus de noms arabes, du point de vue de sa capacité à décrire le système du pluriel. Cette étude montre que des structures linguistiques complexes peuvent être découvertes en ne faisant qu'un minimum d'hypothèses a priori sur les phénomènes considérés. Elle illustre la synergie possible entre des mécanismes d'apprentissage portant sur des niveaux de description linguistique distincts, et cherche à déterminer quand et pourquoi cette coopération échoue. Elle conclut que la tension entre l'universalité de la distinction consonnes-voyelles et la spécificité de la structuration racine-schème est cruciale pour expliquer les forces et les faiblesses d'une telle approche. ABSTRACT This dissertation is concerned with the development of algorithmic methods for the unsupervised learning of natural language morphology, using a symbolically transcribed wordlist. It focuses on the case of languages approaching the introflectional type, such as Arabic or Hebrew. The morphology of such languages is traditionally described in terms of discontinuous units: consonantal roots and vocalic patterns. Inferring this kind of structure is a challenging task for current unsupervised learning systems, which generally operate with continuous units. In this study, the problem of learning root-and-pattern morphology is divided into a phonological and a morphological subproblem. The phonological component of the analysis seeks to partition the symbols of a corpus (phonemes, letters) into two subsets that correspond well with the phonetic definition of consonants and vowels; building around this result, the morphological component attempts to establish the list of roots and patterns in the corpus, and to infer the rules that govern their combinations. We assess the extent to which this can be done on the basis of two hypotheses: (i) the distinction between consonants and vowels can be learned by observing their tendency to alternate in speech; (ii) roots and patterns can be identified as sequences of the previously discovered consonants and vowels respectively. The proposed algorithm uses a purely distributional method for partitioning symbols. Then it applies analogical principles to identify a preliminary set of reliable roots and patterns, and gradually enlarge it. This extension process is guided by an evaluation procedure based on the minimum description length principle, in line with the approach to morphological learning embodied in LINGUISTICA (Goldsmith, 2001). The algorithm is implemented as a computer program named ARABICA; it is evaluated with regard to its ability to account for the system of plural formation in a corpus of Arabic nouns. This thesis shows that complex linguistic structures can be discovered without recourse to a rich set of a priori hypotheses about the phenomena under consideration. It illustrates the possible synergy between learning mechanisms operating at distinct levels of linguistic description, and attempts to determine where and why such a cooperation fails. It concludes that the tension between the universality of the consonant-vowel distinction and the specificity of root-and-pattern structure is crucial for understanding the advantages and weaknesses of this approach.
Resumo:
Mismatch negativity (MMN) overlaps with other auditory event-related potential (ERP) components. We examined the ERPs of 50 9- to 11-year-old children for vowels /i/, /y/ and equivalent complex tones. The goal was to separate MMN from obligatory ERP components using principal component analysis and equal probability control condition. In addition to the contrast of the deviant minus standard response, we employed the contrast of the deviant minus control response, to see whether the obligatory processing contributes to MMN in children. When looking for differences in speech deviant minus standard contrast, MMN starts around 112 ms. However, when both contrasts are examined, MMN emerges for speech at 160 ms whereas for nonspeech MMN is observed at 112 ms regardless of contrast. We argue that this discriminative response to speech stimuli at 112 ms is obligatory in nature rather than reflecting change detection processing.
Resumo:
Summary: Vowels in eastern Finnmark Saami
Resumo:
In this paper we provide a formal account for underapplication of vowel reduction to schwa in Majorcan Catalan loanwords and learned words. On the basis of the comparison of these data with those concerning productive derivation and verbal inflection, which show analogous patterns, in this paper we also explore the existing and not yet acknowledged correlation between those processes that exhibit a particular behaviour in the loanword phonology with respect to the native phonology of the language, those processes that show lexical exceptions and those processes that underapply due to morphological reasons. In light of the analysis of the very same data and taking into account the aforementioned correlation, we show how there might exist a natural diachronic relation between two kinds of Optimality Theory constraints which are commonly used but, in principle, mutually exclusive: positional faithfulness and contextual markedness constraints. Overall, phonological productivity is proven to be crucial in three respects: first, as a context of the grammar, given that «underapplication» is systematically found in what we call the productive phonology of the dialect (including loanwords, learned words, productive derivation and verbal inflection); second, as a trigger or blocker of processes, in that the productivity or the lack of productivity of a specific process or constraint in the language is what explains whether it is challenged or not in any of the depicted situations, and, third, as a guiding principle which can explain the transition from the historical to the synchronic phonology of a linguistic variety.
Resumo:
En el català de Mallorca, el procés de reducció vocàlica de les vocals mitjanes de la sèrie anterior a e neutra en posició àtona no opera (o subaplica) en determinades circumstàncies: (a) en formes derivades productives amb una vocal àtona situada a la síl·laba esquerra o inicial del radical i alternant amb e tancada o e oberta en el radical de la forma primitiva; (b) en formes verbals amb una vocal àtona situada a la síl·laba esquerra o inicial del radical i alternant amb e tancada en una altra forma verbal del mateix paradigma flexiu; (c) en manlleus i paraules apreses amb una vocal àtona e situada, també, a la síl·laba esquerra o inicial del radical i generalment precedida d"una consonant labial. En aquest treball, argumentem que hi ha dos factors que conspiren perquè això sigui d"aquesta manera: (a) tal com ja s"ha fet notar en treballs anteriors, la voluntat d"aquestes vocals d"assemblar-se a les vocals corresponents que apareixen en mots del mateix paradigma derivatiu o flexiu, i, en el cas de la derivació, sobretot quan la relació derivativa és productiva; (b) l"estatus privilegiat de la síl·laba situada a l"esquerra o a l"inici del radical. Per donar compte del primer factor, proposem una nova interpretació dels fets emmarcada en la Teoria de la Correspondència Transderivacional i el model dels Paradigmes Òptims. Per donar compte del segon factor, que ha passat desapercebut en aproximacions anteriors a les mateixes dades, partim de la Teoria de la Fidelitat Posicional i de la Teoria del Marcatge Posicional.
Resumo:
This paper gives a full description of the phonetics and phonology of Traditional Cockney and Popular London speech, treating these varieties as constituting a continuum rather than two separate dialects. Exemplification of the vowels, diphthongs and consonants is provided, both in isolate words and in connected speech, along with their range of variation. The frequencies of the vowels have been charted on the basis of the pronunciation of three elderly male speakers. Regarding the consonants, there are detailed observations on the features typically associated with the linguistic varieties examined: strong aspiration of unvoiced plosives, glottalization, H-dropping, L-vocalization and TH-fronting. A section on prosody provides coverage of lexical stress, rhythm and intonation. The paper takes into account up-to-date research on these phenomena, but does not deal with the most recent vowel shifts, some of which form part of Multi-cultural London English.
Resumo:
Preattentive perception of occasional deviating stimuli in the stream of standard stimuli can be recorded with cognitive event-related potential (ERP) mismatch negativity (MMN). The earlier detection of stimuli at the auditory cortex can be examined with N1 and P2 ERPs. The MMN recording does not require co-operation, it correlates with perceptual threshold, and even complex sounds can be used as stimuli. The aim of this study was to examine different aspects that should be considered when measuring discrimination of hearing with ERPs. The MMN was found to be stimulusintensity- dependent. As the intensity of sine wave stimuli was increased from 40 to 80 dB HL, MMN mean amplitudes increased. The effect of stimulus frequency on the MMN was studied so that the pitch difference would be equal in each stimulus block according to the psychophysiological mel scale or the difference limen of frequency (DLF). However, the blocks differed from each other. The contralateral white noise masking (50 dB EML) was found to attenuate the MMN amplitude when the right ear was stimulated. The N1 amplitude was attenuated and, in contrast, P2 amplitude was not affected by contralateral white noise masking. The perception and production of vowels by four postlingually deafened patients with a cochlear implant were studied. The MMN response could be elicited in the patient with the best vowel perception abilities. The results of the studies show that concerning the MMN recordings, the stimulus parameters and recording procedure design have a great influence on the results.
Resumo:
Middle ear infections (acute otitis media, AOM) are among the most common infectious diseases in childhood, their incidence being greatest at the age of 6–12 months. Approximately 10–30% of children undergo repetitive periods of AOM, referred to as recurrent acute otitis media (RAOM). Middle ear fluid during an AOM episode causes, on average, 20–30 dB of hearing loss lasting from a few days to as much as a couple of months. It is well known that even a mild permanent hearing loss has an effect on language development but so far there is no consensus regarding the consequences of RAOM on childhood language acquisition. The results of studies on middle ear infections and language development have been partly discrepant and the exact effects of RAOM on the developing central auditory nervous system are as yet unknown. This thesis aims to examine central auditory processing and speech production among 2-year-old children with RAOM. Event-related potentials (ERPs) extracted from electroencephalography can be used to objectively investigate the functioning of the central auditory nervous system. For the first time this thesis has utilized auditory ERPs to study sound encoding and preattentive auditory discrimination of speech stimuli, and neural mechanisms of involuntary auditory attention in children with RAOM. Furthermore, the level of phonological development was studied by investigating the number and the quality of consonants produced by these children. Acquisition of consonant phonemes, which are harder to hear than vowels, is a good indicator of the ability to form accurate memory representations of ambient language and has not been studied previously in Finnish-speaking children with RAOM. The results showed that the cortical sound encoding was intact but the preattentive auditory discrimination of multiple speech sound features was atypical in those children with RAOM. Furthermore, their neural mechanisms of auditory attention differed from those of their peers, thus indicating that children with RAOM are atypically sensitive to novel but meaningless sounds. The children with RAOM also produced fewer consonants than their controls. Noticeably, they had a delay in the acquisition of word-medial consonants and the Finnish phoneme /s/, which is acoustically challenging to perceive compared to the other Finnish phonemes. The findings indicate the immaturity of central auditory processing in the children with RAOM, and this might also emerge in speech production. This thesis also showed that the effects of RAOM on central auditory processing are long-lasting because the children had healthy ears at the time of the study. An effective neural network for speech sound processing is a basic requisite of language acquisition, and RAOM in early childhood should be considered as a risk factor for language development.
Resumo:
Les adultes peuvent éprouver des difficultés à discriminer des phonèmes d’une langue seconde (L2) qui ne servent pas à distinguer des items lexicaux dans leur langue maternelle (L1). Le Feature Model (FM) de Brown (1998) propose que les adultes peuvent réussir à créer des nouvelles catégories de sons seulement si celles-ci peuvent être construites à partir de traits distinctifs existant dans la L1 des auditeurs. Cette hypothèse a été testée sur plusieurs contrastes consonantiques dans différentes langues; cependant, il semble que les traits qui s’appliquent sur les voyelles n’aient jamais été examinés dans cette perspective et encore moins les traits qui opèrent à la fois dans les systèmes vocalique et consonantique et qui peuvent avoir un statut distinctif ou non-distinctif. Le principal objectif de la présente étude était de tester la validité du FM concernant le contraste vocalique oral-nasal du portugais brésilien (PB). La perception naïve du contraste /i/-/ĩ/ par des locuteurs du français, de l’anglais, de l’espagnol caribéen et de l’espagnol conservateur a été examinée, étant donné que ces quatre langues diffèrent en ce qui a trait au statut de la nasalité. De plus, la perception du contraste non-naïf /e/-/ẽ/ a été inclus afin de comparer les performances dans la perception naïve et non-naïve. Les résultats obtenus pour la discrimination naïve de /i/-/ĩ/ a permis de tirer les conclusions suivantes pour la première exposition à un contraste non natif : (1) le trait [nasal] qui opère de façon distinctive dans la grammaire d’une certaine L1 peut être redéployé au sein du système vocalique, (2) le trait [nasal] qui opère de façon distinctive dans la grammaire d’une certaine L1 ne peut pas être redéployé à travers les systèmes (consonne à voyelle) et (3) le trait [nasal] qui opère de façon non-distinctive dans la grammaire d’une certaine L1 peut être ou ne pas être redéployé au statut distinctif. En dernier lieu, la discrimination non-naïve de /e/-/ẽ/ a été réussie par tous les groupes, suggérant que les trois types de redéploiement s’avèrent possibles avec plus d’expérience dans la L2.
Resumo:
Medical fields requires fast, simple and noninvasive methods of diagnostic techniques. Several methods are available and possible because of the growth of technology that provides the necessary means of collecting and processing signals. The present thesis details the work done in the field of voice signals. New methods of analysis have been developed to understand the complexity of voice signals, such as nonlinear dynamics aiming at the exploration of voice signals dynamic nature. The purpose of this thesis is to characterize complexities of pathological voice from healthy signals and to differentiate stuttering signals from healthy signals. Efficiency of various acoustic as well as non linear time series methods are analysed. Three groups of samples are used, one from healthy individuals, subjects with vocal pathologies and stuttering subjects. Individual vowels/ and a continuous speech data for the utterance of the sentence "iruvarum changatimaranu" the meaning in English is "Both are good friends" from Malayalam language are recorded using a microphone . The recorded audio are converted to digital signals and are subjected to analysis.Acoustic perturbation methods like fundamental frequency (FO), jitter, shimmer, Zero Crossing Rate(ZCR) were carried out and non linear measures like maximum lyapunov exponent(Lamda max), correlation dimension (D2), Kolmogorov exponent(K2), and a new measure of entropy viz., Permutation entropy (PE) are evaluated for all three groups of the subjects. Permutation Entropy is a nonlinear complexity measure which can efficiently distinguish regular and complex nature of any signal and extract information about the change in dynamics of the process by indicating sudden change in its value. The results shows that nonlinear dynamical methods seem to be a suitable technique for voice signal analysis, due to the chaotic component of the human voice. Permutation entropy is well suited due to its sensitivity to uncertainties, since the pathologies are characterized by an increase in the signal complexity and unpredictability. Pathological groups have higher entropy values compared to the normal group. The stuttering signals have lower entropy values compared to the normal signals.PE is effective in charaterising the level of improvement after two weeks of speech therapy in the case of stuttering subjects. PE is also effective in characterizing the dynamical difference between healthy and pathological subjects. This suggests that PE can improve and complement the recent voice analysis methods available for clinicians. The work establishes the application of the simple, inexpensive and fast algorithm of PE for diagnosis in vocal disorders and stuttering subjects.