Biblioteca Digital

This work examines prosody modelling for the Standard Yorùbá (SY) language in the context of computer text-to-speech synthesis applications. The thesis of this research is that it is possible to develop a practical prosody model by using appropriate computational tools and techniques which combines acoustic data with an encoding of the phonological and phonetic knowledge provided by experts. Our prosody model is conceptualised around a modular holistic framework. The framework is implemented using the Relational Tree (R-Tree) techniques (Ehrich and Foith, 1976). R-Tree is a sophisticated data structure that provides a multi-dimensional description of a waveform. A Skeletal Tree (S-Tree) is first generated using algorithms based on the tone phonological rules of SY. Subsequent steps update the S-Tree by computing the numerical values of the prosody dimensions. To implement the intonation dimension, fuzzy control rules where developed based on data from native speakers of Yorùbá. The Classification And Regression Tree (CART) and the Fuzzy Decision Tree (FDT) techniques were tested in modelling the duration dimension. The FDT was selected based on its better performance. An important feature of our R-Tree framework is its flexibility in that it facilitates the independent implementation of the different dimensions of prosody, i.e. duration and intonation, using different techniques and their subsequent integration. Our approach provides us with a flexible and extendible model that can also be used to implement, study and explain the theory behind aspects of the phenomena observed in speech prosody.

Veja mais

A modular holistic approach to prosody modelling for Standard Yorùbá speech synthesis

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents a novel prosody model in the context of computer text-to-speech synthesis applications for tone languages. We have demonstrated its applicability using the Standard Yorùbá (SY) language. Our approach is motivated by the theory that abstract and realised forms of various prosody dimensions should be modelled within a modular and unified framework [Coleman, J.S., 1994. Polysyllabic words in the YorkTalk synthesis system. In: Keating, P.A. (Ed.), Phonological Structure and Forms: Papers in Laboratory Phonology III, Cambridge University Press, Cambridge, pp. 293–324]. We have implemented this framework using the Relational Tree (R-Tree) technique. R-Tree is a sophisticated data structure for representing a multi-dimensional waveform in the form of a tree. The underlying assumption of this research is that it is possible to develop a practical prosody model by using appropriate computational tools and techniques which combine acoustic data with an encoding of the phonological and phonetic knowledge provided by experts. To implement the intonation dimension, fuzzy logic based rules were developed using speech data from native speakers of Yorùbá. The Fuzzy Decision Tree (FDT) and the Classification and Regression Tree (CART) techniques were tested in modelling the duration dimension. For practical reasons, we have selected the FDT for implementing the duration dimension of our prosody model. To establish the effectiveness of our prosody model, we have also developed a Stem-ML prosody model for SY. We have performed both quantitative and qualitative evaluations on our implemented prosody models. The results suggest that, although the R-Tree model does not predict the numerical speech prosody data as accurately as the Stem-ML model, it produces synthetic speech prosody with better intelligibility and naturalness. The R-Tree model is particularly suitable for speech prosody modelling for languages with limited language resources and expertise, e.g. African languages. Furthermore, the R-Tree model is easy to implement, interpret and analyse.

Veja mais

Infants' selective attention to faces and prosody of speech: The roles of intersensory redundancy and exploratory time

Relevância:

20.00% 20.00%

Publicador:

Resumo:

One of the overarching questions in the field of infant perceptual and cognitive development concerns how selective attention is organized during early development to facilitate learning. The following study examined how infants' selective attention to properties of social events (i.e., prosody of speech and facial identity) changes in real time as a function of intersensory redundancy (redundant audiovisual, nonredundant unimodal visual) and exploratory time. Intersensory redundancy refers to the spatially coordinated and temporally synchronous occurrence of information across multiple senses. Real time macro- and micro-structural change in infants' scanning patterns of dynamic faces was also examined. ^ According to the Intersensory Redundancy Hypothesis, information presented redundantly and in temporal synchrony across two or more senses recruits infants' selective attention and facilitates perceptual learning of highly salient amodal properties (properties that can be perceived across several sensory modalities such as the prosody of speech) at the expense of less salient modality specific properties. Conversely, information presented to only one sense facilitates infants' learning of modality specific properties (properties that are specific to a particular sensory modality such as facial features) at the expense of amodal properties (Bahrick & Lickliter, 2000, 2002). ^ Infants' selective attention and discrimination of prosody of speech and facial configuration was assessed in a modified visual paired comparison paradigm. In redundant audiovisual stimulation, it was predicted infants would show discrimination of prosody of speech in the early phases of exploration and facial configuration in the later phases of exploration. Conversely, in nonredundant unimodal visual stimulation, it was predicted infants would show discrimination of facial identity in the early phases of exploration and prosody of speech in the later phases of exploration. Results provided support for the first prediction and indicated that following redundant audiovisual exposure, infants showed discrimination of prosody of speech earlier in processing time than discrimination of facial identity. Data from the nonredundant unimodal visual condition provided partial support for the second prediction and indicated that infants showed discrimination of facial identity, but not prosody of speech. The dissertation study contributes to the understanding of the nature of infants' selective attention and processing of social events across exploratory time.^

Veja mais

Airwaves: 100 years of radio

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Music composition using prominent broadcast speeches across the whole twentieth century in commemoration of the centenary of Marconi's first transatlantic radio transmission. The work is based on creating music from the found objects of melody derived from spoken intonation. Recordings of the speeches are accompanied throughout by live instrumental music.

Veja mais

Taken

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This eighty-minute work examines responses to Australia's Stolen Generations history. Based on recorded interviews with aboriginal survivors of state removal of children, Taken uses music as a metaphor for white Australians listening to these stories, as the music follows the spoken intonation in the interviews.

Veja mais

The social orders of family mealtime

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This study examined the everyday practices of families within the context of family mealtime to investigate how members accomplished mealtime interactions. Using an ethnomethodological approach, conversation analysis and membership categorization analysis, the study investigated the interactional resources that family members used to assemble their social orders moment by moment during family mealtimes. While there is interest in mealtimes within educational policy, health research and the media, there remain few studies that provide fine-grained detail about how members produce the social activity of having a family meal. Findings from this study contribute empirical understandings about families and family mealtime. Two families with children aged 2 to 10 years were observed as they accomplished their everyday mealtime activities. Data collection took place in the family homes where family members video recorded their naturally occurring mealtimes. Each family was provided with a video camera for a one-month period and they decided which mealtimes they recorded, a method that afforded participants greater agency in the data collection process and made available to the analyst a window into the unfolding of the everyday lives of the families. A total of 14 mealtimes across the two families were recorded, capturing 347 minutes of mealtime interactions. Selected episodes from the data corpus, which includes centralised breakfast and dinnertime episodes, were transcribed using the Jeffersonian system. Three data chapters examine extended sequences of family talk at mealtimes, to show the interactional resources used by members during mealtime interactions. The first data chapter explores multiparty talk to show how the uniqueness of the occasion of having a meal influences turn design. It investigates the ways in which members accomplish two-party talk within a multiparty setting, showing how one child "tells" a funny story to accomplish the drawing together of his brothers as an audience. As well, this chapter identifies the interactional resources used by the mother to cohort her children to accomplish the choralling of grace. The second data chapter draws on sequential and categorical analysis to show how members are mapped to a locally produced membership category. The chapter shows how the mapping of members into particular categories is consequential for social order; for example, aligning members who belong to the membership category "had haircuts" and keeping out those who "did not have haircuts". Additional interactional resources such as echoing, used here to refer to the use of exactly the same words, similar prosody and physical action, and increasing physical closeness, are identified as important to the unfolding talk particularly as a way of accomplishing alignment between the grandmother and grand-daughter. The third and final data analysis chapter examines topical talk during family mealtimes. It explicates how members introduce topics of talk with an orientation to their co-participant and the way in which the take up of a topic is influenced both by the sequential environment in which it is introduced and the sensitivity of the topic. Together, these three data chapters show aspects of how family members participated in family mealtimes. The study contributes four substantive themes that emerged during the analytic process and, as such, the themes reflect what the members were observed to be doing. The first theme identified how family knowledge was relevant and consequential for initiating and sustaining interaction during mealtime with, for example, members buying into the talk of other members or being requested to help out with knowledge about a shared experience. Knowledge about members and their activities was evident with the design of questions evidencing an orientation to coparticipant’s knowledge. The second theme found how members used topic as a resource for social interaction. The third theme concerned the way in which members utilised membership categories for producing and making sense of social action. The fourth theme, evident across all episodes selected for analysis, showed how children’s competence is an ongoing interactional accomplishment as they manipulated interactional resources to manage their participation in family mealtime. The way in which children initiated interactions challenges previous understandings about children’s restricted rights as conversationalists. As well as making a theoretical contribution, the study offers methodological insight by working with families as research participants. The study shows the procedures involved as the study moved from one where the researcher undertook the decisions about what to videorecord to offering this decision making to the families, who chose when and what to videorecord of their mealtime practices. Evident also are the ways in which participants orient both to the video-camera and to the absent researcher. For the duration of the mealtime the video-camera was positioned by the adults as out of bounds to the children; however, it was offered as a "treat" to view after the mealtime was recorded. While situated within family mealtimes and reporting on the experiences of two families, this study illuminates how mealtimes are not just about food and eating; they are social. The study showed the constant and complex work of establishing and maintaining social orders and the rich array of interactional resources that members draw on during family mealtimes. The family’s interactions involved members contributing to building the social orders of family mealtime. With mealtimes occurring in institutional settings involving young children, such as long day care centres and kindergartens, the findings of this study may help educators working with young children to see the rich interactional opportunities mealtimes afford children, the interactional competence that children demonstrate during mealtimes, and the important role/s that adults may assume as co-participants in interactions with children within institutional settings.

Veja mais

Multi-method, multi-theoretical, multi-level research in the learning sciences

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We examine methodologies and methods that apply to multi-level research in the learning sciences. In so doing we describe how multiple theoretical frameworks informs the use of different methods that apply to social levels involving space-time relationships that are not accessible consciously as social life is enacted. Most of the methods involve analyses of video and audio files. Within a framework of interpretive research we present a methodology of event-oriented social science, which employs video ethnography, narrative, conversation analysis, prosody analysis, and facial expression analysis. We illustrate multi-method research in an examination of the role of emotions in teaching and learning. Conversation and prosody analyses augment facial expression analysis and ethnography. We conclude with an exploration of ways in which multi-level studies can be complemented with neural level analyses.

Veja mais

Relationship between emotional climate and the fluency of classroom interactions

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This study examined emotional climate in relation to the teaching and learning of grade 7 science. A multi-method and multi-theoretic approach used sociocultural frameworks as a foundation for interpretive research, conversation analysis, prosody analysis, and studies of nonverbal conduct. Emotional climate varied continuously throughout a lesson. Dialogues occurred and afforded learning when interactions between the teacher and students were fluent and included humour and collective effervescence. Emotional climate was negatively valenced when the teacher and/or students endeavoured to establish and maintain power by restricting others’ participation to spectator roles. The teacher’s endeavours to maintain and establish control over students were potentially detrimental to teaching and learning, teachers and learners. This type of teaching gradually evolved into a form we referred to as cranky teaching, whereby the teacher and her students showed signs of frustration and the enacted teaching and learning roles lacked fluency. The methods we pioneered in the present study might be helpful for other teachers who wish to participate in research on their classes to ascertain what works and should be strengthened, and identify practices and rituals that are deleterious and in need of change.

Veja mais

Methods for sociological inquiry on emotion in educational settings

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Sociological approaches to inquiry on emotion in educational settings are growing. Despite a long tradition of research and theory in disciplines such as psychology and sociology, the methods and approaches for naturalistic investigation of emotion are in a developmental phase in educational settings. In this article, recent empirical studies on emotion in educational contexts are canvassed. The discussion focuses on the use of multiple methods within research conducted in high school and university classrooms highlighting recent methodological progress. The methods discussed include facial expression analysis, verbal and non-verbal conduct, and self-report methods. Analyses drawn from different studies, informed by perspectives from microsociology, highlight the strengths and limitations of any one method. The power and limitations of multi-method approaches is discussed.

Veja mais

Structure informationnelle et constructions du kabyle : Etude de trois types de phrase dans le cadre de la grammaire constructionnelle

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Information structure and Kabyle constructions Three sentence types in the Construction Grammar framework The study examines three Kabyle sentence types and their variants. These sentence types have been chosen because they code the same state of affairs but have different syntactic structures. The sentence types are Dislocated sentence, Cleft sentence, and Canonical sentence. I argue first that a proper description of these sentence types should include information structure and, second, that a description which takes into account information structure is possible in the Construction Grammar framework. The study thus constitutes a testing ground for Construction Grammar for its applicability to a less known language. It constitutes a testing ground notably because the differentiation between the three types of sentences cannot be done without information structure categories and, consequently, these categories must be integrated also in the grammatical description. The information structure analysis is based on the model outlined by Knud Lambrecht. In that model, information structure is considered as a component of sentence grammar that assures the pragmatically correct sentence forms. The work starts by an examination of the three sentence types and the analyses that have been done in André Martinet s functional grammar framework. This introduces the sentence types chosen as the object of study and discusses the difficulties related to their analysis. After a presentation of the state of the art, including earlier and more recent models, the principles and notions of Construction Grammar and of Lambrecht s model are introduced and explicated. The information structure analysis is presented in three chapters, each treating one of the three sentence types. The analyses are based on spoken language data and elicitation. Prosody is included in the study when a syntactic structure seems to code two different focus structures. In such cases, it is pertinent to investigate whether these are coded by prosody. The final chapter presents the constructions that have been established and the problems encountered in analysing them. It also discusses the impact of the study on the theories used and on the theory of syntax in general.

Veja mais

Spontaanin puheen prosodinen jaksottelu

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The common focus of the studies brought together in this work is the prosodic segmentation of spontaneous speech. The theoretically most central aspect is the introduction and further development of the IJ-model of intonational chunking. The study consists of a general introduction and five detailed studies that approach prosodic chunking from different perspectives. The data consist of recordings of face-to-face interaction in several spoken varieties of Finnish and Finland Swedish; the methodology is usage-based and qualitative. The term “speech prosody” refers primarily to the melodic and rhythmic characteristics of speech. Both speaking and understanding speech require the ability to segment the flow of speech into suitably sized prosodic chunks. In order to be usage-based, a study of spontaneous speech consequently needs to be based on material that is segmented into prosodic chunks of various sizes. The segmentation is seen to form a hierarchy of chunking. The prosodic models that have so far been developed and employed in Finland have been based on sentences read aloud, which has made it difficult to apply these models in the analysis of spontaneous speech. The prosodic segmentation of spontaneous speech has not previously been studied in detail in Finland. This research focuses mainly on the following three questions: (1) What are the factors that need to be considered when developing a model of prosodic segmentation of speech, so that the model can be employed regardless of the language or dialect under analysis? (2) What are the characteristics of a prosodic chunk, and what are the similarities in the ways chunks of different languages and varieties manifest themselves that will make it possible to analyze different data according to the same criteria? (3) How does the IJ-model of intonational chunking introduced as a solution to question (1) function in practice in the study of different varieties of Finnish and Finland Swedish? The boundaries of the prosodic chunks were manually marked in the material according to context-specific acoustic and auditory criteria. On the basis of the data analyzed, the IJ-model was further elaborated and implemented, thus allowing comparisons between different language varieties. On the basis of the empirical comparisons, a prosodic typology is presented for the dialects of Swedish in Finland. The general contention is that the principles of the IJ-model can readily be used as a methodological tool for prosodic analysis irrespective of language varieties.

Veja mais

La contextualisation du discours radiophonique par des moyens prosodiques. L’exemple de cinq grands philosophes français du XXe siècle

Relevância:

10.00% 10.00%

Publicador:

Resumo:

"Radiodiskurssin kontekstualisointi prosodisin keinoin. Esimerkkinä viisi suurta ranskalaista 1900-luvun filosofia" Väitöskirja käsittelee puheen kontekstualisointia prosodisin keinoin. Toisin sanottuna työssä käsitellään sitä, miten puheen prosodiset piirteet (kuten sävelkulku, intensiteetti, tauot, kesto ja rytmi) ohjaavat puheen tulkintaa vanhastaan enemmän tutkittujen sana- ja lausemerkitysten ohella. Työssä keskitytään seitsemään prosodisesti merkittyyn kuvioon, jotka koostuvat yhden tai usean parametrin silmiinpistävistä muutoksista. Ilmiöitä käsitellään sekä niiden akustisten muotojen että tyypillisten esiintymisyhteyksien ja diskursiivisten tehtävien näkökulmasta. Aineisto koostuu radio-ohjelmista, joissa puhuu viisi suurta ranskalaista 1900-luvun filosofia: Gaston Bachelard, Albert Camus, Michel Foucault, Maurice Merleau-Ponty ja Jean-Paul Sartre. Ohjelmat on lähetetty eri radiokanavilla Ranskassa vuosina 1948–1973. Väitöskirjan tulokset osoittavat, että prosodisesti merkityt kuviot ovat moniulotteisia puheen ilmiöitä, joilla on keskeinen rooli sanotun kontekstualisoinnissa: ne voivat esimerkiksi nostaa tai laskea sanotun informaatioarvoa, ilmaista puhujan voimakasta tai heikkoa sitoutumista sanomaansa, ilmaista rakenteellisen kokonaisuuden jatkumista tai päättymistä, jne. Väitöskirja sisältää myös kontrastiivisia osia, joissa ilmiöitä verrataan erääseen klassisessa pianomusiikissa esiintyvään melodiseen kuvioon sekä erääseen suomen kielen prosodiseen ilmiöön. Tulokset viittaavat siihen, että tietynlaista melodista kuviota käytetään samankaltaisena jäsentämiskeinona sekä puheessa että klassisessa musiikissa. Lisäksi tulokset antavat viitteitä siitä, että tiettyjä melodisia muotoja käytetään samankaltaisten implikaatioiden luomiseen kahdessa niinkin erilaisessa kielessä kuin suomessa ja ranskassa. Yksi väitöskirjan osa käsittelee pisteen ja pilkun prosodista merkitsemistä puheessa. Tulosten mukaan pisteellä ja pilkulla on kummallakin oma suullinen prototyyppinsä: piste merkitään tyypillisesti sävelkulun laskulla ja tauolla, ja pilkku puolestaan sävelkulun nousulla ja tauolla. Merkittävimmät tulokset koskevat kuitenkin tapauksia, joissa välimerkki tulkitaan prosodisesti epätyypillisellä tavalla: sekä pisteellä että pilkulla vaikuttaisi olevan useita eri suullisia vastaavuuksia, ja välimerkkien tehtävät voivat muotoutua hyvin erilaisiksi niiden prosodisesta tulkinnasta riippuen.

Veja mais

167 resultados para prosody

Filtro por publicador