Biblioteca Digital

41 resultados para Aquecimento vocal

em Queensland University of Technology - ePrints Archive

Utilise Vocal Tract Length Normalisation for Robust Automatic Language Identification

Relevância:

20.00% 20.00%

Publicador:

Veja mais

Fixing the volatile : studio vocal performance techniques

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The process of compiling a studio vocal performance from many takes can often result in the performer producing a new complete performance once this new "best of" assemblage is heard back. This paper investigates the ways that the physical process of recording can alter vocal performance techniques, and in particular, the establishing of a definitive melodic and rhythmic structure. Drawing on his many years of experience as a commercially successful producer, including the attainment of a Grammy award, the author will analyse the process of producing a “credible” vocal performance in depth, with specific case studies and examples. The question of authenticity in rock and pop will also be discussed and, in this context, the uniqueness of the producer’s role as critical arbiter – what gives the producer the authority to make such performance evaluations? Techniques for creating conditions in the studio that are conducive to vocal performances, in many ways a very unnatural performance environment, will be discussed, touching on areas such as the psycho-acoustic properties of headphone mixes, the avoidance of intimidatory practices, and a methodology for inducing the perception of a “familiar” acoustic environment.

Veja mais

Performing for (and against) the microphone : personal takes : producing a credible vocal

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This contribution proposes the effect of the studio practice compiling vocals from many takes on the performance of vocalists.

Veja mais

Vocal repertoire of the New Zealand kea parrot Nestor notabilis

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The unique alpine-living kea parrot Nestor notabilis has been the focus of numerous cognitive studies, but its communication system has so far been largely neglected. We examined 2,884 calls recorded in New Zealand’s Southern Alps. Based on audio and visual spectrographic differences, these calls were categorised into seven distinct call types: the non-oscillating ‘screech’ contact call and ‘mew’; and the oscillating ‘trill’, ‘chatter’, ‘warble’ and ‘whistle’; and a hybrid ‘screech-trill’. Most of these calls contained aspects that were individually unique, in addition to potentially encoding for an individual’s sex and age. Additionally, for each recording, the sender’s previous and next calls were noted, as well as any response given by conspecifics. We found that the previous and next calls made by the sender were most often of the same type, and that the next most likely preceding and/or following call type was the screech call, a contact call which sounds like the ‘kee-ah’ from which the bird’s name derives. As a social bird capable of covering large distances over visually obstructive terrain, long distance contact calls may be of considerable importance for social cohesion. Contact calls allow kea to locate conspecifics and congregate in temporary groups for social activities. The most likely response to any given call was a screech, usually followed by the same type of call as the initial call made by the sender, although responses differed depending on the age of the caller. The exception was the warble, the kea’s play call, to which the most likely response was another warble. Being the most common call type, as well as the default response to another call, it appears that the ‘contagious’ screech contact call plays a central role in kea vocal communication and social cohesion

Veja mais

Orthographic/phonological facilitation of naming responses in the picture-word task: An event-related fMRI study using overt vocal responding

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In the picture-word interference task, naming responses are facilitated when a distractor word is orthographically and phonologically related to the depicted object as compared to an unrelated word. We used event-related functional magnetic resonance imaging (fMRI) to investigate the cerebral hemodynamic responses associated with this priming effect. Serial (or independent-stage) and interactive models of word production that explicitly account for picture-word interference effects assume that the locus of the effect is at the level of retrieving phonological codes, a role attributed recently to the left posterior superior temporal cortex (Wernicke's area). This assumption was tested by randomly presenting participants with trials from orthographically related and unrelated distractor conditions and acquiring image volumes coincident with the estimated peak hemodynamic response for each trial. Overt naming responses occurred in the absence of scanner noise, allowing reaction time data to be recorded. Analysis of this data confirmed the priming effect. Analysis of the fMRI data revealed blood oxygen level-dependent signal decreases in Wernicke's area and the right anterior temporal cortex, whereas signal increases were observed in the anterior cingulate, the right orbitomedial prefrontal, somatosensory, and inferior parietal cortices, and the occipital lobe. The results are interpreted as supporting the locus for the facilitation effect as assumed by both classes of theoretical model of word production. In addition, our results raise the possibilities that, counterintuitively, picture-word interference might be increased by the presentation of orthographically related distractors, due to competition introduced by activation of phonologically related word forms, and that this competition requires inhibitory processes to be resolved. The priming effect is therefore viewed as being sufficient to offset the increased interference. We conclude that information from functional imaging studies might be useful for constraining theoretical models of word production.

Veja mais

Temporal and environmental influences on the vocal behaviour of a nocturnal bird

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Temporal and environmental variation in vocal activity can provide information on avian behaviour and call function not available to short-term experimental studies. Intersexual differences in this variation can provide insight into selection effects. Yet factors influencing vocal behaviour have not been assessed in many birds, even those monitored by acoustic methods. This applies to the New Zealand kiwi (Apterygidae), for which call censuses are used extensively in conservation monitoring, yet which have poorly understood acoustic ecology. We investigated little spotted kiwi Apteryx owenii vocal behaviour over 3 yr, measuring influences on vocal activity in both sexes from time of night, season, weather conditions and lunar cycle. We tested hypotheses that call rate variation reflects call function, foraging efficiency, historic predation risk and variability in sound transmission, and that there are inter-sexual differences in call function. Significant seasonal variation showed that vocalisations were important in kiwi reproduction, and inter-sexual synchronisation of call rates indicated that contact, pair-bonding or resource defence were key functions. All weather variables significantly affected call rates, with elevated calling during increased humidity and ground moisture indicating a relation between vocal activity and foraging conditions. A significant decrease in calling activity on cloudy nights, combined with no moonlight effect, suggests an impact of light pollution in this species. These influences on vocal activity provide insight into kiwi call function, have direct consequences for conservation monitoring of kiwi, and have wider implications in understanding vocal behaviour in a range of nocturnal birds

Veja mais

The fabric of transcultural collaboration: Interweaving the traditional Korean vocal form of p'ansori and the contemporary Japanese dance form of butoh in a transculturally Australian context

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This practice-led research investigated the negotiation processes informing effective models of transcultural collaboration. In a creative project interweaving the image-based physicality of the Japanese dance form of butoh with the traditional Korean vocal style of p'ansori, a series of creative development cycles were undertaken with a team of artists from Australia and Korea, culminating in Deluge, a work of physical theatre. The development of interventions at 'sites of transcultural potential' resulted in improvements to the negotiation of interpersonal relationships and assisted in the emergence of a productive working environment in transculturally collaborative artistic practice.

Veja mais

Size distribution and sites of origin of droplets expelled during expiratory activities

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A new Expiratory Droplet Investigation System (EDIS) was used to conduct the most comprehensive program of study to date, of the dilution corrected droplet size distributions produced during different respiratory activities.----- Distinct physiological processes were responsible for specific size distribution modes. The majority of particles for all activities were produced in one or more modes, with diameters below 0.8 µm. That mode occurred during all respiratory activities, including normal breathing. A second mode at 1.8 µm was produced during all activities, but at lower concentrations.----- Speech produced particles in modes near 3.5 µm and 5 µm. The modes became most pronounced during continuous vocalization, suggesting that the aerosolization of secretions lubricating the vocal chords is a major source of droplets in terms of number.----- Non-eqilibrium droplet evaporation was not detectable for particles between 0.5 and 20 μm implying that evaporation to the equilibrium droplet size occurred within 0.8 s.

Veja mais

Amphibian

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Amphibian is an 10’00’’ musical work which explores new musical interfaces and approaches to hybridising performance practices from the popular music, electronic dance music and computer music traditions. The work is designed to be presented in a range of contexts associated with the electro-acoustic, popular and classical music traditions. The work is for two performers using two synchronised laptops, an electric guitar and a custom designed gestural interface for vocal performers - the e-Mic (Extended Mic-stand Interface Controller). This interface was developed by one of the co-authors, Donna Hewitt. The e-Mic allows a vocal performer to manipulate the voice in real time through the capture of physical gestures via an array of sensors - pressure, distance, tilt - along with ribbon controllers and an X-Y joystick microphone mount. Performance data are then sent to a computer, running audio-processing software, which is used to transform the audio signal from the microphone. In this work, data is also exchanged between performers via a local wireless network, allowing performers to work with shared data streams. The duo employs the gestural conventions of guitarist and singer (i.e. 'a band' in a popular music context), but transform these sounds and gestures into new digital music. The gestural language of popular music is deliberately subverted and taken into a new context. The piece thus explores the nexus between the sonic and performative practices of electro acoustic music and intelligent electronic dance music (‘idm’). This work was situated in the research fields of new musical interfacing, interaction design, experimental music composition and performance. The contexts in which the research was conducted were live musical performance and studio music production. The work investigated new methods for musical interfacing, performance data mapping, hybrid performance and compositional practices in electronic music. The research methodology was practice-led. New insights were gained from the iterative experimental workshopping of gestural inputs, musical data mapping, inter-performer data exchange, software patch design, data and audio processing chains. In respect of interfacing, there were innovations in the design and implementation of a novel sensor-based gestural interface for singers, the e-Mic, one of the only existing gestural controllers for singers. This work explored the compositional potential of sharing real time performance data between performers and deployed novel methods for inter-performer data exchange and mapping. As regards stylistic and performance innovation, the work explored and demonstrated an approach to the hybridisation of the gestural and sonic language of popular music with recent ‘post-digital’ approaches to laptop based experimental music The development of the work was supported by an Australia Council Grant. Research findings have been disseminated via a range of international conference publications, recordings, radio interviews (ABC Classic FM), broadcasts, and performances at international events and festivals. The work was curated into the major Australian international festival, Liquid Architecture, and was selected by an international music jury (through blind peer review) for presentation at the International Computer Music Conference in Belfast, N. Ireland.

Veja mais

Nodule

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Nodule is 19'54" musical work for two electronic music performers, two laptop computers and a custom built, sensor-based microphone controller - the e-Mic (Extended Mic-stand Interface Controller). This interface was developed by one of the co-authors, Donna Hewitt. The e-Mic allows a vocal performer to manipulate their voice in real time by capturing physical gestures via an array of sensors - pressure, distance, tilt – in addition to ribbon controllers and an X-Y joystick microphone mount. Performance data are then sent to a computer, running audio-processing software, which is used to transform the audio signal from the microphone in real time. The work seeks to explore the liminal space between the electro-acoustic music tradition and more recent developments in the electronic dance music tradition. It does so on both a performative (gestural) and compositional (sonic) level. Visually, the performance consists of a singer and a laptop performer, hybridising the gestural context of these traditions. On a sonic level, the work explores hybridity at deeper levels of the musical structure than simple bricolage or collage approaches. Hybridity is explored at the level of the sonic gesture (source material), in production (audio processing gestures), in performance gesture, and in approaches to the use of the frequency spectrum, pulse and meter. The work was designed to be performed in a range of contexts from concert halls, to clubs, to rock festivals, across a range of staging and production platforms. As a consequence, the work has been tested in a range of audience contexts, and has allowed the transportation of compositional and performance practices across traditional audience demographic boundaries.

Veja mais

Towards improved speech recognition for resource poor languages

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In recent times, the improved levels of accuracy obtained by Automatic Speech Recognition (ASR) technology has made it viable for use in a number of commercial products. Unfortunately, these types of applications are limited to only a few of the world’s languages, primarily because ASR development is reliant on the availability of large amounts of language specific resources. This motivates the need for techniques which reduce this language-specific, resource dependency. Ideally, these approaches should generalise across languages, thereby providing scope for rapid creation of ASR capabilities for resource poor languages. Cross Lingual ASR emerges as a means for addressing this need. Underpinning this approach is the observation that sound production is largely influenced by the physiological construction of the vocal tract, and accordingly, is human, and not language specific. As a result, a common inventory of sounds exists across languages; a property which is exploitable, as sounds from a resource poor, target language can be recognised using models trained on resource rich, source languages. One of the initial impediments to the commercial uptake of ASR technology was its fragility in more challenging environments, such as conversational telephone speech. Subsequent improvements in these environments has gained consumer confidence. Pragmatically, if cross lingual techniques are to considered a viable alternative when resources are limited, they need to perform under the same types of conditions. Accordingly, this thesis evaluates cross lingual techniques using two speech environments; clean read speech and conversational telephone speech. Languages used in evaluations are German, Mandarin, Japanese and Spanish. Results highlight that previously proposed approaches provide respectable results for simpler environments such as read speech, but degrade significantly when in the more taxing conversational environment. Two separate approaches for addressing this degradation are proposed. The first is based on deriving better target language lexical representation, in terms of the source language model set. The second, and ultimately more successful approach, focuses on improving the classification accuracy of context-dependent (CD) models, by catering for the adverse influence of languages specific phonotactic properties. Whilst the primary research goal in this thesis is directed towards improving cross lingual techniques, the catalyst for investigating its use was based on expressed interest from several organisations for an Indonesian ASR capability. In Indonesia alone, there are over 200 million speakers of some Malay variant, provides further impetus and commercial justification for speech related research on this language. Unfortunately, at the beginning of the candidature, limited research had been conducted on the Indonesian language in the field of speech science, and virtually no resources existed. This thesis details the investigative and development work dedicated towards obtaining an ASR system with a 10000 word recognition vocabulary for the Indonesian language.

Veja mais

Coding speech signals at very low rates (below 1 kb/s) with high intelligibility

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This thesis presents an original approach to parametric speech coding at rates below 1 kbitsjsec, primarily for speech storage applications. Essential processes considered in this research encompass efficient characterization of evolutionary configuration of vocal tract to follow phonemic features with high fidelity, representation of speech excitation using minimal parameters with minor degradation in naturalness of synthesized speech, and finally, quantization of resulting parameters at the nominated rates. For encoding speech spectral features, a new method relying on Temporal Decomposition (TD) is developed which efficiently compresses spectral information through interpolation between most steady points over time trajectories of spectral parameters using a new basis function. The compression ratio provided by the method is independent of the updating rate of the feature vectors, hence allows high resolution in tracking significant temporal variations of speech formants with no effect on the spectral data rate. Accordingly, regardless of the quantization technique employed, the method yields a high compression ratio without sacrificing speech intelligibility. Several new techniques for improving performance of the interpolation of spectral parameters through phonetically-based analysis are proposed and implemented in this research, comprising event approximated TD, near-optimal shaping event approximating functions, efficient speech parametrization for TD on the basis of an extensive investigation originally reported in this thesis, and a hierarchical error minimization algorithm for decomposition of feature parameters which significantly reduces the complexity of the interpolation process. Speech excitation in this work is characterized based on a novel Multi-Band Excitation paradigm which accurately determines the harmonic structure in the LPC (linear predictive coding) residual spectra, within individual bands, using the concept 11 of Instantaneous Frequency (IF) estimation in frequency domain. The model yields aneffective two-band approximation to excitation and computes pitch and voicing with high accuracy as well. New methods for interpolative coding of pitch and gain contours are also developed in this thesis. For pitch, relying on the correlation between phonetic evolution and pitch variations during voiced speech segments, TD is employed to interpolate the pitch contour between critical points introduced by event centroids. This compresses pitch contour in the ratio of about 1/10 with negligible error. To approximate gain contour, a set of uniformly-distributed Gaussian event-like functions is used which reduces the amount of gain information to about 1/6 with acceptable accuracy. The thesis also addresses a new quantization method applied to spectral features on the basis of statistical properties and spectral sensitivity of spectral parameters extracted from TD-based analysis. The experimental results show that good quality speech, comparable to that of conventional coders at rates over 2 kbits/sec, can be achieved at rates 650-990 bits/sec.

Veja mais

Automatic spoken language identification utilizing acoustic and phonetic speech information

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Automatic spoken Language Identi¯cation (LID) is the process of identifying the language spoken within an utterance. The challenge that this task presents is that no prior information is available indicating the content of the utterance or the identity of the speaker. The trend of globalization and the pervasive popularity of the Internet will amplify the need for the capabilities spoken language identi¯ca- tion systems provide. A prominent application arises in call centers dealing with speakers speaking di®erent languages. Another important application is to index or search huge speech data archives and corpora that contain multiple languages. The aim of this research is to develop techniques targeted at producing a fast and more accurate automatic spoken LID system compared to the previous National Institute of Standards and Technology (NIST) Language Recognition Evaluation. Acoustic and phonetic speech information are targeted as the most suitable fea- tures for representing the characteristics of a language. To model the acoustic speech features a Gaussian Mixture Model based approach is employed. Pho- netic speech information is extracted using existing speech recognition technol- ogy. Various techniques to improve LID accuracy are also studied. One approach examined is the employment of Vocal Tract Length Normalization to reduce the speech variation caused by di®erent speakers. A linear data fusion technique is adopted to combine the various aspects of information extracted from speech. As a result of this research, a LID system was implemented and presented for evaluation in the 2003 Language Recognition Evaluation conducted by the NIST.

Veja mais

Theatre audience contribution : facilitating a new text through the post-performance discussion

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Theatre Audience Contribution introduces a new approach to theatre audience research: audience contribution through the post-performance discussion. This volume considers the physical and vocal behaviour of audience members as an integral part of the theatrical event that changes, adds to and informs the theatrical experience. Post-performance discussions, although rising in popularity, are yet an under-explored and under-utilised avenue for audience contribution. Beginning with an overview of reception theory and the historical role of theatre audiences, the author introduces a new method for the facilitation of post-performance discussions that encourages audience contribution and privileges the audience voice. Two case studies explore post-performance discussions that inform the theatrical event and discover a new role for the contemporary audience: audience critic. This accessible volume has significant implications for theatre theorists, practitioners and audiences alike.

Veja mais

IDOL

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Idol is a collaborative performance work for vocal performer and dancers. The work explores movement and sound relative to a vocal interface called the eMic (Extended Microphone Interface Controller). The eMic is a gestural controller designed by the composer for live vocal performance an real-time processing. The process for generating the work involves the choreographer being provided an opportunity to experiment with gestures ad movement relative to the eMic interface. The choreographer explored the interface as an object,a prop, an instrument and as an extension of the body. the movement was then videoed and the data coming from the sensors simultaneously recorded. The data and the video were then used as part of the compositional process, allowing the composer to see what the performance looks like and to experiment with mapping strategies using the captured sensor data. This approach represents a new compositional direction for working with the eMic, in that previously the compositional process commenced at the computer, building processing patches and assigning parameters to eMic sensors. In order to play the composition, the body needed to adapt to 'playing' the instrument. This approach treats the eMic like a traditional instrument that requires the human body to develop a command over the instrument. Working with the movement as a starting point inverts the process using choreographic gestures as the basis for musical structures.

Veja mais

41 resultados para Aquecimento vocal

em Queensland University of Technology - ePrints Archive

Filtro por publicador