215 resultados para audio processing

em Queensland University of Technology - ePrints Archive


Relevância:

60.00% 60.00%

Publicador:

Resumo:

Amphibian is an 10’00’’ musical work which explores new musical interfaces and approaches to hybridising performance practices from the popular music, electronic dance music and computer music traditions. The work is designed to be presented in a range of contexts associated with the electro-acoustic, popular and classical music traditions. The work is for two performers using two synchronised laptops, an electric guitar and a custom designed gestural interface for vocal performers - the e-Mic (Extended Mic-stand Interface Controller). This interface was developed by one of the co-authors, Donna Hewitt. The e-Mic allows a vocal performer to manipulate the voice in real time through the capture of physical gestures via an array of sensors - pressure, distance, tilt - along with ribbon controllers and an X-Y joystick microphone mount. Performance data are then sent to a computer, running audio-processing software, which is used to transform the audio signal from the microphone. In this work, data is also exchanged between performers via a local wireless network, allowing performers to work with shared data streams. The duo employs the gestural conventions of guitarist and singer (i.e. 'a band' in a popular music context), but transform these sounds and gestures into new digital music. The gestural language of popular music is deliberately subverted and taken into a new context. The piece thus explores the nexus between the sonic and performative practices of electro acoustic music and intelligent electronic dance music (‘idm’). This work was situated in the research fields of new musical interfacing, interaction design, experimental music composition and performance. The contexts in which the research was conducted were live musical performance and studio music production. The work investigated new methods for musical interfacing, performance data mapping, hybrid performance and compositional practices in electronic music. The research methodology was practice-led. New insights were gained from the iterative experimental workshopping of gestural inputs, musical data mapping, inter-performer data exchange, software patch design, data and audio processing chains. In respect of interfacing, there were innovations in the design and implementation of a novel sensor-based gestural interface for singers, the e-Mic, one of the only existing gestural controllers for singers. This work explored the compositional potential of sharing real time performance data between performers and deployed novel methods for inter-performer data exchange and mapping. As regards stylistic and performance innovation, the work explored and demonstrated an approach to the hybridisation of the gestural and sonic language of popular music with recent ‘post-digital’ approaches to laptop based experimental music The development of the work was supported by an Australia Council Grant. Research findings have been disseminated via a range of international conference publications, recordings, radio interviews (ABC Classic FM), broadcasts, and performances at international events and festivals. The work was curated into the major Australian international festival, Liquid Architecture, and was selected by an international music jury (through blind peer review) for presentation at the International Computer Music Conference in Belfast, N. Ireland.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Sleeper is an 18'00" musical work for live performer and laptop computer which exists as both a live performance work and a recorded work for audio CD. The work has been presented at a range of international performance events and survey exhibitions. These include the 2003 International Computer Music Conference (Singapore) where it was selected for CD publication, Variable Resistance (San Francisco Museum of Modern Art, USA), and i.audio, a survey of experimental sound at the Performance Space, Sydney. The source sound materials are drawn from field recordings made in acoustically resonant spaces in the Australian urban environment, amplified and acoustic instruments, radio signals, and sound synthesis procedures. The processing techniques blur the boundaries between, and exploit, the perceptual ambiguities of de-contextualised and processed sound. The work thus challenges the arbitrary distinctions between sound, noise and music and attempts to reveal the inherent musicality in so-called non-musical materials via digitally re-processed location audio. Thematically the work investigates Paul Virilio’s theory that technology ‘collapses space’ via the relationship of technology to speed. Technically this is explored through the design of a music composition process that draws upon spatially and temporally dispersed sound materials treated using digital audio processing technologies. One of the contributions to knowledge in this work is a demonstration of how disparate materials may be employed within a compositional process to produce music through the establishment of musically meaningful morphological, spectral and pitch relationships. This is achieved through the design of novel digital audio processing networks and a software performance interface. The work explores, tests and extends the music perception theories of ‘reduced listening’ (Schaeffer, 1967) and ‘surrogacy’ (Smalley, 1997), by demonstrating how, through specific audio processing techniques, sounds may shifted away from ‘causal’ listening contexts towards abstract aesthetic listening contexts. In doing so, it demonstrates how various time and frequency domain processing techniques may be used to achieve this shift.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Nodule is 19'54" musical work for two electronic music performers, two laptop computers and a custom built, sensor-based microphone controller - the e-Mic (Extended Mic-stand Interface Controller). This interface was developed by one of the co-authors, Donna Hewitt. The e-Mic allows a vocal performer to manipulate their voice in real time by capturing physical gestures via an array of sensors - pressure, distance, tilt – in addition to ribbon controllers and an X-Y joystick microphone mount. Performance data are then sent to a computer, running audio-processing software, which is used to transform the audio signal from the microphone in real time. The work seeks to explore the liminal space between the electro-acoustic music tradition and more recent developments in the electronic dance music tradition. It does so on both a performative (gestural) and compositional (sonic) level. Visually, the performance consists of a singer and a laptop performer, hybridising the gestural context of these traditions. On a sonic level, the work explores hybridity at deeper levels of the musical structure than simple bricolage or collage approaches. Hybridity is explored at the level of the sonic gesture (source material), in production (audio processing gestures), in performance gesture, and in approaches to the use of the frequency spectrum, pulse and meter. The work was designed to be performed in a range of contexts from concert halls, to clubs, to rock festivals, across a range of staging and production platforms. As a consequence, the work has been tested in a range of audience contexts, and has allowed the transportation of compositional and performance practices across traditional audience demographic boundaries.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This thesis investigates aspects of encoding the speech spectrum at low bit rates, with extensions to the effect of such coding on automatic speaker identification. Vector quantization (VQ) is a technique for jointly quantizing a block of samples at once, in order to reduce the bit rate of a coding system. The major drawback in using VQ is the complexity of the encoder. Recent research has indicated the potential applicability of the VQ method to speech when product code vector quantization (PCVQ) techniques are utilized. The focus of this research is the efficient representation, calculation and utilization of the speech model as stored in the PCVQ codebook. In this thesis, several VQ approaches are evaluated, and the efficacy of two training algorithms is compared experimentally. It is then shown that these productcode vector quantization algorithms may be augmented with lossless compression algorithms, thus yielding an improved overall compression rate. An approach using a statistical model for the vector codebook indices for subsequent lossless compression is introduced. This coupling of lossy compression and lossless compression enables further compression gain. It is demonstrated that this approach is able to reduce the bit rate requirement from the current 24 bits per 20 millisecond frame to below 20, using a standard spectral distortion metric for comparison. Several fast-search VQ methods for use in speech spectrum coding have been evaluated. The usefulness of fast-search algorithms is highly dependent upon the source characteristics and, although previous research has been undertaken for coding of images using VQ codebooks trained with the source samples directly, the product-code structured codebooks for speech spectrum quantization place new constraints on the search methodology. The second major focus of the research is an investigation of the effect of lowrate spectral compression methods on the task of automatic speaker identification. The motivation for this aspect of the research arose from a need to simultaneously preserve the speech quality and intelligibility and to provide for machine-based automatic speaker recognition using the compressed speech. This is important because there are several emerging applications of speaker identification where compressed speech is involved. Examples include mobile communications where the speech has been highly compressed, or where a database of speech material has been assembled and stored in compressed form. Although these two application areas have the same objective - that of maximizing the identification rate - the starting points are quite different. On the one hand, the speech material used for training the identification algorithm may or may not be available in compressed form. On the other hand, the new test material on which identification is to be based may only be available in compressed form. Using the spectral parameters which have been stored in compressed form, two main classes of speaker identification algorithm are examined. Some studies have been conducted in the past on bandwidth-limited speaker identification, but the use of short-term spectral compression deserves separate investigation. Combining the major aspects of the research, some important design guidelines for the construction of an identification model when based on the use of compressed speech are put forward.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This dissertation seeks to define and classify potential forms of Nonlinear structure and explore the possibilities they afford for the creation of new musical works. It provides the first comprehensive framework for the discussion of Nonlinear structure in musical works and provides a detailed overview of the rise of nonlinearity in music during the 20th century. Nonlinear events are shown to emerge through significant parametrical discontinuity at the boundaries between regions of relatively strong internal cohesion. The dissertation situates Nonlinear structures in relation to linear structures and unstructured sonic phenomena and provides a means of evaluating Nonlinearity in a musical structure through the consideration of the degree to which the structure is integrated, contingent, compressible and determinate as a whole. It is proposed that Nonlinearity can be classified as a three dimensional space described by three continuums: the temporal continuum, encompassing sequential and multilinear forms of organization, the narrative continuum encompassing processual, game structure and developmental narrative forms and the referential continuum encompassing stylistic allusion, adaptation and quotation. The use of spectrograms of recorded musical works is proposed as a means of evaluating Nonlinearity in a musical work through the visual representation of parametrical divergence in pitch, duration, timbre and dynamic over time. Spectral and structural analysis of repertoire works is undertaken as part of an exploration of musical nonlinearity and the compositional and performative features that characterize it. The contribution of cultural, ideological, scientific and technological shifts to the emergence of Nonlinearity in music is discussed and a range of compositional factors that contributed to the emergence of musical Nonlinearity is examined. The evolution of notational innovations from the mobile score to the screen score is plotted and a novel framework for the discussion of these forms of musical transmission is proposed. A computer coordinated performative model is discussed, in which a computer synchronises screening of notational information, provides temporal coordination of the performers through click-tracks or similar methods and synchronises the audio processing and synthesized elements of the work. It is proposed that such a model constitutes a highly effective means of realizing complex Nonlinear structures. A creative folio comprising 29 original works that explore nonlinearity is presented, discussed and categorised utilising the proposed classifications. Spectrograms of these works are employed where appropriate to illustrate the instantiation of parametrically divergent substructures and examples of structural openness through multiple versioning.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Raven and Song Scope are two automated sound anal-ysis tools based on machine learning technique for en-vironmental monitoring. Many research works have been conducted upon them, however, no or rare explo-ration mentions about the performance and comparison between them. This paper investigates the comparisons from six aspects: theory, software interface, ease of use, detection targets, detection accuracy, and potential application. Through deep exploration one critical gap is identified that there is a lack of approach to detect both syllables and call structures, since Raven only aims to detect syllables while Song Scope targets call structures. Therefore, a Timed Probabilistic Automata (TPA) system is proposed which separates syllables first and clusters them into complex structures after.

Relevância:

40.00% 40.00%

Publicador: