365 resultados para Sound production
em Queensland University of Technology - ePrints Archive
Resumo:
In recent times, the improved levels of accuracy obtained by Automatic Speech Recognition (ASR) technology has made it viable for use in a number of commercial products. Unfortunately, these types of applications are limited to only a few of the world’s languages, primarily because ASR development is reliant on the availability of large amounts of language specific resources. This motivates the need for techniques which reduce this language-specific, resource dependency. Ideally, these approaches should generalise across languages, thereby providing scope for rapid creation of ASR capabilities for resource poor languages. Cross Lingual ASR emerges as a means for addressing this need. Underpinning this approach is the observation that sound production is largely influenced by the physiological construction of the vocal tract, and accordingly, is human, and not language specific. As a result, a common inventory of sounds exists across languages; a property which is exploitable, as sounds from a resource poor, target language can be recognised using models trained on resource rich, source languages. One of the initial impediments to the commercial uptake of ASR technology was its fragility in more challenging environments, such as conversational telephone speech. Subsequent improvements in these environments has gained consumer confidence. Pragmatically, if cross lingual techniques are to considered a viable alternative when resources are limited, they need to perform under the same types of conditions. Accordingly, this thesis evaluates cross lingual techniques using two speech environments; clean read speech and conversational telephone speech. Languages used in evaluations are German, Mandarin, Japanese and Spanish. Results highlight that previously proposed approaches provide respectable results for simpler environments such as read speech, but degrade significantly when in the more taxing conversational environment. Two separate approaches for addressing this degradation are proposed. The first is based on deriving better target language lexical representation, in terms of the source language model set. The second, and ultimately more successful approach, focuses on improving the classification accuracy of context-dependent (CD) models, by catering for the adverse influence of languages specific phonotactic properties. Whilst the primary research goal in this thesis is directed towards improving cross lingual techniques, the catalyst for investigating its use was based on expressed interest from several organisations for an Indonesian ASR capability. In Indonesia alone, there are over 200 million speakers of some Malay variant, provides further impetus and commercial justification for speech related research on this language. Unfortunately, at the beginning of the candidature, limited research had been conducted on the Indonesian language in the field of speech science, and virtually no resources existed. This thesis details the investigative and development work dedicated towards obtaining an ASR system with a 10000 word recognition vocabulary for the Indonesian language.
Resumo:
Research background: Ananyi (Going) is an intercultural music project with lyrics sung in Luritja and English, undertaken in collaboration with the Tjupi Band and producer Jeffrey McLaughlin. The project contributes to cultural maintenance for Australian First Nations peoples, and is informed by prior work in this area by scholars including Peter Dunbar-Hall, Chris Gibson and Karl Neuenfeldt. These existing studies have discussed the complexities of intercultural collaboration, and the types of cultural politics that are involved when Indigenous and non-Indigenous musicians and scholars work together on projects of cultural significance. Critical race theory has also informed the creative work, as a means of interpreting the implicit and explicit discourses of race that arise through intercultural creative practice. The project asked the research question, how can collaborative music making contribute to intercultural understanding and the maintenance of Australian First Nations languages and cultures? Research contribution: The project has identified that recorded popular music is important in the maintenance of Luritja language and culture, and that intercultural collaboration in the areas of digital sound production and distribution can assist with cultural maintenance in both local and national contexts. Research significance: The compact disc was released on the CAAMA Music label, and supported through competitive grants from the Australian Government’s Contemporary Music Touring Grant and the Arnhem Land Progress Association (ALPA). The research context of the work is detailed in Brydie-Leigh Bartleet and Gavin Carfoot 2013. "Desert harmony: Stories of collaboration between Indigenous musicians and university students." International Education Journal: Comparative Perspectives 12 (1): 180-196.
Resumo:
This article compares YouTube and the National Film and Sound Archive (NFSA) as resources for television historians interested in viewing old Australian television programs. The author searched for seventeen important television programs, identified in a previous research project, to compare what was available in the two archives and how easy it was to find. The analysis focused on differences in curatorial practices of accessioning and cataloguing. NFSA is stronger in current affairs and older programs, while YouTube is stronger in game shows and lifestyle programs. YouTube is stronger than the NFSA on “human interest” material—births, marriages, and deaths. YouTube accessioning more strongly accords with popular histories of Australian television. Both NFSA and YouTube offer complete episodes of programs, while YouTube also offers many short clips of “moments.” YouTube has more surprising pieces of rare ephemera. YouTube cataloguing is more reliable than that of the NFSA, with fewer broken links. The YouTube metadata can be searched more intuitively. The NFSA generally provides more useful reference information about production and broadcast dates.
Resumo:
The development and recording of 10 songs for a CD to accompany DeepBlue's new live orchestra production "Who Are You" which began touring Australia and Asia in 2012.
Resumo:
This paper investigates how neuronal activation for naming photographs of objects is influenced by the addition of appropriate colour or sound. Behaviourally, both colour and sound are known to facilitate object recognition from visual form. However, previous functional imaging studies have shown inconsistent effects. For example, the addition of appropriate colour has been shown to reduce antero-medial temporal activation whereas the addition of sound has been shown to increase posterior superior temporal activation. Here we compared the effect of adding colour or sound cues in the same experiment. We found that the addition of either the appropriate colour or sound increased activation for naming photographs of objects in bilateral occipital regions and the right anterior fusiform. Moreover, the addition of colour reduced left antero-medial temporal activation but this effect was not observed for the addition of object sound. We propose that activation in bilateral occipital and right fusiform areas precedes the integration of visual form with either its colour or associated sound. In contrast, left antero-medial temporal activation is reduced because object recognition is facilitated after colour and form have been integrated.