352 resultados para Audio indexing
Resumo:
Jonzi D, one of the leading Hip Hop voices in the UK, creates contemporary theatrical works that merge dance, street art, original scored music and contemporary rap poetry, to create theatrical events that expand a thriving sense of a Hip Hop nation with citizens in the UK, throughout southern Africa and the rest of the world. In recent years Hip Hop has evolved as a performance genre in and of itself that not only borrows from other forms but vitally now contributes back to the body of contemporary practice in the performing arts. As part of this work Jonzi’s company Jonzi D Productions is committed to creating and touring original Hip Hop theatre that promotes the continuing development and awareness of a nation with its own language, culture and currency that exists without borders. Through the deployment of a universal voice from the local streets of Johannesburg and the East End of London, Jonzi D creates a form of highly energized performance that elevates Hip Hop as great democratiser between the highly developed global and under resourced local in the world. It is the staging of this democratised and technologised future (and present), that poses the greatest challenge for the scenographer working with Jonzi and his company, and the associated deprogramming and translation of the artists particular filmic vision to the stage, that this discussion will explore. This paper interrogates not only how a scenographic strategy can support the existence of this work but also how the scenographer as outsider can enter and influence this nation.
Resumo:
This document outlines the system submitted by the Speech and Audio Research Laboratory at the Queensland University of Technology (QUT) for the Speaker Identity Verication: Application task of EVALITA 2009. This submission consisted of a score-level fusion of three component systems, a joint-factor GMM system and two SVM systems using GLDS and GMM supervector kernels. Development and evaluation results are presented, demonstrating the effectiveness of this fused system approach.
Resumo:
The recently proposed data-driven background dataset refinement technique provides a means of selecting an informative background for support vector machine (SVM)-based speaker verification systems. This paper investigates the characteristics of the impostor examples in such highly-informative background datasets. Data-driven dataset refinement individually evaluates the suitability of candidate impostor examples for the SVM background prior to selecting the highest-ranking examples as a refined background dataset. Further, the characteristics of the refined dataset were analysed to investigate the desired traits of an informative SVM background. The most informative examples of the refined dataset were found to consist of large amounts of active speech and distinctive language characteristics. The data-driven refinement technique was shown to filter the set of candidate impostor examples to produce a more disperse representation of the impostor population in the SVM kernel space, thereby reducing the number of redundant and less-informative examples in the background dataset. Furthermore, data-driven refinement was shown to provide performance gains when applied to the difficult task of refining a small candidate dataset that was mis-matched to the evaluation conditions.
Resumo:
Defibrillator is a 16’41” musical work for solo performer, laptop computer and electric guitar. The electric guitar is processed in real-time by digital signal processing network in software, with gestural control provided by a foot-operated pedal board. --------- The work is informed by a range of ideas from the genres of electroacoustic music, western art music, popular music and cinematic sound. It seeks to fluidly cross and hybridise musical practices from these diverse sonic traditions and to develop a compositional language that draws upon multiple genres, but at the same time resists the ability to be located within a singular genre. Musical structures and sonic markers which form genre are ruptured at strategic levels of the musical structure in order to allow for a cross flow of concepts between genres. The process of rupture is facilitated by the practical implementation of music and sound reception theories into the compositional process. -------- The piece exhibits the by-products of a composer born into a media saturated environment, drawing on a range of musical and sonic traditions, actively seeking to explore the liminal space in between these traditions. The project stems from the author's research interests in locating points of connection between traditions of experimentation in diverse musical and sonic traditions arising from the broad uptake of media technologies in the early 20th century.
Resumo:
This approach to sustainable design explores the possibility of creating an architectural design process which can iteratively produce optimised and sustainable design solutions. Driven by an evolution process based on genetic algorithms, the system allows the designer to “design the building design generator” rather than to “designs the building”. The design concept is abstracted into a digital design schema, which allows transfer of the human creative vision into the rational language of a computer. The schema is then elaborated into the use of genetic algorithms to evolve innovative, performative and sustainable design solutions. The prioritisation of the project’s constraints and the subsequent design solutions synthesised during design generation are expected to resolve most of the major conflicts in the evaluation and optimisation phases. Mosques are used as the example building typology to ground the research activity. The spatial organisations of various mosque typologies are graphically represented by adjacency constraints between spaces. Each configuration is represented by a planar graph which is then translated into a non-orthogonal dual graph and fed into the genetic algorithm system with fixed constraints and expected performance criteria set to govern evolution. The resultant Hierarchical Evolutionary Algorithmic Design System is developed by linking the evaluation process with environmental assessment tools to rank the candidate designs. The proposed system generates the concept, the seed, and the schema, and has environmental performance as one of the main criteria in driving optimisation.
Resumo:
Working with 12 journalism students plus a research assistant, producer/director Romano conducted five community focus groups and discussions with 80 people on the street. These provided the themes and concepts and the creative approaches for each program. Each was structured around one of the emergent themes; all programs offered different voices rather than coming to a single conclusion. New Horizons, New Homes aired over three weeks n Radio 4EB and was entered into the 2005 UN Media Peace Award where it won the Best Radio Category ahead of ABC and SBS. The UN commended the way in which the programs brought together a wide base of research to create a better understanding in the community on this issue. This project did not just improve the accuracy and social inclusiveness of reporting. It applied principles of deliberative democracy in the creation of journalism that enhances citizens’ deliberative potential on complex social issues
Resumo:
The detection of voice activity is a challenging problem, especially when the level of acoustic noise is high. Most current approaches only utilise the audio signal, making them susceptible to acoustic noise. An obvious approach to overcome this is to use the visual modality. The current state-of-the-art visual feature extraction technique is one that uses a cascade of visual features (i.e. 2D-DCT, feature mean normalisation, interstep LDA). In this paper, we investigate the effectiveness of this technique for the task of visual voice activity detection (VAD), and analyse each stage of the cascade and quantify the relative improvement in performance gained by each successive stage. The experiments were conducted on the CUAVE database and our results highlight that the dynamics of the visual modality can be used to good effect to improve visual voice activity detection performance.
Resumo:
Managing livestock movement in extensive systems has environmental and production benefits. Currently permanent wire fencing is used to control cattle; this is both expensive and inflexible. Cattle are known to respond to auditory and visual cues and we investigated whether these can be used to manipulate their behaviour. Twenty-five Belmont Red steers with a mean live weight of 270kg were each randomly assigned to one of five treatments. Treatments consisted of a combination of cues (audio, tactile and visual stimuli) and consequence (electrical stimulation). The treatments were electrical stimulation alone, audio plus electrical stimulation, vibration plus electrical stimulation, light plus electrical stimulation and electrified electric fence (6kV) plus electrical stimulation. Cue stimuli were administered for 3s followed immediately by electrical stimulation (consequence) of 1kV for 1s. The experiment tested the operational efficacy of an on-animal control or virtual fencing system. A collar-halter device was designed to carry the electronics, batteries and equipment providing the stimuli, including audio, vibration, light and electrical of a prototype virtual fencing device. Cattle were allowed to travel along a 40m alley to a group of peers and feed while their rate of travel and response to the stimuli were recorded. The prototype virtual fencing system was successful in modifying the behaviour of the cattle. The rate of travel of cattle along the alley demonstrated the large variability in behavioural response associated with tactile, visual and audible cues. The experiment demonstrated virtual fencing has potential for controlling cattle in extensive grazing systems. However, larger numbers of cattle need to be tested to derive a better understanding of the behavioural variance. Further controlled experimental work is also necessary to quantify the interaction between cues, consequences and cattle learning.
Resumo:
Agriculture accounts for a significant portion of the GDP in most developed countries. However, managing farms, particularly largescale extensive farming systems, is hindered by lack of data and increasing shortage of labour. We have deployed a large heterogeneous sensor network on a working farm to explore sensor network applications that can address some of the issues identified above. Our network is solar powered and has been running for over 6 months. The current deployment consists of over 40 moisture sensors that provide soil moisture profiles at varying depths, weight sensors to compute the amount of food and water consumed by animals, electronic tag readers, up to 40 sensors that can be used to track animal movement (consisting of GPS, compass and accelerometers), and 20 sensor/actuators that can be used to apply different stimuli (audio, vibration and mild electric shock) to the animal. The static part of the network is designed for 24/7 operation and is linked to the Internet via a dedicated high-gain radio link, also solar powered. The initial goals of the deployment are to provide a testbed for sensor network research in programmability and data handling while also being a vital tool for scientists to study animal behavior. Our longer term aim is to create a management system that completely transforms the way farms are managed.
Resumo:
This paper proposes a security architecture for the basic cross indexing systems emerging as foundational structures in current health information systems. In these systems unique identifiers are issued to healthcare providers and consumers. In most cases, such numbering schemes are national in scope and must therefore necessarily be used via an indexing system to identify records contained in pre-existing local, regional or national health information systems. Most large scale electronic health record systems envisage that such correlation between national healthcare identifiers and pre-existing identifiers will be performed by some centrally administered cross referencing, or index system. This paper is concerned with the security architecture for such indexing servers and the manner in which they interface with pre-existing health systems (including both workstations and servers). The paper proposes two required structures to achieve the goal of a national scale, and secure exchange of electronic health information, including: (a) the employment of high trust computer systems to perform an indexing function, and (b) the development and deployment of an appropriate high trust interface module, a Healthcare Interface Processor (HIP), to be integrated into the connected workstations or servers of healthcare service providers. This proposed architecture is specifically oriented toward requirements identified in the Connectivity Architecture for Australia’s e-health scheme as outlined by NEHTA and the national e-health strategy released by the Australian Health Ministers.
Resumo:
Creating sustainable urban environments is one of the challenging issues that need a clear vision and implementation strategies involving changes in governmental values and decision making process for local governments. Particularly, internalisation of environmental externalities of daily urban activities (e.g. manufacturing, transportation and so on) has immense importance for which local policies are formulated to provide better living conditions for the people inhabiting urban areas. Even if environmental problems are defined succinctly by various stakeholders, complicated nature of sustainability issues demand a structured evaluation strategy and well-defined sustainability parameters for efficient and effective policy making. Following this reasoning, this study involves assessment of sustainability performance of urban settings mainly focusing on environmental problems caused by rapid urban expansion and transformation. By taking into account land-use and transportation interaction, it tries to reveal how future urban developments would alter daily urban travel behaviour of people and affect the urban and natural environments. The paper introduces a grid-based indexing method developed for this research and trailed as a GIS-based decision support tool to analyse and model selected spatial and aspatial indicators of sustainability in the Gold Coast. This process reveals parameters of site specific relationship among selected indicators that are used to evaluate index-based performance characteristics of the area. The evaluation is made through an embedded decision support module by assigning relative weights to indicators. Resolution of selected grid-based unit of analysis provides insights about service level of projected urban development proposals at a disaggregate level, such as accessibility to transportation and urban services, and pollution. The paper concludes by discussing the findings including the capacity of the decision support system to assist decision-makers in determining problematic areas and developing intervention policies for sustainable outcomes of future developments.
Resumo:
Microphone arrays have been used in various applications to capture conversations, such as in meetings and teleconferences. In many cases, the microphone and likely source locations are known \emph{a priori}, and calculating beamforming filters is therefore straightforward. In ad-hoc situations, however, when the microphones have not been systematically positioned, this information is not available and beamforming must be achieved blindly. In achieving this, a commonly neglected issue is whether it is optimal to use all of the available microphones, or only an advantageous subset of these. This paper commences by reviewing different approaches to blind beamforming, characterising them by the way they estimate the signal propagation vector and the spatial coherence of noise in the absence of prior knowledge of microphone and speaker locations. Following this, a novel clustered approach to blind beamforming is motivated and developed. Without using any prior geometrical information, microphones are first grouped into localised clusters, which are then ranked according to their relative distance from a speaker. Beamforming is then performed using either the closest microphone cluster, or a weighted combination of clusters. The clustered algorithms are compared to the full set of microphones in experiments on a database recorded on different ad-hoc array geometries. These experiments evaluate the methods in terms of signal enhancement as well as performance on a large vocabulary speech recognition task.
Resumo:
Digital rights management allows information owners to control the use and dissemination of electronic documents via a machine-readable licence. Documents are distributed in a protected form such that they may only be used with trusted environments, and only in accordance with terms and conditions stated in the licence. Digital rights management has found uses in protecting copyrighted audio-visual productions, private personal information, and companies' trade secrets and intellectual property. This chapter describes a general model of digital rights management together with the technologies used to implement each component of a digital rights management system, and desribes how digital rights management can be applied to secure the distribution of electronic information in a variety of contexts.
Resumo:
Robust image hashing seeks to transform a given input image into a shorter hashed version using a key-dependent non-invertible transform. These image hashes can be used for watermarking, image integrity authentication or image indexing for fast retrieval. This paper introduces a new method of generating image hashes based on extracting Higher Order Spectral features from the Radon projection of an input image. The feature extraction process is non-invertible, non-linear and different hashes can be produced from the same image through the use of random permutations of the input. We show that the transform is robust to typical image transformations such as JPEG compression, noise, scaling, rotation, smoothing and cropping. We evaluate our system using a verification-style framework based on calculating false match, false non-match likelihoods using the publicly available Uncompressed Colour Image database (UCID) of 1320 images. We also compare our results to Swaminathan’s Fourier-Mellin based hashing method with at least 1% EER improvement under noise, scaling and sharpening.
Resumo:
While close talking microphones give the best signal quality and produce the highest accuracy from current Automatic Speech Recognition (ASR) systems, the speech signal enhanced by microphone array has been shown to be an effective alternative in a noisy environment. The use of microphone arrays in contrast to close talking microphones alleviates the feeling of discomfort and distraction to the user. For this reason, microphone arrays are popular and have been used in a wide range of applications such as teleconferencing, hearing aids, speaker tracking, and as the front-end to speech recognition systems. With advances in sensor and sensor network technology, there is considerable potential for applications that employ ad-hoc networks of microphone-equipped devices collaboratively as a virtual microphone array. By allowing such devices to be distributed throughout the users’ environment, the microphone positions are no longer constrained to traditional fixed geometrical arrangements. This flexibility in the means of data acquisition allows different audio scenes to be captured to give a complete picture of the working environment. In such ad-hoc deployment of microphone sensors, however, the lack of information about the location of devices and active speakers poses technical challenges for array signal processing algorithms which must be addressed to allow deployment in real-world applications. While not an ad-hoc sensor network, conditions approaching this have in effect been imposed in recent National Institute of Standards and Technology (NIST) ASR evaluations on distant microphone recordings of meetings. The NIST evaluation data comes from multiple sites, each with different and often loosely specified distant microphone configurations. This research investigates how microphone array methods can be applied for ad-hoc microphone arrays. A particular focus is on devising methods that are robust to unknown microphone placements in order to improve the overall speech quality and recognition performance provided by the beamforming algorithms. In ad-hoc situations, microphone positions and likely source locations are not known and beamforming must be achieved blindly. There are two general approaches that can be employed to blindly estimate the steering vector for beamforming. The first is direct estimation without regard to the microphone and source locations. An alternative approach is instead to first determine the unknown microphone positions through array calibration methods and then to use the traditional geometrical formulation for the steering vector. Following these two major approaches investigated in this thesis, a novel clustered approach which includes clustering the microphones and selecting the clusters based on their proximity to the speaker is proposed. Novel experiments are conducted to demonstrate that the proposed method to automatically select clusters of microphones (ie, a subarray), closely located both to each other and to the desired speech source, may in fact provide a more robust speech enhancement and recognition than the full array could.