883 resultados para Voice synthesization


Relevância:

10.00% 10.00%

Publicador:

Resumo:

Cherbourg State School is some 300 kilometres northwest of Brisbane. It is situated in an Aboriginal community at Cherbourg with approximately 250 students, all of whom are Indigenous Australian children. Cherbourg State School aims to generate good academic outcomes for its students from kindergarten to Year 7 and nurture a strong and positive sense of what it means to be Aboriginal in today's society. In a context where the community continues to grapple with many social issues born of the historical processes of dispossession and disempowerment, Cherbourg State School is determined that its children can and will learn to become 'Strong and Smart'. It is a journey that has been charted by Chris Sarra, the school's first Aboriginal principal, in his paper Young and Black and Deadly: Strategies for Improving Outcomes for Indigenous Students, which describes how pride and expectations were engendered in the school over a four-year period from 1998. In this article the author discusses the historical context of the school and its impact on the Indigenous people of Cherbourg. The aim is to consider the historical, political, social and cultural context around the creation of Cherbourg State School. The author critically examines the historical records of the role of the State Government and the white settlers in the setting up and creation of the Aboriginal Reserve and later the primary school. Throughout the author addresses an absence � a voice missing from history � the voice of the Aboriginal people. This exercise in collective memory was designed to provide an opportunity for those who have seldom been given the opportunity to tell their story. Instead of the official view of Cherbourg School it provides a narrative which restores the victims of history to a place of dignity and indeed humanity.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Acoustically, vehicles are extremely noisy environments and as a consequence audio-only in-car voice recognition systems perform very poorly. Seeing that the visual modality is immune to acoustic noise, using the visual lip information from the driver is seen as a viable strategy in circumventing this problem. However, implementing such an approach requires a system being able to accurately locate and track the driver’s face and facial features in real-time. In this paper we present such an approach using the Viola-Jones algorithm. Using this system, we present our results which show that using the Viola-Jones approach is a suitable method of locating and tracking the driver’s lips despite the visual variability of illumination and head pose.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper argues that the increasing visibility of Indigenous families in the mainstream Australian media over the past ten years has produced new opportunities for addressing national injustices of the Stolen Generations. It shows how, as certain celebrities like Ernie Dingo, Nova Peris and Cathy Freeman, have become popular household names, a concurrent public interest in their family backgrounds has grown. Descriptive accounts of relationships and shared histories – propelled by the expansion of the lifestyle television genre in this context – has enabled some stories of the ‘Stolen Generations’ to be seen as ‘ordinary’, and part of a broader sense of everyday Australian life for the first time. With the aid of recent sexual citizenship research, the article illustrates that such middle-class representations give voice to new embodiments of citizenship in the post-apology era, making Indigenous justice more subjectively interconnected with life in the white Australian public sphere.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Soft biometrics are characteristics that can be used to describe, but not uniquely identify an individual. These include traits such as height, weight, gender, hair, skin and clothing colour. Unlike traditional biometrics (i.e. face, voice) which require cooperation from the subject, soft biometrics can be acquired by surveillance cameras at range without any user cooperation. Whilst these traits cannot provide robust authentication, they can be used to provide coarse authentication or identification at long range, locate a subject who has been previously seen or who matches a description, as well as aid in object tracking. In this paper we propose three part (head, torso, legs) height and colour soft biometric models, and demonstrate their verification performance on a subset of the PETS 2006 database. We show that these models, whilst not as accurate as traditional biometrics, can still achieve acceptable rates of accuracy in situations where traditional biometrics cannot be applied.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

the (dis)orientation of thought in its encounter with art can be understood as the direct result of an encounter with indeterminacy as a lack in meaning. As an artist I am aware of how this indeterminacy impacts on the perceived value and authority of the artistic voice and in particular its value as a research voice. This paper explores this indeterminacy of meaning, as a profound and disturbing unknowing characteristic of the sublime and argues its value to advanced thought and for any methodological understanding of practice-led research. Lyotard described the sublime as an ‘understanding’ through which art and its associated practices may be able to resist an all too easy assimilation by the public as just a consumer commodity. His thought represents an attempt to both politically and philosophically understand art’s, and particularly abstract painting’s, affect as a state of profound and positive unknowing. To talk of the sublime in art is to speak of the suspension of any comfortable certainty in being and instead to engage with the real as a limit to meaning and knowing. It is to talk of the presentation of the unpresentable as a momentary but significant dissolution of representation. This understanding of the sublime is then further explored through the cultural phenomena of the monochrome painting and applied to the work of the two contemporary artists, Franz Erhard Walter and Günter Umberg. Initially the monochrome was understood as an attempt to go beyond traditional representation and present the unpresentable. In the one hundred years or so since that initial move this understanding has broadened. The monochrome now presents itself as a genre or even project within visual art but it still has much to teach us. In the concretely abstract and performative artworks of Franz Erhard Walter and Günter Umberg, traces of this ambition remain and their work can be seen to pose questions probing our understandings and experiences of artistic meaning, its value and the real.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Prime Minister Kevin Rudd’s Apology to Australia’s Stolen Generations, delivered on 13 February 2008, is both personal and political to me just as the people who talk about it make it political and personal through their actions. This paper represents my attempt to turn the gaze through articulating some of my thoughts on the Apology, policy statements (Close the Gap) and the inconsistencies within the leadership of the present governments. I have endeavoured to do this through exploring the articulations of others and by sharing examples and personal experiences. In bringing forth some analysis to the literature, examples and experiences, I reveal the relationships between oppression, white race privilege and institutional privilege and the epistemology that maintains them. In moving from the position of being silent on the Apology, and my political experiences, to speaking about them, I am able to move from the position of object to subject and to gain a form of liberated voice (hooks 1989:9). Furthermore, I am hopeful that it will encourage others to examine their own practices within political parties and governments and to challenge the domination that continues to subjugate Indigenous peoples. It is only through people enacting their responsibilities and making changes in their daily lives and through the institutions and organisations to which they belong (the personal and political), can the Apology move beyond symbolic to action.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Automatic recognition of people is an active field of research with important forensic and security applications. In these applications, it is not always possible for the subject to be in close proximity to the system. Voice represents a human behavioural trait which can be used to recognise people in such situations. Automatic Speaker Verification (ASV) is the process of verifying a persons identity through the analysis of their speech and enables recognition of a subject at a distance over a telephone channel { wired or wireless. A significant amount of research has focussed on the application of Gaussian mixture model (GMM) techniques to speaker verification systems providing state-of-the-art performance. GMM's are a type of generative classifier trained to model the probability distribution of the features used to represent a speaker. Recently introduced to the field of ASV research is the support vector machine (SVM). An SVM is a discriminative classifier requiring examples from both positive and negative classes to train a speaker model. The SVM is based on margin maximisation whereby a hyperplane attempts to separate classes in a high dimensional space. SVMs applied to the task of speaker verification have shown high potential, particularly when used to complement current GMM-based techniques in hybrid systems. This work aims to improve the performance of ASV systems using novel and innovative SVM-based techniques. Research was divided into three main themes: session variability compensation for SVMs; unsupervised model adaptation; and impostor dataset selection. The first theme investigated the differences between the GMM and SVM domains for the modelling of session variability | an aspect crucial for robust speaker verification. Techniques developed to improve the robustness of GMMbased classification were shown to bring about similar benefits to discriminative SVM classification through their integration in the hybrid GMM mean supervector SVM classifier. Further, the domains for the modelling of session variation were contrasted to find a number of common factors, however, the SVM-domain consistently provided marginally better session variation compensation. Minimal complementary information was found between the techniques due to the similarities in how they achieved their objectives. The second theme saw the proposal of a novel model for the purpose of session variation compensation in ASV systems. Continuous progressive model adaptation attempts to improve speaker models by retraining them after exploiting all encountered test utterances during normal use of the system. The introduction of the weight-based factor analysis model provided significant performance improvements of over 60% in an unsupervised scenario. SVM-based classification was then integrated into the progressive system providing further benefits in performance over the GMM counterpart. Analysis demonstrated that SVMs also hold several beneficial characteristics to the task of unsupervised model adaptation prompting further research in the area. In pursuing the final theme, an innovative background dataset selection technique was developed. This technique selects the most appropriate subset of examples from a large and diverse set of candidate impostor observations for use as the SVM background by exploiting the SVM training process. This selection was performed on a per-observation basis so as to overcome the shortcoming of the traditional heuristic-based approach to dataset selection. Results demonstrate the approach to provide performance improvements over both the use of the complete candidate dataset and the best heuristically-selected dataset whilst being only a fraction of the size. The refined dataset was also shown to generalise well to unseen corpora and be highly applicable to the selection of impostor cohorts required in alternate techniques for speaker verification.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Jonzi D, one of the leading Hip Hop voices in the UK, creates contemporary theatrical works that merge dance, street art, original scored music and contemporary rap poetry, to create theatrical events that expand a thriving sense of a Hip Hop nation with citizens in the UK, throughout southern Africa and the rest of the world. In recent years Hip Hop has evolved as a performance genre in and of itself that not only borrows from other forms but vitally now contributes back to the body of contemporary practice in the performing arts. As part of this work Jonzi’s company Jonzi D Productions is committed to creating and touring original Hip Hop theatre that promotes the continuing development and awareness of a nation with its own language, culture and currency that exists without borders. Through the deployment of a universal voice from the local streets of Johannesburg and the East End of London, Jonzi D creates a form of highly energized performance that elevates Hip Hop as great democratiser between the highly developed global and under resourced local in the world. It is the staging of this democratised and technologised future (and present), that poses the greatest challenge for the scenographer working with Jonzi and his company, and the associated deprogramming and translation of the artists particular filmic vision to the stage, that this discussion will explore. This paper interrogates not only how a scenographic strategy can support the existence of this work but also how the scenographer as outsider can enter and influence this nation.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

what was silent will speak, what is closed will open and will take on a voice Paul Virilio The fundamental problem in dealing with the digital is that we are forced to contend with a fundamental deconstruction of form. A deconstruction that renders our content and practice into a single state that can be openly and easily manipulated, reimagined and mashed together in rapid time to create completely unique artefacts and potentially unwranglable jumbles of data. Once our work is essentially broken down into this series of number sequences, (or bytes), our sound, images, movies and documents – our memory files - we are left with nothing but choice….and this is the key concern. This absence of form transforms our work into new collections and poses unique challenges for the artist seeking opportunities to exploit the potential of digital deconstruction. It is through this struggle with the absent form that we are able to thoroughly explore the latent potential of content, exploit modern abstractions of time and devise approaches within our practice that actively deal with the digital as an essential matter of course.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Natural disasters and deliberate, willful damage to telecommunication infrastructure can result in a loss of critical voice and data services. This loss of service hinders the ability for efficient emergency response and can cause delays leading to loss of life. Current mobile devices are generally tied to one network operator. When a disaster is of significant impact, that network operator cannot be relied upon to provide service and coverage levels that would normally exist. While some operators have agreements with other operators to share resources (such as network roaming) these agreements are contractual in nature and cannot be activated quickly in an emergency. This paper introduces Fourth Generation (4G) wireless networks. 4G networks are highly mobile and heterogeneous, which makes 4G networks highly resilient in times of disaster.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Acoustically, car cabins are extremely noisy and as a consequence audio-only, in-car voice recognition systems perform poorly. As the visual modality is immune to acoustic noise, using the visual lip information from the driver is seen as a viable strategy in circumventing this problem by using audio visual automatic speech recognition (AVASR). However, implementing AVASR requires a system being able to accurately locate and track the drivers face and lip area in real-time. In this paper we present such an approach using the Viola-Jones algorithm. Using the AVICAR [1] in-car database, we show that the Viola- Jones approach is a suitable method of locating and tracking the driver’s lips despite the visual variability of illumination and head pose for audio-visual speech recognition system.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This collaborative event was organised to coincide with International celebrations by the International Council of Societies of Industrial Design (ICSID). The panel discussion involved industrial designers from a variety of backgrounds including academics, theorists and practitioners. Each panel member was given time to voice their opinion surrounding the theme of WIDD2010 "Industrial Design: Humane Solutions for a Resilient World". The discussion was then extended to the audience through active question and answer time. The panel included: * Professor Vesna Popovic FDIA - Queensland University of Technology * Adam Doyle, Studio Manager - Infinity Design Development * Scott Cox MDIA, Creative Director - Formwerx * Alexander Lotersztain, Director - Derlot * Philip Whiting FDIA, Design Convenor - QCA * Professor Tony Fry, Director Team D/E/S & QCA After this, the documentary by Gary Hewtsit "Objectified" was then screened (75 min).

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Acoustically, car cabins are extremely noisy and as a consequence, existing audio-only speech recognition systems, for voice-based control of vehicle functions such as the GPS based navigator, perform poorly. Audio-only speech recognition systems fail to make use of the visual modality of speech (eg: lip movements). As the visual modality is immune to acoustic noise, utilising this visual information in conjunction with an audio only speech recognition system has the potential to improve the accuracy of the system. The field of recognising speech using both auditory and visual inputs is known as Audio Visual Speech Recognition (AVSR). Continuous research in AVASR field has been ongoing for the past twenty-five years with notable progress being made. However, the practical deployment of AVASR systems for use in a variety of real-world applications has not yet emerged. The main reason is due to most research to date neglecting to address variabilities in the visual domain such as illumination and viewpoint in the design of the visual front-end of the AVSR system. In this paper we present an AVASR system in a real-world car environment using the AVICAR database [1], which is publicly available in-car database and we show that the use of visual speech conjunction with the audio modality is a better approach to improve the robustness and effectiveness of voice-only recognition systems in car cabin environments.