199 resultados para Speech and Audio Research Laboratory
em Queensland University of Technology - ePrints Archive
Resumo:
This document outlines the system submitted by the Speech and Audio Research Laboratory at the Queensland University of Technology (QUT) for the Speaker Identity Verication: Application task of EVALITA 2009. This submission consisted of a score-level fusion of three component systems, a joint-factor GMM system and two SVM systems using GLDS and GMM supervector kernels. Development and evaluation results are presented, demonstrating the effectiveness of this fused system approach.
Resumo:
This document outlines the system submitted by the Speech and Audio Research Laboratory at the Queensland University of Technology (QUT) for the Speaker Identity Verification: Application task of EVALITA 2009. This competitive submission consisted of a score-level fusion of three component systems; a joint-factor analysis GMM system and two SVM systems using GLDS and GMM supervector kernels. Development evaluation and post-submission results are presented in this study, demonstrating the effectiveness of this fused system approach. This study highlights the challenges associated with system calibration from limited development data and that mismatch between training and testing conditions continues to be a major source of error in speaker verification technology.
Resumo:
The Autistic Behavioural Indicators Instrument (ABII) is an 18-item instrument developed to identify children with Autistic Disorder (AD) based on the presence of unique autistic behavioural indicators. The ABII was administered to 20 children with AD, 20 children with speech and language impairment (SLI) and 20 typically developing (TD) children aged 2-6 years. Results indicated that the ABII discriminated children diagnosed with AD from those diagnosed with SLI and those who were TD, based on the presence of specific social attention, sensory, and behavioural symptoms. A combination of symptomology across these domains correctly classified 100% of children with and without AD. The paper concludes that the ABII shows considerable promise as an instrument for the early identification of AD.
Resumo:
What really changed for Australian Aboriginal and Torres Strait Islander people between Paul Keating’s Redfern Park Speech (Keating 1992) and Kevin Rudd’s Apology to the stolen generations (Rudd 2008)? What will change between the Apology and the next speech of an Australian Prime Minister? The two speeches were intricately linked, and they were both personal and political. But do they really signify change at the political level? This paper reflects my attempt to turn the gaze away from Aboriginal and Torres Strait Islander people, and back to where the speeches originated: the Australian Labor Party (ALP). I question whether the changes foreshadowed in the two speeches – including changes by the Australian public and within Australian society – are evident in the internal mechanisms of the ALP. I also seek to understand why non-Indigenous women seem to have given in to the existing ways of the ALP instead of challenging the status quo which keeps Aboriginal and Torres Strait Islander peoples marginalised. I believe that, without a thorough examination and a change in the ALP’s practices, the domination and subjugation of Indigenous peoples will continue – within the Party, through the Australian political process and, therefore, through governments.
Resumo:
This paper investigates the use of lip information, in conjunction with speech information, for robust speaker verification in the presence of background noise. It has been previously shown in our own work, and in the work of others, that features extracted from a speaker's moving lips hold speaker dependencies which are complementary with speech features. We demonstrate that the fusion of lip and speech information allows for a highly robust speaker verification system which outperforms the performance of either sub-system. We present a new technique for determining the weighting to be applied to each modality so as to optimize the performance of the fused system. Given a correct weighting, lip information is shown to be highly effective for reducing the false acceptance and false rejection error rates in the presence of background noise
Resumo:
Investigates the use of temporal lip information, in conjunction with speech information, for robust, text-dependent speaker identification. We propose that significant speaker-dependent information can be obtained from moving lips, enabling speaker recognition systems to be highly robust in the presence of noise. The fusion structure for the audio and visual information is based around the use of multi-stream hidden Markov models (MSHMM), with audio and visual features forming two independent data streams. Recent work with multi-modal MSHMMs has been performed successfully for the task of speech recognition. The use of temporal lip information for speaker identification has been performed previously (T.J. Wark et al., 1998), however this has been restricted to output fusion via single-stream HMMs. We present an extension to this previous work, and show that a MSHMM is a valid structure for multi-modal speaker identification
Resumo:
Sound tagging has been studied for years. Among all sound types, music, speech, and environmental sound are three hottest research areas. This survey aims to provide an overview about the state-of-the-art development in these areas.We discuss about the meaning of tagging in different sound areas at the beginning of the journey. Some examples of sound tagging applications are introduced in order to illustrate the significance of this research. Typical tagging techniques include manual, automatic, and semi-automatic approaches.After reviewing work in music, speech and environmental sound tagging, we compare them and state the research progress to date. Research gaps are identified for each research area and the common features and discriminations between three areas are discovered as well. Published datasets, tools used by researchers, and evaluation measures frequently applied in the analysis are listed. In the end, we summarise the worldwide distribution of countries dedicated to sound tagging research for years.
Resumo:
For decades, marketing and marketing research have been based on a concept of consumer behaviour that is deeply embedded in a linear notion of marketing activities. With increasing regularity, key organising frameworks for marketing and marketing activities are being challenged by academics and practitioners alike. In turn, this has led to the search for new approaches and tools that will help marketers understand the interaction among attitudes, emotions and product/brand choice. More recently, the approach developed by Harvard Professor, Gerald Zaltman, referred to as the Zaltman Metaphor Elicitation Technique (ZMET) has gained considerable interest. This paper seeks to demonstrate the effectiveness of this alternative qualitative method, using a non-conventional approach, thus providing a useful contribution to the qualitative research area.
Resumo:
Developing an effective impact evaluation framework, managing and conducting rigorous impact evaluations, and developing a strong research and evaluation culture within development communication organisations presents many challenges. This is especially so when both the community and organisational context is continually changing and the outcomes of programs are complex and difficult to clearly identify.----- This paper presents a case study from a research project being conducted from 2007-2010 that aims to address these challenges and issues, entitled Assessing Communication for Social Change: A New Agenda in Impact Assessment. Building on previous development communication projects which used ethnographic action research, this project is developing, trailing and rigorously evaluating a participatory impact assessment methodology for assessing the social change impacts of community radio programs in Nepal. This project is a collaboration between Equal Access – Nepal (EAN), Equal Access – International, local stakeholders and listeners, a network of trained community researchers, and a research team from two Australian universities. A key element of the project is the establishment of an organisational culture within EAN that values and supports the impact assessment process being developed, which is based on continuous action learning and improvement. The paper describes the situation related to monitoring and evaluation (M&E) and impact assessment before the project began, in which EAN was often reliant on time-bound studies and ‘success stories’ derived from listener letters and feedback. We then outline the various strategies used in an effort to develop stronger and more effective impact assessment and M&E systems, and the gradual changes that have occurred to date. These changes include a greater understanding of the value of adopting a participatory, holistic, evidence-based approach to impact assessment. We also critically review the many challenges experienced in this process, including:----- • Tension between the pressure from donors to ‘prove’ impacts and the adoption of a bottom-up, participatory approach based on ‘improving’ programs in ways that meet community needs and aspirations.----- • Resistance from the content teams to changing their existing M&E practices and to the perceived complexity of the approach.----- • Lack of meaningful connection between the M&E and content teams.----- • Human resource problems and lack of capacity in analysing qualitative data and reporting results.----- • The contextual challenges, including extreme poverty, wide cultural and linguistic diversity, poor transport and communications infrastructure, and political instability.----- • A general lack of acceptance of the importance of evaluation within Nepal due to accepting everything as fate or ‘natural’ rather than requiring investigation into a problem.
Resumo:
This paper describes methods used to support collaboration and communication between practitioners, designers and engineers when designing ubiquitous computing systems. We tested methods such as “Wizard of Oz” and design games in a real domain, the dental surgery, in an attempt to create a system that is: affordable; minimally disruptive of the natural flow of work; and improves human-computer interaction. In doing so we found that such activities allowed the practitioners to be on a ‘level playing ground’ with designers and engineers. The findings we present suggest that dentists are willing to engage in detailed exploration and constructive critique of technical design possibilities if the design ideas and prototypes are presented in the context of their work practice and are of a resolution and relevance that allow them to jointly explore and question with the design time. This paper is an extension of a short paper submitted to the Participatory Design Conference, 2004.