288 resultados para audio-visual education
Resumo:
Speech recognition can be improved by using visual information in the form of lip movements of the speaker in addition to audio information. To date, state-of-the-art techniques for audio-visual speech recognition continue to use audio and visual data of the same database for training their models. In this paper, we present a new approach to make use of one modality of an external dataset in addition to a given audio-visual dataset. By so doing, it is possible to create more powerful models from other extensive audio-only databases and adapt them on our comparatively smaller multi-stream databases. Results show that the presented approach outperforms the widely adopted synchronous hidden Markov models (HMM) trained jointly on audio and visual data of a given audio-visual database for phone recognition by 29% relative. It also outperforms the external audio models trained on extensive external audio datasets and also internal audio models by 5.5% and 46% relative respectively. We also show that the proposed approach is beneficial in noisy environments where the audio source is affected by the environmental noise.
Resumo:
In recent years, many of the world’s leading media producers, screenwriters, technicians and investors, particularly those in the Asia-Pacific region, have been drawn to work in the People's Republic of China (hereafter China or Mainland China). Media projects with a lighter commercial entertainment feel – compared with the heavy propaganda-oriented content of the past – have multiplied, thanks to the Chinese state’s newfound willingness to consider collaboration with foreign partners. This is no more evident than in film. Despite their long-standing reputation for rigorous censorship, state policymakers are now encouraging Chinese media entrepreneurs to generate fresh ideas and to develop products that will revitalise the stagnant domestic production sector. It is hoped that an increase in both the quality and quantity of domestic feature films, stimulated by an infusion of creativity and cutting-edge technology from outside the country, will help reverse China’s ‘cultural trade deficit’ (wenhua maoyi chizi) (Keane 2007).
Resumo:
We propose a novel technique for conducting robust voice activity detection (VAD) in high-noise recordings. We use Gaussian mixture modeling (GMM) to train two generic models; speech and non-speech. We then score smaller segments of a given (unseen) recording against each of these GMMs to obtain two respective likelihood scores for each segment. These scores are used to compute a dissimilarity measure between pairs of segments and to carry out complete-linkage clustering of the segments into speech and non-speech clusters. We compare the accuracy of our method against state-of-the-art and standardised VAD techniques to demonstrate an absolute improvement of 15% in half-total error rate (HTER) over the best performing baseline system and across the QUT-NOISE-TIMIT database. We then apply our approach to the Audio-Visual Database of American English (AVDBAE) to demonstrate the performance of our algorithm in using visual, audio-visual or a proposed fusion of these features.
Resumo:
This doctoral thesis comprises three distinct yet related projects which investigate interdisciplinary practice across: music collaboration; mime performance; and corporate communication. Both the processes and underpinning research of these projects explore, expose and exploit areas where disparate and apparently conflicting fields of professional practice successfully and effectively; intersect, interact, and inform each other - rather than conflict - thereby enhancing each, both individually and collectively. Informed by three decades of professional practice across: music; stage performance; television; corporate communication; design; and tertiary education, the three projects have produced innovative, creative, and commercial viable outcomes, manifest in a variety of media including: music; written text; digital, audio/visual; and internet. In exploring new practice and creating new knowledge, these project outcomes clearly demonstrate the value and effectiveness of reconciling disparate fields of practice through the application of inter-disciplinary creativity and innovation to professional practice.
Resumo:
The performance of visual speech recognition (VSR) systems are significantly influenced by the accuracy of the visual front-end. The current state-of-the-art VSR systems use off-the-shelf face detectors such as Viola- Jones (VJ) which has limited reliability for changes in illumination and head poses. For a VSR system to perform well under these conditions, an accurate visual front end is required. This is an important problem to be solved in many practical implementations of audio visual speech recognition systems, for example in automotive environments for an efficient human-vehicle computer interface. In this paper, we re-examine the current state-of-the-art VSR by comparing off-the-shelf face detectors with the recently developed Fourier Lucas-Kanade (FLK) image alignment technique. A variety of image alignment and visual speech recognition experiments are performed on a clean dataset as well as with a challenging automotive audio-visual speech dataset. Our results indicate that the FLK image alignment technique can significantly outperform off-the shelf face detectors, but requires frequent fine-tuning.
Resumo:
The objective of the present study was to understand the teachers' perception about students' academic stress and other welfare related issues. A group of 125 secondary and higher secondary school teachers (43 male and 82 female) from five schools located in Kolkata were covered in the study following convenience sampling technique. Data were collected by using a semi-structured questionnaire developed by the first author. Findings revealed that more than half of the teachers (55.8% male and 54.9% female) felt that today's students are not brought up in child friendly environment while an overwhelming number of teachers stated that students face some social problems (88.4% male and 96.3% female) which affects their mental health and causes stress (90.7% male and 92.7% female). However, majority of them (79.1% male and 78% female teachers), irrespective of gender, denied the fact that teaching method followed in schools could cause academic stress. Vast majority of the teachers felt that New Education System in India i.e., making Grade X examination (popularly known as secondary examination) optional will not be beneficial for students. So far as motivation of the students is concerned, introducing innovative teaching methods like project work, field visit, using audio-visual aids in the schools has been suggested by more than 95% of the teachers. This apart, most of the teachers suggested reward system in the schools in addition to taking classes seriously by the teachers and punctuality. Reduction of load of home work was also suggested by more than two-fifth teachers. Although corporal punishment has gone down, it is still practiced by some of the teachers' especially male teachers in Kolkata. Male and female teachers differed significantly with respect to two issues only (p < .05) i.e., applying corporal punishment and impact of sexual health education. Male teachers apply more corporal punishment compared to female teachers and secondly, male teachers do not forsee any negative influence of sexual health education.
Resumo:
Background Bachelor of Pharmacy programs were introduced in 2006 into two Sri Lankan universities - University of Peradeniya and University of Sri Jayewardenepura. Due to minimal clinical pharmacy experience in the country, these universities invited international colleagues to develop and teach the clinical pharmacy course. Aims To describe development, delivery and evaluation of both a clinical pharmacy undergraduate course and a "Train-thetrainer”program provided to local academics delivering undergraduate pharmacy programs. Method In 2009, Australian pharmacist academics developed and piloted an undergraduate clinical pharmacy course at University of Peradeniya. In 2010, this was refined and delivered at University of Sri Jayewardenepura, along with a “train-thetrainer”program for local academics. These were evaluated using surveys. Results Most students considered lecture delivery speed and use of audio visual aids appropriate, and lecture content relevant.Most academics found the “Train-the-Trainer” program increased their knowledge and improved their teaching skills. Conclusion Experienced pharmacist academics can improve the quality of clinical pharmacy teaching in developing countries such as Sri Lanka.
Resumo:
Spoken term detection (STD) is the task of looking up a spoken term in a large volume of speech segments. In order to provide fast search, speech segments are first indexed into an intermediate representation using speech recognition engines which provide multiple hypotheses for each speech segment. Approximate matching techniques are usually applied at the search stage to compensate the poor performance of automatic speech recognition engines during indexing. Recently, using visual information in addition to audio information has been shown to improve phone recognition performance, particularly in noisy environments. In this paper, we will make use of visual information in the form of lip movements of the speaker in indexing stage and will investigate its effect on STD performance. Particularly, we will investigate if gains in phone recognition accuracy will carry through the approximate matching stage to provide similar gains in the final audio-visual STD system over a traditional audio only approach. We will also investigate the effect of using visual information on STD performance in different noise environments.
Resumo:
There are two aspects to the problem of digital scholarship and pedagogy. One is to do with scholarship; the other with pedagogy. In scholarship, the association of knowledge with its printed form remains dominant. In pedagogy, the desire to abandon print for ‘new’ media is urgent, at least in some parts of the academy. Film and media studies are thus at the intersection of opposing forces – pulling the field ‘back’ to print and ‘forward’ to digital media. These tensions may be especially painful in a field whose own object of study is another form of communication, neither print nor digital but broadcast. Although print has been overtaken in the popular marketplace by audio-visual forms, this was never achieved in the domain of scholarship. Even when it is digitally distributed, the output of research is still a ‘paper.’ But meanwhile, in the realm of teaching, production- and practice-based pedagogy has become firmly established. Nevertheless a disjunction remains, between high-end scholarship in research universities and vocational training in teaching institutions; but neither is well equipped to deal with the digital challenge.
Resumo:
Videotelephony (real-time audio-visual communication) has been used successfully in adult palliative home care. This paper describes two attempts to complete an RCT (both of which were abandoned following difficulties with family recruitment), designed to investigate the use of videotelephony with families receiving palliative care from a tertiary paediatric oncology service in Brisbane, Australia. To investigate whether providing videotelephone-based support was acceptable to these families, a 12-month non-randomised acceptability trial was completed. Seventeen palliative care families were offered access to a videotelephone support service in addition to the 24 hours ‘on-call’ service already offered. A 92% participation rate in this study provided some reassurance that the use of videotelephones themselves was not a factor in poor RCT participation rates. The next phase of research is to investigate the integration of videotelephone-based support from the time of diagnosis, through outpatient care and support, and for palliative care rather than for palliative care in isolation
Resumo:
Research is often characterised as the search for new ideas and understanding. The language of this view privileges the cognitive and intellectual aspects of discovery. However, in the research process theoretical claims are usually evaluated in practice and, indeed, the observations and experiences of practical circumstances often lead to new research questions. This feedback loop between speculation and experimentation is fundamental to research in many disciplines, and is also appropriate for research in the creative arts. In this chapter we will examine how our creative desire for artistic expressivity results in interplay between actions and ideas that direct the development of techniques and approaches for our audio/visual live-coding activities.