Biblioteca Digital

123 resultados para College teaching Audio-visual aids

Multiple cameras for audio-visual speech recognition in an automotive environment

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Audio-visualspeechrecognition, or the combination of visual lip-reading with traditional acoustic speechrecognition, has been previously shown to provide a considerable improvement over acoustic-only approaches in noisy environments, such as that present in an automotive cabin. The research presented in this paper will extend upon the established audio-visualspeechrecognition literature to show that further improvements in speechrecognition accuracy can be obtained when multiple frontal or near-frontal views of a speaker's face are available. A series of visualspeechrecognition experiments using a four-stream visual synchronous hidden Markov model (SHMM) are conducted on the four-camera AVICAR automotiveaudio-visualspeech database. We study the relative contribution between the side and central orientated cameras in improving visualspeechrecognition accuracy. Finally combination of the four visual streams with a single audio stream in a five-stream SHMM demonstrates a relative improvement of over 56% in word recognition accuracy when compared to the acoustic-only approach in the noisiest conditions of the AVICAR database.

Students' academic stress and welfare as perceived by the teachers

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The objective of the present study was to understand the teachers' perception about students' academic stress and other welfare related issues. A group of 125 secondary and higher secondary school teachers (43 male and 82 female) from five schools located in Kolkata were covered in the study following convenience sampling technique. Data were collected by using a semi-structured questionnaire developed by the first author. Findings revealed that more than half of the teachers (55.8% male and 54.9% female) felt that today's students are not brought up in child friendly environment while an overwhelming number of teachers stated that students face some social problems (88.4% male and 96.3% female) which affects their mental health and causes stress (90.7% male and 92.7% female). However, majority of them (79.1% male and 78% female teachers), irrespective of gender, denied the fact that teaching method followed in schools could cause academic stress. Vast majority of the teachers felt that New Education System in India i.e., making Grade X examination (popularly known as secondary examination) optional will not be beneficial for students. So far as motivation of the students is concerned, introducing innovative teaching methods like project work, field visit, using audio-visual aids in the schools has been suggested by more than 95% of the teachers. This apart, most of the teachers suggested reward system in the schools in addition to taking classes seriously by the teachers and punctuality. Reduction of load of home work was also suggested by more than two-fifth teachers. Although corporal punishment has gone down, it is still practiced by some of the teachers' especially male teachers in Kolkata. Male and female teachers differed significantly with respect to two issues only (p < .05) i.e., applying corporal punishment and impact of sexual health education. Male teachers apply more corporal punishment compared to female teachers and secondly, male teachers do not forsee any negative influence of sexual health education.

Acoustic adaptation in cross database audio visual SHMM training for phonetic spoken term detection

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Visual information in the form of lip movements of the speaker has been shown to improve the performance of speech recognition and search applications. In our previous work, we proposed cross database training of synchronous hidden Markov models (SHMMs) to make use of external large and publicly available audio databases in addition to the relatively small given audio visual database. In this work, the cross database training approach is improved by performing an additional audio adaptation step, which enables audio visual SHMMs to benefit from audio observations of the external audio models before adding visual modality to them. The proposed approach outperforms the baseline cross database training approach in clean and noisy environments in terms of phone recognition accuracy as well as spoken term detection (STD) accuracy.

Cross database training of audio-visual hidden Markov models for phone recognition

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Speech recognition can be improved by using visual information in the form of lip movements of the speaker in addition to audio information. To date, state-of-the-art techniques for audio-visual speech recognition continue to use audio and visual data of the same database for training their models. In this paper, we present a new approach to make use of one modality of an external dataset in addition to a given audio-visual dataset. By so doing, it is possible to create more powerful models from other extensive audio-only databases and adapt them on our comparatively smaller multi-stream databases. Results show that the presented approach outperforms the widely adopted synchronous hidden Markov models (HMM) trained jointly on audio and visual data of a given audio-visual database for phone recognition by 29% relative. It also outperforms the external audio models trained on extensive external audio datasets and also internal audio models by 5.5% and 46% relative respectively. We also show that the proposed approach is beneficial in noisy environments where the audio source is affected by the environmental noise.

East Asian audio-visual collaboration and the global expansion of Chinese media

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In recent years, many of the world’s leading media producers, screenwriters, technicians and investors, particularly those in the Asia-Pacific region, have been drawn to work in the People's Republic of China (hereafter China or Mainland China). Media projects with a lighter commercial entertainment feel – compared with the heavy propaganda-oriented content of the past – have multiplied, thanks to the Chinese state’s newfound willingness to consider collaboration with foreign partners. This is no more evident than in film. Despite their long-standing reputation for rigorous censorship, state policymakers are now encouraging Chinese media entrepreneurs to generate fresh ideas and to develop products that will revitalise the stagnant domestic production sector. It is hoped that an increase in both the quality and quantity of domestic feature films, stimulated by an infusion of creativity and cutting-edge technology from outside the country, will help reverse China’s ‘cultural trade deficit’ (wenhua maoyi chizi) (Keane 2007).

The nature of interaction in educational videoconferencing

Relevância:

100.00% 100.00%

Publicador:

Complete-linkage clustering for voice activity detection in audio and visual speech

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We propose a novel technique for conducting robust voice activity detection (VAD) in high-noise recordings. We use Gaussian mixture modeling (GMM) to train two generic models; speech and non-speech. We then score smaller segments of a given (unseen) recording against each of these GMMs to obtain two respective likelihood scores for each segment. These scores are used to compute a dissimilarity measure between pairs of segments and to carry out complete-linkage clustering of the segments into speech and non-speech clusters. We compare the accuracy of our method against state-of-the-art and standardised VAD techniques to demonstrate an absolute improvement of 15% in half-total error rate (HTER) over the best performing baseline system and across the QUT-NOISE-TIMIT database. We then apply our approach to the Audio-Visual Database of American English (AVDBAE) to demonstrate the performance of our algorithm in using visual, audio-visual or a proposed fusion of these features.

Using a Free-Parts Representation for Visual Speech Recognition

Relevância:

100.00% 100.00%

Publicador:

Digital scholarship and pedagogy, the next step : cultural science.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

There are two aspects to the problem of digital scholarship and pedagogy. One is to do with scholarship; the other with pedagogy. In scholarship, the association of knowledge with its printed form remains dominant. In pedagogy, the desire to abandon print for ‘new’ media is urgent, at least in some parts of the academy. Film and media studies are thus at the intersection of opposing forces – pulling the field ‘back’ to print and ‘forward’ to digital media. These tensions may be especially painful in a field whose own object of study is another form of communication, neither print nor digital but broadcast. Although print has been overtaken in the popular marketplace by audio-visual forms, this was never achieved in the domain of scholarship. Even when it is digitally distributed, the output of research is still a ‘paper.’ But meanwhile, in the realm of teaching, production- and practice-based pedagogy has become firmly established. Nevertheless a disjunction remains, between high-end scholarship in research universities and vocational training in teaching institutions; but neither is well equipped to deal with the digital challenge.

Facilitating a reflective, collaborative teaching development project in higher education : relections on experience

Relevância:

100.00% 100.00%

Publicador:

Visual front-end wars : Viola-Jones face detector vs Fourier Lucas-Kanade

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The performance of visual speech recognition (VSR) systems are significantly influenced by the accuracy of the visual front-end. The current state-of-the-art VSR systems use off-the-shelf face detectors such as Viola- Jones (VJ) which has limited reliability for changes in illumination and head poses. For a VSR system to perform well under these conditions, an accurate visual front end is required. This is an important problem to be solved in many practical implementations of audio visual speech recognition systems, for example in automotive environments for an efficient human-vehicle computer interface. In this paper, we re-examine the current state-of-the-art VSR by comparing off-the-shelf face detectors with the recently developed Fourier Lucas-Kanade (FLK) image alignment technique. A variety of image alignment and visual speech recognition experiments are performed on a clean dataset as well as with a challenging automotive audio-visual speech dataset. Our results indicate that the FLK image alignment technique can significantly outperform off-the shelf face detectors, but requires frequent fine-tuning.

Negotiating indeterminacy in studio teaching: Using the zones of material and immaterial play to shape pedagogies and practices in art

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The processes of studio-based teaching in visual art are often still tied to traditional models of discrete disciplines and largely immersed in skill-based learning. These approaches to training artists are also tied to an individual model of art practice that is clearly defined by the boundaries of those disciplines. This paper will explain how the open studio program at QUT can be broadly understood as an action research model of learning that ‘plays’ with the post-medium, post-studio genealogies and zones of contemporary art. This emphasises developing conceptual, contextual and formal skills as essential for engaging with and practicing in the often-indeterminate spatio-temporal sites of studio teaching. It will explore how this approach looks to Sutton-Smith’s observations on the role of play and Vygotsky’s zone of proximal development in early childhood learning as a way to develop strategies for promoting creative learning environments that are collaborative and self sustainable. Social, cultural, political and philosophical dialogues are examined as they relate to art practice with the aim of forming the shared interests, aims, and ambitions of graduating students into self initiated collectives or ARIs.

Incorporating visual information for spoken term detection

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Spoken term detection (STD) is the task of looking up a spoken term in a large volume of speech segments. In order to provide fast search, speech segments are first indexed into an intermediate representation using speech recognition engines which provide multiple hypotheses for each speech segment. Approximate matching techniques are usually applied at the search stage to compensate the poor performance of automatic speech recognition engines during indexing. Recently, using visual information in addition to audio information has been shown to improve phone recognition performance, particularly in noisy environments. In this paper, we will make use of visual information in the form of lip movements of the speaker in indexing stage and will investigate its effect on STD performance. Particularly, we will investigate if gains in phone recognition accuracy will carry through the approximate matching stage to provide similar gains in the final audio-visual STD system over a traditional audio only approach. We will also investigate the effect of using visual information on STD performance in different noise environments.

Using videotelephony to support paediatric oncology-related palliative care in the home : from abandoned RCT to acceptability study

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Videotelephony (real-time audio-visual communication) has been used successfully in adult palliative home care. This paper describes two attempts to complete an RCT (both of which were abandoned following difficulties with family recruitment), designed to investigate the use of videotelephony with families receiving palliative care from a tertiary paediatric oncology service in Brisbane, Australia. To investigate whether providing videotelephone-based support was acceptable to these families, a 12-month non-randomised acceptability trial was completed. Seventeen palliative care families were offered access to a videotelephone support service in addition to the 24 hours ‘on-call’ service already offered. A 92% participation rate in this study provided some reassurance that the use of videotelephones themselves was not a factor in poor RCT participation rates. The next phase of research is to investigate the integration of videotelephone-based support from the time of diagnosis, through outpatient care and support, and for palliative care rather than for palliative care in isolation

Integrating Creative Practice and Research in the Digital Media Arts

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Research is often characterised as the search for new ideas and understanding. The language of this view privileges the cognitive and intellectual aspects of discovery. However, in the research process theoretical claims are usually evaluated in practice and, indeed, the observations and experiences of practical circumstances often lead to new research questions. This feedback loop between speculation and experimentation is fundamental to research in many disciplines, and is also appropriate for research in the creative arts. In this chapter we will examine how our creative desire for artistic expressivity results in interplay between actions and ideas that direct the development of techniques and approaches for our audio/visual live-coding activities.

«
1
2
3
4
5
6
7
8
9
»