939 resultados para audio equipment
Resumo:
Speech recognition can be improved by using visual information in the form of lip movements of the speaker in addition to audio information. To date, state-of-the-art techniques for audio-visual speech recognition continue to use audio and visual data of the same database for training their models. In this paper, we present a new approach to make use of one modality of an external dataset in addition to a given audio-visual dataset. By so doing, it is possible to create more powerful models from other extensive audio-only databases and adapt them on our comparatively smaller multi-stream databases. Results show that the presented approach outperforms the widely adopted synchronous hidden Markov models (HMM) trained jointly on audio and visual data of a given audio-visual database for phone recognition by 29% relative. It also outperforms the external audio models trained on extensive external audio datasets and also internal audio models by 5.5% and 46% relative respectively. We also show that the proposed approach is beneficial in noisy environments where the audio source is affected by the environmental noise.
Resumo:
Automated digital recordings are useful for large-scale temporal and spatial environmental monitoring. An important research effort has been the automated classification of calling bird species. In this paper we examine a related task, retrieval of birdcalls from a database of audio recordings, similar to a user supplied query call. Such a retrieval task can sometimes be more useful than an automated classifier. We compare three approaches to similarity-based birdcall retrieval using spectral ridge features and two kinds of gradient features, structure tensor and the histogram of oriented gradients. The retrieval accuracy of our spectral ridge method is 94% compared to 82% for the structure tensor method and 90% for the histogram of gradients method. Additionally, this approach potentially offers a more compact representation and is more computationally efficient.
Resumo:
Traction is recognised as an important component of the overall playability and safety of a sportsfield. It relates to the "grip", or footing, provided through an athlete's shoe when in contact with the surface, and is normally measured by the torque generated when a weighted studded disc apparatus is dropped onto the turf and twisted manually. This paper describes the development of an automated traction tester, which mechanises the dropping and twisting of the weighted studded disc. By standardising these operational stages, more repeatable and reliable results can be expected than from the original hand-operated design where positioning of the disc and speed of rotation are controlled manually and so can vary from one measurement to the next. As well as measuring the maximum torque reached during rotation of the studded disc, the automated traction tester generates a profile of torque showing changes over time and calculates the angle through which the studded disc moved before reaching maximum torque. These aspects are now covered by a utility patent (PAT/AU/2004270767). Use of the automated traction tester is illustrated by comparative data for a range of warm-season turfgrasses, by comparisons of traction under different surface conditions generated by wear on Cynodon dactylon cultivars, and by the effects of environment, management and playing patterns on traction across a multi-use sports stadium.
Resumo:
The international traveller needs to plan ahead to ensure medicines are available and used as directed for optimal therapeutic outcome. The planning needs to take account of legal and customs requirements for travelling with medicines for personal use. The standard advice by travel health providers is that travellers should check with the country of destination for requirements when travelling into the country with medicines for personal use. This is akin to introducing a barrier to care for this category of travellers. Innovative method of care for this group of traveller is needed.
Resumo:
Acoustic recordings play an increasingly important role in monitoring terrestrial and aquatic environments. However, rapid advances in technology make it possible to accumulate thousands of hours of recordings, more than ecologists can ever listen to. Our approach to this big-data challenge is to visualize the content of long-duration audio recordings on multiple scales, from minutes, hours, days to years. The visualization should facilitate navigation and yield ecologically meaningful information prior to listening to the audio. To construct images, we calculate acoustic indices, statistics that describe the distribution of acoustic energy and reflect content of ecological interest. We combine various indices to produce false-color spectrogram images that reveal acoustic content and facilitate navigation. The technical challenge we investigate in this work is how to navigate recordings that are days or even months in duration. We introduce a method of zooming through multiple temporal scales, analogous to Google Maps. However, the “landscape” to be navigated is not geographical and not therefore intrinsically visual, but rather a graphical representation of the underlying audio. We describe solutions to navigating spectrograms that range over three orders of magnitude of temporal scale. We make three sets of observations: 1. We determine that at least ten intermediate scale steps are required to zoom over three orders of magnitude of temporal scale; 2. We determine that three different visual representations are required to cover the range of temporal scales; 3. We present a solution to the problem of maintaining visual continuity when stepping between different visual representations. Finally, we demonstrate the utility of the approach with four case studies.
Resumo:
The stochastic version of Pontryagin's maximum principle is applied to determine an optimal maintenance policy of equipment subject to random deterioration. The deterioration of the equipment with age is modelled as a random process. Next the model is generalized to include random catastrophic failure of the equipment. The optimal maintenance policy is derived for two special probability distributions of time to failure of the equipment, namely, exponential and Weibull distributions Both the salvage value and deterioration rate of the equipment are treated as state variables and the maintenance as a control variable. The result is illustrated by an example
Resumo:
The aim of this paper is to present results of research investigating the effectiveness of audio feedback in a third year undergraduate unit. While there is a large and growing body of literature about providing assessment feedback, there is little focussing on the use of audio media. This study employs a mixed method approach, involving semi-structured interviews with academic staff and a survey of students. Analysis of the interview data suggests that there are a number of issues surrounding acceptance of using audio feedback by lecturers. The next stage of the study is to examine the extent to which lecturers change their perceptions as they use audio feedback and to analyse the perceptions of the students (n=120), including the perceived importance of feedback, the ways in which they used the audio feedback and the extent to which they believe they control events that affect them. Ultimately, this study seeks to provide recommendations appropriate to the implementation of audio feedback in higher education.
Resumo:
Providing audio feedback to assessment is relatively uncommon in higher education. However, published research suggests that it is preferred over written feedback by students but lecturers were less convinced. The aim of this paper is to examine further these findings in the context of a third year business ethics unit. Data was collected from two sources. The first is a series of in-depth, semi-structured interviews conducted with three lecturers providing audio feeback for the first time in Semester One 2011. The second source of data was drawn from the university student evaluation system. A total of 363 responses were used providing 'before' and 'after' perspectives about the effectiveness of audio feedback versus written feedback. Between 2005 and 2009 the survey data provided information about student attitudes to written assessment feedback (n=261). From 2010 onwards the data relates to audio (mp3) feedback (n=102). The analysis of he interview data indicated that introducing audio feedback should be done with care. The perception of the participating lecturers was mixed, ranging from sceptism to outright enthusiasm, but over time the overall approach became positive. It was found that particular attention needs to be paid to small (but important) technical details, and lecturers need to be convinced of its effectieness, especially that it is not necessarily more time consuming than providing written feedback. For students, the analysis revealed a clear preference for audio feedback. It is concluded that there is cause for concern and reason for optimism. It is a cause for concern because there is a possibility that scepticism on the part of academic staff seems to be based on assumptions about what students prefer and a concern about using the technology. There is reason for optimism because the evidence points towards students preferring audio feedback and as academic staff become more familiar with the technology the scepticism tends to evaporate. While this study is limited in scope, questions are raised about tackling negative staff perceptions of audio feedback that are worthy of further research.
Resumo:
This research investigates techniques to analyse long duration acoustic recordings to help ecologists monitor birdcall activities. It designs a generalized algorithm to identify a broad range of bird species. It allows ecologists to search for arbitrary birdcalls of interest, rather than restricting them to just a very limited number of species on which the recogniser is trained. The algorithm can help ecologists find sounds of interest more efficiently by filtering out large volumes of unwanted sounds and only focusing on birdcalls.
Resumo:
Volumetric method based adsorption measurements of nitrogen on two specimens of activated carbon (Fluka and Sarabhai) reported by us are refitted to two popular isotherms, namely, Dubunin−Astakhov (D−A) and Toth, in light of improved fitting methods derived recently. Those isotherms have been used to derive other data of relevance in design of engineering equipment such as the concentration dependence of heat of adsorption and Henry’s law coefficients. The present fits provide a better representation of experimental measurements than before because the temperature dependence of adsorbed phase volume and structural heterogeneity of micropore distribution have been accounted for in the D−A equation. A new correlation to the Toth equation is a further contribution. The heat of adsorption in the limiting uptake condition is correlated with the Henry’s law coefficients at the near zero uptake condition.
Resumo:
In recent years, many of the world’s leading media producers, screenwriters, technicians and investors, particularly those in the Asia-Pacific region, have been drawn to work in the People's Republic of China (hereafter China or Mainland China). Media projects with a lighter commercial entertainment feel – compared with the heavy propaganda-oriented content of the past – have multiplied, thanks to the Chinese state’s newfound willingness to consider collaboration with foreign partners. This is no more evident than in film. Despite their long-standing reputation for rigorous censorship, state policymakers are now encouraging Chinese media entrepreneurs to generate fresh ideas and to develop products that will revitalise the stagnant domestic production sector. It is hoped that an increase in both the quality and quantity of domestic feature films, stimulated by an infusion of creativity and cutting-edge technology from outside the country, will help reverse China’s ‘cultural trade deficit’ (wenhua maoyi chizi) (Keane 2007).
Resumo:
Communication applications are usually delay restricted, especially for the instance of musicians playing over the Internet. This requires a one-way delay of maximum 25 msec and also a high audio quality is desired at feasible bit rates. The ultra low delay (ULD) audio coding structure is well suited to this application and we investigate further the application of multistage vector quantization (MSVQ) to reach a bit rate range below 64 Kb/s, in a scalable manner. Results at 32 Kb/s and 64 Kb/s show that the trained codebook MSVQ performs best, better than KLT normalization followed by a simulated Gaussian MSVQ or simulated Gaussian MSVQ alone. The results also show that there is only a weak dependence on the training data, and that we indeed converge to the perceptual quality of our previous ULD coder at 64 Kb/s.