917 resultados para Auxiliares de Audição
Resumo:
DUE TO COPYRIGHT RESTRICTIONS ONLY AVAILABLE FOR CONSULTATION AT ASTON UNIVERSITY LIBRARY AND INFORMATION SERVICES WITH PRIOR ARRANGEMENT
Resumo:
Audio feedback remains little used in most graphical user interfaces despite its potential to greatly enhance interaction. Not only does sonic enhancement of interfaces permit more natural human-computer communication but it also allows users to employ an appropriate sense to solve a problem rather than having to rely solely on vision. Research shows that designers do not typically know how to use sound effectively; subsequently, their ad hoc use of sound often leads to audio feedback being considered an annoying distraction. Unlike the design of purely graphical user interfaces for which guidelines are common, the audio-enhancement of graphical user interfaces has (until now) been plagued by a lack of suitable guidance. This paper presents a series of empirically substantiated guidelines for the design and use of audio-enhanced graphical user interface widgets.
Resumo:
In this report we summarize the state-of-the-art of speech emotion recognition from the signal processing point of view. On the bases of multi-corporal experiments with machine-learning classifiers, the observation is made that existing approaches for supervised machine learning lead to database dependent classifiers which can not be applied for multi-language speech emotion recognition without additional training because they discriminate the emotion classes following the used training language. As there are experimental results showing that Humans can perform language independent categorisation, we made a parallel between machine recognition and the cognitive process and tried to discover the sources of these divergent results. The analysis suggests that the main difference is that the speech perception allows extraction of language independent features although language dependent features are incorporated in all levels of the speech signal and play as a strong discriminative function in human perception. Based on several results in related domains, we have suggested that in addition, the cognitive process of emotion-recognition is based on categorisation, assisted by some hierarchical structure of the emotional categories, existing in the cognitive space of all humans. We propose a strategy for developing language independent machine emotion recognition, related to the identification of language independent speech features and the use of additional information from visual (expression) features.
Resumo:
AMS Subj. Classification: H.3.7 Digital Libraries, K.6.5 Security and Protection
Resumo:
In the digital age the internet and the ICT devices changed our daily life and routines. It means we couldn't live without these services and devices anywhere (work, home, holiday, etc.). It can be experienced in the tourism sector; digital contents become key tools in the tourism of the 21st century; they will be able to adapt the traditional tourist guide methodology to the applications running on novel digital devices. Tourists belong to a new generation, an "ICT generation" using innovative tools, a new info-media to communicate. A possible direction for tourism development is to use modern ICT systems and devices. Besides participating in classical tours guided by travel guides, there is a new opportunity for individual tourists to enjoy high quality ICT based guided walks prepared on the knowledge of travel guides. The main idea of the GUIDE@HAND service is to use reusable, and create new tourism contents for an advanced mobile device, in order to give a contemporary answer to traditional systems of tourism information, by developing new tourism services based on digital contents for innovative mobile applications. The service is based on a new concept of enhancing territorial heritage and values, through knowledge, innovation, languages and multilingual solutions going along with new tourists‟ “sensitiveness”.
Resumo:
It is well established that accent recognition can be as accurate as up to 95% when the signals are noise-free, using feature extraction techniques such as mel-frequency cepstral coefficients and binary classifiers such as discriminant analysis, support vector machine and k-nearest neighbors. In this paper, we demonstrate that the predictive performance can be reduced by as much as 15% when the signals are noisy. Specifically, in this paper we perturb the signals with different levels of white noise, and as the noise become stronger, the out-of-sample predictive performance deteriorates from 95% to 80%, although the in-sample prediction gives overly-optimistic results. ACM Computing Classification System (1998): C.3, C.5.1, H.1.2, H.2.4., G.3.
Resumo:
In this paper we discuss how an innovative audio-visual project was adopted to foster active, rather than declarative learning, in critical International Relations (IR). First, we explore the aesthetic turn in IR, to contrast this with forms of representation that have dominated IR scholarship. Second, we describe how students were asked to record short audio or video projects to explore their own insights through aesthetic and non-written formats. Third, we explain how these projects are understood to be deeply embedded in social science methodologies. We cite our inspiration from applying a personal sociological imagination, as a way to counterbalance a ‘marketised’ slant in higher education, in a global economy where students are often encouraged to consume, rather than produce knowledge. Finally, we draw conclusions in terms of deeper forms of student engagement leading to new ways of thinking and presenting new skills and new connections between theory and practice.
Resumo:
Digital systems can generate left and right audio channels that create the effect of virtual sound source placement (spatialization) by processing an audio signal through pairs of Head-Related Transfer Functions (HRTFs) or, equivalently, Head-Related Impulse Responses (HRIRs). The spatialization effect is better when individually-measured HRTFs or HRIRs are used than when generic ones (e.g., from a mannequin) are used. However, the measurement process is not available to the majority of users. There is ongoing interest to find mechanisms to customize HRTFs or HRIRs to a specific user, in order to achieve an improved spatialization effect for that subject. Unfortunately, the current models used for HRTFs and HRIRs contain over a hundred parameters and none of those parameters can be easily related to the characteristics of the subject. This dissertation proposes an alternative model for the representation of HRTFs, which contains at most 30 parameters, all of which have a defined functional significance. It also presents methods to obtain the value of parameters in the model to make it approximately equivalent to an individually-measured HRTF. This conversion is achieved by the systematic deconstruction of HRIR sequences through an augmented version of the Hankel Total Least Squares (HTLS) decomposition approach. An average 95% match (fit) was observed between the original HRIRs and those re-constructed from the Damped and Delayed Sinusoids (DDSs) found by the decomposition process, for ipsilateral source locations. The dissertation also introduces and evaluates an HRIR customization procedure, based on a multilinear model implemented through a 3-mode tensor, for mapping of anatomical data from the subjects to the HRIR sequences at different sound source locations. This model uses the Higher-Order Singular Value Decomposition (HOSVD) method to represent the HRIRs and is capable of generating customized HRIRs from easily attainable anatomical measurements of a new intended user of the system. Listening tests were performed to compare the spatialization performance of customized, generic and individually-measured HRIRs when they are used for synthesized spatial audio. Statistical analysis of the results confirms that the type of HRIRs used for spatialization is a significant factor in the spatialization success, with the customized HRIRs yielding better results than generic HRIRs.
Resumo:
This study examines the correlation between how certified music educators understand audio technology and how they incorporate it in their instructional methods. Participants were classroom music teachers selected from fifty middle schools in Miami- Dade Public Schools. The study adopted a non-experimental research design in which a survey was the primary tool of investigation. The findings reveal that a majority of middle school music teachers in Miami-Dade are not familiar with advanced audiorecording software or any other digital device dedicated to the recording and processing of audio signals. Moreover, they report a lack of opportunities to develop this knowledge. Younger music teachers, however, are more open to developing up-to-date instructional methodologies. Most of the participants agreed that music instruction should be a platform for preparing students for a future in the entertainment industry. A basic knowledge of music business should be delivered to students enrolled in middle-school music courses.
Resumo:
In an audio cueing system, a teacher is presented with randomly spaced auditory signals via tape recorder or intercom. The teacher is instructed to praise a child who is on-task each time the cue is presented. In this study, a baseline was obtained on the teacher's praise rate and the children's on-task behaviour in a Grade 5 class of 37 students. Children were then divided into high, medium and low on-task groups. Followinq baseline, the teacher's praise rate and the children's on-task behaviour were observed under the following successively implemented conditions: (l) Audio Cueing 1: Audio cueing at a rate of 30 cues per hour was introduced into the classroom and remained in effect during subsequent conditions. A group of consistently low on-task children were delineated. (2) Audio Cueing Plus 'focus praise package': Instructions to direct two-thirds o£ the praise to children identified by the experimenter (consistently low on-task children), feedback and experimenter praise for meeting or surpassing the criterion distribution of praise ('focus praise package') were introduced. (3) Audio Cueing 2: The 'focus praise package' was removed. (4) Audio Cueing Plus 'increase praise package': Instructions to increase the rate of praise, feedback and experimenter praise for improved praise rates ('increase praise package') were introduced. The primary aims of the study were to determine the distribution of praise among hi~h, medium and low on-task children when audio cueinq was first introduced and to investigate the effect of the 'focus praise package' on the distribution of teacher praise. The teacher distributed her praise evenly among the hiqh, medium and low on-task groups during audio cueing 1. The effect of the 'focus praise package' was to increase the percentage of praise received by the consistently low on-task children. Other findings tended to suggest that audio cueing increased the teacher's praise rate. However, the teacher's praise rate unexpectedly decreased to a level considerably below the cued rate during audio cueing 2. The 'increase praise package' appeared to increase the teacher's praise rate above the audio cueing 2 level. The effect of an increased praise rate and two distributions of praise on on-task behaviour were considered. Significant increases in on-task behaviour were found in audio cueing 1 for the low on-task group, in the audio cueing plus 'focus praise package' condition for the entire class and the consistently low on-task group and in audio cueing 2 for the medium on-task group. Except for the high on-task children who did not change, the effects of the experimental manipulations on on-task behaviour were e quivocal. However, there were some indications that directing 67% of the praise to the consistently low on-task children was more effective for increasing this group's on-task behaviour than distributing praise equally among on-task groups.
Resumo:
HomeBank is introduced here. It is a public, permanent, extensible, online database of daylong audio recorded in naturalistic environments. HomeBank serves two primary purposes. First, it is a repository for raw audio and associated files: one database requires special permissions, and another redacted database allows unrestricted public access. Associated files include metadata such as participant demographics and clinical diagnostics, automated annotations, and human-generated transcriptions and annotations. Many recordings use the child-perspective LENA recorders (LENA Research Foundation, Boulder, Colorado, United States), but various recordings and metadata can be accommodated. The HomeBank database can have both vetted and unvetted recordings, with different levels of accessibility. Additionally, HomeBank is an open repository for processing and analysis tools for HomeBank or similar data sets. HomeBank is flexible for users and contributors, making primary data available to researchers, especially those in child development, linguistics, and audio engineering. HomeBank facilitates researchers' access to large-scale data and tools, linking the acoustic, auditory, and linguistic characteristics of children's environments with a variety of variables including socioeconomic status, family characteristics, language trajectories, and disorders. Automated processing applied to daylong home audio recordings is now becoming widely used in early intervention initiatives, helping parents to provide richer speech input to at-risk children.
Resumo:
Situational awareness is achieved naturally by the human senses of sight and hearing in combination. Automatic scene understanding aims at replicating this human ability using microphones and cameras in cooperation. In this paper, audio and video signals are fused and integrated at different levels of semantic abstractions. We detect and track a speaker who is relatively unconstrained, i.e., free to move indoors within an area larger than the comparable reported work, which is usually limited to round table meetings. The system is relatively simple: consisting of just 4 microphone pairs and a single camera. Results show that the overall multimodal tracker is more reliable than single modality systems, tolerating large occlusions and cross-talk. System evaluation is performed on both single and multi-modality tracking. The performance improvement given by the audio–video integration and fusion is quantified in terms of tracking precision and accuracy as well as speaker diarisation error rate and precision–recall (recognition). Improvements vs. the closest works are evaluated: 56% sound source localisation computational cost over an audio only system, 8% speaker diarisation error rate over an audio only speaker recognition unit and 36% on the precision–recall metric over an audio–video dominant speaker recognition method.