899 resultados para Audio signals


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Universidade Estadual de Campinas . Faculdade de Educação Física

Relevância:

30.00% 30.00%

Publicador:

Resumo:

One of the goals in the field of Music Information Retrieval is to obtain a measure of similarity between two musical recordings. Such a measure is at the core of automatic classification, query, and retrieval systems, which have become a necessity due to the ever increasing availability and size of musical databases. This paper proposes a method for calculating a similarity distance between two music signals. The method extracts a set of features from the audio recordings, models the features, and determines the distance between models. While further work is needed, preliminary results show that the proposed method has the potential to be used as a similarity measure for musical signals.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Many audio watermarking schemes divide the audio signal into several blocks such that part of the watermark is embedded into each of them. One of the key issues in these block-oriented watermarking schemes is to preserve the synchronisation, i.e. to recover the exact position of each block in the mark recovery process. In this paper, a novel time domain synchronisation technique is presented together with a new blind watermarking scheme which works in the Discrete Fourier Transform (DFT or FFT) domain. The combined scheme provides excellent imperceptibility results whilst achieving robustness against typical attacks. Furthermore, the execution of the scheme is fast enough to be used in real-time applications. The excellent transparency of the embedding algorithm makes it particularly useful for professional applications, such as the embedding of monitoring information in broadcast signals. The scheme is also compared with some recent results of the literature.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The subject of the thesis was the digital audio broadcasting technology developed in the Eureka project 147. The research was based on the literature on the subject. At first, some reasons for the digitisation of broadcasting technology were given. Next, the channel multiplexing and channel coding methods employed by digital radio were discussed. The design of these methods is based on certain phenomena related to the propagation of radio-frequency signals, and these phenomena were also described. After that, audio and data transfer mechanisms as well as the structure of digital radio network were explained. Furthermore, digital audio and data services were considered. Finally, the digital radio was examined from marketing and administrative aspects. From a merely technical point of view, the digital radio technology offers several improvements in comparison with analogue technology. However, the digital radio has not become as widespread as it was perhaps originally expected during its development.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Cette thèse étudie des modèles de séquences de haute dimension basés sur des réseaux de neurones récurrents (RNN) et leur application à la musique et à la parole. Bien qu'en principe les RNN puissent représenter les dépendances à long terme et la dynamique temporelle complexe propres aux séquences d'intérêt comme la vidéo, l'audio et la langue naturelle, ceux-ci n'ont pas été utilisés à leur plein potentiel depuis leur introduction par Rumelhart et al. (1986a) en raison de la difficulté de les entraîner efficacement par descente de gradient. Récemment, l'application fructueuse de l'optimisation Hessian-free et d'autres techniques d'entraînement avancées ont entraîné la recrudescence de leur utilisation dans plusieurs systèmes de l'état de l'art. Le travail de cette thèse prend part à ce développement. L'idée centrale consiste à exploiter la flexibilité des RNN pour apprendre une description probabiliste de séquences de symboles, c'est-à-dire une information de haut niveau associée aux signaux observés, qui en retour pourra servir d'à priori pour améliorer la précision de la recherche d'information. Par exemple, en modélisant l'évolution de groupes de notes dans la musique polyphonique, d'accords dans une progression harmonique, de phonèmes dans un énoncé oral ou encore de sources individuelles dans un mélange audio, nous pouvons améliorer significativement les méthodes de transcription polyphonique, de reconnaissance d'accords, de reconnaissance de la parole et de séparation de sources audio respectivement. L'application pratique de nos modèles à ces tâches est détaillée dans les quatre derniers articles présentés dans cette thèse. Dans le premier article, nous remplaçons la couche de sortie d'un RNN par des machines de Boltzmann restreintes conditionnelles pour décrire des distributions de sortie multimodales beaucoup plus riches. Dans le deuxième article, nous évaluons et proposons des méthodes avancées pour entraîner les RNN. Dans les quatre derniers articles, nous examinons différentes façons de combiner nos modèles symboliques à des réseaux profonds et à la factorisation matricielle non-négative, notamment par des produits d'experts, des architectures entrée/sortie et des cadres génératifs généralisant les modèles de Markov cachés. Nous proposons et analysons également des méthodes d'inférence efficaces pour ces modèles, telles la recherche vorace chronologique, la recherche en faisceau à haute dimension, la recherche en faisceau élagué et la descente de gradient. Finalement, nous abordons les questions de l'étiquette biaisée, du maître imposant, du lissage temporel, de la régularisation et du pré-entraînement.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Machine tool chatter is an unfavorable phenomenon during metal cutting, which results in heavy vibration of cutting tool. With increase in depth of cut, the cutting regime changes from chatter-free cutting to one with chatter. In this paper, we propose the use of permutation entropy (PE), a conceptually simple and computationally fast measurement to detect the onset of chatter from the time series using sound signal recorded with a unidirectional microphone. PE can efficiently distinguish the regular and complex nature of any signal and extract information about the dynamics of the process by indicating sudden change in its value. Under situations where the data sets are huge and there is no time for preprocessing and fine-tuning, PE can effectively detect dynamical changes of the system. This makes PE an ideal choice for online detection of chatter, which is not possible with other conventional nonlinear methods. In the present study, the variation of PE under two cutting conditions is analyzed. Abrupt variation in the value of PE with increase in depth of cut indicates the onset of chatter vibrations. The results are verified using frequency spectra of the signals and the nonlinear measure, normalized coarse-grained information rate (NCIR).

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Timely detection of sudden change in dynamics that adversely affect the performance of systems and quality of products has great scientific relevance. This work focuses on effective detection of dynamical changes of real time signals from mechanical as well as biological systems using a fast and robust technique of permutation entropy (PE). The results are used in detecting chatter onset in machine turning and identifying vocal disorders from speech signal.Permutation Entropy is a nonlinear complexity measure which can efficiently distinguish regular and complex nature of any signal and extract information about the change in dynamics of the process by indicating sudden change in its value. Here we propose the use of permutation entropy (PE), to detect the dynamical changes in two non linear processes, turning under mechanical system and speech under biological system.Effectiveness of PE in detecting the change in dynamics in turning process from the time series generated with samples of audio and current signals is studied. Experiments are carried out on a lathe machine for sudden increase in depth of cut and continuous increase in depth of cut on mild steel work pieces keeping the speed and feed rate constant. The results are applied to detect chatter onset in machining. These results are verified using frequency spectra of the signals and the non linear measure, normalized coarse-grained information rate (NCIR).PE analysis is carried out to investigate the variation in surface texture caused by chatter on the machined work piece. Statistical parameter from the optical grey level intensity histogram of laser speckle pattern recorded using a charge coupled device (CCD) camera is used to generate the time series required for PE analysis. Standard optical roughness parameter is used to confirm the results.Application of PE in identifying the vocal disorders is studied from speech signal recorded using microphone. Here analysis is carried out using speech signals of subjects with different pathological conditions and normal subjects, and the results are used for identifying vocal disorders. Standard linear technique of FFT is used to substantiate thc results.The results of PE analysis in all three cases clearly indicate that this complexity measure is sensitive to change in regularity of a signal and hence can suitably be used for detection of dynamical changes in real world systems. This work establishes the application of the simple, inexpensive and fast algorithm of PE for the benefit of advanced manufacturing process as well as clinical diagnosis in vocal disorders.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This letter describes a novel algorithm that is based on autoregressive decomposition and pole tracking used to recognize two patterns of speech data: normal voice and disphonic voice caused by nodules. The presented method relates the poles and the peaks of the signal spectrum which represent the periodic components of the voice. The results show that the perturbation contained in the signal is clearly depicted by pole's positions. Their variability is related to jitter and shimmer. The pole dispersion for pathological voices is about 20% higher than for normal voices, therefore, the proposed approach is a more trustworthy measure than the classical ones. © 2007.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This article describes a series of experiments which were carried out to measure the sense of presence in auditory virtual environments. Within the study a comparison of self-created signals to signals created by the surrounding environment is drawn. Furthermore, it is investigated if the room characteristics of the simulated environment have consequences on the perception of presence during vocalization or when listening to speech. Finally the experiments give information about the influence of background signals on the sense of presence. In the experiments subjects rated the degree of perceived presence in an auditory virtual environment on a perceptual scale. It is described which parameters have the most influence on the perception of presence and which ones are of minor influence. The results show that on the one hand an external speaker has more influence on the sense of presence than an adequate presentation of one’s own voice. On the other hand both room reflections and adequately presented background signals significantly increase the perceived presence in the virtual environment.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Una vez presentada la tecnología de Networking audio (redes de datos, protocolos actuales, etc.) se realizará un diseño de la instalación del sistema de audio, en el que el punto de partida es la parte creativa de la actividad en dicha instalación: un juego en el que la comunicación auditiva es lo fundamental. La instalación se compondrá de una sala central, tres salas de grupos, tres salas de cabinas de actores y ocho salas de pasaje. Esta actividad tan particular hará plantearse configuraciones, equipamiento y formas de trabajar especiales que, mediante la tecnología de audio vía red de datos y el equipamiento auxiliar a esta red, podría realizarse de la una forma óptima cumpliendo con todos los objetivos de la actividad, tanto técnicos como relativos al juego. El libro se dividirá en dos partes: La primera parte consistirá en una explicación de lo que son las redes de datos y los aspectos básicos para entenderlas desde un punto de vista práctico: qué es Ethernet, los componentes de una red... Una vez explicada la terminología específica de redes, se expondrán los protocolos que se usan para transmitir audio profesional a día de hoy. En la segunda parte, se empezará presentando la actividad que se realizará en nuestra instalación: un juego de rol. A continuación se conocerá el flujo de señales existentes para después, poner en práctica lo aprendido en la primera parte: diseñaremos una instalación audiovisual mediante networking audio. Un sistema de estas características necesita además de dispositivos en red, sistemas convencionales de audio. Durante el diseño y debido a las necesidades tan específicas de la instalación, se verá que ha sido necesario pensar en sistemas especiales para hacer posible la actividad para la que ha sido ideada nuestra instalación. Los objetivos de este proyecto son, desarrollar los puntos que tendría que tener en cuenta un integrador que se proponga diseñar un sistema de audio networking para una instalación audiovisual para, a continuación, poner en práctica estos conocimientos con la exposición del diseño de una instalación en la que se llevará a cabo una actividad lúdica y de aprendizaje en la que una óptima transmisión de señal de audio a tiempo real, es lo fundamental. ABSTRACT. Once introduced the Networking technology (data networks, current protocols, etc.), the audio installation design is being done. In which the starting point is the creative part of the activity will be made: one game in which the auditory communication is fundamental. The installation will consist of a central room, three meeting groups, three actor cabins rooms and eight passage rooms. This particular activity will consider configurations, equipment and forms of special working that through audio technology via data network and auxiliary equipment to this network, it could be done in an optimal way to meet all the goals of the activity, both technical and relative to the game. The book is divided into two parts: The first part consists of an explanation of what the data networks and the basics to understand from a practical point of view: what Ethernet is, the network components... Once specific network terminology is explained, the current protocols used to transmit professional audio are being showed. In the second part, it is introducing the activity to be made in our installation: a game. Then, the flow of existing signals are being known, we practice what I learned in the first part: we will design an audiovisual installation by audio networking. A system like this besides networked devices, it needs conventional audio systems. During the design and due to the very specific needs of the installation, you will see that it was necessary to think of special systems for this special activity. The goals of this project are to develop the points that an system integrator would have to consider to design a system of networking audio for an audiovisual installation, then put this knowledge into practice with the installation design where it will take place a fun and learning activity in which an optimal transmission of audio signal in real time, is basic.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Federal Highway Administration, Office of Research, Washington, D.C.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In an audio cueing system, a teacher is presented with randomly spaced auditory signals via tape recorder or intercom. The teacher is instructed to praise a child who is on-task each time the cue is presented. In this study, a baseline was obtained on the teacher's praise rate and the children's on-task behaviour in a Grade 5 class of 37 students. Children were then divided into high, medium and low on-task groups. Followinq baseline, the teacher's praise rate and the children's on-task behaviour were observed under the following successively implemented conditions: (l) Audio Cueing 1: Audio cueing at a rate of 30 cues per hour was introduced into the classroom and remained in effect during subsequent conditions. A group of consistently low on-task children were delineated. (2) Audio Cueing Plus 'focus praise package': Instructions to direct two-thirds o£ the praise to children identified by the experimenter (consistently low on-task children), feedback and experimenter praise for meeting or surpassing the criterion distribution of praise ('focus praise package') were introduced. (3) Audio Cueing 2: The 'focus praise package' was removed. (4) Audio Cueing Plus 'increase praise package': Instructions to increase the rate of praise, feedback and experimenter praise for improved praise rates ('increase praise package') were introduced. The primary aims of the study were to determine the distribution of praise among hi~h, medium and low on-task children when audio cueinq was first introduced and to investigate the effect of the 'focus praise package' on the distribution of teacher praise. The teacher distributed her praise evenly among the hiqh, medium and low on-task groups during audio cueing 1. The effect of the 'focus praise package' was to increase the percentage of praise received by the consistently low on-task children. Other findings tended to suggest that audio cueing increased the teacher's praise rate. However, the teacher's praise rate unexpectedly decreased to a level considerably below the cued rate during audio cueing 2. The 'increase praise package' appeared to increase the teacher's praise rate above the audio cueing 2 level. The effect of an increased praise rate and two distributions of praise on on-task behaviour were considered. Significant increases in on-task behaviour were found in audio cueing 1 for the low on-task group, in the audio cueing plus 'focus praise package' condition for the entire class and the consistently low on-task group and in audio cueing 2 for the medium on-task group. Except for the high on-task children who did not change, the effects of the experimental manipulations on on-task behaviour were e quivocal. However, there were some indications that directing 67% of the praise to the consistently low on-task children was more effective for increasing this group's on-task behaviour than distributing praise equally among on-task groups.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Situational awareness is achieved naturally by the human senses of sight and hearing in combination. Automatic scene understanding aims at replicating this human ability using microphones and cameras in cooperation. In this paper, audio and video signals are fused and integrated at different levels of semantic abstractions. We detect and track a speaker who is relatively unconstrained, i.e., free to move indoors within an area larger than the comparable reported work, which is usually limited to round table meetings. The system is relatively simple: consisting of just 4 microphone pairs and a single camera. Results show that the overall multimodal tracker is more reliable than single modality systems, tolerating large occlusions and cross-talk. System evaluation is performed on both single and multi-modality tracking. The performance improvement given by the audio–video integration and fusion is quantified in terms of tracking precision and accuracy as well as speaker diarisation error rate and precision–recall (recognition). Improvements vs. the closest works are evaluated: 56% sound source localisation computational cost over an audio only system, 8% speaker diarisation error rate over an audio only speaker recognition unit and 36% on the precision–recall metric over an audio–video dominant speaker recognition method.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This article introduces the genre of a digital audio game and discusses selected play interaction solutions implemented in the Audio Game Hub, a prototype designed and evaluated in the years 2014 and 2015 at the Gamification Lab at Leuphana University Lüneburg.1 The Audio Game Hub constitutes a set of familiar playful activities (aiming at a target, reflex-based reacting to sound signals, labyrinth exploration) and casual games (e.g. Tetris, Memory) adapted to the digital medium and converted into the audio sphere, where the player is guided predominantly or solely by sound. The authors will discuss the design questions raised at early stages of the project, and confront them with the results of user experience testing performed on two groups of sighted and one group of visually impaired gamers.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Common bottlenose dolphins (Tursiops truncatus), produce a wide variety of vocal emissions for communication and echolocation, of which the pulsed repertoire has been the most difficult to categorize. Packets of high repetition, broadband pulses are still largely reported under a general designation of burst-pulses, and traditional attempts to classify these emissions rely mainly in their aural characteristics and in graphical aspects of spectrograms. Here, we present a quantitative analysis of pulsed signals emitted by wild bottlenose dolphins, in the Sado estuary, Portugal (2011-2014), and test the reliability of a traditional classification approach. Acoustic parameters (minimum frequency, maximum frequency, peak frequency, duration, repetition rate and inter-click-interval) were extracted from 930 pulsed signals, previously categorized using a traditional approach. Discriminant function analysis revealed a high reliability of the traditional classification approach (93.5% of pulsed signals were consistently assigned to their aurally based categories). According to the discriminant function analysis (Wilk's Λ = 0.11, F3, 2.41 = 282.75, P < 0.001), repetition rate is the feature that best enables the discrimination of different pulsed signals (structure coefficient = 0.98). Classification using hierarchical cluster analysis led to a similar categorization pattern: two main signal types with distinct magnitudes of repetition rate were clustered into five groups. The pulsed signals, here described, present significant differences in their time-frequency features, especially repetition rate (P < 0.001), inter-click-interval (P < 0.001) and duration (P < 0.001). We document the occurrence of a distinct signal type-short burst-pulses, and highlight the existence of a diverse repertoire of pulsed vocalizations emitted in graded sequences. The use of quantitative analysis of pulsed signals is essential to improve classifications and to better assess the contexts of emission, geographic variation and the functional significance of pulsed signals.