Biblioteca Digital

313 resultados para audio processing

Statistical parametric speech synthesis based on speaker and language factorization

Relevância:

20.00% 20.00%

Publicador:

Resumo:

An increasingly common scenario in building speech synthesis and recognition systems is training on inhomogeneous data. This paper proposes a new framework for estimating hidden Markov models on data containing both multiple speakers and multiple languages. The proposed framework, speaker and language factorization, attempts to factorize speaker-/language-specific characteristics in the data and then model them using separate transforms. Language-specific factors in the data are represented by transforms based on cluster mean interpolation with cluster-dependent decision trees. Acoustic variations caused by speaker characteristics are handled by transforms based on constrained maximum-likelihood linear regression. Experimental results on statistical parametric speech synthesis show that the proposed framework enables data from multiple speakers in different languages to be used to: train a synthesis system; synthesize speech in a language using speaker characteristics estimated in a different language; and adapt to a new language. © 2012 IEEE.

Hidden Markov models in speech and language processing

Relevância:

20.00% 20.00%

Publicador:

Bayesian interpolation in a dynamic sinusoidal model with application to packet-loss concealment

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we consider Bayesian interpolation and parameter estimation in a dynamic sinusoidal model. This model is more flexible than the static sinusoidal model since it enables the amplitudes and phases of the sinusoids to be time-varying. For the dynamic sinusoidal model, we derive a Bayesian inference scheme for the missing observations, hidden states and model parameters of the dynamic model. The inference scheme is based on a Markov chain Monte Carlo method known as Gibbs sampler. We illustrate the performance of the inference scheme to the application of packet-loss concealment of lost audio and speech packets. © EURASIP, 2010.

An automatic speaker recognition system for intelligence applications

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents an automatic speaker recognition system for intelligence applications. The system has to provide functionalities for a speaker skimming application in which databases of recorded conversations belonging to an ongoing investigation can be annotated and quickly browsed by an operator. The paper discusses the criticalities introduced by the characteristics of the audio signals under consideration - in particular background noise and channel/coding distortions - as well as the requirements and functionalities of the system under development. It is shown that the performance of state-of-the-art approaches degrades significantly in presence of moderately high background noise. Finally, a novel speaker recognizer based on phonetic features and an ensemble classifier is presented. Results show that the proposed approach improves performance on clean audio, and suggest that it can be employed towards improved real-world robustness. © EURASIP, 2009.

Methods and Apparatus for Processing Image Streams

Relevância:

20.00% 20.00%

Publicador:

Speaker and Noise Factorization for Robust Speech Recognition

Relevância:

20.00% 20.00%

Publicador:

Development of a Transformable Plasma Device for Materials Processing

Relevância:

20.00% 20.00%

Publicador:

Variation in Carbon Nanotube Polymer Composite Conductivity from the Effects of Processing, Dispersion, Aging and Sample Size

Relevância:

20.00% 20.00%

Publicador:

Modulation of somatosensory processing by action.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Psychophysical evidence suggests that sensations arising from our own movements are diminished when predicted by motor forward models and that these models may also encode the timing and intensity of movement. Here we report a functional magnetic resonance imaging study in which the effects on sensation of varying the occurrence, timing and force of movements were measured. We observed that tactile-related activity in a region of secondary somatosensory cortex is reduced when sensation is associated with movement and further that this reduction is maximal when movement and sensation occur synchronously. Motor force is not represented in the degree of attenuation but rather in the magnitude of this region's response. These findings provide neurophysiological correlates of previously-observed behavioural forward-model phenomena, and advocate the adopted approach for the study of clinical conditions in which forward-model deficits have been posited to play a crucial role.

Production and processing of graphene and 2d crystals

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Graphene is at the center of an ever growing research effort due to its unique properties, interesting for both fundamental science and applications. A key requirement for applications is the development of industrial-scale, reliable, inexpensive production processes. Here we review the state of the art of graphene preparation, production, placement and handling. Graphene is just the first of a new class of two dimensional materials, derived from layered bulk crystals. Most of the approaches used for graphene can be extended to these crystals, accelerating their journey towards applications. © 2012 Elsevier Ltd.

Graphics processing unit cooling solutions: Acoustic characteristics

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The change in acoustic characteristics in personal computers to console gaming and home entertainment systems with the change in the Graphics Processing Unit (GPU), is presented. The tests are carried out using identical configurations of the software and system hardware. The prime components of the hardware used in the project are central processing unit, motherboard, hard disc drive, memory, power supply, optical drive, and additional cooling system. The results from the measurements taken for each GPU tested are analyzed and compared. The test results are obtained using a photo tachometer and reflective tape adhered to one particular fan blade. The test shows that loudness is a psychoacoustic metric developed by Zwicker and Fastal that aims to quantify how loud a sound is perceived as compared to a standard sound. The acoustic experiment reveals that the inherent noise generation mechanism increases with the increase of the complexity of the cooling solution.

Vowel normalisation: Time-domain processing of the internal dynamics of speech

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Human listeners can identify vowels regardless of speaker size, although the sound waves for an adult and a child speaking the ’same’ vowel would differ enormously. The differences are mainly due to the differences in vocal tract length (VTL) and glottal pulse rate (GPR) which are both related to body size. Automatic speech recognition machines are notoriously bad at understanding children if they have been trained on the speech of an adult. In this paper, we propose that the auditory system adapts its analysis of speech sounds, dynamically and automatically to the GPR and VTL of the speaker on a syllable-to-syllable basis. We illustrate how this rapid adaptation might be performed with the aid of a computational version of the auditory image model, and we propose that an auditory preprocessor of this form would improve the robustness of speech recognisers.

Autoregressive Models for Statistical Parametric Speech Synthesis

Relevância:

20.00% 20.00%

Publicador:

Structured SVMs for Automatic Speech Recognition

Relevância:

20.00% 20.00%

Publicador:

Confidence in beliefs about pain predicts expectancy effects on pain perception and anticipatory processing in right anterior insula.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Psychological factors play a major role in exacerbating chronic pain. Effective self-management of pain is often hindered by inaccurate beliefs about the nature of pain which lead to a high degree of emotional reactivity. Probabilistic models of perception state that greater confidence (certainty) in beliefs increases their influence on perception and behavior. In this study, we treat confidence as a metacognitive process dissociable from the content of belief. We hypothesized that confidence is associated with anticipatory activation of areas of the pain matrix involved with top-down modulation of pain. Healthy volunteers rated their beliefs about the emotional distress that experimental pain would cause, and separately rated their level of confidence in this belief. Confidence predicted the influence of anticipation cues on experienced pain. We measured brain activity during anticipation of pain using high-density EEG and used electromagnetic tomography to determine neural substrates of this effect. Confidence correlated with activity in right anterior insula, posterior midcingulate and inferior parietal cortices during the anticipation of pain. Activity in the right anterior insula predicted a greater influence of anticipation cues on pain perception, whereas activity in right inferior parietal cortex predicted a decreased influence of anticipatory cues. The results support probabilistic models of pain perception and suggest that confidence in beliefs is an important determinant of expectancy effects on pain perception.

«
1
2
...
13
14
15
16
17
18
19
20
21
»