Biblioteca Digital

30 resultados para Voice.

The Linear Transformation of LF Glottal Waveforms for Voice Conversion

Relevância:

20.00% 20.00%

Publicador:

Veja mais

The use of syntax and multiple alternatives in the VODIS voice operated database inquiry system

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The paper describes the architecture of VODIS, a voice operated database inquiry system, and presents some experiments which investigate the effects on performance of varying the level of a priori syntactic constraints. The VODIS system includes a novel mechanism for incorporating context-free grammatical constraints directly into the word recognition algorithm. This allows the degree of a priori constraint to be smoothly varied and provides for the controlled generation of multiple alternatives. The results show that when the spoken input deviates from the predefined task grammar, a combination of weak a priori syntax rules in conjunction with full a posteriori parsing on a lattice of alternative word matches provides the most robust recognition performance. © 1991.

Veja mais

The design and implementation of dialogue control in voice operated database inquiry systems

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper describes work performed as part of the U.K. Alvey sponsored Voice Operated Database Inquiry System (VODIS) project in the area of intelligent dialogue control. The principal aims of the work were to develop a habitable interface for the untrained user; to investigate the degree to which dialogue control can be used to compensate for deficiencies in recognition performance; and to examine the requirements on dialogue control for generating natural speech output. A data-driven methodology is described based on the use of frames in which dialogue topics are organized hierarchically. The concept of a dynamically adjustable scope is introduced to permit adaptation to recognizer performance and the use of historical and hierarchical contexts are described to facilitate the construction of contextually relevant output messages. © 1989.

Veja mais

Mixed source model and its adapted vocal tract filter estimate for voice transformation and synthesis

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In current methods for voice transformation and speech synthesis, the vocal tract filter is usually assumed to be excited by a flat amplitude spectrum. In this article, we present a method using a mixed source model defined as a mixture of the Liljencrants-Fant (LF) model and Gaussian noise. Using the LF model, the base approach used in this presented work is therefore close to a vocoder using exogenous input like ARX-based methods or the Glottal Spectral Separation (GSS) method. Such approaches are therefore dedicated to voice processing promising an improved naturalness compared to generic signal models. To estimate the Vocal Tract Filter (VTF), using spectral division like in GSS, we show that a glottal source model can be used with any envelope estimation method conversely to ARX approach where a least square AR solution is used. We therefore derive a VTF estimate which takes into account the amplitude spectra of both deterministic and random components of the glottal source. The proposed mixed source model is controlled by a small set of intuitive and independent parameters. The relevance of this voice production model is evaluated, through listening tests, in the context of resynthesis, HMM-based speech synthesis, breathiness modification and pitch transposition. © 2012 Elsevier B.V. All rights reserved.

Veja mais

UNSUPERVISED CLUSTERING OF EMOTION AND VOICE STYLES FOR EXPRESSIVE TTS

Relevância:

20.00% 20.00%

Publicador:

Veja mais

Gaussian Process Experts for Voice Conversion

Relevância:

20.00% 20.00%

Publicador:

Veja mais

Stability Analysis of a Max-Min Fair Rate Control Protocol (RCP) in a Small Buffer Regime

Relevância:

10.00% 10.00%

Publicador:

Veja mais

STOCHASTICALLY SCALABLE FLOW CONTROL

Relevância:

10.00% 10.00%

Publicador:

Veja mais

Stability and fairness of explicit congestion control with small buffers.

Relevância:

10.00% 10.00%

Publicador:

Veja mais

Robust noise reduction for speech and audio signals

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Statistical model-based methods are presented for the reconstruction of autocorrelated signals in impulsive plus continuous noise environments. Signals are modelled as autoregressive and noise sources as discrete and continuous mixtures of Gaussians, allowing for robustness in highly impulsive and non-Gaussian environments. Markov Chain Monte Carlo methods are used for reconstruction of the corrupted waveforms within a Bayesian probabilistic framework and results are presented for contaminated voice and audio signals.

Veja mais

A neural network speech recogniser for directory access applications

Relevância:

10.00% 10.00%

Publicador:

Veja mais

Gaussian processes for fast policy optimisation of POMDP-based dialogue managers

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Modelling dialogue as a Partially Observable Markov Decision Process (POMDP) enables a dialogue policy robust to speech understanding errors to be learnt. However, a major challenge in POMDP policy learning is to maintain tractability, so the use of approximation is inevitable. We propose applying Gaussian Processes in Reinforcement learning of optimal POMDP dialogue policies, in order (1) to make the learning process faster and (2) to obtain an estimate of the uncertainty of the approximation. We first demonstrate the idea on a simple voice mail dialogue task and then apply this method to a real-world tourist information dialogue task. © 2010 Association for Computational Linguistics.

Veja mais

Foresight vehicle programme - Customer understanding processes in design

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Customer feedback is normally fed into product design and engineering via quality surveys and therefore mainly comprises negative comments: complaints about things gone wrong. Whilst eradication of such problems will result in a feeling of satisfaction in existing customers, it will not instil the sense of delight required to attract conquest buyers. CUPID's aim is to conceive and evaluate ideas to stimulate product desirability through the provision of delightful features and execution. By definition, surprise and delight features cannot be foreseen, so we have to understand sensory appeal and, therefore, the "hidden" voice of the customer. Copyright © 2002 Society of Automotive Engineers, Inc.

Veja mais

Continuous asr for flexible incremental dialogue

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Spoken dialogue systems provide a convenient way for users to interact with a machine using only speech. However, they often rely on a rigid turn taking regime in which a voice activity detection (VAD) module is used to determine when the user is speaking and decide when is an appropriate time for the system to respond. This paper investigates replacing the VAD and discrete utterance recogniser of a conventional turn-taking system with a continuously operating recogniser that is always listening, and using the recogniser 1-best path to guide turn taking. In this way, a flexible framework for incremental dialogue management is possible. Experimental results show that it is possible to remove the VAD component and successfully use the recogniser best path to identify user speech, with more robustness to noise, potentially smaller latency times, and a reduction in overall recognition error rate compared to using the conventional approach. © 2013 IEEE.

Veja mais

AI in the 21st century - With historical reflections

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The discipline of Artificial Intelligence (AI) was born in the summer of 1956 at Dartmouth College in Hanover, New Hampshire. Half of a century has passed, and AI has turned into an important field whose influence on our daily lives can hardly be overestimated. The original view of intelligence as a computer program - a set of algorithms to process symbols - has led to many useful applications now found in internet search engines, voice recognition software, cars, home appliances, and consumer electronics, but it has not yet contributed significantly to our understanding of natural forms of intelligence. Since the 1980s, AI has expanded into a broader study of the interaction between the body, brain, and environment, and how intelligence emerges from such interaction. This advent of embodiment has provided an entirely new way of thinking that goes well beyond artificial intelligence proper, to include the study of intelligent action in agents other than organisms or robots. For example, it supplies powerful metaphors for viewing corporations, groups of agents, and networked embedded devices as intelligent and adaptive systems acting in highly uncertain and unpredictable environments. In addition to giving us a novel outlook on information technology in general, this broader view of AI also offers unexpected perspectives into how to think about ourselves and the world around us. In this chapter, we briefly review the turbulent history of AI research, point to some of its current trends, and to challenges that the AI of the 21st century will have to face. © Springer-Verlag Berlin Heidelberg 2007.

Veja mais

30 resultados para Voice.

Filtro por publicador