33 resultados para text to scene conversion
em Cambridge University Engineering Department Publications Database
Resumo:
An 80 GSPS photonic ADC system is demonstrated, using broadband MLL and dispersive fibre to form a continuous waveform with time-wavelength mapping, and AWG to channelise. Tests are carried out for RF signals up to 10GHz. © 2005 Optical Society of America.
Resumo:
This paper presents a complete system for expressive visual text-to-speech (VTTS), which is capable of producing expressive output, in the form of a 'talking head', given an input text and a set of continuous expression weights. The face is modeled using an active appearance model (AAM), and several extensions are proposed which make it more applicable to the task of VTTS. The model allows for normalization with respect to both pose and blink state which significantly reduces artifacts in the resulting synthesized sequences. We demonstrate quantitative improvements in terms of reconstruction error over a million frames, as well as in large-scale user studies, comparing the output of different systems. © 2013 IEEE.
Resumo:
An 80 GSPS photonic ADC system is demonstrated, using broadband MLL and dispersive fibre to form a continuous waveform with time-wavelength mapping, and AWG to channelise. Tests are carried out for RF signals up to 10GHz. © 2005 Optical Society of America.
Resumo:
This paper presents an overview of the Text-to-Speech synthesis system developed at the Institute for Language and Speech Processing (ILSP). It focuses on the key issues regarding the design of the system components. The system currently fully supports three languages (Greek, English, Bulgarian) and is designed in such a way to be as language and speaker independent as possible. Also, experimental results are presented which show that the system produces high quality synthetic speech in terms of naturalness and intelligibility. The system was recently ranked among the first three systems worldwide in terms of achieved quality for the English language, at the international Blizzard Challenge 2013 workshop. © 2014 Springer International Publishing.
Resumo:
This paper describes a trainable method for generating letter to sound rules for the Greek language, for producing the pronunciation of out-of-vocabulary words. Several approaches have been adopted over the years for grapheme-to-phoneme conversion, such as hand-seeded rules, finite state transducers, neural networks, HMMs etc, nevertheless it has been proved that the most reliable method is a rule-based one. Our approach is based on a semi-automatically pre-transcribed lexicon, from which we derived rules for automatic transcription. The efficiency and robustness of our method are proved by experiments on out-of-vocabulary words which resulted in over than 98% accuracy on a word-base criterion.
Resumo:
In this paper we present the process of designing an efficient speech corpus for the first unit selection speech synthesis system for Bulgarian, along with some significant preliminary results regarding the quality of the resulted system. As the initial corpus is a crucial factor for the quality delivered by the Text-to-Speech system, special effort has been given in designing a complete and efficient corpus for use in a unit selection TTS system. The targeted domain of the TTS system and hence that of the corpus is the news reports, and although it is a restricted one, it is characterized by an unlimited vocabulary. The paper focuses on issues regarding the design of an optimal corpus for such a framework and the ideas on which our approach was based on. A novel multi-stage approach is presented, with special attention given to language and speaker dependent issues, as they affect the entire process. The paper concludes with the presentation of our results and the evaluation experiments, which provide clear evidence of the quality level achieved. © 2011 Springer-Verlag.
Resumo:
Several options of fuel assembly design are investigated for a BWR core operating in a closed self-sustainable Th-233U fuel cycle. The designs rely on an axially heterogeneous fuel assembly structure consisting of a single axial fissile zone "sandwiched" between two fertile blanket zones, in order to improve fertile to fissile conversion ratio. The main objective of the study was to identify the most promising assembly design parameters, dimensions of fissile and fertile zones, for achieving net breeding of 233U. The design challenge, in this respect, is that the fuel breeding potential is at odds with axial power peaking and the core minimum critical power ratio (CPR), hence limiting the maximum achievable core power rating. Calculations were performed with the BGCore system, which consists of the MCNP code coupled with fuel depletion and thermo-hydraulic feedback modules. A single 3-dimensional fuel assembly having reflective radial boundaries was modeled applying simplified restrictions on the maximum centerline fuel temperature and the CPR. It was found that axially heterogeneous fuel assembly design with a single fissile zone can potentially achieve net breeding, while matching conventional BWR core power rating under certain restrictions to the core loading pattern design. © 2013 Elsevier B.V. All rights reserved.
Resumo:
A micromachined electrometer, based on the concept of a variable capacitor, has been designed, modeled, fabricated, and tested. The device presented in this paper functions as a modulated variable capacitor, wherein a dc charge to be measured is up-modulated and converted to an ac voltage output, thus improving the signal-to-noise ratio. The device was fabricated in a commercial standard SOI micromachining process without the need for any additional processing steps. The electrometer was tested in both air and vacuum at room temperature. In air, it has a charge-to-voltage conversion gain of 2.06 nV/e, and a measured charge noise floor of 52.4 e/rtHz. To reduce the effects of input leakage current, an electrically isolated capacitor has been introduced between the variable capacitor and input to sensor electronics. Methods to improve the sensitivity and resolution are suggested while the long-term stability of these sensors is modeled and discussed. © 2006 IEEE.
Resumo:
In a Text-to-Speech system based on time-domain techniques that employ pitch-synchronous manipulation of the speech waveforms, one of the most important issues that affect the output quality is the way the analysis points of the speech signal are estimated and the actual points, i.e. the analysis pitchmarks. In this paper we present our methodology for calculating the pitchmarks of a speech waveform, a pitchmark detection algorithm, which after thorough experimentation and in comparison with other algorithms, proves to behave better with our TD-PSOLA-based Text-to-Speech synthesizer (Time- Domain Pitch-Synchronous Overlap Add Text to Speech System).
Resumo:
In recent years, the use of morphological decomposition strategies for Arabic Automatic Speech Recognition (ASR) has become increasingly popular. Systems trained on morphologically decomposed data are often used in combination with standard word-based approaches, and they have been found to yield consistent performance improvements. The present article contributes to this ongoing research endeavour by exploring the use of the 'Morphological Analysis and Disambiguation for Arabic' (MADA) tools for this purpose. System integration issues concerning language modelling and dictionary construction, as well as the estimation of pronunciation probabilities, are discussed. In particular, a novel solution for morpheme-to-word conversion is presented which makes use of an N-gram Statistical Machine Translation (SMT) approach. System performance is investigated within a multi-pass adaptation/combination framework. All the systems described in this paper are evaluated on an Arabic large vocabulary speech recognition task which includes both Broadcast News and Broadcast Conversation test data. It is shown that the use of MADA-based systems, in combination with word-based systems, can reduce the Word Error Rates by up to 8.1 relative. © 2012 Elsevier Ltd. All rights reserved.