35 resultados para Text to speech

em Aston University Research Archive


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This work examines prosody modelling for the Standard Yorùbá (SY) language in the context of computer text-to-speech synthesis applications. The thesis of this research is that it is possible to develop a practical prosody model by using appropriate computational tools and techniques which combines acoustic data with an encoding of the phonological and phonetic knowledge provided by experts. Our prosody model is conceptualised around a modular holistic framework. The framework is implemented using the Relational Tree (R-Tree) techniques (Ehrich and Foith, 1976). R-Tree is a sophisticated data structure that provides a multi-dimensional description of a waveform. A Skeletal Tree (S-Tree) is first generated using algorithms based on the tone phonological rules of SY. Subsequent steps update the S-Tree by computing the numerical values of the prosody dimensions. To implement the intonation dimension, fuzzy control rules where developed based on data from native speakers of Yorùbá. The Classification And Regression Tree (CART) and the Fuzzy Decision Tree (FDT) techniques were tested in modelling the duration dimension. The FDT was selected based on its better performance. An important feature of our R-Tree framework is its flexibility in that it facilitates the independent implementation of the different dimensions of prosody, i.e. duration and intonation, using different techniques and their subsequent integration. Our approach provides us with a flexible and extendible model that can also be used to implement, study and explain the theory behind aspects of the phenomena observed in speech prosody.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we present syllable-based duration modelling in the context of a prosody model for Standard Yorùbá (SY) text-to-speech (TTS) synthesis applications. Our prosody model is conceptualised around a modular holistic framework. This framework is implemented using the Relational Tree (R-Tree) techniques. An important feature of our R-Tree framework is its flexibility in that it facilitates the independent implementation of the different dimensions of prosody, i.e. duration, intonation, and intensity, using different techniques and their subsequent integration. We applied the Fuzzy Decision Tree (FDT) technique to model the duration dimension. In order to evaluate the effectiveness of FDT in duration modelling, we have also developed a Classification And Regression Tree (CART) based duration model using the same speech data. Each of these models was integrated into our R-Tree based prosody model. We performed both quantitative (i.e. Root Mean Square Error (RMSE) and Correlation (Corr)) and qualitative (i.e. intelligibility and naturalness) evaluations on the two duration models. The results show that CART models the training data more accurately than FDT. The FDT model, however, shows a better ability to extrapolate from the training data since it achieved a better accuracy for the test data set. Our qualitative evaluation results show that our FDT model produces synthesised speech that is perceived to be more natural than our CART model. In addition, we also observed that the expressiveness of FDT is much better than that of CART. That is because the representation in FDT is not restricted to a set of piece-wise or discrete constant approximation. We, therefore, conclude that the FDT approach is a practical approach for duration modelling in SY TTS applications. © 2006 Elsevier Ltd. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a novel intonation modelling approach and demonstrates its applicability using the Standard Yorùbá language. Our approach is motivated by the theory that abstract and realised forms of intonation and other dimensions of prosody should be modelled within a modular and unified framework. In our model, this framework is implemented using the Relational Tree (R-Tree) technique. The R-Tree is a sophisticated data structure for representing a multi-dimensional waveform in the form of a tree. Our R-Tree for an utterance is generated in two steps. First, the abstract structure of the waveform, called the Skeletal Tree (S-Tree), is generated using tone phonological rules for the target language. Second, the numerical values of the perceptually significant peaks and valleys on the S-Tree are computed using a fuzzy logic based model. The resulting points are then joined by applying interpolation techniques. The actual intonation contour is synthesised by Pitch Synchronous Overlap Technique (PSOLA) using the Praat software. We performed both quantitative and qualitative evaluations of our model. The preliminary results suggest that, although the model does not predict the numerical speech data as accurately as contemporary data-driven approaches, it produces synthetic speech with comparable intelligibility and naturalness. Furthermore, our model is easy to implement, interpret and adapt to other tone languages.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we present the design and analysis of an intonation model for text-to-speech (TTS) synthesis applications using a combination of Relational Tree (RT) and Fuzzy Logic (FL) technologies. The model is demonstrated using the Standard Yorùbá (SY) language. In the proposed intonation model, phonological information extracted from text is converted into an RT. RT is a sophisticated data structure that represents the peaks and valleys as well as the spatial structure of a waveform symbolically in the form of trees. An initial approximation to the RT, called Skeletal Tree (ST), is first generated algorithmically. The exact numerical values of the peaks and valleys on the ST is then computed using FL. Quantitative analysis of the result gives RMSE of 0.56 and 0.71 for peak and valley respectively. Mean Opinion Scores (MOS) of 9.5 and 6.8, on a scale of 1 - -10, was obtained for intelligibility and naturalness respectively.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a novel prosody model in the context of computer text-to-speech synthesis applications for tone languages. We have demonstrated its applicability using the Standard Yorùbá (SY) language. Our approach is motivated by the theory that abstract and realised forms of various prosody dimensions should be modelled within a modular and unified framework [Coleman, J.S., 1994. Polysyllabic words in the YorkTalk synthesis system. In: Keating, P.A. (Ed.), Phonological Structure and Forms: Papers in Laboratory Phonology III, Cambridge University Press, Cambridge, pp. 293–324]. We have implemented this framework using the Relational Tree (R-Tree) technique. R-Tree is a sophisticated data structure for representing a multi-dimensional waveform in the form of a tree. The underlying assumption of this research is that it is possible to develop a practical prosody model by using appropriate computational tools and techniques which combine acoustic data with an encoding of the phonological and phonetic knowledge provided by experts. To implement the intonation dimension, fuzzy logic based rules were developed using speech data from native speakers of Yorùbá. The Fuzzy Decision Tree (FDT) and the Classification and Regression Tree (CART) techniques were tested in modelling the duration dimension. For practical reasons, we have selected the FDT for implementing the duration dimension of our prosody model. To establish the effectiveness of our prosody model, we have also developed a Stem-ML prosody model for SY. We have performed both quantitative and qualitative evaluations on our implemented prosody models. The results suggest that, although the R-Tree model does not predict the numerical speech prosody data as accurately as the Stem-ML model, it produces synthetic speech prosody with better intelligibility and naturalness. The R-Tree model is particularly suitable for speech prosody modelling for languages with limited language resources and expertise, e.g. African languages. Furthermore, the R-Tree model is easy to implement, interpret and analyse.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The influence of text messaging on language has been hotly debated especially in relation to spelling and the lexicon, but the impact of SMS on syntax has received less attention.This article focuses on manipulations within the verbal domain, as language evolution points towards a consistent trend going from synthetic to analytical forms (Bybee et al. 1994), which goes against the need for concision in texting. Based on an authentic corpus of about 500 SMS (Fairon et al. 2006b), the present study shows condensation strategies that are similar to those already described, yet reveals specific features such as the absence of aphaeresis and the scarcity of apocope, as well as the overuse of synthetic forms. It can thus be concluded that while SMS writing displays oral characteristics, it cannot obviously be assimilated to speech; in addition, it may well slow down language evolution and support the conservation of short standard forms.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

It has been proposed that language impairments in children with Autism Spectrum Disorders (ASD) stem from atypical neural processing of speech and/or nonspeech sounds. However, the strength of this proposal is compromised by the unreliable outcomes of previous studies of speech and nonspeech processing in ASD. The aim of this study was to determine whether there was an association between poor spoken language and atypical event-related field (ERF) responses to speech and nonspeech sounds in children with ASD (n = 14) and controls (n = 18). Data from this developmental population (ages 6-14) were analysed using a novel combination of methods to maximize the reliability of our findings while taking into consideration the heterogeneity of the ASD population. The results showed that poor spoken language scores were associated with atypical left hemisphere brain responses (200 to 400 ms) to both speech and nonspeech in the ASD group. These data support the idea that some children with ASD may have an immature auditory cortex that affects their ability to process both speech and nonspeech sounds. Their poor speech processing may impair their ability to process the speech of other people, and hence reduce their ability to learn the phonology, syntax, and semantics of their native language.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper investigates the use of web-based textbook supplementary teaching and learning materials which include multiple choice test banks, animated demonstrations, simulations, quizzes and electronic versions of the text. To gauge their experience of the web-based material students were asked to score the main elements of the material in terms of usefulness. In general it was found that while the electronic text provides a flexible platform for presentation of material there is a need for continued monitoring of student use of this material as the literature suggests that digital viewing habits may mean there is little time spent in evaluating information, either for relevance, accuracy or authority. From a lecturer perspective these materials may provide an effective and efficient way of presenting teaching and learning materials to the students in a variety of multimedia formats, but at this stage do not overcome the need for a VLE such as Blackboard™.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The paper illustrates the role of world knowledge in comprehending and translating texts. A short news item, which displays world knowledge fairly implicitly in condensed lexical forms, was translated by students from English into German. It is shown that their translation strategies changed from a first draft which was rather close to the surface structure of the source text to a final version which took situational aspects, texttypological conventions and the different background knowledge of the respective addressees into account. Decisions on how much world knowledge has to be made explicit in the target text, however, must be based on the relevance principle. Consequences for teaching and for the notions of semantic knowledge and world knowledge are discussed.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

At the beginning of the 80s new approaches to translation were emerging in such a way that, in the global context of postmodernism and poststructuralism, they provoked a reassessment of Translation Studies (TS), acknowledging ideologies as a relevant concept to TS and considering the political and visible role of the translator. This introduction aims to establish a basic theoretical framework in which we can develop an analysis of the ‘alterations’ that, consciously or unconsciously, translators have imposed on Le deuxième sexe (1949, Gallimard) by Simone de Beauvoir for the last fifty years. Furthermore, it is essential to examine the divergences of the censoring attitude adopted by the first male translators (Parshley, Palant and Milliet) who considered this text to be a sex manual, and the one adopted by more recent female translators (Martorell and Simons) who considered it to be a philosophical book on feminism. Nevertheless, despite this tendency to consider that translators are the only professionals responsible for the translation process, it is necessary to bear in mind the work carried out by the paratranslator, who is the real censor and ‘decider’ of the way a work is presented to the translation community. Paratranslators work with paratexts (also known as ‘analysis-spaces’), and this makes it possible to study the ideological adaptation that a cultural object undergoes when it is incorporated into a new culture and society (covers, volumes, tables of contents, titles, iconic or visual elements and so forth). In short, the analysis of the texts and paratexts of Le deuxième sexe, along with its subsequent translations and rewritings into Spanish, Portuguese and English, will help reveal the function of the censoring apparatus and demonstrate the essential role that –without exception– ideologies play in the professional work of translation and paratranslation, since they have a decisive influence on the reception of the cultural (and ideological) object, in both the society in which it is created and that in which it is received.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Keyword identification in one of two simultaneous sentences is improved when the sentences differ in F0, particularly when they are almost continuously voiced. Sentences of this kind were recorded, monotonised using PSOLA, and re-synthesised to give a range of harmonic ?F0s (0, 1, 3, and 10 semitones). They were additionally re-synthesised by LPC with the LPC residual frequency shifted by 25% of F0, to give excitation with inharmonic but regularly spaced components. Perceptual identification of frequency-shifted sentences showed a similar large improvement with nominal ?F0 as seen for harmonic sentences, although overall performance was about 10% poorer. We compared performance with that of two autocorrelation-based computational models comprising four stages: (i) peripheral frequency selectivity and half-wave rectification; (ii) within-channel periodicity extraction; (iii) identification of the two major peaks in the summary autocorrelation function (SACF); (iv) a template-based approach to speech recognition using dynamic time warping. One model sampled the correlogram at the target-F0 period and performed spectral matching; the other deselected channels dominated by the interferer and performed matching on the short-lag portion of the residual SACF. Both models reproduced the monotonic increase observed in human performance with increasing ?F0 for the harmonic stimuli, but not for the frequency-shifted stimuli. A revised version of the spectral-matching model, which groups patterns of periodicity that lie on a curve in the frequency-delay plane, showed a closer match to the perceptual data for frequency-shifted sentences. The results extend the range of phenomena originally attributed to harmonic processing to grouping by common spectral pattern.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This thesis presents a study of interlanguage variability in the use of three tense/aspect forms: the simple present, simple past, and the present perfect. The need for research in this area comes from the problems encountered in the classroom. Language performance in one task sometimes does not reflect that in another. How and why this ocurrs is what this thesis aims to discover. A preliminary study explores the viability of using the Labovian variable model to elicit and explain variability. Difficulties highlight problems which help refine the methodology used in the main study. A review of past research point the direction in which this study should go. Armed with a sample of 17 Chinese Singaporean university students, whose first language is Chinese or a dialect of Chinese, the investigation began with the elicitation of variability to be found in four tasks. Using the attention-to-speech framework, these four tasks are designed to reflect varying degrees of required attention to language form. The results show that there is variability in the use of tense/aspect in all the tasks. However, the framework on which the tasks are based cannot explain the variability pattern. Further analyses of contextual factors, primarily pragmatic ones, point to a complex interplay of factors affecting the variability found in the results.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Applied Pharmaceutical Practice is an invaluable resource and will guide the student pharmacist and pharmacy technician through the main stages involved in pharmaceutical dispensing. As a core reference text, it is ideal as a companion to the compulsory dispensing courses found in all undergraduate MPharm programmes and the equivalent technical training courses. Contents include: •medicines classification and standard operating procedures •NHS supply in the community and within hospitals •non-NHS supply •controlled drugs •emergency supply •patient counselling and communication •poisons and spirits This practical textbook contains useful exercises with an answers section and numerous examples and is written by authors with extensive experience within the field. This is a comprehensive guide through the main stages of pharmaceutical dispensing.The textbook is designed to guide student pharmacists or pharmacy technicians through the main stages involved in pharmaceutical dispensing. It provides students with a core reference text to accompany the compulsory dispensing course found in all pharmacy undergraduate programmes, highlighting and explaining all key concepts behind the processes involved in pharmaceutical dispensing.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The eighth edition is a fundamental and essential update to the seventh edition published in 2000. This new edition examines a comprehensive range of existing and newer topics that are relevant to project financing in 2012 and explores current trends in the project finance and leasing industries. Contributors are experienced academics and practitioners. Since the first edition was published, the financial markets have undergone tremendous upheavals and many new structures and instruments have been created to meet the financing needs of business. This edition considers the wider world of project finance, applicable to such diverse situations as venture capital and leveraged buyouts, and using new approaches such as Islamic finance techniques. The eighth edition is an essential and over-due update to the previous edition published in 2000. The eighth edition updates a comprehensive review of financial and related topics which are relevant to project financing in 2012 and explores current trends in financial modelling of a project, risk management and the private finance initiatives. This is a comprehensive and practical book full of advice and tips for successful project financing, including leasing, offering a clear, easy to understand guide to a complex area with examples. The topic coverage is well organized and complete moving from the fundamentals to the more complex issues. There is an extensive glossary to support readers. Finally the use of 12 practitioner case studies brings many of these complex issues to life. This is the new edition of the clear, easy-to-understand industry-standard text on project financing. With a good overview of a broad area and using principles of project financing to explain complex structures, this book includes lots of examples and case studies (including Eurotunnel, Dabhol, multiple Paiton deals and other recent deals along with subsequent developments) to show the concepts in use, examine outcomes and to ensure you understand important issues such as effective project structuring and financing, financial modelling for project valuation, and risk management. Substantially updated and expanded to provide the latest developments in all aspects of project financing. An important manual reference, this book is a must-have for every project financier's desk. The text unites the domain of project financing with a wealth of project management techniques, supported by diagrams and charts and other pictorial features, where appropriate. All these supporting features facilitate a better understanding of the accompanying text for the reader. In many chapters there are diagrams to clarify the specific transaction structure discussed in the accompanying text. These diagrams enable the reader to get a very clear idea of the transaction structure, which is particularly useful where it is complex or unusual. There are also a number of checklists to assist stakeholders in the project and resource management of complex project financings. The new financial modelling chapters allow exploration of some of the pitfalls project models encounter, challenging the accurate replication of the project cash flows for stakeholders to evaluate. In the later new risk management chapters, worked examples are included to illustrate the techniques in practice. The new public private partnership/private finance initiatives chapter introduces readers to this new approach to public projects. References are made to useful websites throughout the text. Cases are included at the end of the main text to encourage examination of real-life examples of project financing in practice and also highlight specific issues of current interest. The book will be helpful to project finance sponsors, lawyers, host governments, bankers and providers of capital