31 resultados para Text-to-speech
em QUB Research Portal - Research Directory and Institutional Repository for Queen's University Belfast
Resumo:
David Norbrook, Review of English Studies 56 (Sept. 2005), 675-6.
‘We have waited a long time for a study of Marvell’s Latin poetry; fortunately, Estelle Haan’s monograph generously makes good the loss ... One of her most intriguing suggestions … is that Marvell may have presented paired poems like ‘Ros’ and ‘On a Drop of Dew’, and the poems to the obligingly named Dr Witty, to his student Maria Fairfax as his own patterns for the pedagogical practice of double translation. Perhaps the most original parts of the book, however, move beyond the familiar canon to cover the generic range of the Latin verse. Haan offers a very full contextualization of the early Horatian Ode to Charles I in seventeenth-century exercises in parodia. In a rewarding reading of the poem to Dr Ingelo she shows how Marvell deploys the language of Ovid’s Tristia to present Sweden as a place of shivering exile, only to subvert this model with a neo-Virgilian celebration of Christina as a virtuous, city-building Dido. She draws extensively on historical as well as literary sources to offer very detailed contextualizations of the poem to Maniban and ‘Scaevola Scotto-Britannus’... This monograph opens up many new ways into the Latin verse, not least because it is rounded off with new texts and prose translations of the Latin poems. These make a substantial contribution in their own right. They are the best and most accurate translations to date (those in Smith’s edition having some lapses); they avoid poeticisms but bring out the structure of the poems' wordplay very clearly. This book brings us a lot closer to seeing Marvell whole.'
Resumo:
This paper studies single-channel speech separation, assuming unknown, arbitrary temporal dynamics for the speech signals to be separated. A data-driven approach is described, which matches each mixed speech segment against a composite training segment to separate the underlying clean speech segments. To advance the separation accuracy, the new approach seeks and separates the longest mixed speech segments with matching composite training segments. Lengthening the mixed speech segments to match reduces the uncertainty of the constituent training segments, and hence the error of separation. For convenience, we call the new approach Composition of Longest Segments, or CLOSE. The CLOSE method includes a data-driven approach to model long-range temporal dynamics of speech signals, and a statistical approach to identify the longest mixed speech segments with matching composite training segments. Experiments are conducted on the Wall Street Journal database, for separating mixtures of two simultaneous large-vocabulary speech utterances spoken by two different speakers. The results are evaluated using various objective and subjective measures, including the challenge of large-vocabulary continuous speech recognition. It is shown that the new separation approach leads to significant improvement in all these measures.
Resumo:
Temporal dynamics and speaker characteristics are two important features of speech that distinguish speech from noise. In this paper, we propose a method to maximally extract these two features of speech for speech enhancement. We demonstrate that this can reduce the requirement for prior information about the noise, which can be difficult to estimate for fast-varying noise. Given noisy speech, the new approach estimates clean speech by recognizing long segments of the clean speech as whole units. In the recognition, clean speech sentences, taken from a speech corpus, are used as examples. Matching segments are identified between the noisy sentence and the corpus sentences. The estimate is formed by using the longest matching segments found in the corpus sentences. Longer speech segments as whole units contain more distinct dynamics and richer speaker characteristics, and can be identified more accurately from noise than shorter speech segments. Therefore, estimation based on the longest recognized segments increases the noise immunity and hence the estimation accuracy. The new approach consists of a statistical model to represent up to sentence-long temporal dynamics in the corpus speech, and an algorithm to identify the longest matching segments between the noisy sentence and the corpus sentences. The algorithm is made more robust to noise uncertainty by introducing missing-feature based noise compensation into the corpus sentences. Experiments have been conducted on the TIMIT database for speech enhancement from various types of nonstationary noise including song, music, and crosstalk speech. The new approach has shown improved performance over conventional enhancement algorithms in both objective and subjective evaluations.
Resumo:
This paper presents a new approach to speech enhancement from single-channel measurements involving both noise and channel distortion (i.e., convolutional noise), and demonstrates its applications for robust speech recognition and for improving noisy speech quality. The approach is based on finding longest matching segments (LMS) from a corpus of clean, wideband speech. The approach adds three novel developments to our previous LMS research. First, we address the problem of channel distortion as well as additive noise. Second, we present an improved method for modeling noise for speech estimation. Third, we present an iterative algorithm which updates the noise and channel estimates of the corpus data model. In experiments using speech recognition as a test with the Aurora 4 database, the use of our enhancement approach as a preprocessor for feature extraction significantly improved the performance of a baseline recognition system. In another comparison against conventional enhancement algorithms, both the PESQ and the segmental SNR ratings of the LMS algorithm were superior to the other methods for noisy speech enhancement.
Resumo:
This paper provides a summary of our studies on robust speech recognition based on a new statistical approach – the probabilistic union model. We consider speech recognition given that part of the acoustic features may be corrupted by noise. The union model is a method for basing the recognition on the clean part of the features, thereby reducing the effect of the noise on recognition. To this end, the union model is similar to the missing feature method. However, the two methods achieve this end through different routes. The missing feature method usually requires the identity of the noisy data for noise removal, while the union model combines the local features based on the union of random events, to reduce the dependence of the model on information about the noise. We previously investigated the applications of the union model to speech recognition involving unknown partial corruption in frequency band, in time duration, and in feature streams. Additionally, a combination of the union model with conventional noise-reduction techniques was studied, as a means of dealing with a mixture of known or trainable noise and unknown unexpected noise. In this paper, a unified review, in the context of dealing with unknown partial feature corruption, is provided into each of these applications, giving the appropriate theory and implementation algorithms, along with an experimental evaluation.
Resumo:
Reviews the books, Lessons From the Northern Ireland Peace Process edited by Timothy J. White (2013) and Human Rights as War by Other Means by Jennifer Curtis (2014). Edited by a U.S.-based academic with an enduring interest in Ireland, the first book draws together an interdisciplinary group of academics from across North America and the U.K. (though notably not Northern Ireland itself) to cover such topics as third party intervention, nationalism, grassroots change, and community development. The second text to be reviewed may be seen as a thorough analysis of this particular point: what is the role played by human rights in Northern Ireland’s peace process?
Resumo:
The authors are concerned with the development of computer systems that are capable of using information from faces and voices to recognise people's emotions in real-life situations. The paper addresses the nature of the challenges that lie ahead, and provides an assessment of the progress that has been made in the areas of signal processing and analysis techniques (with regard to speech and face), and the psychological and linguistic analyses of emotion. Ongoing developmental work by the authors in each of these areas is described.
Resumo:
Studies in sensory neuroscience reveal the critical importance of accurate sensory perception for cognitive development. There is considerable debate concerning the possible sensory correlates of phonological processing, the primary cognitive risk factor for developmental dyslexia. Across languages, children with dyslexia have a specific difficulty with the neural representation of the phonological structure of speech. The identification of a robust sensory marker of phonological difficulties would enable early identification of risk for developmental dyslexia and early targeted intervention. Here, we explore whether phonological processing difficulties are associated with difficulties in processing acoustic cues to speech rhythm. Speech rhythm is used across languages by infants to segment the speech stream into words and syllables. Early difficulties in perceiving auditory sensory cues to speech rhythm and prosody could lead developmentally to impairments in phonology. We compared matched samples of children with and without dyslexia, learning three very different spoken and written languages, English, Spanish, and Chinese. The key sensory cue measured was rate of onset of the amplitude envelope (rise time), known to be critical for the rhythmic timing of speech. Despite phonological and orthographic differences, for each language, rise time sensitivity was a significant predictor of phonological awareness, and rise time was the only consistent predictor of reading acquisition. The data support a language-universal theory of the neural basis of developmental dyslexia on the basis of rhythmic perception and syllable segmentation. They also suggest that novel remediation strategies on the basis of rhythm and music may offer benefits for phonological and linguistic development.
Resumo:
Intertextuality is central to the production and reception of translations. Yet the possibility of translating most foreign intertexts with any completeness or precision is so limited as to be virtually nonexistent. As a result, they are usually replaced by analogous but ultimately different intertextual relations in the receiving language. The creation of a receiving intertext permits a translation to be read with comprehension by translating-language readers. It also results in a disjunction between the foreign and translated texts, a proliferation of linguistic and cultural differences that are at once interpretive and interrogative. Intertextuality enables and complicates translation, preventing it from being an untroubled communication and opening the translated text to interpretive possibilities that vary with cultural constituencies in the receiving situation. To activate these possibilities and at the same time to improve the study and practice of translation, this article aims to theorize the relative autonomy of the translated text and to increase the self-consciousness of translators and readers of translations alike.
Resumo:
You and I may be little words but they do a great deal. In spoken discourse they reference shared knowledge and mark stance. In pedagogical contexts, they maintain relations in teacher-student discourse. However, language classrooms may rarely explore this array of pragmatic meanings. A lack of awareness of the variety of these functions may be problematic for learners when seeking to construct interpersonal relations and operate successfully in particular spoken contexts. This paper presents a study of you and I in two spoken corpora: a corpus of English language learner task talk and a corpus of university seminar talk. Findings illustrate different patterns of I and you between the two corpora: I and you have a higher rate of occurrence in learner discourse, and pronoun repetition is more frequent in learner discourse, though it does not account for the higher rate of you and I. These findings suggest that language learner task talk displays more features tied to speech production and self-regulation and fewer features associated with attempting to point to the informational space of others, a key feature of university classroom talk. This paper concludes by outlining pedagogical applications to overcome features perceived as disfluent.
Resumo:
The analysis of policy-based party;;competition will not make serious progress beyond the constraints of (a) the unitary actor assumption and (b) a static approach to analyzing party competition between elections until a method is available for deriving; reliable and valid time-series estimates of the policy positions of large numbers of political actors. Retrospective estimation of these positions;In past party systems will require a method for estimating policy positions from political texts.
Previous hand-coding content analysis schemes deal with policy emphasis rather than policy positions. We propose a new hand-coding scheme for policy positions, together with a new English language computer,coding scheme that is compatible with this. We apply both schemes; to party manifestos from Britain and Ireland in 1992 and 1997 and cross validate the resulting estimates with :those derived from quite independent expert surveys and with previous,manifesto analyses.
There is a high degree of cross validation between coding methods. including computer coding. This implies that it is indeed possible to use computer-coded content analysis to derive reliable and valid estimates of policy positions from political texts. This will allow vast Volumes of text to be coded, including texts generated by individuals and other internal party actors, allowing the empirical elaboration of dynamic rather than static models of party competition that move beyond the unitary actor assumption.