950 resultados para Text to speech


Relevância:

40.00% 40.00%

Publicador:

Resumo:

Folks, because we find ourselves in a time crunch, I'm going to forego the terrific speech I planned to give today and quickly make five key points. Please know I will be happy to provide additional information on any of them at any time.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This study investigated the influence of top-down and bottom-up information on speech perception in complex listening environments. Specifically, the effects of listening to different types of processed speech were examined on intelligibility and on simultaneous visual-motor performance. The goal was to extend the generalizability of results in speech perception to environments outside of the laboratory. The effect of bottom-up information was evaluated with natural, cell phone and synthetic speech. The effect of simultaneous tasks was evaluated with concurrent visual-motor and memory tasks. Earlier works on the perception of speech during simultaneous visual-motor tasks have shown inconsistent results (Choi, 2004; Strayer & Johnston, 2001). In the present experiments, two dual-task paradigms were constructed in order to mimic non-laboratory listening environments. In the first two experiments, an auditory word repetition task was the primary task and a visual-motor task was the secondary task. Participants were presented with different kinds of speech in a background of multi-speaker babble and were asked to repeat the last word of every sentence while doing the simultaneous tracking task. Word accuracy and visual-motor task performance were measured. Taken together, the results of Experiments 1 and 2 showed that the intelligibility of natural speech was better than synthetic speech and that synthetic speech was better perceived than cell phone speech. The visual-motor methodology was found to demonstrate independent and supplemental information and provided a better understanding of the entire speech perception process. Experiment 3 was conducted to determine whether the automaticity of the tasks (Schneider & Shiffrin, 1977) helped to explain the results of the first two experiments. It was found that cell phone speech allowed better simultaneous pursuit rotor performance only at low intelligibility levels when participants ignored the listening task. Also, simultaneous task performance improved dramatically for natural speech when intelligibility was good. Overall, it could be concluded that knowledge of intelligibility alone is insufficient to characterize processing of different speech sources. Additional measures such as attentional demands and performance of simultaneous tasks were also important in characterizing the perception of different kinds of speech in complex listening environments.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

New technology in the Freedom (R) speech processor for cochlear implants was developed to improve how incoming acoustic sound is processed; this applies not only for new users, but also for previous generations of cochlear implants. Aim: To identify the contribution of this technology - the Nucleus 22 (R) - on speech perception tests in silence and in noise, and on audiometric thresholds. Methods: A cross-sectional cohort study was undertaken. Seventeen patients were selected. The last map based on the Spectra (R) was revised and optimized before starting the tests. Troubleshooting was used to identify malfunction. To identify the contribution of the Freedom (R) technology for the Nucleus22 (R), auditory thresholds and speech perception tests were performed in free field in soundproof booths. Recorded monosyllables and sentences in silence and in noise (SNR = 0dB) were presented at 60 dBSPL. The nonparametric Wilcoxon test for paired data was used to compare groups. Results: Freedom (R) applied for the Nucleus22 (R) showed a statistically significant difference in all speech perception tests and audiometric thresholds. Conclusion: The reedom (R) technology improved the performance of speech perception and audiometric thresholds of patients with Nucleus 22 (R).

Relevância:

40.00% 40.00%

Publicador:

Resumo:

We present a new method for the enhancement of speech. The method is designed for scenarios in which targeted speaker enrollment as well as system training within the typical noise environment are feasible. The proposed procedure is fundamentally different from most conventional and state-of-the-art denoising approaches. Instead of filtering a distorted signal we are resynthesizing a new “clean” signal based on its likely characteristics. These characteristics are estimated from the distorted signal. A successful implementation of the proposed method is presented. Experiments were performed in a scenario with roughly one hour of clean speech training data. Our results show that the proposed method compares very favorably to other state-of-the-art systems in both objective and subjective speech quality assessments. Potential applications for the proposed method include jet cockpit communication systems and offline methods for the restoration of audio recordings.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

A new idea for waveform coding using vector quantisation (VQ) is introduced. This idea makes it possible to deal with codevectors much larger than before for a fixed bit per sample rate. Also a solution to the matching problem (inherent in the present context) in the &-norm describing a measure of neamess is presented. The overall computational complexity of this solution is O(n3 log, n). Sample results are presented to demonstrate the advantage of using this technique in the context of coding of speech waveforms.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

We present a new approach for corpus-based speech enhancement that significantly improves over a method published by Xiao and Nickel in 2010. Corpus-based enhancement systems do not merely filter an incoming noisy signal, but resynthesize its speech content via an inventory of pre-recorded clean signals. The goal of the procedure is to perceptually improve the sound of speech signals in background noise. The proposed new method modifies Xiao's method in four significant ways. Firstly, it employs a Gaussian mixture model (GMM) instead of a vector quantizer in the phoneme recognition front-end. Secondly, the state decoding of the recognition stage is supported with an uncertainty modeling technique. With the GMM and the uncertainty modeling it is possible to eliminate the need for noise dependent system training. Thirdly, the post-processing of the original method via sinusoidal modeling is replaced with a powerful cepstral smoothing operation. And lastly, due to the improvements of these modifications, it is possible to extend the operational bandwidth of the procedure from 4 kHz to 8 kHz. The performance of the proposed method was evaluated across different noise types and different signal-to-noise ratios. The new method was able to significantly outperform traditional methods, including the one by Xiao and Nickel, in terms of PESQ scores and other objective quality measures. Results of subjective CMOS tests over a smaller set of test samples support our claims.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

From the moment of their birth, a person's life is determined by their sex. Ms. Goroshko wants to know why this difference is so striking, why society is so concerned to sustain it, and how it is able to persist even when certain national or behavioural stereotypes are erased between people. She is convinced of the existence of not only social, but biological differences between men and women, and set herself the task, in a manuscript totalling 126 pages, written in Ukrainian and including extensive illustrations, of analysing these distinctions as they are manifested in language. She points out that, even before 1900, certain stylistic differences between the ways that men and women speak had been noted. Since then it has become possible, for instance in the case of Japanese, to point to examples of male and female sub-languages. In general, one can single out the following characteristics. Males tend to write with less fluency, to refer to events in a verb-phrase, to be time-oriented, to involve themselves more in their references to events, to locate events in their personal sphere of activity, and to refer less to others. Therefore, concludes Ms Goroshko, the male is shown to be more active, more ego-involved in what he does, and less concerned about others. Women, in contrast, were more fluent, referred to events in a noun-phrase, were less time-oriented, tended to be less involved in their event-references, locate events within their interactive community and refer more to others. They spent much more time discussing personal and domestic subjects, relationship problems, family, health and reproductive matters, weight, food and clothing, men, and other women. As regards discourse strategies, Ms Goroshko notes the following. Men more often begin a conversation, they make more utterances, these utterances are longer, they make more assertions, speak less carefully, generally determine the topic of conversation, speak more impersonally, use more vulgar expressions, and use fewer diminutives and more imperatives. Women's speech strategies, apart from being the opposite of those enumerated above, also contain more euphemisms, polite forms, apologies, laughter and crying. All of the above leads Ms. Goroshko to conclude that the differences between male and female speech forms are more striking than the similarities. Furthermore she is convinced that the biological divergence between the sexes is what generates the verbal divergence, and that social factors can only intensify or diminish the differentiation in verbal behaviour established by the sex of a person. Bearing all this in mind, Ms Goroshko set out to construct a grammar of male and female styles of speaking within Russian. One of her most important research tools was a certain type of free association test. She took a list comprising twelve stimuli (to love, to have, to speak, to fuck, a man, a woman, a child, the sky, a prayer, green, beautiful) and gave it to a group of participants specially selected, according to a preliminary psychological testing, for the high levels of masculinity or femininity they displayed. Preliminary responses revealed that the female reactions were more diverse than the male ones, there were more sentences and word combinations in the female reactions, men gave more negative responses to the stimulus and sometimes didn't want to react at all, women reacted more to adjectives and men to nouns, and that, surprisingly, women coloured more negatively their reactions to the words man, to love and a child (Ms. Goroshko is inclined to attribute this to the present economic situation in Russia). Another test performed by Ms. Goroshko was the so-called "defective text" developed by A.A. Brudny. All participants were distributed with packets of complete sentences, which had been taken from a text and then mixed at random. The task was to reconstruct the original text. There were three types of test, the first descriptive, the second narrative, and the third logical. Ms. Goroshko created computer programmes to analyse the results. She found that none of the reconstructed texts was coincident with the original, differing both from the original text and amongst themselves and that there were many more disparities in the male than the female texts. In the descriptive and logical texts the differences manifested themselves more clearly in the male texts, and in the narrative texts in the female texts. The widest dispersal of values was observed at the outset, while the female text ending was practically coincident with the original (in contrast to the male ending). The greatest differences in text reconstruction for both males and females were registered in the middle of the texts. Women, Ms. Goroshko claims, were more sensitive to the semantic structure of the texts, since they assembled the narrative text much more accurately than the other two, while the men assembled more accurately the logical text. Texts written by women were assembled more accurately by women and texts by men by men. On the basis of computer analysis, Ms. Goroshko found that female speech was substantially more emotional. It was expressed by various means, hyperbole, metaphor, comparisons, epithets, ways of enumeration, and with the aid of interjections, rhetorical questions, exclamations. The level of literacy was higher for female speech, and there were fewer mistakes in grammar and spelling in female texts. The last stage of Ms Goroshko's research concerned the social stereotypes of beliefs about men and women in Russian society today. A large number of respondents were asked questions such as "What merits must a woman possess?", "What are male vices and virtues?", etc. After statistical manipulation, an image of modern man and woman, as it exists in the minds of modern Russian men and women, emerged. Ms. Goroshko believes that her findings are significant not only within the field of linguistics. She has already successfully worked on anonymous texts and been able to decide on the sex of the author and consequently believes that in the future her research may even be of benefit to forensic science.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

From Bush’s September 20, 2001 “War on Terror” speech to Congress to President-Elect Barack Obama’s acceptance speech on November 4, 2008, the U.S. Army produced visual recruitment material that addressed the concerns of falling enlistment numbers—due to the prolonged and difficult war in Iraq—with quickly-evolving and compelling rhetorical appeals: from the introduction of an “Army of One” (2001) to “Army Strong” (2006); from messages focused on education and individual identity to high-energy adventure and simulated combat scenarios, distributed through everything from printed posters and music videos to first-person tactical-shooter video games. These highly polished, professional visual appeals introduced to the American public during a time of an unpopular war fought by volunteers provide rich subject matter for research and analysis. This dissertation takes a multidisciplinary approach to the visual media utilized as part of the Army’s recruitment efforts during the War on Terror, focusing on American myths—as defined by Barthes—and how these myths are both revealed and reinforced through design across media platforms. Placing each selection in its historical context, this dissertation analyzes how printed materials changed as the War on Terror continued. It examines the television ad that introduced “Army Strong” to the American public, considering how the combination of moving image, text, and music structure the message and the way we receive it. This dissertation also analyzes the video game America’s Army, focusing on how the interaction of the human player and the computer-generated player combine to enhance the persuasive qualities of the recruitment message. Each chapter discusses how the design of the particular medium facilitates engagement/interactivity of the viewer. The conclusion considers what recruitment material produced during this time period suggests about the persuasive strategies of different media and how they create distinct relationships with their spectators. It also addresses how theoretical frameworks and critical concepts used by a variety of disciplines can be combined to analyze recruitment media utilizing a Selber inspired three literacy framework (functional, critical, rhetorical) and how this framework can contribute to the multimodal classroom by allowing instructors and students to do a comparative analysis of multiple forms of visual media with similar content.