Biblioteca Digital

997 resultados para Presidential speech

Two Stage Transform Vector Quantization of LSFs for Wideband Speech Coding

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We investigate the use of a two stage transform vector quantizer (TSTVQ) for coding of line spectral frequency (LSF) parameters in wideband speech coding. The first stage quantizer of TSTVQ, provides better matching of source distribution and the second stage quantizer provides additional coding gain through using an individual cluster specific decorrelating transform and variance normalization. Further coding gain is shown to be achieved by exploiting the slow time-varying nature of speech spectra and thus using inter-frame cluster continuity (ICC) property in the first stage of TSTVQ method. The proposed method saves 3-4 bits and reduces the computational complexity by 58-66%, compared to the traditional split vector quantizer (SVQ), but at the expense of 1.5-2.5 times of memory.

Joint evaluation of multiple speech patterns for speech recognition and training

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We are addressing the novel problem of jointly evaluating multiple speech patterns for automatic speech recognition and training. We propose solutions based on both the non-parametric dynamic time warping (DTW) algorithm, and the parametric hidden Markov model (HMM). We show that a hybrid approach is quite effective for the application of noisy speech recognition. We extend the concept to HMM training wherein some patterns may be noisy or distorted. Utilizing the concept of ``virtual pattern'' developed for joint evaluation, we propose selective iterative training of HMMs. Evaluating these algorithms for burst/transient noisy speech and isolated word recognition, significant improvement in recognition accuracy is obtained using the new algorithms over those which do not utilize the joint evaluation strategy.

Seeing and hearing speech, sounds, and signs: functional magnetic resonance studies on fluent and dyslexic readers

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Speech has both auditory and visual components (heard speech sounds and seen articulatory gestures). During all perception, selective attention facilitates efficient information processing and enables concentration on high-priority stimuli. Auditory and visual sensory systems interact at multiple processing levels during speech perception and, further, the classical motor speech regions seem also to participate in speech perception. Auditory, visual, and motor-articulatory processes may thus work in parallel during speech perception, their use possibly depending on the information available and the individual characteristics of the observer. Because of their subtle speech perception difficulties possibly stemming from disturbances at elemental levels of sensory processing, dyslexic readers may rely more on motor-articulatory speech perception strategies than do fluent readers. This thesis aimed to investigate the neural mechanisms of speech perception and selective attention in fluent and dyslexic readers. We conducted four functional magnetic resonance imaging experiments, during which subjects perceived articulatory gestures, speech sounds, and other auditory and visual stimuli. Gradient echo-planar images depicting blood oxygenation level-dependent contrast were acquired during stimulus presentation to indirectly measure brain hemodynamic activation. Lip-reading activated the primary auditory cortex, and selective attention to visual speech gestures enhanced activity within the left secondary auditory cortex. Attention to non-speech sounds enhanced auditory cortex activity bilaterally; this effect showed modulation by sound presentation rate. A comparison between fluent and dyslexic readers' brain hemodynamic activity during audiovisual speech perception revealed stronger activation of predominantly motor speech areas in dyslexic readers during a contrast test that allowed exploration of the processing of phonetic features extracted from auditory and visual speech. The results show that visual speech perception modulates hemodynamic activity within auditory cortex areas once considered unimodal, and suggest that the left secondary auditory cortex specifically participates in extracting the linguistic content of seen articulatory gestures. They are strong evidence for the importance of attention as a modulator of auditory cortex function during both sound processing and visual speech perception, and point out the nature of attention as an interactive process (influenced by stimulus-driven effects). Further, they suggest heightened reliance on motor-articulatory and visual speech perception strategies among dyslexic readers, possibly compensating for their auditory speech perception difficulties.

Joint decoding of multiple speech patterns for robust speech recognition

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We are addressing a new problem of improving automatic speech recognition performance, given multiple utterances of patterns from the same class. We have formulated the problem of jointly decoding K multiple patterns given a single Hidden Markov Model. It is shown that such a solution is possible by aligning the K patterns using the proposed Multi Pattern Dynamic Time Warping algorithm followed by the Constrained Multi Pattern Viterbi Algorithm The new formulation is tested in the context of speaker independent isolated word recognition for both clean and noisy patterns. When 10 percent of speech is affected by a burst noise at -5 dB Signal to Noise Ratio (local), it is shown that joint decoding using only two noisy patterns reduces the noisy speech recognition error rate to about 51 percent, when compared to the single pattern decoding using the Viterbi Algorithm. In contrast a simple maximization of individual pattern likelihoods, provides only about 7 percent reduction in error rate.

GMM based Bayesian approach to speech enhancement in signal transform domain

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Considering a general linear model of signal degradation, by modeling the probability density function (PDF) of the clean signal using a Gaussian mixture model (GMM) and additive noise by a Gaussian PDF, we derive the minimum mean square error (MMSE) estimator. The derived MMSE estimator is non-linear and the linear MMSE estimator is shown to be a special case. For speech signal corrupted by independent additive noise, by modeling the joint PDF of time-domain speech samples of a speech frame using a GMM, we propose a speech enhancement method based on the derived MMSE estimator. We also show that the same estimator can be used for transform-domain speech enhancement.

We Are All Georgians : The Neoconservative Narrative on the Russia-Georgia War

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this thesis I examine the U.S. foreign policy discussion that followed the war between Russia and Georgia in August 2008. In the politically charged setting that preceded the presidential elections, the subject of the debate was not only Washington's response to the crisis in the Caucasus but, more generally, the direction of U.S. foreign policy after the presidency of George W. Bush. As of November 2010, the reasons for and consequences of the Russia-Georgia war continue to be contested. My thesis demonstrates that there were already a number of different stories about the conflict immediately after the outbreak of hostilities. I want to argue that among these stories one can discern a “neoconservative narrative” that described the war as a confrontation between the East and the West and considered it as a test for Washington’s global leadership. I draw on the theory of securitization, particularly on a framework introduced by Holger Stritzel. Accordingly, I consider statements about the conflict as “threat texts” and analyze these based on the existing discursive context, the performative force of the threat texts and the positional power of the actors presenting them. My thesis suggests that a notion of narrativity can complement Stritzel’s securitization framework and take it further. Threat texts are established as narratives by attaching causal connections, meaning and actorship to the discourse. By focusing on this process I want to shed light on the relationship between the text and the context, capture the time dimension of a speech act articulation and help to explain how some interpretations of the conflict are privileged and others marginalized. I develop the theoretical discussion through an empirical analysis of the neoconservative narrative. Drawing on Stritzel’s framework, I argue that the internal logic of the narrative which was presented as self-evident can be analyzed in its historicity. Asking what was perceived to be at stake in the conflict, how the narrative was formed and what purposes it served also reveals the possibility for alternative explanations. My main source material consists of transcripts of think tank seminars organized in Washington, D.C. in August 2008. In addition, I resort to the foreign policy discussion in the mainstream media.

Developing a speech intelligibility test based on measuring speech reception thresholds in noise for English and Finnish

Relevância:

20.00% 20.00%

Publicador:

New Method for Delexicalization and its Application to Prosodic Tagging for Text-to-Speech Synthesis

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper describes a new flexible delexicalization method based on glottal excited parametric speech synthesis scheme. The system utilizes inverse filtered glottal flow and all-pole modelling of the vocal tract. The method provides a possibil- ity to retain and manipulate all relevant prosodic features of any kind of speech. Most importantly, the features include voice quality, which has not been properly modeled in earlier delex- icalization methods. The functionality of the new method was tested in a prosodic tagging experiment aimed at providing word prominence data for a text-to-speech synthesis system. The ex- periment confirmed the usefulness of the method and further corroborated earlier evidence that linguistic factors influence the perception of prosodic prominence.

A speech-music discriminator using HILN model based features

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We propose a simple speech music discriminator that uses features based on HILN(Harmonics, Individual Lines and Noise) model. We have been able to test the strength of the feature set on a standard database of 66 files and get an accuracy of around 97%. We also have tested on sung queries and polyphonic music and have got very good results. The current algorithm is being used to discriminate between sung queries and played (using an instrument like flute) queries for a Query by Humming(QBH) system currently under development in the lab.

Dynamic programming based optimum non-uniform samples for speech reconstruction and coding

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Non-uniform sampling of a signal is formulated as an optimization problem which minimizes the reconstruction signal error. Dynamic programming (DP) has been used to solve this problem efficiently for a finite duration signal. Further, the optimum samples are quantized to realize a speech coder. The quantizer and the DP based optimum search for non-uniform samples (DP-NUS) can be combined in a closed-loop manner, which provides distinct advantage over the open-loop formulation. The DP-NUS formulation provides a useful control over the trade-off between bitrate and performance (reconstruction error). It is shown that 5-10 dB SNR improvement is possible using DP-NUS compared to extrema sampling approach. In addition, the close-loop DP-NUS gives a 4-5 dB improvement in reconstruction error.

Comparative study of filter-bank mean-energy distance for automated segmentation of speech signals

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper describes a method of automated segmentation of speech assuming the signal is continuously time varying rather than the traditional short time stationary model. It has been shown that this representation gives comparable if not marginally better results than the other techniques for automated segmentation. A formulation of the 'Bach' (music semitonal) frequency scale filter-bank is proposed. A comparative study has been made of the performances using Mel, Bark and Bach scale filter banks considering this model. The preliminary results show up to 80 % matches within 20 ms of the manually segmented data, without any information of the content of the text and without any language dependence. 'Bach' filters are seen to marginally outperform the other filters.

Language independent automated segmentation of speech using Bach scale filter-banks

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This correspondence describes a method for automated segmentation of speech. The method proposed in this paper uses a specially designed filter-bank called Bach filter-bank which makes use of 'music' related perception criteria. The speech signal is treated as continuously time varying signal as against a short time stationary model. A comparative study has been made of the performances using Mel, Bark and Bach scale filter banks. The preliminary results show up to 80 % matches within 20 ms of the manually segmented data, without any information of the content of the text and without any language dependence. The Bach filters are seen to marginally outperform the other filters.

Multi-Pattern Viterbi Algorithm for joint decoding of multiple speech patterns

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Joint decoding of multiple speech patterns so as to improve speech recognition performance is important, especially in the presence of noise. In this paper, we propose a Multi-Pattern Viterbi algorithm (MPVA) to jointly decode and recognize multiple speech patterns for automatic speech recognition (ASR). The MPVA is a generalization of the Viterbi Algorithm to jointly decode multiple patterns given a Hidden Markov Model (HMM). Unlike the previously proposed two stage Constrained Multi-Pattern Viterbi Algorithm (CMPVA),the MPVA is a single stage algorithm. MPVA has the advantage that it cart be extended to connected word recognition (CWR) and continuous speech recognition (CSR) problems. MPVA is shown to provide better speech recognition performance than the earlier techniques: using only two repetitions of noisy speech patterns (-5 dB SNR, 10% burst noise), the word error rate using MPVA decreased by 28.5%, when compared to using individual decoding. (C) 2010 Elsevier B.V. All rights reserved.

We Are All Georgians: The Neoconservative Narrative on the Russia-Georgia War

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this thesis I examine the U.S. foreign policy discussion that followed the war between Russia and Georgia in August 2008. In the politically charged setting that preceded the presidential elections, the subject of the debate was not only Washington's response to the crisis in the Caucasus but, more generally, the direction of U.S. foreign policy after the presidency of George W. Bush. As of November 2010, the reasons for and consequences of the Russia-Georgia war continue to be contested. My thesis demonstrates that there were already a number of different stories about the conflict immediately after the outbreak of hostilities. I want to argue that among these stories one can discern a “neoconservative narrative” that described the war as a confrontation between the East and the West and considered it as a test for Washington’s global leadership. I draw on the theory of securitization, particularly on a framework introduced by Holger Stritzel. Accordingly, I consider statements about the conflict as “threat texts” and analyze these based on the existing discursive context, the performative force of the threat texts and the positional power of the actors presenting them. My thesis suggests that a notion of narrativity can complement Stritzel’s securitization framework and take it further. Threat texts are established as narratives by attaching causal connections, meaning and actorship to the discourse. By focusing on this process I want to shed light on the relationship between the text and the context, capture the time dimension of a speech act articulation and help to explain how some interpretations of the conflict are privileged and others marginalized. I develop the theoretical discussion through an empirical analysis of the neoconservative narrative. Drawing on Stritzel’s framework, I argue that the internal logic of the narrative which was presented as self-evident can be analyzed in its historicity. Asking what was perceived to be at stake in the conflict, how the narrative was formed and what purposes it served also reveals the possibility for alternative explanations. My main source material consists of transcripts of think tank seminars organized in Washington, D.C. in August 2008. In addition, I resort to the foreign policy discussion in the mainstream media.

Rhetoric and public speech in English republicanism, 1642-1681

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The study analyses the ambivalent relationship republicanism, as a form of self-government free from domination, had with the ideal of participatory oratory and non-dominated speech on the one hand, and with the danger of unhindered demagogy and its possibly fatal consequences to that form of government on the other. Although previous scholarship has delved deeply into republicanism as well as into rhetoric and public speech, the interplay between those aspects has only gathered scattered interest, and there has been no systematic study considering the variety of republican approaches to rhetoric and public speech in 17th-century England. The rare attempts to do so have been studies in English literature, and they have not analysed the political philosophy of republicanism, as the focus has been on republicanism as a literary culture. This study connects the fields of political theory, political history as well as literature in order to make a multidisciplinary contribution to intellectual history. The study shows that, within the tradition of classical republicanism, individual authors could make different choices when addressing the problematic topics of public speech and rhetoric, and the variety of their conclusions often set the authors against each other, resulting in the development of their theories through internal debates within the republican tradition. The authors under study were chosen to reflect this variety and the connections between them: the similarities between James Harrington and John Streater, and between John Milton and John Hall of Durham are shown, as well the controversies between Harrington and Milton, and Streater and Hall, respectively. In addition, by analysing the writings of Marchamont Nedham the study will show that the choices were not limited to more, or less, democratic brands of republicanism. Most significantly, the study provides a thorough analysis of the political philosophies behind the various brands of republicanism, in addition to describing them. By means of this analysis, the study shows that previous attempts to assess the role of free speech and public debate, through the lenses of modern, rights-based liberal political theory have resulted in an inappropriate framework for understanding early modern English republicanism. By approaching the topics through concepts used by the republicans legitimate authority, leadership by oratory, and republican freedom and through the frames of reference available and familiar to them roles of education and institutions the study presents a thorough and systematic analysis of the role and function of rhetoric and public speech in English republicanism. The findings of this analysis have significant consequences to our current understanding of the history and development of republican political theory, and, more generally, of the connections between democratic theory and free speech.

«
1
2
...
5
6
7
8
9
10
11
...
66
67
»