984 resultados para Decoding Speech Prosody


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Non-orthogonal space-time block codes (STBC) with large dimensions are attractive because they can simultaneously achieve both high spectral efficiencies (same spectral efficiency as in V-BLAST for a given number of transmit antennas) as well as full transmit diversity. Decoding of non-orthogonal STBCs with large dimensions has been a challenge. In this paper, we present a reactive tabu search (RTS) based algorithm for decoding non-orthogonal STBCs from cyclic division algebras (CDA) having largedimensions. Under i.i.d fading and perfect channel state information at the receiver (CSIR), our simulation results show that RTS based decoding of 12 X 12 STBC from CDA and 4-QAM with 288 real dimensions achieves i) 10(-3) uncoded BER at an SNR of just 0.5 dB away from SISO AWGN performance, and ii) a coded BER performance close to within about 5 dB of the theoretical MIMO capacity, using rate-3/4 turbo code at a spectral efficiency of 18 bps/Hz. RTS is shown to achieve near SISO AWGN performance with less number of dimensions than with LAS algorithm (which we reported recently) at some extra complexity than LAS. We also report good BER performance of RTS when i.i.d fading and perfect CSIR assumptions are relaxed by considering a spatially correlated MIMO channel model, and by using a training based iterative RTS decoding/channel estimation scheme.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Non-orthogonal space-time block codes (STBC) from cyclic division algebras (CDA) are attractive because they can simultaneously achieve both high spectral efficiencies (same spectral efficiency as in V-BLAST for a given number of transmit antennas) as well as full transmit diversity. Decoding of non-orthogonal STBCs with hundreds of dimensions has been a challenge. In this paper, we present a probabilistic data association (PDA) based algorithm for decoding non-orthogonal STBCs with large dimensions. Our simulation results show that the proposed PDA-based algorithm achieves near SISO AWGN uncoded BER as well as near-capacity coded BER (within 5 dB of the theoretical capacity) for large non-orthogonal STBCs from CDA. We study the effect of spatial correlation on the BER, and show that the performance loss due to spatial correlation can be alleviated by providing more receive spatial dimensions. We report good BER performance when a training-based iterative decoding/channel estimation is used (instead of assuming perfect channel knowledge) in channels with large coherence times. A comparison of the performances of the PDA algorithm and the likelihood ascent search (LAS) algorithm (reported in our recent work) is also presented.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper,we present a belief propagation (BP) based algorithm for decoding non-orthogonal space-time block codes (STBC) from cyclic division algebras (CDA) having large dimensions. The proposed approachinvolves message passing on Markov random field (MRF) representation of the STBC MIMO system. Adoption of BP approach to decode non-orthogonal STBCs of large dimensions has not been reported so far. Our simulation results show that the proposed BP-based decoding achieves increasingly closer to SISO AWGN performance for increased number of dimensions. In addition, it also achieves near-capacity turbo coded BER performance; for e.g., with BP decoding of 24 x 24 STBC from CDA using BPSK (i.e.,n576 real dimensions) and rate-1/2 turbo code (i.e., 12 bps/Hz spectral efficiency), coded BER performance close to within just about 2.5 dB from the theoretical MIMO capacity is achieved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, Space-Time Block Codes (STBCs) with reduced Sphere Decoding Complexity (SDC) are constructed for two-user Multiple-Input Multiple-Output (MIMO) fading multiple access channels. In this set-up, both the users employ identical STBCs and the destination performs sphere decoding for the symbols of the two users. First, we identify the positions of the zeros in the R matrix arising out of the Q-R decomposition of the lattice generator such that (i) the worst case SDC (WSDC) and (ii) the average SDC (ASDC) are reduced. Then, a set of necessary and sufficient conditions on the lattice generator is provided such that the R matrix has zeros at the identified positions. Subsequently, explicit constructions of STBCs which results in the reduced ASDC are presented. The rate (in complex symbols per channel use) of the proposed designs is at most 2/N-t where N-t denotes the number of transmit antennas for each user. We also show that the class of STBCs from complex orthogonal designs (other than the Alamouti design) reduce the WSDC but not the ASDC.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper proposes a full-rate, full-diversity space-time block code(STBC) with low maximum likelihood (ML) decoding complexity and high coding gain for the 4 transmit antenna, 2 receive antenna (4 x 2) multiple-input multiple-output (MIMO) system that employs 4/16-QAM. For such a system, the best code known is the DjABBA code and recently, Biglieri, Hong and Viterbo have proposed another STBC (BHV code) for 4-QAM which has lower ML-decoding complexity than the DjABBA code but does not have full-diversity like the DjABBA code. The code proposed in this paper has the same ML-decoding complexity as the BHV code for any square M-QAM but has full-diversity for 4- and 16-QAM. Compared with the DjABBA code, the proposed code has lower ML-decoding complexity for square M-QAM constellation, higher coding gain for 4- and 16-QAM, and hence a better codeword error rate (CER) performance. Simulation results confirming this are presented.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The study analyses the ambivalent relationship republicanism, as a form of self-government free from domination, had with the ideal of participatory oratory and non-dominated speech on the one hand, and with the danger of unhindered demagogy and its possibly fatal consequences to that form of government on the other. Although previous scholarship has delved deeply into republicanism as well as into rhetoric and public speech, the interplay between those aspects has only gathered scattered interest, and there has been no systematic study considering the variety of republican approaches to rhetoric and public speech in 17th-century England. The rare attempts to do so have been studies in English literature, and they have not analysed the political philosophy of republicanism, as the focus has been on republicanism as a literary culture. This study connects the fields of political theory, political history as well as literature in order to make a multidisciplinary contribution to intellectual history. The study shows that, within the tradition of classical republicanism, individual authors could make different choices when addressing the problematic topics of public speech and rhetoric, and the variety of their conclusions often set the authors against each other, resulting in the development of their theories through internal debates within the republican tradition. The authors under study were chosen to reflect this variety and the connections between them: the similarities between James Harrington and John Streater, and between John Milton and John Hall of Durham are shown, as well the controversies between Harrington and Milton, and Streater and Hall, respectively. In addition, by analysing the writings of Marchamont Nedham the study will show that the choices were not limited to more, or less, democratic brands of republicanism. Most significantly, the study provides a thorough analysis of the political philosophies behind the various brands of republicanism, in addition to describing them. By means of this analysis, the study shows that previous attempts to assess the role of free speech and public debate, through the lenses of modern, rights-based liberal political theory have resulted in an inappropriate framework for understanding early modern English republicanism. By approaching the topics through concepts used by the republicans legitimate authority, leadership by oratory, and republican freedom and through the frames of reference available and familiar to them roles of education and institutions the study presents a thorough and systematic analysis of the role and function of rhetoric and public speech in English republicanism. The findings of this analysis have significant consequences to our current understanding of the history and development of republican political theory, and, more generally, of the connections between democratic theory and free speech.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We use parallel weighted finite-state transducers to implement a part-of-speech tagger, which obtains state-of-the-art accuracy when used to tag the Europarl corpora for Finnish, Swedish and English. Our system consists of a weighted lexicon and a guesser combined with a bigram model factored into two weighted transducers. We use both lemmas and tag sequences in the bigram model, which guarantees reliable bigram estimates.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A new method based on unit continuity metric (UCM) is proposed for optimal unit selection in text-to-speech (TTS) synthesis. UCM employs two features, namely, pitch continuity metric and spectral continuity metric. The methods have been implemented and tested on our test bed called MILE-TTS and it is available as web demo. After verification by a self selection test, the algorithms are evaluated on 8 paragraphs each for Kannada and Tamil by native users of the languages. Mean-opinion-score (MOS) shows that naturalness and comprehension are better with UCM based algorithm than the non-UCM based ones. The naturalness of the TTS output is further enhanced by a new rule based algorithm for pause prediction for Tamil language. The pauses between the words are predicted based on parts-of-speech information obtained from the input text.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

For an n(t) transmit, n(r) receive antenna system (n(t) x nr system), a full-rate space time block code (STBC) transmits min(n(t), n(r)) complex symbols per channel use. In this paper, a scheme to obtain a full-rate STBC for 4 transmit antennas and any n(r), with reduced ML-decoding complexity is presented. The weight matrices of the proposed STBC are obtained from the unitary matrix representations of a Clifford Algebra. By puncturing the symbols of the STBC, full rate designs can be obtained for n(r) < 4. For any value of n(r), the proposed design offers the least ML-decoding complexity among known codes. The proposed design is comparable in error performance to the well known Perfect code for 4 transmit antennas while offering lower ML-decoding complexity. Further, when n(r) < 4, the proposed design has higher ergodic capacity than the punctured Perfect code. Simulation results which corroborate these claims are presented.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Recently, Guo and Xia gave sufficient conditions for an STBC to achieve full diversity when a PIC (Partial Interference Cancellation) or a PIC-SIC (PIC with Successive Interference Cancellation) decoder is used at the receiver. In this paper, we give alternative conditions for an STBC to achieve full diversity with PIC and PIC-SIC decoders, which are equivalent to Guo and Xia's conditions, but are much easier to check. Using these conditions, we construct a new class of full diversity PIC-SIC decodable codes, which contain the Toeplitz codes and a family of codes recently proposed by Zhang, Xu et. al. as proper subclasses. With the help of the new criteria, we also show that a class of PIC-SIC decodable codes recently proposed by Zhang, Shi et. al. can be decoded with much lower complexity than what is reported, without compromising on full diversity.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The paper describes a modular, unit selection based TTS framework, which can be used as a research bed for developing TTS in any new language, as well as studying the effect of changing any parameter during synthesis. Using this framework, TTS has been developed for Tamil. Synthesis database consists of 1027 phonetically rich prerecorded sentences. This framework has already been tested for Kannada. Our TTS synthesizes intelligible and acceptably natural speech, as supported by high mean opinion scores. The framework is further optimized to suit embedded applications like mobiles and PDAs. We compressed the synthesis speech database with standard speech compression algorithms used in commercial GSM phones and evaluated the quality of the resultant synthesized sentences. Even with a highly compressed database, the synthesized output is perceptually close to that with uncompressed database. Through experiments, we explored the ambiguities in human perception when listening to Tamil phones and syllables uttered in isolation,thus proposing to exploit the misperception to substitute for missing phone contexts in the database. Listening experiments have been conducted on sentences synthesized by deliberately replacing phones with their confused ones.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Traditional subspace based speech enhancement (SSE)methods use linear minimum mean square error (LMMSE) estimation that is optimal if the Karhunen Loeve transform (KLT) coefficients of speech and noise are Gaussian distributed. In this paper, we investigate the use of Gaussian mixture (GM) density for modeling the non-Gaussian statistics of the clean speech KLT coefficients. Using Gaussian mixture model (GMM), the optimum minimum mean square error (MMSE) estimator is found to be nonlinear and the traditional LMMSE estimator is shown to be a special case. Experimental results show that the proposed method provides better enhancement performance than the traditional subspace based methods.Index Terms: Subspace based speech enhancement, Gaussian mixture density, MMSE estimation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We formulate a two-stage Iterative Wiener filtering (IWF) approach to speech enhancement, bettering the performance of constrained IWF, reported in literature. The codebook constrained IWF (CCIWF) has been shown to be effective in achieving convergence of IWF in the presence of both stationary and non-stationary noise. To this, we include a second stage of unconstrained IWF and show that the speech enhancement performance can be improved in terms of average segmental SNR (SSNR), Itakura-Saito (IS) distance and Linear Prediction Coefficients (LPC) parameter coincidence. We also explore the tradeoff between the number of CCIWF iterations and the second stage IWF iterations.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Effective feature extraction for robust speech recognition is a widely addressed topic and currently there is much effort to invoke non-stationary signal models instead of quasi-stationary signal models leading to standard features such as LPC or MFCC. Joint amplitude modulation and frequency modulation (AM-FM) is a classical non-parametric approach to non-stationary signal modeling and recently new feature sets for automatic speech recognition (ASR) have been derived based on a multi-band AM-FM representation of the signal. We consider several of these representations and compare their performances for robust speech recognition in noise, using the AURORA-2 database. We show that FEPSTRUM representation proposed is more effective than others. We also propose an improvement to FEPSTRUM based on the Teager energy operator (TEO) and show that it can selectively outperform even FEPSTRUM