996 resultados para Electrical Communication Engineering
Resumo:
Two families of low correlation QAM sequences are presented here. In a CDMA setting, these sequences have the ability to transport a large amount of data as well as enable variable-rate signaling on the reverse link. The first family Á2SQ - B2− is constructed by interleaving 2 selected QAM sequences. This family is defined over M 2-QAM, where M = 2 m , m ≥ 2. Over 16-QAM, the normalized maximum correlation [`(q)]maxmax is bounded above by <~1.17 ÖNUnknown control sequence '\lesssim' , where N is the period of the sequences in the family. This upper bound on [`(q)]maxmax is the lowest among all known sequence families over 16-QAM.The second family Á4SQ4 is constructed by interleaving 4 selected QAM sequences. This family is defined over M 2-QAM, where M = 2 m , m ≥ 3, i.e., 64-QAM and beyond. The [`(q)]maxmax for sequences in this family over 64-QAM is upper bounded by <~1.60 ÖNUnknown control sequence '\lesssim' . For large M, [`(q)]max <~1.64 ÖNUnknown control sequence '\lesssim' . These upper bounds on [`(q)]maxmax are the lowest among all known sequence families over M 2-QAM, M = 2 m , m ≥ 3.
Resumo:
Traditional subspace based speech enhancement (SSE)methods use linear minimum mean square error (LMMSE) estimation that is optimal if the Karhunen Loeve transform (KLT) coefficients of speech and noise are Gaussian distributed. In this paper, we investigate the use of Gaussian mixture (GM) density for modeling the non-Gaussian statistics of the clean speech KLT coefficients. Using Gaussian mixture model (GMM), the optimum minimum mean square error (MMSE) estimator is found to be nonlinear and the traditional LMMSE estimator is shown to be a special case. Experimental results show that the proposed method provides better enhancement performance than the traditional subspace based methods.Index Terms: Subspace based speech enhancement, Gaussian mixture density, MMSE estimation.
Resumo:
We formulate a two-stage Iterative Wiener filtering (IWF) approach to speech enhancement, bettering the performance of constrained IWF, reported in literature. The codebook constrained IWF (CCIWF) has been shown to be effective in achieving convergence of IWF in the presence of both stationary and non-stationary noise. To this, we include a second stage of unconstrained IWF and show that the speech enhancement performance can be improved in terms of average segmental SNR (SSNR), Itakura-Saito (IS) distance and Linear Prediction Coefficients (LPC) parameter coincidence. We also explore the tradeoff between the number of CCIWF iterations and the second stage IWF iterations.
Resumo:
Effective feature extraction for robust speech recognition is a widely addressed topic and currently there is much effort to invoke non-stationary signal models instead of quasi-stationary signal models leading to standard features such as LPC or MFCC. Joint amplitude modulation and frequency modulation (AM-FM) is a classical non-parametric approach to non-stationary signal modeling and recently new feature sets for automatic speech recognition (ASR) have been derived based on a multi-band AM-FM representation of the signal. We consider several of these representations and compare their performances for robust speech recognition in noise, using the AURORA-2 database. We show that FEPSTRUM representation proposed is more effective than others. We also propose an improvement to FEPSTRUM based on the Teager energy operator (TEO) and show that it can selectively outperform even FEPSTRUM
Resumo:
Segmental dynamic time warping (DTW) has been demonstrated to be a useful technique for finding acoustic similarity scores between segments of two speech utterances. Due to its high computational requirements, it had to be computed in an offline manner, limiting the applications of the technique. In this paper, we present results of parallelization of this task by distributing the workload in either a static or dynamic way on an 8-processor cluster and discuss the trade-offs among different distribution schemes. We show that online unsupervised pattern discovery using segmental DTW is plausible with as low as 8 processors. This brings the task within reach of today's general purpose multi-core servers. We also show results on a 32-processor system, and discuss factors affecting scalability of our methods.
Resumo:
We consider a framework in which several service providers offer downlink wireless data access service in a certain area. Each provider serves its end-users through opportunistic secondary spectrum access of licensed spectrum, and needs to pay primary license holders of the spectrum usage based and membership based charges for such secondary spectrum access. In these circumstances, if providers pool their resources and allow end-users to be served by any of the cooperating providers, the total user satisfaction as well as the aggregate revenue earned by providers may increase. We use coalitional game theory to investigate such cooperation among providers, and show that the optimal cooperation schemes can be obtained as solutions of convex optimizations. We next show that under usage based charging scheme, if all providers cooperate, there always exists an operating point that maximizes the aggregate revenue of providers, while presenting each provider a share of the revenue such that no subset of providers has an incentive to leave the coalition. Furthermore, such an operating point can be computed in polynomial time. Finally, we show that when the charging scheme involves membership based charges, the above result holds in important special cases.
Resumo:
In this paper we address the problem of transmission of correlated sources over a fading multiple access channel (MAC). We provide sufficient conditions for transmission with given distortions. Next these conditions are specialized to a Gaussian MAC (GMAC). Transmission schemes for discrete and Gaussian sources over a fading GMAC are considered. Various power allocation strategies are also compared. Keywords: Fading MAC, Power allocation, Random TDMA, Amplify and Forward, Correlated sources.
Resumo:
In this paper, we present a new speech enhancement approach, that is based on exploiting the intra-frame dependency of discrete cosine transform (DCT) domain coefficients. It can be noted that the existing enhancement techniques treat the transformdomain coefficients independently. Instead of this traditional approach of independently processing the scalars, we split the DCT domain noisy speech vector into sub-vectors and each sub-vector is enhanced independently. Through this sub-vector based approach, the higher dimensional enhancement advantage, viz. non-linear dependency, is exploited. In the developed method, each clean speech sub-vector is modeled using a Gaussian mixture (GM) density. We show that the proposed Gaussian mixture model (GMM) based DCT domain method, using sub-vector processing approach, provides better performance than the conventional approach of enhancing the transform domain scalar components independently. Performance improvement over the recently proposed GMM based time domain approach is also shown.
Resumo:
We develop a Gaussian mixture model (GMM) based vector quantization (VQ) method for coding wideband speech line spectrum frequency (LSF) parameters at low complexity. The PDF of LSF source vector is modeled using the Gaussian mixture (GM) density with higher number of uncorrelated Gaussian mixtures and an optimum scalar quantizer (SQ) is designed for each Gaussian mixture. The reduction of quantization complexity is achieved using the relevant subset of available optimum SQs. For an input vector, the subset of quantizers is chosen using nearest neighbor criteria. The developed method is compared with the recent VQ methods and shown to provide high quality rate-distortion (R/D) performance at lower complexity. In addition, the developed method also provides the advantages of bitrate scalability and rate-independent complexity.
Resumo:
We address the problem of local-polynomial modeling of smooth time-varying signals with unknown functional form, in the presence of additive noise. The problem formulation is in the time domain and the polynomial coefficients are estimated in the pointwise minimum mean square error (PMMSE) sense. The choice of the window length for local modeling introduces a bias-variance tradeoff, which we solve optimally by using the intersection-of-confidence-intervals (ICI) technique. The combination of the local polynomial model and the ICI technique gives rise to an adaptive signal model equipped with a time-varying PMMSE-optimal window length whose performance is superior to that obtained by using a fixed window length. We also evaluate the sensitivity of the ICI technique with respect to the confidence interval width. Simulation results on electrocardiogram (ECG) signals show that at 0dB signal-to-noise ratio (SNR), one can achieve about 12dB improvement in SNR. Monte-Carlo performance analysis shows that the performance is comparable to the basic wavelet techniques. For 0 dB SNR, the adaptive window technique yields about 2-3dB higher SNR than wavelet regression techniques and for SNRs greater than 12dB, the wavelet techniques yield about 2dB higher SNR.
Resumo:
We determine the optimal allocation of power between the analog and digital sections of an RF receiver while meeting the BER constraint. Unlike conventional RF receiver designs, we treat the SNR at the output of the analog front end (SNRAD) as a design parameter rather than a specification to arrive at this optimal allocation. We first determine the relationship of the SNRAD to the resolution and operating frequency of the digital section. We then use power models for the analog and digital sections to solve the power minimization problem. As an example, we consider a 802.15.4 compliant low-IF receiver operating at 2.4 GHz in 0.13 μm technology with 1.2 V power supply. We find that the overall receiver power is minimized by having the analog front end provide an SNR of 1.3dB and the ADC and the digital section operate at 1-bit resolution with 18MHz sampling frequency while achieving a power dissipation of 7mW.
Resumo:
A low correlation interleaved QAM sequence family is presented here. In a CDMA setting, these sequences have the ability to transport a large amount of data as well as enable variable-rate signaling on the reverse link. The new interleaved selected family INQ has period N, normalized maximum correlation parameter thetasmacrmax bounded above by lsim a radicN, where a ranges from 1.17 in the 16-QAM case to 1.99 for large M2-QAM, where M = 2m, m ges 2. Each user is enabled to transfer m + 1 bits of data per period of the spreading sequence. These constructions have the lowest known value of maximum correlation of any sequence family with the same alphabet.
Resumo:
The capacity region of a two-user Gaussian Multiple Access Channel (GMAC) with complex finite input alphabets and continuous output alphabet is studied. When both the users are equipped with the same code alphabet, it is shown that, rotation of one of the user’s alphabets by an appropriate angle can make the new pair of alphabets not only uniquely decodable, but will result in enlargement of the capacity region. For this set-up, we identify the primary problem to be finding appropriate angle(s) of rotation between the alphabets such that the capacity region is maximally enlarged. It is shown that the angle of rotation which provides maximum enlargement of the capacity region also minimizes the union bound on the probability of error of the sumalphabet and vice-verse. The optimum angle(s) of rotation varies with the SNR. Through simulations, optimal angle(s) of rotation that gives maximum enlargement of the capacity region of GMAC with some well known alphabets such as M-QAM and M-PSK for some M are presented for several values of SNR. It is shown that for large number of points in the alphabets, capacity gains due to rotations progressively reduce. As the number of points N tends to infinity, our results match the results in the literature wherein the capacity region of the Gaussian code alphabet doesn’t change with rotation for any SNR.