895 resultados para Noisy corpora.
Resumo:
In this paper, new results and insights are derived for the performance of multiple-input, single-output systems with beamforming at the transmitter, when the channel state information is quantized and sent to the transmitter over a noisy feedback channel. It is assumed that there exists a per-antenna power constraint at the transmitter, hence, the equal gain transmission (EGT) beamforming vector is quantized and sent from the receiver to the transmitter. The loss in received signal-to-noise ratio (SNR) relative to perfect beamforming is analytically characterized, and it is shown that at high rates, the overall distortion can be expressed as the sum of the quantization-induced distortion and the channel error-induced distortion, and that the asymptotic performance depends on the error-rate behavior of the noisy feedback channel as the number of codepoints gets large. The optimum density of codepoints (also known as the point density) that minimizes the overall distortion subject to a boundedness constraint is shown to be the same as the point density for a noiseless feedback channel, i.e., the uniform density. The binary symmetric channel with random index assignment is a special case of the analysis, and it is shown that as the number of quantized bits gets large the distortion approaches the same as that obtained with random beamforming. The accuracy of the theoretical expressions obtained are verified through Monte Carlo simulations.
Resumo:
This paper considers the design and analysis of a filter at the receiver of a source coding system to mitigate the excess Mean-Squared Error (MSE) distortion caused due to channel errors. It is assumed that the source encoder is channel-agnostic, i.e., that a Vector Quantization (VQ) based compression designed for a noiseless channel is employed. The index output by the source encoder is sent over a noisy memoryless discrete symmetric channel, and the possibly incorrect received index is decoded by the corresponding VQ decoder. The output of the VQ decoder is processed by a receive filter to obtain an estimate of the source instantiation. In the sequel, the optimum linear receive filter structure to minimize the overall MSE is derived, and shown to have a minimum-mean squared error receiver type structure. Further, expressions are derived for the resulting high-rate MSE performance. The performance is compared with the MSE obtained using conventional VQ as well as the channel optimized VQ. The accuracy of the expressions is demonstrated through Monte Carlo simulations.
Resumo:
We address the problem of local-polynomial modeling of smooth time-varying signals with unknown functional form, in the presence of additive noise. The problem formulation is in the time domain and the polynomial coefficients are estimated in the pointwise minimum mean square error (PMMSE) sense. The choice of the window length for local modeling introduces a bias-variance tradeoff, which we solve optimally by using the intersection-of-confidence-intervals (ICI) technique. The combination of the local polynomial model and the ICI technique gives rise to an adaptive signal model equipped with a time-varying PMMSE-optimal window length whose performance is superior to that obtained by using a fixed window length. We also evaluate the sensitivity of the ICI technique with respect to the confidence interval width. Simulation results on electrocardiogram (ECG) signals show that at 0dB signal-to-noise ratio (SNR), one can achieve about 12dB improvement in SNR. Monte-Carlo performance analysis shows that the performance is comparable to the basic wavelet techniques. For 0 dB SNR, the adaptive window technique yields about 2-3dB higher SNR than wavelet regression techniques and for SNRs greater than 12dB, the wavelet techniques yield about 2dB higher SNR.
Resumo:
This paper considers the high-rate performance of source coding for noisy discrete symmetric channels with random index assignment (IA). Accurate analytical models are developed to characterize the expected distortion performance of vector quantization (VQ) for a large class of distortion measures. It is shown that when the point density is continuous, the distortion can be approximated as the sum of the source quantization distortion and the channel-error induced distortion. Expressions are also derived for the continuous point density that minimizes the expected distortion. Next, for the case of mean squared error distortion, a more accurate analytical model for the distortion is derived by allowing the point density to have a singular component. The extent of the singularity is also characterized. These results provide analytical models for the expected distortion performance of both conventional VQ as well as for channel-optimized VQ. As a practical example, compression of the linear predictive coding parameters in the wideband speech spectrum is considered, with the log spectral distortion as performance metric. The theory is able to correctly predict the channel error rate that is permissible for operation at a particular level of distortion.
Resumo:
Signal acquisition under a compressed sensing scheme offers the possibility of acquisition and reconstruction of signals sparse on some basis incoherent with measurement kernel with sub-Nyquist number of measurements. In particular when the sole objective of the acquisition is the detection of the frequency of a signal rather than exact reconstruction, then an undersampling framework like CS is able to perform the task. In this paper we explore the possibility of acquisition and detection of frequency of multiple analog signals, heavily corrupted with additive white Gaussian noise. We improvise upon the MOSAICS architecture proposed by us in our previous work to include a wider class of signals having non-integral frequency components. This makes it possible to perform multiplexed compressed sensing for general frequency sparse signals.
Resumo:
We consider the problem of Probably Ap-proximate Correct (PAC) learning of a bi-nary classifier from noisy labeled exam-ples acquired from multiple annotators(each characterized by a respective clas-sification noise rate). First, we consider the complete information scenario, where the learner knows the noise rates of all the annotators. For this scenario, we derive sample complexity bound for the Mini-mum Disagreement Algorithm (MDA) on the number of labeled examples to be ob-tained from each annotator. Next, we consider the incomplete information sce-nario, where each annotator is strategic and holds the respective noise rate as a private information. For this scenario, we design a cost optimal procurement auc-tion mechanism along the lines of Myer-son’s optimal auction design framework in a non-trivial manner. This mechanism satisfies incentive compatibility property,thereby facilitating the learner to elicit true noise rates of all the annotators.
Resumo:
Scatter/Gather systems are increasingly becoming useful in browsing document corpora. Usability of the present-day systems are restricted to monolingual corpora, and their methods for clustering and labeling do not easily extend to the multilingual setting, especially in the absence of dictionaries/machine translation. In this paper, we study the cluster labeling problem for multilingual corpora in the absence of machine translation, but using comparable corpora. Using a variational approach, we show that multilingual topic models can effectively handle the cluster labeling problem, which in turn allows us to design a novel Scatter/Gather system ShoBha. Experimental results on three datasets, namely the Canadian Hansards corpus, the entire overlapping Wikipedia of English, Hindi and Bengali articles, and a trilingual news corpus containing 41,000 articles, confirm the utility of the proposed system.
Resumo:
This study considers linear filtering methods for minimising the end-to-end average distortion of a fixed-rate source quantisation system. For the source encoder, both scalar and vector quantisation are considered. The codebook index output by the encoder is sent over a noisy discrete memoryless channel whose statistics could be unknown at the transmitter. At the receiver, the code vector corresponding to the received index is passed through a linear receive filter, whose output is an estimate of the source instantiation. Under this setup, an approximate expression for the average weighted mean-square error (WMSE) between the source instantiation and the reconstructed vector at the receiver is derived using high-resolution quantisation theory. Also, a closed-form expression for the linear receive filter that minimises the approximate average WMSE is derived. The generality of framework developed is further demonstrated by theoretically analysing the performance of other adaptation techniques that can be employed when the channel statistics are available at the transmitter also, such as joint transmit-receive linear filtering and codebook scaling. Monte Carlo simulation results validate the theoretical expressions, and illustrate the improvement in the average distortion that can be obtained using linear filtering techniques.
Resumo:
This paper considers cooperative spectrum sensing algorithms for Cognitive Radios which focus on reducing the number of samples to make a reliable detection. We propose algorithms based on decentralized sequential hypothesis testing in which the Cognitive Radios sequentially collect the observations, make local decisions and send them to the fusion center for further processing to make a final decision on spectrum usage. The reporting channel between the Cognitive Radios and the fusion center is assumed more realistically as a Multiple Access Channel (MAC) with receiver noise. Furthermore the communication for reporting is limited, thereby reducing the communication cost. We start with an algorithm where the fusion center uses an SPRT-like (Sequential Probability Ratio Test) procedure and theoretically analyze its performance. Asymptotically, its performance is close to the optimal centralized test without fusion center noise. We further modify this algorithm to improve its performance at practical operating points. Later we generalize these algorithms to handle uncertainties in SNR and fading. (C) 2014 Elsevier B.V. All rights reserved.
Resumo:
Retransmission protocols such as HDLC and TCP are designed to ensure reliable communication over noisy channels (i.e., channels that can corrupt messages). Thakkar et al. 15] have recently presented an algorithmic verification technique for deterministic streaming string transducer (DSST) models of such protocols. The verification problem is posed as equivalence checking between the specification and protocol DSSTs. In this paper, we argue that more general models need to be obtained using non-deterministic streaming string transducers (NSSTs). However, equivalence checking is undecidable for NSSTs. We present two classes where the models belong to a sub-class of NSSTs for which it is decidable. (C) 2015 Elsevier B.V. All rights reserved.
Resumo:
Identifying translations from comparable corpora is a well-known problem with several applications, e.g. dictionary creation in resource-scarce languages. Scarcity of high quality corpora, especially in Indian languages, makes this problem hard, e.g. state-of-the-art techniques achieve a mean reciprocal rank (MRR) of 0.66 for English-Italian, and a mere 0.187 for Telugu-Kannada. There exist comparable corpora in many Indian languages with other ``auxiliary'' languages. We observe that translations have many topically related words in common in the auxiliary language. To model this, we define the notion of a translingual theme, a set of topically related words from auxiliary language corpora, and present a probabilistic framework for translation induction. Extensive experiments on 35 comparable corpora using English and French as auxiliary languages show that this approach can yield dramatic improvements in performance (e.g. MRR improves by 124% to 0.419 for Telugu-Kannada). A user study on WikiTSu, a system for cross-lingual Wikipedia title suggestion that uses our approach, shows a 20% improvement in the quality of titles suggested.
Resumo:
In this paper methods are developed for enhancement and analysis of autoregressive moving average (ARMA) signals observed in additive noise which can be represented as mixtures of heavy-tailed non-Gaussian sources and a Gaussian background component. Such models find application in systems such as atmospheric communications channels or early sound recordings which are prone to intermittent impulse noise. Markov Chain Monte Carlo (MCMC) simulation techniques are applied to the joint problem of signal extraction, model parameter estimation and detection of impulses within a fully Bayesian framework. The algorithms require only simple linear iterations for all of the unknowns, including the MA parameters, which is in contrast with existing MCMC methods for analysis of noise-free ARMA models. The methods are illustrated using synthetic data and noise-degraded sound recordings.
Resumo:
This work addresses the problem of estimating the optimal value function in a Markov Decision Process from observed state-action pairs. We adopt a Bayesian approach to inference, which allows both the model to be estimated and predictions about actions to be made in a unified framework, providing a principled approach to mimicry of a controller on the basis of observed data. A new Markov chain Monte Carlo (MCMC) sampler is devised for simulation from theposterior distribution over the optimal value function. This step includes a parameter expansion step, which is shown to be essential for good convergence properties of the MCMC sampler. As an illustration, the method is applied to learning a human controller.