105 resultados para Speech in Noise
Resumo:
In this paper, we present a new speech enhancement approach, that is based on exploiting the intra-frame dependency of discrete cosine transform (DCT) domain coefficients. It can be noted that the existing enhancement techniques treat the transformdomain coefficients independently. Instead of this traditional approach of independently processing the scalars, we split the DCT domain noisy speech vector into sub-vectors and each sub-vector is enhanced independently. Through this sub-vector based approach, the higher dimensional enhancement advantage, viz. non-linear dependency, is exploited. In the developed method, each clean speech sub-vector is modeled using a Gaussian mixture (GM) density. We show that the proposed Gaussian mixture model (GMM) based DCT domain method, using sub-vector processing approach, provides better performance than the conventional approach of enhancing the transform domain scalar components independently. Performance improvement over the recently proposed GMM based time domain approach is also shown.
Resumo:
Localization of underwater acoustic sources is a problem of great interest in the area of ocean acoustics. There exist several algorithms for source localization based on array signal processing.It is of interest to know the theoretical performance limits of these estimators. In this paper we develop expressions for the Cramer-Rao-Bound (CRB) on the variance of direction-of-arrival(DOA) and range-depth estimators of underwater acoustic sources in a shallow range-independent ocean for the case of generalized Gaussian noise. We then study the performance of some of the popular source localization techniques,through simulations, for DOA/range-depth estimation of underwater acoustic sources in shallow ocean by comparing the variance of the estimators with the corresponding CRBs.
Resumo:
Chronic recording of neural signals is indispensable in designing efficient brain–machine interfaces and to elucidate human neurophysiology. The advent of multichannel micro-electrode arrays has driven the need for electronics to record neural signals from many neurons. The dynamic range of the system can vary over time due to change in electrode–neuron distance and background noise. We propose a neural amplifier in UMC 130 nm, 1P8M complementary metal–oxide–semiconductor (CMOS) technology. It can be biased adaptively from 200 nA to 2 $mu{rm A}$, modulating input referred noise from 9.92 $mu{rm V}$ to 3.9 $mu{rm V}$. We also describe a low noise design technique which minimizes the noise contribution of the load circuitry. Optimum sizing of the input transistors minimizes the accentuation of the input referred noise of the amplifier and obviates the need of large input capacitance. The amplifier achieves a noise efficiency factor of 2.58. The amplifier can pass signal from 5 Hz to 7 kHz and the bandwidth of the amplifier can be tuned for rejecting low field potentials (LFP) and power line interference. The amplifier achieves a mid-band voltage gain of 37 dB. In vitro experiments are performed to validate the applicability of the neural low noise amplifier in neural recording systems.
Resumo:
The problem of on-line recognition and retrieval of relatively weak industrial signals such as partial discharges (PD), buried in excessive noise, has been addressed in this paper. The major bottleneck being the recognition and suppression of stochastic pulsive interference (PI) due to the overlapping broad band frequency spectrum of PI and PD pulses. Therefore, on-line, onsite, PD measurement is hardly possible in conventional frequency based DSP techniques. The observed PD signal is modeled as a linear combination of systematic and random components employing probabilistic principal component analysis (PPCA) and the pdf of the underlying stochastic process is obtained. The PD/PI pulses are assumed as the mean of the process and modeled instituting non-parametric methods, based on smooth FIR filters, and a maximum aposteriori probability (MAP) procedure employed therein, to estimate the filter coefficients. The classification of the pulses is undertaken using a simple PCA classifier. The methods proposed by the authors were found to be effective in automatic retrieval of PD pulses completely rejecting PI.
Resumo:
In this paper, we have studied the effect of gate-drain/source overlap (LOV) on the drain channel noise and induced gate current noise (SIg) in 90 nm N-channel metal oxide semiconductor field effect transistors using process and device simulations. As the change in overlap affects the gate tunneling leakage current, its effect on shot noise component of SIg has been taken into consideration. It has been shown that “control over LOV” allows us to get better noise performance from the device, i.e., it allows us to reduce noise figure, for a given leakage current constraint. LOV in the range of 0–10 nm is recommended for the 90 nm gate length transistors, in order to get the best performance in radio frequency applications.
Resumo:
The design and operation of the minimum cost classifier, where the total cost is the sum of the measurement cost and the classification cost, is computationally complex. Noting the difficulties associated with this approach, decision tree design directly from a set of labelled samples is proposed in this paper. The feature space is first partitioned to transform the problem to one of discrete features. The resulting problem is solved by a dynamic programming algorithm over an explicitly ordered state space of all outcomes of all feature subsets. The solution procedure is very general and is applicable to any minimum cost pattern classification problem in which each feature has a finite number of outcomes. These techniques are applied to (i) voiced, unvoiced, and silence classification of speech, and (ii) spoken vowel recognition. The resulting decision trees are operationally very efficient and yield attractive classification accuracies.
Resumo:
Intrinsically disordered proteins, IDPs, are proteins that lack a rigid 3D structure under physiological conditions, at least in vitro. Despite the lack of structure, IDPs play important roles in biological processes and transition from disorder to order upon binding to their targets. With multiple conformational states and rapid conformational dynamics, they engage in myriad and often ``promiscuous'' interactions. These stochastic interactions between IDPs and their partners, defined here as conformational noise, is an inherent characteristic of IDP interactions. The collective effect of conformational noise is an ensemble of protein network configurations, from which the most suitable can be explored in response to perturbations, conferring protein networks with remarkable flexibility and resilience. Moreover, the ubiquitous presence of IDPs as transcriptional factors and, more generally, as hubs in protein networks, is indicative of their role in propagation of transcriptional (genetic) noise. As effectors of transcriptional and conformational noise, IDPs rewire protein networks and unmask latent interactions in response to perturbations. Thus, noise-driven activation of latent pathways could underlie state-switching events such as cellular transformation in cancer. To test this hypothesis, we created a model of a protein network with the topological characteristics of a cancer protein network and tested its response to a perturbation in presence of IDP hubs and conformational noise. Because numerous IDPs are found to be epigenetic modifiers and chromatin remodelers, we hypothesize that they could further channel noise into stable, heritable genotypic changes.
Resumo:
Acoustic modeling using mixtures of multivariate Gaussians is the prevalent approach for many speech processing problems. Computing likelihoods against a large set of Gaussians is required as a part of many speech processing systems and it is the computationally dominant phase for Large Vocabulary Continuous Speech Recognition (LVCSR) systems. We express the likelihood computation as a multiplication of matrices representing augmented feature vectors and Gaussian parameters. The computational gain of this approach over traditional methods is by exploiting the structure of these matrices and efficient implementation of their multiplication. In particular, we explore direct low-rank approximation of the Gaussian parameter matrix and indirect derivation of low-rank factors of the Gaussian parameter matrix by optimum approximation of the likelihood matrix. We show that both the methods lead to similar speedups but the latter leads to far lesser impact on the recognition accuracy. Experiments on 1,138 work vocabulary RM1 task and 6,224 word vocabulary TIMIT task using Sphinx 3.7 system show that, for a typical case the matrix multiplication based approach leads to overall speedup of 46 % on RM1 task and 115 % for TIMIT task. Our low-rank approximation methods provide a way for trading off recognition accuracy for a further increase in computational performance extending overall speedups up to 61 % for RM1 and 119 % for TIMIT for an increase of word error rate (WER) from 3.2 to 3.5 % for RM1 and for no increase in WER for TIMIT. We also express pairwise Euclidean distance computation phase in Dynamic Time Warping (DTW) in terms of matrix multiplication leading to saving of approximately of computational operations. In our experiments using efficient implementation of matrix multiplication, this leads to a speedup of 5.6 in computing the pairwise Euclidean distances and overall speedup up to 3.25 for DTW.
Resumo:
This paper considers the problem of weak signal detection in the presence of navigation data bits for Global Navigation Satellite System (GNSS) receivers. Typically, a set of partial coherent integration outputs are non-coherently accumulated to combat the effects of model uncertainties such as the presence of navigation data-bits and/or frequency uncertainty, resulting in a sub-optimal test statistic. In this work, the test-statistic for weak signal detection is derived in the presence of navigation data-bits from the likelihood ratio. It is highlighted that averaging the likelihood ratio based test-statistic over the prior distributions of the unknown data bits and the carrier phase uncertainty leads to the conventional Post Detection Integration (PDI) technique for detection. To improve the performance in the presence of model uncertainties, a novel cyclostationarity based sub-optimal PDI technique is proposed. The test statistic is analytically characterized, and shown to be robust to the presence of navigation data-bits, frequency, phase and noise uncertainties. Monte Carlo simulation results illustrate the validity of the theoretical results and the superior performance offered by the proposed detector in the presence of model uncertainties.
Resumo:
Chronic recording of neural signals is indispensable in designing efficient brain machine interfaces and in elucidating human neurophysiology. The advent of multichannel microelectrode arrays has driven the need for electronics to record neural signals from many neurons. The dynamic range of the system is limited by background system noise which varies over time. We propose a neural amplifier in UMC 130 nm, 2P8M CMOS technology. It can be biased adaptively from 200 nA to 2 uA, modulating input referred noise from 9.92 uV to 3.9 uV. We also describe a low noise design technique which minimizes the noise contribution of the load circuitry. The amplifier can pass signal from 5 Hz to 7 kHz while rejecting input DC offsets at electrode-electrolyte interface. The bandwidth of the amplifier can be tuned by the pseudo-resistor for selectively recording low field potentials (LFP) or extra cellular action potentials (EAP). The amplifier achieves a mid-band voltage gain of 37 dB and minimizes the attenuation of the signal from neuron to the gate of the input transistor. It is used in fully differential configuration to reject noise of bias circuitry and to achieve high PSRR.
Resumo:
We address the problem of speech enhancement in real-world noisy scenarios. We propose to solve the problem in two stages, the first comprising a generalized spectral subtraction technique, followed by a sequence of perceptually-motivated post-processing algorithms. The role of the post-processing algorithms is to compensate for the effects of noise as well as to suppress any artifacts created by the first-stage processing. The key post-processing mechanisms are aimed at suppressing musical noise and to enhance the formant structure of voiced speech as well as to denoise the linear-prediction residual. The parameter values in the techniques are fixed optimally by experimentally evaluating the enhancement performance as a function of the parameters. We used the Carnegie-Mellon university Arctic database for our experiments. We considered three real-world noise types: fan noise, car noise, and motorbike noise. The enhancement performance was evaluated by conducting listening experiments on 12 subjects. The listeners reported a clear improvement (MOS improvement of 0.5 on an average) over the noisy signal in the perceived quality (increase in the mean-opinion score (MOS)) for positive signal-to-noise-ratios (SNRs). For negative SNRs, however, the improvement was found to be marginal.