Biblioteca Digital

244 resultados para Speech Processing

Influence of homogenization on the processing map for hot working of as-cast Mg-2Zn-1Mn alloy

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Processing maps have been developed for hot deformation of Mg-2Zn-1Mn alloy in as-cast condition and after homogenization with a view to evaluate the influence of homogenization. Hot compression data in the temperature range 300-500degreesC and strain rate range 0.001-100 s(-1) were used for generating the processing map. In the map for the as-cast alloy the domain of dynamic recrystallization occurring, at 450degreesC and 0.1 s(-1) has merged with another domain occurring at 500degreesC and 0.001 s(-1) representing grain boundary cracking. The latter domain is eliminated by homogenization and the dynamic recrystallization domain expanded with a higher peak efficiency occurring at 500 degreesC and 0.05 s(-1). The flow localization occurring at strain rates higher than 5 s(-1) is unaffected by homogenization.

Validation of processing maps for a 15Cr-15Ni-2.2Mo-0.3Ti austenitic stainless steel using hot forging and rolling tests

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The processing maps are being developed for use in optimising hot workability and controlling the microstructure of the product. The present investigation deals with the examination to assess the prediction of the processing maps for a 15Cr-15Ni-2.2Mo-0.3Ti austenitic stainless steel using forging and rolling tests at different temperatures in the range of 600-1200 degreesC. The tensile properties of these deformed products were evaluated at room temperature. The influence of the processing conditions, i.e. strain rate and temperature on the tensile properties of the deformed product were analysed to identify the optimum processing parameters. The results have shown good agreement between the regimes exhibited by the map and the properties of the rolled or forged product. The optimum parameters for processing of this steel were identified as rolling or press forging at temperatures above 1050 degreesC to obtain optimum product properties. (C) 2002 Elsevier Science B.V. All rights reserved.

Research bed for unit selection based text to speech synthesis

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The paper describes a modular, unit selection based TTS framework, which can be used as a research bed for developing TTS in any new language, as well as studying the effect of changing any parameter during synthesis. Using this framework, TTS has been developed for Tamil. Synthesis database consists of 1027 phonetically rich prerecorded sentences. This framework has already been tested for Kannada. Our TTS synthesizes intelligible and acceptably natural speech, as supported by high mean opinion scores. The framework is further optimized to suit embedded applications like mobiles and PDAs. We compressed the synthesis speech database with standard speech compression algorithms used in commercial GSM phones and evaluated the quality of the resultant synthesized sentences. Even with a highly compressed database, the synthesized output is perceptually close to that with uncompressed database. Through experiments, we explored the ambiguities in human perception when listening to Tamil phones and syllables uttered in isolation,thus proposing to exploit the misperception to substitute for missing phone contexts in the database. Listening experiments have been conducted on sentences synthesized by deliberately replacing phones with their confused ones.

Subspace Based Speech Enhancement Using Gaussian Mixture Model

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Traditional subspace based speech enhancement (SSE)methods use linear minimum mean square error (LMMSE) estimation that is optimal if the Karhunen Loeve transform (KLT) coefficients of speech and noise are Gaussian distributed. In this paper, we investigate the use of Gaussian mixture (GM) density for modeling the non-Gaussian statistics of the clean speech KLT coefficients. Using Gaussian mixture model (GMM), the optimum minimum mean square error (MMSE) estimator is found to be nonlinear and the traditional LMMSE estimator is shown to be a special case. Experimental results show that the proposed method provides better enhancement performance than the traditional subspace based methods.Index Terms: Subspace based speech enhancement, Gaussian mixture density, MMSE estimation.

Two stage iterative wiener filtering for speech enhancement

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We formulate a two-stage Iterative Wiener filtering (IWF) approach to speech enhancement, bettering the performance of constrained IWF, reported in literature. The codebook constrained IWF (CCIWF) has been shown to be effective in achieving convergence of IWF in the presence of both stationary and non-stationary noise. To this, we include a second stage of unconstrained IWF and show that the speech enhancement performance can be improved in terms of average segmental SNR (SSNR), Itakura-Saito (IS) distance and Linear Prediction Coefficients (LPC) parameter coincidence. We also explore the tradeoff between the number of CCIWF iterations and the second stage IWF iterations.

Comparison of AM-FM Based Features For Robust Speech Recognition

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Effective feature extraction for robust speech recognition is a widely addressed topic and currently there is much effort to invoke non-stationary signal models instead of quasi-stationary signal models leading to standard features such as LPC or MFCC. Joint amplitude modulation and frequency modulation (AM-FM) is a classical non-parametric approach to non-stationary signal modeling and recently new feature sets for automatic speech recognition (ASR) have been derived based on a multi-band AM-FM representation of the signal. We consider several of these representations and compare their performances for robust speech recognition in noise, using the AURORA-2 database. We show that FEPSTRUM representation proposed is more effective than others. We also propose an improvement to FEPSTRUM based on the Teager energy operator (TEO) and show that it can selectively outperform even FEPSTRUM

Online Unsupervised Pattern Discovery in Speech Using Parallelization

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Segmental dynamic time warping (DTW) has been demonstrated to be a useful technique for finding acoustic similarity scores between segments of two speech utterances. Due to its high computational requirements, it had to be computed in an offline manner, limiting the applications of the technique. In this paper, we present results of parallelization of this task by distributing the workload in either a static or dynamic way on an 8-processor cluster and discuss the trade-offs among different distribution schemes. We show that online unsupervised pattern discovery using segmental DTW is plausible with as low as 8 processors. This brings the task within reach of today's general purpose multi-core servers. We also show results on a 32-processor system, and discuss factors affecting scalability of our methods.

Low Complexity Wideband LSFQuantization Using GMM of Uncorrelated Gaussian Mixtures

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We develop a Gaussian mixture model (GMM) based vector quantization (VQ) method for coding wideband speech line spectrum frequency (LSF) parameters at low complexity. The PDF of LSF source vector is modeled using the Gaussian mixture (GM) density with higher number of uncorrelated Gaussian mixtures and an optimum scalar quantizer (SQ) is designed for each Gaussian mixture. The reduction of quantization complexity is achieved using the relevant subset of available optimum SQs. For an input vector, the subset of quantizers is chosen using nearest neighbor criteria. The developed method is compared with the recent VQ methods and shown to provide high quality rate-distortion (R/D) performance at lower complexity. In addition, the developed method also provides the advantages of bitrate scalability and rate-independent complexity.

GMM basedbayesian approach to speech enhancement in signal/transform domain

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Considering a general linear model of signal degradation, by modeling the probability density function (PDF) of the clean signal using a Gaussian mixture model (GMM) and additive noise by a Gaussian PDF, we derive the minimum mean square error (MMSE) estimator.The derived MMSE estimator is non-linear and the linear MMSE estimator is shown to be a special case. For speech signal corrupted by independent additive noise, by modeling the joint PDF of time-domain speech samples of a speech frame using a GMM, we propose a speech enhancement method based on the derived MMSE estimator. We also show that the same estimator can be used for transform-domain speech enhancement.

Miorwave processing of giant dielectric CaCu3Ti4O12 Ceramics

Relevância:

20.00% 20.00%

Publicador:

High-rate analysis of channel-optimized vector quantization

Relevância:

20.00% 20.00%

Publicador:

Resumo:

High-rate analysis of channel-optimized vector quantizationThis paper considers the high-rate performance of channel optimized source coding for noisy discrete symmetric channels with random index assignment. Specifically, with mean squared error (MSE) as the performance metric, an upper bound on the asymptotic (i.e., high-rate) distortion is derived by assuming a general structure on the codebook. This structure enables extension of the analysis of the channel optimized source quantizer to one with a singular point density: for channels with small errors, the point density that minimizes the upper bound is continuous, while as the error rate increases, the point density becomes singular. The extent of the singularity is also characterized. The accuracy of the expressions obtained are verified through Monte Carlo simulations.

Production of bulk-nano materials by friction-stir processing

Relevância:

20.00% 20.00%

Publicador:

Use of Geometric Phase in Quantum Information Processing by Nuclear Magnetic Resonance

Relevância:

20.00% 20.00%

Publicador:

Novel auditory motivated subband temporal envelope based fundamental frequency estimation algorithm

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We address the problem of estimating the fundamental frequency of voiced speech. We present a novel solution motivated by the importance of amplitude modulation in sound processing and speech perception. The new algorithm is based on a cumulative spectrum computed from the temporal envelope of various subbands. We provide theoretical analysis to derive the new pitch estimator based on the temporal envelope of the bandpass speech signal. We report extensive experimental performance for synthetic as well as natural vowels for both realworld noisy and noise-free data. Experimental results show that the new technique performs accurate pitch estimation and is robust to noise. We also show that the technique is superior to the autocorrelation technique for pitch estimation.

High-temperature deformation processing maps for a NiTiCu shape memory alloy

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The properties of widely used Ni-Ti-based shape memory alloys (SMAs) are highly sensitive to the underlying microstructure. Hence, controlling the evolution of microstructure during high-temperature deformation becomes important. In this article, the ``processing maps'' approach is utilized to identify the combination of temperature and strain rate for thermomechanical processing of a Ni(42)Ti(50)Cu(8) SMA. Uniaxial compression experiments were conducted in the temperature range of 800-1050 degrees C and at strain rate range of 10(-3) and 10(2) s(-1). Two-dimensional power dissipation efficiency and instability maps have been generated and various deformation mechanisms, which operate in different temperature and strain rate regimes, were identified with the aid of the maps and complementary microstructural analysis of the deformed specimens. Results show that the safe window for industrial processing of this alloy is in the range of 800-850 degrees C and at 0.1 s(-1), which leads to grain refinement and strain-free grains. Regions of the instability were identified, which result in strained microstructure, which in turn can affect the performance of the SMA.

«
1
2
...
7
8
9
10
11
12
13
...
16
17
»