244 resultados para Speech Processing


Relevância:

20.00% 20.00%

Publicador:

Resumo:

A combined set of thermo-mechanical steps recommended for high strength beta Ti alloy are homogenization, deformation, recrystallization, annealing and ageing steps in sequence. Recrystallization carried out above or below beta transus temperature generates either beta annealed (lath type morphology of alpha) or bimodal (lath+globular morphology of alpha) microstructure. Through variations in heat treatment parameters at these processing steps, wide ranges of length scales of features have been generated in both types of microstructures in a near beta Ti alloy, Ti-5Al-5Mo-5V-3Cr (Ti-5553). 0.2% Yield strength (YS) has been correlated to various microstructural features and associated heat treatment parameters. Relative importance of microstructural features in influencing YS has been identified. Process parameters at different steps have been identified and recommended for attaining different levels of YS for this near beta Ti alloy. (C) 2014 Elsevier B.V. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A phase field modelling approach is implemented in the present study towards simulation of microstructure evolution during cooling slope semi solid slurry generation process of A380 Aluminium alloy. First, experiments are performed to evaluate the number of seeds required within the simulation domain to simulate near spherical microstructure formation, occurs during cooling slope processing of the melt. Subsequently, microstructure evolution is studied employing a phase field method. Simulations are performed to understand the effect of cooling rate on the slurry microstructure. Encouraging results are obtained from the simulation studies which are validated by experimental observations. The results obtained from mesoscopic phase field simulations are grain size, grain density, degree of sphericity of the evolving primary Al phase and the amount of solid fraction present within the slurry at different time frames. Effect of grain refinement also has been studied with an aim of improving the slurry microstructure further. Insight into the process has been obtained from the numerical findings, which are found to be useful for process control.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Ni-Fe-Ga-based alloys form a new class of ferromagnetic shape memory alloys (FSMAs) that show considerable formability because of the presence of a disordered fcc gamma-phase. The current study explores the deformation processing of this alloy using an off-stoichiometric Ni55Fe59Ga26 alloy that contains the ductile gamma-phase. The hot deformation behavior of this alloy has been characterized on the basis of its flow stress variation obtained by isothermal constant true strain rate compression tests in the 1123-1323 K temperature range and strain rate range of 10(-3)-10 s(-1) and using a combination of constitutive modeling and processing map. The dynamic recrystallization (DRX) regime for thermomechanical processing has been identified for this Heusler alloy on the basis of the processing maps and the deformed microstructures. This alloy also shows evidence of dynamic strain-aging (DSA) effect which has not been reported so far for any Heusler FSMAs. Similar effect is also noticed in a Ni-Mn-Ga-based Heusler alloy which is devoid of any gamma-phase. (C) 2014 Elsevier Ltd. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper proposes an automatic acoustic-phonetic method for estimating voice-onset time of stops. This method requires neither transcription of the utterance nor training of a classifier. It makes use of the plosion index for the automatic detection of burst onsets of stops. Having detected the burst onset, the onset of the voicing following the burst is detected using the epochal information and a temporal measure named the maximum weighted inner product. For validation, several experiments are carried out on the entire TIMIT database and two of the CMU Arctic corpora. The performance of the proposed method compares well with three state-of-the-art techniques. (C) 2014 Acoustical Society of America

Relevância:

20.00% 20.00%

Publicador:

Resumo:

USC-TIMIT is an extensive database of multimodal speech production data, developed to complement existing resources available to the speech research community and with the intention of being continuously refined and augmented. The database currently includes real-time magnetic resonance imaging data from five male and five female speakers of American English. Electromagnetic articulography data have also been presently collected from four of these speakers. The two modalities were recorded in two independent sessions while the subjects produced the same 460 sentence corpus used previously in the MOCHA-TIMIT database. In both cases the audio signal was recorded and synchronized with the articulatory data. The database and companion software are freely available to the research community. (C) 2014 Acoustical Society of America.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Friction stir processing (FSP) is emerging as one of the most competent severe plastic deformation (SPD) method for producing bulk ultra-fine grained materials with improved properties. Optimizing the process parameters for a defect free process is one of the challenging aspects of FSP to mark its commercial use. For the commercial aluminium alloy 2024-T3 plate of 6 mm thickness, a bottom-up approach has been attempted to optimize major independent parameters of the process such as plunge depth, tool rotation speed and traverse speed. Tensile properties of the optimum friction stir processed sample were correlated with the microstructural characterization done using Scanning Electron Microscope (SEM) and Electron Back-Scattered Diffraction (EBSD). Optimum parameters from the bottom-up approach have led to a defect free FSP having a maximum strength of 93% the base material strength. Micro tensile testing of the samples taken from the center of processed zone has shown an increased strength of 1.3 times the base material. Measured maximum longitudinal residual stress on the processed surface was only 30 MPa which was attributed to the solid state nature of FSP. Microstructural observation reveals significant grain refinement with less variation in the grain size across the thickness and a large amount of grain boundary precipitation compared to the base metal. The proposed experimental bottom-up approach can be applied as an effective method for optimizing parameters during FSP of aluminium alloys, which is otherwise difficult through analytical methods due to the complex interactions between work-piece, tool and process parameters. Precipitation mechanisms during FSP were responsible for the fine grained microstructure in the nugget zone that provided better mechanical properties than the base metal. (C) 2014 Elsevier Ltd. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Grating Compression Transform (GCT) is a two-dimensional analysis of speech signal which has been shown to be effective in multi-pitch tracking in speech mixtures. Multi-pitch tracking methods using GCT apply Kalman filter framework to obtain pitch tracks which requires training of the filter parameters using true pitch tracks. We propose an unsupervised method for obtaining multiple pitch tracks. In the proposed method, multiple pitch tracks are modeled using time-varying means of a Gaussian mixture model (GMM), referred to as TVGMM. The TVGMM parameters are estimated using multiple pitch values at each frame in a given utterance obtained from different patches of the spectrogram using GCT. We evaluate the performance of the proposed method on all voiced speech mixtures as well as random speech mixtures having well separated and close pitch tracks. TVGMM achieves multi-pitch tracking with 51% and 53% multi-pitch estimates having error <= 20% for random mixtures and all-voiced mixtures respectively. TVGMM also results in lower root mean squared error in pitch track estimation compared to that by Kalman filtering.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Designing a robust algorithm for visual object tracking has been a challenging task since many years. There are trackers in the literature that are reasonably accurate for many tracking scenarios but most of them are computationally expensive. This narrows down their applicability as many tracking applications demand real time response. In this paper, we present a tracker based on random ferns. Tracking is posed as a classification problem and classification is done using ferns. We used ferns as they rely on binary features and are extremely fast at both training and classification as compared to other classification algorithms. Our experiments show that the proposed tracker performs well on some of the most challenging tracking datasets and executes much faster than one of the state-of-the-art trackers, without much difference in tracking accuracy.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Time-varying linear prediction has been studied in the context of speech signals, in which the auto-regressive (AR) coefficients of the system function are modeled as a linear combination of a set of known bases. Traditionally, least squares minimization is used for the estimation of model parameters of the system. Motivated by the sparse nature of the excitation signal for voiced sounds, we explore the time-varying linear prediction modeling of speech signals using sparsity constraints. Parameter estimation is posed as a 0-norm minimization problem. The re-weighted 1-norm minimization technique is used to estimate the model parameters. We show that for sparsely excited time-varying systems, the formulation models the underlying system function better than the least squares error minimization approach. Evaluation with synthetic and real speech examples show that the estimated model parameters track the formant trajectories closer than the least squares approach.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We address the problem of parameter estimation of an ellipse from a limited number of samples. We develop a new approach for solving the ellipse fitting problem by showing that the x and y coordinate functions of an ellipse are finite-rate-of-innovation (FRI) signals. Uniform samples of x and y coordinate functions of the ellipse are modeled as a sum of weighted complex exponentials, for which we propose an efficient annihilating filter technique to estimate the ellipse parameters from the samples. The FRI framework allows for estimating the ellipse parameters reliably from partial or incomplete measurements even in the presence of noise. The efficiency and robustness of the proposed method is compared with state-of-art direct method. The experimental results show that the estimated parameters have lesser bias compared with the direct method and the estimation error is reduced by 5-10 dB relative to the direct method.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We consider the problem of parameter estimation from real-valued multi-tone signals. Such problems arise frequently in spectral estimation. More recently, they have gained new importance in finite-rate-of-innovation signal sampling and reconstruction. The annihilating filter is a key tool for parameter estimation in these problems. The standard annihilating filter design has to be modified to result in accurate estimation when dealing with real sinusoids, particularly because the real-valued nature of the sinusoids must be factored into the annihilating filter design. We show that the constraint on the annihilating filter can be relaxed by making use of the Hilbert transform. We refer to this approach as the Hilbert annihilating filter approach. We show that accurate parameter estimation is possible by this approach. In the single-tone case, the mean-square error performance increases by 6 dB for signal-to-noise ratio (SNR) greater than 0 dB. We also present experimental results in the multi-tone case, which show that a significant improvement (about 6 dB) is obtained when the parameters are close to 0 or pi. In the mid-frequency range, the improvement is about 2 to 3 dB.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this work, we address the recovery of block sparse vectors with intra-block correlation, i.e., the recovery of vectors in which the correlated nonzero entries are constrained to lie in a few clusters, from noisy underdetermined linear measurements. Among Bayesian sparse recovery techniques, the cluster Sparse Bayesian Learning (SBL) is an efficient tool for block-sparse vector recovery, with intra-block correlation. However, this technique uses a heuristic method to estimate the intra-block correlation. In this paper, we propose the Nested SBL (NSBL) algorithm, which we derive using a novel Bayesian formulation that facilitates the use of the monotonically convergent nested Expectation Maximization (EM) and a Kalman filtering based learning framework. Unlike the cluster-SBL algorithm, this formulation leads to closed-form EMupdates for estimating the correlation coefficient. We demonstrate the efficacy of the proposed NSBL algorithm using Monte Carlo simulations.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We address the problem of designing an optimal pointwise shrinkage estimator in the transform domain, based on the minimum probability of error (MPE) criterion. We assume an additive model for the noise corrupting the clean signal. The proposed formulation is general in the sense that it can handle various noise distributions. We consider various noise distributions (Gaussian, Student's-t, and Laplacian) and compare the denoising performance of the estimator obtained with the mean-squared error (MSE)-based estimators. The MSE optimization is carried out using an unbiased estimator of the MSE, namely Stein's Unbiased Risk Estimate (SURE). Experimental results show that the MPE estimator outperforms the SURE estimator in terms of SNR of the denoised output, for low (0 -10 dB) and medium values (10 - 20 dB) of the input SNR.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Electromagnetic Articulography (EMA) technique is used to record the kinematics of different articulators while one speaks. EMA data often contains missing segments due to sensor failure. In this work, we propose a maximum a-posteriori (MAP) estimation with continuity constraint to recover the missing samples in the articulatory trajectories recorded using EMA. In this approach, we combine the benefits of statistical MAP estimation as well as the temporal continuity of the articulatory trajectories. Experiments on articulatory corpus using different missing segment durations show that the proposed continuity constraint results in a 30% reduction in average root mean squared error in estimation over statistical estimation of missing segments without any continuity constraint.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The accurate solution of 3D full-wave Method of Moments (MoM) on an arbitrary mesh of a package-board structure does not guarantee accuracy, since the discretizations may not be fine enough to capture rapid spatial changes in the solution variable. At the same time, uniform over-meshing on the entire structure generates large number of solution variables and therefore requires an unnecessarily large matrix solution. In this work, a suitable refinement criterion for MoM based electromagnetic package-board extraction is proposed and the advantages of the adaptive strategy are demonstrated from both accuracy and speed perspectives.