998 resultados para MP3 (Audio coding standard)


Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents speaker normalization approaches for audio search task. Conventional state-of-the-art feature set, viz., Mel Frequency Cepstral Coefficients (MFCC) is known to contain speaker-specific and linguistic information implicitly. This might create problem for speaker-independent audio search task. In this paper, universal warping-based approach is used for vocal tract length normalization in audio search. In particular, features such as scale transform and warped linear prediction are used to compensate speaker variability in audio matching. The advantage of these features over conventional feature set is that they apply universal frequency warping for both the templates to be matched during audio search. The performance of Scale Transform Cepstral Coefficients (STCC) and Warped Linear Prediction Cepstral Coefficients (WLPCC) are about 3% higher than the state-of-the-art MFCC feature sets on TIMIT database.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The functional source coding problem in which the receiver side information (Has-set) and demands (Want-set) include functions of source messages is studied using row-Latin rectangle. The source transmits encoded messages, called the functional source code, in order to satisfy the receiver's demands. We obtain a minimum length using the row-Latin rectangle. Next, we consider the case of transmission errors and provide a necessary and sufficient condition that a functional source code must satisfy so that the receiver can correctly decode the values of the functions in its Want-set.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper investigates unsupervised test-time adaptation of language models (LM) using discriminative methods for a Mandarin broadcast speech transcription and translation task. A standard approach to adapt interpolated language models to is to optimize the component weights by minimizing the perplexity on supervision data. This is a widely made approximation for language modeling in automatic speech recognition (ASR) systems. For speech translation tasks, it is unclear whether a strong correlation still exists between perplexity and various forms of error cost functions in recognition and translation stages. The proposed minimum Bayes risk (MBR) based approach provides a flexible framework for unsupervised LM adaptation. It generalizes to a variety of forms of recognition and translation error metrics. LM adaptation is performed at the audio document level using either the character error rate (CER), or translation edit rate (TER) as the cost function. An efficient parameter estimation scheme using the extended Baum-Welch (EBW) algorithm is proposed. Experimental results on a state-of-the-art speech recognition and translation system are presented. The MBR adapted language models gave the best recognition and translation performance and reduced the TER score by up to 0.54% absolute. © 2007 IEEE.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we derive the a posteriori probability for the location of bursts of noise additively superimposed on a Gaussian AR process. The theory is developed to give a sequentially based restoration algorithm suitable for real-time applications. The algorithm is particularly appropriate for digital audio restoration, where clicks and scratches may be modelled as additive bursts of noise. Experiments are carried out on both real audio data and synthetic AR processes and Significant improvements are demonstrated over existing restoration techniques. © 1995 IEEE

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Statistical model-based methods are presented for the reconstruction of autocorrelated signals in impulsive plus continuous noise environments. Signals are modelled as autoregressive and noise sources as discrete and continuous mixtures of Gaussians, allowing for robustness in highly impulsive and non-Gaussian environments. Markov Chain Monte Carlo methods are used for reconstruction of the corrupted waveforms within a Bayesian probabilistic framework and results are presented for contaminated voice and audio signals.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present a statistical model-based approach to signal enhancement in the case of additive broadband noise. Because broadband noise is localised in neither time nor frequency, its removal is one of the most pervasive and difficult signal enhancement tasks. In order to improve perceived signal quality, we take advantage of human perception and define a best estimate of the original signal in terms of a cost function incorporating perceptual optimality criteria. We derive the resultant signal estimator and implement it in a short-time spectral attenuation framework. Audio examples, references, and further information may be found at http://www-sigproc.eng.cam.ac.uk/~pjw47.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Resumen: El presente trabajo intenta encontrar una causa exógena al deterioro, a partir de 2005, en los estándares de crédito hipotecario que contribuyeron a la crisis subprime en los Estados Unidos. Sostenemos que la nueva provisión de la prueba de medios de la ley Bankruptcy Abuse Prevention and Consumer Protection Act (BAPCPA) de 2005 fue dicho shock exógeno en el mercado hipotecario. Mostramos que la prueba de medios, que impide solicitar la bancarrota bajo Chapter 7 a los deudores con mayores ingresos relativos, causó un desplazamiento de la oferta de crédito hipotecario de deudores con mayores ingresos a deudores con menores ingresos relativos. Simultáneamente, observamos que todos los deudores debieron pagar tasas de interés más altas, independientemente del nivel de ingresos. Nuestros resultados implican que la ley BAPCPA podría ser un factor que contribuyó al deterioro en los estándares de crédito en el mercado hipotecario de los Estados Unidos.