881 resultados para San Miniato al Monte (Cemetery : Florence, Italy)


Relevância:

100.00% 100.00%

Publicador:

Resumo:

We address the problem of estimating the fundamental frequency of voiced speech. We present a novel solution motivated by the importance of amplitude modulation in sound processing and speech perception. The new algorithm is based on a cumulative spectrum computed from the temporal envelope of various subbands. We provide theoretical analysis to derive the new pitch estimator based on the temporal envelope of the bandpass speech signal. We report extensive experimental performance for synthetic as well as natural vowels for both realworld noisy and noise-free data. Experimental results show that the new technique performs accurate pitch estimation and is robust to noise. We also show that the technique is superior to the autocorrelation technique for pitch estimation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Pre-whitening techniques are employed in blind correlation detection of additive spread spectrum watermarks in audio signals to reduce the host signal interference. A direct deterministic whitening (DDW) scheme is derived in this paper from the frequency domain analysis of the time domain correlation process. Our experimental studies reveal that, the Savitzky-Golay Whitening (SGW), which is otherwise inferior to DDW technique, performs better when the audio signal is predominantly lowpass. The novelty of this paper lies in exploiting the complementary nature to the two whitening techniques to obtain a hybrid whitening (HbW) scheme. In the hybrid scheme the DDW and SGW techniques are selectively applied, based on short time spectral characteristics of the audio signal. The hybrid scheme extends the reliability of watermark detection to a wider range of audio signals.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Grating Compression Transform (GCT) is a two-dimensional analysis of speech signal which has been shown to be effective in multi-pitch tracking in speech mixtures. Multi-pitch tracking methods using GCT apply Kalman filter framework to obtain pitch tracks which requires training of the filter parameters using true pitch tracks. We propose an unsupervised method for obtaining multiple pitch tracks. In the proposed method, multiple pitch tracks are modeled using time-varying means of a Gaussian mixture model (GMM), referred to as TVGMM. The TVGMM parameters are estimated using multiple pitch values at each frame in a given utterance obtained from different patches of the spectrogram using GCT. We evaluate the performance of the proposed method on all voiced speech mixtures as well as random speech mixtures having well separated and close pitch tracks. TVGMM achieves multi-pitch tracking with 51% and 53% multi-pitch estimates having error <= 20% for random mixtures and all-voiced mixtures respectively. TVGMM also results in lower root mean squared error in pitch track estimation compared to that by Kalman filtering.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Designing a robust algorithm for visual object tracking has been a challenging task since many years. There are trackers in the literature that are reasonably accurate for many tracking scenarios but most of them are computationally expensive. This narrows down their applicability as many tracking applications demand real time response. In this paper, we present a tracker based on random ferns. Tracking is posed as a classification problem and classification is done using ferns. We used ferns as they rely on binary features and are extremely fast at both training and classification as compared to other classification algorithms. Our experiments show that the proposed tracker performs well on some of the most challenging tracking datasets and executes much faster than one of the state-of-the-art trackers, without much difference in tracking accuracy.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Time-varying linear prediction has been studied in the context of speech signals, in which the auto-regressive (AR) coefficients of the system function are modeled as a linear combination of a set of known bases. Traditionally, least squares minimization is used for the estimation of model parameters of the system. Motivated by the sparse nature of the excitation signal for voiced sounds, we explore the time-varying linear prediction modeling of speech signals using sparsity constraints. Parameter estimation is posed as a 0-norm minimization problem. The re-weighted 1-norm minimization technique is used to estimate the model parameters. We show that for sparsely excited time-varying systems, the formulation models the underlying system function better than the least squares error minimization approach. Evaluation with synthetic and real speech examples show that the estimated model parameters track the formant trajectories closer than the least squares approach.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We address the problem of parameter estimation of an ellipse from a limited number of samples. We develop a new approach for solving the ellipse fitting problem by showing that the x and y coordinate functions of an ellipse are finite-rate-of-innovation (FRI) signals. Uniform samples of x and y coordinate functions of the ellipse are modeled as a sum of weighted complex exponentials, for which we propose an efficient annihilating filter technique to estimate the ellipse parameters from the samples. The FRI framework allows for estimating the ellipse parameters reliably from partial or incomplete measurements even in the presence of noise. The efficiency and robustness of the proposed method is compared with state-of-art direct method. The experimental results show that the estimated parameters have lesser bias compared with the direct method and the estimation error is reduced by 5-10 dB relative to the direct method.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We consider the problem of parameter estimation from real-valued multi-tone signals. Such problems arise frequently in spectral estimation. More recently, they have gained new importance in finite-rate-of-innovation signal sampling and reconstruction. The annihilating filter is a key tool for parameter estimation in these problems. The standard annihilating filter design has to be modified to result in accurate estimation when dealing with real sinusoids, particularly because the real-valued nature of the sinusoids must be factored into the annihilating filter design. We show that the constraint on the annihilating filter can be relaxed by making use of the Hilbert transform. We refer to this approach as the Hilbert annihilating filter approach. We show that accurate parameter estimation is possible by this approach. In the single-tone case, the mean-square error performance increases by 6 dB for signal-to-noise ratio (SNR) greater than 0 dB. We also present experimental results in the multi-tone case, which show that a significant improvement (about 6 dB) is obtained when the parameters are close to 0 or pi. In the mid-frequency range, the improvement is about 2 to 3 dB.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We address the problem of designing an optimal pointwise shrinkage estimator in the transform domain, based on the minimum probability of error (MPE) criterion. We assume an additive model for the noise corrupting the clean signal. The proposed formulation is general in the sense that it can handle various noise distributions. We consider various noise distributions (Gaussian, Student's-t, and Laplacian) and compare the denoising performance of the estimator obtained with the mean-squared error (MSE)-based estimators. The MSE optimization is carried out using an unbiased estimator of the MSE, namely Stein's Unbiased Risk Estimate (SURE). Experimental results show that the MPE estimator outperforms the SURE estimator in terms of SNR of the denoised output, for low (0 -10 dB) and medium values (10 - 20 dB) of the input SNR.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Electromagnetic Articulography (EMA) technique is used to record the kinematics of different articulators while one speaks. EMA data often contains missing segments due to sensor failure. In this work, we propose a maximum a-posteriori (MAP) estimation with continuity constraint to recover the missing samples in the articulatory trajectories recorded using EMA. In this approach, we combine the benefits of statistical MAP estimation as well as the temporal continuity of the articulatory trajectories. Experiments on articulatory corpus using different missing segment durations show that the proposed continuity constraint results in a 30% reduction in average root mean squared error in estimation over statistical estimation of missing segments without any continuity constraint.

Relevância:

100.00% 100.00%

Publicador: