929 resultados para Audio acoustics


Relevância:

20.00% 20.00%

Publicador:

Resumo:

We propose an iterative algorithm to detect transient segments in audio signals. Short time Fourier transform(STFT) is used to detect rapid local changes in the audio signal. The algorithm has two steps that iteratively - (a) calculate a function of the STFT and (b) build a transient signal. A dynamic thresholding scheme is used to locate the potential positions of transients in the signal. The iterative procedure ensures that genuine transients are built up while the localised spectral noise are suppressed by using an energy criterion. The extracted transient signal is later compared to a ground truth dataset. The algorithm performed well on two databases. On the EBU-SQAM database of monophonic sounds, the algorithm achieved an F-measure of 90% while on our database of polyphonic audio an F-measure of 91% was achieved. This technique is being used as a preprocessing step for a tempo analysis algorithm and a TSR (Transients + Sines + Residue) decomposition scheme.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Exhaust noise in engines has always been a major source of automotive noise. Challenges for muffler design have been constraints on size, back pressure, and, of course, the cost. Designing for sufficient insertion loss at the engine firing frequency and the first few harmonics has been the biggest challenge. Most advances in the design of efficient mufflers have resulted from linear plane wave theory, making use of the transfer matrix method. This review paper deals with evaluating approximate source characteristics required for prediction of the unmuffled intake and exhaust noise, making use of the electroacoustical analogies. In the last few years, significant advances have been made in the analysis of variable area perforated ducts, transverse plane wave analysis of short elliptical as well as circular chambers, double-tuned expansion chambers and concentric tube resonators, catalytic converters, diesel particulate filters, air cleaners, etc. The development of long strand fibrous materials that can be used in hot exhaust systems without binders has led to the use of combination mufflers in exhaust systems. Breakthroughs have been achieved in the prediction and control of breakout noise from the elliptical and circular muffler shell as well as the end plates of typical mufflers. Diesel particulate filters and inlet air cleaners have also been modeled acoustically. Some of these recent advances are the subject of this review paper.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We address the problem of temporal envelope modeling for transient audio signals. We propose the Gamma distribution function (GDF) as a suitable candidate for modeling the envelope keeping in view some of its interesting properties such as asymmetry, causality, near-optimal time-bandwidth product, controllability of rise and decay, etc. The problem of finding the parameters of the GDF becomes a nonlinear regression problem. We overcome the hurdle by using a logarithmic envelope fit, which reduces the problem to one of linear regression. The logarithmic transformation also has the feature of dynamic range compression. Since temporal envelopes of audio signals are not uniformly distributed, in order to compute the amplitude, we investigate the importance of various loss functions for regression. Based on synthesized data experiments, wherein we have a ground truth, and real-world signals, we observe that the least-squares technique gives reasonably accurate amplitude estimates compared with other loss functions.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The trapezoidal rule, which is a special case of the Newmark family of algorithms, is one of the most widely used methods for transient hyperbolic problems. In this work, we show that this rule conserves linear and angular momenta and energy in the case of undamped linear elastodynamics problems, and an ``energy-like measure'' in the case of undamped acoustic problems. These conservation properties, thus, provide a rational basis for using this algorithm. In linear elastodynamics problems, variants of the trapezoidal rule that incorporate ``high-frequency'' dissipation are often used, since the higher frequencies, which are not approximated properly by the standard displacement-based approach, often result in unphysical behavior. Instead of modifying the trapezoidal algorithm, we propose using a hybrid finite element framework for constructing the stiffness matrix. Hybrid finite elements, which are based on a two-field variational formulation involving displacement and stresses, are known to approximate the eigenvalues much more accurately than the standard displacement-based approach, thereby either bypassing or reducing the need for high-frequency dissipation. We show this by means of several examples, where we compare the numerical solutions obtained using the displacement-based and hybrid approaches against analytical solutions.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we propose a new state transition based embedding (STBE) technique for audio watermarking with high fidelity. Furthermore, we propose a new correlation based encoding (CBE) scheme for binary logo image in order to enhance the payload capacity. The result of CBE is also compared with standard run-length encoding (RLE) compression and Huffman schemes. Most of the watermarking algorithms are based on modulating selected transform domain feature of an audio segment in order to embed given watermark bit. In the proposed STBE method instead of modulating feature of each and every segment to embed data, our aim is to retain the default value of this feature for most of the segments. Thus, a high quality of watermarked audio is maintained. Here, the difference between the mean values (Mdiff) of insignificant complex cepstrum transform (CCT) coefficients of down-sampled subsets is selected as a robust feature for embedding. Mdiff values of the frames are changed only when certain conditions are met. Hence, almost 50% of the times, segments are not changed and still STBE can convey watermark information at receiver side. STBE also exhibits a partial restoration feature by which the watermarked audio can be restored partially after extraction of the watermark at detector side. The psychoacoustic model analysis showed that the noise-masking ratio (NMR) of our system is less than -10dB. As amplitude scaling in time domain does not affect selected insignificant CCT coefficients, strong invariance towards amplitude scaling attacks is also proved theoretically. Experimental results reveal that the proposed watermarking scheme maintains high audio quality and are simultaneously robust to general attacks like MP3 compression, amplitude scaling, additive noise, re-quantization, etc.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents speaker normalization approaches for audio search task. Conventional state-of-the-art feature set, viz., Mel Frequency Cepstral Coefficients (MFCC) is known to contain speaker-specific and linguistic information implicitly. This might create problem for speaker-independent audio search task. In this paper, universal warping-based approach is used for vocal tract length normalization in audio search. In particular, features such as scale transform and warped linear prediction are used to compensate speaker variability in audio matching. The advantage of these features over conventional feature set is that they apply universal frequency warping for both the templates to be matched during audio search. The performance of Scale Transform Cepstral Coefficients (STCC) and Warped Linear Prediction Cepstral Coefficients (WLPCC) are about 3% higher than the state-of-the-art MFCC feature sets on TIMIT database.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We propose apractical, feature-level and score-level fusion approach by combining acoustic and estimated articulatory information for both text independent and text dependent speaker verification. From a practical point of view, we study how to improve speaker verification performance by combining dynamic articulatory information with the conventional acoustic features. On text independent speaker verification, we find that concatenating articulatory features obtained from measured speech production data with conventional Mel-frequency cepstral coefficients (MFCCs) improves the performance dramatically. However, since directly measuring articulatory data is not feasible in many real world applications, we also experiment with estimated articulatory features obtained through acoustic-to-articulatory inversion. We explore both feature level and score level fusion methods and find that the overall system performance is significantly enhanced even with estimated articulatory features. Such a performance boost could be due to the inter-speaker variation information embedded in the estimated articulatory features. Since the dynamics of articulation contain important information, we included inverted articulatory trajectories in text dependent speaker verification. We demonstrate that the articulatory constraints introduced by inverted articulatory features help to reject wrong password trials and improve the performance after score level fusion. We evaluate the proposed methods on the X-ray Microbeam database and the RSR 2015 database, respectively, for the aforementioned two tasks. Experimental results show that we achieve more than 15% relative equal error rate reduction for both speaker verification tasks. (C) 2015 Elsevier Ltd. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The fluid mechanics of water entry is studied through investigating the underwater acoustics and the supercavitation. Underwater acoustic signals in water entry are extensively measured at about 30 different positions by using a PVDF needle hydrophone. From the measurements we obtain (1) the primary shock wave caused by the impact of the blunt body on free surface; (2) the vapor pressure inside the cavity; (3) the secondary shock wave caused by pulling away of the cavity from free surface; and so on. The supercavitation induced by the blunt body is observed by using a digital high-speed video camera as well as the single shot photography. The periodic and 3 dimensional motion of the supercavitation is revealed. The experiment is carried out at room temperature.

Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In speech recognition systems language model (LMs) are often constructed by training and combining multiple n-gram models. They can be either used to represent different genres or tasks found in diverse text sources, or capture stochastic properties of different linguistic symbol sequences, for example, syllables and words. Unsupervised LM adaptation may also be used to further improve robustness to varying styles or tasks. When using these techniques, extensive software changes are often required. In this paper an alternative and more general approach based on weighted finite state transducers (WFSTs) is investigated for LM combination and adaptation. As it is entirely based on well-defined WFST operations, minimum change to decoding tools is needed. A wide range of LM combination configurations can be flexibly supported. An efficient on-the-fly WFST decoding algorithm is also proposed. Significant error rate gains of 7.3% relative were obtained on a state-of-the-art broadcast audio recognition task using a history dependently adapted multi-level LM modelling both syllable and word sequences. ©2010 IEEE.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we derive the a posteriori probability for the location of bursts of noise additively superimposed on a Gaussian AR process. The theory is developed to give a sequentially based restoration algorithm suitable for real-time applications. The algorithm is particularly appropriate for digital audio restoration, where clicks and scratches may be modelled as additive bursts of noise. Experiments are carried out on both real audio data and synthetic AR processes and Significant improvements are demonstrated over existing restoration techniques. © 1995 IEEE

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We use reversible jump Markov chain Monte Carlo (MCMC) methods to address the problem of model order uncertainty in autoregressive (AR) time series within a Bayesian framework. Efficient model jumping is achieved by proposing model space moves from the full conditional density for the AR parameters, which is obtained analytically. This is compared with an alternative method, for which the moves are cheaper to compute, in which proposals are made only for new parameters in each move. Results are presented for both synthetic and audio time series.