14 resultados para Audio indexing

em Indian Institute of Science - Bangalore - Índia


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Time-frequency analysis of various simulated and experimental signals due to elastic wave scattering from damage are performed using wavelet transform (WT) and Hilbert-Huang transform (HHT) and their performances are compared in context of quantifying the damages. Spectral finite element method is employed for numerical simulation of wave scattering. An analytical study is carried out to study the effects of higher-order damage parameters on the reflected wave from a damage. Based on this study, error bounds are computed for the signals in the spectral and also on the time-frequency domains. It is shown how such an error bound can provide all estimate of error in the modelling of wave propagation in structure with damage. Measures of damage based on WT and HHT is derived to quantify the damage information hidden in the signal. The aim of this study is to obtain detailed insights into the problem of (1) identifying localised damages (2) dispersion of multifrequency non-stationary signals after they interact with various types of damage and (3) quantifying the damages. Sensitivity analysis of the signal due to scattered wave based on time-frequency representation helps to correlate the variation of damage index measures with respect to the damage parameters like damage size and material degradation factors.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Indexing of a decagonal quasicrystal using the scheme utilizing five planar vectors and one perpendicular to them is examined in detail. A method for determining the indices of zone axes that a reciprocal vector would make in a decagonal phase of any periodicity has been proposed. By this method, the location of the zone axes made by any reciprocal vector can be predicted. The orthogonality condition has been simplified for the zone axes containing twofold vectors. The locations of zone axes have also been determined by an alternative method, utilizing spherical trigonometric calculations, which confirm the zone-axis locations given by the indices. The effect of one-dimensional periodicity on the indices and the accuracy of the zone-axis determination is discussed. Rules for the formation of zone axes between several reciprocal vectors and the prediction of all the reciprocal vectors in a zone are evolved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The least path criterion or least path length in the context of redundant basis vector systems is discussed and a mathematical proof is presented of the uniqueness of indices obtained by applying the least path criterion. Though the method has greater generality, this paper concentrates on the two-dimensional decagonal lattice. The order of redundancy is also discussed; this will help eventually to correlate with other redundant but desirable indexing sets.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Communication applications are usually delay restricted, especially for the instance of musicians playing over the Internet. This requires a one-way delay of maximum 25 msec and also a high audio quality is desired at feasible bit rates. The ultra low delay (ULD) audio coding structure is well suited to this application and we investigate further the application of multistage vector quantization (MSVQ) to reach a bit rate range below 64 Kb/s, in a scalable manner. Results at 32 Kb/s and 64 Kb/s show that the trained codebook MSVQ performs best, better than KLT normalization followed by a simulated Gaussian MSVQ or simulated Gaussian MSVQ alone. The results also show that there is only a weak dependence on the training data, and that we indeed converge to the perceptual quality of our previous ULD coder at 64 Kb/s.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We propose a parametric stereo coding analysis and synthesis directly in the MDCT domain using an analysis by synthesis parameter estimation. The stereo signal is represented by an equalized sum signal and spatialization parameters. Equalized sum signal and the spatialization parameters are obtained by sub-band analysis in the MDCT domain. The de-correlated signal required for the stereo synthesis is also generated in the MDCT domain. Subjective evaluation test using MUSHRA shows that the synthesized stereo signal is perceptually satisfactory and comparable to the state of the art parametric coders.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Pre-whitening techniques are employed in blind correlation detection of additive spread spectrum watermarks in audio signals to reduce the host signal interference. A direct deterministic whitening (DDW) scheme is derived in this paper from the frequency domain analysis of the time domain correlation process. Our experimental studies reveal that, the Savitzky-Golay Whitening (SGW), which is otherwise inferior to DDW technique, performs better when the audio signal is predominantly lowpass. The novelty of this paper lies in exploiting the complementary nature to the two whitening techniques to obtain a hybrid whitening (HbW) scheme. In the hybrid scheme the DDW and SGW techniques are selectively applied, based on short time spectral characteristics of the audio signal. The hybrid scheme extends the reliability of watermark detection to a wider range of audio signals.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Indexing of a decagonal quasicrystal using the scheme utilizing five planar vectors and one perpendicular to them is examined in detail. A method for determining the indices of zone axes that a reciprocal vector would make in a decagonal phase of any periodicity has been proposed. By this method, the location of the zone axes made by any reciprocal vector can be predicted. The orthogonality condition has been simplified for the zone axes containing twofold vectors. The locations of zone axes have also been determined by an alternative method, utilizing spherical trigonometric calculations, which confirm the zone-axis locations given by the indices. The effect of one-dimensional periodicity on the indices and the accuracy of the zone-axis determination is discussed. Rules for the formation of zone axes between several reciprocal vectors and the prediction of all the reciprocal vectors in a zone are evolved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We propose an iterative algorithm to detect transient segments in audio signals. Short time Fourier transform(STFT) is used to detect rapid local changes in the audio signal. The algorithm has two steps that iteratively - (a) calculate a function of the STFT and (b) build a transient signal. A dynamic thresholding scheme is used to locate the potential positions of transients in the signal. The iterative procedure ensures that genuine transients are built up while the localised spectral noise are suppressed by using an energy criterion. The extracted transient signal is later compared to a ground truth dataset. The algorithm performed well on two databases. On the EBU-SQAM database of monophonic sounds, the algorithm achieved an F-measure of 90% while on our database of polyphonic audio an F-measure of 91% was achieved. This technique is being used as a preprocessing step for a tempo analysis algorithm and a TSR (Transients + Sines + Residue) decomposition scheme.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We address the problem of temporal envelope modeling for transient audio signals. We propose the Gamma distribution function (GDF) as a suitable candidate for modeling the envelope keeping in view some of its interesting properties such as asymmetry, causality, near-optimal time-bandwidth product, controllability of rise and decay, etc. The problem of finding the parameters of the GDF becomes a nonlinear regression problem. We overcome the hurdle by using a logarithmic envelope fit, which reduces the problem to one of linear regression. The logarithmic transformation also has the feature of dynamic range compression. Since temporal envelopes of audio signals are not uniformly distributed, in order to compute the amplitude, we investigate the importance of various loss functions for regression. Based on synthesized data experiments, wherein we have a ground truth, and real-world signals, we observe that the least-squares technique gives reasonably accurate amplitude estimates compared with other loss functions.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we propose a new state transition based embedding (STBE) technique for audio watermarking with high fidelity. Furthermore, we propose a new correlation based encoding (CBE) scheme for binary logo image in order to enhance the payload capacity. The result of CBE is also compared with standard run-length encoding (RLE) compression and Huffman schemes. Most of the watermarking algorithms are based on modulating selected transform domain feature of an audio segment in order to embed given watermark bit. In the proposed STBE method instead of modulating feature of each and every segment to embed data, our aim is to retain the default value of this feature for most of the segments. Thus, a high quality of watermarked audio is maintained. Here, the difference between the mean values (Mdiff) of insignificant complex cepstrum transform (CCT) coefficients of down-sampled subsets is selected as a robust feature for embedding. Mdiff values of the frames are changed only when certain conditions are met. Hence, almost 50% of the times, segments are not changed and still STBE can convey watermark information at receiver side. STBE also exhibits a partial restoration feature by which the watermarked audio can be restored partially after extraction of the watermark at detector side. The psychoacoustic model analysis showed that the noise-masking ratio (NMR) of our system is less than -10dB. As amplitude scaling in time domain does not affect selected insignificant CCT coefficients, strong invariance towards amplitude scaling attacks is also proved theoretically. Experimental results reveal that the proposed watermarking scheme maintains high audio quality and are simultaneously robust to general attacks like MP3 compression, amplitude scaling, additive noise, re-quantization, etc.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents speaker normalization approaches for audio search task. Conventional state-of-the-art feature set, viz., Mel Frequency Cepstral Coefficients (MFCC) is known to contain speaker-specific and linguistic information implicitly. This might create problem for speaker-independent audio search task. In this paper, universal warping-based approach is used for vocal tract length normalization in audio search. In particular, features such as scale transform and warped linear prediction are used to compensate speaker variability in audio matching. The advantage of these features over conventional feature set is that they apply universal frequency warping for both the templates to be matched during audio search. The performance of Scale Transform Cepstral Coefficients (STCC) and Warped Linear Prediction Cepstral Coefficients (WLPCC) are about 3% higher than the state-of-the-art MFCC feature sets on TIMIT database.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Monitoring gas purity is an important aspect of gas recovery stations where air is usually one of the major impurities. Purity monitors of Katherometric type ate commercially available for this purpose. Alternatively, we discuss here a helium gas purity monitor based on acoustic resonance of a cavity at audio frequencies. It measures the purity by monitoring the resonant frequency of a cylindrical cavity filled with the gas under test and excited by conventional telephone transducers fixed at the ends. The use of the latter simplifies the design considerably. The paper discusses the details of the resonant cavity and the electronic circuit along with temperature compensation. The unit has been calibrated with helium gas of known purities. The unit has a response time of the order of 10 minutes and measures the gas purity to an accuracy of 0.02%. The unit has been installed in our helium recovery system and is found to perform satisfactorily.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

With the availability of a huge amount of video data on various sources, efficient video retrieval tools are increasingly in demand. Video being a multi-modal data, the perceptions of ``relevance'' between the user provided query video (in case of Query-By-Example type of video search) and retrieved video clips are subjective in nature. We present an efficient video retrieval method that takes user's feedback on the relevance of retrieved videos and iteratively reformulates the input query feature vectors (QFV) for improved video retrieval. The QFV reformulation is done by a simple, but powerful feature weight optimization method based on Simultaneous Perturbation Stochastic Approximation (SPSA) technique. A video retrieval system with video indexing, searching and relevance feedback (RF) phases is built for demonstrating the performance of the proposed method. The query and database videos are indexed using the conventional video features like color, texture, etc. However, we use the comprehensive and novel methods of feature representations, and a spatio-temporal distance measure to retrieve the top M videos that are similar to the query. In feedback phase, the user activated iterative on the previously retrieved videos is used to reformulate the QFV weights (measure of importance) that reflect the user's preference, automatically. It is our observation that a few iterations of such feedback are generally sufficient for retrieving the desired video clips. The novel application of SPSA based RF for user-oriented feature weights optimization makes the proposed method to be distinct from the existing ones. The experimental results show that the proposed RF based video retrieval exhibit good performance.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This article discusses the design and development of GRDB (General Purpose Relational Data Base System) which has been implemented on a DEC-1090 system in Pascal. GRDB is a general purpose database system designed to be completely independent of the nature of data to be handled, since it is not tailored to the specific requirements of any particular enterprise. It can handle different types of data such as variable length records and textual data. Apart from the usual database facilities such as data definition and data manipulation, GRDB supports User Definition Language (UDL) and Security definition language. These facilities are provided through a SEQUEL-like General Purpose Query Language (GQL). GRDB provides adequate protection facilities up to the relation level. The concept of “security matrix” has been made use of to provide database protection. The concept of Unique IDentification number (UID) and Password is made use of to ensure user identification and authentication. The concept of static integrity constraints has been used to ensure data integrity. Considerable efforts have been made to improve the response time through indexing on the data files and query optimisation. GRDB is designed for an interactive use but alternate provision has been made for its use through batch mode also. A typical Air Force application (consisting of data about personnel, inventory control, and maintenance planning) has been used to test GRDB and it has been found to perform satisfactorily.