995 resultados para International TV
Resumo:
We address the problem of multi-instrument recognition in polyphonic music signals. Individual instruments are modeled within a stochastic framework using Student's-t Mixture Models (tMMs). We impose a mixture of these instrument models on the polyphonic signal model. No a priori knowledge is assumed about the number of instruments in the polyphony. The mixture weights are estimated in a latent variable framework from the polyphonic data using an Expectation Maximization (EM) algorithm, derived for the proposed approach. The weights are shown to indicate instrument activity. The output of the algorithm is an Instrument Activity Graph (IAG), using which, it is possible to find out the instruments that are active at a given time. An average F-ratio of 0 : 7 5 is obtained for polyphonies containing 2-5 instruments, on a experimental test set of 8 instruments: clarinet, flute, guitar, harp, mandolin, piano, trombone and violin.
Resumo:
Time-varying linear prediction has been studied in the context of speech signals, in which the auto-regressive (AR) coefficients of the system function are modeled as a linear combination of a set of known bases. Traditionally, least squares minimization is used for the estimation of model parameters of the system. Motivated by the sparse nature of the excitation signal for voiced sounds, we explore the time-varying linear prediction modeling of speech signals using sparsity constraints. Parameter estimation is posed as a 0-norm minimization problem. The re-weighted 1-norm minimization technique is used to estimate the model parameters. We show that for sparsely excited time-varying systems, the formulation models the underlying system function better than the least squares error minimization approach. Evaluation with synthetic and real speech examples show that the estimated model parameters track the formant trajectories closer than the least squares approach.
Resumo:
We propose data acquisition from continuous-time signals belonging to the class of real-valued trigonometric polynomials using an event-triggered sampling paradigm. The sampling schemes proposed are: level crossing (LC), close to extrema LC, and extrema sampling. Analysis of robustness of these schemes to jitter, and bandpass additive gaussian noise is presented. In general these sampling schemes will result in non-uniformly spaced sample instants. We address the issue of signal reconstruction from the acquired data-set by imposing structure of sparsity on the signal model to circumvent the problem of gap and density constraints. The recovery performance is contrasted amongst the various schemes and with random sampling scheme. In the proposed approach, both sampling and reconstruction are non-linear operations, and in contrast to random sampling methodologies proposed in compressive sensing these techniques may be implemented in practice with low-power circuitry.
Resumo:
We formulate the problem of detecting the constituent instruments in a polyphonic music piece as a joint decoding problem. From monophonic data, parametric Gaussian Mixture Hidden Markov Models (GM-HMM) are obtained for each instrument. We propose a method to use the above models in a factorial framework, termed as Factorial GM-HMM (F-GM-HMM). The states are jointly inferred to explain the evolution of each instrument in the mixture observation sequence. The dependencies are decoupled using variational inference technique. We show that the joint time evolution of all instruments' states can be captured using F-GM-HMM. We compare performance of proposed method with that of Student's-t mixture model (tMM) and GM-HMM in an existing latent variable framework. Experiments on two to five polyphony with 8 instrument models trained on the RWC dataset, tested on RWC and TRIOS datasets show that F-GM-HMM gives an advantage over the other considered models in segments containing co-occurring instruments.