16 resultados para Sight-reading (Music)

em Indian Institute of Science - Bangalore - Índia


Relevância:

20.00% 20.00%

Publicador:

Resumo:

The problem of automatic melody line identification in a MIDI file plays an important role towards taking QBH systems to the next level. We present here, a novel algorithm to identify the melody line in a polyphonic MIDI file. A note pruning and track/channel ranking method is used to identify the melody line. We use results from musicology to derive certain simple heuristics for the note pruning stage. This helps in the robustness of the algorithm, by way of discarding "spurious" notes. A ranking based on the melodic information in each track/channel enables us to choose the melody line accurately. Our algorithm makes no assumption about MIDI performer specific parameters, is simple and achieves an accuracy of 97% in identifying the melody line correctly. This algorithm is currently being used by us in a QBH system built in our lab.

Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We propose a simple speech music discriminator that uses features based on HILN(Harmonics, Individual Lines and Noise) model. We have been able to test the strength of the feature set on a standard database of 66 files and get an accuracy of around 97%. We also have tested on sung queries and polyphonic music and have got very good results. The current algorithm is being used to discriminate between sung queries and played (using an instrument like flute) queries for a Query by Humming(QBH) system currently under development in the lab.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In the direction of arrival (DOA) estimation problem, we encounter both finite data and insufficient knowledge of array characterization. It is therefore important to study how subspace-based methods perform in such conditions. We analyze the finite data performance of the multiple signal classification (MUSIC) and minimum norm (min. norm) methods in the presence of sensor gain and phase errors, and derive expressions for the mean square error (MSE) in the DOA estimates. These expressions are first derived assuming an arbitrary array and then simplified for the special case of an uniform linear array with isotropic sensors. When they are further simplified for the case of finite data only and sensor errors only, they reduce to the recent results given in [9-12]. Computer simulations are used to verify the closeness between the predicted and simulated values of the MSE.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We analyze the AlApana of a Carnatic music piece without the prior knowledge of the singer or the rAga. AlApana is ameans to communicate to the audience, the flavor or the bhAva of the rAga through the permitted notes and its phrases. The input to our analysis is a recording of the vocal AlApana along with the accompanying instrument. The AdhAra shadja(base note) of the singer for that AlApana is estimated through a stochastic model of note frequencies. Based on the shadja, we identify the notes (swaras) used in the AlApana using a semi-continuous GMM. Using the probabilities of each note interval, we recognize swaras of the AlApana. For sampurNa rAgas, we can identify the possible rAga, based on the swaras. We have been able to achieve correct shadja identification, which is crucial to all further steps, in 88.8% of 55 AlApanas. Among them (48 AlApanas of 7 rAgas), we get 91.5% correct swara identification and 62.13% correct R (rAga) accuracy.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We propose a new paradigm for displaying comments: showing comments alongside parts of the article they correspond to. We evaluate the effectiveness of various approaches for this task and show that a combination of bag of words and topic models performs the best.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Compressive Sensing (CS) is a new sensing paradigm which permits sampling of a signal at its intrinsic information rate which could be much lower than Nyquist rate, while guaranteeing good quality reconstruction for signals sparse in a linear transform domain. We explore the application of CS formulation to music signals. Since music signals comprise of both tonal and transient nature, we examine several transforms such as discrete cosine transform (DCT), discrete wavelet transform (DWT), Fourier basis and also non-orthogonal warped transforms to explore the effectiveness of CS theory and the reconstruction algorithms. We show that for a given sparsity level, DCT, overcomplete, and warped Fourier dictionaries result in better reconstruction, and warped Fourier dictionary gives perceptually better reconstruction. “MUSHRA” test results show that a moderate quality reconstruction is possible with about half the Nyquist sampling.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This brief discusses the convergence analysis of proportional navigation (PN) guidance law in the presence of delayed line-of-sight (LOS) rate information. The delay in the LOS rate is introduced by the missile guidance system that uses a low cost sensor to obtain LOS rate information by image processing techniques. A Lyapunov-like function is used to analyze the convergence of the delay differential equation (DDE) governing the evolution of the LOS rate. The time-to-go until which decreasing behaviour of the Lyapunov-like function can be guaranteed is obtained. Conditions on the delay for finite time convergence of the LOS rate are presented for the linearized engagement equation. It is observed that in the presence of line-of-sight rate delay, increasing the effective navigation constant of the PN guidance law deteriorates its performance. Numerical simulations are presented to validate the results.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Music signals comprise of atomic notes drawn from a musical scale. The creation of musical sequences often involves splicing the notes in a constrained way resulting in aesthetically appealing patterns. We develop an approach for music signal representation based on symbolic dynamics by translating the lexicographic rules over a musical scale to constraints on a Markov chain. This source representation is useful for machine based music synthesis, in a way, similar to a musician producing original music. In order to mathematically quantify user listening experience, we study the correlation between the max-entropic rate of a musical scale and the subjective aesthetic component. We present our analysis with examples from the south Indian classical music system.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The ribosomal P-site hosts the peptidyl-tRNAs during translation elongation. Which P-site elements support these tRNA species to maintain codon-anticodon interactions has remained unclear. We investigated the effects of P-site features of methylations of G966, C967, and the conserved C-terminal tail sequence of Ser, Lys, and Arg (SKR) of the S9 ribosomal protein in maintenance of the translational reading frame of an mRNA. We generated Escherichia coli strains deleted for the SKR sequence in S9 ribosomal protein, RsmB (which methylates C967), and RsmD (which methylates G966) and used them to translate LacZ from its +1 and -1 out-of-frame constructs. We show that the S9 SKR tail prevents both the +1 and -1 frameshifts and plays a general role in holding the P-site tRNA/peptidyl-tRNA in place. In contrast, the G966 and C967 methylations did not make a direct contribution to the maintenance of the translational frame of an mRNA. However, deletion of rsmB in the S9 Delta 3 background caused significantly increased -1 frameshifting at 37 degrees C. Interestingly, the effects of the deficiency of C967 methylation were annulled when the E. coli strain was grown at 30 degrees C, supporting its context-dependent role.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this work, interference alignment for a class of Gaussian interference networks with general message demands, having line of sight (LOS) channels, at finite powers is considered. We assume that each transmitter has one independent message to be transmitted and the propagation delays are uniformly distributed between 0 and (L - 1) (L >; 0). If receiver-j, j ∈{1,2,..., J}, requires the message of transmitter-i, i ∈ {1, 2, ..., K}, we say (i, j) belongs to a connection. A class of interference networks called the symmetrically connected interference network is defined as a network where, the number of connections required at each transmitter-i is equal to ct for all i and the number of connections required at each receiver-j is equal to cr for all j, for some fixed positive integers ct and cr. For such networks with a LOS channel between every transmitter and every receiver, we show that an expected sum-spectral efficiency (in bits/sec/Hz) of at least K/(e+c1-1)(ct+1) (ct/ct+1)ct log2 (1+min(i, j)∈c|hi, j|2 P/WN0) can be achieved as the number of transmitters and receivers tend to infinity, i.e., K, J →∞ where, C denotes the set of all connections, hij is the channel gain between transmitter-i and receiver-j, P is the average power constraint at each transmitter, W is the bandwidth and N0 W is the variance of Gaussian noise at each receiver. This means that, for an LOS symmetrically connected interference network, at any finite power, the total spectral efficiency can grow linearly with K as K, J →∞. This is achieved by extending the time domain interference alignment scheme proposed by Grokop et al. for the k-user Gaussian interference channel to interference networks.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We address the problem of multi-instrument recognition in polyphonic music signals. Individual instruments are modeled within a stochastic framework using Student's-t Mixture Models (tMMs). We impose a mixture of these instrument models on the polyphonic signal model. No a priori knowledge is assumed about the number of instruments in the polyphony. The mixture weights are estimated in a latent variable framework from the polyphonic data using an Expectation Maximization (EM) algorithm, derived for the proposed approach. The weights are shown to indicate instrument activity. The output of the algorithm is an Instrument Activity Graph (IAG), using which, it is possible to find out the instruments that are active at a given time. An average F-ratio of 0 : 7 5 is obtained for polyphonies containing 2-5 instruments, on a experimental test set of 8 instruments: clarinet, flute, guitar, harp, mandolin, piano, trombone and violin.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The tonic is a fundamental concept in Indian art music. It is the base pitch, which an artist chooses in order to construct the melodies during a rg(a) rendition, and all accompanying instruments are tuned using the tonic pitch. Consequently, tonic identification is a fundamental task for most computational analyses of Indian art music, such as intonation analysis, melodic motif analysis and rg recognition. In this paper we review existing approaches for tonic identification in Indian art music and evaluate them on six diverse datasets for a thorough comparison and analysis. We study the performance of each method in different contexts such as the presence/absence of additional metadata, the quality of audio data, the duration of audio data, music tradition (Hindustani/Carnatic) and the gender of the singer (male/female). We show that the approaches that combine multi-pitch analysis with machine learning provide the best performance in most cases (90% identification accuracy on average), and are robust across the aforementioned contexts compared to the approaches based on expert knowledge. In addition, we also show that the performance of the latter can be improved when additional metadata is available to further constrain the problem. Finally, we present a detailed error analysis of each method, providing further insights into the advantages and limitations of the methods.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Story understanding involves many perceptual and cognitive subprocesses, from perceiving individual words, to parsing sentences, to understanding the relationships among the story characters. We present an integrated computational model of reading that incorporates these and additional subprocesses, simultaneously discovering their fMRI signatures. Our model predicts the fMRI activity associated with reading arbitrary text passages, well enough to distinguish which of two story segments is being read with 74% accuracy. This approach is the first to simultaneously track diverse reading subprocesses during complex story processing and predict the detailed neural representation of diverse story features, ranging from visual word properties to the mention of different story characters and different actions they perform. We construct brain representation maps that replicate many results from a wide range of classical studies that focus each on one aspect of language processing and offer new insights on which type of information is processed by different areas involved in language processing. Additionally, this approach is promising for studying individual differences: it can be used to create single subject maps that may potentially be used to measure reading comprehension and diagnose reading disorders.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Computing the maximum of sensor readings arises in several environmental, health, and industrial monitoring applications of wireless sensor networks (WSNs). We characterize the several novel design trade-offs that arise when green energy harvesting (EH) WSNs, which promise perpetual lifetimes, are deployed for this purpose. The nodes harvest renewable energy from the environment for communicating their readings to a fusion node, which then periodically estimates the maximum. For a randomized transmission schedule in which a pre-specified number of randomly selected nodes transmit in a sensor data collection round, we analyze the mean absolute error (MAE), which is defined as the mean of the absolute difference between the maximum and that estimated by the fusion node in each round. We optimize the transmit power and the number of scheduled nodes to minimize the MAE, both when the nodes have channel state information (CSI) and when they do not. Our results highlight how the optimal system operation depends on the EH rate, availability and cost of acquiring CSI, quantization, and size of the scheduled subset. Our analysis applies to a general class of sensor reading and EH random processes.