61 resultados para independent music
Resumo:
In this paper, we present a machine learning approach for subject independent human action recognition using depth camera, emphasizing the importance of depth in recognition of actions. The proposed approach uses the flow information of all 3 dimensions to classify an action. In our approach, we have obtained the 2-D optical flow and used it along with the depth image to obtain the depth flow (Z motion vectors). The obtained flow captures the dynamics of the actions in space time. Feature vectors are obtained by averaging the 3-D motion over a grid laid over the silhouette in a hierarchical fashion. These hierarchical fine to coarse windows capture the motion dynamics of the object at various scales. The extracted features are used to train a Meta-cognitive Radial Basis Function Network (McRBFN) that uses a Projection Based Learning (PBL) algorithm, referred to as PBL-McRBFN, henceforth. PBL-McRBFN begins with zero hidden neurons and builds the network based on the best human learning strategy, namely, self-regulated learning in a meta-cognitive environment. When a sample is used for learning, PBLMcRBFN uses the sample overlapping conditions, and a projection based learning algorithm to estimate the parameters of the network. The performance of PBL-McRBFN is compared to that of a Support Vector Machine (SVM) and Extreme Learning Machine (ELM) classifiers with representation of every person and action in the training and testing datasets. Performance study shows that PBL-McRBFN outperforms these classifiers in recognizing actions in 3-D. Further, a subject-independent study is conducted by leave-one-subject-out strategy and its generalization performance is tested. It is observed from the subject-independent study that McRBFN is capable of generalizing actions accurately. The performance of the proposed approach is benchmarked with Video Analytics Lab (VAL) dataset and Berkeley Multimodal Human Action Database (MHAD). (C) 2013 Elsevier Ltd. All rights reserved.
Resumo:
This paper, for the first time, explores the charcatersictics of MOS capacitor controlled by independent double gates by numerical simulation and analytical modeling for its possible use in RF circuit design as a varactor. By numerical simulation it is shown how the quasi-static and non-quasi-static characteristics of the first gate capacitance could be tuned by the second gate biases. Effect of body doping and energy quantization are also discussed in this regard. A semi-empirical quasi-static model is also developed by using the existing incomplete Poisson solution of independent double gate transistors. Proposed model, which is valid from accumulation to inversion, is shown to have excellent agreement with numerical simulation for practical bias conditions.
Resumo:
We address the problem of multi-instrument recognition in polyphonic music signals. Individual instruments are modeled within a stochastic framework using Student's-t Mixture Models (tMMs). We impose a mixture of these instrument models on the polyphonic signal model. No a priori knowledge is assumed about the number of instruments in the polyphony. The mixture weights are estimated in a latent variable framework from the polyphonic data using an Expectation Maximization (EM) algorithm, derived for the proposed approach. The weights are shown to indicate instrument activity. The output of the algorithm is an Instrument Activity Graph (IAG), using which, it is possible to find out the instruments that are active at a given time. An average F-ratio of 0 : 7 5 is obtained for polyphonies containing 2-5 instruments, on a experimental test set of 8 instruments: clarinet, flute, guitar, harp, mandolin, piano, trombone and violin.
Resumo:
Detection of QRS serves as a first step in many automated ECG analysis techniques. Motivated by the strong similarities between the signal structures of an ECG signal and the integrated linear prediction residual (ILPR) of voiced speech, an algorithm proposed earlier for epoch detection from ILPR is extended to the problem of QRS detection. The ECG signal is pre-processed by high-pass filtering to remove the baseline wandering and by half-wave rectification to reduce the ambiguities. The initial estimates of the QRS are iteratively obtained using a non-linear temporal feature, named the dynamic plosion index suitable for detection of transients in a signal. These estimates are further refined to obtain a higher temporal accuracy. Unlike most of the high performance algorithms, this technique does not make use of any threshold or differencing operation. The proposed algorithm is validated on the MIT-BIH database using the standard metrics and its performance is found to be comparable to the state-of-the-art algorithms, despite its threshold independence and simple decision logic.
Resumo:
Rice landraces are lineages developed by farmers through artificial selection during the long-term domestication process. Despite huge potential for crop improvement, they are largely understudied in India. Here, we analyse a suite of phenotypic characters from large numbers of Indian landraces comprised of both aromatic and non-aromatic varieties. Our primary aim was to investigate the major determinants of diversity, the strength of segregation among aromatic and non-aromatic landraces as well as that within aromatic landraces. Using principal component analysis, we found that grain length, width and weight, panicle weight and leaf length have the most substantial contribution. Discriminant analysis can effectively distinguish the majority of aromatic from non-aromatic landraces. More interestingly, within aromatic landraces long-grain traditional Basmati and short-grain non-Basmati aromatics remain morphologically well differentiated. The present research emphasizes the general patterns of phenotypic diversity and finds out the most important characters. It also confirms the existence of very unique short-grain aromatic landraces, perhaps carrying signatures of independent origin of an additional aroma quantitative trait locus in the indica group, unlike introgression of specific alleles of the BADH2 gene from the japonica group as in Basmati. We presume that this parallel origin and evolution of aroma in short-grain indica landraces are linked to the long history of rice domestication that involved inheritance of several traits from Oryza nivara, in addition to O. rufipogon. We conclude with a note that the insights from the phenotypic analysis essentially comprise the first part, which will likely be validated with subsequent molecular analysis.
Resumo:
The tonic is a fundamental concept in Indian art music. It is the base pitch, which an artist chooses in order to construct the melodies during a rg(a) rendition, and all accompanying instruments are tuned using the tonic pitch. Consequently, tonic identification is a fundamental task for most computational analyses of Indian art music, such as intonation analysis, melodic motif analysis and rg recognition. In this paper we review existing approaches for tonic identification in Indian art music and evaluate them on six diverse datasets for a thorough comparison and analysis. We study the performance of each method in different contexts such as the presence/absence of additional metadata, the quality of audio data, the duration of audio data, music tradition (Hindustani/Carnatic) and the gender of the singer (male/female). We show that the approaches that combine multi-pitch analysis with machine learning provide the best performance in most cases (90% identification accuracy on average), and are robust across the aforementioned contexts compared to the approaches based on expert knowledge. In addition, we also show that the performance of the latter can be improved when additional metadata is available to further constrain the problem. Finally, we present a detailed error analysis of each method, providing further insights into the advantages and limitations of the methods.
Resumo:
Using the numerical device simulation we show that the relationship between the surface potentials along the channel in any double gate (DG) MOSFET remains invariant in QS (quasistatic) and NQS (nonquasi-static) condition for the same terminal voltages. This concept along with the recently proposed `piecewise charge linearization' technique is then used to develop the intrinsic NQS charge model for a Independent DG (IDG) MOSFET by solving the governing continuity equation. It is also demonstrated that unlike the usual MOSFET transcapacitances, the inter-gate transcapacitance of a IDG-MOSFET initially increases with the frequency and then saturates, which might find novel analog circuit application. The proposed NQS model shows good agreement with numerical device simulations and appears to be useful for efficient circuit simulation.
Resumo:
An optimal measurement selection strategy based on incoherence among rows (corresponding to measurements) of the sensitivity (or weight) matrix for the near infrared diffuse optical tomography is proposed. As incoherence among the measurements can be seen as providing maximum independent information into the estimation of optical properties, this provides high level of optimization required for knowing the independency of a particular measurement on its counterparts. The proposed method was compared with the recently established data-resolution matrix-based approach for optimal choice of independent measurements and shown, using simulated and experimental gelatin phantom data sets, to be superior as it does not require an optimal regularization parameter for providing the same information. (C) 2014 Society of Photo-Optical Instrumentation Engineers (SPIE)
Three-dimensional localization of multiple acoustic sources in shallow ocean with non-Gaussian noise
Resumo:
In this paper, a low-complexity algorithm SAGE-USL is presented for 3-dimensional (3-D) localization of multiple acoustic sources in a shallow ocean with non-Gaussian ambient noise, using a vertical and a horizontal linear array of sensors. In the proposed method, noise is modeled as a Gaussian mixture. Initial estimates of the unknown parameters (source coordinates, signal waveforms and noise parameters) are obtained by known/conventional methods, and a generalized expectation maximization algorithm is used to update the initial estimates iteratively. Simulation results indicate that convergence is reached in a small number of (<= 10) iterations. Initialization requires one 2-D search and one 1-D search, and the iterative updates require a sequence of 1-D searches. Therefore the computational complexity of the SAGE-USL algorithm is lower than that of conventional techniques such as 3-D MUSIC by several orders of magnitude. We also derive the Cramer-Rao Bound (CRB) for 3-D localization of multiple sources in a range-independent ocean. Simulation results are presented to show that the root-mean-square localization errors of SAGE-USL are close to the corresponding CRBs and significantly lower than those of 3-D MUSIC. (C) 2014 Elsevier Inc. All rights reserved.
Resumo:
Central to network tomography is the problem of identifiability, the ability to identify internal network characteristics uniquely from end-to-end measurements. This problem is often underconstrained even when internal network characteristics such as link delays are modeled as additive constants. While it is known that the network topology can play a role in determining the extent of identifiability, there is a lack in the fundamental understanding of being able to quantify it for a given network. In this paper, we consider the problem of identifying additive link metrics in an arbitrary undirected network using measurement nodes and establishing paths/cycles between them. For a given placement of measurement nodes, we define and derive the ``link rank'' of the network-the maximum number of linearly independent cycles/paths that may be established between the measurement nodes. We achieve this in linear time. The link rank helps quantify the exact extent of identifiability in a network. We also develop a quadratic time algorithm to compute a set of cycles/paths that achieves the maximum rank.
Resumo:
The temperature (300-973K) and frequency (100Hz-10MHz) response of the dielectric and impedance characteristics of 2BaO-0.5Na(2)O-2.5Nb(2)O(5)-4.5B(2)O(3) glasses and glass nanocrystal composites were studied. The dielectric constant of the glass was found to be almost independent of frequency (100Hz-10MHz) and temperature (300-600K). The temperature coefficient of dielectric constant was 8 +/- 3ppm/K in the 300-600K temperature range. The relaxation and conduction phenomena were rationalized using modulus formalism and universal AC conductivity exponential power law, respectively. The observed relaxation behavior was found to be thermally activated. The complex impedance data were fitted using the least square method. Dispersion of Barium Sodium Niobate (BNN) phase at nanoscale in a glass matrix resulted in the formation of space charge around crystal-glass interface, leading to a high value of effective dielectric constant especially for the samples heat-treated at higher temperatures. The fabricated glass nanocrystal composites exhibited P versus E hysteresis loops at room temperature and the remnant polarization (P-r) increased with the increase in crystallite size.
Resumo:
In subject-independent acoustic-to-articulatory inversion, the articulatory kinematics of a test subject are estimated assuming that the training corpus does not include data from the test subject. The training corpus in subject-independent inversion (SII) is formed with acoustic and articulatory kinematics data and the acoustic mismatch between training and test subjects is then estimated by an acoustic normalization using acoustic data drawn from a large pool of speakers called generic acoustic space (GAS). In this work, we focus on improving the SII performance through better acoustic normalization and adaptation. We propose unsupervised and several supervised ways of clustering GAS for acoustic normalization. We perform an adaptation of acoustic models of GAS using the acoustic data of the training and test subjects in SII. It is found that SII performance significantly improves (similar to 25% relative on average) over the subject-dependent inversion when the acoustic clusters in GAS correspond to phonetic units (or states of 3-state phonetic HMMs) and when the acoustic model built on GAS is adapted to training and test subjects while optimizing the inversion criterion. (C) 2014 Elsevier B.V. All rights reserved.
Resumo:
There is considerable interest in powering and maneuvering nanostructures remotely in fluidic media using noninvasive fuel-free methods, for which small homogeneous magnetic fields are ideally suited. Current strategies include helical propulsion of chiral nanostructures, cilia-like motion of flexible filaments, and surface assisted translation of asymmetric colloidal doublets and magnetic nanorods, in all of which the individual structures are moved in a particular direction that is completely tied to the characteristics of the driving fields. As we show in this paper, when we use appropriate magnetic field configurations and actuation time scales, it is possible to maneuver geometrically identical nanostructures in different directions, and subsequently position them at arbitrary locations with respect to each other. The method reported here requires proximity of the nanomotors to a solid surface, and could be useful in applications that require remote and independent control over individual components in microfluidic environments.
Resumo:
Facial emotions are the most expressive way to display emotions. Many algorithms have been proposed which employ a particular set of people (usually a database) to both train and test their model. This paper focuses on the challenging task of database independent emotion recognition, which is a generalized case of subject-independent emotion recognition. The emotion recognition system employed in this work is a Meta-Cognitive Neuro-Fuzzy Inference System (McFIS). McFIS has two components, a neuro-fuzzy inference system, which is the cognitive component and a self-regulatory learning mechanism, which is the meta-cognitive component. The meta-cognitive component, monitors the knowledge in the neuro-fuzzy inference system and decides on what-to-learn, when-to-learn and how-to-learn the training samples, efficiently. For each sample, the McFIS decides whether to delete the sample without being learnt, use it to add/prune or update the network parameter or reserve it for future use. This helps the network avoid over-training and as a result improve its generalization performance over untrained databases. In this study, we extract pixel based emotion features from well-known (Japanese Female Facial Expression) JAFFE and (Taiwanese Female Expression Image) TFEID database. Two sets of experiment are conducted. First, we study the individual performance of both databases on McFIS based on 5-fold cross validation study. Next, in order to study the generalization performance, McFIS trained on JAFFE database is tested on TFEID and vice-versa. The performance The performance comparison in both experiments against SVNI classifier gives promising results.
Resumo:
Here, we report the hydrothermal synthesis of boron-doped CNPs (B-CNPs) with different size/atomic percentage of doping and size-independent color tunability from red to blue. The variation of size/atomic percentage of B is achieved by simply varying the reaction time, while the color tunability is obtained by diluting the solution. With dilution, the luminescence spectra are not only blue-shifted, the intensity increases as well. The huge blue-shift in the emission energy (similar to 1 eV) is believed to be due to the increase in the interparticle distance. The quantum yield with optimum dilution is found to increase with boron doping though it is very low as compared to CNPs and nitrogen-doped CNPs. Finally, we show that B-CNPs with a quantum yield of 0.5% can be used for bioimaging applications. (C) 2015 Elsevier Ltd. All rights reserved.