118 resultados para Speaker Recognition


Relevância:

70.00% 70.00%

Publicador:

Resumo:

Abstract-The success of automatic speaker recognition in laboratory environments suggests applications in forensic science for establishing the Identity of individuals on the basis of features extracted from speech. A theoretical model for such a verification scheme for continuous normaliy distributed featureIss developed. The three cases of using a) single feature, b)multipliendependent measurements of a single feature, and c)multpleindependent features are explored.The number iofndependent features needed for areliable personal identification is computed based on the theoretcal model and an expklatory study of some speech featues.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

For the problem of speaker adaptation in speech recognition, the performance depends on the availability of adaptation data. In this paper, we have compared several existing speaker adaptation methods, viz. maximum likelihood linear regression (MLLR), eigenvoice (EV), eigenspace-based MLLR (EMLLR), segmental eigenvoice (SEV) and hierarchical eigenvoice (HEV) based methods. We also develop a new method by modifying the existing HEV method for achieving further performance improvement in a limited available data scenario. In the sense of availability of adaptation data, the new modified HEV (MHEV) method is shown to perform better than all the existing methods throughout the range of operation except the case of MLLR at the availability of more adaptation data.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We are addressing a new problem of improving automatic speech recognition performance, given multiple utterances of patterns from the same class. We have formulated the problem of jointly decoding K multiple patterns given a single Hidden Markov Model. It is shown that such a solution is possible by aligning the K patterns using the proposed Multi Pattern Dynamic Time Warping algorithm followed by the Constrained Multi Pattern Viterbi Algorithm The new formulation is tested in the context of speaker independent isolated word recognition for both clean and noisy patterns. When 10 percent of speech is affected by a burst noise at -5 dB Signal to Noise Ratio (local), it is shown that joint decoding using only two noisy patterns reduces the noisy speech recognition error rate to about 51 percent, when compared to the single pattern decoding using the Viterbi Algorithm. In contrast a simple maximization of individual pattern likelihoods, provides only about 7 percent reduction in error rate.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Design of speaker identification schemes for a small number of speakers (around 10) with a high degree of accuracy in controlled environment is a practical proposition today. When the number of speakers is large (say 50–100), many of these schemes cannot be directly extended, as both recognition error and computation time increase monotonically with population size. The feature selection problem is also complex for such schemes. Though there were earlier attempts to rank order features based on statistical distance measures, it has been observed only recently that the best two independent measurements are not the same as the combination in two's for pattern classification. We propose here a systematic approach to the problem using the decision tree or hierarchical classifier with the following objectives: (1) Design of optimal policy at each node of the tree given the tree structure i.e., the tree skeleton and the features to be used at each node. (2) Determination of the optimal feature measurement and decision policy given only the tree skeleton. Applicability of optimization procedures such as dynamic programming in the design of such trees is studied. The experimental results deal with the design of a 50 speaker identification scheme based on this approach.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Semi-rigid molecular tweezers 1, 3 and 4 bind picric acid with more than tenfold increment in tetrachloromethane as compared to chloroform.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The baculovirus expression system using the Autographa californica nuclear polyhedrosis virus (AcNPV) has been extensively utilized for high-level expression of cloned foreign genes, driven by the strong viral promoters of polyhedrin (polh) and p10 encoding genes. A parallel system using Bombyx mori nuclear polyhedrosis virus (BmNPV) is much less exploited because the choice and variety of BmNPV-based transfer vectors are limited. Using a transient expression assay, we have demonstrated here that the heterologous promoters of the very late genes polh and p10 from AcNPV function as efficiently in BmN cells as the BmNPV promoters. The location of the cloned foreign gene with respect to the promoter sequences was critical for achieving the highest levels of expression, following the order +35 > +1 > -3 > -8 nucleotides (nt) with respect to the polh or p10 start codons. We have successfully generated recombinant BmNPV harboring AcNPV promoters by homeologous recombination between AcNPV-based transfer vectors and BmNPV genomic DNA. Infection of BmN cell lines with recombinant BmNPV showed a temporal expression pattern, reaching very high levels in 60-72 h post infection. The recombinant BmNPV harboring the firefly luciferase-encoding gene under the control of AcNPV polh or p10 promoters, on infection of the silkworm larvae led to the synthesis of large quantities of luciferase. Such larvae emanated significant luminiscence instantaneously on administration of the substrate luciferin resulting in 'glowing silkworms'. The virus-infected larvae continued to glow for several hours and revealed the most abundant distribution of virus in the fat bodies. In larval expression also, the highest levels were achieved when the reporter gene was located at +35 nt of the polh.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

An adaptive learning scheme, based on a fuzzy approximation to the gradient descent method for training a pattern classifier using unlabeled samples, is described. The objective function defined for the fuzzy ISODATA clustering procedure is used as the loss function for computing the gradient. Learning is based on simultaneous fuzzy decisionmaking and estimation. It uses conditional fuzzy measures on unlabeled samples. An exponential membership function is assumed for each class, and the parameters constituting these membership functions are estimated, using the gradient, in a recursive fashion. The induced possibility of occurrence of each class is useful for estimation and is computed using 1) the membership of the new sample in that class and 2) the previously computed average possibility of occurrence of the same class. An inductive entropy measure is defined in terms of induced possibility distribution to measure the extent of learning. The method is illustrated with relevant examples.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The minimum cost classifier when general cost functionsare associated with the tasks of feature measurement and classification is formulated as a decision graph which does not reject class labels at intermediate stages. Noting its complexities, a heuristic procedure to simplify this scheme to a binary decision tree is presented. The optimizationof the binary tree in this context is carried out using ynamicprogramming. This technique is applied to the voiced-unvoiced-silence classification in speech processing.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

trychnine was coupled to fluorescein isothiocyanate to mark strychnine binding sites in spinal cord of rat. Specific binding of strychnine could be demonstrated in synaptosomal fraction. Addition of glycine to the strychninised membrane led to a decrease in fluorescence indicating same receptor loci.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This letter presents the development of simplified algorithms based on Haar functions for signal extraction in relaying signals. These algorithms, being computationally simple, are better suited for microprocessor-based power system protection relaying. They provide accurate estimates of the signal amplitude and phase.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The statistical minimum risk pattern recognition problem, when the classification costs are random variables of unknown statistics, is considered. Using medical diagnosis as a possible application, the problem of learning the optimal decision scheme is studied for a two-class twoaction case, as a first step. This reduces to the problem of learning the optimum threshold (for taking appropriate action) on the a posteriori probability of one class. A recursive procedure for updating an estimate of the threshold is proposed. The estimation procedure does not require the knowledge of actual class labels of the sample patterns in the design set. The adaptive scheme of using the present threshold estimate for taking action on the next sample is shown to converge, in probability, to the optimum. The results of a computer simulation study of three learning schemes demonstrate the theoretically predictable salient features of the adaptive scheme.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We are addressing the problem of jointly using multiple noisy speech patterns for automatic speech recognition (ASR), given that they come from the same class. If the user utters a word K times, the ASR system should try to use the information content in all the K patterns of the word simultaneously and improve its speech recognition accuracy compared to that of the single pattern based speech recognition. T address this problem, recently we proposed a Multi Pattern Dynamic Time Warping (MPDTW) algorithm to align the K patterns by finding the least distortion path between them. A Constrained Multi Pattern Viterbi algorithm was used on this aligned path for isolated word recognition (IWR). In this paper, we explore the possibility of using only the MPDTW algorithm for IWR. We also study the properties of the MPDTW algorithm. We show that using only 2 noisy test patterns (10 percent burst noise at -5 dB SNR) reduces the noisy speech recognition error rate by 37.66 percent when compared to the single pattern recognition using the Dynamic Time Warping algorithm.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Database schemes can be viewed as hypergraphs with individual relation schemes corresponding to the edges of a hypergraph. Under this setting, a new class of "acyclic" database schemes was recently introduced and was shown to have a claim to a number of desirable properties. However, unlike the case of ordinary undirected graphs, there are several unequivalent notions of acyclicity of hypergraphs. Of special interest among these are agr-, beta-, and gamma-, degrees of acyclicity, each characterizing an equivalence class of desirable properties for database schemes, represented as hypergraphs. In this paper, two complementary approaches to designing beta-acyclic database schemes have been presented. For the first part, a new notion called "independent cycle" is introduced. Based on this, a criterion for beta-acyclicity is developed and is shown equivalent to the existing definitions of beta-acyclicity. From this and the concept of the dual of a hypergraph, an efficient algorithm for testing beta-acyclicity is developed. As for the second part, a procedure is evolved for top-down generation of beta-acyclic schemes and its correctness is established. Finally, extensions and applications of ideas are described.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper suggests a scheme for classifying online handwritten characters, based on dynamic space warping of strokes within the characters. A method for segmenting components into strokes using velocity profiles is proposed. Each stroke is a simple arbitrary shape and is encoded using three attributes. Correspondence between various strokes is established using Dynamic Space Warping. A distance measure which reliably differentiates between two corresponding simple shapes (strokes) has been formulated thus obtaining a perceptual distance measure between any two characters. Tests indicate an accuracy of over 85% on two different datasets of characters.