63 resultados para speaker clustering


Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper investigates the problem of speaker identi-fication and verification in noisy conditions, assuming that speechsignals are corrupted by environmental noise, but knowledgeabout the noise characteristics is not available. This research ismotivated in part by the potential application of speaker recog-nition technologies on handheld devices or the Internet. Whilethe technologies promise an additional biometric layer of securityto protect the user, the practical implementation of such systemsfaces many challenges. One of these is environmental noise. Due tothe mobile nature of such systems, the noise sources can be highlytime-varying and potentially unknown. This raises the require-ment for noise robustness in the absence of information about thenoise. This paper describes a method that combines multicondi-tion model training and missing-feature theory to model noisewith unknown temporal-spectral characteristics. Multiconditiontraining is conducted using simulated noisy data with limitednoise variation, providing a “coarse” compensation for the noise,and missing-feature theory is applied to refine the compensationby ignoring noise variation outside the given training conditions,thereby reducing the training and testing mismatch. This paperis focused on several issues relating to the implementation of thenew model for real-world applications. These include the gener-ation of multicondition training data to model noisy speech, thecombination of different training data to optimize the recognitionperformance, and the reduction of the model’s complexity. Thenew algorithm was tested using two databases with simulated andrealistic noisy speech data. The first database is a redevelopmentof the TIMIT database by rerecording the data in the presence ofvarious noise types, used to test the model for speaker identifica-tion with a focus on the varieties of noise. The second database isa handheld-device database collected in realistic noisy conditions,used to further validate the model for real-world speaker verifica-tion. The new model is compared to baseline systems and is foundto achieve lower error rates.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Clustering analysis of data from DNA microarray hybridization studies is an essential task for identifying biologically relevant groups of genes. Attribute cluster algorithm (ACA) has provided an attractive way to group and select meaningful genes. However, ACA needs much prior knowledge about the genes to set the number of clusters. In practical applications, if the number of clusters is misspecified, the performance of the ACA will deteriorate rapidly. In fact, it is a very demanding to do that because of our little knowledge. We propose the Cooperative Competition Cluster Algorithm (CCCA) in this paper. In the algorithm, we assume that both cooperation and competition exist simultaneously between clusters in the process of clustering. By using this principle of Cooperative Competition, the number of clusters can be found in the process of clustering. Experimental results on a synthetic and gene expression data are demonstrated. The results show that CCCA can choose the number of clusters automatically and get excellent performance with respect to other competing methods.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper deals with Takagi-Sugeno (TS) fuzzy model identification of nonlinear systems using fuzzy clustering. In particular, an extended fuzzy Gustafson-Kessel (EGK) clustering algorithm, using robust competitive agglomeration (RCA), is developed for automatically constructing a TS fuzzy model from system input-output data. The EGK algorithm can automatically determine the 'optimal' number of clusters from the training data set. It is shown that the EGK approach is relatively insensitive to initialization and is less susceptible to local minima, a benefit derived from its agglomerate property. This issue is often overlooked in the current literature on nonlinear identification using conventional fuzzy clustering. Furthermore, the robust statistical concepts underlying the EGK algorithm help to alleviate the difficulty of cluster identification in the construction of a TS fuzzy model from noisy training data. A new hybrid identification strategy is then formulated, which combines the EGK algorithm with a locally weighted, least-squares method for the estimation of local sub-model parameters. The efficacy of this new approach is demonstrated through function approximation examples and also by application to the identification of an automatic voltage regulation (AVR) loop for a simulated 3 kVA laboratory micro-machine system.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Support vector machine (SVM) is a powerful technique for data classification. Despite of its good theoretic foundations and high classification accuracy, normal SVM is not suitable for classification of large data sets, because the training complexity of SVM is highly dependent on the size of data set. This paper presents a novel SVM classification approach for large data sets by using minimum enclosing ball clustering. After the training data are partitioned by the proposed clustering method, the centers of the clusters are used for the first time SVM classification. Then we use the clusters whose centers are support vectors or those clusters which have different classes to perform the second time SVM classification. In this stage most data are removed. Several experimental results show that the approach proposed in this paper has good classification accuracy compared with classic SVM while the training is significantly faster than several other SVM classifiers.