25 resultados para speech segmentation
em Chinese Academy of Sciences Institutional Repositories Grid Portal
Resumo:
In the light of descriptive geometry and notions in set theory, this paper re-defines the basic elements in space such as curve and surface and so on, presents some fundamental notions with respect to the point cover based on the High-dimension space (HDS) point covering theory, finally takes points from mapping part of speech signals to HDS, so as to analyze distribution information of these speech points in HDS, and various geometric covering objects for speech points and their relationship. Besides, this paper also proposes a new algorithm for speaker independent continuous digit speech recognition based on the HDS point dynamic searching theory without end-points detection and segmentation. First from the different digit syllables in real continuous digit speech, we establish the covering area in feature space for continuous speech. During recognition, we make use of the point covering dynamic searching theory in HDS to do recognition, and then get the satisfying recognized results. At last, compared to HMM (Hidden Markov models)-based method, from the development trend of the comparing results, as sample amount increasing, the difference of recognition rate between two methods will decrease slowly, while sample amount approaching to be very large, two recognition rates all close to 100% little by little. As seen from the results, the recognition rate of HDS point covering method is higher than that of in HMM (Hidden Markov models) based method, because, the point covering describes the morphological distribution for speech in HDS, whereas HMM-based method is only a probability distribution, whose accuracy is certainly inferior to point covering.
Resumo:
Hard coatings on relatively soft substrate always face the danger of debonding along the interface. Interfacial stresses are considered to be the initial driving force for the interfacial debonding of the relatively strong bonded coatings. Interfacial stresses due to the mismatch of strain between the coating and substrate are simulated with FEM firstly. The distribution of the interfacial stresses is achieved, which confirms an excessive stresses concentration near the interface end. Subsequently, the redistribution of interfacial stresses is calculated for a coating with periodic segmentation cracks. Results indicate that the distribution of interfacial stresses is altered greatly with the periodic segmentation cracks. To reveal the effect of the spacing of the periodic segmentation cracks on the distribution of interfacial stresses, different crack density is modeled within the coating. It is found that that the peak values of the interfacial stresses decrease with the increase of crack density, i.e. with reduction of spacing of segmentation cracks.
Resumo:
The mechanism of the formation of periodic segmentation cracks of a coating plated on a substrate with periodic subsurface inclusions (PSI) is investigated. The internal stress in coating and subsequently the strain energy release rate (SERR) of the segmentation cracks are computed with finite element method (FEM). And the effect of the geometrical parameters of the PSI is studied. The results indicate that the ratio of the width of the inclusion to the period of the repeated structure has an optimum value, at which the maximum internal tensile stress and SERR arise. On the other hand, the ratio of the max-thickness of the inclusion to the thickness of the coating has a threshold value, above which the further increase of this ratio should seldom influence the internal stress or the SERR.
Resumo:
Channeling/segmentation cracks may arise in the coating subjected to in-plane tensile stress. The interaction between these multiple cracks, say the effect of the spacing between two adjacent cracks oil the behaviors of channels themselves and the interface around the interface corners, attracts wide interest. However, if the spacing is greater than a specific magniture,, namely the Critical Spacing (CS), there should be no interaction between such channeling/segmentation cracks. In this study, file mechanism of the effect of the crack spacing oil the interfacial stress around the interface corner will be Interpreted firstly. Then the existence of the CS will be verified and the relationship between the CS and the so-called stress transfer length Ill coating will be established for plane strain condition. Finally, the dependence of the stress transfer length, simultaneously of the CS, on the sensitive parameters will be investigated with finite element method and expressed with a simple empirical formula. (C) 2007 Elsevier Ltd. All rights reserved.
Resumo:
For the purpose of human-computer interaction (HCI), a vision-based gesture segmentation approach is proposed. The technique essentially includes skin color detection and gesture segmentation. The skin color detection employs a skin-color artificial neural network (ANN). To merge and segment the region of interest, we propose a novel mountain algorithm. The details of the approach and experiment results are provided. The experimental segmentation accuracy is 96.25%. (C) 2003 Society of Photo-Optical Instrumentation Engineers.
Resumo:
A novel spatiotemporal segmentation technique is further developed for extracting uncovered background and moving objects from the image sequences, then the following motion estimation is performed only on the regions corresponding to moving objects. The frame difference contrast (FCON) and local variance contrast (LCON), which are related to the temporal and spatial homogeneity of the image sequence, are selected to form the 2-D spatiotemporal entropy. Then the spatial segmentation threshold is determined by maximizing the 2-D spatiotemporal entropy, and the temporal segmentation point is selected to minimize the complexity measure for image sequence coding. Since both temporal and spatial correlation of an image sequence are exploited, this proposed spatiotemporal segmentation technique can further be used to determine the positions of reference frames adaptively, hence resulting in a low bit rate. Experimental results show that this segmentation-based coding scheme is more efficient than usual fixed-size coding algorithms. (C) 1997 Society of Photo-Optical Instrumentation Engineers.
Resumo:
We investigate the use of independent component analysis (ICA) for speech feature extraction in digits speech recognition systems.We observe that this may be true for a recognition tasks based on geometrical learning with little training data. In contrast to image processing, phase information is not essential for digits speech recognition. We therefore propose a new scheme that shows how the phase sensitivity can be removed by using an analytical description of the ICA-adapted basis functions via the Hilbert transform. Furthermore, since the basis functions are not shift invariant, we extend the method to include a frequency-based ICA stage that removes redundant time shift information. The digits speech recognition results show promising accuracy, Experiments show method based on ICA and geometrical learning outperforms HMM in different number of train samples.
Resumo:
In this paper, a novel approach for mandarin speech emotion recognition, that is mandarin speech emotion recognition based on high dimensional geometry theory, is proposed. The human emotions are classified into 6 archetypal classes: fear, anger, happiness, sadness, surprise and disgust. According to the characteristics of these emotional speech signals, the amplitude, pitch frequency and formant are used as the feature parameters for speech emotion recognition. The new method called high dimensional geometry theory is applied for recognition. Compared with traditional GSVM model, the new method has some advantages. It is noted that this method has significant values for researches and applications henceforth.
Resumo:
Based on biomimetic pattern recognition theory, we proposed a novel speaker-independent continuous speech keyword-spotting algorithm. Without endpoint detection and division, we can get the minimum distance curve between continuous speech samples and every keyword-training net through the dynamic searching to the feature-extracted continuous speech. Then we can count the number of the keywords by investigating the vale-value and the numbers of the vales in the curve. Experiments of small vocabulary continuous speech with various speaking rate have got good recognition results and proved the validity of the algorithm.
Resumo:
In speaker-independent speech recognition, the disadvantage of the most diffused technology (HMMs, or Hidden Markov models) is not only the need of many more training samples, but also long train time requirement. This paper describes the use of Biomimetic pattern recognition (BPR) in recognizing some mandarin continuous speech in a speaker-independent manner. A speech database was developed for the course of study. The vocabulary of the database consists of 15 Chinese dish's names, the length of each name is 4 Chinese words. Neural networks (NNs) based on Multi-weight neuron (MWN) model are used to train and recognize the speech sounds. The number of MWN was investigated to achieve the optimal performance of the NNs-based BPR. This system, which is based on BPR and can carry out real time recognition reaches a recognition rate of 98.14% for the first option and 99.81% for the first two options to the persons from different provinces of China speaking common Chinese speech. Experiments were also carried on to evaluate Continuous density hidden Markov models (CDHMM), Dynamic time warping (DTW) and BPR for speech recognition. The Experiment results show that BPR outperforms CDHMM and DTW especially in the cases of samples of a finite size.
Resumo:
We investigate the use of independent component analysis (ICA) for speech feature extraction in digits speech recognition systems. We observe that this may be true for recognition tasks based on Geometrical Learning with little training data. In contrast to image processing, phase information is not essential for digits speech recognition. We therefore propose a new scheme that shows how the phase sensitivity can be removed by using an analytical description of the ICA-adapted basis functions. Furthermore, since the basis functions are not shift invariant, we extend the method to include a frequency-based ICA stage that removes redundant time shift information. The digits speech recognition results show promising accuracy. Experiments show that the method based on ICA and Geometrical Learning outperforms HMM in a different number of training samples.
Resumo:
In this paper, we presents HyperSausage Neuron based on the High-Dimension Space(HDS), and proposes a new algorithm for speaker independent continuous digit speech recognition. At last, compared to HMM-based method, the recognition rate of HyperSausage Neuron method is higher than that of in HMM-based method.
Resumo:
In speaker-independent speech recognition, the disadvantage of the most diffused technology ( Hidden Markov Models) is not only the need of many more training samples, but also long train time requirement. This paper describes the use of Biomimetic Pattern Recognition (BPR) in recognizing some Mandarin Speech in a speaker-independent manner. The vocabulary of the system consists of 15 Chinese dish's names. Neural networks based on Multi-Weight Neuron (MWN) model are used to train and recognize the speech sounds. Experimental results are presented to show that the system, which can carry out real time recognition of the persons from different provinces speaking common Chinese speech, outperforms HMMs especially in the cases of samples of a finite size.
Resumo:
We investigate the use of independent component analysis (ICA) for speech feature extraction in digits speech recognition systems. We observe that this may be true for recognition tasks based on Geometrical Learning with little training data. In contrast to image processing, phase information is not essential for digits speech recognition. We therefore propose a new scheme that shows how the phase sensitivity can be removed by using an analytical description of the ICA-adapted basis functions. Furthermore, since the basis functions are not shift invariant, we extend the method to include a frequency-based ICA stage that removes redundant time shift information. The digits speech recognition results show promising accuracy. Experiments show that the method based on ICA and Geometrical Learning outperforms HMM in a different number of training samples.