18 resultados para Recognition accuracy
Resumo:
In this paper we introduce a weighted complex networks model to investigate and recognize structures of patterns. The regular treating in pattern recognition models is to describe each pattern as a high-dimensional vector which however is insufficient to express the structural information. Thus, a number of methods are developed to extract the structural information, such as different feature extraction algorithms used in pre-processing steps, or the local receptive fields in convolutional networks. In our model, each pattern is attributed to a weighted complex network, whose topology represents the structure of that pattern. Based upon the training samples, we get several prototypal complex networks which could stand for the general structural characteristics of patterns in different categories. We use these prototypal networks to recognize the unknown patterns. It is an attempt to use complex networks in pattern recognition, and our result shows the potential for real-world pattern recognition. A spatial parameter is introduced to get the optimal recognition accuracy, and it remains constant insensitive to the amount of training samples. We have discussed the interesting properties of the prototypal networks. An approximate linear relation is found between the strength and color of vertexes, in which we could compare the structural difference between each category. We have visualized these prototypal networks to show that their topology indeed represents the common characteristics of patterns. We have also shown that the asymmetric strength distribution in these prototypal networks brings high robustness for recognition. Our study may cast a light on understanding the mechanism of the biologic neuronal systems in object recognition as well.
Resumo:
In recognition-based user interface, users’ satisfaction is determined not only by recognition accuracy but also by effort to correct recognition errors. In this paper, we introduce a crossmodal error correction technique, which allows users to correct errors of Chinese handwriting recognition by speech. The focus of the paper is a multimodal fusion algorithm supporting the crossmodal error correction. By fusing handwriting and speech recognition, the algorithm can correct errors in both character extraction and recognition of handwriting. The experimental result indicates that the algorithm is effective and efficient. Moreover, the evaluation also shows the correction technique can help users to correct errors in handwriting recognition more efficiently than the other two error correction techniques.
Resumo:
Video-based facial expression recognition is a challenging problem in computer vision and human-computer interaction. To target this problem, texture features have been extracted and widely used, because they can capture image intensity changes raised by skin deformation. However, existing texture features encounter problems with albedo and lighting variations. To solve both problems, we propose a new texture feature called image ratio features. Compared with previously proposed texture features, e. g., high gradient component features, image ratio features are more robust to albedo and lighting variations. In addition, to further improve facial expression recognition accuracy based on image ratio features, we combine image ratio features with facial animation parameters (FAPs), which describe the geometric motions of facial feature points. The performance evaluation is based on the Carnegie Mellon University Cohn-Kanade database, our own database, and the Japanese Female Facial Expression database. Experimental results show that the proposed image ratio feature is more robust to albedo and lighting variations, and the combination of image ratio features and FAPs outperforms each feature alone. In addition, we study asymmetric facial expressions based on our own facial expression database and demonstrate the superior performance of our combined expression recognition system.
Resumo:
Concept maps are an important tool to knowledge organization,representation, and sharing. Most current concept map tools do not provide full support for hand-drawn concept map creation and manipulation, largely due to the lack of methods to recognize hand-drawn concept maps. This paper proposes a structure recognition method. Our algorithm can extract node blocks and link blocks of a hand-drawn concept map by combining dynamic programming and graph partitioning and then build a concept-map structure by relating extracted nodes and links. We also introduce structure-based intelligent manipulation technique of hand-drawn concept maps. Evaluation shows that our method has high structure recognition accuracy in real time, and the intelligent manipulation technique is efficient and effective.
Resumo:
Recently,Handheld Communication Devices is developing very fast, extending in users and spreading in application fields, and has an promising future. This study investigated the acceptance of the multimodal text entry method and the behavioral characteristics when using it. Based on the general information process model of a bimodal system and the human factor studies about the multimodal map system, the present study mainly focused on the hand-speech bimodal text entry method. For acceptance, the study investigated the subjective perception of the accuracy of speech recognition by Wizard of Oz (WOz) experiment and a questionnaire. Results showed that there was a linear relationship between the speech recognition accuracy and the subjective accuracy. Furthermore, as the familiarity increasing, the difference between the acceptable accuracy and the subjective accuracy gradually decreased. In addition, the similarity of meaning between the outcome of speech recognition and the correct sentences was an important referential criterion. The second study investigated three aspects of the bimodal text entry method, including input, error recovery and modal shifts. The first experiment aimed to find the behavioral characteristics of user when doing error recovery task. Results indicated that participants preferred to correct the error by handwriting, which had no relationship with the input modality. The second experiment aimed to discover the behavioral characteristics of users when doing text entry in various types of text. Results showed that users preferred to speech input in both words and sentences conditions, which was highly consistent among individuals, while no significant difference was found between handwriting and speech input in the character condition. Participants used more direct strategy than jumping strategy to deal with mixed text, especially for the Chinese-English mixed type. The third experiment examined the cognitive load in the different modal shifts, results suggesting that there were significant differences between different shifts. Moreover, relevant little time was needed in the Shift from speech input to hand input. Based on the main findings, implications were discussed as follows: Firstly, when evaluating a speech recognition system, attention should be paid to the fact that the speech recognition accuracy was not equal to the subjective accuracy. Secondly, in order to make a speech input system more acceptable, a good method is to train and supply the feedback for the accuracy in training, which improving the familiarity and sensitivity to the system. Thirdly, both the universal and individual behavioral patterns were taken into consideration to improve the error recovery method. Fourthly, easing the study and the use of speech input, the operations of speech input should be simpler. Fifthly, more convenient text input method for non-Chinese text entry should be provided. Finally, the shifting time between hand input and speech input provides an important parameter for the design of automatic-evoked speech recognition system.
Resumo:
The researches of the CC's form processing mainly involved the effects of all kinds of form properties. In most of the cases, the researches were conducted after the lexical process completed. A few which was about early phases of visual perception focused on the process of feature extraction in character recognition. Up till now, nobody put forward a propose that we should study the form processing in the early phases of visual perception of CC. We hold that because the form processing occurs in the early phases of visual perception, we should study the processing prelexically. Moreover, visual perception of CC is a course during which the CC becomes clear gradually, so that the effects of all kinds of form properties should not be a absolute phenomena of an all-or-none. In this study we adopted 4 methods to research the form processing in the early phases simulatedly and systematically, including the tachistoscopic repetition, increasing time to present gradually, enlarging the visual angle gradually and non- tachistoscopic searching and naming. Under all kinds of bad or degraded visual conditions, the instantaneous course of early-phases processing was slowed down and postponed, and then the growth course was open to before our eyes. We can captured the characteristics of the form processing in the early phases by analyzing the reaction speed and recognition accuracy. Accompanying the visual angle and time increasing, the clarity improved and we can find out the relation between the effects of form properties and visual clarity improving. The results were as follows: ①in the early phases of visual perception of CC, there were the effects of all kinds of form properties. ②the quantity of the effects would cut down when the visual conditions were being changed better and better. We raised the concept of character's space transparency and it's algorithm to explain these effects of form properties. Furthermore, a model was discussed to help understand the phenomenon that the quantity of the effects changed as the visual conditions were improved. ③The early phases of visual perception of CC isn't the loci of the frequency effect.
Resumo:
We investigate the use of independent component analysis (ICA) for speech feature extraction in digits speech recognition systems.We observe that this may be true for a recognition tasks based on geometrical learning with little training data. In contrast to image processing, phase information is not essential for digits speech recognition. We therefore propose a new scheme that shows how the phase sensitivity can be removed by using an analytical description of the ICA-adapted basis functions via the Hilbert transform. Furthermore, since the basis functions are not shift invariant, we extend the method to include a frequency-based ICA stage that removes redundant time shift information. The digits speech recognition results show promising accuracy, Experiments show method based on ICA and geometrical learning outperforms HMM in different number of train samples.
Resumo:
In the light of descriptive geometry and notions in set theory, this paper re-defines the basic elements in space such as curve and surface and so on, presents some fundamental notions with respect to the point cover based on the High-dimension space (HDS) point covering theory, finally takes points from mapping part of speech signals to HDS, so as to analyze distribution information of these speech points in HDS, and various geometric covering objects for speech points and their relationship. Besides, this paper also proposes a new algorithm for speaker independent continuous digit speech recognition based on the HDS point dynamic searching theory without end-points detection and segmentation. First from the different digit syllables in real continuous digit speech, we establish the covering area in feature space for continuous speech. During recognition, we make use of the point covering dynamic searching theory in HDS to do recognition, and then get the satisfying recognized results. At last, compared to HMM (Hidden Markov models)-based method, from the development trend of the comparing results, as sample amount increasing, the difference of recognition rate between two methods will decrease slowly, while sample amount approaching to be very large, two recognition rates all close to 100% little by little. As seen from the results, the recognition rate of HDS point covering method is higher than that of in HMM (Hidden Markov models) based method, because, the point covering describes the morphological distribution for speech in HDS, whereas HMM-based method is only a probability distribution, whose accuracy is certainly inferior to point covering.
Resumo:
The accurate recognition of cancer subtypes is very significant in clinic. Especially, the DNA microarray gene expression technology is applied to diagnosing and recognizing cancer types. This paper proposed a method of that recognized cancer subtypes based on geometrical learning. Firstly, the cancer genes expression profiles data was pretreated and selected feature genes by conventional method; then the expression data of feature genes in the training samples was construed each convex hull in the high-dimensional space using training algorithm of geometrical learning, while the independent test set was tested by the recognition algorithm of geometrical learning. The method was applied to the human acute leukemia gene expression data. The accuracy rate reached to 100%. The experiments have proved its efficiency and feasibility.
Resumo:
We investigate the use of independent component analysis (ICA) for speech feature extraction in digits speech recognition systems. We observe that this may be true for recognition tasks based on Geometrical Learning with little training data. In contrast to image processing, phase information is not essential for digits speech recognition. We therefore propose a new scheme that shows how the phase sensitivity can be removed by using an analytical description of the ICA-adapted basis functions. Furthermore, since the basis functions are not shift invariant, we extend the method to include a frequency-based ICA stage that removes redundant time shift information. The digits speech recognition results show promising accuracy. Experiments show that the method based on ICA and Geometrical Learning outperforms HMM in a different number of training samples.
Resumo:
We investigate the use of independent component analysis (ICA) for speech feature extraction in digits speech recognition systems. We observe that this may be true for recognition tasks based on Geometrical Learning with little training data. In contrast to image processing, phase information is not essential for digits speech recognition. We therefore propose a new scheme that shows how the phase sensitivity can be removed by using an analytical description of the ICA-adapted basis functions. Furthermore, since the basis functions are not shift invariant, we extend the method to include a frequency-based ICA stage that removes redundant time shift information. The digits speech recognition results show promising accuracy. Experiments show that the method based on ICA and Geometrical Learning outperforms HMM in a different number of training samples.
Resumo:
气液两相流体系是一个复杂的多变量随机过程体系,流型的定义、流型过渡准则和判别方法等方面的研究是多相流学科目前研究的重点内容。本文就与气液两相流流型及其判别有关的研究状况进行了回顾和评述,力图反映近年来气液两相流流型及其判别问题研究的状态和趋势。
Resumo:
The relationships between indentation responses and Young's modulus of an indented material were investigated by employing dimensional analysis and finite element method. Three representative tip bluntness geometries were introduced to describe the shape of a real Berkovich indenter. It was demonstrated that for each of these bluntness geometries, a set of approximate indentation relationships correlating the ratio of nominal hardness/reduced Young's modulus H (n) /E (r) and the ratio of elastic work/total work W (e)/W can be derived. Consequently, a method for Young's modulus measurement combined with its accuracy estimation was established on basis of these relationships. The effectiveness of this approach was verified by performing nanoindentation tests on S45C carbon steel and 6061 aluminum alloy and microindentation tests on aluminum single crystal, GCr15 bearing steel and fused silica.
Resumo:
Based on the rigorous formulation of integral equations for the propagations of light waves at the medium interface, we carry out the numerical solutions of the random light field scattered from self-affine fractal surface samples. The light intensities produced by the same surface samples are also calculated in Kirchhoff's approximation, and their comparisons with the corresponding rigorous results show directly the degree of the accuracy of the approximation. It is indicated that Kirchhoff's approximation is of good accuracy for random surfaces with small roughness value w and large roughness exponent alpha. For random surfaces with larger w and smaller alpha, the approximation results in considerable errors, and detailed calculations show that the inaccuracy comes from the simplification that the transmitted light field is proportional to the incident field and from the neglect of light field derivative at the interface.
Resumo:
A visual pattern recognition network and its training algorithm are proposed. The network constructed of a one-layer morphology network and a two-layer modified Hamming net. This visual network can implement invariant pattern recognition with respect to image translation and size projection. After supervised learning takes place, the visual network extracts image features and classifies patterns much the same as living beings do. Moreover we set up its optoelectronic architecture for real-time pattern recognition. (C) 1996 Optical Society of America