29 resultados para audiovisual speech perception


Relevância:

20.00% 20.00%

Publicador:

Resumo:

We investigate the use of independent component analysis (ICA) for speech feature extraction in digits speech recognition systems. We observe that this may be true for recognition tasks based on Geometrical Learning with little training data. In contrast to image processing, phase information is not essential for digits speech recognition. We therefore propose a new scheme that shows how the phase sensitivity can be removed by using an analytical description of the ICA-adapted basis functions. Furthermore, since the basis functions are not shift invariant, we extend the method to include a frequency-based ICA stage that removes redundant time shift information. The digits speech recognition results show promising accuracy. Experiments show that the method based on ICA and Geometrical Learning outperforms HMM in a different number of training samples.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we presents HyperSausage Neuron based on the High-Dimension Space(HDS), and proposes a new algorithm for speaker independent continuous digit speech recognition. At last, compared to HMM-based method, the recognition rate of HyperSausage Neuron method is higher than that of in HMM-based method.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In recognition-based user interface, users’ satisfaction is determined not only by recognition accuracy but also by effort to correct recognition errors. In this paper, we introduce a crossmodal error correction technique, which allows users to correct errors of Chinese handwriting recognition by speech. The focus of the paper is a multimodal fusion algorithm supporting the crossmodal error correction. By fusing handwriting and speech recognition, the algorithm can correct errors in both character extraction and recognition of handwriting. The experimental result indicates that the algorithm is effective and efficient. Moreover, the evaluation also shows the correction technique can help users to correct errors in handwriting recognition more efficiently than the other two error correction techniques.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

An important characteristic of virtual assembly is interaction. Traditional di-rect manipulation in virtual assembly relies on dynamic collision detection, which is very time-consuming and even impossible in desktop virtual assembly environment. Feature-matching isa critical process in harmonious virtual assembly, and is the premise of assembly constraint sens-ing. This paper puts forward an active object-based feature-matching perception mechanism and afeature-matching interactive computing process, both of which make the direct manipulation in vir-tual assembly break away from collision detection. They also help to enhance virtual environmentunderstandability of user intention and promote interaction performance. Experimental resultsshow that this perception mechanism can ensure that users achieve real-time direct manipulationin desktop virtual environment.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

It is important to detect the aromaticity of structures during the process of structure elucidation and output. In this paper, an alogrithm was proposed to detect the aromaticity of structures by the use of algorithm on ring identification. The results show that it could be used to identify most of the aromatic structure. It have been used as constraints of Expert System on Elucidation Structure of Organic Compounds(ESESOC) and a good result has been achieved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

It's important to identify ring in the process of structure elucidation. In this paper, all rings and the smallest set of smallest ring(SSSR) of structure are obtained from two-dimensional connection table. The results are satisfactory by using this algorithm in ESESOC expert system as constraint.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

During the development of our ESESOC system (Expert System for the Elucidation of the Structures of Organic Compounds), computer perception of topological symmetry is essential in searching for the canonical description of a molecular structure, removing the irredundant connections in the structure generation process, and specifying the number of peaks in C-13- and H-1-NMR spectra in the structure evaluation process. In the present paper, a new path identifier is introduced and an algorithm for detection of topological symmetry from a connection table is developed by the all-paths method. (C) 1999 Elsevier Science B.V. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A new algorithm for computer perception of topological symmetry is proposed. A node library containing various kinds of nodes is built, and the index number of the library is used as initial atom class identifier (CI) to discriminate the different types of non-hydrogen atoms. The path index (PI) and ringindex (RI) are calculated from the CI, and the global topological enviroment is defined as the sum of PIs and RIs. The topological symmetry can be detected by the iterative calculation of the global topological enviroment.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

For the exhaustive and irredundant generation of candidate structures in ESESOC (Expert System for the Elucidation of the Structures of Organic Compounds), a new algorithm for computer perception of topological equivalence classes of the nodes (non-hydrog

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Recently,Handheld Communication Devices is developing very fast, extending in users and spreading in application fields, and has an promising future. This study investigated the acceptance of the multimodal text entry method and the behavioral characteristics when using it. Based on the general information process model of a bimodal system and the human factor studies about the multimodal map system, the present study mainly focused on the hand-speech bimodal text entry method. For acceptance, the study investigated the subjective perception of the accuracy of speech recognition by Wizard of Oz (WOz) experiment and a questionnaire. Results showed that there was a linear relationship between the speech recognition accuracy and the subjective accuracy. Furthermore, as the familiarity increasing, the difference between the acceptable accuracy and the subjective accuracy gradually decreased. In addition, the similarity of meaning between the outcome of speech recognition and the correct sentences was an important referential criterion. The second study investigated three aspects of the bimodal text entry method, including input, error recovery and modal shifts. The first experiment aimed to find the behavioral characteristics of user when doing error recovery task. Results indicated that participants preferred to correct the error by handwriting, which had no relationship with the input modality. The second experiment aimed to discover the behavioral characteristics of users when doing text entry in various types of text. Results showed that users preferred to speech input in both words and sentences conditions, which was highly consistent among individuals, while no significant difference was found between handwriting and speech input in the character condition. Participants used more direct strategy than jumping strategy to deal with mixed text, especially for the Chinese-English mixed type. The third experiment examined the cognitive load in the different modal shifts, results suggesting that there were significant differences between different shifts. Moreover, relevant little time was needed in the Shift from speech input to hand input. Based on the main findings, implications were discussed as follows: Firstly, when evaluating a speech recognition system, attention should be paid to the fact that the speech recognition accuracy was not equal to the subjective accuracy. Secondly, in order to make a speech input system more acceptable, a good method is to train and supply the feedback for the accuracy in training, which improving the familiarity and sensitivity to the system. Thirdly, both the universal and individual behavioral patterns were taken into consideration to improve the error recovery method. Fourthly, easing the study and the use of speech input, the operations of speech input should be simpler. Fifthly, more convenient text input method for non-Chinese text entry should be provided. Finally, the shifting time between hand input and speech input provides an important parameter for the design of automatic-evoked speech recognition system.