968 resultados para Word recognition.
Resumo:
We are addressing a new problem of improving automatic speech recognition performance, given multiple utterances of patterns from the same class. We have formulated the problem of jointly decoding K multiple patterns given a single Hidden Markov Model. It is shown that such a solution is possible by aligning the K patterns using the proposed Multi Pattern Dynamic Time Warping algorithm followed by the Constrained Multi Pattern Viterbi Algorithm The new formulation is tested in the context of speaker independent isolated word recognition for both clean and noisy patterns. When 10 percent of speech is affected by a burst noise at -5 dB Signal to Noise Ratio (local), it is shown that joint decoding using only two noisy patterns reduces the noisy speech recognition error rate to about 51 percent, when compared to the single pattern decoding using the Viterbi Algorithm. In contrast a simple maximization of individual pattern likelihoods, provides only about 7 percent reduction in error rate.
Resumo:
In this article, we aim at reducing the error rate of the online Tamil symbol recognition system by employing multiple experts to reevaluate certain decisions of the primary support vector machine classifier. Motivated by the relatively high percentage of occurrence of base consonants in the script, a reevaluation technique has been proposed to correct any ambiguities arising in the base consonants. Secondly, a dynamic time-warping method is proposed to automatically extract the discriminative regions for each set of confused characters. Class-specific features derived from these regions aid in reducing the degree of confusion. Thirdly, statistics of specific features are proposed for resolving any confusions in vowel modifiers. The reevaluation approaches are tested on two databases (a) the isolated Tamil symbols in the IWFHR test set, and (b) the symbols segmented from a set of 10,000 Tamil words. The recognition rate of the isolated test symbols of the IWFHR database improves by 1.9 %. For the word database, the incorporation of the reevaluation step improves the symbol recognition rate by 3.5 % (from 88.4 to 91.9 %). This, in turn, boosts the word recognition rate by 11.9 % (from 65.0 to 76.9 %). The reduction in the word error rate has been achieved using a generic approach, without the incorporation of language models.
Resumo:
In this paper we present the application of Hidden Conditional Random Fields (HCRFs) to modelling speech for visual speech recognition. HCRFs may be easily adapted to model long range dependencies across an observation sequence. As a result visual word recognition performance can be improved as the model is able to take more of a contextual approach to generating state sequences. Results are presented from a speaker-dependent, isolated digit, visual speech recognition task using comparisons with a baseline HMM system. We firstly illustrate that word recognition rates on clean video using HCRFs can be improved by increasing the number of past and future observations being taken into account by each state. Secondly we compare model performances using various levels of video compression on the test set. As far as we are aware this is the first attempted use of HCRFs for visual speech recognition.
Resumo:
A class of twenty-two grade one children was tested to determine their reading levels using the Stanford Diagnostic Reading Achievement Test. Based on these results and teacher input the students were paired according to reading ability. The students ages ranged from six years four months to seven years four months at the commencement of the study. Eleven children were assigned to the language experience group and their partners became the text group. Each member of the language experience group generated a list of eight to be learned words. The treatment consisted of exposing the student to a given word three times per session for ten sessions, over a period of five days. The dependent variables consisted of word identification speed, word identification accuracy, and word recognition accuracy. Each member of the text group followed the same procedure using his/her partner's list of words. Upon completion of this training, the entire process was repeated with members of the text group from the first part becoming members of the language experience group and vice versa. The results suggest that generally speaking language experience words are identified faster than text words but that there is no difference in the rate at which these words are learned. Language experience words may be identified faster because the auditory-semantic information is more readily available in them than in text words. The rate of learning in both types of words, however, may be dictated by the orthography of the to be learned word.
Resumo:
This research looked at conditions which result in the development of integrated letter code information in the acquisition of reading vocabulary. Thirty grade three children of normal reading ability acquired new reading words in a Meaning Assigned task and a Letter Comparison task, and worked to increase skill for known reading words in a Copy task. The children were then assessed on their ability to identify the letters in these words. During the test each stimulus word for each child was exposed for 100 msec., after which each child reported as many of his or her letters as he or she could. Familiar words, new words, and a single letter identification task served as within subject controls. Following this, subjects were assessed for word meaning recall of the Meaning Assigned words and word reading times for words in all condi tions • The resul ts supported an episodic model of word recognition in which the overlap between the processing operations employed in encoding a word and those required when decoding it affected decoding performance. In particular, the Meaning Assigned and Copy tasks. appeared to facilitate letter code accessibility and integration in new and familiar words respectively. Performance in the Letter Comparison task, on the other hand, suggested that subjects can process the elements of a new word without integrating them into its lexical structure. It was concluded that these results favour an episodic model of word recognition.
Resumo:
Digit speech recognition is important in many applications such as automatic data entry, PIN entry, voice dialing telephone, automated banking system, etc. This paper presents speaker independent speech recognition system for Malayalam digits. The system employs Mel frequency cepstrum coefficient (MFCC) as feature for signal processing and Hidden Markov model (HMM) for recognition. The system is trained with 21 male and female voices in the age group of 20 to 40 years and there was 98.5% word recognition accuracy (94.8% sentence recognition accuracy) on a test set of continuous digit recognition task.
Resumo:
Federmeier and Benjamin (2005) have suggested that semantic encoding for verbal information in the right hemisphere can be more effective when memory demands are higher. However, other studies (Kanske & Kotz, 2007) also suggest that visual word recognition differ in function of emotional valence. In this context, the present study was designed to evaluate the effects of retention level upon recognition memory processes for negative and neutral words. Sample consisted of 15 right-handed undergraduate portuguese students with normal or corrected to normal vision. Portuguese concrete negative and neutral words were selected in accordance to known linguistic capabilities of the right hemisphere. The participants were submitted to a visual half-field word presentation using a continuous recognition memory paradigm. Eye movements were continuously monitored with a Tobii T60 eye-tracker that showed no significant differences in fixations to negative and neutral words. Reaction times in word recognition suggest an overall advantage of negative words in comparison to the neutral words. Further analysis showed faster responses for negative words than for neutral words when were recognised at longer retention intervals for left-hemisphere encoding. Electrophysiological data through event related potentials revealed larger P2 amplitude over centro-posterior electrode sites for words studied in the left hemifield suggesting a priming effect for right-hemisphere encoding. Overall data suggest different hemispheric memory strategies for the semantic encoding of negative and neutral words.
Resumo:
This investigation moves beyond the traditional studies of word reading to identify how the production complexity of words affects reading accuracy in an individual with deep dyslexia (JO). We examined JO’s ability to read words aloud while manipulating both the production complexity of the words and the semantic context. The classification of words as either phonetically simple or complex was based on the Index of Phonetic Complexity. The semantic context was varied using a semantic blocking paradigm (i.e., semantically blocked and unblocked conditions). In the semantically blocked condition words were grouped by semantic categories (e.g., table, sit, seat, couch,), whereas in the unblocked condition the same words were presented in a random order. JO’s performance on reading aloud was also compared to her performance on a repetition task using the same items. Results revealed a strong interaction between word complexity and semantic blocking for reading aloud but not for repetition. JO produced the greatest number of errors for phonetically complex words in semantically blocked condition. This interaction suggests that semantic processes are constrained by output production processes which are exaggerated when derived from visual rather than auditory targets. This complex relationship between orthographic, semantic, and phonetic processes highlights the need for word recognition models to explicitly account for production processes.
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-06
Resumo:
Much of what is known about word recognition in toddlers comes from eyetracking studies. Here we show that the speed and facility with which children recognize words, as revealed in such studies, cannot be attributed to a task-specific, closed-set strategy; rather, children's gaze to referents of spoken nouns reflects successful search of the lexicon. Toddlers' spoken word comprehension was examined in the context of pictures that had two possible names (such as a cup of juice which could be called "cup" or "juice") and pictures that had only one likely name for toddlers (such as "apple"), using a visual world eye-tracking task and a picture-labeling task (n = 77, mean age, 21 months). Toddlers were just as fast and accurate in fixating named pictures with two likely names as pictures with one. If toddlers do name pictures to themselves, the name provides no apparent benefit in word recognition, because there is no cost to understanding an alternative lexical construal of the picture. In toddlers, as in adults, spoken words rapidly evoke their referents.
Resumo:
Free association norms indicate that words are organized into semantic/associative neighborhoods within a larger network of words and links that bind the net together. We present evidence indicating that memory for a recent word event can depend on implicitly and simultaneously activating related words in its neighborhood. Processing a word during encoding primes its network representation as a function of the density of the links in its neighborhood. Such priming increases recall and recognition and can have long lasting effects when the word is processed in working memory. Evidence for this phenomenon is reviewed in extralist cuing, primed free association, intralist cuing, and single-item recognition tasks. The findings also show that when a related word is presented to cue the recall of a studied word, the cue activates it in an array of related words that distract and reduce the probability of its selection. The activation of the semantic network produces priming benefits during encoding and search costs during retrieval. In extralist cuing recall is a negative function of cue-to-distracter strength and a positive function of neighborhood density, cue-to-target strength, and target-to cue strength. We show how four measures derived from the network can be combined and used to predict memory performance. These measures play different roles in different tasks indicating that the contribution of the semantic network varies with the context provided by the task. We evaluate spreading activation and quantum-like entanglement explanations for the priming effect produced by neighborhood density.
Resumo:
Digital devices like smart phones and tablet computers are becoming commonplace in young children’s lives for play, entertainment, learning and communication. Recently, there has been a great deal of focus on the educational potential of devices like iPads in both formal and informal educational settings. There is now an abundance of educational ‘apps’ available to children, parents, and kindergarten and pre-school teachers that claim to enhance children’s early literacy and numeracy development and creativity. To date, though, there has been very little formal investigation of the educational potential of these devices. This book discusses the impact on children’s learning when iPads were introduced in three very different kindergartens in Brisbane, Australia. Chapters outline how researchers worked with pre-school teachers and parents to explore how iPads can assist with letter and word recognition, the development of oral literacy and talk around play. The book also considers the possibilities for using iPads for creativity and arts education through photography, storytelling, drawing, music creation and audio recording.
Resumo:
This paper describes our participation in the Chinese word segmentation task of CIPS-SIGHAN 2010. We implemented an n-gram mutual information (NGMI) based segmentation algorithm with the mixed-up features from unsupervised, supervised and dictionarybased segmentation methods. This algorithm is also combined with a simple strategy for out-of-vocabulary (OOV) word recognition. The evaluation for both open and closed training shows encouraging results of our system. The results for OOV word recognition in closed training evaluation were however found unsatisfactory.
Resumo:
What helps us determine whether a word is a noun or a verb, without conscious awareness? We report on cues in the way individual English words are spelled, and, for the first time, identify their neural correlates via functional magnetic resonance imaging (fMRI). We used a lexical decision task with trisyllabic nouns and verbs containing orthographic cues that are either consistent or inconsistent with the spelling patterns of words from that grammatical category. Significant linear increases in response times and error rates were observed as orthography became less consistent, paralleled by significant linear decreases in blood oxygen level dependent (BOLD) signal in the left supramarginal gyrus of the left inferior parietal lobule, a brain region implicated in visual word recognition. A similar pattern was observed in the left superior parietal lobule. These findings align with an emergentist view of grammatical category processing which results from sensitivity to multiple probabilistic cues.
Resumo:
We investigated the neural correlates of semantic priming by using event-related fMRI to record blood oxygen level dependent (BOLD) responses while participants performed speeded lexical decisions (word/nonword) on visually presented related versus unrelated prime-target pairs. A long stimulus onset asynchrony of 1000 ms was employed, which allowed for increased controlled processing and selective frequency-based ambiguity priming. Conditions included an ambiguous word prime (e.g. bank) and a target related to its dominant (e.g. money) or subordinate meaning (e.g. river). Compared to an unrelated condition, primed dominant targets were associated with increased activity in the LIFG, the right anterior cingulate and superior temporal gyrus, suggesting postlexical semantic integrative mechanisms, while increased right supramarginal activity for the unrelated condition was consistent with expectancy based priming. Subordinate targets were not primed and were associated with reduced activity primarily in occipitotemporal regions associated with word recognition, which may be consistent with frequency-based meaning suppression. These findings provide new insights into the neural substrates of semantic priming and the functional-anatomic correlates of lexical ambiguity suppression mechanisms.