964 resultados para Automatic Visual Word Dictionary Calculation


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Image categorization by means of bag of visual words has received increasing attention by the image processing and vision communities in the last years. In these approaches, each image is represented by invariant points of interest which are mapped to a Hilbert Space representing a visual dictionary which aims at comprising the most discriminative features in a set of images. Notwithstanding, the main problem of such approaches is to find a compact and representative dictionary. Finding such representative dictionary automatically with no user intervention is an even more difficult task. In this paper, we propose a method to automatically find such dictionary by employing a recent developed graph-based clustering algorithm called Optimum-Path Forest, which does not make any assumption about the visual dictionary's size and is more efficient and effective than the state-of-the-art techniques used for dictionary generation. © 2012 IEEE.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Image categorization by means of bag of visual words has received increasing attention by the image processing and vision communities in the last years. In these approaches, each image is represented by invariant points of interest which are mapped to a Hilbert Space representing a visual dictionary which aims at comprising the most discriminative features in a set of images. Notwithstanding, the main problem of such approaches is to find a compact and representative dictionary. Finding such representative dictionary automatically with no user intervention is an even more difficult task. In this paper, we propose a method to automatically find such dictionary by employing a recent developed graph-based clustering algorithm called Optimum-Path Forest, which does not make any assumption about the visual dictionary's size and is more efficient and effective than the state-of-the-art techniques used for dictionary generation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Inspection of solder joints has been a critical process in the electronic manufacturing industry to reduce manufacturing cost, improve yield, and ensure product quality and reliability. This paper proposes two inspection modules for an automatic solder joint classification system. The “front-end” inspection system includes illumination normalisation, localisation and segmentation. The “back-end” inspection involves the classification of solder joints using the Log Gabor filter and classifier fusion. Five different levels of solder quality with respect to the amount of solder paste have been defined. The Log Gabor filter has been demonstrated to achieve high recognition rates and is resistant to misalignment. This proposed system does not need any special illumination system, and the images are acquired by an ordinary digital camera. This system could contribute to the development of automated non-contact, non-destructive and low cost solder joint quality inspection systems.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this thesis, three main questions were addressed using event-related potentials (ERPs): (1) the timing of lexical semantic access, (2) the influence of "top-down" processes on visual word processing, and (3) the influence of "bottom-up" factors on visual word processing. The timing of lexical semantic access was investigated in two studies using different designs. In Study 1,14 participants completed two tasks: a standard lexical decision (LD) task which required a word/nonword decision to each target stimulus, and a semantically primed version (LS) of it using the same category of words (e.g., animal) within each block following which participants made a category judgment. In Study 2, another 12 participants performed a standard semantic priming task, where target stimulus words (e.g., nurse) could be either semantically related or unrelated to their primes (e.g., doctor, tree) but the order of presentation was randomized. We found evidence in both ERP studies that lexical semantic access might occur early within the first 200 ms (at about 170 ms for Study 1 and at about 160 ms for Study 2). Our results were consistent with more recent ERP and eye-tracking studies and are in contrast with the traditional research focus on the N400 component. "Top-down" processes, such as a person's expectation and strategic decisions, were possible in Study 1 because of the blocked design, but they were not for Study 2 with a randomized design. Comparing results from two studies, we found that visual word processing could be affected by a person's expectation and the effect occurred early at a sensory/perceptual stage: a semantic task effect in the PI component at about 100 ms in the ERP was found in Study 1 , but not in Study 2. Furthermore, we found that such "top-down" influence on visual word processing might be mediated through separate mechanisms depending on whether the stimulus was a word or a nonword. "Bottom-up" factors involve inherent characteristics of particular words, such as bigram frequency (the total frequency of two-letter combinations of a word), word frequency (the frequency of the written form of a word), and neighborhood density (the number of words that can be generated by changing one letter of an original word or nonword). A bigram frequency effect was found when comparing the results from Studies 1 and 2, but it was examined more closely in Study 3. Fourteen participants performed a similar standard lexical decision task but the words and nonwords were selected systematically to provide a greater range in the aforementioned factors. As a result, a total of 18 word conditions were created with 18 nonword conditions matched on neighborhood density and neighborhood frequency. Using multiple regression analyses, we foimd that the PI amplitude was significantly related to bigram frequency for both words and nonwords, consistent with results from Studies 1 and 2. In addition, word frequency and neighborhood frequency were also able to influence the PI amplitude separately for words and for nonwords and there appeared to be a spatial dissociation between the two effects: for words, the word frequency effect in PI was found at the left electrode site; for nonwords, the neighborhood frequency effect in PI was fovind at the right elecfrode site. The implications of otir findings are discussed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Forty students from regular, grade five classes were divided into two groups of twenty, a good reader group and a' poor reader group, on the basis. of their reading scores on Canadian Achievement Tests. .The subjects took. part in four experimental conditions iM which they .learned lists of pronounceable and unprono~nceable pseudowords, some with semantic referents, and responded to questions designed tci test visual perceptu~l learning and lexical ·and semantic association learning. It' was hypothesized "that the good reade~ group would be able to make use of graphemic and phonemic redundancy patterns in order to improv~·visuSl perceptual learning and lexical and semantic association lea~ningto a greater extent. than would .the poor reader gr6up. The data supported this hypothesis, and also indicated that, although the poor readers were less adept at using familiar sound and letter patterns, they were more dependent on· such pa~terns as an aid to visual recognition memory and semantic recall than were the good readers. It wa.s postulated that poor readers are in a double- ~ . bind situatio~ of having to choose between using weak graphemic-semantic associations or gr~pheme-phoneme associations which are also weak and which have hindered them in developing automaticity in. reading.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A substantial amount of evidence has been collected to propose an exclusive role for the dorsal visual pathway in the control of guided visual search mechanisms, specifically in the preattentive direction of spatial selection [Vidyasagar, T. R. (1999). A neuronal model of attentional spotlight: Parietal guiding the temporal. Brain Research and Reviews, 30, 66-76; Vidyasagar, T. R. (2001). From attentional gating in macaque primary visual cortex to dyslexia in humans. Progress in Brain Research, 134, 297-312]. Moreover, it has been suggested recently that the dorsal visual pathway is specifically involved in the spatial selection and sequencing required for orthographic processing in visual word recognition. In this experiment we manipulate the demands for spatial processing in a word recognition, lexical decision task by presenting target words in a normal spatial configuration, or where the constituent letters of each word are spatially shifted relative to each other. Accurate word recognition in the Shifted-words condition should demand higher spatial encoding requirements, thereby making greater demands on the dorsal visual stream. Magnetoencephalographic (MEG) neuroimaging revealed a high frequency (35-40 Hz) right posterior parietal activation consistent with dorsal stream involvement occurring between 100 and 300 ms post-stimulus onset, and then again at 200-400 ms. Moreover, this signal was stronger in the shifted word condition, compared to the normal word condition. This result provides neurophysiological evidence that the dorsal visual stream may play an important role in visual word recognition and reading. These results further provide a plausible link between early stage theories of reading, and the magnocellular-deficit theory of dyslexia, which characterises many types of reading difficulty. © 2006 Elsevier Ltd. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We used magnetoencephalography (MEG) to map the spatiotemporal evolution of cortical activity for visual word recognition. We show that for five-letter words, activity in the left hemisphere (LH) fusiform gyrus expands systematically in both the posterior-anterior and medial-lateral directions over the course of the first 500 ms after stimulus presentation. Contrary to what would be expected from cognitive models and hemodynamic studies, the component of this activity that spatially coincides with the visual word form area (VWFA) is not active until around 200 ms post-stimulus, and critically, this activity is preceded by and co-active with activity in parts of the inferior frontal gyrus (IFG, BA44/6). The spread of activity in the VWFA for words does not appear in isolation but is co-active in parallel with spread of activity in anterior middle temporal gyrus (aMTG, BA 21 and 38), posterior middle temporal gyrus (pMTG, BA37/39), and IFG. © 2004 Elsevier Inc. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Different from the first attempts to solve the image categorization problem (often based on global features), recently, several researchers have been tackling this research branch through a new vantage point - using features around locally invariant interest points and visual dictionaries. Although several advances have been done in the visual dictionaries literature in the past few years, a problem we still need to cope with is calculation of the number of representative words in the dictionary. Therefore, in this paper we introduce a new solution for automatically finding the number of visual words in an N-Way image categorization problem by means of supervised pattern classification based on optimum-path forest. © 2011 IEEE.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

There is evidence that automatic visual attention favors the right side. This study investigated whether this lateral asymmetry interacts with the right hemisphere dominance for visual location processing and left hemisphere dominance for visual shape processing. Volunteers were tested in a location discrimination task and a shape discrimination task. The target stimuli (S2) could occur in the left or right hemifield. They were preceded by an ipsilateral, contralateral or bilateral prime stimulus (S1). The attentional effect produced by the right S1 was larger than that produced by the left S1. This lateral asymmetry was similar between the two tasks suggesting that the hemispheric asymmetries of visual mechanisms do not contribute to it. The finding that it was basically due to a longer reaction time to the left S2 than to the right S2 for the contralateral S1 condition suggests that the inhibitory component of attention is laterally asymmetric.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

插件作业 (parts mating)是装配机器人的一项基本作业环节 .本文介绍了以双目立体视觉实现该作业的视觉导引方法 .该方法通过采用人机交互方式 ,借助于人的智慧 ,提高了图像特征提取和匹配的准确性和可靠性、可直观准确地给出插件作业的动作参数 ,克服了自动视觉计算复杂、鲁棒性差的缺点 ,适用于机器人遥操作作业 .实验表明 ,基于人机交互的机器人插件作业在立体视觉导引下是完全可行的

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we present the application of Hidden Conditional Random Fields (HCRFs) to modelling speech for visual speech recognition. HCRFs may be easily adapted to model long range dependencies across an observation sequence. As a result visual word recognition performance can be improved as the model is able to take more of a contextual approach to generating state sequences. Results are presented from a speaker-dependent, isolated digit, visual speech recognition task using comparisons with a baseline HMM system. We firstly illustrate that word recognition rates on clean video using HCRFs can be improved by increasing the number of past and future observations being taken into account by each state. Secondly we compare model performances using various levels of video compression on the test set. As far as we are aware this is the first attempted use of HCRFs for visual speech recognition.