62 resultados para Bag-of-visual Words
em CentAUR: Central Archive University of Reading - UK
Resumo:
Scene classification based on latent Dirichlet allocation (LDA) is a more general modeling method known as a bag of visual words, in which the construction of a visual vocabulary is a crucial quantization process to ensure success of the classification. A framework is developed using the following new aspects: Gaussian mixture clustering for the quantization process, the use of an integrated visual vocabulary (IVV), which is built as the union of all centroids obtained from the separate quantization process of each class, and the usage of some features, including edge orientation histogram, CIELab color moments, and gray-level co-occurrence matrix (GLCM). The experiments are conducted on IKONOS images with six semantic classes (tree, grassland, residential, commercial/industrial, road, and water). The results show that the use of an IVV increases the overall accuracy (OA) by 11 to 12% and 6% when it is implemented on the selected and all features, respectively. The selected features of CIELab color moments and GLCM provide a better OA than the implementation over CIELab color moment or GLCM as individuals. The latter increases the OA by only ∼2 to 3%. Moreover, the results show that the OA of LDA outperforms the OA of C4.5 and naive Bayes tree by ∼20%. © 2014 Society of Photo-Optical Instrumentation Engineers (SPIE) [DOI: 10.1117/1.JRS.8.083690]
Resumo:
We use a detailed study of the knowledge work around visual representations to draw attention to the multidimensional nature of `objects'. Objects are variously described in the literatures as relatively stable or in flux; as abstract or concrete; and as used within or across practices. We clarify these dimensions, drawing on and extending the literature on boundary objects, and connecting it with work on epistemic and technical objects. In particular, we highlight the epistemic role of objects, using our observations of knowledge work on an architectural design project to show how, in this setting, visual representations are characterized by a `lack' or incompleteness that precipitates unfolding. The conceptual design of a building involves a wide range of technical, social and aesthetic forms of knowledge that need to be developed and aligned. We explore how visual representations are used, and how these are meaningful to different stakeholders, eliciting their distinct contributions. As the project evolves and the drawings change, new issues and needs for knowledge work arise. These objects have an `unfolding ontology' and are constantly in flux, rather than fully formed. We discuss the implications for wider understandings of objects in organizations and for how knowledge work is achieved in practice.
Resumo:
This paper uses the social model of disability to examine visually impaired children's experiences of their housing and neighbourhoods and finds that they did not experience any significant problems with the design of them. The source of their problems was within these environments, and was caused by factors such as the intensity of movement, for example, from flows of traffic. We conclude by discussing the social policy implications of these findings.
Resumo:
During locomotion, retinal flow, gaze angle, and vestibular information can contribute to one's perception of self-motion. Their respective roles were investigated during active steering: Retinal flow and gaze angle were biased by altering the visual information during computer-simulated locomotion, and vestibular information was controlled through use of a motorized chair that rotated the participant around his or her vertical axis. Chair rotation was made appropriate for the steering response of the participant or made inappropriate by rotating a proportion of the veridical amount. Large steering errors resulted from selective manipulation of retinal flow and gaze angle, and the pattern of errors provided strong evidence for an additive model of combination. Vestibular information had little or no effect on steering performance, suggesting that vestibular signals are not integrated with visual information for the control of steering at these speeds.
Resumo:
Previous functional imaging studies have shown that facilitated processing of a visual object on repeated, relative to initial, presentation (i.e., repetition priming) is associated with reductions in neural activity in multiple regions, including fusiforin/lateral occipital cortex. Moreover, activity reductions have been found, at diminished levels, when a different exemplar of an object is presented on repetition. In one previous study, the magnitude of diminished priming across exemplars was greater in the right relative to the left fusiform, suggesting greater exemplar specificity in the right. Another previous study, however, observed fusiform lateralization modulated by object viewpoint, but not object exemplar. The present fMRI study sought to determine whether the result of differential fusiform responses for perceptually different exemplars could be replicated. Furthermore, the role of the left fusiform cortex in object recognition was investigated via the inclusion of a lexical/semantic manipulation. Right fusiform cortex showed a significantly greater effect of exemplar change than left fusiform, replicating the previous result of exemplar-specific fusiform lateralization. Right fusiform and lateral occipital cortex were not differentially engaged by the lexical/semantic manipulation, suggesting that their role in visual object recognition is predominantly in the. C visual discrimination of specific objects. Activation in left fusiform cortex, but not left lateral occipital cortex, was modulated by both exemplar change and lexical/semantic manipulation, with further analysis suggesting a posterior-to-anterior progression between regions involved in processing visuoperceptual and lexical/semantic information about objects. The results are consistent with the view that the right fusiform plays a greater role in processing specific visual form information about objects, whereas the left fusiform is also involved in lexical/semantic processing. (C) 2003 Elsevier Science (USA). All rights reserved.
Resumo:
Defensive behaviors, such as withdrawing your hand to avoid potentially harmful approaching objects, rely on rapid sensorimotor transformations between visual and motor coordinates. We examined the reference frame for coding visual information about objects approaching the hand during motor preparation. Subjects performed a simple visuomanual task while a task-irrelevant distractor ball rapidly approached a location either near to or far from their hand. After the distractor ball appearance, single pulses of transcranial magnetic stimulation were delivered over the subject's primary motor cortex, eliciting motor evoked potentials (MEPs) in their responding hand. MEP amplitude was reduced when the ball approached near the responding hand, both when the hand was on the left and the right of the midline. Strikingly, this suppression occurred very early, at 70-80ms after ball appearance, and was not modified by visual fixation location. Furthermore, it was selective for approaching balls, since static visual distractors did not modulate MEP amplitude. Together with additional behavioral measurements, we provide converging evidence for automatic hand-centered coding of visual space in the human brain.
Resumo:
There are still major challenges in the area of automatic indexing and retrieval of multimedia content data for very large multimedia content corpora. Current indexing and retrieval applications still use keywords to index multimedia content and those keywords usually do not provide any knowledge about the semantic content of the data. With the increasing amount of multimedia content, it is inefficient to continue with this approach. In this paper, we describe the project DREAM, which addresses such challenges by proposing a new framework for semi-automatic annotation and retrieval of multimedia based on the semantic content. The framework uses the Topic Map Technology, as a tool to model the knowledge automatically extracted from the multimedia content using an Automatic Labelling Engine. We describe how we acquire knowledge from the content and represent this knowledge using the support of NLP to automatically generate Topic Maps. The framework is described in the context of film post-production.
Resumo:
There are still major challenges in the area of automatic indexing and retrieval of digital data. The main problem arises from the ever increasing mass of digital media and the lack of efficient methods for indexing and retrieval of such data based on the semantic content rather than keywords. To enable intelligent web interactions or even web filtering, we need to be capable of interpreting the information base in an intelligent manner. Research has been ongoing for a few years in the field of ontological engineering with the aim of using ontologies to add knowledge to information. In this paper we describe the architecture of a system designed to automatically and intelligently index huge repositories of special effects video clips, based on their semantic content, using a network of scalable ontologies to enable intelligent retrieval.
Resumo:
This paper describes experiments relating to the perception of the roughness of simulated surfaces via the haptic and visual senses. Subjects used a magnitude estimation technique to judge the roughness of “virtual gratings” presented via a PHANToM haptic interface device, and a standard visual display unit. It was shown that under haptic perception, subjects tended to perceive roughness as decreasing with increased grating period, though this relationship was not always statistically significant. Under visual exploration, the exact relationship between spatial period and perceived roughness was less well defined, though linear regressions provided a reliable approximation to individual subjects’ estimates.
Resumo:
The premotor theory of attention claims that attentional shifts are triggered during response programming, regardless of which response modality is involved. To investigate this claim, event-related brain potentials (ERPs) were recorded while participants covertly prepared a left or right response, as indicated by a precue presented at the beginning of each trial. Cues signalled a left or right eye movement in the saccade task, and a left or right manual response in the manual task. The cued response had to be executed or withheld following the presentation of a Go/Nogo stimulus. Although there were systematic differences between ERPs triggered during covert manual and saccade preparation, lateralised ERP components sensitive to the direction of a cued response were very similar for both tasks, and also similar to the components previously found during cued shifts of endogenous spatial attention. This is consistent with the claim that the control of attention and of covert response preparation are closely linked. N1 components triggered by task-irrelevant visual probes presented during the covert response preparation interval were enhanced when these probes were presented close to cued response hand in the manual task, and at the saccade target location in the saccade task. This demonstrates that both manual and saccade preparation result in spatially specific modulations of visual processing, in line with the predictions of the premotor theory.
Resumo:
This study explores how children learn the meaning (semantics) and spelling patterns (orthography) of novel words encountered in story context. English-speaking children (N = 88) aged 7 to 8 years read 8 stories and each story contained 1 novel word repeated 4 times. Semantic cues were provided by the story context such that children could infer the meaning of the word (specific context) or the category that the word belonged to (general context). Following story reading, posttests indicated that children showed reliable semantic and orthographic learning. Decoding was the strongest predictor of orthographic learning, indicating that self-teaching via phonological recoding was important for this aspect of word learning. In contrast, oral vocabulary emerged as the strongest predictor of semantic learning.
Resumo:
Future research is required into the prevalence of loneliness, anxiety and depression in adults with visual impairment, and to evaluate the effectiveness of interventions for improving psychosocial well-being such as counselling, peer support and employment programmes.
Resumo:
Dorsolateral prefrontal cortex (DLPFC) is recruited during visual working memory (WM) when relevant information must be maintained in the presence of distracting information. The mechanism by which DLPFC might ensure successful maintenance of the contents of WM is, however, unclear; it might enhance neural maintenance of memory targets or suppress processing of distracters. To adjudicate between these possibilities, we applied time-locked transcranial magnetic stimulation (TMS) during functional MRI, an approach that permits causal assessment of a stimulated brain region's influence on connected brain regions, and evaluated how this influence may change under different task conditions. Participants performed a visual WM task requiring retention of visual stimuli (faces or houses) across a delay during which visual distracters could be present or absent. When distracters were present, they were always from the opposite stimulus category, so that targets and distracters were represented in distinct posterior cortical areas. We then measured whether DLPFC-TMS, administered in the delay at the time point when distracters could appear, would modulate posterior regions representing memory targets or distracters. We found that DLPFC-TMS influenced posterior areas only when distracters were present and, critically, that this influence consisted of increased activity in regions representing the current memory targets. DLPFC-TMS did not affect regions representing current distracters. These results provide a new line of causal evidence for a top-down DLPFC-based control mechanism that promotes successful maintenance of relevant information in WM in the presence of distraction.
Resumo:
Remote transient changes in the environment, such as the onset of visual distractors, impact on the exe- cution of target directed saccadic eye movements. Studies that have examined the latency of the saccade response have shown conflicting results. When there was an element of target selection, saccade latency increased as the distance between distractor and target increased. In contrast, when target selection is minimized by restricting the target to appear on one axis position, latency has been found to be slowest when the distractor is shown at fixation and reduces as it moves away from this position, rather than from the target. Here we report four experiments examining saccade latency as target and distractor posi- tions are varied. We find support for both a dependence of saccade latency on distractor distance from target and from fixation: saccade latency was longer when distractor is shown close to fixation and even longer still when shown in an opposite location (180°) to the target. We suggest that this is due to inhib- itory interactions between the distractor, fixation and the target interfering with fixation disengagement and target selection.