972 resultados para natural scene perception


Relevância:

100.00% 100.00%

Publicador:

Resumo:

How do humans rapidly recognize a scene? How can neural models capture this biological competence to achieve state-of-the-art scene classification? The ARTSCENE neural system classifies natural scene photographs by using multiple spatial scales to efficiently accumulate evidence for gist and texture. ARTSCENE embodies a coarse-to-fine Texture Size Ranking Principle whereby spatial attention processes multiple scales of scenic information, ranging from global gist to local properties of textures. The model can incrementally learn and predict scene identity by gist information alone and can improve performance through selective attention to scenic textures of progressively smaller size. ARTSCENE discriminates 4 landscape scene categories (coast, forest, mountain and countryside) with up to 91.58% correct on a test set, outperforms alternative models in the literature which use biologically implausible computations, and outperforms component systems that use either gist or texture information alone. Model simulations also show that adjacent textures form higher-order features that are also informative for scene recognition.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We investigated the effect of image size on saccade amplitudes. First, in a meta-analysis, relevant results from previous scene perception studies are summarised, suggesting the possibility of a linear relationship between mean saccade amplitude and image size. Forty-eight observers viewed 96 colour scene images scaled to four different sizes, while their eye movements were recorded. Mean and median saccade amplitudes were found to be directly proportional to image size, while the mode of the distribution lay in the range of very short saccades. However, saccade amplitudes expressed as percentages of image size were not constant over the different image sizes; on smaller stimulus images, the relative saccades were found to be larger, and vice versa. In sum, and as far as mean and median saccade amplitudes are concerned, the size of stimulus images is the dominant factor. Other factors, such as image properties, viewing task, or measurement equipment, are only of subordinate importance. Thus, the role of stimulus size has to be reconsidered, in theoretical as well as methodological terms.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Color has an unresolved role in the rapid process of natural scene. The temporal changes of the color effect might partly account for the debates. Besides, the distinction of localized and unlocalized information has not been addressed directly in these color studies. Here we present two experiments that investigate whether color contributes to categorization in a briefly flashed natural image and also whether it is mediated by time and low-level information. By controlling the interval between target and mask stimuli, Experiment 1 tested the hypothesis that colors could facilitate in the early stage of scene perception and the effect would decay in later processing. Experiment 2 examined how the randomization of local phase information influenced the color’s advantage over gray. Together, the results suggest that color does enhance natural scene categorization at short exposure time. Furthermore, results imply that effect of color is stable between 12 and120ms, and is not accounted by showing the structures organized by localized information. Therefore,we concluded that color always make effect in the process of rapid scene categorization, and do not depend on localized information. Thus, the present study is an attempt to fill the gap in previous research; its results is an contribution to deeper understanding of the role of color in natural scene perception.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The human visual system is adept at detecting and encoding statistical regularities in its spatio-temporal environment. Here we report an unexpected failure of this ability in the context of perceiving inconsistencies in illumination distributions across a scene. Contrary to predictions from previous studies [Enns and Rensink, 1990; Sun and Perona, 1996a, 1996b, 1997], we find that the visual system displays a remarkable lack of sensitivity to illumination inconsistencies, both in experimental stimuli and in images of real scenes. Our results allow us to draw inferences regarding how the visual system encodes illumination distributions across scenes. Specifically, they suggest that the visual system does not verify the global consistency of locally derived estimates of illumination direction.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

La perception visuelle ne se résume pas à la simple perception des variations de la quantité de lumière qui atteint la rétine. L’image naturelle est en effet composée de variation de contraste et de texture que l’on qualifie d’information de deuxième ordre (en opposition à l’information de premier ordre : luminance). Il a été démontré chez plusieurs espèces qu’un mouvement de deuxième ordre (variation spatiotemporelle du contraste ou de la texture) est aisément détecté. Les modèles de détection du mouvement tel le modèle d’énergie d’Adelson et Bergen ne permettent pas d’expliquer ces résultats, car le mouvement de deuxième ordre n’implique aucune variation de la luminance. Il existe trois modèles expliquant la détection du mouvement de deuxième ordre : la présence d’une circuiterie de type filter-rectify-filter, un mécanisme de feature-tracking ou simplement l’existence de non-linéarités précoces dans le traitement visuel. Par ailleurs, il a été proposé que l’information visuelle de deuxième ordre soit traitée par une circuiterie neuronale distincte de celle qui traite du premier ordre. Bon nombre d’études réfutent cependant cette théorie et s’entendent sur le fait qu’il n’y aurait qu’une séparation partielle à bas niveau. Les études électrophysiologiques sur la perception du mouvement de deuxième ordre ont principalement été effectuées chez le singe et le chat. Chez le chat, toutefois, seules les aires visuelles primaires (17 et 18) ont été extensivement étudiées. L’implication dans le traitement du deuxième ordre de l’aire dédiée à la perception du mouvement, le Sulcus syprasylvien postéro-médian latéral (PMLS), n’est pas encore connue. Pour ce faire, nous avons étudié les profils de réponse des neurones du PMLS évoqués par des stimuli dont la composante dynamique était de deuxième ordre. Les profils de réponses au mouvement de deuxième ordre sont très similaires au premier ordre, bien que moins sensibles. Nos données suggèrent que la perception du mouvement par le PMLS serait de type form-cue invariant. En somme, les résultats démontrent que le PMLS permet un traitement plus complexe du mouvement du deuxième ordre et sont en accord avec son rôle privilégié dans la perception du mouvement.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

When we actively explore the visual environment, our gaze preferentially selects regions characterized by high contrast and high density of edges, suggesting that the guidance of eye movements during visual exploration is driven to a significant degree by perceptual characteristics of a scene. Converging findings suggest that the selection of the visual target for the upcoming saccade critically depends on a covert shift of spatial attention. However, it is unclear whether attention selects the location of the next fixation uniquely on the basis of global scene structure or additionally on local perceptual information. To investigate the role of spatial attention in scene processing, we examined eye fixation patterns of patients with spatial neglect during unconstrained exploration of natural images and compared these to healthy and brain-injured control participants. We computed luminance, colour, contrast, and edge information contained in image patches surrounding each fixation and evaluated whether they differed from randomly selected image patches. At the global level, neglect patients showed the characteristic ipsilesional shift of the distribution of their fixations. At the local level, patients with neglect and control participants fixated image regions in ipsilesional space that were closely similar with respect to their local feature content. In contrast, when directing their gaze to contralesional (impaired) space neglect patients fixated regions of significantly higher local luminance and lower edge content than controls. These results suggest that intact spatial attention is necessary for the active sampling of local feature content during scene perception.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Image annotation is a significant step towards semantic based image retrieval. Ontology is a popular approach for semantic representation and has been intensively studied for multimedia analysis. However, relations among concepts are seldom used to extract higher-level semantics. Moreover, the ontology inference is often crisp. This paper aims to enable sophisticated semantic querying of images, and thus contributes to 1) an ontology framework to contain both visual and contextual knowledge, and 2) a probabilistic inference approach to reason the high-level concepts based on different sources of information. The experiment on a natural scene database from LabelMe database shows encouraging results.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper describes a semi-automatic tool for annotation of multi-script text from natural scene images. To our knowledge, this is the maiden tool that deals with multi-script text or arbitrary orientation. The procedure involves manual seed selection followed by a region growing process to segment each word present in the image. The threshold for region growing can be varied by the user so as to ensure pixel-accurate character segmentation. The text present in the image is tagged word-by-word. A virtual keyboard interface has also been designed for entering the ground truth in ten Indic scripts, besides English. The keyboard interface can easily be generated for any script, thereby expanding the scope of the toolkit. Optionally, each segmented word can further be labeled into its constituent characters/symbols. Polygonal masks are used to split or merge the segmented words into valid characters/symbols. The ground truth is represented by a pixel-level segmented image and a '.txt' file that contains information about the number of words in the image, word bounding boxes, script and ground truth Unicode. The toolkit, developed using MATLAB, can be used to generate ground truth and annotation for any generic document image. Thus, it is useful for researchers in the document image processing community for evaluating the performance of document analysis and recognition techniques. The multi-script annotation toolokit (MAST) is available for free download.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

How do humans use predictive contextual information to facilitate visual search? How are consistently paired scenic objects and positions learned and used to more efficiently guide search in familiar scenes? For example, a certain combination of objects can define a context for a kitchen and trigger a more efficient search for a typical object, such as a sink, in that context. A neural model, ARTSCENE Search, is developed to illustrate the neural mechanisms of such memory-based contextual learning and guidance, and to explain challenging behavioral data on positive/negative, spatial/object, and local/distant global cueing effects during visual search. The model proposes how global scene layout at a first glance rapidly forms a hypothesis about the target location. This hypothesis is then incrementally refined by enhancing target-like objects in space as a scene is scanned with saccadic eye movements. The model clarifies the functional roles of neuroanatomical, neurophysiological, and neuroimaging data in visual search for a desired goal object. In particular, the model simulates the interactive dynamics of spatial and object contextual cueing in the cortical What and Where streams starting from early visual areas through medial temporal lobe to prefrontal cortex. After learning, model dorsolateral prefrontal cortical cells (area 46) prime possible target locations in posterior parietal cortex based on goalmodulated percepts of spatial scene gist represented in parahippocampal cortex, whereas model ventral prefrontal cortical cells (area 47/12) prime possible target object representations in inferior temporal cortex based on the history of viewed objects represented in perirhinal cortex. The model hereby predicts how the cortical What and Where streams cooperate during scene perception, learning, and memory to accumulate evidence over time to drive efficient visual search of familiar scenes.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The goal of this study is to identify cues for the cognitive process of attention in ancient Greek art, aiming to find confirmation of its possible use by ancient Greek audiences and artists. Evidence of cues that trigger attention’s psychological dispositions was searched through content analysis of image reproductions of ancient Greek sculpture and fine vase painting from the archaic to the Hellenistic period - ca. 7th -1st cent. BC. Through this analysis, it was possible to observe the presence of cues that trigger orientation to the work of art (i.e. amplification, contrast, emotional salience, simplification, symmetry), of a cue that triggers a disseminate attention to the parts of the work (i.e. distribution of elements) and of cues that activate selective attention to specific elements in the work of art (i.e. contrast of elements, salient color, central positioning of elements, composition regarding the flow of elements and significant objects). Results support the universality of those dispositions, probably connected with basic competencies that are hard-wired in the nervous system and in the cognitive processes.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

En este estudio se evalúa el rendimiento de los métodos de Bag-of-Visualterms (BOV) para la clasificación automática de imágenes digitales de la base de datos del artista Miquel Planas. Estas imágenes intervienen en la ideación y diseño de su producción escultórica. Constituye un interesante desafío dada la dificultad de la categorización de escenas cuando éstas difieren más por los contenidos semánticos que por los objetos que contienen. Hemos empleado un método de reconocimiento basado en Kernels introducido por Lazebnik, Schmid y Ponce en 2006. Los resultados son prometedores, en promedio, la puntuación del rendimiento es aproximadamente del 70%. Los experimentos sugieren que la categorización automática de imágenes basada en métodos de visión artificial puede proporcionar principios objetivos en la catalogación de imágenes y que los resultados obtenidos pueden ser aplicados en diferentes campos de la creación artística.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Do we view the world differently if it is described to us in figurative rather than literal terms? An answer to this question would reveal something about both the conceptual representation of figurative language and the scope of top-down influences oil scene perception. Previous work has shown that participants will look longer at a path region of a picture when it is described with a type of figurative language called fictive motion (The road goes through the desert) rather than without (The road is in the desert). The current experiment provided evidence that such fictive motion descriptions affect eye movements by evoking mental representations of motion. If participants heard contextual information that would hinder actual motion, it influenced how they viewed a picture when it was described with fictive motion. Inspection times and eye movements scanning along the path increased during fictive motion descriptions when the terrain was first described as difficult (The desert is hilly) as compared to easy (The desert is flat); there were no such effects for descriptions without fictive motion. It is argued that fictive motion evokes a mental simulation of motion that is immediately integrated with visual processing, and hence figurative language can have a distinct effect on perception. (c) 2005 Elsevier B.V. All rights reserved.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

An increasing number of neuroscience experiments are using virtual reality to provide a more immersive and less artificial experimental environment. This is particularly useful to navigation and three-dimensional scene perception experiments. Such experiments require accurate real-time tracking of the observer's head in order to render the virtual scene. Here, we present data on the accuracy of a commonly used six degrees of freedom tracker (Intersense IS900) when it is moved in ways typical of virtual reality applications. We compared the reported location of the tracker with its location computed by an optical tracking method. When the tracker was stationary, the root mean square error in spatial accuracy was 0.64 mm. However, we found that errors increased over ten-fold (up to 17 mm) when the tracker moved at speeds common in virtual reality applications. We demonstrate that the errors we report here are predominantly due to inaccuracies of the IS900 system rather than the optical tracking against which it was compared. (c) 2006 Elsevier B.V. All rights reserved.