929 resultados para PASCAL Visual Object Classes (VOC)


Relevância:

40.00% 40.00%

Publicador:

Resumo:

Visual search data are given a unified quantitative explanation by a model of how spatial maps in the parietal cortex and object recognition categories in the inferotemporal cortex deploy attentional resources as they reciprocally interact with visual representations in the prestriate cortex. The model visual representations arc organized into multiple boundary and surface representations. Visual search in the model is initiated by organizing multiple items that lie within a given boundary or surface representation into a candidate search grouping. These items arc compared with object recognition categories to test for matches or mismatches. Mismatches can trigger deeper searches and recursive selection of new groupings until a target object io identified. This search model is algorithmically specified to quantitatively simulate search data using a single set of parameters, as well as to qualitatively explain a still larger data base, including data of Aks and Enns (1992), Bravo and Blake (1990), Chellazzi, Miller, Duncan, and Desimone (1993), Egeth, Viri, and Garbart (1984), Cohen and Ivry (1991), Enno and Rensink (1990), He and Nakayarna (1992), Humphreys, Quinlan, and Riddoch (1989), Mordkoff, Yantis, and Egeth (1990), Nakayama and Silverman (1986), Treisman and Gelade (1980), Treisman and Sato (1990), Wolfe, Cave, and Franzel (1989), and Wolfe and Friedman-Hill (1992). The model hereby provides an alternative to recent variations on the Feature Integration and Guided Search models, and grounds the analysis of visual search in neural models of preattentive vision, attentive object learning and categorization, and attentive spatial localization and orientation.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

How do visual form and motion processes cooperate to compute object motion when each process separately is insufficient? Consider, for example, a deer moving behind a bush. Here the partially occluded fragments of motion signals available to an observer must be coherently grouped into the motion of a single object. A 3D FORMOTION model comprises five important functional interactions involving the brain’s form and motion systems that address such situations. Because the model’s stages are analogous to areas of the primate visual system, we refer to the stages by corresponding anatomical names. In one of these functional interactions, 3D boundary representations, in which figures are separated from their backgrounds, are formed in cortical area V2. These depth-selective V2 boundaries select motion signals at the appropriate depths in MT via V2-to-MT signals. In another, motion signals in MT disambiguate locally incomplete or ambiguous boundary signals in V2 via MT-to-V1-to-V2 feedback. The third functional property concerns resolution of the aperture problem along straight moving contours by propagating the influence of unambiguous motion signals generated at contour terminators or corners. Here, sparse “feature tracking signals” from, e.g., line ends, are amplified to overwhelm numerically superior ambiguous motion signals along line segment interiors. In the fourth, a spatially anisotropic motion grouping process takes place across perceptual space via MT-MST feedback to integrate veridical feature-tracking and ambiguous motion signals to determine a global object motion percept. The fifth property uses the MT-MST feedback loop to convey an attentional priming signal from higher brain areas back to V1 and V2. The model's use of mechanisms such as divisive normalization, endstopping, cross-orientation inhibition, and longrange cooperation is described. Simulated data include: the degree of motion coherence of rotating shapes observed through apertures, the coherent vs. element motion percepts separated in depth during the chopsticks illusion, and the rigid vs. non-rigid appearance of rotating ellipses.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Early visual cortex (EVC) participates in visual feature memory and the updating of remembered locations across saccades, but its role in the trans-saccadic integration of object features is unknown. We hypothesized that if EVC is involved in updating object features relative to gaze, feature memory should be disrupted when saccades remap an object representation into a simultaneously perturbed EVC site. To test this, we applied transcranial magnetic stimulation (TMS) over functional magnetic resonance imaging-localized EVC clusters corresponding to the bottom left/right visual quadrants (VQs). During experiments, these VQs were probed psychophysically by briefly presenting a central object (Gabor patch) while subjects fixated gaze to the right or left (and above). After a short memory interval, participants were required to detect the relative change in orientation of a re-presented test object at the same spatial location. Participants either sustained fixation during the memory interval (fixation task) or made a horizontal saccade that either maintained or reversed the VQ of the object (saccade task). Three TMS pulses (coinciding with the pre-, peri-, and postsaccade intervals) were applied to the left or right EVC. This had no effect when (a) fixation was maintained, (b) saccades kept the object in the same VQ, or (c) the EVC quadrant corresponding to the first object was stimulated. However, as predicted, TMS reduced performance when saccades (especially larger saccades) crossed the remembered object location and brought it into the VQ corresponding to the TMS site. This suppression effect was statistically significant for leftward saccades and followed a weaker trend for rightward saccades. These causal results are consistent with the idea that EVC is involved in the gaze-centered updating of object features for trans-saccadic memory and perception.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

There is a perception amongst some of those learning computer programming that the principles of object-oriented programming (where behaviour is often encapsulated across multiple class files) can be difficult to grasp, especially when taught through a traditional, didactic ‘talk-and-chalk’ method or in a lecture-based environment.
We propose a non-traditional teaching method, developed for a government funded teaching training project delivered by Queen’s University, we call it bigCode. In this scenario, learners are provided with many printed, poster-sized fragments of code (in this case either Java or C#). The learners sit on the floor in groups and assemble these fragments into the many classes which make-up an object-oriented program.
Early trials indicate that bigCode is an effective method for teaching object-orientation. The requirement to physically organise the code fragments imitates closely the thought processes of a good software developer when developing object-oriented code.
Furthermore, in addition to teaching the principles involved in object-orientation, bigCode is also an extremely useful technique for teaching learners the organisation and structure of individual classes in Java or C# (as well as the organisation of procedural code). The mechanics of organising fragments of code into complete, correct computer programs give the users first-hand practice of this important skill, and as a result they subsequently find it much easier to develop well-structured code on a computer.
Yet, open questions remain. Is bigCode successful only because we have unknowingly predominantly targeted kinesthetic learners? Is bigCode also an effective teaching approach for other forms of learners, such as visual learners? How scalable is bigCode: in its current form can it be used with large class sizes, or outside the classroom?

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In his introduction, Pinna (2010) quoted one of Wertheimer’s observations: “I stand at the window and see a house, trees, sky. Theoretically I might say there were 327 brightnesses and nuances of color. Do I have ‘327’? No. I have sky, house, and trees.” This seems quite remarkable, for Max Wertheimer, together with Kurt Koffka and Wolfgang Koehler, was a pioneer of Gestalt Theory: perceptual organisation was tackled considering grouping rules of line and edge elements in relation to figure-ground segregation, i.e., a meaningful object (the figure) as perceived against a complex background (the ground). At the lowest level – line and edge elements – Wertheimer (1923) himself formulated grouping principles on the basis of proximity, good continuation, convexity, symmetry and, often forgotten, past experience of the observer. Rubin (1921) formulated rules for figure-ground segregation using surroundedness, size and orientation, but also convexity and symmetry. Almost a century of research into Gestalt later, Pinna and Reeves (2006) introduced the notion of figurality, meant to represent the integrated set of properties of visual objects, from the principles of grouping and figure-ground to the colour and volume of objects with shading. Pinna, in 2010, went one important step further and studied perceptual meaning, i.e., the interpretation of complex figures on the basis of past experience of the observer. Re-establishing a link to Wertheimer’s rule about past experience, he formulated five propositions, three definitions and seven properties on the basis of observations made on graphically manipulated patterns. For example, he introduced the illusion of meaning by comics-like elements suggesting wind, therefore inducing a learned interpretation. His last figure shows a regular array of squares but with irregular positions on the right side. This pile of (ir)regular squares can be interpreted as the result of an earthquake which destroyed part of an apartment block. This is much more intuitive, direct and economic than describing the complexity of the array of squares.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

La texture est un élément clé pour l’interprétation des images de télédétection à fine résolution spatiale. L’intégration de l’information texturale dans un processus de classification automatisée des images se fait habituellement via des images de texture, souvent créées par le calcul de matrices de co-occurrences (MCO) des niveaux de gris. Une MCO est un histogramme des fréquences d’occurrence des paires de valeurs de pixels présentes dans les fenêtres locales, associées à tous les pixels de l’image utilisée; une paire de pixels étant définie selon un pas et une orientation donnés. Les MCO permettent le calcul de plus d’une dizaine de paramètres décrivant, de diverses manières, la distribution des fréquences, créant ainsi autant d’images texturales distinctes. L’approche de mesure des textures par MCO a été appliquée principalement sur des images de télédétection monochromes (ex. images panchromatiques, images radar monofréquence et monopolarisation). En imagerie multispectrale, une unique bande spectrale, parmi celles disponibles, est habituellement choisie pour générer des images de texture. La question que nous avons posée dans cette recherche concerne justement cette utilisation restreinte de l’information texturale dans le cas des images multispectrales. En fait, l’effet visuel d’une texture est créé, non seulement par l’agencement particulier d’objets/pixels de brillance différente, mais aussi de couleur différente. Plusieurs façons sont proposées dans la littérature pour introduire cette idée de la texture à plusieurs dimensions. Parmi celles-ci, deux en particulier nous ont intéressés dans cette recherche. La première façon fait appel aux MCO calculées bande par bande spectrale et la seconde utilise les MCO généralisées impliquant deux bandes spectrales à la fois. Dans ce dernier cas, le procédé consiste en le calcul des fréquences d’occurrence des paires de valeurs dans deux bandes spectrales différentes. Cela permet, en un seul traitement, la prise en compte dans une large mesure de la « couleur » des éléments de texture. Ces deux approches font partie des techniques dites intégratives. Pour les distinguer, nous les avons appelées dans cet ouvrage respectivement « textures grises » et « textures couleurs ». Notre recherche se présente donc comme une analyse comparative des possibilités offertes par l’application de ces deux types de signatures texturales dans le cas spécifique d’une cartographie automatisée des occupations de sol à partir d’une image multispectrale. Une signature texturale d’un objet ou d’une classe d’objets, par analogie aux signatures spectrales, est constituée d’une série de paramètres de texture mesurés sur une bande spectrale à la fois (textures grises) ou une paire de bandes spectrales à la fois (textures couleurs). Cette recherche visait non seulement à comparer les deux approches intégratives, mais aussi à identifier la composition des signatures texturales des classes d’occupation du sol favorisant leur différentiation : type de paramètres de texture / taille de la fenêtre de calcul / bandes spectrales ou combinaisons de bandes spectrales. Pour ce faire, nous avons choisi un site à l’intérieur du territoire de la Communauté Métropolitaine de Montréal (Longueuil) composé d’une mosaïque d’occupations du sol, caractéristique d’une zone semi urbaine (résidentiel, industriel/commercial, boisés, agriculture, plans d’eau…). Une image du satellite SPOT-5 (4 bandes spectrales) de 10 m de résolution spatiale a été utilisée dans cette recherche. Puisqu’une infinité d’images de texture peuvent être créées en faisant varier les paramètres de calcul des MCO et afin de mieux circonscrire notre problème nous avons décidé, en tenant compte des études publiées dans ce domaine : a) de faire varier la fenêtre de calcul de 3*3 pixels à 21*21 pixels tout en fixant le pas et l’orientation pour former les paires de pixels à (1,1), c'est-à-dire à un pas d’un pixel et une orientation de 135°; b) de limiter les analyses des MCO à huit paramètres de texture (contraste, corrélation, écart-type, énergie, entropie, homogénéité, moyenne, probabilité maximale), qui sont tous calculables par la méthode rapide de Unser, une approximation des matrices de co-occurrences, c) de former les deux signatures texturales par le même nombre d’éléments choisis d’après une analyse de la séparabilité (distance de Bhattacharya) des classes d’occupation du sol; et d) d’analyser les résultats de classification (matrices de confusion, exactitudes, coefficients Kappa) par maximum de vraisemblance pour conclure sur le potentiel des deux approches intégratives; les classes d’occupation du sol à reconnaître étaient : résidentielle basse et haute densité, commerciale/industrielle, agricole, boisés, surfaces gazonnées (incluant les golfs) et plans d’eau. Nos principales conclusions sont les suivantes a) à l’exception de la probabilité maximale, tous les autres paramètres de texture sont utiles dans la formation des signatures texturales; moyenne et écart type sont les plus utiles dans la formation des textures grises tandis que contraste et corrélation, dans le cas des textures couleurs, b) l’exactitude globale de la classification atteint un score acceptable (85%) seulement dans le cas des signatures texturales couleurs; c’est une amélioration importante par rapport aux classifications basées uniquement sur les signatures spectrales des classes d’occupation du sol dont le score est souvent situé aux alentours de 75%; ce score est atteint avec des fenêtres de calcul aux alentours de11*11 à 15*15 pixels; c) Les signatures texturales couleurs offrant des scores supérieurs à ceux obtenus avec les signatures grises de 5% à 10%; et ce avec des petites fenêtres de calcul (5*5, 7*7 et occasionnellement 9*9) d) Pour plusieurs classes d’occupation du sol prises individuellement, l’exactitude dépasse les 90% pour les deux types de signatures texturales; e) une seule classe est mieux séparable du reste par les textures grises, celle de l’agricole; f) les classes créant beaucoup de confusions, ce qui explique en grande partie le score global de la classification de 85%, sont les deux classes du résidentiel (haute et basse densité). En conclusion, nous pouvons dire que l’approche intégrative par textures couleurs d’une image multispectrale de 10 m de résolution spatiale offre un plus grand potentiel pour la cartographie des occupations du sol que l’approche intégrative par textures grises. Pour plusieurs classes d’occupations du sol un gain appréciable en temps de calcul des paramètres de texture peut être obtenu par l’utilisation des petites fenêtres de traitement. Des améliorations importantes sont escomptées pour atteindre des exactitudes de classification de 90% et plus par l’utilisation des fenêtres de calcul de taille variable adaptées à chaque type d’occupation du sol. Une méthode de classification hiérarchique pourrait être alors utilisée afin de séparer les classes recherchées une à la fois par rapport au reste au lieu d’une classification globale où l’intégration des paramètres calculés avec des fenêtres de taille variable conduirait inévitablement à des confusions entre classes.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Mémoire numérisé par la Division de la gestion de documents et des archives de l'Université de Montréal

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Understanding how biological visual systems perform object recognition is one of the ultimate goals in computational neuroscience. Among the biological models of recognition the main distinctions are between feedforward and feedback and between object-centered and view-centered. From a computational viewpoint the different recognition tasks - for instance categorization and identification - are very similar, representing different trade-offs between specificity and invariance. Thus the different tasks do not strictly require different classes of models. The focus of the review is on feedforward, view-based models that are supported by psychophysical and physiological data.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Human object recognition is generally considered to tolerate changes of the stimulus position in the visual field. A number of recent studies, however, have cast doubt on the completeness of translation invariance. In a new series of experiments we tried to investigate whether positional specificity of short-term memory is a general property of visual perception. We tested same/different discrimination of computer graphics models that were displayed at the same or at different locations of the visual field, and found complete translation invariance, regardless of the similarity of the animals and irrespective of direction and size of the displacement (Exp. 1 and 2). Decisions were strongly biased towards same decisions if stimuli appeared at a constant location, while after translation subjects displayed a tendency towards different decisions. Even if the spatial order of animal limbs was randomized ("scrambled animals"), no deteriorating effect of shifts in the field of view could be detected (Exp. 3). However, if the influence of single features was reduced (Exp. 4 and 5) small but significant effects of translation could be obtained. Under conditions that do not reveal an influence of translation, rotation in depth strongly interferes with recognition (Exp. 6). Changes of stimulus size did not reduce performance (Exp. 7). Tolerance to these object transformations seems to rely on different brain mechanisms, with translation and scale invariance being achieved in principle, while rotation invariance is not.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper presents recent developments to a vision-based traffic surveillance system which relies extensively on the use of geometrical and scene context. Firstly, a highly parametrised 3-D model is reported, able to adopt the shape of a wide variety of different classes of vehicle (e.g. cars, vans, buses etc.), and its subsequent specialisation to a generic car class which accounts for commonly encountered types of car (including saloon, batchback and estate cars). Sample data collected from video images, by means of an interactive tool, have been subjected to principal component analysis (PCA) to define a deformable model having 6 degrees of freedom. Secondly, a new pose refinement technique using “active” models is described, able to recover both the pose of a rigid object, and the structure of a deformable model; an assessment of its performance is examined in comparison with previously reported “passive” model-based techniques in the context of traffic surveillance. The new method is more stable, and requires fewer iterations, especially when the number of free parameters increases, but shows somewhat poorer convergence. Typical applications for this work include robot surveillance and navigation tasks.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper presents a video surveillance framework that robustly and efficiently detects abandoned objects in surveillance scenes. The framework is based on a novel threat assessment algorithm which combines the concept of ownership with automatic understanding of social relations in order to infer abandonment of objects. Implementation is achieved through development of a logic-based inference engine based on Prolog. Threat detection performance is conducted by testing against a range of datasets describing realistic situations and demonstrates a reduction in the number of false alarms generated. The proposed system represents the approach employed in the EU SUBITO project (Surveillance of Unattended Baggage and the Identification and Tracking of the Owner).

Relevância:

40.00% 40.00%

Publicador:

Resumo:

A large body of research analyzes the runtime execution of a system to extract abstract behavioral views. Those approaches primarily analyze control flow by tracing method execution events or they analyze object graphs of heap snapshots. However, they do not capture how objects are passed through the system at runtime. We refer to the exchange of objects as the object flow, and we claim that object flow is necessary to analyze if we are to understand the runtime of an object-oriented application. We propose and detail Object Flow Analysis, a novel dynamic analysis technique that takes this new information into account. To evaluate its usefulness, we present a visual approach that allows a developer to study classes and components in terms of how they exchange objects at runtime. We illustrate our approach on three case studies.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This article presents a novel system and a control strategy for visual following of a 3D moving object by an Unmanned Aerial Vehicle UAV. The presented strategy is based only on the visual information given by an adaptive tracking method based on the color information, which jointly with the dynamics of a camera fixed to a rotary wind UAV are used to develop an Image-based visual servoing IBVS system. This system is focused on continuously following a 3D moving target object, maintaining it with a fixed distance and centered on the image plane. The algorithm is validated on real flights on outdoors scenarios, showing the robustness of the proposed systems against winds perturbations, illumination and weather changes among others. The obtained results indicate that the proposed algorithms is suitable for complex controls task, such object following and pursuit, flying in formation, as well as their use for indoor navigation