935 resultados para Visual Object Identification Task
Resumo:
Spatial objects may not only be perceived visually but also by touch. We report recent experiments investigating to what extent prior object knowledge acquired in either the haptic or visual sensory modality transfers to a subsequent visual learning task. Results indicate that even mental object representations learnt in one sensory modality may attain a multi-modal quality. These findings seem incompatible with picture-based reasoning schemas but leave open the possibility of modality-specific reasoning mechanisms.
Resumo:
The offered paper deals with the problems of color images preliminary procession. Among these are: interference control (local ones and noise) and extraction of the object from the background on the stage preceding the process of contours extraction. It was considered for a long time that execution of smoothing in segmentation through the boundary extraction is inadmissible, but the described methods and the obtained results evidence about expedience of using the noise control methods.
Resumo:
Recent experimental studies have shown that development towards adult performance levels in configural processing in object recognition is delayed through middle childhood. Whilst partchanges to animal and artefact stimuli are processed with similar to adult levels of accuracy from 7 years of age, relative size changes to stimuli result in a significant decrease in relative performance for participants aged between 7 and 10. Two sets of computational experiments were run using the JIM3 artificial neural network with adult and 'immature' versions to simulate these results. One set progressively decreased the number of neurons involved in the representation of view-independent metric relations within multi-geon objects. A second set of computational experiments involved decreasing the number of neurons that represent view-dependent (nonrelational) object attributes in JIM3's Surface Map. The simulation results which show the best qualitative match to empirical data occurred when artificial neurons representing metric-precision relations were entirely eliminated. These results therefore provide further evidence for the late development of relational processing in object recognition and suggest that children in middle childhood may recognise objects without forming structural description representations.
Resumo:
In the visual perception literature, the recognition of faces has often been contrasted with that of non-face objects, in terms of differences with regard to the role of parts, part relations and holistic processing. However, recent evidence from developmental studies has begun to blur this sharp distinction. We review evidence for a protracted development of object recognition that is reminiscent of the well-documented slow maturation observed for faces. The prolonged development manifests itself in a retarded processing of metric part relations as opposed to that of individual parts and offers surprising parallels to developmental accounts of face recognition, even though the interpretation of the data is less clear with regard to holistic processing. We conclude that such results might indicate functional commonalities between the mechanisms underlying the recognition of faces and non-face objects, which are modulated by different task requirements in the two stimulus domains.
Resumo:
To navigate effectively in three-dimensional space, flying insects must approximate distances to nearby objects. Humans are able to use an array of cues to guide depth perception in the visual world. However, some of these cues are not available to insects that are constrained by their rigid eyes and relatively small body size. Flying fruit flies can use motion parallax to gauge the distance of nearby objects, but using this cue becomes a less effective strategy as objects become more remote. Humans are able to infer depth across far distances by comparing the angular distance of an object to the horizon. This study tested if flying fruit flies, like humans, use the relative position of the horizon as a depth cue. Fruit flies in tethered flight were stimulated with a virtual environment that displayed vertical bars of varying elevation relative to a horizon, and their tracking responses were recorded. This study showed that tracking responses of the flies were strongly increased by reducing the apparent elevation of the bar against the horizon, indicating that fruit flies may be able to assess the distance of far off objects in the natural world by comparing them against a visual horizon.
Resumo:
Portions of this research were presented at the Experimental Psychological Society conference at the University of Kent (May, 2014).The first author is supported by a studentship provided by the University of Dundee. This study was conducted as part of the requirements for the degree of Doctor of Philosophy by the first author.
Resumo:
Portions of this research were presented at the Experimental Psychological Society conference at the University of Kent (May, 2014).The first author is supported by a studentship provided by the University of Dundee. This study was conducted as part of the requirements for the degree of Doctor of Philosophy by the first author.
Resumo:
Person re-identification involves recognizing a person across non-overlapping camera views, with different pose, illumination, and camera characteristics. We propose to tackle this problem by training a deep convolutional network to represent a person’s appearance as a low-dimensional feature vector that is invariant to common appearance variations encountered in the re-identification problem. Specifically, a Siamese-network architecture is used to train a feature extraction network using pairs of similar and dissimilar images. We show that use of a novel multi-task learning objective is crucial for regularizing the network parameters in order to prevent over-fitting due to the small size the training dataset. We complement the verification task, which is at the heart of re-identification, by training the network to jointly perform verification, identification, and to recognise attributes related to the clothing and pose of the person in each image. Additionally, we show that our proposed approach performs well even in the challenging cross-dataset scenario, which may better reflect real-world expected performance.
Resumo:
Programa de doctorado: Sistemas Inteligentes y Aplicaciones Numéricas en Ingeniería
Resumo:
Four Ss were run in a visual span of apprehension experiment to determine whether second choices made following incorrect first responses are at the chance level, as implied by various high threshold models proposed for this situation. The relationships between response biases on first and second choices, and between first choice biases on trials with two or three possible responses, were also examined in terms of Luce's (1959) choice theory. The results were: (a) second choice performance in this task appears to be determined by response bias alone, i.e., second choices were at the chance level; (b)first and second choice response biases were not related according to Luce's choice axiom; and (c) the choice axiom predicted with reasonable accuracy the relationships between first choice response biases corresponding to trials with different numbers of possible response alternatives. © 1967 Psychonomic Society, Inc.
Resumo:
The goal of image retrieval and matching is to find and locate object instances in images from a large-scale image database. While visual features are abundant, how to combine them to improve performance by individual features remains a challenging task. In this work, we focus on leveraging multiple features for accurate and efficient image retrieval and matching. We first propose two graph-based approaches to rerank initially retrieved images for generic image retrieval. In the graph, vertices are images while edges are similarities between image pairs. Our first approach employs a mixture Markov model based on a random walk model on multiple graphs to fuse graphs. We introduce a probabilistic model to compute the importance of each feature for graph fusion under a naive Bayesian formulation, which requires statistics of similarities from a manually labeled dataset containing irrelevant images. To reduce human labeling, we further propose a fully unsupervised reranking algorithm based on a submodular objective function that can be efficiently optimized by greedy algorithm. By maximizing an information gain term over the graph, our submodular function favors a subset of database images that are similar to query images and resemble each other. The function also exploits the rank relationships of images from multiple ranked lists obtained by different features. We then study a more well-defined application, person re-identification, where the database contains labeled images of human bodies captured by multiple cameras. Re-identifications from multiple cameras are regarded as related tasks to exploit shared information. We apply a novel multi-task learning algorithm using both low level features and attributes. A low rank attribute embedding is joint learned within the multi-task learning formulation to embed original binary attributes to a continuous attribute space, where incorrect and incomplete attributes are rectified and recovered. To locate objects in images, we design an object detector based on object proposals and deep convolutional neural networks (CNN) in view of the emergence of deep networks. We improve a Fast RCNN framework and investigate two new strategies to detect objects accurately and efficiently: scale-dependent pooling (SDP) and cascaded rejection classifiers (CRC). The SDP improves detection accuracy by exploiting appropriate convolutional features depending on the scale of input object proposals. The CRC effectively utilizes convolutional features and greatly eliminates negative proposals in a cascaded manner, while maintaining a high recall for true objects. The two strategies together improve the detection accuracy and reduce the computational cost.
Resumo:
Forskning visar att avvikande sexuell preferens (ASP) är en av de mest centrala prediktorerna för återfall i sexualbrott. Eftersom det är viktigt att det i juridiskt beslutsfattande och behandling av sexualförbrytare finns valida och standardiserade verktyg att tillgå för bedömning av ASP, har forskningen under de senaste åren fokuserats på uppmärksamhetsbaserade metoder. Syftet med avhandlingen var att förbättra uppmärksamhetsbaserade metoder genom utvecklingen av en s.k. Rapid Serial Visual Presentation (dtRSVP) metod för att identifiera ASP. Innan vi testade sampel med sexualbrottslingar så genomförde vi tre studier för att kalibrera förfarandena. I dessa studier undersökte vi möjligheten att differentiera mellan homosexuella och heterosexuella män och hur enkelt deltagarna kunde fuska. Dessutom skapade vi en ny uppsättning standardiserade stimuli för bedömningen av pedofilt sexuellt intresse (Virtual People Set, VPS). I skapandet av stimuli togs de etiska och juridiska problemen i beaktande så långt som möjligt. När vi använde dtRSVP som mätningsförfarande för att skilja åt sexuellläggning i ett sampel av homosexuella och heterosexuella män fann vi att sexuellt relevanta stimuli påverkade informationsbehandlingen på ett förutsägbart sätt. Förfarandet hade en god förmåga att skilja sexuella preferenser mellan grupperna och förfarandet var svårt att påverka genom att fuska. När vi använde dtRSVP som mätningsförfarande för att identifiera avvikande sexuell läggning bland dömda sexbrottslingar fann vi att de visade en annorlunda bearbetning av sexuella stimuli jämfört med andra deltagare och att dessa skillnader var i de förväntade riktningarna. Det var däremot svårt att dra några slutsatser angående denna mätmetods förmåga att skilja mellan grupperna pedofiler och icke-pedofiler. Slutligen fann vi att VPS verkar vara ett användbart stimuluspaket för experimentell forskning om pedofilt sexuellt intresse. ---------------------------------------------------- Tutkimuksien mukaan poikkeava seksuaalinen mieltymys (PSM) on yksi seksuaalirikoksen uusimisen keskeisimmistä ennustajista. Oikeudellisessa päätöksenteossa ja seksuaalirikollisten hoidossa on tärkeää, että on käytettävissä kelvollisia ja standardisoituja välineitä PSM:n arvioinnissa. Sen vuoksi on tutkimuksissa viime vuosina keskitytty huomaavaisuuteen perustuviin menetelmiin. Tämän tutkielman tarkoitus oli parantaa huomaavaisuuteen perustuvia menetelmiä kehittämällä nk. Rapid Serial Visual Presentation (dtRSVP) menetelmän. Kalibroidakseen menettelytapoja suoritettiin kolme tutkimusta ennen kuin tutkittiin koeryhmää, johon kuuluvat seksuaalirikollisia. Näissä tutkimuksissa tutkittiin mahdollisuutta erotella homoseksuaalisia ja heteroseksuaalisia miehiä ja tutkittiin missä määrin osallistujien oli helppo huiputtaa. Sen lisäksi, pedofiilisen seksuaalisen mieltymyksen arviointia varten kehitettiin uutta sarjaa standardisoituja virikkeitä (Virtual People Set, VPS). Kehitettäessä virikkeitä otettiin mahdollisimman pitkälti huomioon eettisiä ja oikeudellisia ongelmia. Käytettäessä dtRSVP mittausmenetelmää erotellakseen homo- ja heteroseksuaalista suuntautumista havaitsimme, että seksuaaliset virikkeet vaikuttivat tietojenkäsittelyyn ennustettavalla tavalla. Menettelytavan avulla pystyttiin erotella koeryhmässä olevia seksuaalisia mieltymyksiä ja menettelytapaan oli vaikeaa vaikuttaa huijaamalla. Käytettäessä dtRSVP mittausmenetelmää tunnistaakseen poikkeavaa seksuaalista mieltymystä tuomittujen seksuaalirikollisten keskenään havaitsimme, että he osoittivat erilaista käsittelyä seksuaalisista virikkeistä toisiin osallistujiin verrattuna ja tulokset olivat ennustettavia. Silti oli vaikeaa tehdä johtopäätöksiä mittausmenetelmän kyvystä erotella ryhmiä pedofiilisiä ja ei-pedofiilisiä miehiä. Lopuksi havaitsimme, että VPS näyttää olevan käyttökelpoinen stimulussarja pedofiilisen seksuaalisen mieltymyksen kokeellisissa tutkimuksissa.
Resumo:
This thesis proposes a generic visual perception architecture for robotic clothes perception and manipulation. This proposed architecture is fully integrated with a stereo vision system and a dual-arm robot and is able to perform a number of autonomous laundering tasks. Clothes perception and manipulation is a novel research topic in robotics and has experienced rapid development in recent years. Compared to the task of perceiving and manipulating rigid objects, clothes perception and manipulation poses a greater challenge. This can be attributed to two reasons: firstly, deformable clothing requires precise (high-acuity) visual perception and dexterous manipulation; secondly, as clothing approximates a non-rigid 2-manifold in 3-space, that can adopt a quasi-infinite configuration space, the potential variability in the appearance of clothing items makes them difficult to understand, identify uniquely, and interact with by machine. From an applications perspective, and as part of EU CloPeMa project, the integrated visual perception architecture refines a pre-existing clothing manipulation pipeline by completing pre-wash clothes (category) sorting (using single-shot or interactive perception for garment categorisation and manipulation) and post-wash dual-arm flattening. To the best of the author’s knowledge, as investigated in this thesis, the autonomous clothing perception and manipulation solutions presented here were first proposed and reported by the author. All of the reported robot demonstrations in this work follow a perception-manipulation method- ology where visual and tactile feedback (in the form of surface wrinkledness captured by the high accuracy depth sensor i.e. CloPeMa stereo head or the predictive confidence modelled by Gaussian Processing) serve as the halting criteria in the flattening and sorting tasks, respectively. From scientific perspective, the proposed visual perception architecture addresses the above challenges by parsing and grouping 3D clothing configurations hierarchically from low-level curvatures, through mid-level surface shape representations (providing topological descriptions and 3D texture representations), to high-level semantic structures and statistical descriptions. A range of visual features such as Shape Index, Surface Topologies Analysis and Local Binary Patterns have been adapted within this work to parse clothing surfaces and textures and several novel features have been devised, including B-Spline Patches with Locality-Constrained Linear coding, and Topology Spatial Distance to describe and quantify generic landmarks (wrinkles and folds). The essence of this proposed architecture comprises 3D generic surface parsing and interpretation, which is critical to underpinning a number of laundering tasks and has the potential to be extended to other rigid and non-rigid object perception and manipulation tasks. The experimental results presented in this thesis demonstrate that: firstly, the proposed grasp- ing approach achieves on-average 84.7% accuracy; secondly, the proposed flattening approach is able to flatten towels, t-shirts and pants (shorts) within 9 iterations on-average; thirdly, the proposed clothes recognition pipeline can recognise clothes categories from highly wrinkled configurations and advances the state-of-the-art by 36% in terms of classification accuracy, achieving an 83.2% true-positive classification rate when discriminating between five categories of clothes; finally the Gaussian Process based interactive perception approach exhibits a substantial improvement over single-shot perception. Accordingly, this thesis has advanced the state-of-the-art of robot clothes perception and manipulation.