935 resultados para Visual Object Identification Task


Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we consider the task of recognizing epigraphs in images such as photos taken using mobile devices. Given a set of 17,155 photos related to 14,560 epigraphs, we used a k-NearestNeighbor approach in order to perform the recognition. The contribution of this work is in evaluating state-of-the-art visual object recognition techniques in this specific context. The experimental results conducted show that Vector of Locally Aggregated Descriptors obtained aggregating SIFT descriptors is the best choice for this task.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

As we look around a scene, we perceive it as continuous and stable even though each saccadic eye movement changes the visual input to the retinas. How the brain achieves this perceptual stabilization is unknown, but a major hypothesis is that it relies on presaccadic remapping, a process in which neurons shift their visual sensitivity to a new location in the scene just before each saccade. This hypothesis is difficult to test in vivo because complete, selective inactivation of remapping is currently intractable. We tested it in silico with a hierarchical, sheet-based neural network model of the visual and oculomotor system. The model generated saccadic commands to move a video camera abruptly. Visual input from the camera and internal copies of the saccadic movement commands, or corollary discharge, converged at a map-level simulation of the frontal eye field (FEF), a primate brain area known to receive such inputs. FEF output was combined with eye position signals to yield a suitable coordinate frame for guiding arm movements of a robot. Our operational definition of perceptual stability was "useful stability," quantified as continuously accurate pointing to a visual object despite camera saccades. During training, the emergence of useful stability was correlated tightly with the emergence of presaccadic remapping in the FEF. Remapping depended on corollary discharge but its timing was synchronized to the updating of eye position. When coupled to predictive eye position signals, remapping served to stabilize the target representation for continuously accurate pointing. Graded inactivations of pathways in the model replicated, and helped to interpret, previous in vivo experiments. The results support the hypothesis that visual stability requires presaccadic remapping, provide explanations for the function and timing of remapping, and offer testable hypotheses for in vivo studies. We conclude that remapping allows for seamless coordinate frame transformations and quick actions despite visual afferent lags. With visual remapping in place for behavior, it may be exploited for perceptual continuity.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a solution to part of the problem of making robotic or semi-robotic digging equipment less dependant on human supervision. A method is described for identifying rocks of a certain size that may affect digging efficiency or require special handling. The process involves three main steps. First, by using range and intensity data from a time-of-flight (TOF) camera, a feature descriptor is used to rank points and separate regions surrounding high scoring points. This allows a wide range of rocks to be recognized because features can represent a whole or just part of a rock. Second, these points are filtered to extract only points thought to belong to the large object. Finally, a check is carried out to verify that the resultant point cloud actually represents a rock. Results are presented from field testing on piles of fragmented rock. Note to Practitioners—This paper presents an algorithm to identify large boulders in a pile of broken rock as a step towards an autonomous mining dig planner. In mining, piles of broken rock can contain large fragments that may need to be specially handled. To assess rock piles for excavation, we make use of a TOF camera that does not rely on external lighting to generate a point cloud of the rock pile. We then segment large boulders from its surface by using a novel feature descriptor and distinguish between real and false boulder candidates. Preliminary field experiments show promising results with the algorithm performing nearly as well as human test subjects.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Visual recognition is a fundamental research topic in computer vision. This dissertation explores datasets, features, learning, and models used for visual recognition. In order to train visual models and evaluate different recognition algorithms, this dissertation develops an approach to collect object image datasets on web pages using an analysis of text around the image and of image appearance. This method exploits established online knowledge resources (Wikipedia pages for text; Flickr and Caltech data sets for images). The resources provide rich text and object appearance information. This dissertation describes results on two datasets. The first is Berg’s collection of 10 animal categories; on this dataset, we significantly outperform previous approaches. On an additional set of 5 categories, experimental results show the effectiveness of the method. Images are represented as features for visual recognition. This dissertation introduces a text-based image feature and demonstrates that it consistently improves performance on hard object classification problems. The feature is built using an auxiliary dataset of images annotated with tags, downloaded from the Internet. Image tags are noisy. The method obtains the text features of an unannotated image from the tags of its k-nearest neighbors in this auxiliary collection. A visual classifier presented with an object viewed under novel circumstances (say, a new viewing direction) must rely on its visual examples. This text feature may not change, because the auxiliary dataset likely contains a similar picture. While the tags associated with images are noisy, they are more stable when appearance changes. The performance of this feature is tested using PASCAL VOC 2006 and 2007 datasets. This feature performs well; it consistently improves the performance of visual object classifiers, and is particularly effective when the training dataset is small. With more and more collected training data, computational cost becomes a bottleneck, especially when training sophisticated classifiers such as kernelized SVM. This dissertation proposes a fast training algorithm called Stochastic Intersection Kernel Machine (SIKMA). This proposed training method will be useful for many vision problems, as it can produce a kernel classifier that is more accurate than a linear classifier, and can be trained on tens of thousands of examples in two minutes. It processes training examples one by one in a sequence, so memory cost is no longer the bottleneck to process large scale datasets. This dissertation applies this approach to train classifiers of Flickr groups with many group training examples. The resulting Flickr group prediction scores can be used to measure image similarity between two images. Experimental results on the Corel dataset and a PASCAL VOC dataset show the learned Flickr features perform better on image matching, retrieval, and classification than conventional visual features. Visual models are usually trained to best separate positive and negative training examples. However, when recognizing a large number of object categories, there may not be enough training examples for most objects, due to the intrinsic long-tailed distribution of objects in the real world. This dissertation proposes an approach to use comparative object similarity. The key insight is that, given a set of object categories which are similar and a set of categories which are dissimilar, a good object model should respond more strongly to examples from similar categories than to examples from dissimilar categories. This dissertation develops a regularized kernel machine algorithm to use this category dependent similarity regularization. Experiments on hundreds of categories show that our method can make significant improvement for categories with few or even no positive examples.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Fornax Spectroscopic Survey will use the Two degree Field spectrograph (2dF) of the Angle-Australian Telescope to obtain spectra for a complete sample of all 14000 objects with 16.5 less than or equal to b(j) less than or equal to 19.7 in a 12 square degree area centred on the Fornax Cluster. The aims of this project include the study of dwarf galaxies in the cluster (both known low surface brightness objects and putative normal surface brightness dwarfs) and a comparison sample of background field galaxies. We will also measure quasars and other active galaxies, any previously unrecognised compact galaxies and a large sample of Galactic stars. By selecting all objects-both stars and galaxies-independent of morphology, we cover a much larger range of surface brightness and scale size than previous surveys. In this paper we first describe the design of the survey. Our targets are selected from UK Schmidt Telescope sky survey plates digitised by the Automated Plate Measuring (APM) facility. We then describe the photometric and astrometric calibration of these data and show that the APM astrometry is accurate enough for use with the 2dF. We also describe a general approach to object identification using cross-correlations which allows us to identify and classify both stellar and galaxy spectra. We present results from the first 2dF field. Redshift distributions and velocity structures are shown for all observed objects in the direction of Fornax, including Galactic stars? galaxies in and around the Fornax Cluster, and for the background galaxy population. The velocity data for the stars show the contributions from the different Galactic components, plus a small tail to high velocities. We find no galaxies in the foreground to the cluster in our 2dF field. The Fornax Cluster is clearly defined kinematically. The mean velocity from the 26 cluster members having reliable redshifts is 1560 +/- 80 km s(-1). They show a velocity dispersion of 380 +/- 50 km s(-1). Large-scale structure can be traced behind the cluster to a redshift beyond z = 0.3. Background compact galaxies and low surface brightness galaxies are found to follow the general galaxy distribution.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Although several reports have demonstrated physiological and behavioral changes in adult rats due to neonatal immune challenges, little is known about their effects in adolescence. Since neonatal exposure to lipopolysaccharide (LPS) alters the neural substrates involved in cognitive disorders, we tested the hypothesis that it may also alter the response to novel environments in adolescent rats. At 3 and 5 days of age, male Wistar rats received intraperitoneal injections of either vehicle solution or E. coli LPS (0.05 mg/kg) or were left undisturbed. In the mid-adolescent period, between 40 and 46 days of age, the rats were exposed to the following behavioral tests: elevated plus-maze, open-field, novel-object exploration task, hole-board and the modified Porsolt forced swim test. The results showed that, in comparison with control animals, LPS-treated rats exhibited (1) less anxiety-related behaviors and enhanced patterns of locomotion and rearing in the plus-maze and the open-field tests, (2) high levels of exploration of both objects in the novel-object task and of corner and central holes in hole-board test, and (3) more time spent diving, an active behavior in the forced swim test. The present findings suggest that neonatal LPS exposure has long-lasting effects on the behavior profile adolescent rats exhibit in response to novelty. This behavioral pattern, characterized by heightened exploratory activity in novel environments, also suggests that early immune stimulation may contribute to the development of impulsive behavior in adolescent rats. (C) 2010 Elsevier B.V. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Previous research using punctuate reaction time and counting tasks has found that the startle eyeblink reflex is sensitive to attentional demands. The present experiment explored whether startle eyeblink is also modulated during a complex continuous task and is sensitive to different levels of mental workload. Participants (N=14) performed a visual horizontal tracking task either alone (single-task condition) or in combination with a visual gauge monitoring task (multiple-task condition) for three minutes. On some task trials, the startle eyeblink reflex was elicited by a noise burst. Results showed that startle eyeblink was attenuated during both tasks and that the attenuation was greater during the multiple-task condition than during the single-task condition. Subjective ratings, endogenous eyeblink rate, heart period, and heart period variability provided convergent validity of the workload manipulations. The findings suggest that the startle eyeblink is sensitive to the workload demands associated with a continuous visual task. The application of startle eyeblink modulation as a workload metric and the possibility that it may be diagnostic of workload demands in different stimulus modalities is discussed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Relationships between accuracy and speed of decision-making, or speed-accuracy tradeoffs (SAT), have been extensively studied. However, the range of SAT observed varies widely across studies for reasons that are unclear. Several explanations have been proposed, including motivation or incentive for speed vs. accuracy, species and modality but none of these hypotheses has been directly tested. An alternative explanation is that the different degrees of SAT are related to the nature of the task being performed. Here, we addressed this problem by comparing SAT in two odor-guided decision tasks that were identical except for the nature of the task uncertainty: an odor mixture categorization task, where the distinguishing information is reduced by making the stimuli more similar to each other; and an odor identification task in which the information is reduced by lowering the intensity over a range of three log steps. (...)

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Dissertação de mestrado integrado em Psicologia

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The role of peroxisome proliferator activator receptor (PPAR)β/δ in the pathogenesis of Alzheimer's disease has only recently been explored through the use of PPARβ/δ agonists. Here we evaluated the effects of PPARβ/δ deficiency on the amyloidogenic pathway and tau hyperphosphorylation. PPARβ/δ-null mice showed cognitive impairment in the object recognition task, accompanied by enhanced DNA-binding activity of NF-κB in the cortex and increased expression of IL-6. In addition, two NF-κB-target genes involved in β-amyloid (Aβ) synthesis and deposition, the β site APP cleaving enzyme 1 (Bace1) and the receptor for advanced glycation endproducts (Rage), respectively, increased in PPARβ/δ-null mice compared to wild type animals. The protein levels of glial fibrillary acidic protein (GFAP) increased in the cortex of PPARβ/δ-null mice, which would suggest the presence of astrogliosis. Finally, tau hyperphosphorylation at Ser199 and enhanced levels of PHF-tau were associated with increased levels of the tau kinases CDK5 and phospho-ERK1/2 in the cortex of PPARβ/δ(-/-) mice. Collectively, our findings indicate that PPARβ/δ deficiency results in cognitive impairment associated with enhanced inflammation, astrogliosis and tau hyperphosphorylation in the cortex.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Introduction: Responses to external stimuli are typically investigated by averaging peri-stimulus electroencephalography (EEG) epochs in order to derive event-related potentials (ERPs) across the electrode montage, under the assumption that signals that are related to the external stimulus are fixed in time across trials. We demonstrate the applicability of a single-trial model based on patterns of scalp topographies (De Lucia et al, 2007) that can be used for ERP analysis at the single-subject level. The model is able to classify new trials (or groups of trials) with minimal a priori hypotheses, using information derived from a training dataset. The features used for the classification (the topography of responses and their latency) can be neurophysiologically interpreted, because a difference in scalp topography indicates a different configuration of brain generators. An above chance classification accuracy on test datasets implicitly demonstrates the suitability of this model for EEG data. Methods: The data analyzed in this study were acquired from two separate visual evoked potential (VEP) experiments. The first entailed passive presentation of checkerboard stimuli to each of the four visual quadrants (hereafter, "Checkerboard Experiment") (Plomp et al, submitted). The second entailed active discrimination of novel versus repeated line drawings of common objects (hereafter, "Priming Experiment") (Murray et al, 2004). Four subjects per experiment were analyzed, using approx. 200 trials per experimental condition. These trials were randomly separated in training (90%) and testing (10%) datasets in 10 independent shuffles. In order to perform the ERP analysis we estimated the statistical distribution of voltage topographies by a Mixture of Gaussians (MofGs), which reduces our original dataset to a small number of representative voltage topographies. We then evaluated statistically the degree of presence of these template maps across trials and whether and when this was different across experimental conditions. Based on these differences, single-trials or sets of a few single-trials were classified as belonging to one or the other experimental condition. Classification performance was assessed using the Receiver Operating Characteristic (ROC) curve. Results: For the Checkerboard Experiment contrasts entailed left vs. right visual field presentations for upper and lower quadrants, separately. The average posterior probabilities, indicating the presence of the computed template maps in time and across trials revealed significant differences starting at ~60-70 ms post-stimulus. The average ROC curve area across all four subjects was 0.80 and 0.85 for upper and lower quadrants, respectively and was in all cases significantly higher than chance (unpaired t-test, p<0.0001). In the Priming Experiment, we contrasted initial versus repeated presentations of visual object stimuli. Their posterior probabilities revealed significant differences, which started at 250ms post-stimulus onset. The classification accuracy rates with single-trial test data were at chance level. We therefore considered sub-averages based on five single trials. We found that for three out of four subjects' classification rates were significantly above chance level (unpaired t-test, p<0.0001). Conclusions: The main advantage of the present approach is that it is based on topographic features that are readily interpretable along neurophysiologic lines. As these maps were previously normalized by the overall strength of the field potential on the scalp, a change in their presence across trials and between conditions forcibly reflects a change in the underlying generator configurations. The temporal periods of statistical difference between conditions were estimated for each training dataset for ten shuffles of the data. Across the ten shuffles and in both experiments, we observed a high level of consistency in the temporal periods over which the two conditions differed. With this method we are able to analyze ERPs at the single-subject level providing a novel tool to compare normal electrophysiological responses versus single cases that cannot be considered part of any cohort of subjects. This aspect promises to have a strong impact on both basic and clinical research.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Automatic creation of polarity lexicons is a crucial issue to be solved in order to reduce time andefforts in the first steps of Sentiment Analysis. In this paper we present a methodology based onlinguistic cues that allows us to automatically discover, extract and label subjective adjectivesthat should be collected in a domain-based polarity lexicon. For this purpose, we designed abootstrapping algorithm that, from a small set of seed polar adjectives, is capable to iterativelyidentify, extract and annotate positive and negative adjectives. Additionally, the methodautomatically creates lists of highly subjective elements that change their prior polarity evenwithin the same domain. The algorithm proposed reached a precision of 97.5% for positiveadjectives and 71.4% for negative ones in the semantic orientation identification task.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Under the referential of a ternary logic, this article aims to focus on the geographical map perceived as a visual object and the question of surfaces of representation. We analyse the status of map making versus landscape representation, the relations between a map and a painted picture. A ternary model of the pictural composition perspective|light/pictural field is proposed. The frame of the map, articulating the space that is cut out and the space included is discussed in a parallel between maps and painted pictures.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We perceive our environment through multiple sensory channels. Nonetheless, research has traditionally focused on the investigation of sensory processing within single modalities. Thus, investigating how our brain integrates multisensory information is of crucial importance for understanding how organisms cope with a constantly changing and dynamic environment. During my thesis I have investigated how multisensory events impact our perception and brain responses, either when auditory-visual stimuli were presented simultaneously or how multisensory events at one point in time impact later unisensory processing. In "Looming signals reveal synergistic principles of multisensory integration" (Cappe, Thelen et al., 2012) we investigated the neuronal substrates involved in motion detection in depth under multisensory vs. unisensory conditions. We have shown that congruent auditory-visual looming (i.e. approaching) signals are preferentially integrated by the brain. Further, we show that early effects under these conditions are relevant for behavior, effectively speeding up responses to these combined stimulus presentations. In "Electrical neuroimaging of memory discrimination based on single-trial multisensory learning" (Thelen et al., 2012), we investigated the behavioral impact of single encounters with meaningless auditory-visual object parings upon subsequent visual object recognition. In addition to showing that these encounters lead to impaired recognition accuracy upon repeated visual presentations, we have shown that the brain discriminates images as soon as ~100ms post-stimulus onset according to the initial encounter context. In "Single-trial multisensory memories affect later visual and auditory object recognition" (Thelen et al., in review) we have addressed whether auditory object recognition is affected by single-trial multisensory memories, and whether recognition accuracy of sounds was similarly affected by the initial encounter context as visual objects. We found that this is in fact the case. We propose that a common underlying brain network is differentially involved during encoding and retrieval of images and sounds based on our behavioral findings. - Nous percevons l'environnement qui nous entoure à l'aide de plusieurs organes sensoriels. Antérieurement, la recherche sur la perception s'est focalisée sur l'étude des systèmes sensoriels indépendamment les uns des autres. Cependant, l'étude des processus cérébraux qui soutiennent l'intégration de l'information multisensorielle est d'une importance cruciale pour comprendre comment notre cerveau travail en réponse à un monde dynamique en perpétuel changement. Pendant ma thèse, j'ai ainsi étudié comment des événements multisensoriels impactent notre perception immédiate et/ou ultérieure et comment ils sont traités par notre cerveau. Dans l'étude " Looming signals reveal synergistic principles of multisensory integration" (Cappe, Thelen et al., 2012), nous nous sommes intéressés aux processus neuronaux impliqués dans la détection de mouvements à l'aide de l'utilisation de stimuli audio-visuels seuls ou combinés. Nos résultats ont montré que notre cerveau intègre de manière préférentielle des stimuli audio-visuels combinés s'approchant de l'observateur. De plus, nous avons montré que des effets précoces, observés au niveau de la réponse cérébrale, influencent notre comportement, en accélérant la détection de ces stimuli. Dans l'étude "Electrical neuroimaging of memory discrimination based on single-trial multisensory learning" (Thelen et al., 2012), nous nous sommes intéressés à l'impact qu'a la présentation d'un stimulus audio-visuel sur l'exactitude de reconnaissance d'une image. Nous avons étudié comment la présentation d'une combinaison audio-visuelle sans signification, impacte, au niveau comportementale et cérébral, sur la reconnaissance ultérieure de l'image. Les résultats ont montré que l'exactitude de la reconnaissance d'images, présentées dans le passé, avec un son sans signification, est inférieure à celle obtenue dans le cas d'images présentées seules. De plus, notre cerveau différencie ces deux types de stimuli très tôt dans le traitement d'images. Dans l'étude "Single-trial multisensory memories affect later visual and auditory object recognition" (Thelen et al., in review), nous nous sommes posés la question si l'exactitude de ia reconnaissance de sons était affectée de manière semblable par la présentation d'événements multisensoriels passés. Ceci a été vérifié par nos résultats. Nous avons proposé que cette similitude puisse être expliquée par le recrutement différentiel d'un réseau neuronal commun.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Glucose-dependent insulinotropic polypeptide (GIP) is a key incretin hormone, released from intestine after a meal, producing a glucose-dependent insulin secretion. The GIP receptor (GIPR) is expressed on pyramidal neurons in the cortex and hippocampus, and GIP is synthesized in a subset of neurons in the brain. However, the role of the GIPR in neuronal signaling is not clear. In this study, we used a mouse strain with GIPR gene deletion (GIPR KO) to elucidate the role of the GIPR in neuronal communication and brain function. Compared with C57BL/6 control mice, GIPR KO mice displayed higher locomotor activity in an open-field task. Impairment of recognition and spatial learning and memory of GIPR KO mice were found in the object recognition task and a spatial water maze task, respectively. In an object location task, no impairment was found. GIPR KO mice also showed impaired synaptic plasticity in paired-pulse facilitation and a block of long-term potentiation in area CA1 of the hippocampus. Moreover, a large decrease in the number of neuronal progenitor cells was found in the dentate gyrus of transgenic mice, although the numbers of young neurons was not changed. Together the results suggest that GIP receptors play an important role in cognition, neurotransmission, and cell proliferation.