985 resultados para visual objects
Resumo:
Most existing color-based tracking algorithms utilize the statistical color information of the object as the tracking clues, without maintaining the spatial structure within a single chromatic image. Recently, the researches on the multilinear algebra provide the possibility to hold the spatial structural relationship in a representation of the image ensembles. In this paper, a third-order color tensor is constructed to represent the object to be tracked. Considering the influence of the environment changing on the tracking, the biased discriminant analysis (BDA) is extended to the tensor biased discriminant analysis (TBDA) for distinguishing the object from the background. At the same time, an incremental scheme for the TBDA is developed for the tensor biased discriminant subspace online learning, which can be used to adapt to the appearance variant of both the object and background. The experimental results show that the proposed method can track objects precisely undergoing large pose, scale and lighting changes, as well as partial occlusion. © 2009 Elsevier B.V.
Resumo:
When visual sensor networks are composed of cameras which can adjust the zoom factor of their own lens, one must determine the optimal zoom levels for the cameras, for a given task. This gives rise to an important trade-off between the overlap of the different cameras’ fields of view, providing redundancy, and image quality. In an object tracking task, having multiple cameras observe the same area allows for quicker recovery, when a camera fails. In contrast having narrow zooms allow for a higher pixel count on regions of interest, leading to increased tracking confidence. In this paper we propose an approach for the self-organisation of redundancy in a distributed visual sensor network, based on decentralised multi-objective online learning using only local information to approximate the global state. We explore the impact of different zoom levels on these trade-offs, when tasking omnidirectional cameras, having perfect 360-degree view, with keeping track of a varying number of moving objects. We further show how employing decentralised reinforcement learning enables zoom configurations to be achieved dynamically at runtime according to an operator’s preference for maximising either the proportion of objects tracked, confidence associated with tracking, or redundancy in expectation of camera failure. We show that explicitly taking account of the level of overlap, even based only on local knowledge, improves resilience when cameras fail. Our results illustrate the trade-off between maintaining high confidence and object coverage, and maintaining redundancy, in anticipation of future failure. Our approach provides a fully tunable decentralised method for the self-organisation of redundancy in a changing environment, according to an operator’s preferences.
Resumo:
Purpose: Dementia is associated with various alterations of the eye and visual function. Over 60% of cases are attributable to Alzheimer's disease, a significant proportion of the remainder to vascular dementia or dementia with Lewy bodies, while frontotemporal dementia, and Parkinson's disease dementia are less common. This review describes the oculo-visual problems of these five dementias and the pathological changes which may explain these symptoms. It further discusses clinical considerations to help the clinician care for older patients affected by dementia. Recent findings: Visual problems in dementia include loss of visual acuity, defects in colour vision and visual masking tests, changes in pupillary response to mydriatics, defects in fixation and smooth and saccadic eye movements, changes in contrast sensitivity function and visual evoked potentials, and disturbance of complex visual functions such as in reading ability, visuospatial function, and the naming and identification of objects. Pathological changes have also been reported affecting the crystalline lens, retina, optic nerve, and visual cortex. Clinically, issues such as cataract surgery, correcting the refractive error, quality of life, falls, visual impairment and eye care for dementia have been addressed. Summary: Many visual changes occur across dementias, are controversial, often based on limited patient numbers, and no single feature can be regarded as diagnostic of any specific dementia. Nevertheless, visual hallucinations may be more characteristic of dementia with Lewy bodies and Parkinson's disease dementia than Alzheimer's disease or frontotemporal dementia. Differences in saccadic eye movement dysfunction may also help to distinguish Alzheimer's disease from frontotemporal dementia and Parkinson's disease dementia from dementia with Lewy bodies. Eye care professionals need to keep informed of the growing literature in vision/dementia, be attentive to signs and symptoms suggestive of cognitive impairment, and be able to adapt their practice and clinical interventions to best serve patients with dementia.
Resumo:
Recent experimental studies have shown that development towards adult performance levels in configural processing in object recognition is delayed through middle childhood. Whilst partchanges to animal and artefact stimuli are processed with similar to adult levels of accuracy from 7 years of age, relative size changes to stimuli result in a significant decrease in relative performance for participants aged between 7 and 10. Two sets of computational experiments were run using the JIM3 artificial neural network with adult and 'immature' versions to simulate these results. One set progressively decreased the number of neurons involved in the representation of view-independent metric relations within multi-geon objects. A second set of computational experiments involved decreasing the number of neurons that represent view-dependent (nonrelational) object attributes in JIM3's Surface Map. The simulation results which show the best qualitative match to empirical data occurred when artificial neurons representing metric-precision relations were entirely eliminated. These results therefore provide further evidence for the late development of relational processing in object recognition and suggest that children in middle childhood may recognise objects without forming structural description representations.
Resumo:
Imagining a familiar environment is different from imagining an environmental map and clinical evidence demonstrated the existence of double dissociations in brain-damaged patients due to the contents of mental images. Here, we assessed a large sample of young and old participants by considering their ability to generate different kinds of mental images, namely, buildings or common objects. As buildings are environmental stimuli that have an important role in human navigation, we expected that elderly participants would have greater difficulty in generating images of buildings than common objects. We found that young and older participants differed in generating both buildings and common objects. For young participants there were no differences between buildings and common objects, but older participants found easier to generate common objects than buildings. Buildings are a special type of visual stimuli because in urban environments they are commonly used as landmarks for navigational purposes. Considering that topographical orientation is one of the abilities mostly affected in normal and pathological aging, the present data throw some light on the impaired processes underlying human navigation.
Resumo:
Alzheimer's disease (AD) is an important neurodegenerative disorder causing visual problems in the elderly population. The pathology of AD includes the deposition in the brain of abnormal aggregates of β-amyloid (Aβ) in the form of senile plaques (SP) and abnormally phosphorylated tau in the form of neurofibrillary tangles (NFT). A variety of visual problems have been reported in patients with AD including loss of visual acuity (VA), colour vision and visual fields; changes in pupillary responses to mydriatics, defects in fixation and in smooth and saccadic eye movements; changes in contrast sensitivity and in visual evoked potentials (VEP); and disturbances in complex visual tasks such as reading, visuospatial function, and in the naming and identification of objects. In addition, pathological changes have been observed to affect the eye, visual pathway, and visual cortex in AD. To better understand degeneration of the visual cortex in AD, the laminar distribution of the SP and NFT was studied in visual areas V1 and V2 in 18 cases of AD which varied in disease onset and duration. In area V1, the mean density of SP and NFT reached a maximum in lamina III and in laminae II and III respectively. In V2, mean SP density was maximal in laminae III and IV and NFT density in laminae II and III. The densities of SP in laminae I of V1 and NFT in lamina IV of V2 were negatively correlated with patient age. No significant correlations were observed in any cortical lamina between the density of NFT and disease onset or duration. However, in area V2, the densities of SP in lamina II and lamina V were negatively correlated with disease duration and disease onset respectively. In addition, there were several positive correlations between the densities of SP and NFT in V1 with those in area V2. The data suggest: (1) NFT pathology is greater in area V2 than V1, (2) laminae II/III of V1 and V2 are most affected by the pathology, (3) the formation of SP and NFT in V1 and V2 are interconnected, and (4) the pathology may spread between visual areas via the feed-forward short cortico-cortical connections. © 2012 by Nova Science Publishers, Inc. All rights reserved.
Resumo:
In this study we aim to evaluate the impact of ageing and gender on different visual mental imagery processes. Two hundred and fifty-one participants (130 women and 121 men; age range = 18–77 years) were given an extensive neuropsychological battery including tasks probing the generation, maintenance, inspection, and transformation of visual mental images (Complete Visual Mental Imagery Battery, CVMIB). Our results show that all mental imagery processes with the exception of the maintenance are affected by ageing, suggesting that other deficits, such as working memory deficits, could account for this effect. However, the analysis of the transformation process, investigated in terms of mental rotation and mental folding skills, shows a steeper decline in mental rotation, suggesting that age could affect rigid transformations of objects and spare non-rigid transformations. Our study also adds to previous ones in showing gender differences favoring men across the lifespan in the transformation process, and, interestingly, it shows a steeper decline in men than in women in inspecting mental images, which could partially account for the mixed results about the effect of ageing on this specific process. We also discuss the possibility to introduce the CVMIB in clinical assessment in the context of theoretical models of mental imagery.
Resumo:
Current reform initiatives recommend that school geometry teaching and learning include the study of three-dimensional geometric objects and provide students with opportunities to use spatial abilities in mathematical tasks. Two ways of using Geometer's Sketchpad (GSP), a dynamic and interactive computer program, in conjunction with manipulatives enable students to investigate and explore geometric concepts, especially when used in a constructivist setting. Research on spatial abilities has focused on visual reasoning to improve visualization skills. This dissertation investigated the hypothesis that connecting visual and analytic reasoning may better improve students' spatial visualization abilities as compared to instruction that makes little or no use of the connection of the two. Data were collected using the Purdue Spatial Visualization Tests (PSVT) administered as a pretest and posttest to a control and two experimental groups. Sixty-four 10th grade students in three geometry classrooms participated in the study during 6 weeks. Research questions were answered using statistical procedures. An analysis of covariance was used for a quantitative analysis, whereas a description of students' visual-analytic processing strategies was presented using qualitative methods. The quantitative results indicated that there were significant differences in gender, but not in the group factor. However, when analyzing a sub sample of 33 participants with pretest scores below the 50th percentile, males in one of the experimental groups significantly benefited from the treatment. A review of previous research also indicated that students with low visualization skills benefited more than those with higher visualization skills. The qualitative results showed that girls were more sophisticated in their visual-analytic processing strategies to solve three-dimensional tasks. It is recommended that the teaching and learning of spatial visualization start in the middle school, prior to students' more rigorous mathematics exposure in high school. A duration longer than 6 weeks for treatments in similar future research studies is also recommended.
Resumo:
To navigate effectively in three-dimensional space, flying insects must approximate distances to nearby objects. Humans are able to use an array of cues to guide depth perception in the visual world. However, some of these cues are not available to insects that are constrained by their rigid eyes and relatively small body size. Flying fruit flies can use motion parallax to gauge the distance of nearby objects, but using this cue becomes a less effective strategy as objects become more remote. Humans are able to infer depth across far distances by comparing the angular distance of an object to the horizon. This study tested if flying fruit flies, like humans, use the relative position of the horizon as a depth cue. Fruit flies in tethered flight were stimulated with a virtual environment that displayed vertical bars of varying elevation relative to a horizon, and their tracking responses were recorded. This study showed that tracking responses of the flies were strongly increased by reducing the apparent elevation of the bar against the horizon, indicating that fruit flies may be able to assess the distance of far off objects in the natural world by comparing them against a visual horizon.
Resumo:
The environment in which we live in, we constantly deal with a huge amount of dynamic information, thus, attention is an indispensable cognitive resource that allows an effective selection of stimuli for our survival. From this, investigating how we process our encouragement in movements and how the attention spreads into a space to serve more than one stimuli simultanously is something very important. The behavioural urgence hipothesis suggests that the encouragement in a movement of approaching shows a certain priority in the process related to objects which are in a movement away, but there are researches that point out that it might not happen in an attentive phase, but instead as a priorization of motor response. There are also many controversies found in researches about attentive focalization, in which some studies suggest that the focus of attention would work in a similar manner to a zoom lens, while some searches indicate that the focus of attention could be shared to answer some stimuli in non contiguous regions. This study tried to investigate through two experiments the effect of attentive priorization by encouragement in movements and how the attention is spread with distractors stimuli. The first experiment investigated if the amount of moving flows really influenced in the process of information. The results indicate an effect of priorization of the flows guided in relation to aleatory ones and also from the unique flow due to dual flow. The second experiment investigated how the distribution of attention is in a space with the use of flows as an exogenous cue. The results indicate that the focus of attention works as the one suggested in the zoom lens model.
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-08
Resumo:
This study examined the properties of ERP effects elicited by unattended (spatially uncued) objects using a short-lag repetition-priming paradigm. Same or different common objects were presented in a yoked prime-probe trial either as intact images or slightly scrambled (half-split) versions. Behaviourally, only objects in a familiar (intact) view showed priming. An enhanced negativity was observed at parietal and occipito-parietal electrode sites within the time window of the posterior N250 after the repetition of intact, but not split, images. An additional post-hoc N2pc analysis of the prime display supported that this result could not be attributed to differences in salience between familiar intact and split views. These results demonstrate that spatially unattended objects undergo visual processing but only if shown in familiar views, indicating a role of holistic processing of objects that is independent of attention.
Resumo:
Visual recognition is a fundamental research topic in computer vision. This dissertation explores datasets, features, learning, and models used for visual recognition. In order to train visual models and evaluate different recognition algorithms, this dissertation develops an approach to collect object image datasets on web pages using an analysis of text around the image and of image appearance. This method exploits established online knowledge resources (Wikipedia pages for text; Flickr and Caltech data sets for images). The resources provide rich text and object appearance information. This dissertation describes results on two datasets. The first is Berg’s collection of 10 animal categories; on this dataset, we significantly outperform previous approaches. On an additional set of 5 categories, experimental results show the effectiveness of the method. Images are represented as features for visual recognition. This dissertation introduces a text-based image feature and demonstrates that it consistently improves performance on hard object classification problems. The feature is built using an auxiliary dataset of images annotated with tags, downloaded from the Internet. Image tags are noisy. The method obtains the text features of an unannotated image from the tags of its k-nearest neighbors in this auxiliary collection. A visual classifier presented with an object viewed under novel circumstances (say, a new viewing direction) must rely on its visual examples. This text feature may not change, because the auxiliary dataset likely contains a similar picture. While the tags associated with images are noisy, they are more stable when appearance changes. The performance of this feature is tested using PASCAL VOC 2006 and 2007 datasets. This feature performs well; it consistently improves the performance of visual object classifiers, and is particularly effective when the training dataset is small. With more and more collected training data, computational cost becomes a bottleneck, especially when training sophisticated classifiers such as kernelized SVM. This dissertation proposes a fast training algorithm called Stochastic Intersection Kernel Machine (SIKMA). This proposed training method will be useful for many vision problems, as it can produce a kernel classifier that is more accurate than a linear classifier, and can be trained on tens of thousands of examples in two minutes. It processes training examples one by one in a sequence, so memory cost is no longer the bottleneck to process large scale datasets. This dissertation applies this approach to train classifiers of Flickr groups with many group training examples. The resulting Flickr group prediction scores can be used to measure image similarity between two images. Experimental results on the Corel dataset and a PASCAL VOC dataset show the learned Flickr features perform better on image matching, retrieval, and classification than conventional visual features. Visual models are usually trained to best separate positive and negative training examples. However, when recognizing a large number of object categories, there may not be enough training examples for most objects, due to the intrinsic long-tailed distribution of objects in the real world. This dissertation proposes an approach to use comparative object similarity. The key insight is that, given a set of object categories which are similar and a set of categories which are dissimilar, a good object model should respond more strongly to examples from similar categories than to examples from dissimilar categories. This dissertation develops a regularized kernel machine algorithm to use this category dependent similarity regularization. Experiments on hundreds of categories show that our method can make significant improvement for categories with few or even no positive examples.
Resumo:
This thesis proposes a generic visual perception architecture for robotic clothes perception and manipulation. This proposed architecture is fully integrated with a stereo vision system and a dual-arm robot and is able to perform a number of autonomous laundering tasks. Clothes perception and manipulation is a novel research topic in robotics and has experienced rapid development in recent years. Compared to the task of perceiving and manipulating rigid objects, clothes perception and manipulation poses a greater challenge. This can be attributed to two reasons: firstly, deformable clothing requires precise (high-acuity) visual perception and dexterous manipulation; secondly, as clothing approximates a non-rigid 2-manifold in 3-space, that can adopt a quasi-infinite configuration space, the potential variability in the appearance of clothing items makes them difficult to understand, identify uniquely, and interact with by machine. From an applications perspective, and as part of EU CloPeMa project, the integrated visual perception architecture refines a pre-existing clothing manipulation pipeline by completing pre-wash clothes (category) sorting (using single-shot or interactive perception for garment categorisation and manipulation) and post-wash dual-arm flattening. To the best of the author’s knowledge, as investigated in this thesis, the autonomous clothing perception and manipulation solutions presented here were first proposed and reported by the author. All of the reported robot demonstrations in this work follow a perception-manipulation method- ology where visual and tactile feedback (in the form of surface wrinkledness captured by the high accuracy depth sensor i.e. CloPeMa stereo head or the predictive confidence modelled by Gaussian Processing) serve as the halting criteria in the flattening and sorting tasks, respectively. From scientific perspective, the proposed visual perception architecture addresses the above challenges by parsing and grouping 3D clothing configurations hierarchically from low-level curvatures, through mid-level surface shape representations (providing topological descriptions and 3D texture representations), to high-level semantic structures and statistical descriptions. A range of visual features such as Shape Index, Surface Topologies Analysis and Local Binary Patterns have been adapted within this work to parse clothing surfaces and textures and several novel features have been devised, including B-Spline Patches with Locality-Constrained Linear coding, and Topology Spatial Distance to describe and quantify generic landmarks (wrinkles and folds). The essence of this proposed architecture comprises 3D generic surface parsing and interpretation, which is critical to underpinning a number of laundering tasks and has the potential to be extended to other rigid and non-rigid object perception and manipulation tasks. The experimental results presented in this thesis demonstrate that: firstly, the proposed grasp- ing approach achieves on-average 84.7% accuracy; secondly, the proposed flattening approach is able to flatten towels, t-shirts and pants (shorts) within 9 iterations on-average; thirdly, the proposed clothes recognition pipeline can recognise clothes categories from highly wrinkled configurations and advances the state-of-the-art by 36% in terms of classification accuracy, achieving an 83.2% true-positive classification rate when discriminating between five categories of clothes; finally the Gaussian Process based interactive perception approach exhibits a substantial improvement over single-shot perception. Accordingly, this thesis has advanced the state-of-the-art of robot clothes perception and manipulation.
Resumo:
Este estudo, procura explicar a modularidade da mente humana, como um conjunto de módulos, permitindo desta forma contribuir para o estudo das ciências cognitivas. Estes módulos da arquitetura mental, permitem que a nossa mente interprete a cor resultante do sistema visual e das longitudes de ondas do espetro eletromagnético refratado dos objetos. Tendo por base o estudo do sistema visual, as células sensíveis, designadas por fotorrecetores percorrem o nervo ótico até atingir o encéfalo, localizando-se aí o sistema percetivo, permitindo desta forma realizar o estudo sobre busca visual da cor, como medida avaliadora do funcionamento do sistema visual, um estudo exploratório a propósito da objetividade da felicidade em crianças, que visa explorar a busca visual disjuntiva da cor como medida objetiva do bom funcionamento mental, do bem-estar subjetivo, como construto da felicidade. A amostra foi constituída por um grupo de 49 crianças não institucionalizadas e por um grupo de 16 crianças institucionalizadas, de ambos os sexos. Para a concretização deste estudo, foi necessária a utilização de uma tarefa de busca visual disjuntiva, que utilizou as simetrias de cores pertencentes ao mesmo par oponente e cores pertencentes a diferentes pares oponentes. Os resultados sugerem que não há qualquer interferência da institucionalização no funcionamento mental, logo no bem-estar subjetivo nas crianças; ABSTRACT: This study seeks to explain the modularity of the human mind, as a set of modules, giving this way a contribution to the study of the cognitive sciences. These modules of the mental architecture, allow our mind to interpret the resulting color of the visual system and the wavelengths of the electromagnetic spectrum refracted from the objects. Based on the study of our visual system, sensitive cells known as photoreceptors, which run along the optic nerve to the encephalon, being the perceptive system located there, allowing in this way to carry out the study on visual search of colour, as an assessment measure of the functioning of the visual system, an exploratory study concerning the objectivity of happiness in children, which aims to explore the disjunctive visual search of color as an objective measure of good mental functioning, of subjective well-being, as a construct of happiness. The sample consisted of a group of 49 non institutionalized children and of a group of 16 institutionalized children from both sexes. For the implementation of this study it was necessary to use a disjunctive visual search task, which used the Symmetry of colours belonging to the same opponent pair, and colours belonging to different opponent pairs. The results suggest that there is no interference from the institutionalization in mental functioning, therefore in the children’s subjective well being.