993 resultados para Visual Recognition


Relevância:

30.00% 30.00%

Publicador:

Resumo:

With the progress of computer technology, computers are expected to be more intelligent in the interaction with humans, presenting information according to the user's psychological and physiological characteristics. However, computer users with visual problems may encounter difficulties on the perception of icons, menus, and other graphical information displayed on the screen, limiting the efficiency of their interaction with computers. In this dissertation, a personalized and dynamic image precompensation method was developed to improve the visual performance of the computer users with ocular aberrations. The precompensation was applied on the graphical targets before presenting them on the screen, aiming to counteract the visual blurring caused by the ocular aberration of the user's eye. A complete and systematic modeling approach to describe the retinal image formation of the computer user was presented, taking advantage of modeling tools, such as Zernike polynomials, wavefront aberration, Point Spread Function and Modulation Transfer Function. The ocular aberration of the computer user was originally measured by a wavefront aberrometer, as a reference for the precompensation model. The dynamic precompensation was generated based on the resized aberration, with the real-time pupil diameter monitored. The potential visual benefit of the dynamic precompensation method was explored through software simulation, with the aberration data from a real human subject. An "artificial eye'' experiment was conducted by simulating the human eye with a high-definition camera, providing objective evaluation to the image quality after precompensation. In addition, an empirical evaluation with 20 human participants was also designed and implemented, involving image recognition tests performed under a more realistic viewing environment of computer use. The statistical analysis results of the empirical experiment confirmed the effectiveness of the dynamic precompensation method, by showing significant improvement on the recognition accuracy. The merit and necessity of the dynamic precompensation were also substantiated by comparing it with the static precompensation. The visual benefit of the dynamic precompensation was further confirmed by the subjective assessments collected from the evaluation participants.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Neuroimaging studies of episodic memory, or memory of events from our personal past, have predominantly focused their attention on medial temporal lobe (MTL). There is growing acknowledgement however, from the cognitive neuroscience of memory literature, that regions outside the MTL can support episodic memory processes. The medial prefrontal cortex is one such region garnering increasing interest from researchers. Using behavioral and functional magnetic resonance imaging measures, over two studies, this thesis provides evidence of a mnemonic role of the medial PFC. In the first study, participants were scanned while judging the extent to which they agreed or disagreed with the sociopolitical views of unfamiliar individuals. Behavioral tests of associative recognition revealed that participants remembered with high confidence viewpoints previously linked with judgments of strong agreement/disagreement. Neurally, the medial PFC mediated the interaction between high-confidence associative recognition memory and beliefs associated with strong agree/disagree judgments. In an effort to generalize this finding to well-established associative information, in the second study, we investigated associative recognition memory for real-world concepts. Object-scene pairs congruent or incongruent with a preexisting schema were presented to participants in a cued-recall paradigm. Behavioral tests of conceptual and perceptual recognition revealed memory enhancements arising from strong resonance between presented pairs and preexisting schemas. Neurally, the medial PFC tracked increases in visual recall of schema-congruent pairs whereas the MTL tracked increases in visual recall of schema-incongruent pairs. Additionally, ventral areas of the medial PFC tracked conceptual components of visual recall specifically for schema-congruent pairs. These findings are consistent with a recent theoretical proposal of medial PFC contributions to memory for schema-related content. Collectively, these studies provide evidence of a role for the medial PFC in associative recognition memory persisting for associative information deployed in our daily social interactions and for those associations formed over multiple learning episodes. Additionally, this set of findings advance our understanding of the cognitive contributions of the medial PFC beyond its canonical role in processes underlying social cognition.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The occurrences of visual hallucinations seem to be more prevalent in low light and hallucinators tend to be more prone to false positive type errors in memory tasks. Here we investigated whether the richness of stimuli does indeed affect recognition differently in hallucinating and nonhallucinating participants, and if so whether this difference extends to identifying spatial context. We compared 36 Parkinson's disease (PD) patients with visual hallucinations, 32 Parkinson's patients without hallucinations, and 36 age-matched controls, on a visual memory task where color and black and white pictures were presented at different locations. Participants had to recognize the pictures among distracters along with the location of the stimulus. Findings revealed clear differences in performance between the groups. Both PD groups had impaired recognition compared to the controls, but those with hallucinations were significantly more impaired on black and white than on color stimuli. In addition, the group with hallucinations was significantly impaired compared to the other two groups on spatial memory. We suggest that not only do PD patients have poorer recognition of pictorial stimuli than controls, those who present with visual hallucinations appear to be more heavily reliant on bottom up sensory input and impaired on spatial ability.

Relevância:

30.00% 30.00%

Publicador:

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this work, we propose a biologically inspired appearance model for robust visual tracking. Motivated in part by the success of the hierarchical organization of the primary visual cortex (area V1), we establish an architecture consisting of five layers: whitening, rectification, normalization, coding and polling. The first three layers stem from the models developed for object recognition. In this paper, our attention focuses on the coding and pooling layers. In particular, we use a discriminative sparse coding method in the coding layer along with spatial pyramid representation in the pooling layer, which makes it easier to distinguish the target to be tracked from its background in the presence of appearance variations. An extensive experimental study shows that the proposed method has higher tracking accuracy than several state-of-the-art trackers.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This PhD by publication examines selected practice-based audio-visual works made by the author over a ten-year period, placing them in a critical context. Central to the publications, and the focus of the thesis, is an exploration of the role of sound in the creation of dialectic tension between the audio, the visual and the audience. By first analysing a number of texts (films/videos and key writings) the thesis locates the principal issues and debates around the use of audio in artists’ moving image practice. From this it is argued that asynchronism, first advocated in 1929 by Pudovkin as a response to the advent of synchronised sound, can be used to articulate audio-visual relationships. Central to asynchronism’s application in this paper is a recognition of the propensity for sound and image to adhere, and in visual music for there to be a literal equation of audio with the visual, often married with a quest for the synaesthetic. These elements can either be used in an illusionist fashion, or employed as part of an anti-illusionist strategy for realising dialectic. Using this as a theoretical basis, the paper examines how the publications implement asynchronism, including digital mapping to facilitate innovative reciprocal sound and image combinations, and the asynchronous use of ‘found sound’ from a range of online sources to reframe the moving image. The synthesis of publications and practice demonstrates that asynchronism can both underpin the creation of dialectic, and be an integral component in an audio-visual anti-illusionist methodology.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Increasing the size of training data in many computer vision tasks has shown to be very effective. Using large scale image datasets (e.g. ImageNet) with simple learning techniques (e.g. linear classifiers) one can achieve state-of-the-art performance in object recognition compared to sophisticated learning techniques on smaller image sets. Semantic search on visual data has become very popular. There are billions of images on the internet and the number is increasing every day. Dealing with large scale image sets is intense per se. They take a significant amount of memory that makes it impossible to process the images with complex algorithms on single CPU machines. Finding an efficient image representation can be a key to attack this problem. A representation being efficient is not enough for image understanding. It should be comprehensive and rich in carrying semantic information. In this proposal we develop an approach to computing binary codes that provide a rich and efficient image representation. We demonstrate several tasks in which binary features can be very effective. We show how binary features can speed up large scale image classification. We present learning techniques to learn the binary features from supervised image set (With different types of semantic supervision; class labels, textual descriptions). We propose several problems that are very important in finding and using efficient image representation.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A presente dissertação, no âmbito do mestrado de Línguas e Relações Empresariais, tem como objetivo aferir a perceção do público relativamente à identidade visual do Novo Banco, de modo a compreender se esta atuou eficazmente enquanto instrumento de comunicação no pós crise da instituição. Esta eficácia prende-se com a capacidade da identidade visual influenciar positivamente uma marca, promovendo o seu reconhecimento e visibilidade, bem como a diferenciação e o posicionamento positivo na mente do público. O processo de desmantelamento do Banco Espírito Santo, e posterior surgimento do Novo Banco, encontra-se associado a uma forte carga emocional. Por isso, com a intenção de avaliar com maior exactidão o impacto desta carga emocional inerente ao Novo Banco, foram estudadas, também, associações semânticas à marca, reveladoras do significado afetivo do público relativamente à organização. Para o cumprimento objetivo, foram delineadas bases contextuais e teóricas, que levaram à aplicação de um questionário que visou a obtenção de dados relativos, precisamente, à perceção do público geral. Os resultados sugerem que a identidade visual não foi uma resposta suficientemente eficaz dado que determinados componentes que a constituem não oferecem uma interpretação clara do que significam, originando baixos níveis de concordância. Contudo, os resultados mostram, ainda, que grande parte desta ineficácia deriva das consequências da crise no Banco Espírito Santo, ainda muito presentes na mente do público.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

One of the Highlights of our academic year is the exhibition of the works of the graduating class in our visual arts degree program. Of course it is very Satisfying to see such an unambiguous evidence of accomplishments of our students.But i also find the exhibition invariably inspiring because of the works themselves,which stimulate us to see again as children, with wide-eyed curiosity and wonder, without the need to explain or reduce, with delight, with horror, with recognition. In enabling us to see again in this way, the students have learned a very demanding craft and educated themselves inthe history, vocabulary and syntax of visual expression.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This thesis proposes a generic visual perception architecture for robotic clothes perception and manipulation. This proposed architecture is fully integrated with a stereo vision system and a dual-arm robot and is able to perform a number of autonomous laundering tasks. Clothes perception and manipulation is a novel research topic in robotics and has experienced rapid development in recent years. Compared to the task of perceiving and manipulating rigid objects, clothes perception and manipulation poses a greater challenge. This can be attributed to two reasons: firstly, deformable clothing requires precise (high-acuity) visual perception and dexterous manipulation; secondly, as clothing approximates a non-rigid 2-manifold in 3-space, that can adopt a quasi-infinite configuration space, the potential variability in the appearance of clothing items makes them difficult to understand, identify uniquely, and interact with by machine. From an applications perspective, and as part of EU CloPeMa project, the integrated visual perception architecture refines a pre-existing clothing manipulation pipeline by completing pre-wash clothes (category) sorting (using single-shot or interactive perception for garment categorisation and manipulation) and post-wash dual-arm flattening. To the best of the author’s knowledge, as investigated in this thesis, the autonomous clothing perception and manipulation solutions presented here were first proposed and reported by the author. All of the reported robot demonstrations in this work follow a perception-manipulation method- ology where visual and tactile feedback (in the form of surface wrinkledness captured by the high accuracy depth sensor i.e. CloPeMa stereo head or the predictive confidence modelled by Gaussian Processing) serve as the halting criteria in the flattening and sorting tasks, respectively. From scientific perspective, the proposed visual perception architecture addresses the above challenges by parsing and grouping 3D clothing configurations hierarchically from low-level curvatures, through mid-level surface shape representations (providing topological descriptions and 3D texture representations), to high-level semantic structures and statistical descriptions. A range of visual features such as Shape Index, Surface Topologies Analysis and Local Binary Patterns have been adapted within this work to parse clothing surfaces and textures and several novel features have been devised, including B-Spline Patches with Locality-Constrained Linear coding, and Topology Spatial Distance to describe and quantify generic landmarks (wrinkles and folds). The essence of this proposed architecture comprises 3D generic surface parsing and interpretation, which is critical to underpinning a number of laundering tasks and has the potential to be extended to other rigid and non-rigid object perception and manipulation tasks. The experimental results presented in this thesis demonstrate that: firstly, the proposed grasp- ing approach achieves on-average 84.7% accuracy; secondly, the proposed flattening approach is able to flatten towels, t-shirts and pants (shorts) within 9 iterations on-average; thirdly, the proposed clothes recognition pipeline can recognise clothes categories from highly wrinkled configurations and advances the state-of-the-art by 36% in terms of classification accuracy, achieving an 83.2% true-positive classification rate when discriminating between five categories of clothes; finally the Gaussian Process based interactive perception approach exhibits a substantial improvement over single-shot perception. Accordingly, this thesis has advanced the state-of-the-art of robot clothes perception and manipulation.