972 resultados para Visual representation
Resumo:
Recovering position from sensor information is an important problem in mobile robotics, known as localisation. Localisation requires a map or some other description of the environment to provide the robot with a context to interpret sensor data. The mobile robot system under discussion is using an artificial neural representation of position. Building a geometrical map of the environment with a single camera and artificial neural networks is difficult. Instead it would be simpler to learn position as a function of the visual input. Usually when learning images, an intermediate representation is employed. An appropriate starting point for biologically plausible image representation is the complex cells of the visual cortex, which have invariance properties that appear useful for localisation. The effectiveness for localisation of two different complex cell models are evaluated. Finally the ability of a simple neural network with single shot learning to recognise these representations and localise a robot is examined.
Resumo:
In recent years there has been an increasing use of visual methods in ageing research. There are, however, limited reflections and critical explorations of the implications of using visual methods in research with people in mid to later life. This paper examines key methodological complexities when researching the daily lives of people as they grow older and the possibilities and limitations of using participant-generated visual diaries. The paper will draw on our experiences of an empirical study, which included a sample of 62 women and men aged 50 years and over with different daily routines. Participant-led photography was drawn upon as a means to create visual diaries, followed by in-depth, photo-elicitation interviews. The paper will critically reflect on the use of visual methods for researching the daily lives of people in mid to later life, as well as suggesting some wider tensions within visual methods that warrant attention. First, we explore the extent to which photography facilitates a ‘collaborative’ research process; second, complexities around capturing the ‘everydayness’ of daily routines are explored; third, the representation and presentation of ‘self’ by participants within their images and interview narratives is examined; and, finally, we highlight particular emotional considerations in visualising daily life.
Resumo:
Copyright © 2016 the authors 0270-6474/16/360714-16$15.00/0. This research was supported by National Science Foundation INSPIRE Grant 1248076, which was awarded to Y.L. and A.M.N.
Resumo:
Increasing the size of training data in many computer vision tasks has shown to be very effective. Using large scale image datasets (e.g. ImageNet) with simple learning techniques (e.g. linear classifiers) one can achieve state-of-the-art performance in object recognition compared to sophisticated learning techniques on smaller image sets. Semantic search on visual data has become very popular. There are billions of images on the internet and the number is increasing every day. Dealing with large scale image sets is intense per se. They take a significant amount of memory that makes it impossible to process the images with complex algorithms on single CPU machines. Finding an efficient image representation can be a key to attack this problem. A representation being efficient is not enough for image understanding. It should be comprehensive and rich in carrying semantic information. In this proposal we develop an approach to computing binary codes that provide a rich and efficient image representation. We demonstrate several tasks in which binary features can be very effective. We show how binary features can speed up large scale image classification. We present learning techniques to learn the binary features from supervised image set (With different types of semantic supervision; class labels, textual descriptions). We propose several problems that are very important in finding and using efficient image representation.
Resumo:
A interdisciplinaridade entre a música e as artes visuais tem sido explorado por conceituados teóricos e filósofos, embora não exista muito na área da interpretação visual do grafismo de partituras musicais. Este estudo investiga como os grafismos na notação e símbolos musicais afectam o intérprete na sua transformação em som, com referência especial a partituras contemporâneas, que utilizam notação menos convencional para a criação de uma interpretação por sugestão. Outras relações entre o som e o visual são exploradas, incluindo a sinestesia, a temporalidade e a relação entre obra de arte e público. O objectivo desta dissertação é a de constituir um estudo inovativo sobre partituras musicais contemporâneas, simultaneamente do ponto de vista musical e visual. Finalmente, também vai mais longe, incluindo desenhos da própria autora inspirados e motivados pela música. Estes já não cumprem uma função de notação convencional para o músico, embora existe uma constante possibilidade de uma reinterpretação. ABSTRACT; The inter-disciplinarity between music and visual art has been explored by leading theorists and philosophers, though very little exists in the area of the visual interpretation of graphic musical scores. This study looks at how the graphics of musical notation and symbols affect the performer in transforming them into sound, with particular reference to contemporary scores that use non¬conventional notation to create an interpretation through suggestion. Other sound-visual relationships are explored, including synaesthesia, temporality and the interconnection between work of art and audience or public. This dissertation aims to be an innovative study of contemporary musical scores, from a musical as well as visual perspective. Finally, it takes a step further with drawings of my own, directly inspired and motivated by the music. These no longer fulfil a conventionally notational function for the musician, yet the potential for a re-interpretation is ever-present.
Resumo:
Diabetic Retinopathy (DR) is a complication of diabetes that can lead to blindness if not readily discovered. Automated screening algorithms have the potential to improve identification of patients who need further medical attention. However, the identification of lesions must be accurate to be useful for clinical application. The bag-of-visual-words (BoVW) algorithm employs a maximum-margin classifier in a flexible framework that is able to detect the most common DR-related lesions such as microaneurysms, cotton-wool spots and hard exudates. BoVW allows to bypass the need for pre- and post-processing of the retinographic images, as well as the need of specific ad hoc techniques for identification of each type of lesion. An extensive evaluation of the BoVW model, using three large retinograph datasets (DR1, DR2 and Messidor) with different resolution and collected by different healthcare personnel, was performed. The results demonstrate that the BoVW classification approach can identify different lesions within an image without having to utilize different algorithms for each lesion reducing processing time and providing a more flexible diagnostic system. Our BoVW scheme is based on sparse low-level feature detection with a Speeded-Up Robust Features (SURF) local descriptor, and mid-level features based on semi-soft coding with max pooling. The best BoVW representation for retinal image classification was an area under the receiver operating characteristic curve (AUC-ROC) of 97.8% (exudates) and 93.5% (red lesions), applying a cross-dataset validation protocol. To assess the accuracy for detecting cases that require referral within one year, the sparse extraction technique associated with semi-soft coding and max pooling obtained an AUC of 94.2 ± 2.0%, outperforming current methods. Those results indicate that, for retinal image classification tasks in clinical practice, BoVW is equal and, in some instances, surpasses results obtained using dense detection (widely believed to be the best choice in many vision problems) for the low-level descriptors.
Resumo:
The human nervous system constructs a Euclidean representation of near (personal) space by combining multiple sources of information (cues). We investigated the cues used for the representation of personal space in a patient with visual form agnosia (DF). Our results indicated that DF relies predominantly on binocular vergence information when determining the distance of a target despite the presence of other (retinal) cues. Notably, DF was able to construct an Euclidean representation of personal space from vergence alone. This finding supports previous assertions that vergence provides the nervous system with veridical information for the construction of personal space. The results from the current study, together with those of others, suggest that: (i) the ventral stream is responsible for extracting depth and distance information from monocular retinal cues (i.e. from shading, texture, perspective) and (ii) the dorsal stream has access to binocular information (from horizontal image disparities and vergence). These results also indicate that DF was not able to use size information to gauge target distance, suggesting that intact temporal cortex is necessary for learned size to influence distance processing. Our findings further suggest that in neurologically intact humans, object information extracted in the ventral pathway is combined with the products of dorsal stream processing for guiding prehension. Finally, we studied the size-distance paradox in visual form agnosia in order to explore the cognitive use of size information. The results of this experiment were consistent with a previous suggestion that the paradox is a cognitive phenomenon.
Resumo:
Os sistemas de perceção existentes nos robôs autónomos, hoje em dia, são bastante complexos. A informação dos vários sensores, existentes em diferentes partes do robôs, necessitam de estar relacionados entre si face ao referencial do robô ou do mundo. Para isso, o conhecimento da atitude (posição e rotação) entre os referenciais dos sensores e o referencial do robô é um fator critico para o desempenho do mesmo. O processo de calibração dessas posições e translações é chamado calibração dos parâmetros extrínsecos. Esta dissertação propõe o desenvolvimento de um método de calibração autónomo para robôs como câmaras direcionais, como é o caso dos robôs da equipa ISePorto. A solução proposta consiste na aquisição de dados da visão, giroscópio e odometria durante uma manobra efetuada pelo robô em torno de um alvo com um padrão conhecido. Esta informação é então processada em conjunto através de um Extended Kalman Filter (EKF) onde são estimados necessários para relacionar os sensores existentes no robô em relação ao referencial do mesmo. Esta solução foi avaliada com recurso a vários testes e os resultados obtidos foram bastante similares aos obtidos pelo método manual, anteriormente utilizado, com um aumento significativo em rapidez e consistência.
Resumo:
Doctoral Program in Computer Science
Resumo:
Dissertação de mestrado em Engenharia e Gestão da Qualidade
Resumo:
Impaired visual search is a hallmark of spatial neglect. When searching for an unique feature (e.g., color) neglect patients often show only slight visual field asymmetries. In contrast, when the target is defined by a combination of features (e.g., color and form) they exhibit a severe deficit of contralesional search. This finding suggests a selective impairment of the serial deployment of spatial attention. Here, we examined this deficit with a preview paradigm. Neglect patients searched for a target defined by the conjunction of shape and color, presented together with varying numbers of distracters. The presentation time was varied such that on some trials participants previewed the target together with same-shape/different-color distracters, for 300 or 600 ms prior to the appearance of additional different-shape/same-color distracters. On the remaining trials the target and all distracters were shown simultaneously. Healthy participants exhibited a serial search strategy only when all items were presented simultaneously, whereas in both preview conditions a pop-out effect was observed. Neglect patients showed a similar pattern when the target was presented in the right hemifield. In contrast, when searching for a target in the left hemifield they showed serial search in the no-preview condition, as well as with a preview of 300 ms, and partly even at 600 ms. A control experiment suggested that the failure to fully benefit from item preview was probably independent of accurate perception of time. Our results, when viewed in the context of existing literature, lead us to conclude that the visual search deficit in neglect reflects two additive factors: a biased representation of attentional priority in favor of ipsilesional information and exaggerated capture of attention by ipsilesional abrupt onsets.
Resumo:
Prismatic adaptation has been shown to induce a realignment of visuoproprioceptive representations and to involve parietocerebellar networks. We have investigated in humans how far other types of functions known to involve the parietal cortex are influenced by a brief exposure to prismatic adaptation. Normal subjects underwent an fMRI evaluation before and after a brief session of prismatic adaptation using rightward deviating prisms for one group or after an equivalent session using plain glasses for the other group. Activation patterns to three tasks were analyzed: (1) visual detection; (2) visuospatial short-term memory; and (3) verbal short-term memory. The prismatic adaptation-related changes were found bilaterally in the inferior parietal lobule when prisms, but not plain glasses, were used. This effect was driven by selective changes during the visual detection task: an increase in neural activity was induced on the left and a decrease on the right parietal side after prismatic adaptation. Comparison of activation patterns after prismatic adaptation on the visual detection task demonstrated a significant increase of the ipsilateral field representation in the left inferior parietal lobule and a significant decrease in the right inferior parietal lobule. In conclusion, a brief exposure to prismatic adaptation modulates differently left and right parietal activation during visual detection but not during short-term memory. Furthermore, the visuospatial representation within the inferior parietal lobule changes, with a decrease of the ipsilateral hemifield representation on the right and increase on the left side, suggesting thus a left hemispheric dominance.
Resumo:
Photo-mosaicing techniques have become popular for seafloor mapping in various marine science applications. However, the common methods cannot accurately map regions with high relief and topographical variations. Ortho-mosaicing borrowed from photogrammetry is an alternative technique that enables taking into account the 3-D shape of the terrain. A serious bottleneck is the volume of elevation information that needs to be estimated from the video data, fused, and processed for the generation of a composite ortho-photo that covers a relatively large seafloor area. We present a framework that combines the advantages of dense depth-map and 3-D feature estimation techniques based on visual motion cues. The main goal is to identify and reconstruct certain key terrain feature points that adequately represent the surface with minimal complexity in the form of piecewise planar patches. The proposed implementation utilizes local depth maps for feature selection, while tracking over several views enables 3-D reconstruction by bundle adjustment. Experimental results with synthetic and real data validate the effectiveness of the proposed approach
Resumo:
The aim of the present study was to determine whether and how rats can use local olfactory cues for spatial orientation. Rats were trained in an eight-arm radial maze under different conditions as defined by the presence or absence of supplementary olfactory cues marking each arm, the availability of distant visuospatial information, and the illumination of the maze (light or darkness). The different visual conditions were designed to dissociate among the effects of light per se and those of visuospatial cues, on the use of olfactory cues for accurate arm choice. Different procedures with modifications of the arrangement of olfactory cues were used to determine if rats formed a representation of the spatial configuration of the olfactory cues and if they could rely on such a representation for accurate arm choice in the radial maze. The present study demonstrated that the use of olfactory cues to direct arm choice in the radial arm maze was critically dependent on the illumination conditions and implied two different modes of processing of olfactory information according to the presence or the absence of light. Olfactory cues were used in an explicit manner and enabled accurate arm choice only in the absence of light. Rats, however, had an implicit memory of the location of the olfactory cues and formed a representation of the spatial position of these cues, whatever the lighting conditions. They did not memorize the spatial configuration of the olfactory cues per se but needed these cues to be linked to the external spatial frame of reference.