979 resultados para Visual image
Resumo:
This thesis deals with Visual Servoing and its strictly connected disciplines like projective geometry, image processing, robotics and non-linear control. More specifically the work addresses the problem to control a robotic manipulator through one of the largely used Visual Servoing techniques: the Image Based Visual Servoing (IBVS). In Image Based Visual Servoing the robot is driven by on-line performing a feedback control loop that is closed directly in the 2D space of the camera sensor. The work considers the case of a monocular system with the only camera mounted on the robot end effector (eye in hand configuration). Through IBVS the system can be positioned with respect to a 3D fixed target by minimizing the differences between its initial view and its goal view, corresponding respectively to the initial and the goal system configurations: the robot Cartesian Motion is thus generated only by means of visual informations. However, the execution of a positioning control task by IBVS is not straightforward because singularity problems may occur and local minima may be reached where the reached image is very close to the target one but the 3D positioning task is far from being fulfilled: this happens in particular for large camera displacements, when the the initial and the goal target views are noticeably different. To overcame singularity and local minima drawbacks, maintaining the good properties of IBVS robustness with respect to modeling and camera calibration errors, an opportune image path planning can be exploited. This work deals with the problem of generating opportune image plane trajectories for tracked points of the servoing control scheme (a trajectory is made of a path plus a time law). The generated image plane paths must be feasible i.e. they must be compliant with rigid body motion of the camera with respect to the object so as to avoid image jacobian singularities and local minima problems. In addition, the image planned trajectories must generate camera velocity screws which are smooth and within the allowed bounds of the robot. We will show that a scaled 3D motion planning algorithm can be devised in order to generate feasible image plane trajectories. Since the paths in the image are off-line generated it is also possible to tune the planning parameters so as to maintain the target inside the camera field of view even if, in some unfortunate cases, the feature target points would leave the camera images due to 3D robot motions. To test the validity of the proposed approach some both experiments and simulations results have been reported taking also into account the influence of noise in the path planning strategy. The experiments have been realized with a 6DOF anthropomorphic manipulator with a fire-wire camera installed on its end effector: the results demonstrate the good performances and the feasibility of the proposed approach.
Resumo:
Quantification of protein expression based on immunohistochemistry (IHC) is an important step in clinical diagnoses and translational tissue-based research. Manual scoring systems are used in order to evaluate protein expression based on staining intensities and distribution patterns. However, visual scoring remains an inherently subjective approach. The aim of our study was to explore whether digital image analysis proves to be an alternative or even superior tool to quantify expression of membrane-bound proteins. We analyzed five membrane-binding biomarkers (HER2, EGFR, pEGFR, β-catenin, and E-cadherin) and performed IHC on tumor tissue microarrays from 153 esophageal adenocarcinomas patients from a single center study. The tissue cores were scored visually applying an established routine scoring system as well as by using digital image analysis obtaining a continuous spectrum of average staining intensity. Subsequently, we compared both assessments by survival analysis as an end point. There were no significant correlations with patient survival using visual scoring of β-catenin, E-cadherin, pEGFR, or HER2. In contrast, the results for digital image analysis approach indicated that there were significant associations with disease-free survival for β-catenin, E-cadherin, pEGFR, and HER2 (P = 0.0125, P = 0.0014, P = 0.0299, and P = 0.0096, respectively). For EGFR, there was a greater association with patient survival when digital image analysis was used compared to when visual scoring was (visual: P = 0.0045, image analysis: P < 0.0001). The results of this study indicated that digital image analysis was superior to visual scoring. Digital image analysis is more sensitive and, therefore, better able to detect biological differences within the tissues with greater accuracy. This increased sensitivity improves the quality of quantification.
Resumo:
A proposal for a model of the primary visual cortex is reported. It is structured with the basis of a simple unit cell able to perform fourteen pairs of different boolean functions corresponding to the two possible inputs. As a first step, a model of the retina is presented. Different types of responses, according to the different possibilities of interconnecting the building blocks, have been obtained. These responses constitute the basis for an initial configuration of the mammalian primary visual cortex. Some qualitative functions, as symmetry or size of an optical input, have been obtained. A proposal to extend this model to some higher functions, concludes the paper.
Resumo:
Vision extracts useful information from images. Reconstructing the three-dimensional structure of our environment and recognizing the objects that populate it are among the most important functions of our visual system. Computer vision researchers study the computational principles of vision and aim at designing algorithms that reproduce these functions. Vision is difficult: the same scene may give rise to very different images depending on illumination and viewpoint. Typically, an astronomical number of hypotheses exist that in principle have to be analyzed to infer a correct scene description. Moreover, image information might be extracted at different levels of spatial and logical resolution dependent on the image processing task. Knowledge of the world allows the visual system to limit the amount of ambiguity and to greatly simplify visual computations. We discuss how simple properties of the world are captured by the Gestalt rules of grouping, how the visual system may learn and organize models of objects for recognition, and how one may control the complexity of the description that the visual system computes.
Resumo:
Retinal image quality is commonly analyzed through parameters inherited from instrumental optics. These parameters are defined for ‘good optics’ so they are hard to translate into visual quality metrics. Instead of using point or artificial functions, we propose a quality index that takes into account properties of natural images. These images usually show strong local correlations that help to interpret the image. Our aim is to derive an objective index that quantifies the quality of vision by taking into account the local structure of the scene, instead of focusing on a particular aberration. As we show, this index highly correlates with visual acuity and allows inter-comparison of natural images around the retina. The usefulness of the index is proven through the analysis of real eyes before and after undergoing corneal surgery, which usually are hard to analyze with standard metrics.
Resumo:
A large part of the new generation of computer numerical control systems has adopted an architecture based on robotic systems. This architecture improves the implementation of many manufacturing processes in terms of flexibility, efficiency, accuracy and velocity. This paper presents a 4-axis robot tool based on a joint structure whose primary use is to perform complex machining shapes in some non-contact processes. A new dynamic visual controller is proposed in order to control the 4-axis joint structure, where image information is used in the control loop to guide the robot tool in the machining task. In addition, this controller eliminates the chaotic joint behavior which appears during tracking of the quasi-repetitive trajectories required in machining processes. Moreover, this robot tool can be coupled to a manipulator robot in order to form a multi-robot platform for complex manufacturing tasks. Therefore, the robot tool could perform a machining task using a piece grasped from the workspace by a manipulator robot. This manipulator robot could be guided by using visual information given by the robot tool, thereby obtaining an intelligent multi-robot platform controlled by only one camera.
Resumo:
The work presented in this thesis is divided into two distinct sections. In the first, the functional neuroimaging technique of Magnetoencephalography (MEG) is described and a new technique is introduced for accurate combination of MEG and MRI co-ordinate systems. In the second part of this thesis, MEG and the analysis technique of SAM are used to investigate responses of the visual system in the context of functional specialisation within the visual cortex. In chapter one, the sources of MEG signals are described, followed by a brief description of the necessary instrumentation for accurate MEG recordings. This chapter is concluded by introducing the forward and inverse problems of MEG, techniques to solve the inverse problem, and a comparison of MEG with other neuroimaging techniques. Chapter two provides an important contribution to the field of research with MEG. Firstly, it is described how MEG and MRI co-ordinate systems are combined for localisation and visualisation of activated brain regions. A previously used co-registration methods is then described, and a new technique is introduced. In a series of experiments, it is demonstrated that using fixed fiducial points provides a considerable improvement in the accuracy and reliability of co-registration. Chapter three introduces the visual system starting from the retina and ending with the higher visual rates. The functions of the magnocellular and the parvocellular pathways are described and it is shown how the parallel visual pathways remain segregated throughout the visual system. The structural and functional organisation of the visual cortex is then described. Chapter four presents strong evidence in favour of the link between conscious experience and synchronised brain activity. The spatiotemporal responses of the visual cortex are measured in response to specific gratings. It is shown that stimuli that induce visual discomfort and visual illusions share their physical properties with those that induce highly synchronised gamma frequency oscillations in the primary visual cortex. Finally chapter five is concerned with localization of colour in the visual cortex. In this first ever use of Synthetic Aperture Magnetometry to investigate colour processing in the visual cortex, it is shown that in response to isoluminant chromatic gratings, the highest magnitude of cortical activity arise from area V2.
Resumo:
Presentation Purpose:To relate structural change to functional change in age-related macular degeneration (AMD) in a cross-sectional population using fundus imaging and the visual field status. Methods:10 degree standard and SWAP visual fields and other standard functional clinical measures were acquired in 44 eyes of 27 patients at various stages of AMD, as well as fundus photographs. Retro-mode SLO images were captured in a subset of 29 eyes of 19 of the patients. Drusen area, measured by automated drusen segmentation software (Smith et al. 2005) was correlated with visual field data. Visual field defect position was compared to the position of the imaged drusen and deposits using custom software. Results:The effect of AMD stage on drusen area within the 6000µm was significant (One-way ANOVA: F = 17.231, p < 0.001), however the trend was not strong across all stages. There were significant linear relationships between visual field parameters and drusen area. The mean deviation (MD) declined by 3.00dB and 3.92dB for each log % drusen area for standard perimetry and SWAP, respectively. The visual field parameters of focal loss displayed the strongest correlations with drusen area. The number of pattern deviation (PD) defects increased by 9.30 and 9.68 defects per log % drusen area for standard perimetry and SWAP, respectively. Weaker correlations were found between drusen area and visual acuity, contrast sensitivity, colour vision and reading speed. 72.6% of standard PD defects and 65.2% of SWAP PD defects coincided with retinal signs of AMD on fundus photography. 67.5% of standard PD defects and 69.7% of SWAP PD defects coincided with deposits on retro-mode images. Conclusions:Perimetry exhibited a stronger relationship with drusen area than other measures of visual function. The structure-function relationship between visual field parameters and drusen area was linear. Overall the indices of focal loss had a stronger correlation with drusen area in SWAP than in standard perimetry. Visual field defects had a high coincidence proportion with retinal manifestations of AMD.Smith R.T. et al. (2005) Arch Ophthalmol 123:200-206.
Resumo:
With the progress of computer technology, computers are expected to be more intelligent in the interaction with humans, presenting information according to the user's psychological and physiological characteristics. However, computer users with visual problems may encounter difficulties on the perception of icons, menus, and other graphical information displayed on the screen, limiting the efficiency of their interaction with computers. In this dissertation, a personalized and dynamic image precompensation method was developed to improve the visual performance of the computer users with ocular aberrations. The precompensation was applied on the graphical targets before presenting them on the screen, aiming to counteract the visual blurring caused by the ocular aberration of the user's eye. A complete and systematic modeling approach to describe the retinal image formation of the computer user was presented, taking advantage of modeling tools, such as Zernike polynomials, wavefront aberration, Point Spread Function and Modulation Transfer Function. The ocular aberration of the computer user was originally measured by a wavefront aberrometer, as a reference for the precompensation model. The dynamic precompensation was generated based on the resized aberration, with the real-time pupil diameter monitored. The potential visual benefit of the dynamic precompensation method was explored through software simulation, with the aberration data from a real human subject. An "artificial eye'' experiment was conducted by simulating the human eye with a high-definition camera, providing objective evaluation to the image quality after precompensation. In addition, an empirical evaluation with 20 human participants was also designed and implemented, involving image recognition tests performed under a more realistic viewing environment of computer use. The statistical analysis results of the empirical experiment confirmed the effectiveness of the dynamic precompensation method, by showing significant improvement on the recognition accuracy. The merit and necessity of the dynamic precompensation were also substantiated by comparing it with the static precompensation. The visual benefit of the dynamic precompensation was further confirmed by the subjective assessments collected from the evaluation participants.
Resumo:
This PhD by publication examines selected practice-based audio-visual works made by the author over a ten-year period, placing them in a critical context. Central to the publications, and the focus of the thesis, is an exploration of the role of sound in the creation of dialectic tension between the audio, the visual and the audience. By first analysing a number of texts (films/videos and key writings) the thesis locates the principal issues and debates around the use of audio in artists’ moving image practice. From this it is argued that asynchronism, first advocated in 1929 by Pudovkin as a response to the advent of synchronised sound, can be used to articulate audio-visual relationships. Central to asynchronism’s application in this paper is a recognition of the propensity for sound and image to adhere, and in visual music for there to be a literal equation of audio with the visual, often married with a quest for the synaesthetic. These elements can either be used in an illusionist fashion, or employed as part of an anti-illusionist strategy for realising dialectic. Using this as a theoretical basis, the paper examines how the publications implement asynchronism, including digital mapping to facilitate innovative reciprocal sound and image combinations, and the asynchronous use of ‘found sound’ from a range of online sources to reframe the moving image. The synthesis of publications and practice demonstrates that asynchronism can both underpin the creation of dialectic, and be an integral component in an audio-visual anti-illusionist methodology.
Resumo:
With the progress of computer technology, computers are expected to be more intelligent in the interaction with humans, presenting information according to the user's psychological and physiological characteristics. However, computer users with visual problems may encounter difficulties on the perception of icons, menus, and other graphical information displayed on the screen, limiting the efficiency of their interaction with computers. In this dissertation, a personalized and dynamic image precompensation method was developed to improve the visual performance of the computer users with ocular aberrations. The precompensation was applied on the graphical targets before presenting them on the screen, aiming to counteract the visual blurring caused by the ocular aberration of the user's eye. A complete and systematic modeling approach to describe the retinal image formation of the computer user was presented, taking advantage of modeling tools, such as Zernike polynomials, wavefront aberration, Point Spread Function and Modulation Transfer Function. The ocular aberration of the computer user was originally measured by a wavefront aberrometer, as a reference for the precompensation model. The dynamic precompensation was generated based on the resized aberration, with the real-time pupil diameter monitored. The potential visual benefit of the dynamic precompensation method was explored through software simulation, with the aberration data from a real human subject. An "artificial eye'' experiment was conducted by simulating the human eye with a high-definition camera, providing objective evaluation to the image quality after precompensation. In addition, an empirical evaluation with 20 human participants was also designed and implemented, involving image recognition tests performed under a more realistic viewing environment of computer use. The statistical analysis results of the empirical experiment confirmed the effectiveness of the dynamic precompensation method, by showing significant improvement on the recognition accuracy. The merit and necessity of the dynamic precompensation were also substantiated by comparing it with the static precompensation. The visual benefit of the dynamic precompensation was further confirmed by the subjective assessments collected from the evaluation participants.
Resumo:
The abundance of visual data and the push for robust AI are driving the need for automated visual sensemaking. Computer Vision (CV) faces growing demand for models that can discern not only what images "represent," but also what they "evoke." This is a demand for tools mimicking human perception at a high semantic level, categorizing images based on concepts like freedom, danger, or safety. However, automating this process is challenging due to entropy, scarcity, subjectivity, and ethical considerations. These challenges not only impact performance but also underscore the critical need for interoperability. This dissertation focuses on abstract concept-based (AC) image classification, guided by three technical principles: situated grounding, performance enhancement, and interpretability. We introduce ART-stract, a novel dataset of cultural images annotated with ACs, serving as the foundation for a series of experiments across four key domains: assessing the effectiveness of the end-to-end DL paradigm, exploring cognitive-inspired semantic intermediaries, incorporating cultural and commonsense aspects, and neuro-symbolic integration of sensory-perceptual data with cognitive-based knowledge. Our results demonstrate that integrating CV approaches with semantic technologies yields methods that surpass the current state of the art in AC image classification, outperforming the end-to-end deep vision paradigm. The results emphasize the role semantic technologies can play in developing both effective and interpretable systems, through the capturing, situating, and reasoning over knowledge related to visual data. Furthermore, this dissertation explores the complex interplay between technical and socio-technical factors. By merging technical expertise with an understanding of human and societal aspects, we advocate for responsible labeling and training practices in visual media. These insights and techniques not only advance efforts in CV and explainable artificial intelligence but also propel us toward an era of AI development that harmonizes technical prowess with deep awareness of its human and societal implications.