979 resultados para Visual image
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
Huge image collections are becoming available lately. In this scenario, the use of Content-Based Image Retrieval (CBIR) systems has emerged as a promising approach to support image searches. The objective of CBIR systems is to retrieve the most similar images in a collection, given a query image, by taking into account image visual properties such as texture, color, and shape. In these systems, the effectiveness of the retrieval process depends heavily on the accuracy of ranking approaches. Recently, re-ranking approaches have been proposed to improve the effectiveness of CBIR systems by taking into account the relationships among images. The re-ranking approaches consider the relationships among all images in a given dataset. These approaches typically demands a huge amount of computational power, which hampers its use in practical situations. On the other hand, these methods can be massively parallelized. In this paper, we propose to speedup the computation of the RL-Sim algorithm, a recently proposed image re-ranking approach, by using the computational power of Graphics Processing Units (GPU). GPUs are emerging as relatively inexpensive parallel processors that are becoming available on a wide range of computer systems. We address the image re-ranking performance challenges by proposing a parallel solution designed to fit the computational model of GPUs. We conducted an experimental evaluation considering different implementations and devices. Experimental results demonstrate that significant performance gains can be obtained. Our approach achieves speedups of 7x from serial implementation considering the overall algorithm and up to 36x on its core steps.
Resumo:
Image categorization by means of bag of visual words has received increasing attention by the image processing and vision communities in the last years. In these approaches, each image is represented by invariant points of interest which are mapped to a Hilbert Space representing a visual dictionary which aims at comprising the most discriminative features in a set of images. Notwithstanding, the main problem of such approaches is to find a compact and representative dictionary. Finding such representative dictionary automatically with no user intervention is an even more difficult task. In this paper, we propose a method to automatically find such dictionary by employing a recent developed graph-based clustering algorithm called Optimum-Path Forest, which does not make any assumption about the visual dictionary's size and is more efficient and effective than the state-of-the-art techniques used for dictionary generation.
Resumo:
The aim of this study was to investigate the reliability of visual and digital methods to assess marginal microleakage in vitro. Materials and Methods: Typical Class V preparations were made in bovine teeth and filled with composite resin. After dye penetration (0.5% basic fuchsin), teeth were sectioned and the 53 obtained fragments were assessed according to visual (stereomicroscope) and digital methods (Image Tool Software ® -ITS) (University of Texas Health Science Center-San Antonio Dental School, USA). Two calibrated examiners (A and B) evaluated dye penetration, by means of a stereomicroscope with ×20 magnification (scores), and by the ITS (millimeters). The intra- and inter-examiner agreement was estimated according to Kappa statistics (κ), and intraclass correlation coefficient (ρ). Results: In relation to the visual method, the intra-examiner agreement was almost perfect (κA = 0.87) and substantial (κB = 0.76), respectively to the examiner A and B. The inter-examiner agreement showed an almost perfect reliability (κ = 0.84). For the digital method, the intra-examiner agreement was almost perfect for both examiners and equal to ρ = 0.99, and so was the inter-examiner agreement value. Conclusion: Visual (stereomicroscope) and digital methods (ITS) showed high levels of intra- and inter-examiner reproducibility when marginal microleakage was assessed.
Resumo:
Relevance feedback approaches have been established as an important tool for interactive search, enabling users to express their needs. However, in view of the growth of multimedia collections available, the user efforts required by these methods tend to increase as well, demanding approaches for reducing the need of user interactions. In this context, this paper proposes a semi-supervised learning algorithm for relevance feedback to be used in image retrieval tasks. The proposed semi-supervised algorithm aims at using both supervised and unsupervised approaches simultaneously. While a supervised step is performed using the information collected from the user feedback, an unsupervised step exploits the intrinsic dataset structure, which is represented in terms of ranked lists of images. Several experiments were conducted for different image retrieval tasks involving shape, color, and texture descriptors and different datasets. The proposed approach was also evaluated on multimodal retrieval tasks, considering visual and textual descriptors. Experimental results demonstrate the effectiveness of the proposed approach.
Resumo:
The visual identity is based on a semantic relationship of several signs that make up a coherent system. A bimédia language formed by text and image complement to create an understandable message. This study aims the use of non-verbal communication in the corporate visual identity design project, contextualizing the role of the designer as mediator for informational corporate message to their audiences.
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
Pós-graduação em Artes - IA
Resumo:
Pós-graduação em Artes - IA
Resumo:
A single picture provides a largely incomplete representation of the scene one is looking at. Usually it reproduces only a limited spatial portion of the scene according to the standpoint and the viewing angle, besides it contains only instantaneous information. Thus very little can be understood on the geometrical structure of the scene, the position and orientation of the observer with respect to it remaining also hard to guess. When multiple views, taken from different positions in space and time, observe the same scene, then a much deeper knowledge is potentially achievable. Understanding inter-views relations enables construction of a collective representation by fusing the information contained in every single image. Visual reconstruction methods confront with the formidable, and still unanswered, challenge of delivering a comprehensive representation of structure, motion and appearance of a scene from visual information. Multi-view visual reconstruction deals with the inference of relations among multiple views and the exploitation of revealed connections to attain the best possible representation. This thesis investigates novel methods and applications in the field of visual reconstruction from multiple views. Three main threads of research have been pursued: dense geometric reconstruction, camera pose reconstruction, sparse geometric reconstruction of deformable surfaces. Dense geometric reconstruction aims at delivering the appearance of a scene at every single point. The construction of a large panoramic image from a set of traditional pictures has been extensively studied in the context of image mosaicing techniques. An original algorithm for sequential registration suitable for real-time applications has been conceived. The integration of the algorithm into a visual surveillance system has lead to robust and efficient motion detection with Pan-Tilt-Zoom cameras. Moreover, an evaluation methodology for quantitatively assessing and comparing image mosaicing algorithms has been devised and made available to the community. Camera pose reconstruction deals with the recovery of the camera trajectory across an image sequence. A novel mosaic-based pose reconstruction algorithm has been conceived that exploit image-mosaics and traditional pose estimation algorithms to deliver more accurate estimates. An innovative markerless vision-based human-machine interface has also been proposed, so as to allow a user to interact with a gaming applications by moving a hand held consumer grade camera in unstructured environments. Finally, sparse geometric reconstruction refers to the computation of the coarse geometry of an object at few preset points. In this thesis, an innovative shape reconstruction algorithm for deformable objects has been designed. A cooperation with the Solar Impulse project allowed to deploy the algorithm in a very challenging real-world scenario, i.e. the accurate measurements of airplane wings deformations.
Resumo:
Visual correspondence is a key computer vision task that aims at identifying projections of the same 3D point into images taken either from different viewpoints or at different time instances. This task has been the subject of intense research activities in the last years in scenarios such as object recognition, motion detection, stereo vision, pattern matching, image registration. The approaches proposed in literature typically aim at improving the state of the art by increasing the reliability, the accuracy or the computational efficiency of visual correspondence algorithms. The research work carried out during the Ph.D. course and presented in this dissertation deals with three specific visual correspondence problems: fast pattern matching, stereo correspondence and robust image matching. The dissertation presents original contributions to the theory of visual correspondence, as well as applications dealing with 3D reconstruction and multi-view video surveillance.
Resumo:
Images of a scene, static or dynamic, are generally acquired at different epochs from different viewpoints. They potentially gather information about the whole scene and its relative motion with respect to the acquisition device. Data from different (in the spatial or temporal domain) visual sources can be fused together to provide a unique consistent representation of the whole scene, even recovering the third dimension, permitting a more complete understanding of the scene content. Moreover, the pose of the acquisition device can be achieved by estimating the relative motion parameters linking different views, thus providing localization information for automatic guidance purposes. Image registration is based on the use of pattern recognition techniques to match among corresponding parts of different views of the acquired scene. Depending on hypotheses or prior information about the sensor model, the motion model and/or the scene model, this information can be used to estimate global or local geometrical mapping functions between different images or different parts of them. These mapping functions contain relative motion parameters between the scene and the sensor(s) and can be used to integrate accordingly informations coming from the different sources to build a wider or even augmented representation of the scene. Accordingly, for their scene reconstruction and pose estimation capabilities, nowadays image registration techniques from multiple views are increasingly stirring up the interest of the scientific and industrial community. Depending on the applicative domain, accuracy, robustness, and computational payload of the algorithms represent important issues to be addressed and generally a trade-off among them has to be reached. Moreover, on-line performance is desirable in order to guarantee the direct interaction of the vision device with human actors or control systems. This thesis follows a general research approach to cope with these issues, almost independently from the scene content, under the constraint of rigid motions. This approach has been motivated by the portability to very different domains as a very desirable property to achieve. A general image registration approach suitable for on-line applications has been devised and assessed through two challenging case studies in different applicative domains. The first case study regards scene reconstruction through on-line mosaicing of optical microscopy cell images acquired with non automated equipment, while moving manually the microscope holder. By registering the images the field of view of the microscope can be widened, preserving the resolution while reconstructing the whole cell culture and permitting the microscopist to interactively explore the cell culture. In the second case study, the registration of terrestrial satellite images acquired by a camera integral with the satellite is utilized to estimate its three-dimensional orientation from visual data, for automatic guidance purposes. Critical aspects of these applications are emphasized and the choices adopted are motivated accordingly. Results are discussed in view of promising future developments.
Resumo:
Generic object recognition is an important function of the human visual system and everybody finds it highly useful in their everyday life. For an artificial vision system it is a really hard, complex and challenging task because instances of the same object category can generate very different images, depending of different variables such as illumination conditions, the pose of an object, the viewpoint of the camera, partial occlusions, and unrelated background clutter. The purpose of this thesis is to develop a system that is able to classify objects in 2D images based on the context, and identify to which category the object belongs to. Given an image, the system can classify it and decide the correct categorie of the object. Furthermore the objective of this thesis is also to test the performance and the precision of different supervised Machine Learning algorithms in this specific task of object image categorization. Through different experiments the implemented application reveals good categorization performances despite the difficulty of the problem. However this project is open to future improvement; it is possible to implement new algorithms that has not been invented yet or using other techniques to extract features to make the system more reliable. This application can be installed inside an embedded system and after trained (performed outside the system), so it can become able to classify objects in a real-time. The information given from a 3D stereocamera, developed inside the department of Computer Engineering of the University of Bologna, can be used to improve the accuracy of the classification task. The idea is to segment a single object in a scene using the depth given from a stereocamera and in this way make the classification more accurate.
Resumo:
Presenting visual feedback for image-guided surgery on a monitor requires the surgeon to perform time-consuming comparisons and diversion of sight and attention away from the patient. Deficiencies in previously developed augmented reality systems for image-guided surgery have, however, prevented the general acceptance of any one technique as a viable alternative to monitor displays. This work presents an evaluation of the feasibility and versatility of a novel augmented reality approach for the visualisation of surgical planning and navigation data. The approach, which utilises a portable image overlay device, was evaluated during integration into existing surgical navigation systems and during application within simulated navigated surgery scenarios.
Resumo:
BackgroundDespite the increasingly higher spatial and contrast resolution of CT, nodular lesions are prone to be missed on chest CT. Tinted lenses increase visual acuity and contrast sensitivity by filtering short wavelength light of solar and artificial origin.PurposeTo test the impact of Gunnar eyewear, image quality (standard versus low dose CT) and nodule location on detectability of lung nodules in CT and to compare their individual influence.Material and MethodsA pre-existing database of CT images of patients with lung nodules >5 mm, scanned with standard does image quality (150 ref mAs/120 kVp) and lower dose/quality (40 ref mAs/120 kVp), was used. Five radiologists read 60 chest CTs twice: once with Gunnar glasses and once without glasses with a 1 month break between. At both read-outs the cases were shown at lower dose or standard dose level to quantify the influence of both variables (eyewear vs. image quality) on nodule sensitivity.ResultsThe sensitivity of CT for lung nodules increased significantly using Gunnar eyewear for two readers and insignificantly for two other readers. Over all, the mean sensitivity of all radiologist raised significantly from 50% to 53%, using the glasses (P value = 0.034). In contrast, sensitivity for lung nodules was not significantly affected by lowering the image quality from 150 to 40 ref mAs. The average sensitivity was 52% at low dose level, that was even 0.7% higher than at standard dose level (P value = 0.40). The strongest impact on sensitivity had the factors readers and nodule location (lung segments).ConclusionSensitivity for lung nodules was significantly enhanced by Gunnar eyewear (+3%), while lower image quality (40 ref mAs) had no impact on nodule sensitivity. Not using the glasses had a bigger impact on sensitivity than lowering the image quality.