919 resultados para Image-Based Visual Hull


Relevância:

50.00% 50.00%

Publicador:

Resumo:

With the progress of computer technology, computers are expected to be more intelligent in the interaction with humans, presenting information according to the user's psychological and physiological characteristics. However, computer users with visual problems may encounter difficulties on the perception of icons, menus, and other graphical information displayed on the screen, limiting the efficiency of their interaction with computers. In this dissertation, a personalized and dynamic image precompensation method was developed to improve the visual performance of the computer users with ocular aberrations. The precompensation was applied on the graphical targets before presenting them on the screen, aiming to counteract the visual blurring caused by the ocular aberration of the user's eye. A complete and systematic modeling approach to describe the retinal image formation of the computer user was presented, taking advantage of modeling tools, such as Zernike polynomials, wavefront aberration, Point Spread Function and Modulation Transfer Function. The ocular aberration of the computer user was originally measured by a wavefront aberrometer, as a reference for the precompensation model. The dynamic precompensation was generated based on the resized aberration, with the real-time pupil diameter monitored. The potential visual benefit of the dynamic precompensation method was explored through software simulation, with the aberration data from a real human subject. An "artificial eye'' experiment was conducted by simulating the human eye with a high-definition camera, providing objective evaluation to the image quality after precompensation. In addition, an empirical evaluation with 20 human participants was also designed and implemented, involving image recognition tests performed under a more realistic viewing environment of computer use. The statistical analysis results of the empirical experiment confirmed the effectiveness of the dynamic precompensation method, by showing significant improvement on the recognition accuracy. The merit and necessity of the dynamic precompensation were also substantiated by comparing it with the static precompensation. The visual benefit of the dynamic precompensation was further confirmed by the subjective assessments collected from the evaluation participants.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

This PhD by publication examines selected practice-based audio-visual works made by the author over a ten-year period, placing them in a critical context. Central to the publications, and the focus of the thesis, is an exploration of the role of sound in the creation of dialectic tension between the audio, the visual and the audience. By first analysing a number of texts (films/videos and key writings) the thesis locates the principal issues and debates around the use of audio in artists’ moving image practice. From this it is argued that asynchronism, first advocated in 1929 by Pudovkin as a response to the advent of synchronised sound, can be used to articulate audio-visual relationships. Central to asynchronism’s application in this paper is a recognition of the propensity for sound and image to adhere, and in visual music for there to be a literal equation of audio with the visual, often married with a quest for the synaesthetic. These elements can either be used in an illusionist fashion, or employed as part of an anti-illusionist strategy for realising dialectic. Using this as a theoretical basis, the paper examines how the publications implement asynchronism, including digital mapping to facilitate innovative reciprocal sound and image combinations, and the asynchronous use of ‘found sound’ from a range of online sources to reframe the moving image. The synthesis of publications and practice demonstrates that asynchronism can both underpin the creation of dialectic, and be an integral component in an audio-visual anti-illusionist methodology.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

Au cours des dernières décennies, l’effort sur les applications de capteurs infrarouges a largement progressé dans le monde. Mais, une certaine difficulté demeure, en ce qui concerne le fait que les objets ne sont pas assez clairs ou ne peuvent pas toujours être distingués facilement dans l’image obtenue pour la scène observée. L’amélioration de l’image infrarouge a joué un rôle important dans le développement de technologies de la vision infrarouge de l’ordinateur, le traitement de l’image et les essais non destructifs, etc. Cette thèse traite de la question des techniques d’amélioration de l’image infrarouge en deux aspects, y compris le traitement d’une seule image infrarouge dans le domaine hybride espacefréquence, et la fusion d’images infrarouges et visibles employant la technique du nonsubsampled Contourlet transformer (NSCT). La fusion d’images peut être considérée comme étant la poursuite de l’exploration du modèle d’amélioration de l’image unique infrarouge, alors qu’il combine les images infrarouges et visibles en une seule image pour représenter et améliorer toutes les informations utiles et les caractéristiques des images sources, car une seule image ne pouvait contenir tous les renseignements pertinents ou disponibles en raison de restrictions découlant de tout capteur unique de l’imagerie. Nous examinons et faisons une enquête concernant le développement de techniques d’amélioration d’images infrarouges, et ensuite nous nous consacrons à l’amélioration de l’image unique infrarouge, et nous proposons un schéma d’amélioration de domaine hybride avec une méthode d’évaluation floue de seuil amélioré, qui permet d’obtenir une qualité d’image supérieure et améliore la perception visuelle humaine. Les techniques de fusion d’images infrarouges et visibles sont établies à l’aide de la mise en oeuvre d’une mise en registre précise des images sources acquises par différents capteurs. L’algorithme SURF-RANSAC est appliqué pour la mise en registre tout au long des travaux de recherche, ce qui conduit à des images mises en registre de façon très précise et des bénéfices accrus pour le traitement de fusion. Pour les questions de fusion d’images infrarouges et visibles, une série d’approches avancées et efficaces sont proposés. Une méthode standard de fusion à base de NSCT multi-canal est présente comme référence pour les approches de fusion proposées suivantes. Une approche conjointe de fusion, impliquant l’Adaptive-Gaussian NSCT et la transformée en ondelettes (Wavelet Transform, WT) est propose, ce qui conduit à des résultats de fusion qui sont meilleurs que ceux obtenus avec les méthodes non-adaptatives générales. Une approche de fusion basée sur le NSCT employant la détection comprime (CS, compressed sensing) et de la variation totale (TV) à des coefficients d’échantillons clairsemés et effectuant la reconstruction de coefficients fusionnés de façon précise est proposée, qui obtient de bien meilleurs résultats de fusion par le biais d’une pré-amélioration de l’image infrarouge et en diminuant les informations redondantes des coefficients de fusion. Une procédure de fusion basée sur le NSCT utilisant une technique de détection rapide de rétrécissement itératif comprimé (fast iterative-shrinking compressed sensing, FISCS) est proposée pour compresser les coefficients décomposés et reconstruire les coefficients fusionnés dans le processus de fusion, qui conduit à de meilleurs résultats plus rapidement et d’une manière efficace.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

With the progress of computer technology, computers are expected to be more intelligent in the interaction with humans, presenting information according to the user's psychological and physiological characteristics. However, computer users with visual problems may encounter difficulties on the perception of icons, menus, and other graphical information displayed on the screen, limiting the efficiency of their interaction with computers. In this dissertation, a personalized and dynamic image precompensation method was developed to improve the visual performance of the computer users with ocular aberrations. The precompensation was applied on the graphical targets before presenting them on the screen, aiming to counteract the visual blurring caused by the ocular aberration of the user's eye. A complete and systematic modeling approach to describe the retinal image formation of the computer user was presented, taking advantage of modeling tools, such as Zernike polynomials, wavefront aberration, Point Spread Function and Modulation Transfer Function. The ocular aberration of the computer user was originally measured by a wavefront aberrometer, as a reference for the precompensation model. The dynamic precompensation was generated based on the resized aberration, with the real-time pupil diameter monitored. The potential visual benefit of the dynamic precompensation method was explored through software simulation, with the aberration data from a real human subject. An "artificial eye'' experiment was conducted by simulating the human eye with a high-definition camera, providing objective evaluation to the image quality after precompensation. In addition, an empirical evaluation with 20 human participants was also designed and implemented, involving image recognition tests performed under a more realistic viewing environment of computer use. The statistical analysis results of the empirical experiment confirmed the effectiveness of the dynamic precompensation method, by showing significant improvement on the recognition accuracy. The merit and necessity of the dynamic precompensation were also substantiated by comparing it with the static precompensation. The visual benefit of the dynamic precompensation was further confirmed by the subjective assessments collected from the evaluation participants.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

The abundance of visual data and the push for robust AI are driving the need for automated visual sensemaking. Computer Vision (CV) faces growing demand for models that can discern not only what images "represent," but also what they "evoke." This is a demand for tools mimicking human perception at a high semantic level, categorizing images based on concepts like freedom, danger, or safety. However, automating this process is challenging due to entropy, scarcity, subjectivity, and ethical considerations. These challenges not only impact performance but also underscore the critical need for interoperability. This dissertation focuses on abstract concept-based (AC) image classification, guided by three technical principles: situated grounding, performance enhancement, and interpretability. We introduce ART-stract, a novel dataset of cultural images annotated with ACs, serving as the foundation for a series of experiments across four key domains: assessing the effectiveness of the end-to-end DL paradigm, exploring cognitive-inspired semantic intermediaries, incorporating cultural and commonsense aspects, and neuro-symbolic integration of sensory-perceptual data with cognitive-based knowledge. Our results demonstrate that integrating CV approaches with semantic technologies yields methods that surpass the current state of the art in AC image classification, outperforming the end-to-end deep vision paradigm. The results emphasize the role semantic technologies can play in developing both effective and interpretable systems, through the capturing, situating, and reasoning over knowledge related to visual data. Furthermore, this dissertation explores the complex interplay between technical and socio-technical factors. By merging technical expertise with an understanding of human and societal aspects, we advocate for responsible labeling and training practices in visual media. These insights and techniques not only advance efforts in CV and explainable artificial intelligence but also propel us toward an era of AI development that harmonizes technical prowess with deep awareness of its human and societal implications.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Diabetic Retinopathy (DR) is a complication of diabetes that can lead to blindness if not readily discovered. Automated screening algorithms have the potential to improve identification of patients who need further medical attention. However, the identification of lesions must be accurate to be useful for clinical application. The bag-of-visual-words (BoVW) algorithm employs a maximum-margin classifier in a flexible framework that is able to detect the most common DR-related lesions such as microaneurysms, cotton-wool spots and hard exudates. BoVW allows to bypass the need for pre- and post-processing of the retinographic images, as well as the need of specific ad hoc techniques for identification of each type of lesion. An extensive evaluation of the BoVW model, using three large retinograph datasets (DR1, DR2 and Messidor) with different resolution and collected by different healthcare personnel, was performed. The results demonstrate that the BoVW classification approach can identify different lesions within an image without having to utilize different algorithms for each lesion reducing processing time and providing a more flexible diagnostic system. Our BoVW scheme is based on sparse low-level feature detection with a Speeded-Up Robust Features (SURF) local descriptor, and mid-level features based on semi-soft coding with max pooling. The best BoVW representation for retinal image classification was an area under the receiver operating characteristic curve (AUC-ROC) of 97.8% (exudates) and 93.5% (red lesions), applying a cross-dataset validation protocol. To assess the accuracy for detecting cases that require referral within one year, the sparse extraction technique associated with semi-soft coding and max pooling obtained an AUC of 94.2 ± 2.0%, outperforming current methods. Those results indicate that, for retinal image classification tasks in clinical practice, BoVW is equal and, in some instances, surpasses results obtained using dense detection (widely believed to be the best choice in many vision problems) for the low-level descriptors.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The present work reports the porous alumina structures fabrication and their quantitative structural characteristics study based on mathematical morphology analysis by using the SEM images. The algorithm used in this work was implemented in 6.2 MATLAB software. Using the algorithm it was possible to obtain the distribution of maximum, minimum and average radius of the pores in porous alumina structures. Additionally, with the calculus of the area occupied by the pores, it was possible to obtain the porosity of the structures. The quantitative results could be obtained and related to the process fabrication characteristics, showing to be reliable and promising to be used to control the pores formation process. Then, this technique could provide a more accurate determination of pore sizes and pores distribution. (C) 2008 Elsevier Ltd. All rights reserved.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Extracting human postural information from video sequences has proved a difficult research question. The most successful approaches to date have been based on particle filtering, whereby the underlying probability distribution is approximated by a set of particles. The shape of the underlying observational probability distribution plays a significant role in determining the success, both accuracy and efficiency, of any visual tracker. In this paper we compare approaches used by other authors and present a cost path approach which is commonly used in image segmentation problems, however is currently not widely used in tracking applications.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

While multimedia data, image data in particular, is an integral part of most websites and web documents, our quest for information so far is still restricted to text based search. To explore the World Wide Web more effectively, especially its rich repository of truly multimedia information, we are facing a number of challenging problems. Firstly, we face the ambiguous and highly subjective nature of defining image semantics and similarity. Secondly, multimedia data could come from highly diversified sources, as a result of automatic image capturing and generation processes. Finally, multimedia information exists in decentralised sources over the Web, making it difficult to use conventional content-based image retrieval (CBIR) techniques for effective and efficient search. In this special issue, we present a collection of five papers on visual and multimedia information management and retrieval topics, addressing some aspects of these challenges. These papers have been selected from the conference proceedings (Kluwer Academic Publishers, ISBN: 1-4020- 7060-8) of the Sixth IFIP 2.6 Working Conference on Visual Database Systems (VDB6), held in Brisbane, Australia, on 29–31 May 2002.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In this work, we take advantage of association rule mining to support two types of medical systems: the Content-based Image Retrieval (CBIR) systems and the Computer-Aided Diagnosis (CAD) systems. For content-based retrieval, association rules are employed to reduce the dimensionality of the feature vectors that represent the images and to improve the precision of the similarity queries. We refer to the association rule-based method to improve CBIR systems proposed here as Feature selection through Association Rules (FAR). To improve CAD systems, we propose the Image Diagnosis Enhancement through Association rules (IDEA) method. Association rules are employed to suggest a second opinion to the radiologist or a preliminary diagnosis of a new image. A second opinion automatically obtained can either accelerate the process of diagnosing or to strengthen a hypothesis, increasing the probability of a prescribed treatment be successful. Two new algorithms are proposed to support the IDEA method: to pre-process low-level features and to propose a preliminary diagnosis based on association rules. We performed several experiments to validate the proposed methods. The results indicate that association rules can be successfully applied to improve CBIR and CAD systems, empowering the arsenal of techniques to support medical image analysis in medical systems. (C) 2009 Elsevier B.V. All rights reserved.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In this paper, we propose a method based on association rule-mining to enhance the diagnosis of medical images (mammograms). It combines low-level features automatically extracted from images and high-level knowledge from specialists to search for patterns. Our method analyzes medical images and automatically generates suggestions of diagnoses employing mining of association rules. The suggestions of diagnosis are used to accelerate the image analysis performed by specialists as well as to provide them an alternative to work on. The proposed method uses two new algorithms, PreSAGe and HiCARe. The PreSAGe algorithm combines, in a single step, feature selection and discretization, and reduces the mining complexity. Experiments performed on PreSAGe show that this algorithm is highly suitable to perform feature selection and discretization in medical images. HiCARe is a new associative classifier. The HiCARe algorithm has an important property that makes it unique: it assigns multiple keywords per image to suggest a diagnosis with high values of accuracy. Our method was applied to real datasets, and the results show high sensitivity (up to 95%) and accuracy (up to 92%), allowing us to claim that the use of association rules is a powerful means to assist in the diagnosing task.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

An efficient representation method for arbitrarily shaped image segments is proposed. This method includes a smart way to select wavelet basis to approximate the given image segment, with improved image quality and reduced computational load.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

An optically addressed read-write sensor based on two stacked p-i-n heterojunctions is analyzed. The device is a two terminal image sensing structure. The charge packets are injected optically into the p-i-n writer and confined at the illuminated regions changing locally the electrical field profile across the p-i-n reader. An optical scanner is used for charge readout. The design allows a continuous readout without the need for pixel-level patterning. The role of light pattern and scanner wavelengths on the readout parameters is analyzed. The optical-to-electrical transfer characteristics show high quantum efficiency, broad spectral response, and reciprocity between light and image signal. A numerical simulation supports the imaging process. A black and white image is acquired with a resolution around 20 mum showing the potentiality of these devices for imaging applications.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In recent works large area hydrogenated amorphous silicon p-i-n structures with low conductivity doped layers were proposed as single element image sensors. The working principle of this type of sensor is based on the modulation, by the local illumination conditions, of the photocurrent generated by a light beam scanning the active area of the device. In order to evaluate the sensor capabilities is necessary to perform a response time characterization. This work focuses on the transient response of such sensor and on the influence of the carbon contents of the doped layers. In order to evaluate the response time a set of devices with different percentage of carbon incorporation in the doped layers is analyzed by measuring the scanner-induced photocurrent under different bias conditions.