932 resultados para Blurred and noisy images
Resumo:
In this text, we present two stereo-based head tracking techniques along with a fast 3D model acquisition system. The first tracking technique is a robust implementation of stereo-based head tracking designed for interactive environments with uncontrolled lighting. We integrate fast face detection and drift reduction algorithms with a gradient-based stereo rigid motion tracking technique. Our system can automatically segment and track a user's head under large rotation and illumination variations. Precision and usability of this approach are compared with previous tracking methods for cursor control and target selection in both desktop and interactive room environments. The second tracking technique is designed to improve the robustness of head pose tracking for fast movements. Our iterative hybrid tracker combines constraints from the ICP (Iterative Closest Point) algorithm and normal flow constraint. This new technique is more precise for small movements and noisy depth than ICP alone, and more robust for large movements than the normal flow constraint alone. We present experiments which test the accuracy of our approach on sequences of real and synthetic stereo images. The 3D model acquisition system we present quickly aligns intensity and depth images, and reconstructs a textured 3D mesh. 3D views are registered with shape alignment based on our iterative hybrid tracker. We reconstruct the 3D model using a new Cubic Ray Projection merging algorithm which takes advantage of a novel data structure: the linked voxel space. We present experiments to test the accuracy of our approach on 3D face modelling using real-time stereo images.
Resumo:
We present a set of techniques that can be used to represent and detect shapes in images. Our methods revolve around a particular shape representation based on the description of objects using triangulated polygons. This representation is similar to the medial axis transform and has important properties from a computational perspective. The first problem we consider is the detection of non-rigid objects in images using deformable models. We present an efficient algorithm to solve this problem in a wide range of situations, and show examples in both natural and medical images. We also consider the problem of learning an accurate non-rigid shape model for a class of objects from examples. We show how to learn good models while constraining them to the form required by the detection algorithm. Finally, we consider the problem of low-level image segmentation and grouping. We describe a stochastic grammar that generates arbitrary triangulated polygons while capturing Gestalt principles of shape regularity. This grammar is used as a prior model over random shapes in a low level algorithm that detects objects in images.
Resumo:
The aim of this study was to analyze the color alterations performed by the CIE L*a*b* system in the digital imaging of shade guide tabs, which were obtained photographically according to the automatic and manual modes. This study also sought to examine the observers' agreement in quantifying the coordinates. Four Vita Lumin Vaccum shade guide tabs were used: A3.5, B1, B3 and C4. An EOS Canon digital camera was used to record the digital images of the shade tabs, and the images were processed using Adobe Photoshop software. A total of 80 observations (five replicates of each shade according to two observers in two modes, specifically, automatic and manual) were obtained, leading to color values of L*, a* and b*. The color difference (AE) between the modes was calculated and classified as either clinically acceptable or unacceptable. The results indicated that there was agreement between the two observers in obtaining the L*, a* and b* values related to all guides. However, the B1, B3, and C4 shade tabs had AE values classified as clinically acceptable (Delta E = 0.44, Delta E = 2.04 and Delta E = 2.69, respectively). The A3.5 shade tab had a AE value classified as clinically unacceptable (Delta E = 4.17), as it presented higher values for luminosity in the automatic mode (L* = 54.0) than in the manual mode (L* = 50.6). It was concluded that the B1, B3 and C4 shade tabs can be used at any of the modes in digital camera (manual or automatic), which was a different finding from that observed for the A3.5 shade tab.
Resumo:
In this work we propose a new image inpainting technique that combines texture synthesis, anisotropic diffusion, transport equation and a new sampling mechanism designed to alleviate the computational burden of the inpainting process. Given an image to be inpainted, anisotropic diffusion is initially applied to generate a cartoon image. A block-based inpainting approach is then applied so that to combine the cartoon image and a measure based on transport equation that dictates the priority on which pixels are filled. A sampling region is then defined dynamically so as to hold the propagation of the edges towards image structures while avoiding unnecessary searches during the completion process. Finally, a cartoon-based metric is computed to measure likeness between target and candidate blocks. Experimental results and comparisons against existing techniques attest the good performance and flexibility of our technique when dealing with real and synthetic images. © 2013 Elsevier B.V. All rights reserved.
Resumo:
Research on students’ (and teachers’) images of mathematics and mathematicians reveals a number of stereotypical images, most of which are negative. In this paper we present an overview of some these images and stereotypes and consider the questions: (1) how might the image of mathematics and mathematicians be a problem in mathematics education, and (2) what can be done to remedy the situation? Also, we consider an outreach project called Windows into Elementary Mathematics. In this project mathematicians are interviewed about their perspectives on elementary mathematics topics and their interviews are videotaped and are posted online, along with supporting images and interactive content. In this context we consider the questions: (3) what is the Windows project about, and (4) how might it offer an alternate (and perhaps better) image of mathematics and mathematicians? Lastly, we share an example where activities from the project were used in a math-for-teachers course.
Resumo:
This paper presents a comparison of descriptive statistics obtained for brittle structural lineaments extracted manually from LANDSAT images and shaded relief images from SRTM 3 DEM at 1:100, 000 and 1:500, 000 scales. The selected area is located in the southern of Brazil and comprises Precambrian rocks and stratigraphic units of the Paraná Basin. The application of this methodology shows that the visual interpretation depends on the kind of remote sensing image. The resulting descriptive statistics obtained for lineaments extracted from the images do not follow the same pattern according to the scale adopted. The main direction obtained for Proterozoic rocks using both image types at a 1:500, 000 scale are close to NS±10, whereas at a 1:100, 000 scale N45E was obtained for shaded relief images from SRTM 3 DEM and N10W for LANDSAT images. The Paleozoic sediments yielded the best results for the different images and scales (N50W). On the other hand, the Mesozoic igneous rocks showed greatest differences, the shaded relief images from SRTM 3 DEM images highlighting NE structures and the LANDSAT images highlighting NW structures. The accumulated frequency demonstrated high similarity between products for each image type no matter the scale, indicating that they can be used in multiscale studies. Conversely, major differences were found when comparing data obtained using shaded relief images from SRTM 3 DEM and Landsat images at a 1:100, 000 scale.
Resumo:
For their survival, humans and animals can rely on motivational systems which are specialized in assessing the valence and imminence of dangers and appetitive cues. The Orienting Response (OR) is a fundamental response pattern that an organism executes whenever a novel or significant stimulus is detected, and has been shown to be consistently modulated by the affective value of a stimulus. However, detecting threatening stimuli and appetitive affordances while they are far away compared to when they are within reach constitutes an obvious evolutionary advantage. Building on the linear relationship between stimulus distance and retinal size, the present research was aimed at investigating the extent to which emotional modulation of distinct processes (action preparation, attentional capture, and subjective emotional state) is affected when reducing the retinal size of a picture. Studies 1-3 examined the effects of picture size on emotional response. Subjective feeling of engagement, as well as sympathetic activation, were modulated by picture size, suggesting that action preparation and subjective experience reflect the combined effects of detecting an arousing stimulus and assessing its imminence. On the other hand, physiological responses which are thought to reflect the amount of attentional resources invested in stimulus processing did not vary with picture size. Studies 4-6 were conducted to substantiate and extend the results of studies 1-3. In particular, it was noted that a decrease in picture size is associated with a loss in the low spatial frequencies of a picture, which might confound the interpretation of the results of studies 1-3. Therefore, emotional and neutral images which were either low-pass filtered or reduced in size were presented, and affective responses were measured. Most effects which were observed when manipulating image size were replicated by blurring pictures. However, pictures depicting highly arousing unpleasant contents were associated with a more pronounced decrease in affective modulation when pictures were reduced in size compared to when they were blurred. The present results provide important information for the study of processes involved in picture perception and in the genesis and expression of an emotional response. In particular, the availability of high spatial frequencies might affect the degree of activation of an internal representation of an affectively charged scene, and might modulate subjective emotional state and preparation for action. Moreover, the manipulation of stimulus imminence revealed important effects of stimulus engagement on specific components of the emotional response, and the implications of the present data for some models of emotions have been discussed. In particular, within the framework of a staged model of emotional response, the tactic and strategic role of response preparation and attention allocation to stimuli varying in engaging power has been discussed, considering the adaptive advantages that each might represent in an evolutionary view. Finally, the identification of perceptual parameters that allow affective processing to be carried out has important methodological applications in future studies examining emotional response in basic research or clinical contexts.
Resumo:
Adapting to blurred or sharpened images alters perceived blur of a focused image (M. A. Webster, M. A. Georgeson, & S. M. Webster, 2002). We asked whether blur adaptation results in (a) renormalization of perceived focus or (b) a repulsion aftereffect. Images were checkerboards or 2-D Gaussian noise, whose amplitude spectra had (log-log) slopes from -2 (strongly blurred) to 0 (strongly sharpened). Observers adjusted the spectral slope of a comparison image to match different test slopes after adaptation to blurred or sharpened images. Results did not show repulsion effects but were consistent with some renormalization. Test blur levels at and near a blurred or sharpened adaptation level were matched by more focused slopes (closer to 1/f) but with little or no change in appearance after adaptation to focused (1/f) images. A model of contrast adaptation and blur coding by multiple-scale spatial filters predicts these blur aftereffects and those of Webster et al. (2002). A key proposal is that observers are pre-adapted to natural spectra, and blurred or sharpened spectra induce changes in the state of adaptation. The model illustrates how norms might be encoded and recalibrated in the visual system even when they are represented only implicitly by the distribution of responses across multiple channels.
Resumo:
We propose a novel template matching approach for the discrimination of handwritten and machine-printed text. We first pre-process the scanned document images by performing denoising, circles/lines exclusion and word-block level segmentation. We then align and match characters in a flexible sized gallery with the segmented regions, using parallelised normalised cross-correlation. The experimental results over the Pattern Recognition & Image Analysis Research Lab-Natural History Museum (PRImA-NHM) dataset show remarkably high robustness of the algorithm in classifying cluttered, occluded and noisy samples, in addition to those with significant high missing data. The algorithm, which gives 84.0% classification rate with false positive rate 0.16 over the dataset, does not require training samples and generates compelling results as opposed to the training-based approaches, which have used the same benchmark.
Resumo:
Images about Africa in the northern hemisphere are generally negative and pessimistic. In spite of instant global communication, why have these images persisted till date? This contribution shall revisit these perceptions and the images embodying them to unearth the motivations and rationale. The central argument, based on some narratives and experiences, is that ignorance feeds these images and stereotypes. Furthermore, positionality of non-African experts and some groups of African scholars and activists contribute to this culture of ignorance and paternalism. The contribution shall end with an ethical evaluation of the persistence of the images and the extent of moral responsibility of the authors and carriers of the racist stereotypes embedded in the images. (DIPF/Orig.)
Resumo:
With the rise of smart phones, lifelogging devices (e.g. Google Glass) and popularity of image sharing websites (e.g. Flickr), users are capturing and sharing every aspect of their life online producing a wealth of visual content. Of these uploaded images, the majority are poorly annotated or exist in complete semantic isolation making the process of building retrieval systems difficult as one must firstly understand the meaning of an image in order to retrieve it. To alleviate this problem, many image sharing websites offer manual annotation tools which allow the user to “tag” their photos, however, these techniques are laborious and as a result have been poorly adopted; Sigurbjörnsson and van Zwol (2008) showed that 64% of images uploaded to Flickr are annotated with < 4 tags. Due to this, an entire body of research has focused on the automatic annotation of images (Hanbury, 2008; Smeulders et al., 2000; Zhang et al., 2012a) where one attempts to bridge the semantic gap between an image’s appearance and meaning e.g. the objects present. Despite two decades of research the semantic gap still largely exists and as a result automatic annotation models often offer unsatisfactory performance for industrial implementation. Further, these techniques can only annotate what they see, thus ignoring the “bigger picture” surrounding an image (e.g. its location, the event, the people present etc). Much work has therefore focused on building photo tag recommendation (PTR) methods which aid the user in the annotation process by suggesting tags related to those already present. These works have mainly focused on computing relationships between tags based on historical images e.g. that NY and timessquare co-exist in many images and are therefore highly correlated. However, tags are inherently noisy, sparse and ill-defined often resulting in poor PTR accuracy e.g. does NY refer to New York or New Year? This thesis proposes the exploitation of an image’s context which, unlike textual evidences, is always present, in order to alleviate this ambiguity in the tag recommendation process. Specifically we exploit the “what, who, where, when and how” of the image capture process in order to complement textual evidences in various photo tag recommendation and retrieval scenarios. In part II, we combine text, content-based (e.g. # of faces present) and contextual (e.g. day-of-the-week taken) signals for tag recommendation purposes, achieving up to a 75% improvement to precision@5 in comparison to a text-only TF-IDF baseline. We then consider external knowledge sources (i.e. Wikipedia & Twitter) as an alternative to (slower moving) Flickr in order to build recommendation models on, showing that similar accuracy could be achieved on these faster moving, yet entirely textual, datasets. In part II, we also highlight the merits of diversifying tag recommendation lists before discussing at length various problems with existing automatic image annotation and photo tag recommendation evaluation collections. In part III, we propose three new image retrieval scenarios, namely “visual event summarisation”, “image popularity prediction” and “lifelog summarisation”. In the first scenario, we attempt to produce a rank of relevant and diverse images for various news events by (i) removing irrelevant images such memes and visual duplicates (ii) before semantically clustering images based on the tweets in which they were originally posted. Using this approach, we were able to achieve over 50% precision for images in the top 5 ranks. In the second retrieval scenario, we show that by combining contextual and content-based features from images, we are able to predict if it will become “popular” (or not) with 74% accuracy, using an SVM classifier. Finally, in chapter 9 we employ blur detection and perceptual-hash clustering in order to remove noisy images from lifelogs, before combining visual and geo-temporal signals in order to capture a user’s “key moments” within their day. We believe that the results of this thesis show an important step towards building effective image retrieval models when there lacks sufficient textual content (i.e. a cold start).
Resumo:
Nanotechnology has revolutionised humanity's capability in building microscopic systems by manipulating materials on a molecular and atomic scale. Nan-osystems are becoming increasingly smaller and more complex from the chemical perspective which increases the demand for microscopic characterisation techniques. Among others, transmission electron microscopy (TEM) is an indispensable tool that is increasingly used to study the structures of nanosystems down to the molecular and atomic scale. However, despite the effectivity of this tool, it can only provide 2-dimensional projection (shadow) images of the 3D structure, leaving the 3-dimensional information hidden which can lead to incomplete or erroneous characterization. One very promising inspection method is Electron Tomography (ET), which is rapidly becoming an important tool to explore the 3D nano-world. ET provides (sub-)nanometer resolution in all three dimensions of the sample under investigation. However, the fidelity of the ET tomogram that is achieved by current ET reconstruction procedures remains a major challenge. This thesis addresses the assessment and advancement of electron tomographic methods to enable high-fidelity three-dimensional investigations. A quality assessment investigation was conducted to provide a quality quantitative analysis of the main established ET reconstruction algorithms and to study the influence of the experimental conditions on the quality of the reconstructed ET tomogram. Regular shaped nanoparticles were used as a ground-truth for this study. It is concluded that the fidelity of the post-reconstruction quantitative analysis and segmentation is limited, mainly by the fidelity of the reconstructed ET tomogram. This motivates the development of an improved tomographic reconstruction process. In this thesis, a novel ET method was proposed, named dictionary learning electron tomography (DLET). DLET is based on the recent mathematical theorem of compressed sensing (CS) which employs the sparsity of ET tomograms to enable accurate reconstruction from undersampled (S)TEM tilt series. DLET learns the sparsifying transform (dictionary) in an adaptive way and reconstructs the tomogram simultaneously from highly undersampled tilt series. In this method, the sparsity is applied on overlapping image patches favouring local structures. Furthermore, the dictionary is adapted to the specific tomogram instance, thereby favouring better sparsity and consequently higher quality reconstructions. The reconstruction algorithm is based on an alternating procedure that learns the sparsifying dictionary and employs it to remove artifacts and noise in one step, and then restores the tomogram data in the other step. Simulation and real ET experiments of several morphologies are performed with a variety of setups. Reconstruction results validate its efficiency in both noiseless and noisy cases and show that it yields an improved reconstruction quality with fast convergence. The proposed method enables the recovery of high-fidelity information without the need to worry about what sparsifying transform to select or whether the images used strictly follow the pre-conditions of a certain transform (e.g. strictly piecewise constant for Total Variation minimisation). This can also avoid artifacts that can be introduced by specific sparsifying transforms (e.g. the staircase artifacts the may result when using Total Variation minimisation). Moreover, this thesis shows how reliable elementally sensitive tomography using EELS is possible with the aid of both appropriate use of Dual electron energy loss spectroscopy (DualEELS) and the DLET compressed sensing algorithm to make the best use of the limited data volume and signal to noise inherent in core-loss electron energy loss spectroscopy (EELS) from nanoparticles of an industrially important material. Taken together, the results presented in this thesis demonstrates how high-fidelity ET reconstructions can be achieved using a compressed sensing approach.
Resumo:
This report describes the realization of a system, in which an object detection model will be implemented, whose aim is to detect the presence of people in images. This system could be used for several applications: for example, it could be carried on board an aircraft or a drone. In this case, the system is designed in such a way that it can be mounted on light/medium weight helicopters, helping the operator to find people in emergency situations. In the first chapter the use of helicopters for civil protection is analysed and applications similar to this case study are listed. The second chapter describes the choice of the hardware devices that have been used to implement a prototype of a system to collect, analyse and display images. At first, the PC necessary to process the images was chosen, based on the characteristics of the algorithms that are necessary to run the analysis. In the further, a camera that could be compatible with the PC was selected. Finally, the battery pack was chosen taking into account the electrical consumption of the devices. The third chapter illustrates the algorithms used for image analysis. In the fourth, some of the requirements listed in the regulations that must be taken into account for carrying on board all the devices have been briefly analysed. In the fifth chapter the activity of design and modelling, with the CAD Solidworks, the devices and a prototype of a case that will house them is described. The sixth chapter discusses the additive manufacturing, since the case was printed exploiting this technology. In the seventh chapter, part of the tests that must be carried out on the equipment to certificate it have been analysed, and some simulations have been carried out. In the eighth chapter the results obtained once loaded the object detection model on a hardware for image analyses were showed. In the ninth chapter, conclusions and future applications were discussed.
Resumo:
Fifty Bursa of Fabricius (BF) were examined by conventional optical microscopy and digital images were acquired and processed using Matlab® 6.5 software. The Artificial Neuronal Network (ANN) was generated using Neuroshell® Classifier software and the optical and digital data were compared. The ANN was able to make a comparable classification of digital and optical scores. The use of ANN was able to classify correctly the majority of the follicles, reaching sensibility and specificity of 89% and 96%, respectively. When the follicles were scored and grouped in a binary fashion the sensibility increased to 90% and obtained the maximum value for the specificity of 92%. These results demonstrate that the use of digital image analysis and ANN is a useful tool for the pathological classification of the BF lymphoid depletion. In addition it provides objective results that allow measuring the dimension of the error in the diagnosis and classification therefore making comparison between databases feasible.
Resumo:
Context. Compact groups of galaxies are entities that have high densities of galaxies and serve as laboratories to study galaxy interactions, intergalactic star formation and galaxy evolution. Aims. The main goal of this study is to search for young objects in the intragroup medium of seven compact groups of galaxies: HCG 2, 7, 22, 23, 92, 100 and NGC 92 as well as to evaluate the stage of interaction of each group. Methods. We used Fabry-Perot velocity fields and rotation curves together with GALEX NUV and FUV images and optical R-band and HI maps. Results. (i) HCG 7 and HCG 23 are in early stages of interaction; (ii) HCG 2 and HCG 22 are mildly interacting; and (iii) HCG 92, HCG 100 and NGC 92 are in late stages of evolution. We find that all three evolved groups contain populations of young blue objects in the intragroup medium, consistent with ages < 100 Myr, of which several are younger than < 10 Myr. We also report the discovery of a tidal dwarf galaxy candidate in the tail of NGC 92. These three groups, besides containing galaxies that have peculiar velocity fields, also show extended HI tails. Conclusions. Our results indicate that the advanced stage of evolution of a group, together with the presence of intragroup HI clouds, may lead to star formation in the intragroup medium. A table containing all intergalactic HII regions and tidal dwarf galaxies confirmed to date is appended.