280 resultados para IMAGE SEQUENCES
Resumo:
This paper proposes a novel approach to video deblocking which performs perceptually adaptive bilateral filtering by considering color, intensity, and motion features in a holistic manner. The method is based on bilateral filter which is an effective smoothing filter that preserves edges. The bilateral filter parameters are adaptive and avoid over-blurring of texture regions and at the same time eliminate blocking artefacts in the smooth region and areas of slow motion content. This is achieved by using a saliency map to control the strength of the filter for each individual point in the image based on its perceptual importance. The experimental results demonstrate that the proposed algorithm is effective in deblocking highly compressed video sequences and to avoid over-blurring of edges and textures in salient regions of image.
Resumo:
Australasian marsupials include three major radiations, the insectivorous/carnivorous Dasyuromorphia, the omnivorous bandicoots (Peramelemorphia), and the largely herbivorous diprotodontians. Morphologists have generally considered the bandicoots and diprotodontians to be closely related, most prominently because they are both syndactylous (with the 2nd and 3rd pedal digits being fused). Molecular studies have been unable to confirm or reject this Syndactyla hypothesis. Here we present new mitochondrial (mt) genomes from a spiny bandicoot (Echymipera rufescens) and two dasyurids, a fat-tailed dunnart (Sminthopsis crassicaudata) and a northern quoll (Dasyurus hallucatus). By comparing trees derived from pairwise base-frequency differences between taxa with standard (absolute, uncorrected) distance trees, we infer that composition bias among mt protein-coding and RNA sequences is sufficient to mislead tree reconstruction. This can explain incongruence between trees obtained from mt and nuclear data sets. However, after excluding major sources of compositional heterogeneity, both the “reduced-bias” mt and nuclear data sets clearly favor a bandicoot plus dasyuromorphian association, as well as a grouping of kangaroos and possums (Phalangeriformes) among diprotodontians. Notably, alternatives to these groupings could only be confidently rejected by combining the mt and nuclear data. Elsewhere on the tree, Dromiciops appears to be sister to the monophyletic Australasian marsupials, whereas the placement of the marsupial mole (Notoryctes) remains problematic. More generally, we contend that it is desirable to combine mt genome and nuclear sequences for inferring vertebrate phylogeny, but as separately modeled process partitions. This strategy depends on detecting and excluding (or accounting for) major sources of nonhistorical signal, such as from compositional nonstationarity.
Resumo:
While researchers strive to improve automatic face recognition performance, the relationship between image resolution and face recognition performance has not received much attention. This relationship is examined systematically and a framework is developed such that results from super-resolution techniques can be compared. Three super-resolution techniques are compared with the Eigenface and Elastic Bunch Graph Matching face recognition engines. Parameter ranges over which these techniques provide better recognition performance than interpolated images is determined.
Resumo:
Affine covariant local image features are a powerful tool for many applications, including matching and calibrating wide baseline images. Local feature extractors that use a saliency map to locate features require adaptation processes in order to extract affine covariant features. The most effective extractors make use of the second moment matrix (SMM) to iteratively estimate the affine shape of local image regions. This paper shows that the Hessian matrix can be used to estimate local affine shape in a similar fashion to the SMM. The Hessian matrix requires significantly less computation effort than the SMM, allowing more efficient affine adaptation. Experimental results indicate that using the Hessian matrix in conjunction with a feature extractor that selects features in regions with high second order gradients delivers equivalent quality correspondences in less than 17% of the processing time, compared to the same extractor using the SMM.
Resumo:
This paper presents an image-based visual servoing system that was used to track the atmospheric Earth re-entry of Hayabusa. The primary aim of this ground based tracking platform was to record the emission spectrum radiating from the superheated gas of the shock layer and the surface of the heat shield during re-entry. To the author's knowledge, this is the first time that a visual servoing system has successfully tracked a super-orbital re-entry of a spacecraft and recorded its pectral signature. Furthermore, we improved the system by including a simplified dynamic model for feed-forward control and demonstrate improved tracking performance on the International Space Station (ISS). We present comparisons between simulation and experimental results on different target trajectories including tracking results from Hayabusa and ISS. The required performance for tracking both spacecraft is demanding when combined with a narrow field of view (FOV). We also briefly discuss the preliminary results obtained from the spectroscopy of the Hayabusa's heat shield during re-entry.
Resumo:
In this paper we use a sequence-based visual localization algorithm to reveal surprising answers to the question, how much visual information is actually needed to conduct effective navigation? The algorithm actively searches for the best local image matches within a sliding window of short route segments or 'sub-routes', and matches sub-routes by searching for coherent sequences of local image matches. In contract to many existing techniques, the technique requires no pre-training or camera parameter calibration. We compare the algorithm's performance to the state-of-the-art FAB-MAP 2.0 algorithm on a 70 km benchmark dataset. Performance matches or exceeds the state of the art feature-based localization technique using images as small as 4 pixels, fields of view reduced by a factor of 250, and pixel bit depths reduced to 2 bits. We present further results demonstrating the system localizing in an office environment with near 100% precision using two 7 bit Lego light sensors, as well as using 16 and 32 pixel images from a motorbike race and a mountain rally car stage. By demonstrating how little image information is required to achieve localization along a route, we hope to stimulate future 'low fidelity' approaches to visual navigation that complement probabilistic feature-based techniques.
Resumo:
Learning and then recognizing a route, whether travelled during the day or at night, in clear or inclement weather, and in summer or winter is a challenging task for state of the art algorithms in computer vision and robotics. In this paper, we present a new approach to visual navigation under changing conditions dubbed SeqSLAM. Instead of calculating the single location most likely given a current image, our approach calculates the best candidate matching location within every local navigation sequence. Localization is then achieved by recognizing coherent sequences of these “local best matches”. This approach removes the need for global matching performance by the vision front-end - instead it must only pick the best match within any short sequence of images. The approach is applicable over environment changes that render traditional feature-based techniques ineffective. Using two car-mounted camera datasets we demonstrate the effectiveness of the algorithm and compare it to one of the most successful feature-based SLAM algorithms, FAB-MAP. The perceptual change in the datasets is extreme; repeated traverses through environments during the day and then in the middle of the night, at times separated by months or years and in opposite seasons, and in clear weather and extremely heavy rain. While the feature-based method fails, the sequence-based algorithm is able to match trajectory segments at 100% precision with recall rates of up to 60%.
Resumo:
With the rising popularity of anime amongst animation students, audiences and scholars around the world, it has become increasingly important to critically analyse anime as being more than a ‘limited’ form of animation, and thematically as encompassing more than super robots and pocket monsters. Frames of Anime: Culture and Image-Building charts the development of Japanese animation from its indigenous roots within a native culture, through Japan’s experience of modernity and the impact of the Second World War. This text is the result of a rigorous study that recognises the heterogeneous and polymorphous background of anime. As such, Tze-Yue has adopted an ‘interdisciplinary and transnational’ (p. 7) approach to her enquiry, drawing upon face-to-face interviews, on-site visits and biographical writings of animators. Tze-Yue delineates anime from other forms of animation by linking its visual style to pre-modern Japanese art forms and demonstrating the connection it shares with an indigenous folk system of beliefs. Via the identification of traditional Japanese art forms and their visual connectedness to Japanese animation, Tze-Yue shows that the Japanese were already heavily engaged in what was destined to become anime once technology had enabled its production. Tze-Yue’s efforts to connect traditional Japanese art forms, and their artistic elements, to contemporary anime reveals that the Japanese already had a rich culture of visual storytelling that pre-dates modern animation. She identifies the Japanese form of the magic lantern at the turn of the 19th century, utsushi-e, as the pre-modern ancestor of Japanese animation, describing it as ‘Edo anime’ (p. 43). Along with utsushi-e, the Edo period also saw the woodblock print, ukiyo-e, being produced for the rising middle class (p. 32). Highlighting the ‘resurfacing’ of ‘realist’ approaches to Japanese art in ukiyo-e, Tze-Yue demonstrates the visual connection of ukiyo-e and anime in the …
Resumo:
The ubiquity of multimodality in hypermedia environments is undeniable. Bezemer and Kress (2008) have argued that writing has been displaced by image as the central mode for representation. Given the current technical affordances of digital technology and user-friendly interfaces that enable the ease of multimodal design, the conspicuous absence of images in certain domains of cyberspace is deserving of critical analysis. In this presentation, I examine the politics of discourses implicit within hypertextual spaces, drawing textual examples from a higher education website. I critically examine the role of writing and other modes of production used in what Fairclough (1993) refers to as discourses of marketisation in higher education, tracing four pervasive discourses of teaching and learning in the current economy: i) materialization, ii) personalization, iii) technologisation, and iv) commodification (Fairclough, 1999). Each of these arguments is supported by the critical analysis of multimodal texts. The first is a podcast highlighting the new architectonic features of a university learning space. The second is a podcast and transcript of a university Open Day interview with prospective students. The third is a time-lapse video showing the construction of a new science and engineering precinct. These three multimodal texts contrast a final web-based text that exhibits a predominance of writing and the powerful absence or silencing of the image. I connect the weightiness of words and the function of monomodality in the commodification of discourses, and its resistance to the multimodal affordances of web-based technologies, and how this is used to establish particular sets of subject positions and ideologies through which readers are constrained to occupy. Applying principles of critical language study by theorists that include Fairclough, Kress, Lemke, and others whose semiotic analysis of texts focuses on the connections between language, power, and ideology, I demonstrate how the denial of image and the privileging of written words in the multimodality of cyberspace is an ideological effect to accentuate the dominance of the institution.
Resumo:
Topographic structural complexity of a reef is highly correlated to coral growth rates, coral cover and overall levels of biodiversity, and is therefore integral in determining ecological processes. Modeling these processes commonly includes measures of rugosity obtained from a wide range of different survey techniques that often fail to capture rugosity at different spatial scales. Here we show that accurate estimates of rugosity can be obtained from video footage captured using underwater video cameras (i.e., monocular video). To demonstrate the accuracy of our method, we compared the results to in situ measurements of a 2m x 20m area of forereef from Glovers Reef atoll in Belize. Sequential pairs of images were used to compute fine scale bathymetric reconstructions of the reef substrate from which precise measurements of rugosity and reef topographic structural complexity can be derived across multiple spatial scales. To achieve accurate bathymetric reconstructions from uncalibrated monocular video, the position of the camera for each image in the video sequence and the intrinsic parameters (e.g., focal length) must be computed simultaneously. We show that these parameters can be often determined when the data exhibits parallax-type motion, and that rugosity and reef complexity can be accurately computed from existing video sequences taken from any type of underwater camera from any reef habitat or location. This technique provides an infinite array of possibilities for future coral reef research by providing a cost-effective and automated method of determining structural complexity and rugosity in both new and historical video surveys of coral reefs.
Resumo:
Teleradiology allows medical images to be transmitted over electronic networks for clinical interpretation, and for improved healthcare access, delivery and standards. Although, such remote transmission of the images is raising various new and complex legal and ethical issues, including image retention and fraud, privacy, malpractice liability, etc., considerations of the security measures used in teleradiology remain unchanged. Addressing this problem naturally warrants investigations on the security measures for their relative functional limitations and for the scope of considering them further. In this paper, starting with various security and privacy standards, the security requirements of medical images as well as expected threats in teleradiology are reviewed. This will make it possible to determine the limitations of the conventional measures used against the expected threats. Further, we thoroughly study the utilization of digital watermarking for teleradiology. Following the key attributes and roles of various watermarking parameters, justification for watermarking over conventional security measures is made in terms of their various objectives, properties, and requirements. We also outline the main objectives of medical image watermarking for teleradiology, and provide recommendations on suitable watermarking techniques and their characterization. Finally, concluding remarks and directions for future research are presented.
Resumo:
Person re-identification involves recognising individuals in different locations across a network of cameras and is a challenging task due to a large number of varying factors such as pose (both subject and camera) and ambient lighting conditions. Existing databases do not adequately capture these variations, making evaluations of proposed techniques difficult. In this paper, we present a new challenging multi-camera surveillance database designed for the task of person re-identification. This database consists of 150 unscripted sequences of subjects travelling in a building environment though up to eight camera views, appearing from various angles and in varying illumination conditions. A flexible XML-based evaluation protocol is provided to allow a highly configurable evaluation setup, enabling a variety of scenarios relating to pose and lighting conditions to be evaluated. A baseline person re-identification system consisting of colour, height and texture models is demonstrated on this database.
Resumo:
Purpose Arbitrary numbers of corneal confocal microscopy images have been used for analysis of corneal subbasal nerve parameters under the implicit assumption that these are a representative sample of the central corneal nerve plexus. The purpose of this study is to present a technique for quantifying the number of random central corneal images required to achieve an acceptable level of accuracy in the measurement of corneal nerve fiber length and branch density. Methods Every possible combination of 2 to 16 images (where 16 was deemed the true mean) of the central corneal subbasal nerve plexus, not overlapping by more than 20%, were assessed for nerve fiber length and branch density in 20 subjects with type 2 diabetes and varying degrees of functional nerve deficit. Mean ratios were calculated to allow comparisons between and within subjects. Results In assessing nerve branch density, eight randomly chosen images not overlapping by more than 20% produced an average that was within 30% of the true mean 95% of the time. A similar sampling strategy of five images was 13% within the true mean 80% of the time for corneal nerve fiber length. Conclusions The “sample combination analysis” presented here can be used to determine the sample size required for a desired level of accuracy of quantification of corneal subbasal nerve parameters. This technique may have applications in other biological sampling studies.