994 resultados para Input image


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Whole image descriptors have recently been shown to be remarkably robust to perceptual change especially compared to local features. However, whole-image-based localization systems typically rely on heuristic methods for determining appropriate matching thresholds in a particular environment. These environment-specific tuning requirements and the lack of a meaningful interpretation of these arbitrary thresholds limits the general applicability of these systems. In this paper we present a Bayesian model of probability for whole-image descriptors that can be seamlessly integrated into localization systems designed for probabilistic visual input. We demonstrate this method using CAT-Graph, an appearance-based visual localization system originally designed for a FAB-MAP-style probabilistic input. We show that using whole-image descriptors as visual input extends CAT-Graph’s functionality to environments that experience a greater amount of perceptual change. We also present a method of estimating whole-image probability models in an online manner, removing the need for a prior training phase. We show that this online, automated training method can perform comparably to pre-trained, manually tuned local descriptor methods.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, we present a machine learning approach to measure the visual quality of JPEG-coded images. The features for predicting the perceived image quality are extracted by considering key human visual sensitivity (HVS) factors such as edge amplitude, edge length, background activity and background luminance. Image quality assessment involves estimating the functional relationship between HVS features and subjective test scores. The quality of the compressed images are obtained without referring to their original images ('No Reference' metric). Here, the problem of quality estimation is transformed to a classification problem and solved using extreme learning machine (ELM) algorithm. In ELM, the input weights and the bias values are randomly chosen and the output weights are analytically calculated. The generalization performance of the ELM algorithm for classification problems with imbalance in the number of samples per quality class depends critically on the input weights and the bias values. Hence, we propose two schemes, namely the k-fold selection scheme (KS-ELM) and the real-coded genetic algorithm (RCGA-ELM) to select the input weights and the bias values such that the generalization performance of the classifier is a maximum. Results indicate that the proposed schemes significantly improve the performance of ELM classifier under imbalance condition for image quality assessment. The experimental results prove that the estimated visual quality of the proposed RCGA-ELM emulates the mean opinion score very well. The experimental results are compared with the existing JPEG no-reference image quality metric and full-reference structural similarity image quality metric.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Existing approches to digital halftoning of image are based primarily on thresholding. We propose a general framework fot image halftoning whcrc some function uf the output halftone tracks another function of the input gray-tone.This appcoach is shown lo unify most existing algorithms and to provide useful insights. Further, the new intcrpretation allows us to remedy problems in existing aigorithrms such as the error dlffusion, and sohsequently to achieve halftones haavmg superior quality. The proposed method is very general nature is an advantage since it offers a wide choice of three Cilters and a update rule. An intercstmg product of this framework is that equally good, or better, half-tones are possible ro be obtained by thresholding a noise proccess instead of the image itself.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

[EN]Fundación Zain is developing new built heritage assessment protocols. The goal is to objectivize and standardize the analysis and decision process that leads to determining the degree of protection of built heritage in the Basque Country. The ultimate step in this objectivization and standardization effort will be the development of an information and communication technology (ICT) tool for the assessment of built heritage. This paper presents the ground work carried out to make this tool possible: the automatic, image-based delineation of stone masonry. This is a necessary first step in the development of the tool, as the built heritage that will be assessed consists of stone masonry construction, and many of the features analyzed can be characterized according to the geometry and arrangement of the stones. Much of the assessment is carried out through visual inspection. Thus, this process will be automated by applying image processing on digital images of the elements under inspection. The principal contribution of this paper is the automatic delineation the framework proposed. The other contribution is the performance evaluation of this delineation as the input to a classifier for a geometrically characterized feature of a built heritage object. The element chosen to perform this evaluation is the stone arrangement of masonry walls. The validity of the proposed framework is assessed on real images of masonry walls.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Our study of a novel technique for adaptive image sequence coding is reported. The number of reference frames and the intervals between them are adjusted to improve the temporal compensability of the input video. The bits are distributed more efficiently on different frame types according to temporal and spatial complexity of the image scene. Experimental results show that this dynamic group-of-picture (GOP) structure coding scheme is not only feasible but also better than the conventional fixed GOP method in terms of perceptual quality and SNR. (C) 1996 Society of Photo-Optical Instrumentation Engineers.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Fundacion Zain is developing new built heritage assessment protocols. The goal is to objectivize and standardize the analysis and decision process that leads to determining the degree of protection of built heritage in the Basque Country. The ultimate step in this objectivization and standardization effort will be the development of an information and communication technology (ICT) tool for the assessment of built heritage. This paper presents the ground work carried out to make this tool possible: the automatic, image-based delineation of stone masonry. This is a necessary first step in the development of the tool, as the built heritage that will be assessed consists of stone masonry construction, and many of the features analyzed can be characterized according to the geometry and arrangement of the stones. Much of the assessment is carried out through visual inspection. Thus, this process will be automated by applying image processing on digital images of the elements under inspection. The principal contribution of this paper is the automatic delineation the framework proposed. The other contribution is the performance evaluation of this delineation as the input to a classifier for a geometrically characterized feature of a built heritage object. The element chosen to perform this evaluation is the stone arrangement of masonry walls. The validity of the proposed framework is assessed on real images of masonry walls.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Hybrid opto-digital joint transform correlator (HODJTC) is effective for image motion measurement, but it is different from the traditional joint transform correlator because it only has one optical transform and the joint power spectrum is directly input into a digital processing unit to compute the image shift. The local cross-correlation image can be directly obtained by adopting a local Fourier transform operator. After the pixel-level location of cross-correlation peak is initially obtained, the up-sampling technique is introduced to relocate the peak in even higher accuracy. With signal-to-noise ratio >= 20 dB, up-sampling factor k >= 10 and the maximum image shift <= 60 pixels, the root-mean-square error of motion measurement accuracy can be controlled below 0.05 pixels.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper consists of two major parts. First, we present the outline of a simple approach to very-low bandwidth video-conferencing system relying on an example-based hierarchical image compression scheme. In particular, we discuss the use of example images as a model, the number of required examples, faces as a class of semi-rigid objects, a hierarchical model based on decomposition into different time-scales, and the decomposition of face images into patches of interest. In the second part, we present several algorithms for image processing and animation as well as experimental evaluations. Among the original contributions of this paper is an automatic algorithm for pose estimation and normalization. We also review and compare different algorithms for finding the nearest neighbors in a database for a new input as well as a generalized algorithm for blending patches of interest in order to synthesize new images. Finally, we outline the possible integration of several algorithms to illustrate a simple model-based video-conference system.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We present an image-based approach to infer 3D structure parameters using a probabilistic "shape+structure'' model. The 3D shape of a class of objects may be represented by sets of contours from silhouette views simultaneously observed from multiple calibrated cameras. Bayesian reconstructions of new shapes can then be estimated using a prior density constructed with a mixture model and probabilistic principal components analysis. We augment the shape model to incorporate structural features of interest; novel examples with missing structure parameters may then be reconstructed to obtain estimates of these parameters. Model matching and parameter inference are done entirely in the image domain and require no explicit 3D construction. Our shape model enables accurate estimation of structure despite segmentation errors or missing views in the input silhouettes, and works even with only a single input view. Using a dataset of thousands of pedestrian images generated from a synthetic model, we can perform accurate inference of the 3D locations of 19 joints on the body based on observed silhouette contours from real images.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Poster is based on the following paper: C. Kwan and M. Betke. Camera Canvas: Image editing software for people with disabilities. In Proceedings of the 14th International Conference on Human Computer Interaction (HCI International 2011), Orlando, Florida, July 2011.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Camera Canvas is an image editing software package for users with severe disabilities that limit their mobility. It is specially designed for Camera Mouse, a camera-based mouse-substitute input system. Users can manipulate images through various head movements, tracked by Camera Mouse. The system is also fully usable with traditional mouse or touch-pad input. Designing the system, we studied the requirements and solutions for image editing and content creation using Camera Mouse. Experiments with 20 subjects, each testing Camera Canvas with Camera Mouse as the input mechanism, showed that users found the software easy to understand and operate. User feedback was taken into account to make the software more usable and the interface more intuitive. We suggest that the Camera Canvas software makes important progress in providing a new medium of utility and creativity in computing for users with severe disabilities.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A non-linear supervised learning architecture, the Specialized Mapping Architecture (SMA) and its application to articulated body pose reconstruction from single monocular images is described. The architecture is formed by a number of specialized mapping functions, each of them with the purpose of mapping certain portions (connected or not) of the input space, and a feedback matching process. A probabilistic model for the architecture is described along with a mechanism for learning its parameters. The learning problem is approached using a maximum likelihood estimation framework; we present Expectation Maximization (EM) algorithms for two different instances of the likelihood probability. Performance is characterized by estimating human body postures from low level visual features, showing promising results.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Under natural viewing conditions small movements of the eye, head, and body prevent the maintenance of a steady direction of gaze. It is known that stimuli tend to fade when they a restabilized on the retina for several seconds. However; it is unclear whether the physiological motion of the retinal image serves a visual purpose during the brief periods of natural visual fixation. This study examines the impact of fixational instability on the statistics of the visua1 input to the retina and on the structure of neural activity in the early visual system. We show that fixational instability introduces a component in the retinal input signals that in the presence of natural images, lacks spatial correlations. This component strongly influences neural activity in a model of the LGN. It decorrelates cell responses even if the contrast sensitivity functions of simulated cells arc not perfectly tuned to counterbalance the power-law spectrum of natural images. A decorrelation of neural activity at the early stages of the visual system has been proposed to be beneficial for discarding statistical redundancies in the input signals. The results of this study suggest that fixational instability might contribute to establishing efficient representations of natural stimuli.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

© 2015 IEEE.In virtual reality applications, there is an aim to provide real time graphics which run at high refresh rates. However, there are many situations in which this is not possible due to simulation or rendering issues. When running at low frame rates, several aspects of the user experience are affected. For example, each frame is displayed for an extended period of time, causing a high persistence image artifact. The effect of this artifact is that movement may lose continuity, and the image jumps from one frame to another. In this paper, we discuss our initial exploration of the effects of high persistence frames caused by low refresh rates and compare it to high frame rates and to a technique we developed to mitigate the effects of low frame rates. In this technique, the low frame rate simulation images are displayed with low persistence by blanking out the display during the extra time such image would be displayed. In order to isolate the visual effects, we constructed a simulator for low and high persistence displays that does not affect input latency. A controlled user study comparing the three conditions for the tasks of 3D selection and navigation was conducted. Results indicate that the low persistence display technique may not negatively impact user experience or performance as compared to the high persistence case. Directions for future work on the use of low persistence displays for low frame rate situations are discussed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A new, front-end image processing chip is presented for real-time small object detection. It has been implemented using a 0.6 µ, 3.3 V CMOS technology and operates on 10-bit input data at 54 megasamples per second. It occupies an area of 12.9 mm×13.6 mm (including pads), dissipates 1.5 W, has 92 I/O pins and is to be housed in a 160-pin ceramic quarter flat-pack. It performs both one- and two-dimensional FIR filtering and a multilayer perceptron (MLP) neural network function using a reconfigurable array of 21 multiplication-accumulation cells which corresponds to a window size of 7×3. The chip can cope with images of 2047 pixels per line and can be cascaded to cope with larger window sizes. The chip performs two billion fixed point multiplications and additions per second.