873 resultados para human visual masking


Relevância:

80.00% 80.00%

Publicador:

Resumo:

The main goal of this research is to design an efficient compression al~ gorithm for fingerprint images. The wavelet transform technique is the principal tool used to reduce interpixel redundancies and to obtain a parsimonious representation for these images. A specific fixed decomposition structure is designed to be used by the wavelet packet in order to save on the computation, transmission, and storage costs. This decomposition structure is based on analysis of information packing performance of several decompositions, two-dimensional power spectral density, effect of each frequency band on the reconstructed image, and the human visual sensitivities. This fixed structure is found to provide the "most" suitable representation for fingerprints, according to the chosen criteria. Different compression techniques are used for different subbands, based on their observed statistics. The decision is based on the effect of each subband on the reconstructed image according to the mean square criteria as well as the sensitivities in human vision. To design an efficient quantization algorithm, a precise model for distribution of the wavelet coefficients is developed. The model is based on the generalized Gaussian distribution. A least squares algorithm on a nonlinear function of the distribution model shape parameter is formulated to estimate the model parameters. A noise shaping bit allocation procedure is then used to assign the bit rate among subbands. To obtain high compression ratios, vector quantization is used. In this work, the lattice vector quantization (LVQ) is chosen because of its superior performance over other types of vector quantizers. The structure of a lattice quantizer is determined by its parameters known as truncation level and scaling factor. In lattice-based compression algorithms reported in the literature the lattice structure is commonly predetermined leading to a nonoptimized quantization approach. In this research, a new technique for determining the lattice parameters is proposed. In the lattice structure design, no assumption about the lattice parameters is made and no training and multi-quantizing is required. The design is based on minimizing the quantization distortion by adapting to the statistical characteristics of the source in each subimage. 11 Abstract Abstract Since LVQ is a multidimensional generalization of uniform quantizers, it produces minimum distortion for inputs with uniform distributions. In order to take advantage of the properties of LVQ and its fast implementation, while considering the i.i.d. nonuniform distribution of wavelet coefficients, the piecewise-uniform pyramid LVQ algorithm is proposed. The proposed algorithm quantizes almost all of source vectors without the need to project these on the lattice outermost shell, while it properly maintains a small codebook size. It also resolves the wedge region problem commonly encountered with sharply distributed random sources. These represent some of the drawbacks of the algorithm proposed by Barlaud [26). The proposed algorithm handles all types of lattices, not only the cubic lattices, as opposed to the algorithms developed by Fischer [29) and Jeong [42). Furthermore, no training and multiquantizing (to determine lattice parameters) is required, as opposed to Powell's algorithm [78). For coefficients with high-frequency content, the positive-negative mean algorithm is proposed to improve the resolution of reconstructed images. For coefficients with low-frequency content, a lossless predictive compression scheme is used to preserve the quality of reconstructed images. A method to reduce bit requirements of necessary side information is also introduced. Lossless entropy coding techniques are subsequently used to remove coding redundancy. The algorithms result in high quality reconstructed images with better compression ratios than other available algorithms. To evaluate the proposed algorithms their objective and subjective performance comparisons with other available techniques are presented. The quality of the reconstructed images is important for a reliable identification. Enhancement and feature extraction on the reconstructed images are also investigated in this research. A structural-based feature extraction algorithm is proposed in which the unique properties of fingerprint textures are used to enhance the images and improve the fidelity of their characteristic features. The ridges are extracted from enhanced grey-level foreground areas based on the local ridge dominant directions. The proposed ridge extraction algorithm, properly preserves the natural shape of grey-level ridges as well as precise locations of the features, as opposed to the ridge extraction algorithm in [81). Furthermore, it is fast and operates only on foreground regions, as opposed to the adaptive floating average thresholding process in [68). Spurious features are subsequently eliminated using the proposed post-processing scheme.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Detection of Region of Interest (ROI) in a video leads to more efficient utilization of bandwidth. This is because any ROIs in a given frame can be encoded in higher quality than the rest of that frame, with little or no degradation of quality from the perception of the viewers. Consequently, it is not necessary to uniformly encode the whole video in high quality. One approach to determine ROIs is to use saliency detectors to locate salient regions. This paper proposes a methodology for obtaining ground truth saliency maps to measure the effectiveness of ROI detection by considering the role of user experience during the labelling process of such maps. User perceptions can be captured and incorporated into the definition of salience in a particular video, taking advantage of human visual recall within a given context. Experiments with two state-of-the-art saliency detectors validate the effectiveness of this approach to validating visual saliency in video. This paper will provide the relevant datasets associated with the experiments.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Remote networked collaboration with business model documentation has many communication problems. The aim of this project is to solve some of these communication problems by using digital 3D representations of human visual cues. Results from this project increased our understanding of the role and effects of visual cues in remote collaboration, specifically for validating business process models. Technology designs to support such cues across a distance have been proposed in this thesis with qualitative and quantitative methods of analysis being combined to analyse the impact of these cues on the communication, coordination and performance of a team collaborating remotely.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper, we present a machine learning approach to measure the visual quality of JPEG-coded images. The features for predicting the perceived image quality are extracted by considering key human visual sensitivity (HVS) factors such as edge amplitude, edge length, background activity and background luminance. Image quality assessment involves estimating the functional relationship between HVS features and subjective test scores. The quality of the compressed images are obtained without referring to their original images ('No Reference' metric). Here, the problem of quality estimation is transformed to a classification problem and solved using extreme learning machine (ELM) algorithm. In ELM, the input weights and the bias values are randomly chosen and the output weights are analytically calculated. The generalization performance of the ELM algorithm for classification problems with imbalance in the number of samples per quality class depends critically on the input weights and the bias values. Hence, we propose two schemes, namely the k-fold selection scheme (KS-ELM) and the real-coded genetic algorithm (RCGA-ELM) to select the input weights and the bias values such that the generalization performance of the classifier is a maximum. Results indicate that the proposed schemes significantly improve the performance of ELM classifier under imbalance condition for image quality assessment. The experimental results prove that the estimated visual quality of the proposed RCGA-ELM emulates the mean opinion score very well. The experimental results are compared with the existing JPEG no-reference image quality metric and full-reference structural similarity image quality metric.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The human visual system has adapted to function in different lighting environments and responds to contrast instead of the amount of light as such. On the one hand, this ensures constancy of perception, for example, white paper looks white both in bright sunlight and in dim moonlight, because contrast is invariant to changes in overall light level. On the other hand, the brightness of the surfaces has to be reconstructed from the contrast signal because no signal from surfaces as such is conveyed to the visual cortex. In the visual cortex, the visual image is decomposed to local features by spatial filters that are selective for spatial frequency, orientation, and phase. Currently it is not known, however, how these features are subsequently integrated to form objects and object surfaces. In this thesis the integration mechanisms of achromatic surfaces were studied by psychophysically measuring the spatial frequency and orientation tuning of brightness perception. In addition, the effect of textures on the spread of brightness and the effect of phase of the inducing stimulus on brightness were measured. The novel findings of the thesis are that (1) a narrow spatial frequency band, independent of stimulus size and complexity, mediates brightness information (2) figure-ground brightness illusions are narrowly tuned for orientation (3) texture borders, without any luminance difference, are able to block the spread of brightness, and (4) edges and even- and odd-symmetric Gabors have a similar antagonistic effect on brightness. The narrow spatial frequency tuning suggests that only a subpopulation of neurons in V1 is involved in brightness perception. The independence of stimulus size and complexity indicates that the narrow tuning reflects hard-wired processing in the visual system. Further, it seems that figure-ground segregation and mechanisms integrating contrast polarities are closely related to the low level mechanisms of brightness perception. In conclusion, the results of the thesis suggest that a subpopulation of neurons in visual cortex selectively integrates information from different contrast polarities to reconstruct surface brightness.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Information visualization is a process of constructing a visual presentation of abstract quantitative data. The characteristics of visual perception enable humans to recognize patterns, trends and anomalies inherent in the data with little effort in a visual display. Such properties of the data are likely to be missed in a purely text-based presentation. Visualizations are therefore widely used in contemporary business decision support systems. Visual user interfaces called dashboards are tools for reporting the status of a company and its business environment to facilitate business intelligence (BI) and performance management activities. In this study, we examine the research on the principles of human visual perception and information visualization as well as the application of visualization in a business decision support system. A review of current BI software products reveals that the visualizations included in them are often quite ineffective in communicating important information. Based on the principles of visual perception and information visualization, we summarize a set of design guidelines for creating effective visual reporting interfaces.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper, we present a growing and pruning radial basis function based no-reference (NR) image quality model for JPEG-coded images. The quality of the images are estimated without referring to their original images. The features for predicting the perceived image quality are extracted by considering key human visual sensitivity factors such as edge amplitude, edge length, background activity and background luminance. Image quality estimation involves computation of functional relationship between HVS features and subjective test scores. Here, the problem of quality estimation is transformed to a function approximation problem and solved using GAP-RBF network. GAP-RBF network uses sequential learning algorithm to approximate the functional relationship. The computational complexity and memory requirement are less in GAP-RBF algorithm compared to other batch learning algorithms. Also, the GAP-RBF algorithm finds a compact image quality model and does not require retraining when the new image samples are presented. Experimental results prove that the GAP-RBF image quality model does emulate the mean opinion score (MOS). The subjective test results of the proposed metric are compared with JPEG no-reference image quality index as well as full-reference structural similarity image quality index and it is observed to outperform both.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

An attempt is made to present some challenging problems (mainly to the technically minded researchers) in the development of computational models for certain (visual) processes which are executed with, apparently, deceptive ease by the human visual system. However, in the interest of simplicity (and with a nonmathematical audience in mind), the presentation is almost completely devoid of mathematical formalism. Some of the findings in biological vision are presented in order to provoke some approaches to their computational models, The development of ideas is not complete, and the vast literature on biological and computational vision cannot be reviewed here. A related but rather specific aspect of computational vision (namely, detection of edges) has been discussed by Zucker, who brings out some of the difficulties experienced in the classical approaches.Space limitations here preclude any detailed analysis of even the elementary aspects of information processing in biological vision, However, the main purpose of the present paper is to highlight some of the fascinating problems in the frontier area of modelling mathematically the human vision system.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Digital human modeling (DHM) involves modeling of structure, form and functional capabilities of human users for ergonomics simulation. This paper presents application of geometric procedures for investigating the characteristics of human visual capabilities which are particularly important in the context mentioned above. Using the cone of unrestricted directions through the pupil on a tessellated head model as the geometric interpretation of the clinical field-of-view (FoV), the results obtained are experimentally validated. Estimating the pupil movement for a given gaze direction using Listing's Law, FoVs are re-computed. Significant variation of the FoV is observed with the variation in gaze direction. A novel cube-grid representation, which integrated the unit-cube representation of directions and the enhanced slice representation has been introduced for fast and exact point classification for point visibility analysis for a given FoV. Computation of containment frequency of every grid-cell for a given set of FoVs enabled determination of percentile-based FoV contours for estimating the visual performance of a given population. This is a new concept which makes visibility analysis more meaningful from ergonomics point-of-view. The algorithms are fast enough to support interactive analysis of reasonably complex scenes on a typical desktop computer. (C) 2011 Elsevier Ltd. All rights reserved.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

373 p. : il., gráf., fot., tablas

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Transcranial magnetic stimulation (TMS) is a technique that stimulates the brain using a magnetic coil placed on the scalp. Since it is applicable to humans non-invasively, directly interfering with neural electrical activity, it is potentially a good tool to study the direct relationship between perceptual experience and neural activity. However, it has been difficult to produce a clear perceptible phenomenon with TMS of sensory areas, especially using a single magnetic pulse. Also, the biophysical mechanisms of magnetic stimulation of single neurons have been poorly understood.

In the psychophysical part of this thesis, perceptual phenomena induced by TMS of the human visual cortex are demonstrated as results of the interactions with visual inputs. We first introduce a method to create a hole, or a scotoma, in a flashed, large-field visual pattern using single-pulse TMS. Spatial aspects of the interactions are explored using the distortion effect of the scotoma depending on the visual pattern, which can be luminance-defined or illusory. Its similarity to the distortion of afterimages is also discussed. Temporal interactions are demonstrated in the filling-in of the scotoma with temporally adjacent visual features, as well as in the effective suppression of transient visual features. Also, paired-pulse TMS is shown to lead to different brightness modulations in transient and sustained visual stimuli.

In the biophysical part, we first develop a biophysical theory to simulate the effect of magnetic stimulation on arbitrary neuronal structure. Computer simulations are performed on cortical neuron models with realistic structure and channels, combined with the current injection that simulates magnetic stimulation. The simulation results account for general and basic characteristics of the macroscopic effects of TMS including our psychophysical findings, such as a long inhibitory effect, dependence on the background activity, and dependence on the direction of the induced electric field.

The perceptual effects and the cortical neuron model presented here provide foundations for the study of the relationship between perception and neural activity. Further insights would be obtained from extension of our model to neuronal networks and psychophysical studies based on predictions of the biophysical model.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Inspired by human visual cognition mechanism, this paper first presents a scene classification method based on an improved standard model feature. Compared with state-of-the-art efforts in scene classification, the newly proposed method is more robust, more selective, and of lower complexity. These advantages are demonstrated by two sets of experiments on both our own database and standard public ones. Furthermore, occlusion and disorder problems in scene classification in video surveillance are also first studied in this paper.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This report describes the implementation of a theory of edge detection, proposed by Marr and Hildreth (1979). According to this theory, the image is first processed independently through a set of different size filters, whose shape is the Laplacian of a Gaussian, ***. Zero-crossings in the output of these filters mark the positions of intensity changes at different resolutions. Information about these zero-crossings is then used for deriving a full symbolic description of changes in intensity in the image, called the raw primal sketch. The theory is closely tied with early processing in the human visual systems. In this report, we first examine the critical properties of the initial filters used in the edge detection process, both from a theoretical and practical standpoint. The implementation is then used as a test bed for exploring aspects of the human visual system; in particular, acuity and hyperacuity. Finally, we present some preliminary results concerning the relationship between zero-crossings detected at different resolutions, and some observations relevant to the process by which the human visual system integrates descriptions of intensity changes obtained at different resolutions.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Log-polar image architectures, motivated by the structure of the human visual field, have long been investigated in computer vision for use in estimating motion parameters from an optical flow vector field. Practical problems with this approach have been: (i) dependence on assumed alignment of the visual and motion axes; (ii) sensitivity to occlusion form moving and stationary objects in the central visual field, where much of the numerical sensitivity is concentrated; and (iii) inaccuracy of the log-polar architecture (which is an approximation to the central 20°) for wide-field biological vision. In the present paper, we show that an algorithm based on generalization of the log-polar architecture; termed the log-dipolar sensor, provides a large improvement in performance relative to the usual log-polar sampling. Specifically, our algorithm: (i) is tolerant of large misalignmnet of the optical and motion axes; (ii) is insensitive to significant occlusion by objects of unknown motion; and (iii) represents a more correct analogy to the wide-field structure of human vision. Using the Helmholtz-Hodge decomposition to estimate the optical flow vector field on a log-dipolar sensor, we demonstrate these advantages, using synthetic optical flow maps as well as natural image sequences.