91 resultados para saliency


Relevância:

10.00% 10.00%

Publicador:

Resumo:

Visual information processing in brain proceeds in both serial and parallel fashion throughout various functionally distinct hierarchically organised cortical areas. Feedforward signals from retina and hierarchically lower cortical levels are the major activators of visual neurons, but top-down and feedback signals from higher level cortical areas have a modulating effect on neural processing. My work concentrates on visual encoding in hierarchically low level cortical visual areas in human brain and examines neural processing especially in cortical representation of visual field periphery. I use magnetoencephalography and functional magnetic resonance imaging to measure neuromagnetic and hemodynamic responses during visual stimulation and oculomotor and cognitive tasks from healthy volunteers. My thesis comprises six publications. Visual cortex forms a great challenge for modeling of neuromagnetic sources. My work shows that a priori information of source locations are needed for modeling of neuromagnetic sources in visual cortex. In addition, my work examines other potential confounding factors in vision studies such as light scatter inside the eye which may result in erroneous responses in cortex outside the representation of stimulated region, and eye movements and attention. I mapped cortical representations of peripheral visual field and identified a putative human homologue of functional area V6 of the macaque in the posterior bank of parieto-occipital sulcus. My work shows that human V6 activates during eye-movements and that it responds to visual motion at short latencies. These findings suggest that human V6, like its monkey homologue, is related to fast processing of visual stimuli and visually guided movements. I demonstrate that peripheral vision is functionally related to eye-movements and connected to rapid stream of functional areas that process visual motion. In addition, my work shows two different forms of top-down modulation of neural processing in the hierachically lowest cortical levels; one that is related to dorsal stream activation and may reflect motor processing or resetting signals that prepare visual cortex for change in the environment and another local signal enhancement at the attended region that reflects local feed-back signal and may perceptionally increase the stimulus saliency.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Energy-based direct methods for transient stability analysis are potentially useful both as offline tools for planning purposes as well as for online security assessment. In this paper, a novel structure-preserving energy function (SPEF) is developed using the philosophy of structure-preserving model for the system and detailed generator model including flux decay, transient saliency, automatic voltage regulator (AVR), exciter and damper winding. A simpler and yet general expression for the SPEF is also derived which can simplify the computation of the energy function. The system equations and the energy function are derived using the centre-of-inertia (COI) formulation and the system loads are modelled as arbitrary functions of the respective bus voltages. Application of the proposed SPEF to transient stability evaluation of power systems is illustrated with numerical examples.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper describes the field oriented control of a salient pole wound field synchronous machine in stator flux coordinates. The procedure for derivation of flux linkage equations along any general rotating axes including stator flux axes is given. The stator flux equations are used to identify the cross-coupling occurring between the axes due to saliency in the machine. The coupling terms are canceled as feedforward terms in the generation of references for current controllers to achieve good decoupling during transients. The design of current controller for stator-flux-oriented control is presented. This paper proposes the method of extending rotor flux closed loop observer for sensorless control of wound field synchronous machine. This paper also proposes a new sensorless control by using stator flux closed loop observer and estimation of torque angle using stator current components in stator flux coordinates. Detailed experimental results from a sensorless 15.8 hp salient pole wound field synchronous machine drive are presented to demonstrate the performance of the proposed control strategy from a low speed of 0.8 Hz to 50 Hz.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Regions in video streams attracting human interest contribute significantly to human understanding of the video. Being able to predict salient and informative Regions of Interest (ROIs) through a sequence of eye movements is a challenging problem. Applications such as content-aware retargeting of videos to different aspect ratios while preserving informative regions and smart insertion of dialog (closed-caption text) into the video stream can significantly be improved using the predicted ROIs. We propose an interactive human-in-the-loop framework to model eye movements and predict visual saliency into yet-unseen frames. Eye tracking and video content are used to model visual attention in a manner that accounts for important eye-gaze characteristics such as temporal discontinuities due to sudden eye movements, noise, and behavioral artifacts. A novel statistical-and algorithm-based method gaze buffering is proposed for eye-gaze analysis and its fusion with content-based features. Our robust saliency prediction is instantiated for two challenging and exciting applications. The first application alters video aspect ratios on-the-fly using content-aware video retargeting, thus making them suitable for a variety of display sizes. The second application dynamically localizes active speakers and places dialog captions on-the-fly in the video stream. Our method ensures that dialogs are faithful to active speaker locations and do not interfere with salient content in the video stream. Our framework naturally accommodates personalisation of the application to suit biases and preferences of individual users.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Salient object detection has become an important task in many image processing applications. The existing approaches exploit background prior and contrast prior to attain state of the art results. In this paper, instead of using background cues, we estimate the foreground regions in an image using objectness proposals and utilize it to obtain smooth and accurate saliency maps. We propose a novel saliency measure called `foreground connectivity' which determines how tightly a pixel or a region is connected to the estimated foreground. We use the values assigned by this measure as foreground weights and integrate these in an optimization framework to obtain the final saliency maps. We extensively evaluate the proposed approach on two benchmark databases and demonstrate that the results obtained are better than the existing state of the art approaches.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This thesis presents a biologically plausible model of an attentional mechanism for forming position- and scale-invariant representations of objects in the visual world. The model relies on a set of control neurons to dynamically modify the synaptic strengths of intra-cortical connections so that information from a windowed region of primary visual cortex (Vl) is selectively routed to higher cortical areas. Local spatial relationships (i.e., topography) within the attentional window are preserved as information is routed through the cortex, thus enabling attended objects to be represented in higher cortical areas within an object-centered reference frame that is position and scale invariant. The representation in V1 is modeled as a multiscale stack of sample nodes with progressively lower resolution at higher eccentricities. Large changes in the size of the attentional window are accomplished by switching between different levels of the multiscale stack, while positional shifts and small changes in scale are accomplished by translating and rescaling the window within a single level of the stack. The control signals for setting the position and size of the attentional window are hypothesized to originate from neurons in the pulvinar and in the deep layers of visual cortex. The dynamics of these control neurons are governed by simple differential equations that can be realized by neurobiologically plausible circuits. In pre-attentive mode, the control neurons receive their input from a low-level "saliency map" representing potentially interesting regions of a scene. During the pattern recognition phase, control neurons are driven by the interaction between top-down (memory) and bottom-up (retinal input) sources. The model respects key neurophysiological, neuroanatomical, and psychophysical data relating to attention, and it makes a variety of experimentally testable predictions.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This thesis addresses a series of topics related to the question of how people find the foreground objects from complex scenes. With both computer vision modeling, as well as psychophysical analyses, we explore the computational principles for low- and mid-level vision.

We first explore the computational methods of generating saliency maps from images and image sequences. We propose an extremely fast algorithm called Image Signature that detects the locations in the image that attract human eye gazes. With a series of experimental validations based on human behavioral data collected from various psychophysical experiments, we conclude that the Image Signature and its spatial-temporal extension, the Phase Discrepancy, are among the most accurate algorithms for saliency detection under various conditions.

In the second part, we bridge the gap between fixation prediction and salient object segmentation with two efforts. First, we propose a new dataset that contains both fixation and object segmentation information. By simultaneously presenting the two types of human data in the same dataset, we are able to analyze their intrinsic connection, as well as understanding the drawbacks of today’s “standard” but inappropriately labeled salient object segmentation dataset. Second, we also propose an algorithm of salient object segmentation. Based on our novel discoveries on the connections of fixation data and salient object segmentation data, our model significantly outperforms all existing models on all 3 datasets with large margins.

In the third part of the thesis, we discuss topics around the human factors of boundary analysis. Closely related to salient object segmentation, boundary analysis focuses on delimiting the local contours of an object. We identify the potential pitfalls of algorithm evaluation for the problem of boundary detection. Our analysis indicates that today’s popular boundary detection datasets contain significant level of noise, which may severely influence the benchmarking results. To give further insights on the labeling process, we propose a model to characterize the principles of the human factors during the labeling process.

The analyses reported in this thesis offer new perspectives to a series of interrelating issues in low- and mid-level vision. It gives warning signs to some of today’s “standard” procedures, while proposing new directions to encourage future research.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A common approach to visualise multidimensional data sets is to map every data dimension to a separate visual feature. It is generally assumed that such visual features can be judged independently from each other. However, we have recently shown that interactions between features do exist [Hannus et al. 2004; van den Berg et al. 2005]. In those studies, we first determined individual colour and size contrast or colour and orientation contrast necessary to achieve a fixed level of discrimination performance in single feature search tasks. These contrasts were then used in a conjunction search task in which the target was defined by a combination of a colour and a size or a colour and an orientation. We found that in conjunction search, despite the matched feature discriminability, subjects significantly more often chose an item with the correct colour than one with correct size or orientation. This finding may have consequences for visualisation: the saliency of information coded by objects' size or orientation may change when there is a need to simultaneously search for colour that codes another aspect of the information. In the present experiment, we studied whether a colour bias can also be found in a more complex and continuous task, Subjects had to search for a target in a node-link diagram consisting of SO nodes, while their eye movements were being tracked, Each node was assigned a random colour and size (from a range of 10 possible values with fixed perceptual distances). We found that when we base the distances on the mean threshold contrasts that were determined in our previous experiments, the fixated nodes tend to resemble the target colour more than the target size (Figure 1a). This indicates that despite the perceptual matching, colour is judged with greater precision than size during conjunction search. We also found that when we double the size contrast (i.e. the distances between the 10 possible node sizes), this effect disappears (Figure 1b). Our findings confirm that the previously found decrease in salience of other features during colour conjunction search is also present in more complex (more 'visualisation- realistic') visual search tasks. The asymmetry in visual search behaviour can be compensated for by manipulating step sizes (perceptual distances) within feature dimensions. Our results therefore also imply that feature hierarchies are not completely fixed and may be adapted to the requirements of a particular visualisation. Copyright © 2005 by the Association for Computing Machinery, Inc.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This thesis shows how to detect boundaries on the basis of motion information alone. The detection is performed in two stages: (i) the local estimation of motion discontinuities and of the visual flowsfield; (ii) the extraction of complete boundaries belonging to differently moving objects. For the first stage, three new methods are presented: the "Bimodality Tests,'' the "Bi-distribution Test,'' and the "Dynamic Occlusion Method.'' The second stage consists of applying the "Structural Saliency Method,'' by Sha'ashua and Ullman to extract complete and unique boundaries from the output of the first stage. The developed methods can successfully segment complex motion sequences.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

How do humans use predictive contextual information to facilitate visual search? How are consistently paired scenic objects and positions learned and used to more efficiently guide search in familiar scenes? For example, a certain combination of objects can define a context for a kitchen and trigger a more efficient search for a typical object, such as a sink, in that context. A neural model, ARTSCENE Search, is developed to illustrate the neural mechanisms of such memory-based contextual learning and guidance, and to explain challenging behavioral data on positive/negative, spatial/object, and local/distant global cueing effects during visual search. The model proposes how global scene layout at a first glance rapidly forms a hypothesis about the target location. This hypothesis is then incrementally refined by enhancing target-like objects in space as a scene is scanned with saccadic eye movements. The model clarifies the functional roles of neuroanatomical, neurophysiological, and neuroimaging data in visual search for a desired goal object. In particular, the model simulates the interactive dynamics of spatial and object contextual cueing in the cortical What and Where streams starting from early visual areas through medial temporal lobe to prefrontal cortex. After learning, model dorsolateral prefrontal cortical cells (area 46) prime possible target locations in posterior parietal cortex based on goalmodulated percepts of spatial scene gist represented in parahippocampal cortex, whereas model ventral prefrontal cortical cells (area 47/12) prime possible target object representations in inferior temporal cortex based on the history of viewed objects represented in perirhinal cortex. The model hereby predicts how the cortical What and Where streams cooperate during scene perception, learning, and memory to accumulate evidence over time to drive efficient visual search of familiar scenes.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Small salient-pole machines, in the range 30 kVA to 2 MVA, are often used in distributed generators, which in turn are likely to form the major constituent of power generation in power system islanding schemes or microgrids. In addition to power system faults, such as short-circuits, islanding contains an inherent risk of out-of-synchronism re-closure onto the main power system. To understand more fully the effect of these phenomena on a small salient-pole alternator, the armature and field currents from tests conducted on a 31.5 kVA machine are analysed. This study demonstrates that by resolving the voltage difference between the machine terminals and bus into direct and quadrature axis components, interesting properties of the transient currents are revealed. The presence of saliency and short time-constants cause intriguing differences between machine events such as out-of-phase synchronisations and sudden three-phase short-circuits.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Painterly rendering has been linked to computer vision, but we propose to link it to human vision because perception and painting are two processes that are interwoven. Recent progress in developing computational models allows to establish this link. We show that completely automatic rendering can be obtained by applying four image representations in the visual system: (1) colour constancy can be used to correct colours, (2) coarse background brightness in combination with colour coding in cytochrome-oxidase blobs can be used to create a background with a big brush, (3) the multi-scale line and edge representation provides a very natural way to render fi ner brush strokes, and (4) the multi-scale keypoint representation serves to create saliency maps for Focus-of-Attention, and FoA can be used to render important structures. Basic processes are described, renderings are shown, and important ideas for future research are discussed.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

End-stopped cells in cortical area V1, which combine out- puts of complex cells tuned to different orientations, serve to detect line and edge crossings (junctions) and points with a large curvature. In this paper we study the importance of the multi-scale keypoint representa- tion, i.e. retinotopic keypoint maps which are tuned to different spatial frequencies (scale or Level-of-Detail). We show that this representation provides important information for Focus-of-Attention (FoA) and object detection. In particular, we show that hierarchically-structured saliency maps for FoA can be obtained, and that combinations over scales in conjunction with spatial symmetries can lead to face detection through grouping operators that deal with keypoints at the eyes, nose and mouth, especially when non-classical receptive field inhibition is employed. Al- though a face detector can be based on feedforward and feedback loops within area V1, such an operator must be embedded into dorsal and ventral data streams to and from higher areas for obtaining translation-, rotation- and scale-invariant face (object) detection.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Keypoints (junctions) provide important information for focus-of-attention (FoA) and object categorization/recognition. In this paper we analyze the multi-scale keypoint representation, obtained by applying a linear and quasi-continuous scaling to an optimized model of cortical end-stopped cells, in order to study its importance and possibilities for developing a visual, cortical architecture.We show that keypoints, especially those which are stable over larger scale intervals, can provide a hierarchically structured saliency map for FoA and object recognition. In addition, the application of non-classical receptive field inhibition to keypoint detection allows to distinguish contour keypoints from texture (surface) keypoints.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Empirical studies concerning face recognition suggest that faces may be stored in memory by a few canonical representations. Models of visual perception are based on image representations in cortical area V1 and beyond, which contain many cell layers for feature extraction. Simple, complex and end-stopped cells provide input for line, edge and keypoint detection. Detected events provide a rich, multi-scale object representation, and this representation can be stored in memory in order to identify objects. In this paper, the above context is applied to face recognition. The multi-scale line/edge representation is explored in conjunction with keypoint-based saliency maps for Focus-of-Attention. Recognition rates of up to 96% were achieved by combining frontal and 3/4 views, and recognition was quite robust against partial occlusions.