123 resultados para Identificação visual
Resumo:
It is commonly believed that visual short-term memory (VSTM) consists of a fixed number of "slots" in which items can be stored. An alternative theory in which memory resource is a continuous quantity distributed over all items seems to be refuted by the appearance of guessing in human responses. Here, we introduce a model in which resource is not only continuous but also variable across items and trials, causing random fluctuations in encoding precision. We tested this model against previous models using two VSTM paradigms and two feature dimensions. Our model accurately accounts for all aspects of the data, including apparent guessing, and outperforms slot models in formal model comparison. At the neural level, variability in precision might correspond to variability in neural population gain and doubly stochastic stimulus representation. Our results suggest that VSTM resource is continuous and variable rather than discrete and fixed and might explain why subjective experience of VSTM is not all or none.
Resumo:
An object in the peripheral visual field is more difficult to recognize when surrounded by other objects. This phenomenon is called "crowding". Crowding places a fundamental constraint on human vision that limits performance on numerous tasks. It has been suggested that crowding results from spatial feature integration necessary for object recognition. However, in the absence of convincing models, this theory has remained controversial. Here, we present a quantitative and physiologically plausible model for spatial integration of orientation signals, based on the principles of population coding. Using simulations, we demonstrate that this model coherently accounts for fundamental properties of crowding, including critical spacing, "compulsory averaging", and a foveal-peripheral anisotropy. Moreover, we show that the model predicts increased responses to correlated visual stimuli. Altogether, these results suggest that crowding has little immediate bearing on object recognition but is a by-product of a general, elementary integration mechanism in early vision aimed at improving signal quality.
Resumo:
Visual information is difficult to search and interpret when the density of the displayed information is high or the layout is chaotic. Visual information that exhibits such properties is generally referred to as being "cluttered." Clutter should be avoided in information visualizations and interface design in general because it can severely degrade task performance. Although previous studies have identified computable correlates of clutter (such as local feature variance and edge density), understanding of why humans perceive some scenes as being more cluttered than others remains limited. Here, we explore an account of clutter that is inspired by findings from visual perception studies. Specifically, we test the hypothesis that the so-called "crowding" phenomenon is an important constituent of clutter. We constructed an algorithm to predict visual clutter in arbitrary images by estimating the perceptual impairment due to crowding. After verifying that this model can reproduce crowding data we tested whether it can also predict clutter. We found that its predictions correlate well with both subjective clutter assessments and search performance in cluttered scenes. These results suggest that crowding and clutter may indeed be closely related concepts and suggest avenues for further research.
On the generality of crowding: visual crowding in size, saturation, and hue compared to orientation.
Resumo:
Perception of peripherally viewed shapes is impaired when surrounded by similar shapes. This phenomenon is commonly referred to as "crowding". Although studied extensively for perception of characters (mainly letters) and, to a lesser extent, for orientation, little is known about whether and how crowding affects perception of other features. Nevertheless, current crowding models suggest that the effect should be rather general and thus not restricted to letters and orientation. Here, we report on a series of experiments investigating crowding in the following elementary feature dimensions: size, hue, and saturation. Crowding effects in these dimensions were benchmarked against those in the orientation domain. Our primary finding is that all features studied show clear signs of crowding. First, identification thresholds increase with decreasing mask spacing. Second, for all tested features, critical spacing appears to be roughly half the viewing eccentricity and independent of stimulus size, a property previously proposed as the hallmark of crowding. Interestingly, although critical spacings are highly comparable, crowding magnitude differs across features: Size crowding is almost as strong as orientation crowding, whereas the effect is much weaker for saturation and hue. We suggest that future theories and models of crowding should be able to accommodate these differences in crowding effects.
Resumo:
While searching for objects, we combine information from multiple visual modalities. Classical theories of visual search assume that features are processed independently prior to an integration stage. Based on this, one would predict that features that are equally discriminable in single feature search should remain so in conjunction search. We test this hypothesis by examining whether search accuracy in feature search predicts accuracy in conjunction search. Subjects searched for objects combining color and orientation or size; eye movements were recorded. Prior to the main experiment, we matched feature discriminability, making sure that in feature search, 70% of saccades were likely to go to the correct target stimulus. In contrast to this symmetric single feature discrimination performance, the conjunction search task showed an asymmetry in feature discrimination performance: In conjunction search, a similar percentage of saccades went to the correct color as in feature search but much less often to correct orientation or size. Therefore, accuracy in feature search is a good predictor of accuracy in conjunction search for color but not for size and orientation. We propose two explanations for the presence of such asymmetries in conjunction search: the use of conjunctively tuned channels and differential crowding effects for different features.
Resumo:
The visual system must learn to infer the presence of objects and features in the world from the images it encounters, and as such it must, either implicitly or explicitly, model the way these elements interact to create the image. Do the response properties of cells in the mammalian visual system reflect this constraint? To address this question, we constructed a probabilistic model in which the identity and attributes of simple visual elements were represented explicitly and learnt the parameters of this model from unparsed, natural video sequences. After learning, the behaviour and grouping of variables in the probabilistic model corresponded closely to functional and anatomical properties of simple and complex cells in the primary visual cortex (V1). In particular, feature identity variables were activated in a way that resembled the activity of complex cells, while feature attribute variables responded much like simple cells. Furthermore, the grouping of the attributes within the model closely parallelled the reported anatomical grouping of simple cells in cat V1. Thus, this generative model makes explicit an interpretation of complex and simple cells as elements in the segmentation of a visual scene into basic independent features, along with a parametrisation of their moment-by-moment appearances. We speculate that such a segmentation may form the initial stage of a hierarchical system that progressively separates the identity and appearance of more articulated visual elements, culminating in view-invariant object recognition.
Resumo:
Looking for a target in a visual scene becomes more difficult as the number of stimuli increases. In a signal detection theory view, this is due to the cumulative effect of noise in the encoding of the distractors, and potentially on top of that, to an increase of the noise (i.e., a decrease of precision) per stimulus with set size, reflecting divided attention. It has long been argued that human visual search behavior can be accounted for by the first factor alone. While such an account seems to be adequate for search tasks in which all distractors have the same, known feature value (i.e., are maximally predictable), we recently found a clear effect of set size on encoding precision when distractors are drawn from a uniform distribution (i.e., when they are maximally unpredictable). Here we interpolate between these two extreme cases to examine which of both conclusions holds more generally as distractor statistics are varied. In one experiment, we vary the level of distractor heterogeneity; in another we dissociate distractor homogeneity from predictability. In all conditions in both experiments, we found a strong decrease of precision with increasing set size, suggesting that precision being independent of set size is the exception rather than the rule.
Resumo:
This paper presents a complete system for expressive visual text-to-speech (VTTS), which is capable of producing expressive output, in the form of a 'talking head', given an input text and a set of continuous expression weights. The face is modeled using an active appearance model (AAM), and several extensions are proposed which make it more applicable to the task of VTTS. The model allows for normalization with respect to both pose and blink state which significantly reduces artifacts in the resulting synthesized sequences. We demonstrate quantitative improvements in terms of reconstruction error over a million frames, as well as in large-scale user studies, comparing the output of different systems. © 2013 IEEE.
Resumo:
Strategic planning can be an arduous and complex task; and, once a plan has been devised, it is often quite a challenge to effectively communicate the principal missions and key priorities to the array of different stakeholders. The communication challenge can be addressed through the application of a clearly and concisely designed visualisation of the strategic plan - to that end, this paper proposes the use of a roadmapping framework to structure a visual canvas. The canvas provides a template in the form of a single composite visual output that essentially allows a 'plan-on-a-page' to be generated. Such a visual representation provides a high-level depiction of the future context, end-state capabilities and the system-wide transitions needed to realise the strategic vision. To demonstrate this approach, an illustrative case study based on the Australian Government's Defence White Paper and the Royal Australian Navy's fleet plan will be presented. The visual plan plots the in-service upgrades for addressing the capability shortfalls and gaps in the Navy's fleet as it transitions from its current configuration to its future end-state vision. It also provides a visualisation of project timings in terms of the decision gates (approval, service release) and specific phases (proposal, contract, delivery) together with how these projects are rated against the key performance indicators relating to the technology acquisition process and associated management activities. © 2013 Taylor & Francis.