131 resultados para visual music
Resumo:
The visual system must learn to infer the presence of objects and features in the world from the images it encounters, and as such it must, either implicitly or explicitly, model the way these elements interact to create the image. Do the response properties of cells in the mammalian visual system reflect this constraint? To address this question, we constructed a probabilistic model in which the identity and attributes of simple visual elements were represented explicitly and learnt the parameters of this model from unparsed, natural video sequences. After learning, the behaviour and grouping of variables in the probabilistic model corresponded closely to functional and anatomical properties of simple and complex cells in the primary visual cortex (V1). In particular, feature identity variables were activated in a way that resembled the activity of complex cells, while feature attribute variables responded much like simple cells. Furthermore, the grouping of the attributes within the model closely parallelled the reported anatomical grouping of simple cells in cat V1. Thus, this generative model makes explicit an interpretation of complex and simple cells as elements in the segmentation of a visual scene into basic independent features, along with a parametrisation of their moment-by-moment appearances. We speculate that such a segmentation may form the initial stage of a hierarchical system that progressively separates the identity and appearance of more articulated visual elements, culminating in view-invariant object recognition.
Resumo:
Looking for a target in a visual scene becomes more difficult as the number of stimuli increases. In a signal detection theory view, this is due to the cumulative effect of noise in the encoding of the distractors, and potentially on top of that, to an increase of the noise (i.e., a decrease of precision) per stimulus with set size, reflecting divided attention. It has long been argued that human visual search behavior can be accounted for by the first factor alone. While such an account seems to be adequate for search tasks in which all distractors have the same, known feature value (i.e., are maximally predictable), we recently found a clear effect of set size on encoding precision when distractors are drawn from a uniform distribution (i.e., when they are maximally unpredictable). Here we interpolate between these two extreme cases to examine which of both conclusions holds more generally as distractor statistics are varied. In one experiment, we vary the level of distractor heterogeneity; in another we dissociate distractor homogeneity from predictability. In all conditions in both experiments, we found a strong decrease of precision with increasing set size, suggesting that precision being independent of set size is the exception rather than the rule.
Resumo:
This paper presents a complete system for expressive visual text-to-speech (VTTS), which is capable of producing expressive output, in the form of a 'talking head', given an input text and a set of continuous expression weights. The face is modeled using an active appearance model (AAM), and several extensions are proposed which make it more applicable to the task of VTTS. The model allows for normalization with respect to both pose and blink state which significantly reduces artifacts in the resulting synthesized sequences. We demonstrate quantitative improvements in terms of reconstruction error over a million frames, as well as in large-scale user studies, comparing the output of different systems. © 2013 IEEE.
Resumo:
Strategic planning can be an arduous and complex task; and, once a plan has been devised, it is often quite a challenge to effectively communicate the principal missions and key priorities to the array of different stakeholders. The communication challenge can be addressed through the application of a clearly and concisely designed visualisation of the strategic plan - to that end, this paper proposes the use of a roadmapping framework to structure a visual canvas. The canvas provides a template in the form of a single composite visual output that essentially allows a 'plan-on-a-page' to be generated. Such a visual representation provides a high-level depiction of the future context, end-state capabilities and the system-wide transitions needed to realise the strategic vision. To demonstrate this approach, an illustrative case study based on the Australian Government's Defence White Paper and the Royal Australian Navy's fleet plan will be presented. The visual plan plots the in-service upgrades for addressing the capability shortfalls and gaps in the Navy's fleet as it transitions from its current configuration to its future end-state vision. It also provides a visualisation of project timings in terms of the decision gates (approval, service release) and specific phases (proposal, contract, delivery) together with how these projects are rated against the key performance indicators relating to the technology acquisition process and associated management activities. © 2013 Taylor & Francis.
Resumo:
The human motor system is remarkably proficient in the online control of visually guided movements, adjusting to changes in the visual scene within 100 ms [1-3]. This is achieved through a set of highly automatic processes [4] translating visual information into representations suitable for motor control [5, 6]. For this to be accomplished, visual information pertaining to target and hand need to be identified and linked to the appropriate internal representations during the movement. Meanwhile, other visual information must be filtered out, which is especially demanding in visually cluttered natural environments. If selection of relevant sensory information for online control was achieved by visual attention, its limited capacity [7] would substantially constrain the efficiency of visuomotor feedback control. Here we demonstrate that both exogenously and endogenously cued attention facilitate the processing of visual target information [8], but not of visual hand information. Moreover, distracting visual information is more efficiently filtered out during the extraction of hand compared to target information. Our results therefore suggest the existence of a dedicated visuomotor binding mechanism that links the hand representation in visual and motor systems.
Resumo:
Relative (comparative) attributes are promising for thematic ranking of visual entities, which also aids in recognition tasks. However, attribute rank learning often requires a substantial amount of relational supervision, which is highly tedious, and apparently impractical for real-world applications. In this paper, we introduce the Semantic Transform, which under minimal supervision, adaptively finds a semantic feature space along with a class ordering that is related in the best possible way. Such a semantic space is found for every attribute category. To relate the classes under weak supervision, the class ordering needs to be refined according to a cost function in an iterative procedure. This problem is ideally NP-hard, and we thus propose a constrained search tree formulation for the same. Driven by the adaptive semantic feature space representation, our model achieves the best results to date for all of the tasks of relative, absolute and zero-shot classification on two popular datasets. © 2013 IEEE.
Resumo:
Experimental research in biology has uncovered a number of different ways in which flying insects use cues derived from optical flow for navigational purposes, such as safe landing, obstacle avoidance and dead reckoning. In this study, we use a synthetic methodology to gain additional insights into the navigation behavior of bees. Specifically, we focus on the mechanisms of course stabilization behavior and visually mediated odometer by using a biological model of motion detector for the purpose of long-range goal-directed navigation in 3D environment. The performance tests of the proposed navigation method are conducted by using a blimp-type flying robot platform in uncontrolled indoor environments. The result shows that the proposed mechanism can be used for goal-directed navigation. Further analysis is also conducted in order to enhance the navigation performance of autonomous aerial vehicles. © 2003 Elsevier B.V. All rights reserved.