861 resultados para Bag-of-visual Words


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Visual noise insensitivity is important to audio visual speech recognition (AVSR). Visual noise can take on a number of forms such as varying frame rate, occlusion, lighting or speaker variabilities. The use of a high dimensional secondary classifier on the word likelihood scores from both the audio and video modalities is investigated for the purposes of adaptive fusion. Preliminary results are presented demonstrating performance above the catastrophic fusion boundary for our confidence measure irrespective of the type of visual noise presented to it. Our experiments were restricted to small vocabulary applications.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The use of visual features in the form of lip movements to improve the performance of acoustic speech recognition has been shown to work well, particularly in noisy acoustic conditions. However, whether this technique can outperform speech recognition incorporating well-known acoustic enhancement techniques, such as spectral subtraction, or multi-channel beamforming is not known. This is an important question to be answered especially in an automotive environment, for the design of an efficient human-vehicle computer interface. We perform a variety of speech recognition experiments on a challenging automotive speech dataset and results show that synchronous HMM-based audio-visual fusion can outperform traditional single as well as multi-channel acoustic speech enhancement techniques. We also show that further improvement in recognition performance can be obtained by fusing speech-enhanced audio with the visual modality, demonstrating the complementary nature of the two robust speech recognition approaches.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In “Thinking Feeling” a camera zooms in and around an animated constellation of words. There are ten words, each repeated one hundred times. The individual words independently pulse and orbit an invisible nucleus. The slow movements of the words and camera are reinforced by an airy, synthesised soundtrack. Over time, various phrasal combinations form and dissolve on screen. A bit like forcing oneself to sleep, “Thinking Feeling” picks at that fine line between controlling and letting go of thoughts. It creates small mantric loops that slip in and out of focus, playing with the liminal zones between the conscious and unconscious, between language and sensation, between gripping and releasing, and between calm and irritation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Intuitively, any ‘bag of words’ approach in IR should benefit from taking term dependencies into account. Unfortunately, for years the results of exploiting such dependencies have been mixed or inconclusive. To improve the situation, this paper shows how the natural language properties of the target documents can be used to transform and enrich the term dependencies to more useful statistics. This is done in three steps. The term co-occurrence statistics of queries and documents are each represented by a Markov chain. The paper proves that such a chain is ergodic, and therefore its asymptotic behavior is unique, stationary, and independent of the initial state. Next, the stationary distribution is taken to model queries and documents, rather than their initial distributions. Finally, ranking is achieved following the customary language modeling paradigm. The main contribution of this paper is to argue why the asymptotic behavior of the document model is a better representation then just the document’s initial distribution. A secondary contribution is to investigate the practical application of this representation in case the queries become increasingly verbose. In the experiments (based on Lemur’s search engine substrate) the default query model was replaced by the stable distribution of the query. Just modeling the query this way already resulted in significant improvements over a standard language model baseline. The results were on a par or better than more sophisticated algorithms that use fine-tuned parameters or extensive training. Moreover, the more verbose the query, the more effective the approach seems to become.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we present a new algorithm for boosting visual template recall performance through a process of visual expectation. Visual expectation dynamically modifies the recognition thresholds of learnt visual templates based on recently matched templates, improving the recall of sequences of familiar places while keeping precision high, without any feedback from a mapping backend. We demonstrate the performance benefits of visual expectation using two 17 kilometer datasets gathered in an outdoor environment at two times separated by three weeks. The visual expectation algorithm provides up to a 100% improvement in recall. We also combine the visual expectation algorithm with the RatSLAM SLAM system and show how the algorithm enables successful mapping

Relevância:

100.00% 100.00%

Publicador:

Resumo:

An interactive installation with full body interface, digital projection, multi-touch sensitive screen surfaces, interactive 3D gaming software, motorised dioramas, 4.1 spatial sound & new furniture forms - investigating the cultural dimensions of sustainability through the lens of 'time'. “Time is change, time is finitude. Humans are a finite species. Every decision we make today brings that end closer, or alternatively pushes it further away. Nothing can be neutral”. Tony Fry DETAILS: Finitude (Mallee:Time) is a major new media/sculptural hybrid work premiered in 2011 in version 1 at the Ka-rama Motel for the Mildura Palimpsest #8 ('Collaborators and Saboteurs'). Each participant/viewer lies comfortably on their back on the double bed of Room 22. Directly above them, supported by a wooden structure, not unlike a house frame, is a semi-transparent Perspex screen that displays projected 3D imagery and is simultaneously sensitive to the lightest of finger touches. Depending upon the ever changing qualities of the projected image on this screen the participant can see through its surface to a series of physical dioramas suspended above, lit by subtle LED spotlighting. This diorama consists of a slowly rotating series of physical environments, which also include several animatronic components, allowing the realtime composition of whimsical ‘landscapes’ of both 'real' and 'virtual' media. Through subtle, non-didactic touch-sensitive interactivity the participant then has influence over both the 3D graphic imagery, the physical movements of the diorama and the 4.1 immersive soundscape, creating an uncanny blend of physical and virtual media. Five speakers positioned around the room deliver a rich interactive soundscape that responds both audibly and physically to interactions. VERSION 1, CONTEXT/THEORY: Finitude (Mallee: Time) is Version 1 of a series of presentations during 2012-14. This version has been inspired through a series of recent visits and residencies in the SW Victoria Mallee country. Further drawing on recent writings by post colonial author Paul Carter, the work is envisaged as an evolving ‘personal topography’ of place-discovery. By contrasting and melding readily available generalisations of the Mallee regions’ rational surfaces, climatic maps and ecological systems with what Carter calls “a fine capillary system of interconnected words, places, memories and sensations” generated through my own idiosyncratic research processes, Finitude (Mallee Time) invokes a “dark writing” of place through outside eyes - an approach that avoids concentration upon what 'everyone else knows', to instead imagine and develop a sense how things might be. This basis in re-imagining and re-invention becomes the vehicle for the work’s more fundamental intention - as a meditative re-imagination of 'time' (and region) as finite resources: Towards this end, every object, process and idea in the work is re-thought as having its own ‘time component’ or ‘residue’ that becomes deposited into our 'collective future'. Thought this way Finitude (Mallee Time) suggests the poverty of predominant images of time as ‘mechanism’ to instead envisage time as a plastic cyclical medium that we can each choose to ‘give to’ or ‘take away from’ our future. Put another way - time has become finitude.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Power relations and small and medium-sized enterprise strategies for capturing value in global production networks: visual effects (VFX) service firms in the Hollywood film industry, Regional Studies. This paper provides insights into the way in which non-lead firms manoeuvre in global value chains in the pursuit of a larger share of revenue and how power relations affect these manoeuvres. It examines the nature of value capture and power relations in the global supply of visual effects (VFX) services and the range of strategies VFX firms adopt to capture higher value in the global value chain. The analysis is based on a total of thirty-six interviews with informants in the industry in Australia, the United Kingdom and Canada, and a database of VFX credits for 3323 visual products for 640 VFX firms.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Existing algebraic analyses of the ZUC cipher indicate that the cipher should be secure against algebraic attacks. In this paper, we present an alternative algebraic analysis method for the ZUC stream cipher, where a combiner is used to represent the nonlinear function and to derive equations representing the cipher. Using this approach, the initial states of ZUC can be recovered from 2^97 observed words of keystream, with a complexity of 2^282 operations. This method is more successful when applied to a modified version of ZUC, where the number of output words per clock is increased. If the cipher outputs 120 bits of keystream per clock, the attack can succeed with 219 observed keystream bits and 2^47 operations. Therefore, the security of ZUC against algebraic attack could be significantly reduced if its throughput was to be increased for efficiency.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Abstract Purpose: To determine how high and low contrast visual acuities are affected by blur caused by crossed-cylinder lenses. Method: Crossed-cylinder lenses of power zero (no added lens), +0.12 DS/-0.25 DC, +0.25 DS/-0.50 DC and +0.37/-0.75 DC were placed over the correcting lenses of the right eyes of eight subjects. Negative cylinder axes used were 15-180 degrees in 15 degree step for the two higher crossed-cylinders and 30-180 degrees in 30 degree steps for the lowest crossed cylinder. Targets were single lines of letters based on the Bailey-Lovie chart. Successively smaller lines were read until the subject could not read any of the letters correctly. Two contrasts were used: high (100%) and low (10%). The screen luminance of 100 cd/m2, together with the room lighting, gave pupil sizes of 4.5 to 6 mm. Results: High contrast visual acuities were better than low contrast visual acuities by 0.1 to 0.2 log unit (1 to 2 chart lines) for the no added lens condition. Based on comparing the average of visual acuities for the 0.75 D crossed-cylinder with the best visual acuity for a given contrast and subject, the rates of change of visual acuity per unit blur strength were similar for high contrast (0.34± 0.05 logMAR/D) and low contrast (0.37± 0.09 logMAR/D). There were considerable asymmetry effects, with the average loss in visual acuity across the two contrasts and the 0.50D/0.75 D crossed-cylinders doubling between the 165± and 60± negative cylinder axes. The loss of visual acuity with 0.75 D crossed-cylinders was approximately twice times that occurring for defocus of the same blur strength. Conclusion: Small levels of crossed-cylinder blur (≤0.75D) produce losses in visual acuity that are dependent on the cylinder axis. 0.75 D crossed-cylinders produce losses in visual acuity that are twice those produced by defocus of the same blur strength.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Purpose. The Useful Field of View (UFOV(R)) test has been shown to be highly effective in predicting crash risk among older adults. An important question which we examined in this study is whether this association is due to the ability of the UFOV to predict difficulties in attention-demanding driving situations that involve either visual or auditory distracters. Methods. Participants included 92 community-living adults (mean age 73.6 +/- 5.4 years; range 65-88 years) who completed all three subtests of the UFOV involving assessment of visual processing speed (subtest 1), divided attention (subtest 2), and selective attention (subtest 3); driving safety risk was also classified using the UFOV scoring system. Driving performance was assessed separately on a closed-road circuit while driving under three conditions: no distracters, visual distracters, and auditory distracters. Driving outcome measures included road sign recognition, hazard detection, gap perception, time to complete the course, and performance on the distracter tasks. Results. Those rated as safe on the UFOV (safety rating categories 1 and 2), as well as those responding faster than the recommended cut-off on the selective attention subtest (350 msec), performed significantly better in terms of overall driving performance and also experienced less interference from distracters. Of the three UFOV subtests, the selective attention subtest best predicted overall driving performance in the presence of distracters. Conclusions. Older adults who were rated as higher risk on the UFOV, particularly on the selective attention subtest, demonstrated poorest driving performance in the presence of distracters. This finding suggests that the selective attention subtest of the UFOV may be differentially more effective in predicting driving difficulties in situations of divided attention which are commonly associated with crashes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this video, a male voice recites a script comprised entirely of jokes. Words flash on screen in time with the spoken words. Sometimes the two sets of words match, and sometimes they differ. This work examines processes of signification. It emphasizes disruption and disconnection as fundamental and generative operations in making meaning. Extending on post-structural and deconstructionist ideas, this work questions the relationship between written and spoken words. By deliberately confusing the signifying structures of jokes and narratives, it questions the sites and mechanisms of comprehension, humour and signification.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This article provides a tutorial introduction to visual servo control of robotic manipulators. Since the topic spans many disciplines our goal is limited to providing a basic conceptual framework. We begin by reviewing the prerequisite topics from robotics and computer vision, including a brief review of coordinate transformations, velocity representation, and a description of the geometric aspects of the image formation process. We then present a taxonomy of visual servo control systems. The two major classes of systems, position-based and image-based systems, are then discussed in detail. Since any visual servo system must be capable of tracking image features in a sequence of images, we also include an overview of feature-based and correlation-based methods for tracking. We conclude the tutorial with a number of observations on the current directions of the research field of visual servo control.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Purpose: To determine the effect of moderate levels of refractive blur and simulated cataracts on nighttime pedestrian conspicuity in the presence and absence of headlamp glare. Methods: The ability to recognize pedestrians at night was measured in 28 young adults (M=27.6 years) under three visual conditions: normal vision, refractive blur and simulated cataracts; mean acuity was 20/40 or better in all conditions. Pedestrian recognition distances were recorded while participants drove an instrumented vehicle along a closed road course at night. Pedestrians wore one of three clothing conditions and oncoming headlamps were present for 16 participants and absent for 12 participants. Results: Simulated visual impairment and glare significantly reduced the frequency with which drivers recognized pedestrians and the distance at which the drivers first recognized them. Simulated cataracts were significantly more disruptive than blur even though photopic visual acuity levels were matched. With normal vision, drivers responded to pedestrians at 3.6x and 5.5x longer distances on average than for the blur or cataract conditions, respectively. Even in the presence of visual impairment and glare, pedestrians were recognized more often and at longer distances when they wore a “biological motion” reflective clothing configuration than when they wore a reflective vest or black clothing. Conclusions: Drivers’ ability to recognize pedestrians at night is degraded by common visual impairments even when the drivers’ mean visual acuity meets licensing requirements. To maximize drivers’ ability to see pedestrians, drivers should wear their optimum optical correction, and cataract surgery should be performed early enough to avoid potentially dangerous reductions in visual performance.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

It would be a rare thing to visit an early years setting or classroom in Australia that does not display examples of young children’s artworks. This practice serves to give schools a particular ‘look’, but is no guarantee of quality art education. The Australian National Review of Visual Arts Education (NRVE) (2009) has called for changes to visual art education in schools. The planned new National Curriculum includes the arts (music, dance, drama, media and visual arts) as one of the five learning areas. Research shows that it is the classroom teacher that makes the difference, and teacher education has a large part to play in reforms to art education. This paper provides an account of one foundation unit of study (Unit 1) for first year university students enrolled in a 4-year Bachelor degree program who are preparing to teach in the early years (0–8 years). To prepare pre-service teachers to meet the needs of children in the 21st century, Unit 1 blends old and new ways of seeing art, child and pedagogy. Claims for the effectiveness of this model are supported with evidence-based research, conducted over the six years of iterations and ongoing development of Unit 1.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

While researchers strive to improve automatic face recognition performance, the relationship between image resolution and face recognition performance has not received much attention. This relationship is examined systematically and a framework is developed such that results from super-resolution techniques can be compared. Three super-resolution techniques are compared with the Eigenface and Elastic Bunch Graph Matching face recognition engines. Parameter ranges over which these techniques provide better recognition performance than interpolated images is determined.