994 resultados para Interpretação visual
Resumo:
This paper presents visual detection and classification of light vehicles and personnel on a mine site.We capitalise on the rapid advances of ConvNet based object recognition but highlight that a naive black box approach results in a significant number of false positives. In particular, the lack of domain specific training data and the unique landscape in a mine site causes a high rate of errors. We exploit the abundance of background-only images to train a k-means classifier to complement the ConvNet. Furthermore, localisation of objects of interest and a reduction in computation is enabled through region proposals. Our system is tested on over 10km of real mine site data and we were able to detect both light vehicles and personnel. We show that the introduction of our background model can reduce the false positive rate by an order of magnitude.
Resumo:
Uncorrected refractive error, including astigmatism, is a leading cause of reversible visual impairment. While the ability to perform vision-related daily activities is reduced when people are not optimally corrected, only limited research has investigated the impact of uncorrected astigmatism. Given the capacity to perform vision-related daily activities involves integration of a range of visual and cognitive cues, this research examined the impact of simulated astigmatism on visual tasks that also involved cognitive input. The research also examined whether the higher levels of complexity inherent in Chinese characters makes them more susceptible to the effects of astigmatism. The effects of different powers of astigmatism, as well as astigmatism at different axes were investigated in order to determine the minimum level of astigmatism that resulted in a decrement in visual performance.
Resumo:
Due to the numerous possibilities of voicing concerns and the flood of data we are exposed to, local issues are sometimes at risk of being overlooked. This study explores Local Commons, a design intervention in public space that combines situated digital and tangible media in order to engage communities in contributing and debating different perspectives on a given local issue. The intervention invited the community to submit images of their perspectives on the issue, which were displayed on a public screen. Via tangible buttons in front of the screen, community members then agree or disagree on the displayed perspectives, creating a space for deliberation. In a user study, we were specifically interested in testing three aspects of our intervention, which are discussed in this paper: The difference that situatedness, visual content, and tangible interaction can make to urban community engagement.
Resumo:
This research investigated the visual demands in modern primary school classrooms and also the impact of common refractive anomalies on a child's ability to perform academic-related tasks. The results showed that relatively high levels of visual acuity, contrast demand and sustained accommodative-convergence are required to perform optimally in the modern classroom environment. It was also demonstrated that relatively low magnitudes of uncorrected refractive error may have a detrimental impact on children's ability to perform academic-related activities at school, with sustained near work further exacerbating this effect. These findings have important implications for both eye care practitioners and education authorities.
Resumo:
This project developed a visual strategy and graphic outcomes to communicate the results of a scientific collaborative project to the Mackay community. During 2013 and 2014 a team from CSIRO engaged with the community in Mackay to collaboratively develop a set of strategies to improve the management of the Great Barrier Reef. The result of this work was a 300+ page scientific report that needed to be translated and summarised to the general community. The aim of this project was to strategically synthesise information contained in the report and to design and produce an outcome to be distributed to the participant community. By working with the CISRO researchers, an action toolkit was developed, with twelve cards and a booklet. Each card represented the story behind a certain local management issue and the actions that the participants suggested should be taken in order to improve management of The Reef. During the design synthesis it was identified that for all management issues there was a reference to the need to develop some sort of "educational campaign" to the area. That was then translated as an underlying action to support all other actions proposed in the toolkit.
Resumo:
A large range of underground mining equipment makes use of compliant hydraulic arms for tasks such as rock-bolting, rock breaking, explosive charging and shotcreting. This paper describes a laboratory model electo-hydraulic manipulator which is used to prototype novel control and sensing techniques. The research is aimed at improving the safety and productivity of these mining tasks through automation, in particular the application of closed-loop visual positioning of the machine's end-effector.
Resumo:
We propose a novel technique for conducting robust voice activity detection (VAD) in high-noise recordings. We use Gaussian mixture modeling (GMM) to train two generic models; speech and non-speech. We then score smaller segments of a given (unseen) recording against each of these GMMs to obtain two respective likelihood scores for each segment. These scores are used to compute a dissimilarity measure between pairs of segments and to carry out complete-linkage clustering of the segments into speech and non-speech clusters. We compare the accuracy of our method against state-of-the-art and standardised VAD techniques to demonstrate an absolute improvement of 15% in half-total error rate (HTER) over the best performing baseline system and across the QUT-NOISE-TIMIT database. We then apply our approach to the Audio-Visual Database of American English (AVDBAE) to demonstrate the performance of our algorithm in using visual, audio-visual or a proposed fusion of these features.
Resumo:
Our aim was to make a quantitative comparison of the response of the different visual cortical areas to selective stimulation of the two different cone-opponent pathways [long- and medium-wavelength (L/M)- and short-wavelength (S)-cone-opponent] and the achromatic pathway under equivalent conditions. The appropriate stimulus-contrast metric for the comparison of colour and achromatic sensitivity is unknown, however, and so a secondary aim was to investigate whether equivalent fMRI responses of each cortical area are predicted by stimulus contrast matched in multiples of detection threshold that approximately equates for visibility, or direct (cone) contrast matches in which psychophysical sensitivity is uncorrected. We found that the fMRI response across the two colour and achromatic pathways is not well predicted by threshold-scaled stimuli (perceptual visibility) but is better predicted by cone contrast, particularly for area V1. Our results show that the early visual areas (V1, V2, V3, VP and hV4) all have robust responses to colour. No area showed an overall colour preference, however, until anterior to V4 where we found a ventral occipital region that has a significant preference for chromatic stimuli, indicating a functional distinction from earlier areas. We found that all of these areas have a surprisingly strong response to S-cone stimuli, at least as great as the L/M response, suggesting a relative enhancement of the S-cone cortical signal. We also identified two areas (V3A and hMT+) with a significant preference for achromatic over chromatic stimuli, indicating a functional grouping into a dorsal pathway with a strong magnocellular input.
Resumo:
The paper critiques the focus of creative industries policy on capability development of small and medium sized firms and the provision of regional incentives. It analyses factors affecting the competitiveness and sustainability of the games development industry and visual effects suppliers to feature films. Interviews with participants in these industries highlight the need for policy instruments to take into consideration the structure and organization of global markets and the power of lead multinational corporations. We show that although forms of economic governance in these industries may allow sustainable value capture, they are interrupted by bottlenecks in which ferocious competition among suppliers is confronted by comparatively little competition among the lead firms. We argue that current approaches to creative industries policy aimed at building self-sustaining creative industries are unlikely to be sufficient because of the globalized nature of the industries. Rather, we argue that a more profitable approach is likely to require supporting diversification of the industries as ‘feeders’ into other areas of the economy.
Resumo:
Visual information in the form of lip movements of the speaker has been shown to improve the performance of speech recognition and search applications. In our previous work, we proposed cross database training of synchronous hidden Markov models (SHMMs) to make use of external large and publicly available audio databases in addition to the relatively small given audio visual database. In this work, the cross database training approach is improved by performing an additional audio adaptation step, which enables audio visual SHMMs to benefit from audio observations of the external audio models before adding visual modality to them. The proposed approach outperforms the baseline cross database training approach in clean and noisy environments in terms of phone recognition accuracy as well as spoken term detection (STD) accuracy.
Resumo:
Spoken term detection (STD) is the task of looking up a spoken term in a large volume of speech segments. In order to provide fast search, speech segments are first indexed into an intermediate representation using speech recognition engines which provide multiple hypotheses for each speech segment. Approximate matching techniques are usually applied at the search stage to compensate the poor performance of automatic speech recognition engines during indexing. Recently, using visual information in addition to audio information has been shown to improve phone recognition performance, particularly in noisy environments. In this paper, we will make use of visual information in the form of lip movements of the speaker in indexing stage and will investigate its effect on STD performance. Particularly, we will investigate if gains in phone recognition accuracy will carry through the approximate matching stage to provide similar gains in the final audio-visual STD system over a traditional audio only approach. We will also investigate the effect of using visual information on STD performance in different noise environments.
Resumo:
Speech recognition can be improved by using visual information in the form of lip movements of the speaker in addition to audio information. To date, state-of-the-art techniques for audio-visual speech recognition continue to use audio and visual data of the same database for training their models. In this paper, we present a new approach to make use of one modality of an external dataset in addition to a given audio-visual dataset. By so doing, it is possible to create more powerful models from other extensive audio-only databases and adapt them on our comparatively smaller multi-stream databases. Results show that the presented approach outperforms the widely adopted synchronous hidden Markov models (HMM) trained jointly on audio and visual data of a given audio-visual database for phone recognition by 29% relative. It also outperforms the external audio models trained on extensive external audio datasets and also internal audio models by 5.5% and 46% relative respectively. We also show that the proposed approach is beneficial in noisy environments where the audio source is affected by the environmental noise.
Resumo:
In a book seeking to redraw the boundaries between interdisciplinary and transnational modernisms, this chapter contributes to the reorientation in modernist studies by revisiting "primitivism." While no one freely identifies as “primitive,” the spectre of primitivism was a magnet of attraction as well as of critical refusal. It resided on the knife-edge of envy and denunciation, as well as for the projection of alternate imaginative utopias and the worst forms of racial chauvinism. This chapter asserts that primitivism endures as a provocation as much as a utopian aspiration, but it also provides a different understanding of cultures on the "periphery", which is how Antipodean art history has understood itself. The spectre of primitivism not only amplifies the quandaries of modernist cultures—both alerting one to the aesthetic alternatives to modernist cultures, yet also highlighting the fate of traditional culture pitted against modernist cultures, it also suggests the quandaries of a peripheral modernity.
Resumo:
This paper presents the results of a research project aimed at examining the capabilities and challenges of two distinct but not mutually exclusive approaches to in-service bridge assessment: visual inspection and installed monitoring systems. In this study, the intended functionality of both approaches was evaluated on its ability to identify potential structural damage and to provide decision-making support. Inspection and monitoring are compared in terms of their functional performance, cost, and barriers (real and perceived) to implementation. Both methods have strengths and weaknesses across the metrics analyzed, and it is likely that a hybrid evaluation technique that adopts both approaches will optimize efficiency of condition assessment and ultimately lead to better decision making.
Resumo:
Informed by Kristeva's formulation of affect and Winnicott's Holding Environment, this practice-led visual art project is an exploration into how sensitivity to the physical sensation of trembling can sustain a creative practice. Building upon this is a further enquiry into what the significance of the affective experience of trembling is for an ethics of affect in contemporary art. I have done this through object and video-based installations informed by my own experience of trembling. This has been further informed by the work of artists like Louise Bourgeois, Dennis Del Favero and Willie Doherty. The creative outcomes contribute to the discourse around ethical responses to affect by extending and developing on the works of these artists.