501 resultados para PASCAL Visual Object Classes (VOC)
Resumo:
This paper introduces an improved line tracker using IMU and vision data for visual servoing tasks. We utilize an Image Jacobian which describes motion of a line feature to corresponding camera movements. These camera motions are estimated using an IMU. We demonstrate impacts of the proposed method in challenging environments: maximum angular rate ~160 0/s, acceleration ~6m /s2 and in cluttered outdoor scenes. Simulation and quantitative tracking performance comparison with the Visual Servoing Platform (ViSP) are also presented.
Resumo:
In this paper we present a novel place recognition algorithm inspired by recent discoveries in human visual neuroscience. The algorithm combines intolerant but fast low resolution whole image matching with highly tolerant, sub-image patch matching processes. The approach does not require prior training and works on single images (although we use a cohort normalization score to exploit temporal frame information), alleviating the need for either a velocity signal or image sequence, differentiating it from current state of the art methods. We demonstrate the algorithm on the challenging Alderley sunny day – rainy night dataset, which has only been previously solved by integrating over 320 frame long image sequences. The system is able to achieve 21.24% recall at 100% precision, matching drastically different day and night-time images of places while successfully rejecting match hypotheses between highly aliased images of different places. The results provide a new benchmark for single image, condition-invariant place recognition.
Resumo:
Ongoing innovation in digital animation and visual effects technologies has provided new opportunities for stories to be visually rendered in ways never before possible. Films featuring animation and visual effects continue to perform well at the box office, proving to be highly profitable projects. The Avengers (Whedon, 2012) holds the current record for opening weekend sales, accruing as much as $207,438,708 USD and $623,357,910 USD gross at time of writing. Life of Pi (Lee, 2012) at time of writing has grossed as much as $608,791,063 USD (Box Office Mojo, 2013). With so much creative potential and a demonstrable ability to generate a large amount of revenue, the animation and visual effects industry – otherwise known as the Post, Digital and Visual Effects (PDV) industry – has become significant to the future growth and stability of the Australian film industry as a whole.
Resumo:
Long-term autonomy in robotics requires perception systems that are resilient to unusual but realistic conditions that will eventually occur during extended missions. For example, unmanned ground vehicles (UGVs) need to be capable of operating safely in adverse and low-visibility conditions, such as at night or in the presence of smoke. The key to a resilient UGV perception system lies in the use of multiple sensor modalities, e.g., operating at different frequencies of the electromagnetic spectrum, to compensate for the limitations of a single sensor type. In this paper, visual and infrared imaging are combined in a Visual-SLAM algorithm to achieve localization. We propose to evaluate the quality of data provided by each sensor modality prior to data combination. This evaluation is used to discard low-quality data, i.e., data most likely to induce large localization errors. In this way, perceptual failures are anticipated and mitigated. An extensive experimental evaluation is conducted on data sets collected with a UGV in a range of environments and adverse conditions, including the presence of smoke (obstructing the visual camera), fire, extreme heat (saturating the infrared camera), low-light conditions (dusk), and at night with sudden variations of artificial light. A total of 240 trajectory estimates are obtained using five different variations of data sources and data combination strategies in the localization method. In particular, the proposed approach for selective data combination is compared to methods using a single sensor type or combining both modalities without preselection. We show that the proposed framework allows for camera-based localization resilient to a large range of low-visibility conditions.
Resumo:
This work aims to contribute to the reliability and integrity of perceptual systems of unmanned ground vehicles (UGV). A method is proposed to evaluate the quality of sensor data prior to its use in a perception system by utilising a quality metric applied to heterogeneous sensor data such as visual and infrared camera images. The concept is illustrated specifically with sensor data that is evaluated prior to the use of the data in a standard SIFT feature extraction and matching technique. The method is then evaluated using various experimental data sets that were collected from a UGV in challenging environmental conditions, represented by the presence of airborne dust and smoke. In the first series of experiments, a motionless vehicle is observing a ’reference’ scene, then the method is extended to the case of a moving vehicle by compensating for its motion. This paper shows that it is possible to anticipate degradation of a perception algorithm by evaluating the input data prior to any actual execution of the algorithm.
Resumo:
This paper proposes an experimental study of quality metrics that can be applied to visual and infrared images acquired from cameras onboard an unmanned ground vehicle (UGV). The relevance of existing metrics in this context is discussed and a novel metric is introduced. Selected metrics are evaluated on data collected by a UGV in clear and challenging environmental conditions, represented in this paper by the presence of airborne dust or smoke. An example of application is given with monocular SLAM estimating the pose of the UGV while smoke is present in the environment. It is shown that the proposed novel quality metric can be used to anticipate situations where the quality of the pose estimate will be significantly degraded due to the input image data. This leads to decisions of advantageously switching between data sources (e.g. using infrared images instead of visual images).
Resumo:
This paper proposes an experimental study of quality metrics that can be applied to visual and infrared images acquired from cameras onboard an unmanned ground vehicle (UGV). The relevance of existing metrics in this context is discussed and a novel metric is introduced. Selected metrics are evaluated on data collected by a UGV in clear and challenging environmental conditions, represented in this paper by the presence of airborne dust or smoke.
Resumo:
This work aims to contribute to reliability and integrity in perceptual systems of autonomous ground vehicles. Information theoretic based metrics to evaluate the quality of sensor data are proposed and applied to visual and infrared camera images. The contribution of the proposed metrics to the discrimination of challenging conditions is discussed and illustrated with the presence of airborne dust and smoke.
Resumo:
Objectives: To investigate the relationship between two assessments to quantify delayed onset muscle soreness [DOMS]: visual analog scale [VAS] and pressure pain threshold [PPT]. Methods: Thirty-one healthy young men [25.8 ± 5.5 years] performed 10 sets of six maximal eccentric contractions of the elbow flexors with their non-dominant arm. Before and one to four days after the exercise, muscle pain perceived upon palpation of the biceps brachii at three sites [5, 9 and 13 cm above the elbow crease] was assessed by VAS with a 100 mm line [0 = no pain, 100 = extremely painful], and PPT of the same sites was determined by an algometer. Changes in VAS and PPT over time were compared amongst three sites by a two-way repeated measures analysis of variance, and the relationship between VAS and PPT was analyzed using a Pearson product-moment correlation. Results: The VAS increased one to four days after exercise and peaked two days post-exercise, while the PPT decreased most one day post-exercise and remained below baseline for four days following exercise [p < 0.05]. No significant difference among the three sites was found for VAS [p = 0.62] or PPT [p = 0.45]. The magnitude of change in VAS did not significantly correlate with that of PPT [r = −0.20, p = 0.28]. Conclusion: These results suggest that the level of muscle pain is not region-specific, at least among the three sites investigated in the study, and VAS and PPT provide different information about DOMS, indicating that VAS and PPT represent different aspects of pain.
Resumo:
Object classification is plagued by the issue of session variation. Session variation describes any variation that makes one instance of an object look different to another, for instance due to pose or illumination variation. Recent work in the challenging task of face verification has shown that session variability modelling provides a mechanism to overcome some of these limitations. However, for computer vision purposes, it has only been applied in the limited setting of face verification. In this paper we propose a local region based intersession variability (ISV) modelling approach, and apply it to challenging real-world data. We propose a region based session variability modelling approach so that local session variations can be modelled, termed Local ISV. We then demonstrate the efficacy of this technique on a challenging real-world fish image database which includes images taken underwater, providing significant real-world session variations. This Local ISV approach provides a relative performance improvement of, on average, 23% on the challenging MOBIO, Multi-PIE and SCface face databases. It also provides a relative performance improvement of 35% on our challenging fish image dataset.
Resumo:
This practice-led research project investigates how new postcolonial conditions require new methods of critique to fully engage with the nuances of real world, 'lived' experiences. Framed by key aspects of postcolonial theory, this project examines contemporary artists' contributions to investigations of identity, race, ethnicity, otherness and diaspora, as well as questions of locality, nationality, and transnationality. Approaching these issues through the lens of my own experience as an artist and subject, it results in a body of creative work and a written exegesis that creatively and critically examine the complexities, ambiguities and ambivalences of the contemporary postcolonial condition.
Resumo:
This article content analyzes music in tourism TV commercials from 95 regions and countries to identify their general acoustic characteristics. The objective is to offer a general guideline in the postproduction of tourism TV commercials. It is found that tourism TV commercials tend to be produced in a faster tempo with beats per minute close to 120, which is rare to be found in general TV commercials. To compensate for the faster tempo (increased aural information load), less scenes (longer duration per scene) were edited into the footage. Production recommendations and future research are presented.
Resumo:
My practice-led research explores and maps workflows for generating experimental creative work involving inertia based motion capture technology. Motion capture has often been used as a way to bridge animation and dance resulting in abstracted visuals outcomes. In early works this process was largely done by rotoscoping, reference footage and mechanical forms of motion capture. With the evolution of technology, optical and inertial forms of motion capture are now more accessible and able to accurately capture a larger range of complex movements. The creative work titled “Contours in Motion” was the first in a series of studies on captured motion data used to generating experimental visual forms that reverberate in space and time. With the source or ‘seed’ comes from using an Xsens MVN - Inertial Motion Capture system to capture spontaneous dance movements, with the visual generation conducted through a customised dynamics simulation. The aim of the creative work was to diverge way from a standard practice of using particle system and/or a simple re-targeting of the motion data to drive a 3d character as a means to produce abstracted visual forms. To facilitate this divergence a virtual dynamic object was tether to a selection of data points from a captured performance. The proprieties of the dynamic object were then adjusted to balance the influences from the human movement data with the influence of computer based randomization. The resulting outcome was a visual form that surpassed simple data visualization to project the intent of the performer’s movements into a visual shape itself. The reported outcomes from this investigation have contributed to a larger study on the use of motion capture in the generative arts, furthering the understanding of and generating theories on practice.
Resumo:
Covertly tracking mobile targets, either animal or human, in previously unmapped outdoor natural environments using off-road robotic platforms requires both visual and acoustic stealth. Whilst the use of robots for stealthy surveillance is not new, the majority only consider navigation for visual covertness. However, most fielded robotic systems have a non-negligible acoustic footprint arising from the onboard sensors, motors, computers and cooling systems, and also from the wheels interacting with the terrain during motion. This time-varying acoustic signature can jeopardise any visual covertness and needs to be addressed in any stealthy navigation strategy. In previous work, we addressed the initial concepts for acoustically masking a tracking robot’s movements as it travels between observation locations selected to minimise its detectability by a dynamic natural target and ensuring con- tinuous visual tracking of the target. This work extends the overall concept by examining the utility of real-time acoustic signature self-assessment and exploiting shadows as hiding locations for use in a combined visual and acoustic stealth framework.
Resumo:
In this workshop proposal I discuss a case study physical computing environment named Talk2Me. This work was exhibited in February 2006 at The Block, Brisbane as an interactive installation in the early stages of its development. The major artefact in this work is a 10 metre wide X 3 metre high light-permeable white dome. There are other technologies and artefacts contained within the dome that make up this interactive environment. The dome artefact has impacted heavily on the design process, including shaping the types of interactions involved, the kinds of technologies employed, and the choice of other artefacts. In this workshop paper, I chart some of the various iterations Talk2Me has undergone in the design process.