902 resultados para Visual Information


Relevância:

60.00% 60.00%

Publicador:

Resumo:

Purpose: The classic study of Sumby and Pollack (1954, JASA, 26(2), 212-215) demonstrated that visual information aided speech intelligibility under noisy auditory conditions. Their work showed that visual information is especially useful under low signal-to-noise conditions where the auditory signal leaves greater margins for improvement. We investigated whether simulated cataracts interfered with the ability of participants to use visual cues to help disambiguate the auditory signal in the presence of auditory noise. Methods: Participants in the study were screened to ensure normal visual acuity (mean of 20/20) and normal hearing (auditory threshold ≤ 20 dB HL). Speech intelligibility was tested under an auditory only condition and two visual conditions: normal vision and simulated cataracts. The light scattering effects of cataracts were imitated using cataract-simulating filters. Participants wore blacked-out glasses in the auditory only condition and lens-free frames in the normal auditory-visual condition. Individual sentences were spoken by a live speaker in the presence of prerecorded four-person background babble set to a speech-to-noise ratio (SNR) of -16 dB. The SNR was determined in a preliminary experiment to support 50% correct identification of sentence under the auditory only conditions. The speaker was trained to match the rate, intensity and inflections of a prerecorded audio track of everyday speech sentences. The speaker was blind to the visual conditions of the participant to control for bias.Participants’ speech intelligibility was measured by comparing the accuracy of their written account of what they believed the speaker to have said to the actual spoken sentence. Results: Relative to the normal vision condition, speech intelligibility was significantly poorer when participants wore simulated catarcts. Conclusions: The results suggest that cataracts may interfere with the acquisition of visual cues to speech perception.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper shows initial results in deploying the biologically inspired Simultaneous Localisation and Mapping system, RatSLAM, in an outdoor environment. RatSLAM has been widely tested in indoor environments on the task of producing topologically coherent maps based on a fusion of odometric and visual information. This paper details the changes required to deploy RatSLAM on a small tractor equipped with odometry and an omnidirectional camera. The principal changes relate to the vision system, with others required for RatSLAM to use omnidirectional visual data. The initial results from mapping around a 500 m loop are promising, with many improvements still to be made.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Biological inspiration has produced some successful solutions for estimation of self motion from visual information. In this paper we present the construction of a unique new camera, inspired by the compound eye of insects. The hemispherical nature of the compound eye has some intrinsically valuable properties in producing optical flow fields that are suitable for egomotion estimation in six degrees of freedom. The camera that we present has the added advantage of being lightweight and low cost, making it suitable for a range of mobile robot applications. We present some initial results that show the effectiveness of our egomotion estimation algorithm and the image capture capability of the hemispherical camera.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Simultaneous Localization And Mapping (SLAM) is one of the major challenges in mobile robotics. Probabilistic techniques using high-end range finding devices are well established in the field, but recent work has investigated vision only approaches. This paper presents a method for generating approximate rotational and translation velocity information from a single vehicle-mounted consumer camera, without the computationally expensive process of tracking landmarks. The method is tested by employing it to provide the odometric and visual information for the RatSLAM system while mapping a complex suburban road network. RatSLAM generates a coherent map of the environment during an 18 km long trip through suburban traffic at speeds of up to 60 km/hr. This result demonstrates the potential of ground based vision-only SLAM using low cost sensing and computational hardware.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

To detect and annotate the key events of live sports videos, we need to tackle the semantic gaps of audio-visual information. Previous work has successfully extracted semantic from the time-stamped web match reports, which are synchronized with the video contents. However, web and social media articles with no time-stamps have not been fully leveraged, despite they are increasingly used to complement the coverage of major sporting tournaments. This paper aims to address this limitation using a novel multimodal summarization framework that is based on sentiment analysis and players' popularity. It uses audiovisual contents, web articles, blogs, and commentators' speech to automatically annotate and visualize the key events and key players in a sports tournament coverage. The experimental results demonstrate that the automatically generated video summaries are aligned with the events identified from the official website match reports.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This study investigated the Kinaesthetic Fusion Effect (KFE) first described by Craske and Kenny in 1981. The current study did not replicate these findings. Participants did not perceive any reduction in the sagittal separation of a button pressed by the index finger of one arm and a probe touching the other, following repeated exposure to the tactile stimuli present on both unseen arms. This study’s failure to replicate the widely-cited KFE as described by Craske et al. (1984) suggests that it may be contingent on several aspects of visual information, especially the availability of a specific visual reference, the role of instructions regarding gaze direction, and the potential use of a line of sight strategy when referring felt positions to an interposed surface. In addition, a foreshortening effect was found; this may result from a line-of-sight judgment and represent a feature of the reporting method used. The transformed line of sight data were regressed against the participant reported values, resulting in a slope of 1.14 (right arm) and 1.11 (left arm), and r > 0.997 for each. The study also provides additional evidence that mis-perceptions of the mediolateral position of the limbs specifically their separation and consistent with notions of Gestalt grouping, is somewhat labile and can be influenced by active motions causing touch of one limb by the other. Finally, this research will benefit future studies that require participants to report the perceived locations of the unseen limbs.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

As the popularity of video as an information medium rises, the amount of video content that we produce and archive keeps growing. This creates a demand for shorter representations of videos in order to assist the task of video retrieval. The traditional solution is to let humans watch these videos and write textual summaries based on what they saw. This summarisation process, however, is time-consuming. Moreover, a lot of useful audio-visual information contained in the original video can be lost. Video summarisation aims to turn a full-length video into a more concise version that preserves as much information as possible. The problem of video summarisation is to minimise the trade-off between how concise and how representative a summary is. There are also usability concerns that need to be addressed in a video summarisation scheme. To solve these problems, this research aims to create an automatic video summarisation framework that combines and improves on existing video summarisation techniques, with the focus on practicality and user satisfaction. We also investigate the need for different summarisation strategies in different kinds of videos, for example news, sports, or TV series. Finally, we develop a video summarisation system based on the framework, which is validated by subjective and objective evaluation. The evaluation results shows that the proposed framework is effective for creating video skims, producing high user satisfaction rate and having reasonably low computing requirement. We also demonstrate that the techniques presented in this research can be used for visualising video summaries in the form web pages showing various useful information, both from the video itself and from external sources.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This study investigated the Kinaesthetic Fusion Effect (KFE) first described by Craske and Kenny in 1981. The current study did not replicate these findings following a change in the reporting method used by participants. Participants did not perceive any reduction in the sagittal separation of a button pressed by the index finger of one arm and a probe touching the other, following repeated exposure to the tactile stimuli present on both unseen arms. This study’s failure to replicate the widely-cited KFE as described by Craske et al. (1984) suggests that it may be contingent on several aspects of visual information, especially the availability of a specific visual reference, the role of instructions regarding gaze direction, and the potential use of a line of sight strategy when referring felt positions to an interposed surface. In addition, a foreshortening effect was found; this may result from a line-of-sight judgment and represent a feature of the reporting method used. Finally, this research will benefit future studies that require participants to report the perceived locations of the unseen limbs.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This study investigated the Kinaesthetic Fusion Effect (KFE) first described by Craske and Kenny in 1981. In Experiment 1 the study did not replicate these findings following a change in the reporting method used by participants. Participants did not perceive any reduction in the sagittal separation of a button pressed by the index finger of one arm and a probe touching the other, following repeated exposure to the tactile stimuli present on both unseen arms. This study’s failure to replicate the widely-cited KFE as described by Craske et al. (1984) suggests that it may be contingent on several aspects of visual information, especially the availability of a specific visual reference, the role of instructions regarding gaze direction, and the potential use of a line of sight strategy when referring felt positions to an interposed surface. In addition, a foreshortening effect was found; this may result from a line-of-sight judgment and represent a feature of the reporting method used. Finally, this research will benefit future studies that require participants to report the perceived locations of the unseen limbs. Experiment 2 investigated the KFE when the visual reference was removed and participants made reports of touched position, blindfolded. A number of interesting outcomes arose from this change and may provide clarification to the phenomena.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In this paper, we seek to expand the use of direct methods in real-time applications by proposing a vision-based strategy for pose estimation of aerial vehicles. The vast majority of approaches make use of features to estimate motion. Conversely, the strategy we propose is based on a MR (Multi- Resolution) implementation of an image registration technique (Inverse Compositional Image Alignment ICIA) using direct methods. An on-board camera in a downwards-looking configuration, and the assumption of planar scenes, are the bases of the algorithm. The motion between frames (rotation and translation) is recovered by decomposing the frame-to-frame homography obtained by the ICIA algorithm applied to a patch that covers around the 80% of the image. When the visual estimation is required (e.g. GPS drop-out), this motion is integrated with the previous known estimation of the vehicles’ state, obtained from the on-board sensors (GPS/IMU), and the subsequent estimations are based only on the vision-based motion estimations. The proposed strategy is tested with real flight data in representative stages of a flight: cruise, landing, and take-off, being two of those stages considered critical: take-off and landing. The performance of the pose estimation strategy is analyzed by comparing it with the GPS/IMU estimations. Results show correlation between the visual estimation obtained with the MR-ICIA and the GPS/IMU data, that demonstrate that the visual estimation can be used to provide a good approximation of the vehicle’s state when it is required (e.g. GPS drop-outs). In terms of performance, the proposed strategy is able to maintain an estimation of the vehicle’s state for more than one minute, at real-time frame rates based, only on visual information.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Motivated by the growing interest in unmanned aerial system’s applications in indoor and outdoor settings and the standardisation of visual sensors as vehicle payload. This work presents a collision avoidance approach based on omnidirectional cameras that does not require the estimation of range between two platforms to resolve a collision encounter. It will achieve a minimum separation between the two vehicles involved by maximising the view-angle given by the omnidirectional sensor. Only visual information is used to achieve avoidance under a bearing-only visual servoing approach. We provide theoretical problem formulation, as well as results from real flight using small quadrotors.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This research explores music in space, as experienced through performing and music-making with interactive systems. It explores how musical parameters may be presented spatially and displayed visually with a view to their exploration by a musician during performance. Spatial arrangements of musical components, especially pitches and harmonies, have been widely studied in the literature, but the current capabilities of interactive systems allow the improvisational exploration of these musical spaces as part of a performance practice. This research focuses on quantised spatial organisation of musical parameters that can be categorised as grid music systems (GMSs), and interactive music systems based on them. The research explores and surveys existing and historical uses of GMSs, and develops and demonstrates the use of a novel grid music system designed for whole body interaction. Grid music systems provide plotting of spatialised input to construct patterned music on a two-dimensional grid layout. GMSs are navigated to construct a sequence of parametric steps, for example a series of pitches, rhythmic values, a chord sequence, or terraced dynamic steps. While they are conceptually simple when only controlling one musical dimension, grid systems may be layered to enable complex and satisfying musical results. These systems have proved a viable, effective, accessible and engaging means of music-making for the general user as well as the musician. GMSs have been widely used in electronic and digital music technologies, where they have generally been applied to small portable devices and software systems such as step sequencers and drum machines. This research shows that by scaling up a grid music system, music-making and musical improvisation are enhanced, gaining several advantages: (1) Full body location becomes the spatial input to the grid. The system becomes a partially immersive one in four related ways: spatially, graphically, sonically and musically. (2) Detection of body location by tracking enables hands-free operation, thereby allowing the playing of a musical instrument in addition to “playing” the grid system. (3) Visual information regarding musical parameters may be enhanced so that the performer may fully engage with existing spatial knowledge of musical materials. The result is that existing spatial knowledge is overlaid on, and combined with, music-space. Music-space is a new concept produced by the research, and is similar to notions of other musical spaces including soundscape, acoustic space, Smalley's “circumspace” and “immersive space” (2007, 48-52), and Lotis's “ambiophony” (2003), but is rather more textural and “alive”—and therefore very conducive to interaction. Music-space is that space occupied by music, set within normal space, which may be perceived by a person located within, or moving around in that space. Music-space has a perceivable “texture” made of tensions and relaxations, and contains spatial patterns of these formed by musical elements such as notes, harmonies, and sounds, changing over time. The music may be performed by live musicians, created electronically, or be prerecorded. Large-scale GMSs have the capability not only to interactively display musical information as music representative space, but to allow music-space to co-exist with it. Moving around the grid, the performer may interact in real time with musical materials in music-space, as they form over squares or move in paths. Additionally he/she may sense the textural matrix of the music-space while being immersed in surround sound covering the grid. The HarmonyGrid is a new computer-based interactive performance system developed during this research that provides a generative music-making system intended to accompany, or play along with, an improvising musician. This large-scale GMS employs full-body motion tracking over a projected grid. Playing with the system creates an enhanced performance employing live interactive music, along with graphical and spatial activity. Although one other experimental system provides certain aspects of immersive music-making, currently only the HarmonyGrid provides an environment to explore and experience music-space in a GMS.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Signal-degrading speckle is one factor that can reduce the quality of optical coherence tomography images. We demonstrate the use of a hierarchical model-based motion estimation processing scheme based on an affine-motion model to reduce speckle in optical coherence tomography imaging, by image registration and the averaging of multiple B-scans. The proposed technique is evaluated against other methods available in the literature. The results from a set of retinal images show the benefit of the proposed technique, which provides an improvement in signal-to-noise ratio of the square root of the number of averaged images, leading to clearer visual information in the averaged image. The benefits of the proposed technique are also explored in the case of ocular anterior segment imaging.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Rats are superior to the most advanced robots when it comes to creating and exploiting spatial representations. A wild rat can have a foraging range of hundreds of meters, possibly kilometers, and yet the rodent can unerringly return to its home after each foraging mission, and return to profitable foraging locations at a later date (Davis, et al., 1948). The rat runs through undergrowth and pipes with few distal landmarks, along paths where the visual, textural, and olfactory appearance constantly change (Hardy and Taylor, 1980; Recht, 1988). Despite these challenges the rat builds, maintains, and exploits internal representations of large areas of the real world throughout its two to three year lifetime. While algorithms exist that allow robots to build maps, the questions of how to maintain those maps and how to handle change in appearance over time remain open. The robotic approach to map building has been dominated by algorithms that optimise the geometry of the map based on measurements of distances to features. In a robotic approach, measurements of distance to features are taken with range-measuring devices such as laser range finders or ultrasound sensors, and in some cases estimates of depth from visual information. The features are incorporated into the map based on previous readings of other features in view and estimates of self-motion. The algorithms explicitly model the uncertainty in measurements of range and the measurement of self-motion, and use probability theory to find optimal solutions for the geometric configuration of the map features (Dissanayake, et al., 2001; Thrun and Leonard, 2008). Some of the results from the application of these algorithms have been impressive, ranging from three-dimensional maps of large urban strucutures (Thrun and Montemerlo, 2006) to natural environments (Montemerlo, et al., 2003).

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Unmanned Aerial Vehicles (UAVs) industry is a fast growing sector. Nowadays, the market offers numerous possibilities for off-the-shelf UAVs such as quadrotors or fixed-wings. Until UAVs demonstrate advance capabilities such as autonomous collision avoidance they will be segregated and restricted to flight in controlled environments. This work presents a visual fuzzy servoing system for obstacle avoidance using UAVs. To accomplish this task we used the visual information from the front camera. Images are processed off-board and the result send to the Fuzzy Logic controller which then send commands to modify the orientation of the aircraft. Results from flight test are presented with a commercial off-the-shelf platform.