179 resultados para Visual odometry
Resumo:
In this paper we present the application of Hidden Conditional Random Fields (HCRFs) to modelling speech for visual speech recognition. HCRFs may be easily adapted to model long range dependencies across an observation sequence. As a result visual word recognition performance can be improved as the model is able to take more of a contextual approach to generating state sequences. Results are presented from a speaker-dependent, isolated digit, visual speech recognition task using comparisons with a baseline HMM system. We firstly illustrate that word recognition rates on clean video using HCRFs can be improved by increasing the number of past and future observations being taken into account by each state. Secondly we compare model performances using various levels of video compression on the test set. As far as we are aware this is the first attempted use of HCRFs for visual speech recognition.
Resumo:
Accurate estimates of the time-to-contact (TTC) of approaching objects are crucial for survival. We used an ecologically valid driving simulation to compare and contrast the neural substrates of egocentric (head-on approach) and allocentric (lateral approach) TTC tasks in a fully factorial, event-related fMRI design. Compared to colour control tasks, both egocentric and allocentric TTC tasks activated left ventral premotor cortex/frontal operculum and inferior parietal cortex, the same areas that have previously been implicated in temporal attentional orienting. Despite differences in visual and cognitive demands, both TTC and temporal orienting paradigms encourage the use of temporally predictive information to guide behaviour, suggesting these areas may form a core network for temporal prediction. We also demonstrated that the temporal derivative of the perceptual index tau (tau-dot) held predictive value for making collision judgements and varied inversely with activity in primary visual cortex (V1). Specifically, V1 activity increased with the increasing likelihood of reporting a collision, suggesting top-down attentional modulation of early visual processing areas as a function of subjective collision. Finally, egocentric viewpoints provoked a response bias for reporting collisions, rather than no-collisions, reflecting increased caution for head-on approaches. Associated increases in SMA activity suggest motor preparation mechanisms were engaged, despite the perceptual nature of the task.
Resumo:
In this survey, we will summarize the latest developments of visual cryptography since its inception in 1994, introduce the main research topics in this area and outline the current problems and possible solutions. Directions and trends for future VC work shall also be examined along with possible VC applications.
Resumo:
In this journal article, we take multiple secrets into consideration and generate a key share for all the secrets; correspondingly, we share each secret using this key share. The secrets are recovered when the key is superimposed on the combined share in different locations using the proposed scheme. Also discussed and illustrated within this paper is how to embed a share of visual cryptography into halftone and colour images. The remaining share is used as a key share in order to perform the decryption. It is also worth noting that no information regarding the secrets is leaked in any of our proposed schemes. We provide the corresponding results in this paper.
Resumo:
OBJECTIVE:
To elucidate the contribution of environmental versus genetic factors to the significant losses in visual function associated with normal aging.
DESIGN:
A classical twin study.
PARTICIPANTS:
Forty-two twin pairs (21 monozygotic and 21 dizygotic; age 57-75 years) with normal visual acuity recruited through the Australian Twin Registry.
METHODS:
Cone function was evaluated by establishing absolute cone contrast thresholds to flicker (4 and 14 Hz) and isoluminant red and blue colors under steady state adaptation. Adaptation dynamics were determined for both cones and rods. Bootstrap resampling was used to return robust intrapair correlations for each parameter.
MAIN OUTCOME MEASURES:
Psychophysical thresholds and adaptational time constants.
RESULTS:
The intrapair correlations for all color and flicker thresholds, as well as cone absolute threshold, were significantly higher in monozygotic compared with dizygotic twin pairs (P<0.05). Rod absolute thresholds (P = 0.28) and rod and cone recovery rate (P = 0.83; P = 0.79, respectively) did not show significant differences between monozygotic and dizygotic twins in their intrapair correlations, indicating that steady-state cone thresholds and flicker thresholds have a marked genetic contribution, in contrast with rod thresholds and adaptive processes, which are influenced more by environmental factors over a lifetime.
CONCLUSIONS:
Genes and the environment contribute differently to important neuronal processes in the retina and the role they may play in the decline in visual function as we age. Consequently, retinal structures involved in rod thresholds and adaptive processes may be responsive to appropriate environmental manipulation. Because the functions tested are commonly impaired in the early stages of age-related macular degeneration, which is known to have a multifactorial etiology, this study supports the view that pathogenic pathways early in the disease may be altered by appropriate environmental intervention.
Resumo:
Rapid orientating movements of the eyes are believed to be controlled ballistically. The mechanism underlying this control is thought to involve a comparison between the desired displacement of the eye and an estimate of its actual position (obtained from the integration of the eye velocity signal). This study shows, however, that under certain circumstances fast gaze movements may be controlled quite differently and may involve mechanisms which use visual information to guide movements prospectively. Subjects were required to make large gaze shifts in yaw towards a target whose location and motion were unknown prior to movement onset. Six of those tested demonstrated remarkable accuracy when making gaze shifts towards a target that appeared during their ongoing movement. In fact their level of accuracy was not significantly different from that shown when they performed a 'remembered' gaze shift to a known stationary target (F-3,F-15 = 0.15, p > 0.05). The lack of a stereotypical relationship between the skew of the gaze velocity profile and movement duration indicates that on-line modifications were being made. It is suggested that a fast route from the retina to the superior colliculus could account for this behaviour and that models of oculomotor control need to be updated.
Resumo:
In this paper, we present a new approach to visual speech recognition which improves contextual modelling by combining Inter-Frame Dependent and Hidden Markov Models. This approach captures contextual information in visual speech that may be lost using a Hidden Markov Model alone. We apply contextual modelling to a large speaker independent isolated digit recognition task, and compare our approach to two commonly adopted feature based techniques for incorporating speech dynamics. Results are presented from baseline feature based systems and the combined modelling technique. We illustrate that both of these techniques achieve similar levels of performance when used independently. However significant improvements in performance can be achieved through a combination of the two. In particular we report an improvement in excess of 17% relative Word Error Rate in comparison to our best baseline system.