42 resultados para Visual Speaker Recognition, Visual Speech Recognition, Cascading Appearance-Based Features
em Aston University Research Archive
Resumo:
This review describes the oculo-visual problems likely to be encountered in Parkinson's disease (PD) with special reference to three questions: (1) are there visual symptoms characteristic of the prodromal phase of PD, (2) is PD dementia associated with specific visual changes, and (3) can visual symptoms help in the differential diagnosis of the parkinsonian syndromes, viz. PD, progressive supranuclear palsy (PSP), dementia with Lewy bodies (DLB), multiple system atrophy (MSA), and corticobasal degeneration (CBD)? Oculo-visual dysfunction in PD can involve visual acuity, dynamic contrast sensitivity, colour discrimination, pupil reactivity, eye movement, motion perception, and visual processing speeds. In addition, disturbance of visuo-spatial orientation, facial recognition problems, and chronic visual hallucinations may be present. Prodromal features of PD may include autonomic system dysfunction potentially affecting pupil reactivity, abnormal colour vision, abnormal stereopsis associated with postural instability, defects in smooth pursuit eye movements, and deficits in visuo-motor adaptation, especially when accompanied by idiopathic rapid eye movement (REM) sleep behaviour disorder. PD dementia is associated with the exacerbation of many oculo-visual problems but those involving eye movements, visuo-spatial function, and visual hallucinations are most characteristic. Useful diagnostic features in differentiating the parkinsonian symptoms are the presence of visual hallucinations, visuo-spatial problems, and variation in saccadic eye movement dysfunction.
Resumo:
Recently, hemispherical asymmetries have been demonstrated for primary visual processing suggesting that basic spatiotemporal features of the stimulus may play a role in the lateralisation effects that have been observed in the human brain. However, to our knowledge no studies have reported hemispheric differences using magnetoencephalography (MEG). Hence, the objective of this study was to determine whether MEG could detect hemispherical asymmetry to the onset of a checkerboard pattern.
Resumo:
The Octopus Automated Perimeter was validated in a comparative study and found to offer many advantages in the assessment of the visual field. The visual evoked potential was investigated in an extensive study using a variety of stimulus parameters to simulate hemianopia and central visual field defects. The scalp topography was recorded topographically and a technique to compute the source derivation of the scalp potential was developed. This enabled clarification of the expected scalp distribution to half field stimulation using different electrode montages. The visual evoked potential following full field stimulation was found to be asymmetrical around the midline with a bias over the left occiput particularly when the foveal polar projections of the occipital cortex were preferentially stimulated. The half field response reflected the distribution asymmetry. Masking of the central 3° resulted in a response which was approximately symmetrical around the midline but there was no evidence of the PNP-complex. A method for visual field quantification was developed based on the neural representation of visual space (Drasdo and Peaston 1982) in an attempt to relate visual field depravation with the resultant visual evoked potentials. There was no form of simple, diffuse summation between the scalp potential and the cortical generators. It was, however, possible to quantify the degree of scalp potential attenuation for M-scaled full field stimuli. The results obtained from patients exhibiting pre-chiasmal lesions suggested that the PNP-complex is not scotomatous in nature but confirmed that it is most likely to be related to specific diseases (Harding and Crews 1982). There was a strong correlation between the percentage information loss of the visual field and the diagnostic value of the visual evoked potential in patients exhibiting chiasmal lesions.
Resumo:
There have been two main approaches to feature detection in human and computer vision - luminance-based and energy-based. Bars and edges might arise from peaks of luminance and luminance gradient respectively, or bars and edges might be found at peaks of local energy, where local phases are aligned across spatial frequency. This basic issue of definition is important because it guides more detailed models and interpretations of early vision. Which approach better describes the perceived positions of elements in a 3-element contour-alignment task? We used the class of 1-D images defined by Morrone and Burr in which the amplitude spectrum is that of a (partially blurred) square wave and Fourier components in a given image have a common phase. Observers judged whether the centre element (eg ±458 phase) was to the left or right of the flanking pair (eg 0º phase). Lateral offset of the centre element was varied to find the point of subjective alignment from the fitted psychometric function. This point shifted systematically to the left or right according to the sign of the centre phase, increasing with the degree of blur. These shifts were well predicted by the location of luminance peaks and other derivative-based features, but not by energy peaks which (by design) predicted no shift at all. These results on contour alignment agree well with earlier ones from a more explicit feature-marking task, and strongly suggest that human vision does not use local energy peaks to locate basic first-order features. [Supported by the Wellcome Trust (ref: 056093)]
Resumo:
Background: During last decade the use of ECG recordings in biometric recognition studies has increased. ECG characteristics made it suitable for subject identification: it is unique, present in all living individuals, and hard to forge. However, in spite of the great number of approaches found in literature, no agreement exists on the most appropriate methodology. This study aimed at providing a survey of the techniques used so far in ECG-based human identification. Specifically, a pattern recognition perspective is here proposed providing a unifying framework to appreciate previous studies and, hopefully, guide future research. Methods: We searched for papers on the subject from the earliest available date using relevant electronic databases (Medline, IEEEXplore, Scopus, and Web of Knowledge). The following terms were used in different combinations: electrocardiogram, ECG, human identification, biometric, authentication and individual variability. The electronic sources were last searched on 1st March 2015. In our selection we included published research on peer-reviewed journals, books chapters and conferences proceedings. The search was performed for English language documents. Results: 100 pertinent papers were found. Number of subjects involved in the journal studies ranges from 10 to 502, age from 16 to 86, male and female subjects are generally present. Number of analysed leads varies as well as the recording conditions. Identification performance differs widely as well as verification rate. Many studies refer to publicly available databases (Physionet ECG databases repository) while others rely on proprietary recordings making difficult them to compare. As a measure of overall accuracy we computed a weighted average of the identification rate and equal error rate in authentication scenarios. Identification rate resulted equal to 94.95 % while the equal error rate equal to 0.92 %. Conclusions: Biometric recognition is a mature field of research. Nevertheless, the use of physiological signals features, such as the ECG traits, needs further improvements. ECG features have the potential to be used in daily activities such as access control and patient handling as well as in wearable electronics applications. However, some barriers still limit its growth. Further analysis should be addressed on the use of single lead recordings and the study of features which are not dependent on the recording sites (e.g. fingers, hand palms). Moreover, it is expected that new techniques will be developed using fiducials and non-fiducial based features in order to catch the best of both approaches. ECG recognition in pathological subjects is also worth of additional investigations.
Resumo:
Human object recognition is considered to be largely invariant to translation across the visual field. However, the origin of this invariance to positional changes has remained elusive, since numerous studies found that the ability to discriminate between visual patterns develops in a largely location-specific manner, with only a limited transfer to novel visual field positions. In order to reconcile these contradicting observations, we traced the acquisition of categories of unfamiliar grey-level patterns within an interleaved learning and testing paradigm that involved either the same or different retinal locations. Our results show that position invariance is an emergent property of category learning. Pattern categories acquired over several hours at a fixed location in either the peripheral or central visual field gradually become accessible at new locations without any position-specific feedback. Furthermore, categories of novel patterns presented in the left hemifield are distinctly faster learnt and better generalized to other locations than those learnt in the right hemifield. Our results suggest that during learning initially position-specific representations of categories based on spatial pattern structure become encoded in a relational, position-invariant format. Such representational shifts may provide a generic mechanism to achieve perceptual invariance in object recognition.
Resumo:
A substantial amount of evidence has been collected to propose an exclusive role for the dorsal visual pathway in the control of guided visual search mechanisms, specifically in the preattentive direction of spatial selection [Vidyasagar, T. R. (1999). A neuronal model of attentional spotlight: Parietal guiding the temporal. Brain Research and Reviews, 30, 66-76; Vidyasagar, T. R. (2001). From attentional gating in macaque primary visual cortex to dyslexia in humans. Progress in Brain Research, 134, 297-312]. Moreover, it has been suggested recently that the dorsal visual pathway is specifically involved in the spatial selection and sequencing required for orthographic processing in visual word recognition. In this experiment we manipulate the demands for spatial processing in a word recognition, lexical decision task by presenting target words in a normal spatial configuration, or where the constituent letters of each word are spatially shifted relative to each other. Accurate word recognition in the Shifted-words condition should demand higher spatial encoding requirements, thereby making greater demands on the dorsal visual stream. Magnetoencephalographic (MEG) neuroimaging revealed a high frequency (35-40 Hz) right posterior parietal activation consistent with dorsal stream involvement occurring between 100 and 300 ms post-stimulus onset, and then again at 200-400 ms. Moreover, this signal was stronger in the shifted word condition, compared to the normal word condition. This result provides neurophysiological evidence that the dorsal visual stream may play an important role in visual word recognition and reading. These results further provide a plausible link between early stage theories of reading, and the magnocellular-deficit theory of dyslexia, which characterises many types of reading difficulty. © 2006 Elsevier Ltd. All rights reserved.
Resumo:
We used magnetoencephalography (MEG) to map the spatiotemporal evolution of cortical activity for visual word recognition. We show that for five-letter words, activity in the left hemisphere (LH) fusiform gyrus expands systematically in both the posterior-anterior and medial-lateral directions over the course of the first 500 ms after stimulus presentation. Contrary to what would be expected from cognitive models and hemodynamic studies, the component of this activity that spatially coincides with the visual word form area (VWFA) is not active until around 200 ms post-stimulus, and critically, this activity is preceded by and co-active with activity in parts of the inferior frontal gyrus (IFG, BA44/6). The spread of activity in the VWFA for words does not appear in isolation but is co-active in parallel with spread of activity in anterior middle temporal gyrus (aMTG, BA 21 and 38), posterior middle temporal gyrus (pMTG, BA37/39), and IFG. © 2004 Elsevier Inc. All rights reserved.
Resumo:
We report an extension of the procedure devised by Weinstein and Shanks (Memory & Cognition 36:1415-1428, 2008) to study false recognition and priming of pictures. Participants viewed scenes with multiple embedded objects (seen items), then studied the names of these objects and the names of other objects (read items). Finally, participants completed a combined direct (recognition) and indirect (identification) memory test that included seen items, read items, and new items. In the direct test, participants recognized pictures of seen and read items more often than new pictures. In the indirect test, participants' speed at identifying those same pictures was improved for pictures that they had actually studied, and also for falsely recognized pictures whose names they had read. These data provide new evidence that a false-memory induction procedure can elicit memory-like representations that are difficult to distinguish from "true" memories of studied pictures. © 2012 Psychonomic Society, Inc.
Resumo:
Recent experimental studies have shown that development towards adult performance levels in configural processing in object recognition is delayed through middle childhood. Whilst partchanges to animal and artefact stimuli are processed with similar to adult levels of accuracy from 7 years of age, relative size changes to stimuli result in a significant decrease in relative performance for participants aged between 7 and 10. Two sets of computational experiments were run using the JIM3 artificial neural network with adult and 'immature' versions to simulate these results. One set progressively decreased the number of neurons involved in the representation of view-independent metric relations within multi-geon objects. A second set of computational experiments involved decreasing the number of neurons that represent view-dependent (nonrelational) object attributes in JIM3's Surface Map. The simulation results which show the best qualitative match to empirical data occurred when artificial neurons representing metric-precision relations were entirely eliminated. These results therefore provide further evidence for the late development of relational processing in object recognition and suggest that children in middle childhood may recognise objects without forming structural description representations.
Resumo:
The research presented in this paper is part of an ongoing investigation into how best to incorporate speech-based input within mobile data collection applications. In our previous work [1], we evaluated the ability of a single speech recognition engine to support accurate, mobile, speech-based data input. Here, we build on our previous research to compare the achievable speaker-independent accuracy rates of a variety of speech recognition engines; we also consider the relative effectiveness of different speech recognition engine and microphone pairings in terms of their ability to support accurate text entry under realistic mobile conditions of use. Our intent is to provide some initial empirical data derived from mobile, user-based evaluations to support technological decisions faced by developers of mobile applications that would benefit from, or require, speech-based data entry facilities.
Resumo:
The research presented in this paper is part of an ongoing investigation into how best to incorporate speech-based input within mobile data collection applications. In our previous work [1], we evaluated the ability of a single speech recognition engine to support accurate, mobile, speech-based data input. Here, we build on our previous research to compare the achievable speaker-independent accuracy rates of a variety of speech recognition engines; we also consider the relative effectiveness of different speech recognition engine and microphone pairings in terms of their ability to support accurate text entry under realistic mobile conditions of use. Our intent is to provide some initial empirical data derived from mobile, user-based evaluations to support technological decisions faced by developers of mobile applications that would benefit from, or require, speech-based data entry facilities.
Resumo:
Parkinson's disease (PD) is a common disorder of middle-aged and elderly people, in which there is degeneration of the extra-pyramidal motor system. In some patients, the disease is associated with a range of visual signs and symptoms, including defects in visual acuity, colour vision, the blink reflex, pupil reactivity, saccadic and smooth pursuit movements and visual evoked potentials. In addition, there may be psychophysical changes, disturbances of complex visual functions such as visuospatial orientation and facial recognition, and chronic visual hallucinations. Some of the treatments associated with PD may have adverse ocular reactions. If visual problems are present, they can have an important effect on overall motor function, and quality of life of patients can be improved by accurate diagnosis and correction of such defects. Moreover, visual testing is useful in separating PD from other movement disorders with visual symptoms, such as dementia with Lewy bodies (DLB), multiple system atrophy (MSA) and progressive supranuclear palsy (PSP). Although not central to PD, visual signs and symptoms can be an important though obscure aspect of the disease and should not be overlooked.
Resumo:
This paper presents a case study of the use of a visual interactive modelling system to investigate issues involved in the management of a hospital ward. Visual Interactive Modelling systems are seen to offer the learner the opportunity to explore operational management issues from a varied perspective and to provide an interactive system in which the learner receives feedback on the consequences of their actions. However to maximise the potential learning experience for a student requires the recognition that they require task structure which helps them to understand the concepts involved. These factors can be incorporated into the visual interactive model by providing an interface customised to guide the student through the experimentation. Recent developments of VIM systems in terms of their connectivity with the programming language Visual Basic facilitates this customisation.
Resumo:
According to some models of visual selective attention, objects in a scene activate corresponding neural representations, which compete for perceptual awareness and motor behavior. During a visual search for a target object, top-down control exerted by working memory representations of the target's defining properties resolves competition in favor of the target. These models, however, ignore the existence of associative links among object representations. Here we show that such associations can strongly influence deployment of attention in humans. In the context of visual search, objects associated with the target were both recalled more often and recognized more accurately than unrelated distractors. Notably, both target and associated objects competitively weakened recognition of unrelated distractors and slowed responses to a luminance probe. Moreover, in a speeded search protocol, associated objects rendered search both slower and less accurate. Finally, the first saccades after onset of the stimulus array were more often directed toward associated than control items.