941 resultados para Visual Speaker Recognition, Visual Speech Recognition, Cascading Appearance-Based Features


Relevância:

50.00% 50.00%

Publicador:

Resumo:

In behavior reminiscent of the responsiveness of human infants to speech, young songbirds innately recognize and prefer to learn the songs of their own species. The acoustic and physiological bases for innate recognition were investigated in fledgling white-crowned sparrows lacking song experience. A behavioral test revealed that the complete conspecific song was not essential for innate recognition: songs composed of single white-crowned sparrow phrases and songs played in reverse elicited vocal responses as strongly as did normal song. In all cases, these responses surpassed those to other species’ songs. Although auditory neurons in the song nucleus HVc and the underlying neostriatum of fledglings did not prefer conspecific song over foreign song, some neurons responded strongly to particular phrase types characteristic of white-crowned sparrows and, thus, could contribute to innate song recognition.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

The influence of temporal association on the representation and recognition of objects was investigated. Observers were shown sequences of novel faces in which the identity of the face changed as the head rotated. As a result, observers showed a tendency to treat the views as if they were of the same person. Additional experiments revealed that this was only true if the training sequences depicted head rotations rather than jumbled views; in other words, the sequence had to be spatially as well as temporally smooth. Results suggest that we are continuously associating views of objects to support later recognition, and that we do so not only on the basis of the physical similarity, but also the correlated appearance in time of the objects.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

Speech interface technology, which includes automatic speech recognition, synthetic speech, and natural language processing, is beginning to have a significant impact on business and personal computer use. Today, powerful and inexpensive microprocessors and improved algorithms are driving commercial applications in computer command, consumer, data entry, speech-to-text, telephone, and voice verification. Robust speaker-independent recognition systems for command and navigation in personal computers are now available; telephone-based transaction and database inquiry systems using both speech synthesis and recognition are coming into use. Large-vocabulary speech interface systems for document creation and read-aloud proofing are expanding beyond niche markets. Today's applications represent a small preview of a rich future for speech interface technology that will eventually replace keyboards with microphones and loud-speakers to give easy accessibility to increasingly intelligent machines.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

Deep brain stimulation (DBS) provides significant therapeutic benefit for movement disorders such as Parkinson’s disease (PD). Current DBS devices lack real-time feedback (thus are open loop) and stimulation parameters are adjusted during scheduled visits with a clinician. A closed-loop DBS system may reduce power consumption and side effects by adjusting stimulation parameters based on patient’s behavior. Thus behavior detection is a major step in designing such systems. Various physiological signals can be used to recognize the behaviors. Subthalamic Nucleus (STN) Local field Potential (LFP) is a great candidate signal for the neural feedback, because it can be recorded from the stimulation lead and does not require additional sensors. This thesis proposes novel detection and classification techniques for behavior recognition based on deep brain LFP. Behavior detection from such signals is the vital step in developing the next generation of closed-loop DBS devices. LFP recordings from 13 subjects are utilized in this study to design and evaluate our method. Recordings were performed during the surgery and the subjects were asked to perform various behavioral tasks. Various techniques are used understand how the behaviors modulate the STN. One method studies the time-frequency patterns in the STN LFP during the tasks. Another method measures the temporal inter-hemispheric connectivity of the STN as well as the connectivity between STN and Pre-frontal Cortex (PFC). Experimental results demonstrate that different behaviors create different m odulation patterns in STN and it’s connectivity. We use these patterns as features to classify behaviors. A method for single trial recognition of the patient’s current task is proposed. This method uses wavelet coefficients as features and support vector machine (SVM) as the classifier for recognition of a selection of behaviors: speech, motor, and random. The proposed method is 82.4% accurate for the binary classification and 73.2% for classifying three tasks. As the next step, a practical behavior detection method which asynchronously detects behaviors is proposed. This method does not use any priori knowledge of behavior onsets and is capable of asynchronously detect the finger movements of PD patients. Our study indicates that there is a motor-modulated inter-hemispheric connectivity between LFP signals recorded bilaterally from STN. We utilize a non-linear regression method to measure this inter-hemispheric connectivity and to detect the finger movements. Our experimental results using STN LFP recorded from eight patients with PD demonstrate this is a promising approach for behavior detection and developing novel closed-loop DBS systems.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

Parkinson disease is mainly characterized by the degeneration of dopaminergic neurons in the central nervous system, including the retina. Different interrelated molecular mechanisms underlying Parkinson disease-associated neuronal death have been put forward in the brain, including oxidative stress and mitochondrial dysfunction. Systemic injection of the proneurotoxin 1-methyl-4-phenyl-1,2,3,6-tetrahydropyridine (MPTP) to monkeys elicits the appearance of a parkinsonian syndrome, including morphological and functional impairments in the retina. However, the intracellular events leading to derangement of dopaminergic and other retinal neurons in MPTP-treated animal models have not been so far investigated. Here we have used a comparative proteomics approach to identify proteins differentially expressed in the retina of MPTP-treated monkeys. Proteins were solubilized from the neural retinas of control and MPTP-treated animals, labelled separately with two different cyanine fluorophores and run pairwise on 2D DIGE gels. Out of >700 protein spots resolved and quantified, 36 were found to exhibit statistically significant differences in their expression levels, of at least ±1.4-fold, in the parkinsonian monkey retina compared with controls. Most of these spots were excised from preparative 2D gels, trypsinized and subjected to MALDI-TOF MS and LC-MS/MS analyses. Data obtained were used for protein sequence database interrogation, and 15 different proteins were successfully identified, of which 13 were underexpressed and 2 overexpressed. These proteins were involved in key cellular functional pathways such as glycolysis and mitochondrial electron transport, neuronal protection against stress and survival, and phototransduction processes. These functional categories underscore that alterations in energy metabolism, neuroprotective mechanisms and signal transduction are involved in MPTPinduced neuronal degeneration in the retina, in similarity to mechanisms thought to underlie neuronal death in the Parkinson’s diseased brain and neurodegenerative diseases of the retina proper.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

Drawing from ethnographic, empirical, and historical/cultural perspectives, we examine the extent to which visual aspects of music contribute to the communication that takes place between performers and their listeners. First, we introduce a framework for understanding how media and genres shape aural and visual experiences of music. Second, we present case studies of two performances, and describe the relation between visual and aural aspects of performance. Third, we report empirical evidence that visual aspects of performance reliably influence perceptions of musical structure (pitch related features) and affective interpretations of music. Finally, we trace new and old media trajectories of aural and visual dimensions of music, and highlight how our conceptions, perceptions and appreciation of music are intertwined with technological innovation and media deployment strategies.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

Recognising the laterality of a pictured hand involves making an initial decision and confirming that choice by mentally moving one's own hand to match the picture. This depends on an intact body schema. Because patients with complex regional pain syndrome type 1 (CRPS1) take longer to recognise a hand's laterality when it corresponds to their affected hand, it has been proposed that nociceptive input disrupts the body schema. However, chronic pain is associated with physiological and psychosocial complexities that may also explain the results. In three studies, we investigated whether the effect is simply due to nociceptive input. Study one evaluated the temporal and perceptual characteristics of acute hand pain elicited by intramuscular injection of hypertonic saline into the thenar eminence. In studies two and three, subjects performed a hand laterality recognition task before, during, and after acute experimental hand pain, and experimental elbow pain, respectively. During hand pain and during elbow pain, when the laterality of the pictured hand corresponded to the painful side, there was no effect on response time (RT). That suggests that nociceptive input alone is not sufficient to disrupt the working body schema. Conversely to patients with CRPS1, when the laterality of the pictured hand corresponded to the non-painful hand, RT increased similar to 380 ms (95% confidence interval 190 ms-590 ms). The results highlight the differences between acute and chronic pain and may reflect a bias in information processing in acute pain toward the affected part.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

In the past, the accuracy of facial approximations has been assessed by resemblance ratings (i.e., the comparison of a facial approximation directly to a target individual) and recognition tests (e.g., the comparison of a facial approximation to a photo array of faces including foils and a target individual). Recently, several research studies have indicated that recognition tests hold major strengths in contrast to resemblance ratings. However, resemblance ratings remain popularly employed and/or are given weighting when judging facial approximations, thus indicating that no consensus has been reached. This study aims to further investigate the matter by comparing the results of resemblance ratings and recognition tests for two facial approximations which clearly differed in their morphological appearance. One facial approximation was constructed by an experienced practitioner privy to the appearance of the target individual (practitioner had direct access to an antemortem frontal photograph during face construction), while the other facial approximation was constructed by a novice under blind conditions. Both facial approximations, whilst clearly morphologically different, were given similar resemblance scores even though recognition test results produced vastly different results. One facial approximation was correctly recognized almost without exception while the other was not correctly recognized above chance rates. These results suggest that resemblance ratings are insensitive measures of the accuracy of facial approximations and lend further weight to the use of recognition tests in facial approximation assessment. (c) 2006 Elsevier Ireland Ltd. All rights reserved.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

Marked phenotypic variation has been reported in pyramidal cells in the primate cerebral cortex. These extent and systematic nature of these specializations suggest that they are important for specialized aspects of cortical processing. However, it remains unknown as to whether regional variations in the pyramidal cell phenotype are unique to primates or if they are widespread amongst mammalian species. In the present study we determined the receptive fields of neurons in striate and extrastriate visual cortex, and quantified pyramidal cell structure in these cortical regions, in the diurnal, large-brained, South American rodent Dasyprocta primnolopha. We found evidence for a first, second and third visual area (V1, V2 and V3, respectively) forming a lateral progression from the occipital pole to the temporal pole. Pyramidal cell structure became increasingly more complex through these areas, suggesting that regional specialization in pyramidal cell phenotype is not restricted to primates. However, cells in V1, V2 and V3 of the agouti were considerably more spinous than their counterparts in primates, suggesting different evolutionary and developmental influences may act on cortical microcircuitry in rodents and primates. (c) 2006 Elsevier B.V. All rights reserved.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

PURPOSE. The driving environment is becoming increasingly complex, including both visual and auditory distractions within the in- vehicle and external driving environments. This study was designed to investigate the effect of visual and auditory distractions on a performance measure that has been shown to be related to driving safety, the useful field of view. METHODS. A laboratory study recorded the useful field of view in 28 young visually normal adults (mean 22.6 +/- 2.2 years). The useful field of view was measured in the presence and absence of visual distracters (of the same angular subtense as the target) and with three levels of auditory distraction (none, listening only, listening and responding). RESULTS. Central errors increased significantly (P < 0.05) in the presence of auditory but not visual distracters, while peripheral errors increased in the presence of both visual and auditory distracters. Peripheral errors increased with eccentricity and were greatest in the inferior region in the presence of distracters. CONCLUSIONS. Visual and auditory distracters reduce the extent of the useful field of view, and these effects are exacerbated in inferior and peripheral locations. This result has significant ramifications for road safety in an increasingly complex in-vehicle and driving environment.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

This paper presents a corpus-based descriptive analysis of the most prevalent transfer effects and connected speech processes observed in a comparison of 11 Vietnamese English speakers (6 females, 5 males) and 12 Australian English speakers (6 males, 6 females) over 24 grammatical paraphrase items. The phonetic processes are segmentally labelled in terms of IPA diacritic features using the EMU speech database system with the aim of labelling departures from native-speaker pronunciation. An analysis of prosodic features was made using ToBI framework. The results show many phonetic and prosodic processes which make non-native speakers’ speech distinct from native ones. The corpusbased methodology of analysing foreign accent may have implications for the evaluation of non-native accent, accented speech recognition and computer assisted pronunciation- learning.