941 resultados para Visual Speaker Recognition, Visual Speech Recognition, Cascading Appearance-Based Features
Resumo:
"Es tracta d'un projecte dividit en dues parts independents però complementàries, realitzades per autors diferents. Aquest document conté originàriament altre material i/o programari només consultable a la Biblioteca de Ciència i Tecnologia"
Resumo:
Brittle cornea syndrome (BCS) is an autosomal recessive disorder characterised by extreme corneal thinning and fragility. Corneal rupture can therefore occur either spontaneously or following minimal trauma in affected patients. Two genes, ZNF469 and PRDM5, have now been identified, in which causative pathogenic mutations collectively account for the condition in nearly all patients with BCS ascertained to date. Therefore, effective molecular diagnosis is now available for affected patients, and those at risk of being heterozygous carriers for BCS. We have previously identified mutations in ZNF469 in 14 families (in addition to 6 reported by others in the literature), and in PRDM5 in 8 families (with 1 further family now published by others). Clinical features include extreme corneal thinning with rupture, high myopia, blue sclerae, deafness of mixed aetiology with hypercompliant tympanic membranes, and variable skeletal manifestations. Corneal rupture may be the presenting feature of BCS, and it is possible that this may be incorrectly attributed to non-accidental injury. Mainstays of management include the prevention of ocular rupture by provision of protective polycarbonate spectacles, careful monitoring of visual and auditory function, and assessment for skeletal complications such as developmental dysplasia of the hip. Effective management depends upon appropriate identification of affected individuals, which may be challenging given the phenotypic overlap of BCS with other connective tissue disorders.
Resumo:
A right-handed man developed a sudden transient, amnestic syndrome associated with bilateral hemorrhage of the hippocampi, probably due to Urbach-Wiethe disease. In the 3rd month, despite significant hippocampal structural damage on imaging, only a milder degree of retrograde and anterograde amnesia persisted on detailed neuropsychological examination. On systematic testing of recognition of facial and vocal expression of emotion, we found an impairment of the vocal perception of fear, but not that of other emotions, such as joy, sadness and anger. Such selective impairment of fear perception was not present in the recognition of facial expression of emotion. Thus emotional perception varies according to the different aspects of emotions and the different modality of presentation (faces versus voices). This is consistent with the idea that there may be multiple emotion systems. The study of emotional perception in this unique case of bilateral involvement of hippocampus suggests that this structure may play a critical role in the recognition of fear in vocal expression, possibly dissociated from that of other emotions and from that of fear in facial expression. In regard of recent data suggesting that the amygdala is playing a role in the recognition of fear in the auditory as well as in the visual modality this could suggest that the hippocampus may be part of the auditory pathway of fear recognition.
Resumo:
Several features that can be extracted from digital images of the sky and that can be useful for cloud-type classification of such images are presented. Some features are statistical measurements of image texture, some are based on the Fourier transform of the image and, finally, others are computed from the image where cloudy pixels are distinguished from clear-sky pixels. The use of the most suitable features in an automatic classification algorithm is also shown and discussed. Both the features and the classifier are developed over images taken by two different camera devices, namely, a total sky imager (TSI) and a whole sky imager (WSC), which are placed in two different areas of the world (Toowoomba, Australia; and Girona, Spain, respectively). The performance of the classifier is assessed by comparing its image classification with an a priori classification carried out by visual inspection of more than 200 images from each camera. The index of agreement is 76% when five different sky conditions are considered: clear, low cumuliform clouds, stratiform clouds (overcast), cirriform clouds, and mottled clouds (altocumulus, cirrocumulus). Discussion on the future directions of this research is also presented, regarding both the use of other features and the use of other classification techniques
Resumo:
The current state of empirical investigations refers to consciousness as an all-or-none phenomenon. However, a recent theoretical account opens up this perspective by proposing a partial level (between nil and full) of conscious perception. In the well-studied case of single-word reading, short-lived exposure can trigger incomplete word-form recognition wherein letters fall short of forming a whole word in one's conscious perception thereby hindering word-meaning access and report. Hence, the processing from incomplete to complete word-form recognition straightforwardly mirrors a transition from partial to full-blown consciousness. We therefore hypothesized that this putative functional bottleneck to consciousness (i.e. the perceptual boundary between partial and full conscious perception) would emerge at a major key hub region for word-form recognition during reading, namely the left occipito-temporal junction. We applied a real-time staircase procedure and titrated subjective reports at the threshold between partial (letters) and full (whole word) conscious perception. This experimental approach allowed us to collect trials with identical physical stimulation, yet reflecting distinct perceptual experience levels. Oscillatory brain activity was monitored with magnetoencephalography and revealed that the transition from partial-to-full word-form perception was accompanied by alpha-band (7-11 Hz) power suppression in the posterior left occipito-temporal cortex. This modulation of rhythmic activity extended anteriorly towards the visual word form area (VWFA), a region whose selectivity for word-forms in perception is highly debated. The current findings provide electrophysiological evidence for a functional bottleneck to consciousness thereby empirically instantiating a recently proposed partial perspective on consciousness. Moreover, the findings provide an entirely new outlook on the functioning of the VWFA as a late bottleneck to full-blown conscious word-form perception.
Resumo:
Evidence from neuropsychological and activation studies (Clarke et al., 2oo0, Maeder et al., 2000) suggests that sound recognitionand localisation are processed by two anatomically and functionally distinct cortical networks. We report here on a case of a patientthat had an interruption of auditory information and we show: i) the effects of this interruption on cortical auditory processing; ii)the effect of the workload on activation pattern.A 36 year old man suffered from a small left mesencephalic haemotrhage, due to cavernous angioma; the let% inferior colliculuswas resected in the surgical approach of the vascular malformation. In the acute stage, the patient complained of auditoryhallucinations and of auditory loss in right ear, while tonal audiometry was normal. At 12 months, auditory recognition, auditorylocalisation (assessed by lTD and IID cues) and auditory motion perception were normal (Clarke et al., 2000), while verbal dichoticlistening was deficient on the right side.Sound recognition and sound localisation activation patterns were investigated with fMRI, using a passive and an activeparadigm. In normal subjects, distinct cortical networks were involved in sound recognition and localisation, both in passive andactive paradigm (Maeder et al., 2OOOa, 2000b).Passive listening of environmental and spatial stimuli as compared to rest strongly activated right auditory cortex, but failed toactivate left primary auditory cortex. The specialised networks for sound recognition and localisation could not be visual&d onthe right and only minimally on the left convexity. A very different activation pattern was obtained in the active condition wherea motor response was required. Workload not only increased the activation of the right auditory cortex, but also allowed theactivation of the left primary auditory cortex. The specialised networks for sound recognition and localisation were almostcompletely present in both hemispheres.These results show that increasing the workload can i) help to recruit cortical region in the auditory deafferented hemisphere;and ii) lead to processing auditory information within specific cortical networks.References:Clarke et al. (2000). Neuropsychologia 38: 797-807.Mae.der et al. (2OOOa), Neuroimage 11: S52.Maeder et al. (2OOOb), Neuroimage 11: S33
Resumo:
Single-trial encounters with multisensory stimuli affect both memory performance and early-latency brain responses to visual stimuli. Whether and how auditory cortices support memory processes based on single-trial multisensory learning is unknown and may differ qualitatively and quantitatively from comparable processes within visual cortices due to purported differences in memory capacities across the senses. We recorded event-related potentials (ERPs) as healthy adults (n = 18) performed a continuous recognition task in the auditory modality, discriminating initial (new) from repeated (old) sounds of environmental objects. Initial presentations were either unisensory or multisensory; the latter entailed synchronous presentation of a semantically congruent or a meaningless image. Repeated presentations were exclusively auditory, thus differing only according to the context in which the sound was initially encountered. Discrimination abilities (indexed by d') were increased for repeated sounds that were initially encountered with a semantically congruent image versus sounds initially encountered with either a meaningless or no image. Analyses of ERPs within an electrical neuroimaging framework revealed that early stages of auditory processing of repeated sounds were affected by prior single-trial multisensory contexts. These effects followed from significantly reduced activity within a distributed network, including the right superior temporal cortex, suggesting an inverse relationship between brain activity and behavioural outcome on this task. The present findings demonstrate how auditory cortices contribute to long-term effects of multisensory experiences on auditory object discrimination. We propose a new framework for the efficacy of multisensory processes to impact both current multisensory stimulus processing and unisensory discrimination abilities later in time.
Resumo:
Increasing evidence suggests that working memory and perceptual processes are dynamically interrelated due to modulating activity in overlapping brain networks. However, the direct influence of working memory on the spatio-temporal brain dynamics of behaviorally relevant intervening information remains unclear. To investigate this issue, subjects performed a visual proximity grid perception task under three different visual-spatial working memory (VSWM) load conditions. VSWM load was manipulated by asking subjects to memorize the spatial locations of 6 or 3 disks. The grid was always presented between the encoding and recognition of the disk pattern. As a baseline condition, grid stimuli were presented without a VSWM context. VSWM load altered both perceptual performance and neural networks active during intervening grid encoding. Participants performed faster and more accurately on a challenging perceptual task under high VSWM load as compared to the low load and the baseline condition. Visual evoked potential (VEP) analyses identified changes in the configuration of the underlying sources in one particular period occurring 160-190 ms post-stimulus onset. Source analyses further showed an occipito-parietal down-regulation concurrent to the increased involvement of temporal and frontal resources in the high VSWM context. Together, these data suggest that cognitive control mechanisms supporting working memory may selectively enhance concurrent visual processing related to an independent goal. More broadly, our findings are in line with theoretical models implicating the engagement of frontal regions in synchronizing and optimizing mnemonic and perceptual resources towards multiple goals.
Resumo:
Multisensory memory traces established via single-trial exposures can impact subsequent visual object recognition. This impact appears to depend on the meaningfulness of the initial multisensory pairing, implying that multisensory exposures establish distinct object representations that are accessible during later unisensory processing. Multisensory contexts may be particularly effective in influencing auditory discrimination, given the purportedly inferior recognition memory in this sensory modality. The possibility of this generalization and the equivalence of effects when memory discrimination was being performed in the visual vs. auditory modality were at the focus of this study. First, we demonstrate that visual object discrimination is affected by the context of prior multisensory encounters, replicating and extending previous findings by controlling for the probability of multisensory contexts during initial as well as repeated object presentations. Second, we provide the first evidence that single-trial multisensory memories impact subsequent auditory object discrimination. Auditory object discrimination was enhanced when initial presentations entailed semantically congruent multisensory pairs and was impaired after semantically incongruent multisensory encounters, compared to sounds that had been encountered only in a unisensory manner. Third, the impact of single-trial multisensory memories upon unisensory object discrimination was greater when the task was performed in the auditory vs. visual modality. Fourth, there was no evidence for correlation between effects of past multisensory experiences on visual and auditory processing, suggestive of largely independent object processing mechanisms between modalities. We discuss these findings in terms of the conceptual short term memory (CSTM) model and predictive coding. Our results suggest differential recruitment and modulation of conceptual memory networks according to the sensory task at hand.
Resumo:
We propose a probabilistic object classifier for outdoor scene analysis as a first step in solving the problem of scene context generation. The method begins with a top-down control, which uses the previously learned models (appearance and absolute location) to obtain an initial pixel-level classification. This information provides us the core of objects, which is used to acquire a more accurate object model. Therefore, their growing by specific active regions allows us to obtain an accurate recognition of known regions. Next, a stage of general segmentation provides the segmentation of unknown regions by a bottom-strategy. Finally, the last stage tries to perform a region fusion of known and unknown segmented objects. The result is both a segmentation of the image and a recognition of each segment as a given object class or as an unknown segmented object. Furthermore, experimental results are shown and evaluated to prove the validity of our proposal
Resumo:
Local features are used in many computer vision tasks including visual object categorization, content-based image retrieval and object recognition to mention a few. Local features are points, blobs or regions in images that are extracted using a local feature detector. To make use of extracted local features the localized interest points are described using a local feature descriptor. A descriptor histogram vector is a compact representation of an image and can be used for searching and matching images in databases. In this thesis the performance of local feature detectors and descriptors is evaluated for object class detection task. Features are extracted from image samples belonging to several object classes. Matching features are then searched using random image pairs of a same class. The goal of this thesis is to find out what are the best detector and descriptor methods for such task in terms of detector repeatability and descriptor matching rate.
Resumo:
During a possible loss of coolant accident in BWRs, a large amount of steam will be released from the reactor pressure vessel to the suppression pool. Steam will be condensed into the suppression pool causing dynamic and structural loads to the pool. The formation and break up of bubbles can be measured by visual observation using a suitable pattern recognition algorithm. The aim of this study was to improve the preliminary pattern recognition algorithm, developed by Vesa Tanskanen in his doctoral dissertation, by using MATLAB. Video material from the PPOOLEX test facility, recorded during thermal stratification and mixing experiments, was used as a reference in the development of the algorithm. The developed algorithm consists of two parts: the pattern recognition of the bubbles and the analysis of recognized bubble images. The bubble recognition works well, but some errors will appear due to the complex structure of the pool. The results of the image analysis were reasonable. The volume and the surface area of the bubbles were not evaluated. Chugging frequencies calculated by using FFT fitted well into the results of oscillation frequencies measured in the experiments. The pattern recognition algorithm works in the conditions it is designed for. If the measurement configuration will be changed, some modifications have to be done. Numerous improvements are proposed for the future 3D equipment.
Resumo:
Motivated by a recently proposed biologically inspired face recognition approach, we investigated the relation between human behavior and a computational model based on Fourier-Bessel (FB) spatial patterns. We measured human recognition performance of FB filtered face images using an 8-alternative forced-choice method. Test stimuli were generated by converting the images from the spatial to the FB domain, filtering the resulting coefficients with a band-pass filter, and finally taking the inverse FB transformation of the filtered coefficients. The performance of the computational models was tested using a simulation of the psychophysical experiment. In the FB model, face images were first filtered by simulated V1- type neurons and later analyzed globally for their content of FB components. In general, there was a higher human contrast sensitivity to radially than to angularly filtered images, but both functions peaked at the 11.3-16 frequency interval. The FB-based model presented similar behavior with regard to peak position and relative sensitivity, but had a wider frequency band width and a narrower response range. The response pattern of two alternative models, based on local FB analysis and on raw luminance, strongly diverged from the human behavior patterns. These results suggest that human performance can be constrained by the type of information conveyed by polar patterns, and consequently that humans might use FB-like spatial patterns in face processing.
Resumo:
The visualization of tools and manipulable objects activates motor-related areas in the cortex, facilitating possible actions toward them. This pattern of activity may underlie the phenomenon of object affordance. Some cortical motor neurons are also covertly activated during the recognition of body parts such as hands. One hypothesis is that different subpopulations of motor neurons in the frontal cortex are activated in each motor program; for example, canonical neurons in the premotor cortex are responsible for the affordance of visual objects, while mirror neurons support motor imagery triggered during handedness recognition. However, the question remains whether these subpopulations work independently. This hypothesis can be tested with a manual reaction time (MRT) task with a priming paradigm to evaluate whether the view of a manipulable object interferes with the motor imagery of the subject's hand. The MRT provides a measure of the course of information processing in the brain and allows indirect evaluation of cognitive processes. Our results suggest that canonical and mirror neurons work together to create a motor plan involving hand movements to facilitate successful object manipulation.
Resumo:
Many industrial applications need object recognition and tracking capabilities. The algorithms developed for those purposes are computationally expensive. Yet ,real time performance, high accuracy and small power consumption are essential measures of the system. When all these requirements are combined, hardware acceleration of these algorithms becomes a feasible solution. The purpose of this study is to analyze the current state of these hardware acceleration solutions, which algorithms have been implemented in hardware and what modifications have been done in order to adapt these algorithms to hardware.