873 resultados para Audio-visual Speech Recognition, Visual Feature Extraction, Free-parts, Monolithic, ROI
Resumo:
Proceedings of the International Conference on Computer Vision Theory and Applications, 361-365, 2013, Barcelona, Spain
Resumo:
The robotics community is concerned with the ability to infer and compare the results from researchers in areas such as vision perception and multi-robot cooperative behavior. To accomplish that task, this paper proposes a real-time indoor visual ground truth system capable of providing accuracy with at least more magnitude than the precision of the algorithm to be evaluated. A multi-camera architecture is proposed under the ROS (Robot Operating System) framework to estimate the 3D position of objects and the implementation and results were contextualized to the Robocup Middle Size League scenario.
Resumo:
Dissertação para obtenção do Grau de Mestre em Engenharia Biomédica
Resumo:
In the early nineties, Mark Weiser wrote a series of seminal papers that introduced the concept of Ubiquitous Computing. According to Weiser, computers require too much attention from the user, drawing his focus from the tasks at hand. Instead of being the centre of attention, computers should be so natural that they would vanish into the human environment. Computers become not only truly pervasive but also effectively invisible and unobtrusive to the user. This requires not only for smaller, cheaper and low power consumption computers, but also for equally convenient display solutions that can be harmoniously integrated into our surroundings. With the advent of Printed Electronics, new ways to link the physical and the digital worlds became available. By combining common printing techniques such as inkjet printing with electro-optical functional inks, it is starting to be possible not only to mass-produce extremely thin, flexible and cost effective electronic circuits but also to introduce electronic functionalities into products where it was previously unavailable. Indeed, Printed Electronics is enabling the creation of novel sensing and display elements for interactive devices, free of form factor. At the same time, the rise in the availability and affordability of digital fabrication technologies, namely of 3D printers, to the average consumer is fostering a new industrial (digital) revolution and the democratisation of innovation. Nowadays, end-users are already able to custom design and manufacture on demand their own physical products, according to their own needs. In the future, they will be able to fabricate interactive digital devices with user-specific form and functionality from the comfort of their homes. This thesis explores how task-specific, low computation, interactive devices capable of presenting dynamic visual information can be created using Printed Electronics technologies, whilst following an approach based on the ideals behind Personal Fabrication. Focus is given on the use of printed electrochromic displays as a medium for delivering dynamic digital information. According to the architecture of the displays, several approaches are highlighted and categorised. Furthermore, a pictorial computation model based on extended cellular automata principles is used to programme dynamic simulation models into matrix-based electrochromic displays. Envisaged applications include the modelling of physical, chemical, biological, and environmental phenomena.
Resumo:
Due to advances in information technology (e.g., digital video cameras, ubiquitous sensors), the automatic detection of human behaviors from video is a very recent research topic. In this paper, we perform a systematic and recent literature review on this topic, from 2000 to 2014, covering a selection of 193 papers that were searched from six major scientific publishers. The selected papers were classified into three main subjects: detection techniques, datasets and applications. The detection techniques were divided into four categories (initialization, tracking, pose estimation and recognition). The list of datasets includes eight examples (e.g., Hollywood action). Finally, several application areas were identified, including human detection, abnormal activity detection, action recognition, player modeling and pedestrian detection. Our analysis provides a road map to guide future research for designing automatic visual human behavior detection systems.
Resumo:
Brittle cornea syndrome (BCS) is an autosomal recessive disorder characterised by extreme corneal thinning and fragility. Corneal rupture can therefore occur either spontaneously or following minimal trauma in affected patients. Two genes, ZNF469 and PRDM5, have now been identified, in which causative pathogenic mutations collectively account for the condition in nearly all patients with BCS ascertained to date. Therefore, effective molecular diagnosis is now available for affected patients, and those at risk of being heterozygous carriers for BCS. We have previously identified mutations in ZNF469 in 14 families (in addition to 6 reported by others in the literature), and in PRDM5 in 8 families (with 1 further family now published by others). Clinical features include extreme corneal thinning with rupture, high myopia, blue sclerae, deafness of mixed aetiology with hypercompliant tympanic membranes, and variable skeletal manifestations. Corneal rupture may be the presenting feature of BCS, and it is possible that this may be incorrectly attributed to non-accidental injury. Mainstays of management include the prevention of ocular rupture by provision of protective polycarbonate spectacles, careful monitoring of visual and auditory function, and assessment for skeletal complications such as developmental dysplasia of the hip. Effective management depends upon appropriate identification of affected individuals, which may be challenging given the phenotypic overlap of BCS with other connective tissue disorders.
Resumo:
OBJECTIVE: The study tests the hypothesis that intramodal visual binding is disturbed in schizophrenia and should be detectable in all illness stages as a stable trait marker. METHOD: Three groups of patients (rehospitalized chronic schizophrenic, first admitted schizophrenic and schizotypal patients believed to be suffering from a pre-schizophrenic prodrome) and a group of normal control subjects were tested on three tasks targeting visual 'binding' abilities (Muller-Lyer's illusion and two figure detection tasks) in addition to control parameters such as reaction time, visual selective attention, Raven's test and two conventional cortical tasks of spatial working memory (SWM) and a global local test. RESULTS: Chronic patients had a decreased performance on the binding tests. Unexpectedly, the prodromal group exhibited an enhanced Gestalt extraction on these tests compared both to schizophrenic patients and to healthy subjects. Furthermore, chronic schizophrenia was associated with a poor performance on cortical tests of SWM, global local and on Raven. This association appears to be mediated by or linked to the chronicity of the illness. CONCLUSION: The study confirms a variety of neurocognitive deficits in schizophrenia which, however, in this sample seem to be linked to chronicity of illness. However, certain aspects of visual processing concerned with Gestalt extraction deserve attention as potential vulnerability- or prodrome- indicators. The initial hypothesis of the study is rejected.
Resumo:
Impaired visual search is a hallmark of spatial neglect. When searching for an unique feature (e.g., color) neglect patients often show only slight visual field asymmetries. In contrast, when the target is defined by a combination of features (e.g., color and form) they exhibit a severe deficit of contralesional search. This finding suggests a selective impairment of the serial deployment of spatial attention. Here, we examined this deficit with a preview paradigm. Neglect patients searched for a target defined by the conjunction of shape and color, presented together with varying numbers of distracters. The presentation time was varied such that on some trials participants previewed the target together with same-shape/different-color distracters, for 300 or 600 ms prior to the appearance of additional different-shape/same-color distracters. On the remaining trials the target and all distracters were shown simultaneously. Healthy participants exhibited a serial search strategy only when all items were presented simultaneously, whereas in both preview conditions a pop-out effect was observed. Neglect patients showed a similar pattern when the target was presented in the right hemifield. In contrast, when searching for a target in the left hemifield they showed serial search in the no-preview condition, as well as with a preview of 300 ms, and partly even at 600 ms. A control experiment suggested that the failure to fully benefit from item preview was probably independent of accurate perception of time. Our results, when viewed in the context of existing literature, lead us to conclude that the visual search deficit in neglect reflects two additive factors: a biased representation of attentional priority in favor of ipsilesional information and exaggerated capture of attention by ipsilesional abrupt onsets.
Resumo:
Positioning a robot with respect to objects by using data provided by a camera is a well known technique called visual servoing. In order to perform a task, the object must exhibit visual features which can be extracted from different points of view. Then, visual servoing is object-dependent as it depends on the object appearance. Therefore, performing the positioning task is not possible in presence of nontextured objets or objets for which extracting visual features is too complex or too costly. This paper proposes a solution to tackle this limitation inherent to the current visual servoing techniques. Our proposal is based on the coded structured light approach as a reliable and fast way to solve the correspondence problem. In this case, a coded light pattern is projected providing robust visual features independently of the object appearance
Resumo:
Functional magnetic resonance imaging studies have indicated that efficient feature search (FS) and inefficient conjunction search (CS) activate partially distinct frontoparietal cortical networks. However, it remains a matter of debate whether the differences in these networks reflect differences in the early processing during FS and CS. In addition, the relationship between the differences in the networks and spatial shifts of attention also remains unknown. We examined these issues by applying a spatio-temporal analysis method to high-resolution visual event-related potentials (ERPs) and investigated how spatio-temporal activation patterns differ for FS and CS tasks. Within the first 450 msec after stimulus onset, scalp potential distributions (ERP maps) revealed 7 different electric field configurations for each search task. Configuration changes occurred simultaneously in the two tasks, suggesting that contributing processes were not significantly delayed in one task compared to the other. Despite this high spatial and temporal correlation, two ERP maps (120-190 and 250-300 msec) differed between the FS and CS. Lateralized distributions were observed only in the ERP map at 250-300 msec for the FS. This distribution corresponds to that previously described as the N2pc component (a negativity in the time range of the N2 complex over posterior electrodes of the hemisphere contralateral to the target hemifield), which has been associated with the focusing of attention onto potential target items in the search display. Thus, our results indicate that the cortical networks involved in feature and conjunction searching partially differ as early as 120 msec after stimulus onset and that the differences between the networks employed during the early stages of FS and CS are not necessarily caused by spatial attention shifts.
Resumo:
The current state of empirical investigations refers to consciousness as an all-or-none phenomenon. However, a recent theoretical account opens up this perspective by proposing a partial level (between nil and full) of conscious perception. In the well-studied case of single-word reading, short-lived exposure can trigger incomplete word-form recognition wherein letters fall short of forming a whole word in one's conscious perception thereby hindering word-meaning access and report. Hence, the processing from incomplete to complete word-form recognition straightforwardly mirrors a transition from partial to full-blown consciousness. We therefore hypothesized that this putative functional bottleneck to consciousness (i.e. the perceptual boundary between partial and full conscious perception) would emerge at a major key hub region for word-form recognition during reading, namely the left occipito-temporal junction. We applied a real-time staircase procedure and titrated subjective reports at the threshold between partial (letters) and full (whole word) conscious perception. This experimental approach allowed us to collect trials with identical physical stimulation, yet reflecting distinct perceptual experience levels. Oscillatory brain activity was monitored with magnetoencephalography and revealed that the transition from partial-to-full word-form perception was accompanied by alpha-band (7-11 Hz) power suppression in the posterior left occipito-temporal cortex. This modulation of rhythmic activity extended anteriorly towards the visual word form area (VWFA), a region whose selectivity for word-forms in perception is highly debated. The current findings provide electrophysiological evidence for a functional bottleneck to consciousness thereby empirically instantiating a recently proposed partial perspective on consciousness. Moreover, the findings provide an entirely new outlook on the functioning of the VWFA as a late bottleneck to full-blown conscious word-form perception.
Resumo:
Localization, which is the ability of a mobile robot to estimate its position within its environment, is a key capability for autonomous operation of any mobile robot. This thesis presents a system for indoor coarse and global localization of a mobile robot based on visual information. The system is based on image matching and uses SIFT features as natural landmarks. Features extracted from training images arestored in a database for use in localization later. During localization an image of the scene is captured using the on-board camera of the robot, features are extracted from the image and the best match is searched from the database. Feature matching is done using the k-d tree algorithm. Experimental results showed that localization accuracy increases with the number of training features used in the training database, while, on the other hand, increasing number of features tended to have a negative impact on the computational time. For some parts of the environment the error rate was relatively high due to a strong correlation of features taken from those places across the environment.
Resumo:
Single-trial encounters with multisensory stimuli affect both memory performance and early-latency brain responses to visual stimuli. Whether and how auditory cortices support memory processes based on single-trial multisensory learning is unknown and may differ qualitatively and quantitatively from comparable processes within visual cortices due to purported differences in memory capacities across the senses. We recorded event-related potentials (ERPs) as healthy adults (n = 18) performed a continuous recognition task in the auditory modality, discriminating initial (new) from repeated (old) sounds of environmental objects. Initial presentations were either unisensory or multisensory; the latter entailed synchronous presentation of a semantically congruent or a meaningless image. Repeated presentations were exclusively auditory, thus differing only according to the context in which the sound was initially encountered. Discrimination abilities (indexed by d') were increased for repeated sounds that were initially encountered with a semantically congruent image versus sounds initially encountered with either a meaningless or no image. Analyses of ERPs within an electrical neuroimaging framework revealed that early stages of auditory processing of repeated sounds were affected by prior single-trial multisensory contexts. These effects followed from significantly reduced activity within a distributed network, including the right superior temporal cortex, suggesting an inverse relationship between brain activity and behavioural outcome on this task. The present findings demonstrate how auditory cortices contribute to long-term effects of multisensory experiences on auditory object discrimination. We propose a new framework for the efficacy of multisensory processes to impact both current multisensory stimulus processing and unisensory discrimination abilities later in time.
Resumo:
Increasing evidence suggests that working memory and perceptual processes are dynamically interrelated due to modulating activity in overlapping brain networks. However, the direct influence of working memory on the spatio-temporal brain dynamics of behaviorally relevant intervening information remains unclear. To investigate this issue, subjects performed a visual proximity grid perception task under three different visual-spatial working memory (VSWM) load conditions. VSWM load was manipulated by asking subjects to memorize the spatial locations of 6 or 3 disks. The grid was always presented between the encoding and recognition of the disk pattern. As a baseline condition, grid stimuli were presented without a VSWM context. VSWM load altered both perceptual performance and neural networks active during intervening grid encoding. Participants performed faster and more accurately on a challenging perceptual task under high VSWM load as compared to the low load and the baseline condition. Visual evoked potential (VEP) analyses identified changes in the configuration of the underlying sources in one particular period occurring 160-190 ms post-stimulus onset. Source analyses further showed an occipito-parietal down-regulation concurrent to the increased involvement of temporal and frontal resources in the high VSWM context. Together, these data suggest that cognitive control mechanisms supporting working memory may selectively enhance concurrent visual processing related to an independent goal. More broadly, our findings are in line with theoretical models implicating the engagement of frontal regions in synchronizing and optimizing mnemonic and perceptual resources towards multiple goals.
Resumo:
Multisensory memory traces established via single-trial exposures can impact subsequent visual object recognition. This impact appears to depend on the meaningfulness of the initial multisensory pairing, implying that multisensory exposures establish distinct object representations that are accessible during later unisensory processing. Multisensory contexts may be particularly effective in influencing auditory discrimination, given the purportedly inferior recognition memory in this sensory modality. The possibility of this generalization and the equivalence of effects when memory discrimination was being performed in the visual vs. auditory modality were at the focus of this study. First, we demonstrate that visual object discrimination is affected by the context of prior multisensory encounters, replicating and extending previous findings by controlling for the probability of multisensory contexts during initial as well as repeated object presentations. Second, we provide the first evidence that single-trial multisensory memories impact subsequent auditory object discrimination. Auditory object discrimination was enhanced when initial presentations entailed semantically congruent multisensory pairs and was impaired after semantically incongruent multisensory encounters, compared to sounds that had been encountered only in a unisensory manner. Third, the impact of single-trial multisensory memories upon unisensory object discrimination was greater when the task was performed in the auditory vs. visual modality. Fourth, there was no evidence for correlation between effects of past multisensory experiences on visual and auditory processing, suggestive of largely independent object processing mechanisms between modalities. We discuss these findings in terms of the conceptual short term memory (CSTM) model and predictive coding. Our results suggest differential recruitment and modulation of conceptual memory networks according to the sensory task at hand.