8 resultados para Visual Object Recognition

em AMS Tesi di Dottorato - Alm@DL - Università di Bologna


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Introduction and aims of the research Nitric oxide (NO) and endocannabinoids (eCBs) are major retrograde messengers, involved in synaptic plasticity (long-term potentiation, LTP, and long-term depression, LTD) in many brain areas (including hippocampus and neocortex), as well as in learning and memory processes. NO is synthesized by NO synthase (NOS) in response to increased cytosolic Ca2+ and mainly exerts its functions through soluble guanylate cyclase (sGC) and cGMP production. The main target of cGMP is the cGMP-dependent protein kinase (PKG). Activity-dependent release of eCBs in the CNS leads to the activation of the Gαi/o-coupled cannabinoid receptor 1 (CB1) at both glutamatergic and inhibitory synapses. The perirhinal cortex (Prh) is a multimodal associative cortex of the temporal lobe, critically involved in visual recognition memory. LTD is proposed to be the cellular correlate underlying this form of memory. Cholinergic neurotransmission has been shown to play a critical role in both visual recognition memory and LTD in Prh. Moreover, visual recognition memory is one of the main cognitive functions impaired in the early stages of Alzheimer’s disease. The main aim of my research was to investigate the role of NO and ECBs in synaptic plasticity in rat Prh and in visual recognition memory. Part of this research was dedicated to the study of synaptic transmission and plasticity in a murine model (Tg2576) of Alzheimer’s disease. Methods Field potential recordings. Extracellular field potential recordings were carried out in horizontal Prh slices from Sprague-Dawley or Dark Agouti juvenile (p21-35) rats. LTD was induced with a single train of 3000 pulses delivered at 5 Hz (10 min), or via bath application of carbachol (Cch; 50 μM) for 10 min. LTP was induced by theta-burst stimulation (TBS). In addition, input/output curves and 5Hz-LTD were carried out in Prh slices from 3 month-old Tg2576 mice and littermate controls. Behavioural experiments. The spontaneous novel object exploration task was performed in intra-Prh bilaterally cannulated adult Dark Agouti rats. Drugs or vehicle (saline) were directly infused into the Prh 15 min before training to verify the role of nNOS and CB1 in visual recognition memory acquisition. Object recognition memory was tested at 20 min and 24h after the end of the training phase. Results Electrophysiological experiments in Prh slices from juvenile rats showed that 5Hz-LTD is due to the activation of the NOS/sGC/PKG pathway, whereas Cch-LTD relies on NOS/sGC but not PKG activation. By contrast, NO does not appear to be involved in LTP in this preparation. Furthermore, I found that eCBs are involved in LTP induction, but not in basal synaptic transmission, 5Hz-LTD and Cch-LTD. Behavioural experiments demonstrated that the blockade of nNOS impairs rat visual recognition memory tested at 24 hours, but not at 20 min; however, the blockade of CB1 did not affect visual recognition memory acquisition tested at both time points specified. In three month-old Tg2576 mice, deficits in basal synaptic transmission and 5Hz-LTD were observed compared to littermate controls. Conclusions The results obtained in Prh slices from juvenile rats indicate that NO and CB1 play a role in the induction of LTD and LTP, respectively. These results are confirmed by the observation that nNOS, but not CB1, is involved in visual recognition memory acquisition. The preliminary results obtained in the murine model of Alzheimer’s disease indicate that deficits in synaptic transmission and plasticity occur very early in Prh; further investigations are required to characterize the molecular mechanisms underlying these deficits.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Visual correspondence is a key computer vision task that aims at identifying projections of the same 3D point into images taken either from different viewpoints or at different time instances. This task has been the subject of intense research activities in the last years in scenarios such as object recognition, motion detection, stereo vision, pattern matching, image registration. The approaches proposed in literature typically aim at improving the state of the art by increasing the reliability, the accuracy or the computational efficiency of visual correspondence algorithms. The research work carried out during the Ph.D. course and presented in this dissertation deals with three specific visual correspondence problems: fast pattern matching, stereo correspondence and robust image matching. The dissertation presents original contributions to the theory of visual correspondence, as well as applications dealing with 3D reconstruction and multi-view video surveillance.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The research activity carried out during the PhD course was focused on the development of mathematical models of some cognitive processes and their validation by means of data present in literature, with a double aim: i) to achieve a better interpretation and explanation of the great amount of data obtained on these processes from different methodologies (electrophysiological recordings on animals, neuropsychological, psychophysical and neuroimaging studies in humans), ii) to exploit model predictions and results to guide future research and experiments. In particular, the research activity has been focused on two different projects: 1) the first one concerns the development of neural oscillators networks, in order to investigate the mechanisms of synchronization of the neural oscillatory activity during cognitive processes, such as object recognition, memory, language, attention; 2) the second one concerns the mathematical modelling of multisensory integration processes (e.g. visual-acoustic), which occur in several cortical and subcortical regions (in particular in a subcortical structure named Superior Colliculus (SC)), and which are fundamental for orienting motor and attentive responses to external world stimuli. This activity has been realized in collaboration with the Center for Studies and Researches in Cognitive Neuroscience of the University of Bologna (in Cesena) and the Department of Neurobiology and Anatomy of the Wake Forest University School of Medicine (NC, USA). PART 1. Objects representation in a number of cognitive functions, like perception and recognition, foresees distribute processes in different cortical areas. One of the main neurophysiological question concerns how the correlation between these disparate areas is realized, in order to succeed in grouping together the characteristics of the same object (binding problem) and in maintaining segregated the properties belonging to different objects simultaneously present (segmentation problem). Different theories have been proposed to address these questions (Barlow, 1972). One of the most influential theory is the so called “assembly coding”, postulated by Singer (2003), according to which 1) an object is well described by a few fundamental properties, processing in different and distributed cortical areas; 2) the recognition of the object would be realized by means of the simultaneously activation of the cortical areas representing its different features; 3) groups of properties belonging to different objects would be kept separated in the time domain. In Chapter 1.1 and in Chapter 1.2 we present two neural network models for object recognition, based on the “assembly coding” hypothesis. These models are networks of Wilson-Cowan oscillators which exploit: i) two high-level “Gestalt Rules” (the similarity and previous knowledge rules), to realize the functional link between elements of different cortical areas representing properties of the same object (binding problem); 2) the synchronization of the neural oscillatory activity in the γ-band (30-100Hz), to segregate in time the representations of different objects simultaneously present (segmentation problem). These models are able to recognize and reconstruct multiple simultaneous external objects, even in difficult case (some wrong or lacking features, shared features, superimposed noise). In Chapter 1.3 the previous models are extended to realize a semantic memory, in which sensory-motor representations of objects are linked with words. To this aim, the network, previously developed, devoted to the representation of objects as a collection of sensory-motor features, is reciprocally linked with a second network devoted to the representation of words (lexical network) Synapses linking the two networks are trained via a time-dependent Hebbian rule, during a training period in which individual objects are presented together with the corresponding words. Simulation results demonstrate that, during the retrieval phase, the network can deal with the simultaneous presence of objects (from sensory-motor inputs) and words (from linguistic inputs), can correctly associate objects with words and segment objects even in the presence of incomplete information. Moreover, the network can realize some semantic links among words representing objects with some shared features. These results support the idea that semantic memory can be described as an integrated process, whose content is retrieved by the co-activation of different multimodal regions. In perspective, extended versions of this model may be used to test conceptual theories, and to provide a quantitative assessment of existing data (for instance concerning patients with neural deficits). PART 2. The ability of the brain to integrate information from different sensory channels is fundamental to perception of the external world (Stein et al, 1993). It is well documented that a number of extraprimary areas have neurons capable of such a task; one of the best known of these is the superior colliculus (SC). This midbrain structure receives auditory, visual and somatosensory inputs from different subcortical and cortical areas, and is involved in the control of orientation to external events (Wallace et al, 1993). SC neurons respond to each of these sensory inputs separately, but is also capable of integrating them (Stein et al, 1993) so that the response to the combined multisensory stimuli is greater than that to the individual component stimuli (enhancement). This enhancement is proportionately greater if the modality-specific paired stimuli are weaker (the principle of inverse effectiveness). Several studies have shown that the capability of SC neurons to engage in multisensory integration requires inputs from cortex; primarily the anterior ectosylvian sulcus (AES), but also the rostral lateral suprasylvian sulcus (rLS). If these cortical inputs are deactivated the response of SC neurons to cross-modal stimulation is no different from that evoked by the most effective of its individual component stimuli (Jiang et al 2001). This phenomenon can be better understood through mathematical models. The use of mathematical models and neural networks can place the mass of data that has been accumulated about this phenomenon and its underlying circuitry into a coherent theoretical structure. In Chapter 2.1 a simple neural network model of this structure is presented; this model is able to reproduce a large number of SC behaviours like multisensory enhancement, multisensory and unisensory depression, inverse effectiveness. In Chapter 2.2 this model was improved by incorporating more neurophysiological knowledge about the neural circuitry underlying SC multisensory integration, in order to suggest possible physiological mechanisms through which it is effected. This endeavour was realized in collaboration with Professor B.E. Stein and Doctor B. Rowland during the 6 months-period spent at the Department of Neurobiology and Anatomy of the Wake Forest University School of Medicine (NC, USA), within the Marco Polo Project. The model includes four distinct unisensory areas that are devoted to a topological representation of external stimuli. Two of them represent subregions of the AES (i.e., FAES, an auditory area, and AEV, a visual area) and send descending inputs to the ipsilateral SC; the other two represent subcortical areas (one auditory and one visual) projecting ascending inputs to the same SC. Different competitive mechanisms, realized by means of population of interneurons, are used in the model to reproduce the different behaviour of SC neurons in conditions of cortical activation and deactivation. The model, with a single set of parameters, is able to mimic the behaviour of SC multisensory neurons in response to very different stimulus conditions (multisensory enhancement, inverse effectiveness, within- and cross-modal suppression of spatially disparate stimuli), with cortex functional and cortex deactivated, and with a particular type of membrane receptors (NMDA receptors) active or inhibited. All these results agree with the data reported in Jiang et al. (2001) and in Binns and Salt (1996). The model suggests that non-linearities in neural responses and synaptic (excitatory and inhibitory) connections can explain the fundamental aspects of multisensory integration, and provides a biologically plausible hypothesis about the underlying circuitry.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The term Ambient Intelligence (AmI) refers to a vision on the future of the information society where smart, electronic environment are sensitive and responsive to the presence of people and their activities (Context awareness). In an ambient intelligence world, devices work in concert to support people in carrying out their everyday life activities, tasks and rituals in an easy, natural way using information and intelligence that is hidden in the network connecting these devices. This promotes the creation of pervasive environments improving the quality of life of the occupants and enhancing the human experience. AmI stems from the convergence of three key technologies: ubiquitous computing, ubiquitous communication and natural interfaces. Ambient intelligent systems are heterogeneous and require an excellent cooperation between several hardware/software technologies and disciplines, including signal processing, networking and protocols, embedded systems, information management, and distributed algorithms. Since a large amount of fixed and mobile sensors embedded is deployed into the environment, the Wireless Sensor Networks is one of the most relevant enabling technologies for AmI. WSN are complex systems made up of a number of sensor nodes which can be deployed in a target area to sense physical phenomena and communicate with other nodes and base stations. These simple devices typically embed a low power computational unit (microcontrollers, FPGAs etc.), a wireless communication unit, one or more sensors and a some form of energy supply (either batteries or energy scavenger modules). WNS promises of revolutionizing the interactions between the real physical worlds and human beings. Low-cost, low-computational power, low energy consumption and small size are characteristics that must be taken into consideration when designing and dealing with WSNs. To fully exploit the potential of distributed sensing approaches, a set of challengesmust be addressed. Sensor nodes are inherently resource-constrained systems with very low power consumption and small size requirements which enables than to reduce the interference on the physical phenomena sensed and to allow easy and low-cost deployment. They have limited processing speed,storage capacity and communication bandwidth that must be efficiently used to increase the degree of local ”understanding” of the observed phenomena. A particular case of sensor nodes are video sensors. This topic holds strong interest for a wide range of contexts such as military, security, robotics and most recently consumer applications. Vision sensors are extremely effective for medium to long-range sensing because vision provides rich information to human operators. However, image sensors generate a huge amount of data, whichmust be heavily processed before it is transmitted due to the scarce bandwidth capability of radio interfaces. In particular, in video-surveillance, it has been shown that source-side compression is mandatory due to limited bandwidth and delay constraints. Moreover, there is an ample opportunity for performing higher-level processing functions, such as object recognition that has the potential to drastically reduce the required bandwidth (e.g. by transmitting compressed images only when something ‘interesting‘ is detected). The energy cost of image processing must however be carefully minimized. Imaging could play and plays an important role in sensing devices for ambient intelligence. Computer vision can for instance be used for recognising persons and objects and recognising behaviour such as illness and rioting. Having a wireless camera as a camera mote opens the way for distributed scene analysis. More eyes see more than one and a camera system that can observe a scene from multiple directions would be able to overcome occlusion problems and could describe objects in their true 3D appearance. In real-time, these approaches are a recently opened field of research. In this thesis we pay attention to the realities of hardware/software technologies and the design needed to realize systems for distributed monitoring, attempting to propose solutions on open issues and filling the gap between AmI scenarios and hardware reality. The physical implementation of an individual wireless node is constrained by three important metrics which are outlined below. Despite that the design of the sensor network and its sensor nodes is strictly application dependent, a number of constraints should almost always be considered. Among them: • Small form factor to reduce nodes intrusiveness. • Low power consumption to reduce battery size and to extend nodes lifetime. • Low cost for a widespread diffusion. These limitations typically result in the adoption of low power, low cost devices such as low powermicrocontrollers with few kilobytes of RAMand tenth of kilobytes of program memory with whomonly simple data processing algorithms can be implemented. However the overall computational power of the WNS can be very large since the network presents a high degree of parallelism that can be exploited through the adoption of ad-hoc techniques. Furthermore through the fusion of information from the dense mesh of sensors even complex phenomena can be monitored. In this dissertation we present our results in building several AmI applications suitable for a WSN implementation. The work can be divided into two main areas:Low Power Video Sensor Node and Video Processing Alghoritm and Multimodal Surveillance . Low Power Video Sensor Nodes and Video Processing Alghoritms In comparison to scalar sensors, such as temperature, pressure, humidity, velocity, and acceleration sensors, vision sensors generate much higher bandwidth data due to the two-dimensional nature of their pixel array. We have tackled all the constraints listed above and have proposed solutions to overcome the current WSNlimits for Video sensor node. We have designed and developed wireless video sensor nodes focusing on the small size and the flexibility of reuse in different applications. The video nodes target a different design point: the portability (on-board power supply, wireless communication), a scanty power budget (500mW),while still providing a prominent level of intelligence, namely sophisticated classification algorithmand high level of reconfigurability. We developed two different video sensor node: The device architecture of the first one is based on a low-cost low-power FPGA+microcontroller system-on-chip. The second one is based on ARM9 processor. Both systems designed within the above mentioned power envelope could operate in a continuous fashion with Li-Polymer battery pack and solar panel. Novel low power low cost video sensor nodes which, in contrast to sensors that just watch the world, are capable of comprehending the perceived information in order to interpret it locally, are presented. Featuring such intelligence, these nodes would be able to cope with such tasks as recognition of unattended bags in airports, persons carrying potentially dangerous objects, etc.,which normally require a human operator. Vision algorithms for object detection, acquisition like human detection with Support Vector Machine (SVM) classification and abandoned/removed object detection are implemented, described and illustrated on real world data. Multimodal surveillance: In several setup the use of wired video cameras may not be possible. For this reason building an energy efficient wireless vision network for monitoring and surveillance is one of the major efforts in the sensor network community. Energy efficiency for wireless smart camera networks is one of the major efforts in distributed monitoring and surveillance community. For this reason, building an energy efficient wireless vision network for monitoring and surveillance is one of the major efforts in the sensor network community. The Pyroelectric Infra-Red (PIR) sensors have been used to extend the lifetime of a solar-powered video sensor node by providing an energy level dependent trigger to the video camera and the wireless module. Such approach has shown to be able to extend node lifetime and possibly result in continuous operation of the node.Being low-cost, passive (thus low-power) and presenting a limited form factor, PIR sensors are well suited for WSN applications. Moreover techniques to have aggressive power management policies are essential for achieving long-termoperating on standalone distributed cameras needed to improve the power consumption. We have used an adaptive controller like Model Predictive Control (MPC) to help the system to improve the performances outperforming naive power management policies.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This thesis deals with Visual Servoing and its strictly connected disciplines like projective geometry, image processing, robotics and non-linear control. More specifically the work addresses the problem to control a robotic manipulator through one of the largely used Visual Servoing techniques: the Image Based Visual Servoing (IBVS). In Image Based Visual Servoing the robot is driven by on-line performing a feedback control loop that is closed directly in the 2D space of the camera sensor. The work considers the case of a monocular system with the only camera mounted on the robot end effector (eye in hand configuration). Through IBVS the system can be positioned with respect to a 3D fixed target by minimizing the differences between its initial view and its goal view, corresponding respectively to the initial and the goal system configurations: the robot Cartesian Motion is thus generated only by means of visual informations. However, the execution of a positioning control task by IBVS is not straightforward because singularity problems may occur and local minima may be reached where the reached image is very close to the target one but the 3D positioning task is far from being fulfilled: this happens in particular for large camera displacements, when the the initial and the goal target views are noticeably different. To overcame singularity and local minima drawbacks, maintaining the good properties of IBVS robustness with respect to modeling and camera calibration errors, an opportune image path planning can be exploited. This work deals with the problem of generating opportune image plane trajectories for tracked points of the servoing control scheme (a trajectory is made of a path plus a time law). The generated image plane paths must be feasible i.e. they must be compliant with rigid body motion of the camera with respect to the object so as to avoid image jacobian singularities and local minima problems. In addition, the image planned trajectories must generate camera velocity screws which are smooth and within the allowed bounds of the robot. We will show that a scaled 3D motion planning algorithm can be devised in order to generate feasible image plane trajectories. Since the paths in the image are off-line generated it is also possible to tune the planning parameters so as to maintain the target inside the camera field of view even if, in some unfortunate cases, the feature target points would leave the camera images due to 3D robot motions. To test the validity of the proposed approach some both experiments and simulations results have been reported taking also into account the influence of noise in the path planning strategy. The experiments have been realized with a 6DOF anthropomorphic manipulator with a fire-wire camera installed on its end effector: the results demonstrate the good performances and the feasibility of the proposed approach.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The ability of integrating into a unified percept sensory inputs deriving from different sensory modalities, but related to the same external event, is called multisensory integration and might represent an efficient mechanism of sensory compensation when a sensory modality is damaged by a cortical lesion. This hypothesis has been discussed in the present dissertation. Experiment 1 explored the role of superior colliculus (SC) in multisensory integration, testing patients with collicular lesions, patients with subcortical lesions not involving the SC and healthy control subjects in a multisensory task. The results revealed that patients with collicular lesions, paralleling the evidence of animal studies, demonstrated a loss of multisensory enhancement, in contrast with control subjects, providing the first lesional evidence in humans of the essential role of SC in mediating audio-visual integration. Experiment 2 investigated the role of cortex in mediating multisensory integrative effects, inducing virtual lesions by inhibitory theta-burst stimulation on temporo-parietal cortex, occipital cortex and posterior parietal cortex, demonstrating that only temporo-parietal cortex was causally involved in modulating the integration of audio-visual stimuli at the same spatial location. Given the involvement of the retino-colliculo-extrastriate pathway in mediating audio-visual integration, the functional sparing of this circuit in hemianopic patients is extremely relevant in the perspective of a multisensory-based approach to the recovery of unisensory defects. Experiment 3 demonstrated the spared functional activity of this circuit in a group of hemianopic patients, revealing the presence of implicit recognition of the fearful content of unseen visual stimuli (i.e. affective blindsight), an ability mediated by the retino-colliculo-extrastriate pathway and its connections with amygdala. Finally, Experiment 4 provided evidence that a systematic audio-visual stimulation is effective in inducing long-lasting clinical improvements in patients with visual field defect and revealed that the activity of the spared retino-colliculo-extrastriate pathway is responsible of the observed clinical amelioration, as suggested by the greater improvement observed in patients with cortical lesions limited to the occipital cortex, compared to patients with lesions extending to other cortical areas, found in tasks high demanding in terms of spatial orienting. Overall, the present results indicated that multisensory integration is mediated by the retino-colliculo-extrastriate pathway and that a systematic audio-visual stimulation, activating this spared neural circuit, is able to affect orientation towards the blind field in hemianopic patients and, therefore, might constitute an effective and innovative approach for the rehabilitation of unisensory visual impairments.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Visual tracking is the problem of estimating some variables related to a target given a video sequence depicting the target. Visual tracking is key to the automation of many tasks, such as visual surveillance, robot or vehicle autonomous navigation, automatic video indexing in multimedia databases. Despite many years of research, long term tracking in real world scenarios for generic targets is still unaccomplished. The main contribution of this thesis is the definition of effective algorithms that can foster a general solution to visual tracking by letting the tracker adapt to mutating working conditions. In particular, we propose to adapt two crucial components of visual trackers: the transition model and the appearance model. The less general but widespread case of tracking from a static camera is also considered and a novel change detection algorithm robust to sudden illumination changes is proposed. Based on this, a principled adaptive framework to model the interaction between Bayesian change detection and recursive Bayesian trackers is introduced. Finally, the problem of automatic tracker initialization is considered. In particular, a novel solution for categorization of 3D data is presented. The novel category recognition algorithm is based on a novel 3D descriptors that is shown to achieve state of the art performances in several applications of surface matching.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Lesions to the primary geniculo-striate visual pathway cause blindness in the contralesional visual field. Nevertheless, previous studies have suggested that patients with visual field defects may still be able to implicitly process the affective valence of unseen emotional stimuli (affective blindsight) through alternative visual pathways bypassing the striate cortex. These alternative pathways may also allow exploitation of multisensory (audio-visual) integration mechanisms, such that auditory stimulation can enhance visual detection of stimuli which would otherwise be undetected when presented alone (crossmodal blindsight). The present dissertation investigated implicit emotional processing and multisensory integration when conscious visual processing is prevented by real or virtual lesions to the geniculo-striate pathway, in order to further clarify both the nature of these residual processes and the functional aspects of the underlying neural pathways. The present experimental evidence demonstrates that alternative subcortical visual pathways allow implicit processing of the emotional content of facial expressions in the absence of cortical processing. However, this residual ability is limited to fearful expressions. This finding suggests the existence of a subcortical system specialised in detecting danger signals based on coarse visual cues, therefore allowing the early recruitment of flight-or-fight behavioural responses even before conscious and detailed recognition of potential threats can take place. Moreover, the present dissertation extends the knowledge about crossmodal blindsight phenomena by showing that, unlike with visual detection, sound cannot crossmodally enhance visual orientation discrimination in the absence of functional striate cortex. This finding demonstrates, on the one hand, that the striate cortex plays a causative role in crossmodally enhancing visual orientation sensitivity and, on the other hand, that subcortical visual pathways bypassing the striate cortex, despite affording audio-visual integration processes leading to the improvement of simple visual abilities such as detection, cannot mediate multisensory enhancement of more complex visual functions, such as orientation discrimination.