Biblioteca Digital

12 resultados para audio-visual automatic speech recognition

em AMS Tesi di Dottorato - Alm@DL - Università di Bologna

Neural basis of visual deficits recovery: visual residual functions and multisensory integration.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The ability of integrating into a unified percept sensory inputs deriving from different sensory modalities, but related to the same external event, is called multisensory integration and might represent an efficient mechanism of sensory compensation when a sensory modality is damaged by a cortical lesion. This hypothesis has been discussed in the present dissertation. Experiment 1 explored the role of superior colliculus (SC) in multisensory integration, testing patients with collicular lesions, patients with subcortical lesions not involving the SC and healthy control subjects in a multisensory task. The results revealed that patients with collicular lesions, paralleling the evidence of animal studies, demonstrated a loss of multisensory enhancement, in contrast with control subjects, providing the first lesional evidence in humans of the essential role of SC in mediating audio-visual integration. Experiment 2 investigated the role of cortex in mediating multisensory integrative effects, inducing virtual lesions by inhibitory theta-burst stimulation on temporo-parietal cortex, occipital cortex and posterior parietal cortex, demonstrating that only temporo-parietal cortex was causally involved in modulating the integration of audio-visual stimuli at the same spatial location. Given the involvement of the retino-colliculo-extrastriate pathway in mediating audio-visual integration, the functional sparing of this circuit in hemianopic patients is extremely relevant in the perspective of a multisensory-based approach to the recovery of unisensory defects. Experiment 3 demonstrated the spared functional activity of this circuit in a group of hemianopic patients, revealing the presence of implicit recognition of the fearful content of unseen visual stimuli (i.e. affective blindsight), an ability mediated by the retino-colliculo-extrastriate pathway and its connections with amygdala. Finally, Experiment 4 provided evidence that a systematic audio-visual stimulation is effective in inducing long-lasting clinical improvements in patients with visual field defect and revealed that the activity of the spared retino-colliculo-extrastriate pathway is responsible of the observed clinical amelioration, as suggested by the greater improvement observed in patients with cortical lesions limited to the occipital cortex, compared to patients with lesions extending to other cortical areas, found in tasks high demanding in terms of spatial orienting. Overall, the present results indicated that multisensory integration is mediated by the retino-colliculo-extrastriate pathway and that a systematic audio-visual stimulation, activating this spared neural circuit, is able to affect orientation towards the blind field in hemianopic patients and, therefore, might constitute an effective and innovative approach for the rehabilitation of unisensory visual impairments.

Veja mais

Residual visual processing following real or virtual lesions to primary visual pathways

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Lesions to the primary geniculo-striate visual pathway cause blindness in the contralesional visual field. Nevertheless, previous studies have suggested that patients with visual field defects may still be able to implicitly process the affective valence of unseen emotional stimuli (affective blindsight) through alternative visual pathways bypassing the striate cortex. These alternative pathways may also allow exploitation of multisensory (audio-visual) integration mechanisms, such that auditory stimulation can enhance visual detection of stimuli which would otherwise be undetected when presented alone (crossmodal blindsight). The present dissertation investigated implicit emotional processing and multisensory integration when conscious visual processing is prevented by real or virtual lesions to the geniculo-striate pathway, in order to further clarify both the nature of these residual processes and the functional aspects of the underlying neural pathways. The present experimental evidence demonstrates that alternative subcortical visual pathways allow implicit processing of the emotional content of facial expressions in the absence of cortical processing. However, this residual ability is limited to fearful expressions. This finding suggests the existence of a subcortical system specialised in detecting danger signals based on coarse visual cues, therefore allowing the early recruitment of flight-or-fight behavioural responses even before conscious and detailed recognition of potential threats can take place. Moreover, the present dissertation extends the knowledge about crossmodal blindsight phenomena by showing that, unlike with visual detection, sound cannot crossmodally enhance visual orientation discrimination in the absence of functional striate cortex. This finding demonstrates, on the one hand, that the striate cortex plays a causative role in crossmodally enhancing visual orientation sensitivity and, on the other hand, that subcortical visual pathways bypassing the striate cortex, despite affording audio-visual integration processes leading to the improvement of simple visual abilities such as detection, cannot mediate multisensory enhancement of more complex visual functions, such as orientation discrimination.

Veja mais

Il ruolo del collicolo superiore nell'orientamento spaziale

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis was aimed at verifying the role of the superior colliculus (SC) in human spatial orienting. To do so, subjects performed two experimental tasks that have been shown to involve SC’s activation in animals, that is a multisensory integration task (Experiment 1 and 2) and a visual target selection task (Experiment 3). To investigate this topic in humans, we took advantage of neurophysiological finding revealing that retinal S-cones do not send projections to the collicular and magnocellular pathway. In the Experiment 1, subjects performed a simple reaction-time task in which they were required to respond as quickly as possible to any sensory stimulus (visual, auditory or bimodal audio-visual). The visual stimulus could be an S-cone stimulus (invisible to the collicular and magnocellular pathway) or a long wavelength stimulus (visible to the SC). Results showed that when using S-cone stimuli, RTs distribution was simply explained by probability summation, indicating that the redundant auditory and visual channels are independent. Conversely, with red long-wavelength stimuli, visible to the SC, the RTs distribution was related to nonlinear neural summation, which constitutes evidence of integration of different sensory information. We also demonstrate that when AV stimuli were presented at fixation, so that the spatial orienting component of the task was reduced, neural summation was possible regardless of stimulus color. Together, these findings provide support for a pivotal role of the SC in mediating multisensory spatial integration in humans, when behavior involves spatial orienting responses. Since previous studies have shown an anatomical asymmetry of fibres projecting to the SC from the hemiretinas, the Experiment 2 was aimed at investigating temporo-nasal asymmetry in multisensory integration. To do so, subjects performed monocularly the same task shown in the Experiment 1. When spatially coincident audio-visual stimuli were visible to the SC (i.e. red stimuli), the RTE depended on a neural coactivation mechanism, suggesting an integration of multisensory information. When using stimuli invisible to the SC (i.e. purple stimuli), the RTE depended only on a simple statistical facilitation effect, in which the two sensory stimuli were processed by independent channels. Finally, we demonstrate that the multisensory integration effect was stronger for stimuli presented to the temporal hemifield than to the nasal hemifield. Taken together, these findings suggested that multisensory stimulation can be differentially effective depending on specific stimulus parameters. The Experiment 3 was aimed at verifying the role of the SC in target selection by using a color-oddity search task, comprising stimuli either visible or invisible to the collicular and magnocellular pathways. Subjects were required to make a saccade toward a target that could be presented alone or with three distractors of another color (either S-cone or long-wavelength). When using S-cone distractors, invisible to the SC, localization errors were similar to those observed in the distractor-free condition. Conversely, with long-wavelength distractors, visible to the SC, saccadic localization error and variability were significantly greater than in either the distractor-free condition or the S-cone distractors condition. Our results clearly indicate that the SC plays a direct role in visual target selection in humans. Overall, our results indicate that the SC plays an important role in mediating spatial orienting responses both when required covert (Experiments 1 and 2) and overt orienting (Experiment 3).

Veja mais

Effetti dell'integrazione visuo-acustica in pazienti con disturbo di campo visivo

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Human brain is provided with a flexible audio-visual system, which interprets and guides responses to external events according to spatial alignment, temporal synchronization and effectiveness of unimodal signals. The aim of the present thesis was to explore the possibility that such a system might represent the neural correlate of sensory compensation after a damage to one sensory pathway. To this purpose, three experimental studies have been conducted, which addressed the immediate, short-term and long-term effects of audio-visual integration on patients with Visual Field Defect (VFD). Experiment 1 investigated whether the integration of stimuli from different modalities (cross-modal) and from the same modality (within-modal) have a different, immediate effect on localization behaviour. Patients had to localize modality-specific stimuli (visual or auditory), cross-modal stimulus pairs (visual-auditory) and within-modal stimulus pairs (visual-visual). Results showed that cross-modal stimuli evoked a greater improvement than within modal stimuli, consistent with a Bayesian explanation. Moreover, even when visual processing was impaired, cross-modal stimuli improved performance in an optimal fashion. These findings support the hypothesis that the improvement derived from multisensory integration is not attributable to simple target redundancy, and prove that optimal integration of cross-modal signals occurs in processing stage which are not consciously accessible. Experiment 2 examined the possibility to induce a short term improvement of localization performance without an explicit knowledge of visual stimulus. Patients with VFD and patients with neglect had to localize weak sounds before and after a brief exposure to a passive cross-modal stimulation, which comprised spatially disparate or spatially coincident audio-visual stimuli. After exposure to spatially disparate stimuli in the affected field, only patients with neglect exhibited a shifts of auditory localization toward the visual attractor (the so called Ventriloquism After-Effect). In contrast, after adaptation to spatially coincident stimuli, both neglect and hemianopic patients exhibited a significant improvement of auditory localization, proving the occurrence of After Effect for multisensory enhancement. These results suggest the presence of two distinct recalibration mechanisms, each mediated by a different neural route: a geniculo-striate circuit and a colliculus-extrastriate circuit respectively. Finally, Experiment 3 verified whether a systematic audio-visual stimulation could exert a long-lasting effect on patients’ oculomotor behaviour. Eye movements responses during a visual search task and a reading task were studied before and after visual (control) or audio-visual (experimental) training, in a group of twelve patients with VFD and twelve controls subjects. Results showed that prior to treatment, patients’ performance was significantly different from that of controls in relation to fixations and saccade parameters; after audiovisual training, all patients reported an improvement in ocular exploration characterized by fewer fixations and refixations, quicker and larger saccades, and reduced scanpath length. Similarly, reading parameters were significantly affected by the training, with respect to specific impairments observed in left and right hemisphere–damaged patients. The present findings provide evidence that a systematic audio-visual stimulation may encourage a more organized pattern of visual exploration with long lasting effects. In conclusion, results from these studies clearly demonstrate that the beneficial effects of audio-visual integration can be retained in absence of explicit processing of visual stimulus. Surprisingly, an improvement of spatial orienting can be obtained not only when a on-line response is required, but also after either a brief or a long adaptation to audio-visual stimulus pairs, so suggesting the maintenance of mechanisms subserving cross-modal perceptual learning after a damage to geniculo-striate pathway. The colliculus-extrastriate pathway, which is spared in patients with VFD, seems to play a pivotal role in this sensory compensation.

Veja mais

On-line adaptive visual tracking

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Visual tracking is the problem of estimating some variables related to a target given a video sequence depicting the target. Visual tracking is key to the automation of many tasks, such as visual surveillance, robot or vehicle autonomous navigation, automatic video indexing in multimedia databases. Despite many years of research, long term tracking in real world scenarios for generic targets is still unaccomplished. The main contribution of this thesis is the definition of effective algorithms that can foster a general solution to visual tracking by letting the tracker adapt to mutating working conditions. In particular, we propose to adapt two crucial components of visual trackers: the transition model and the appearance model. The less general but widespread case of tracking from a static camera is also considered and a novel change detection algorithm robust to sudden illumination changes is proposed. Based on this, a principled adaptive framework to model the interaction between Bayesian change detection and recursive Bayesian trackers is introduced. Finally, the problem of automatic tracker initialization is considered. In particular, a novel solution for categorization of 3D data is presented. The novel category recognition algorithm is based on a novel 3D descriptors that is shown to achieve state of the art performances in several applications of surface matching.

Veja mais

Role of nitric oxide and endocannabinoids in synaptic plasticity in the perirhinal cortex and in visual recognition memory

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Introduction and aims of the research Nitric oxide (NO) and endocannabinoids (eCBs) are major retrograde messengers, involved in synaptic plasticity (long-term potentiation, LTP, and long-term depression, LTD) in many brain areas (including hippocampus and neocortex), as well as in learning and memory processes. NO is synthesized by NO synthase (NOS) in response to increased cytosolic Ca2+ and mainly exerts its functions through soluble guanylate cyclase (sGC) and cGMP production. The main target of cGMP is the cGMP-dependent protein kinase (PKG). Activity-dependent release of eCBs in the CNS leads to the activation of the Gαi/o-coupled cannabinoid receptor 1 (CB1) at both glutamatergic and inhibitory synapses. The perirhinal cortex (Prh) is a multimodal associative cortex of the temporal lobe, critically involved in visual recognition memory. LTD is proposed to be the cellular correlate underlying this form of memory. Cholinergic neurotransmission has been shown to play a critical role in both visual recognition memory and LTD in Prh. Moreover, visual recognition memory is one of the main cognitive functions impaired in the early stages of Alzheimer’s disease. The main aim of my research was to investigate the role of NO and ECBs in synaptic plasticity in rat Prh and in visual recognition memory. Part of this research was dedicated to the study of synaptic transmission and plasticity in a murine model (Tg2576) of Alzheimer’s disease. Methods Field potential recordings. Extracellular field potential recordings were carried out in horizontal Prh slices from Sprague-Dawley or Dark Agouti juvenile (p21-35) rats. LTD was induced with a single train of 3000 pulses delivered at 5 Hz (10 min), or via bath application of carbachol (Cch; 50 μM) for 10 min. LTP was induced by theta-burst stimulation (TBS). In addition, input/output curves and 5Hz-LTD were carried out in Prh slices from 3 month-old Tg2576 mice and littermate controls. Behavioural experiments. The spontaneous novel object exploration task was performed in intra-Prh bilaterally cannulated adult Dark Agouti rats. Drugs or vehicle (saline) were directly infused into the Prh 15 min before training to verify the role of nNOS and CB1 in visual recognition memory acquisition. Object recognition memory was tested at 20 min and 24h after the end of the training phase. Results Electrophysiological experiments in Prh slices from juvenile rats showed that 5Hz-LTD is due to the activation of the NOS/sGC/PKG pathway, whereas Cch-LTD relies on NOS/sGC but not PKG activation. By contrast, NO does not appear to be involved in LTP in this preparation. Furthermore, I found that eCBs are involved in LTP induction, but not in basal synaptic transmission, 5Hz-LTD and Cch-LTD. Behavioural experiments demonstrated that the blockade of nNOS impairs rat visual recognition memory tested at 24 hours, but not at 20 min; however, the blockade of CB1 did not affect visual recognition memory acquisition tested at both time points specified. In three month-old Tg2576 mice, deficits in basal synaptic transmission and 5Hz-LTD were observed compared to littermate controls. Conclusions The results obtained in Prh slices from juvenile rats indicate that NO and CB1 play a role in the induction of LTD and LTP, respectively. These results are confirmed by the observation that nNOS, but not CB1, is involved in visual recognition memory acquisition. The preliminary results obtained in the murine model of Alzheimer’s disease indicate that deficits in synaptic transmission and plasticity occur very early in Prh; further investigations are required to characterize the molecular mechanisms underlying these deficits.

Veja mais

Fingerprint Recognition: Enhancement, Feature Extraction and Automatic Evaluation of Algorithms

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The identification of people by measuring some traits of individual anatomy or physiology has led to a specific research area called biometric recognition. This thesis is focused on improving fingerprint recognition systems considering three important problems: fingerprint enhancement, fingerprint orientation extraction and automatic evaluation of fingerprint algorithms. An effective extraction of salient fingerprint features depends on the quality of the input fingerprint. If the fingerprint is very noisy, we are not able to detect a reliable set of features. A new fingerprint enhancement method, which is both iterative and contextual, is proposed. This approach detects high-quality regions in fingerprints, selectively applies contextual filtering and iteratively expands like wildfire toward low-quality ones. A precise estimation of the orientation field would greatly simplify the estimation of other fingerprint features (singular points, minutiae) and improve the performance of a fingerprint recognition system. The fingerprint orientation extraction is improved following two directions. First, after the introduction of a new taxonomy of fingerprint orientation extraction methods, several variants of baseline methods are implemented and, pointing out the role of pre- and post- processing, we show how to improve the extraction. Second, the introduction of a new hybrid orientation extraction method, which follows an adaptive scheme, allows to improve significantly the orientation extraction in noisy fingerprints. Scientific papers typically propose recognition systems that integrate many modules and therefore an automatic evaluation of fingerprint algorithms is needed to isolate the contributions that determine an actual progress in the state-of-the-art. The lack of a publicly available framework to compare fingerprint orientation extraction algorithms, motivates the introduction of a new benchmark area called FOE (including fingerprints and manually-marked orientation ground-truth) along with fingerprint matching benchmarks in the FVC-onGoing framework. The success of such framework is discussed by providing relevant statistics: more than 1450 algorithms submitted and two international competitions.

Veja mais

Automatic code generation: from process algebraic architectural descriptions to multithreaded java programs

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Process algebraic architectural description languages provide a formal means for modeling software systems and assessing their properties. In order to bridge the gap between system modeling and system im- plementation, in this thesis an approach is proposed for automatically generating multithreaded object-oriented code from process algebraic architectural descriptions, in a way that preserves – under certain assumptions – the properties proved at the architectural level. The approach is divided into three phases, which are illustrated by means of a running example based on an audio processing system. First, we develop an architecture-driven technique for thread coordination management, which is completely automated through a suitable package. Second, we address the translation of the algebraically-specified behavior of the individual software units into thread templates, which will have to be filled in by the software developer according to certain guidelines. Third, we discuss performance issues related to the suitability of synthesizing monitors rather than threads from software unit descriptions that satisfy specific constraints. In addition to the running example, we present two case studies about a video animation repainting system and the implementation of a leader election algorithm, in order to summarize the whole approach. The outcome of this thesis is the implementation of the proposed approach in a translator called PADL2Java and its integration in the architecture-centric verification tool TwoTowers.

Veja mais

Methodologies for visual correspondence

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Visual correspondence is a key computer vision task that aims at identifying projections of the same 3D point into images taken either from different viewpoints or at different time instances. This task has been the subject of intense research activities in the last years in scenarios such as object recognition, motion detection, stereo vision, pattern matching, image registration. The approaches proposed in literature typically aim at improving the state of the art by increasing the reliability, the accuracy or the computational efficiency of visual correspondence algorithms. The research work carried out during the Ph.D. course and presented in this dissertation deals with three specific visual correspondence problems: fast pattern matching, stereo correspondence and robust image matching. The dissertation presents original contributions to the theory of visual correspondence, as well as applications dealing with 3D reconstruction and multi-view video surveillance.

Veja mais

Creazione e sviluppo di corpora multimediali. Nuove metodologie di ricerca nella traduzione audiovisiva

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The construction and use of multimedia corpora has been advocated for a while in the literature as one of the expected future application fields of Corpus Linguistics. This research project represents a pioneering experience aimed at applying a data-driven methodology to the study of the field of AVT, similarly to what has been done in the last few decades in the macro-field of Translation Studies. This research was based on the experience of Forlixt 1, the Forlì Corpus of Screen Translation, developed at the University of Bologna’s Department of Interdisciplinary Studies in Translation, Languages and Culture. As a matter of fact, in order to quantify strategies of linguistic transfer of an AV product, we need to take into consideration not only the linguistic aspect of such a product but all the meaning-making resources deployed in the filmic text. Provided that one major benefit of Forlixt 1 is the combination of audiovisual and textual data, this corpus allows the user to access primary data for scientific investigation, and thus no longer rely on pre-processed material such as traditional annotated transcriptions. Based on this rationale, the first chapter of the thesis sets out to illustrate the state of the art of research in the disciplinary fields involved. The primary objective was to underline the main repercussions on multimedia texts resulting from the interaction of a double support, audio and video, and, accordingly, on procedures, means, and methods adopted in their translation. By drawing on previous research in semiotics and film studies, the relevant codes at work in visual and acoustic channels were outlined. Subsequently, we concentrated on the analysis of the verbal component and on the peculiar characteristics of filmic orality as opposed to spontaneous dialogic production. In the second part, an overview of the main AVT modalities was presented (dubbing, voice-over, interlinguistic and intra-linguistic subtitling, audio-description, etc.) in order to define the different technologies, processes and professional qualifications that this umbrella term presently includes. The second chapter focuses diachronically on various theories’ contribution to the application of Corpus Linguistics’ methods and tools to the field of Translation Studies (i.e. Descriptive Translation Studies, Polysystem Theory). In particular, we discussed how the use of corpora can favourably help reduce the gap existing between qualitative and quantitative approaches. Subsequently, we reviewed the tools traditionally employed by Corpus Linguistics in regard to the construction of traditional “written language” corpora, to assess whether and how they can be adapted to meet the needs of multimedia corpora. In particular, we reviewed existing speech and spoken corpora, as well as multimedia corpora specifically designed to investigate Translation. The third chapter reviews Forlixt 1's main developing steps, from a technical (IT design principles, data query functions) and methodological point of view, by laying down extensive scientific foundations for the annotation methods adopted, which presently encompass categories of pragmatic, sociolinguistic, linguacultural and semiotic nature. Finally, we described the main query tools (free search, guided search, advanced search and combined search) and the main intended uses of the database in a pedagogical perspective. The fourth chapter lists specific compilation criteria retained, as well as statistics of the two sub-corpora, by presenting data broken down by language pair (French-Italian and German-Italian) and genre (cinema’s comedies, television’s soapoperas and crime series). Next, we concentrated on the discussion of the results obtained from the analysis of summary tables reporting the frequency of categories applied to the French-Italian sub-corpus. The detailed observation of the distribution of categories identified in the original and dubbed corpus allowed us to empirically confirm some of the theories put forward in the literature and notably concerning the nature of the filmic text, the dubbing process and Italian dubbed language’s features. This was possible by looking into some of the most problematic aspects, like the rendering of socio-linguistic variation. The corpus equally allowed us to consider so far neglected aspects, such as pragmatic, prosodic, kinetic, facial, and semiotic elements, and their combination. At the end of this first exploration, some specific observations concerning possible macrotranslation trends were made for each type of sub-genre considered (cinematic and TV genre). On the grounds of this first quantitative investigation, the fifth chapter intended to further examine data, by applying ad hoc models of analysis. Given the virtually infinite number of combinations of categories adopted, and of the latter with searchable textual units, three possible qualitative and quantitative methods were designed, each of which was to concentrate on a particular translation dimension of the filmic text. The first one was the cultural dimension, which specifically focused on the rendering of selected cultural references and on the investigation of recurrent translation choices and strategies justified on the basis of the occurrence of specific clusters of categories. The second analysis was conducted on the linguistic dimension by exploring the occurrence of phrasal verbs in the Italian dubbed corpus and by ascertaining the influence on the adoption of related translation strategies of possible semiotic traits, such as gestures and facial expressions. Finally, the main aim of the third study was to verify whether, under which circumstances, and through which modality, graphic and iconic elements were translated into Italian from an original corpus of both German and French films. After having reviewed the main translation techniques at work, an exhaustive account of possible causes for their non-translation was equally provided. By way of conclusion, the discussion of results obtained from the distribution of annotation categories on the French-Italian corpus, as well as the application of specific models of analysis allowed us to underline possible advantages and drawbacks related to the adoption of a corpus-based approach to AVT studies. Even though possible updating and improvement were proposed in order to help solve some of the problems identified, it is argued that the added value of Forlixt 1 lies ultimately in having created a valuable instrument, allowing to carry out empirically-sound contrastive studies that may be usefully replicated on different language pairs and several types of multimedia texts. Furthermore, multimedia corpora can also play a crucial role in L2 and translation teaching, two disciplines in which their use still lacks systematic investigation.

Veja mais

Algorithms for on-line image registration from multiple views

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Images of a scene, static or dynamic, are generally acquired at different epochs from different viewpoints. They potentially gather information about the whole scene and its relative motion with respect to the acquisition device. Data from different (in the spatial or temporal domain) visual sources can be fused together to provide a unique consistent representation of the whole scene, even recovering the third dimension, permitting a more complete understanding of the scene content. Moreover, the pose of the acquisition device can be achieved by estimating the relative motion parameters linking different views, thus providing localization information for automatic guidance purposes. Image registration is based on the use of pattern recognition techniques to match among corresponding parts of different views of the acquired scene. Depending on hypotheses or prior information about the sensor model, the motion model and/or the scene model, this information can be used to estimate global or local geometrical mapping functions between different images or different parts of them. These mapping functions contain relative motion parameters between the scene and the sensor(s) and can be used to integrate accordingly informations coming from the different sources to build a wider or even augmented representation of the scene. Accordingly, for their scene reconstruction and pose estimation capabilities, nowadays image registration techniques from multiple views are increasingly stirring up the interest of the scientific and industrial community. Depending on the applicative domain, accuracy, robustness, and computational payload of the algorithms represent important issues to be addressed and generally a trade-off among them has to be reached. Moreover, on-line performance is desirable in order to guarantee the direct interaction of the vision device with human actors or control systems. This thesis follows a general research approach to cope with these issues, almost independently from the scene content, under the constraint of rigid motions. This approach has been motivated by the portability to very different domains as a very desirable property to achieve. A general image registration approach suitable for on-line applications has been devised and assessed through two challenging case studies in different applicative domains. The first case study regards scene reconstruction through on-line mosaicing of optical microscopy cell images acquired with non automated equipment, while moving manually the microscope holder. By registering the images the field of view of the microscope can be widened, preserving the resolution while reconstructing the whole cell culture and permitting the microscopist to interactively explore the cell culture. In the second case study, the registration of terrestrial satellite images acquired by a camera integral with the satellite is utilized to estimate its three-dimensional orientation from visual data, for automatic guidance purposes. Critical aspects of these applications are emphasized and the choices adopted are motivated accordingly. Results are discussed in view of promising future developments.

Veja mais

Oblique ionograms automatic scaling and eikonal based ray tracing

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A method for automatic scaling of oblique ionograms has been introduced. This method also provides a rejection procedure for ionograms that are considered to lack sufficient information, depicting a very good success rate. Observing the Kp index of each autoscaled ionogram, can be noticed that the behavior of the autoscaling program does not depend on geomagnetic conditions. The comparison between the values of the MUF provided by the presented software and those obtained by an experienced operator indicate that the procedure developed for detecting the nose of oblique ionogram traces is sufficiently efficient and becomes much more efficient as the quality of the ionograms improves. These results demonstrate the program allows the real-time evaluation of MUF values associated with a particular radio link through an oblique radio sounding. The automatic recognition of a part of the trace allows determine for certain frequencies, the time taken by the radio wave to travel the path between the transmitter and receiver. The reconstruction of the ionogram traces, suggests the possibility of estimating the electron density between the transmitter and the receiver, from an oblique ionogram. The showed results have been obtained with a ray-tracing procedure based on the integration of the eikonal equation and using an analytical ionospheric model with free parameters. This indicates the possibility of applying an adaptive model and a ray-tracing algorithm to estimate the electron density in the ionosphere between the transmitter and the receiver An additional study has been conducted on a high quality ionospheric soundings data set and another algorithm has been designed for the conversion of an oblique ionogram into a vertical one, using Martyn's theorem. This allows a further analysis of oblique soundings, throw the use of the INGV Autoscala program for the automatic scaling of vertical ionograms.

Veja mais

12 resultados para audio-visual automatic speech recognition

em AMS Tesi di Dottorato - Alm@DL - Università di Bologna

Filtro por publicador