37 resultados para Visual Odometry,Transformer,Deep learning
Resumo:
In highly urbanized coastal lowlands, effective site characterization is crucial for assessing seismic risk. It requires a comprehensive stratigraphic analysis of the shallow subsurface, coupled with the precise assessment of the geophysical properties of buried deposits. In this context, late Quaternary paleovalley systems, shallowly buried fluvial incisions formed during the Late Pleistocene sea-level fall and filled during the Holocene sea-level rise, are crucial for understanding seismic amplification due to their soft sediment infill and sharp lithologic contrasts. In this research, we conducted high-resolution stratigraphic analyses of two regions, the Pescara and Manfredonia areas along the Adriatic coastline of Italy, to delineate the geometries and facies architecture of two paleovalley systems. Furthermore, we carried out geophysical investigations to characterize the study areas and perform seismic response analyses. We tested the microtremor-based horizontal-to-vertical spectral ratio as a mapping tool to reconstruct the buried paleovalley geometries. We evaluated the relationship between geological and geophysical data and identified the stratigraphic surfaces responsible for the observed resonances. To perform seismic response analysis of the Pescara paleovalley system, we integrated the stratigraphic framework with microtremor and shear wave velocity measurements. The seismic response analysis highlights strong seismic amplifications in frequency ranges that can interact with a wide variety of building types. Additionally, we explored the applicability of artificial intelligence in performing facies analysis from borehole images. We used a robust dataset of high-resolution digital images from continuous sediment cores of Holocene age to outline a novel, deep-learning-based approach for performing automatic semantic segmentation directly on core images, leveraging the power of convolutional neural networks. We propose an automated model to rapidly characterize sediment cores, reproducing the sedimentologist's interpretation, and providing guidance for stratigraphic correlation and subsurface reconstructions.
Resumo:
Nowadays robotic applications are widespread and most of the manipulation tasks are efficiently solved. However, Deformable-Objects (DOs) still represent a huge limitation for robots. The main difficulty in DOs manipulation is dealing with the shape and dynamics uncertainties, which prevents the use of model-based approaches (since they are excessively computationally complex) and makes sensory data difficult to interpret. This thesis reports the research activities aimed to address some applications in robotic manipulation and sensing of Deformable-Linear-Objects (DLOs), with particular focus to electric wires. In all the works, a significant effort was made in the study of an effective strategy for analyzing sensory signals with various machine learning algorithms. In the former part of the document, the main focus concerns the wire terminals, i.e. detection, grasping, and insertion. First, a pipeline that integrates vision and tactile sensing is developed, then further improvements are proposed for each module. A novel procedure is proposed to gather and label massive amounts of training images for object detection with minimal human intervention. Together with this strategy, we extend a generic object detector based on Convolutional-Neural-Networks for orientation prediction. The insertion task is also extended by developing a closed-loop control capable to guide the insertion of a longer and curved segment of wire through a hole, where the contact forces are estimated by means of a Recurrent-Neural-Network. In the latter part of the thesis, the interest shifts to the DLO shape. Robotic reshaping of a DLO is addressed by means of a sequence of pick-and-place primitives, while a decision making process driven by visual data learns the optimal grasping locations exploiting Deep Q-learning and finds the best releasing point. The success of the solution leverages on a reliable interpretation of the DLO shape. For this reason, further developments are made on the visual segmentation.
Resumo:
The wide use of e-technologies represents a great opportunity for underserved segments of the population, especially with the aim of reintegrating excluded individuals back into society through education. This is particularly true for people with different types of disabilities who may have difficulties while attending traditional on-site learning programs that are typically based on printed learning resources. The creation and provision of accessible e-learning contents may therefore become a key factor in enabling people with different access needs to enjoy quality learning experiences and services. Another e-learning challenge is represented by m-learning (which stands for mobile learning), which is emerging as a consequence of mobile terminals diffusion and provides the opportunity to browse didactical materials everywhere, outside places that are traditionally devoted to education. Both such situations share the need to access materials in limited conditions and collide with the growing use of rich media in didactical contents, which are designed to be enjoyed without any restriction. Nowadays, Web-based teaching makes great use of multimedia technologies, ranging from Flash animations to prerecorded video-lectures. Rich media in e-learning can offer significant potential in enhancing the learning environment, through helping to increase access to education, enhance the learning experience and support multiple learning styles. Moreover, they can often be used to improve the structure of Web-based courses. These highly variegated and structured contents may significantly improve the quality and the effectiveness of educational activities for learners. For example, rich media contents allow us to describe complex concepts and process flows. Audio and video elements may be utilized to add a “human touch” to distance-learning courses. Finally, real lectures may be recorded and distributed to integrate or enrich on line materials. A confirmation of the advantages of these approaches can be seen in the exponential growth of video-lecture availability on the net, due to the ease of recording and delivering activities which take place in a traditional classroom. Furthermore, the wide use of assistive technologies for learners with disabilities injects new life into e-learning systems. E-learning allows distance and flexible educational activities, thus helping disabled learners to access resources which would otherwise present significant barriers for them. For instance, students with visual impairments have difficulties in reading traditional visual materials, deaf learners have trouble in following traditional (spoken) lectures, people with motion disabilities have problems in attending on-site programs. As already mentioned, the use of wireless technologies and pervasive computing may really enhance the educational learner experience by offering mobile e-learning services that can be accessed by handheld devices. This new paradigm of educational content distribution maximizes the benefits for learners since it enables users to overcome constraints imposed by the surrounding environment. While certainly helpful for users without disabilities, we believe that the use of newmobile technologies may also become a fundamental tool for impaired learners, since it frees them from sitting in front of a PC. In this way, educational activities can be enjoyed by all the users, without hindrance, thus increasing the social inclusion of non-typical learners. While the provision of fully accessible and portable video-lectures may be extremely useful for students, it is widely recognized that structuring and managing rich media contents for mobile learning services are complex and expensive tasks. Indeed, major difficulties originate from the basic need to provide a textual equivalent for each media resource composing a rich media Learning Object (LO). Moreover, tests need to be carried out to establish whether a given LO is fully accessible to all kinds of learners. Unfortunately, both these tasks are truly time-consuming processes, depending on the type of contents the teacher is writing and on the authoring tool he/she is using. Due to these difficulties, online LOs are often distributed as partially accessible or totally inaccessible content. Bearing this in mind, this thesis aims to discuss the key issues of a system we have developed to deliver accessible, customized or nomadic learning experiences to learners with different access needs and skills. To reduce the risk of excluding users with particular access capabilities, our system exploits Learning Objects (LOs) which are dynamically adapted and transcoded based on the specific needs of non-typical users and on the barriers that they can encounter in the environment. The basic idea is to dynamically adapt contents, by selecting them from a set of media resources packaged in SCORM-compliant LOs and stored in a self-adapting format. The system schedules and orchestrates a set of transcoding processes based on specific learner needs, so as to produce a customized LO that can be fully enjoyed by any (impaired or mobile) student.
Resumo:
Introduction and aims of the research Nitric oxide (NO) and endocannabinoids (eCBs) are major retrograde messengers, involved in synaptic plasticity (long-term potentiation, LTP, and long-term depression, LTD) in many brain areas (including hippocampus and neocortex), as well as in learning and memory processes. NO is synthesized by NO synthase (NOS) in response to increased cytosolic Ca2+ and mainly exerts its functions through soluble guanylate cyclase (sGC) and cGMP production. The main target of cGMP is the cGMP-dependent protein kinase (PKG). Activity-dependent release of eCBs in the CNS leads to the activation of the Gαi/o-coupled cannabinoid receptor 1 (CB1) at both glutamatergic and inhibitory synapses. The perirhinal cortex (Prh) is a multimodal associative cortex of the temporal lobe, critically involved in visual recognition memory. LTD is proposed to be the cellular correlate underlying this form of memory. Cholinergic neurotransmission has been shown to play a critical role in both visual recognition memory and LTD in Prh. Moreover, visual recognition memory is one of the main cognitive functions impaired in the early stages of Alzheimer’s disease. The main aim of my research was to investigate the role of NO and ECBs in synaptic plasticity in rat Prh and in visual recognition memory. Part of this research was dedicated to the study of synaptic transmission and plasticity in a murine model (Tg2576) of Alzheimer’s disease. Methods Field potential recordings. Extracellular field potential recordings were carried out in horizontal Prh slices from Sprague-Dawley or Dark Agouti juvenile (p21-35) rats. LTD was induced with a single train of 3000 pulses delivered at 5 Hz (10 min), or via bath application of carbachol (Cch; 50 μM) for 10 min. LTP was induced by theta-burst stimulation (TBS). In addition, input/output curves and 5Hz-LTD were carried out in Prh slices from 3 month-old Tg2576 mice and littermate controls. Behavioural experiments. The spontaneous novel object exploration task was performed in intra-Prh bilaterally cannulated adult Dark Agouti rats. Drugs or vehicle (saline) were directly infused into the Prh 15 min before training to verify the role of nNOS and CB1 in visual recognition memory acquisition. Object recognition memory was tested at 20 min and 24h after the end of the training phase. Results Electrophysiological experiments in Prh slices from juvenile rats showed that 5Hz-LTD is due to the activation of the NOS/sGC/PKG pathway, whereas Cch-LTD relies on NOS/sGC but not PKG activation. By contrast, NO does not appear to be involved in LTP in this preparation. Furthermore, I found that eCBs are involved in LTP induction, but not in basal synaptic transmission, 5Hz-LTD and Cch-LTD. Behavioural experiments demonstrated that the blockade of nNOS impairs rat visual recognition memory tested at 24 hours, but not at 20 min; however, the blockade of CB1 did not affect visual recognition memory acquisition tested at both time points specified. In three month-old Tg2576 mice, deficits in basal synaptic transmission and 5Hz-LTD were observed compared to littermate controls. Conclusions The results obtained in Prh slices from juvenile rats indicate that NO and CB1 play a role in the induction of LTD and LTP, respectively. These results are confirmed by the observation that nNOS, but not CB1, is involved in visual recognition memory acquisition. The preliminary results obtained in the murine model of Alzheimer’s disease indicate that deficits in synaptic transmission and plasticity occur very early in Prh; further investigations are required to characterize the molecular mechanisms underlying these deficits.
Resumo:
Deep Neural Networks (DNNs) have revolutionized a wide range of applications beyond traditional machine learning and artificial intelligence fields, e.g., computer vision, healthcare, natural language processing and others. At the same time, edge devices have become central in our society, generating an unprecedented amount of data which could be used to train data-hungry models such as DNNs. However, the potentially sensitive or confidential nature of gathered data poses privacy concerns when storing and processing them in centralized locations. To this purpose, decentralized learning decouples model training from the need of directly accessing raw data, by alternating on-device training and periodic communications. The ability of distilling knowledge from decentralized data, however, comes at the cost of facing more challenging learning settings, such as coping with heterogeneous hardware and network connectivity, statistical diversity of data, and ensuring verifiable privacy guarantees. This Thesis proposes an extensive overview of decentralized learning literature, including a novel taxonomy and a detailed description of the most relevant system-level contributions in the related literature for privacy, communication efficiency, data and system heterogeneity, and poisoning defense. Next, this Thesis presents the design of an original solution to tackle communication efficiency and system heterogeneity, and empirically evaluates it on federated settings. For communication efficiency, an original method, specifically designed for Convolutional Neural Networks, is also described and evaluated against the state-of-the-art. Furthermore, this Thesis provides an in-depth review of recently proposed methods to tackle the performance degradation introduced by data heterogeneity, followed by empirical evaluations on challenging data distributions, highlighting strengths and possible weaknesses of the considered solutions. Finally, this Thesis presents a novel perspective on the usage of Knowledge Distillation as a mean for optimizing decentralized learning systems in settings characterized by data heterogeneity or system heterogeneity. Our vision on relevant future research directions close the manuscript.
Resumo:
In medicine, innovation depends on a better knowledge of the human body mechanism, which represents a complex system of multi-scale constituents. Unraveling the complexity underneath diseases proves to be challenging. A deep understanding of the inner workings comes with dealing with many heterogeneous information. Exploring the molecular status and the organization of genes, proteins, metabolites provides insights on what is driving a disease, from aggressiveness to curability. Molecular constituents, however, are only the building blocks of the human body and cannot currently tell the whole story of diseases. This is why nowadays attention is growing towards the contemporary exploitation of multi-scale information. Holistic methods are then drawing interest to address the problem of integrating heterogeneous data. The heterogeneity may derive from the diversity across data types and from the diversity within diseases. Here, four studies conducted data integration using customly designed workflows that implement novel methods and views to tackle the heterogeneous characterization of diseases. The first study devoted to determine shared gene regulatory signatures for onco-hematology and it showed partial co-regulation across blood-related diseases. The second study focused on Acute Myeloid Leukemia and refined the unsupervised integration of genomic alterations, which turned out to better resemble clinical practice. In the third study, network integration for artherosclerosis demonstrated, as a proof of concept, the impact of network intelligibility when it comes to model heterogeneous data, which showed to accelerate the identification of new potential pharmaceutical targets. Lastly, the fourth study introduced a new method to integrate multiple data types in a unique latent heterogeneous-representation that facilitated the selection of important data types to predict the tumour stage of invasive ductal carcinoma. The results of these four studies laid the groundwork to ease the detection of new biomarkers ultimately beneficial to medical practice and to the ever-growing field of Personalized Medicine.
Resumo:
The abundance of visual data and the push for robust AI are driving the need for automated visual sensemaking. Computer Vision (CV) faces growing demand for models that can discern not only what images "represent," but also what they "evoke." This is a demand for tools mimicking human perception at a high semantic level, categorizing images based on concepts like freedom, danger, or safety. However, automating this process is challenging due to entropy, scarcity, subjectivity, and ethical considerations. These challenges not only impact performance but also underscore the critical need for interoperability. This dissertation focuses on abstract concept-based (AC) image classification, guided by three technical principles: situated grounding, performance enhancement, and interpretability. We introduce ART-stract, a novel dataset of cultural images annotated with ACs, serving as the foundation for a series of experiments across four key domains: assessing the effectiveness of the end-to-end DL paradigm, exploring cognitive-inspired semantic intermediaries, incorporating cultural and commonsense aspects, and neuro-symbolic integration of sensory-perceptual data with cognitive-based knowledge. Our results demonstrate that integrating CV approaches with semantic technologies yields methods that surpass the current state of the art in AC image classification, outperforming the end-to-end deep vision paradigm. The results emphasize the role semantic technologies can play in developing both effective and interpretable systems, through the capturing, situating, and reasoning over knowledge related to visual data. Furthermore, this dissertation explores the complex interplay between technical and socio-technical factors. By merging technical expertise with an understanding of human and societal aspects, we advocate for responsible labeling and training practices in visual media. These insights and techniques not only advance efforts in CV and explainable artificial intelligence but also propel us toward an era of AI development that harmonizes technical prowess with deep awareness of its human and societal implications.