907 resultados para Visual Odometry,Transformer,Deep learning
Resumo:
Neural representations (NR) have emerged in the last few years as a powerful tool to represent signals from several domains, such as images, 3D shapes, or audio. Indeed, deep neural networks have been shown capable of approximating continuous functions that describe a given signal with theoretical infinite resolution. This finding allows obtaining representations whose memory footprint is fixed and decoupled from the resolution at which the underlying signal can be sampled, something that is not possible with traditional discrete representations, e.g., grids of pixels for images or voxels for 3D shapes. During the last two years, many techniques have been proposed to improve the capability of NR to approximate high-frequency details and to make the optimization procedures required to obtain NR less demanding both in terms of time and data requirements, motivating many researchers to deploy NR as the main form of data representation for complex pipelines. Following this line of research, we first show that NR can approximate precisely Unsigned Distance Functions, providing an effective way to represent garments that feature open 3D surfaces and unknown topology. Then, we present a pipeline to obtain in a few minutes a compact Neural Twin® for a given object, by exploiting the recent advances in modeling neural radiance fields. Furthermore, we move a step in the direction of adopting NR as a standalone representation, by considering the possibility of performing downstream tasks by processing directly the NR weights. We first show that deep neural networks can be compressed into compact latent codes. Then, we show how this technique can be exploited to perform deep learning on implicit neural representations (INR) of 3D shapes, by only looking at the weights of the networks.
Resumo:
The study of ancient, undeciphered scripts presents unique challenges, that depend both on the nature of the problem and on the peculiarities of each writing system. In this thesis, I present two computational approaches that are tailored to two different tasks and writing systems. The first of these methods is aimed at the decipherment of the Linear A afraction signs, in order to discover their numerical values. This is achieved with a combination of constraint programming, ad-hoc metrics and paleographic considerations. The second main contribution of this thesis regards the creation of an unsupervised deep learning model which uses drawings of signs from ancient writing system to learn to distinguish different graphemes in the vector space. This system, which is based on techniques used in the field of computer vision, is adapted to the study of ancient writing systems by incorporating information about sequences in the model, mirroring what is often done in natural language processing. In order to develop this model, the Cypriot Greek Syllabary is used as a target, since this is a deciphered writing system. Finally, this unsupervised model is adapted to the undeciphered Cypro-Minoan and it is used to answer open questions about this script. In particular, by reconstructing multiple allographs that are not agreed upon by paleographers, it supports the idea that Cypro-Minoan is a single script and not a collection of three script like it was proposed in the literature. These results on two different tasks shows that computational methods can be applied to undeciphered scripts, despite the relatively low amount of available data, paving the way for further advancement in paleography using these methods.
Resumo:
This thesis focuses on automating the time-consuming task of manually counting activated neurons in fluorescent microscopy images, which is used to study the mechanisms underlying torpor. The traditional method of manual annotation can introduce bias and delay the outcome of experiments, so the author investigates a deep-learning-based procedure to automatize this task. The author explores two of the main convolutional-neural-network (CNNs) state-of-the-art architectures: UNet and ResUnet family model, and uses a counting-by-segmentation strategy to provide a justification of the objects considered during the counting process. The author also explores a weakly-supervised learning strategy that exploits only dot annotations. The author quantifies the advantages in terms of data reduction and counting performance boost obtainable with a transfer-learning approach and, specifically, a fine-tuning procedure. The author released the dataset used for the supervised use case and all the pre-training models, and designed a web application to share both the counting process pipeline developed in this work and the models pre-trained on the dataset analyzed in this work.
Resumo:
The cation chloride cotransporters (CCCs) represent a vital family of ion transporters, with several members implicated in significant neurological disorders. Specifically, conditions such as cerebrospinal fluid accumulation, epilepsy, Down’s syndrome, Asperger’s syndrome, and certain cancers have been attributed to various CCCs. This thesis delves into these pharmacological targets using advanced computational methodologies. I primarily employed GPU-accelerated all-atom molecular dynamics simulations, deep learning-based collective variables, enhanced sampling methods, and custom Python scripts for comprehensive simulation analyses. Our research predominantly centered on KCC1 and NKCC1 transporters. For KCC1, I examined its equilibrium dynamics in the presence/absence of an inhibitor and assessed the functional implications of different ion loading states. In contrast, our work on NKCC1 revealed its unique alternating access mechanism, termed the rocking-bundle mechanism. I identified a previously unobserved occluded state and demonstrated the transporter's potential for water permeability under specific conditions. Furthermore, I confirmed the actual water flow through its permeable states. In essence, this thesis leverages cutting-edge computational techniques to deepen our understanding of the CCCs, a family of ion transporters with profound clinical significance.
Resumo:
Quantitative Susceptibility Mapping (QSM) is an advanced magnetic resonance technique that can quantify in vivo biomarkers of pathology, such as alteration in iron and myelin concentration. It allows for the comparison of magnetic susceptibility properties within and between different subject groups. In this thesis, QSM acquisition and processing pipeline are discussed, together with clinical and methodological applications of QSM to neurodegeneration. In designing the studies, significant emphasis was placed on results reproducibility and interpretability. The first project focuses on the investigation of cortical regions in amyotrophic lateral sclerosis. By examining various histogram susceptibility properties, a pattern of increased iron content was revealed in patients with amyotrophic lateral sclerosis compared to controls and other neurodegenerative disorders. Moreover, there was a correlation between susceptibility and upper motor neuron impairment, particularly in patients experiencing rapid disease progression. Similarly, in the second application, QSM was used to examine cortical and sub-cortical areas in individuals with myotonic dystrophy type 1. The thalamus and brainstem were identified as structures of interest, with relevant correlations with clinical and laboratory data such as neurological evaluation and sleep records. In the third project, a robust pipeline for assessing radiomic susceptibility-based features reliability was implemented within a cohort of patients with multiple sclerosis and healthy controls. Lastly, a deep learning super-resolution model was applied to QSM images of healthy controls. The employed model demonstrated excellent generalization abilities and outperformed traditional up-sampling methods, without requiring a customized re-training. Across the three disorders investigated, it was evident that QSM is capable of distinguishing between patient groups and healthy controls while establishing correlations between imaging measurements and clinical data. These studies lay the foundation for future research, with the ultimate goal of achieving earlier and less invasive diagnoses of neurodegenerative disorders within the context of personalized medicine.
Resumo:
Embedded systems are increasingly integral to daily life, improving and facilitating the efficiency of modern Cyber-Physical Systems which provide access to sensor data, and actuators. As modern architectures become increasingly complex and heterogeneous, their optimization becomes a challenging task. Additionally, ensuring platform security is important to avoid harm to individuals and assets. This study primarily addresses challenges in contemporary Embedded Systems, focusing on platform optimization and security enforcement. The initial section of this study delves into the application of machine learning methods to efficiently determine the optimal number of cores for a parallel RISC-V cluster to minimize energy consumption using static source code analysis. Results demonstrate that automated platform configuration is not only viable but also that there is a moderate performance trade-off when relying solely on static features. The second part focuses on addressing the problem of heterogeneous device mapping, which involves assigning tasks to the most suitable computational device in a heterogeneous platform for optimal runtime. The contribution of this section lies in the introduction of novel pre-processing techniques, along with a training framework called Siamese Networks, that enhances the classification performance of DeepLLVM, an advanced approach for task mapping. Importantly, these proposed approaches are independent from the specific deep-learning model used. Finally, this research work focuses on addressing issues concerning the binary exploitation of software running in modern Embedded Systems. It proposes an architecture to implement Control-Flow Integrity in embedded platforms with a Root-of-Trust, aiming to enhance security guarantees with limited hardware modifications. The approach involves enhancing the architecture of a modern RISC-V platform for autonomous vehicles by implementing a side-channel communication mechanism that relays control-flow changes executed by the process running on the host core to the Root-of-Trust. This approach has limited impact on performance and it is effective in enhancing the security of embedded platforms.
Resumo:
In highly urbanized coastal lowlands, effective site characterization is crucial for assessing seismic risk. It requires a comprehensive stratigraphic analysis of the shallow subsurface, coupled with the precise assessment of the geophysical properties of buried deposits. In this context, late Quaternary paleovalley systems, shallowly buried fluvial incisions formed during the Late Pleistocene sea-level fall and filled during the Holocene sea-level rise, are crucial for understanding seismic amplification due to their soft sediment infill and sharp lithologic contrasts. In this research, we conducted high-resolution stratigraphic analyses of two regions, the Pescara and Manfredonia areas along the Adriatic coastline of Italy, to delineate the geometries and facies architecture of two paleovalley systems. Furthermore, we carried out geophysical investigations to characterize the study areas and perform seismic response analyses. We tested the microtremor-based horizontal-to-vertical spectral ratio as a mapping tool to reconstruct the buried paleovalley geometries. We evaluated the relationship between geological and geophysical data and identified the stratigraphic surfaces responsible for the observed resonances. To perform seismic response analysis of the Pescara paleovalley system, we integrated the stratigraphic framework with microtremor and shear wave velocity measurements. The seismic response analysis highlights strong seismic amplifications in frequency ranges that can interact with a wide variety of building types. Additionally, we explored the applicability of artificial intelligence in performing facies analysis from borehole images. We used a robust dataset of high-resolution digital images from continuous sediment cores of Holocene age to outline a novel, deep-learning-based approach for performing automatic semantic segmentation directly on core images, leveraging the power of convolutional neural networks. We propose an automated model to rapidly characterize sediment cores, reproducing the sedimentologist's interpretation, and providing guidance for stratigraphic correlation and subsurface reconstructions.
Resumo:
Negli ultimi anni, il natural language processing ha subito una forte evoluzione, principalmente dettata dai paralleli avanzamenti nell’area del deep-learning. Con dimensioni architetturali in crescita esponenziale e corpora di addestramento sempre più comprensivi, i modelli neurali sono attualmente in grado di generare testo in maniera indistinguibile da quello umano. Tuttavia, a predizioni accurate su task complessi, si contrappongono metriche frequentemente arretrate, non capaci di cogliere le sfumature semantiche o le dimensioni di valutazione richieste. Tale divario motiva ancora oggi l’adozione di una valutazione umana come metodologia standard, ma la natura pervasiva del testo sul Web rende evidente il bisogno di sistemi automatici, scalabili, ed efficienti sia sul piano dei tempi che dei costi. In questa tesi si propone un’analisi delle principali metriche allo stato dell’arte per la valutazione di modelli pre-addestrati, partendo da quelle più popolari come Rouge fino ad arrivare a quelle che a loro volta sfruttano modelli per valutare il testo. Inoltre, si introduce una nuova libreria – denominata Blanche– finalizzata a raccogliere in un unico ambiente le implementazioni dei principali contributi oggi disponibili, agevolando il loro utilizzo da parte di sviluppatori e ricercatori. Infine, si applica Blanche per una valutazione ad ampio spettro dei risultati generativi ottenuti all’interno di un reale caso di studio, incentrato sulla verbalizzazione di eventi biomedici espressi nella letteratura scientifica. Una particolare attenzione è rivolta alla gestione dell’astrattività, un aspetto sempre più cruciale e sfidante sul piano valutativo.
Resumo:
La malattia COVID-19 associata alla sindrome respiratoria acuta grave da coronavirus 2 (SARS-CoV-2) ha rappresentato una grave minaccia per la salute pubblica e l’economia globale sin dalla sua scoperta in Cina, nel dicembre del 2019. Gli studiosi hanno effettuato numerosi studi ed in particolar modo l’applicazione di modelli epidemiologici costruiti a partire dai dati raccolti, ha permesso la previsione di diversi scenari sullo sviluppo della malattia, nel breve-medio termine. Gli obiettivi di questa tesi ruotano attorno a tre aspetti: i dati disponibili sulla malattia COVID-19, i modelli matematici compartimentali, con particolare riguardo al modello SEIJDHR che include le vaccinazioni, e l’utilizzo di reti neurali ”physics-informed” (PINNs), un nuovo approccio basato sul deep learning che mette insieme i primi due aspetti. I tre aspetti sono stati dapprima approfonditi singolarmente nei primi tre capitoli di questo lavoro e si sono poi applicate le PINNs al modello SEIJDHR. Infine, nel quarto capitolo vengono riportati frammenti rilevanti dei codici Python utilizzati e i risultati numerici ottenuti. In particolare vengono mostrati i grafici sulle previsioni nel breve-medio termine, ottenuti dando in input dati sul numero di positivi, ospedalizzati e deceduti giornalieri prima riguardanti la città di New York e poi l’Italia. Inoltre, nell’indagine della parte predittiva riguardante i dati italiani, si è individuato un punto critico legato alla funzione che modella la percentuale di ricoveri; sono stati quindi eseguiti numerosi esperimenti per il controllo di tali previsioni.
Resumo:
In recent years, we have witnessed great changes in the industrial environment as a result of the innovations introduced by Industry 4.0, especially in the integration of Internet of Things, Automation and Robotics in the manufacturing field. The project presented in this thesis lies within this innovation context and describes the implementation of an Image Recognition application focused on the automotive field. The project aims at helping the supply chain operator to perform an effective and efficient check of the homologation tags present on vehicles. The user contribution consists in taking a picture of the tag and the application will automatically, exploiting Amazon Web Services, return the result of the control about the correctness of the tag, the correct positioning within the vehicle and the presence of faults or defects on the tag. To implement this application we ombined two IoT platforms widely used in industrial field: Amazon Web Services(AWS) and ThingWorx. AWS exploits Convolutional Neural Networks to perform Text Detection and Image Recognition, while PTC ThingWorx manages the user interface and the data manipulation.
Resumo:
Il Deep Learning ha radicalmente trasformato il mondo del Machine Learning migliorando lo stato dell'arte in diversi campi che spaziano dalla computer vision al natural language processing. Non fermandosi a problemi di classificazione, negli ultimi anni, applicazioni di tipo generativo hanno portato alla creazione di immagini realistiche e documenti letterali. Il mondo della musica non è esente da una moltitudine di esperimenti nello stesso campo, con risultati ancora acerbi ma comunque potenzialmente interessanti. In questa tesi verrà discussa l'applicazione di un di modello appartenente alla famiglia del Deep Learning per la generazione di musica simbolica.
Resumo:
Correctness of information gathered in production environments is an essential part of quality assurance processes in many industries, this task is often performed by human resources who visually take annotations in various steps of the production flow. Depending on the performed task the correlation between where exactly the information is gathered and what it represents is more than often lost in the process. The lack of labeled data places a great boundary on the application of deep neural networks aimed at object detection tasks, moreover supervised training of deep models requires a great amount of data to be available. Reaching an adequate large collection of labeled images through classic techniques of data annotations is an exhausting and costly task to perform, not always suitable for every scenario. A possible solution is to generate synthetic data that replicates the real one and use it to fine-tune a deep neural network trained on one or more source domains to a different target domain. The purpose of this thesis is to show a real case scenario where the provided data were both in great scarcity and missing the required annotations. Sequentially a possible approach is presented where synthetic data has been generated to address those issues while standing as a training base of deep neural networks for object detection, capable of working on images taken in production-like environments. Lastly, it compares performance on different types of synthetic data and convolutional neural networks used as backbones for the model.
Resumo:
State-of-the-art NLP systems are generally based on the assumption that the underlying models are provided with vast datasets to train on. However, especially when working in multi-lingual contexts, datasets are often scarce, thus more research should be carried out in this field. This thesis investigates the benefits of introducing an additional training step when fine-tuning NLP models, named Intermediate Training, which could be exploited to augment the data used for the training phase. The Intermediate Training step is applied by training models on NLP tasks that are not strictly related to the target task, aiming to verify if the models are able to leverage the learned knowledge of such tasks. Furthermore, in order to better analyze the synergies between different categories of NLP tasks, experimentations have been extended also to Multi-Task Training, in which the model is trained on multiple tasks at the same time.
Resumo:
Activation functions within neural networks play a crucial role in Deep Learning since they allow to learn complex and non-trivial patterns in the data. However, the ability to approximate non-linear functions is a significant limitation when implementing neural networks in a quantum computer to solve typical machine learning tasks. The main burden lies in the unitarity constraint of quantum operators, which forbids non-linearity and poses a considerable obstacle to developing such non-linear functions in a quantum setting. Nevertheless, several attempts have been made to tackle the realization of the quantum activation function in the literature. Recently, the idea of the QSplines has been proposed to approximate a non-linear activation function by implementing the quantum version of the spline functions. Yet, QSplines suffers from various drawbacks. Firstly, the final function estimation requires a post-processing step; thus, the value of the activation function is not available directly as a quantum state. Secondly, QSplines need many error-corrected qubits and a very long quantum circuits to be executed. These constraints do not allow the adoption of the QSplines on near-term quantum devices and limit their generalization capabilities. This thesis aims to overcome these limitations by leveraging hybrid quantum-classical computation. In particular, a few different methods for Variational Quantum Splines are proposed and implemented, to pave the way for the development of complete quantum activation functions and unlock the full potential of quantum neural networks in the field of quantum machine learning.
Resumo:
L’intelligenza artificiale è senza dubbio uno degli argomenti attualmente più in voga nel mondo dell’informatica, sempre in costante evoluzione ed espansione in nuovi settori. In questa elaborato progettuale viene combinato l’argomento sopracitato con il mondo dei social network, che ormai sono parte integrante della quotidianità di tutti. Viene infatti analizzato lo stato dell’arte attuale delle reti neurali, in particolare delle reti generative avversarie, e vengono esaminate le principali tipologie di social network. Su questa base, infatti, verrà realizzato un sistema di rete sociale completo nel quale una GAN sarà proprio la protagonista, sfruttando le più interessanti tecnologie attualmente disponibili. Il sistema sarà disponibile sia come applicativo per dispositivi mobile che come sito web e introdurrà elementi di gamification per aumentare l’interazione con l’utente.