860 resultados para computer vision,machine learning,centernet,volleyball,sports
Resumo:
Dopo lo sviluppo dei primi casi di Covid-19 in Cina nell’autunno del 2019, ad inizio 2020 l’intero pianeta è precipitato in una pandemia globale che ha stravolto le nostre vite con conseguenze che non si vivevano dall’influenza spagnola. La grandissima quantità di paper scientifici in continua pubblicazione sul coronavirus e virus ad esso affini ha portato alla creazione di un unico dataset dinamico chiamato CORD19 e distribuito gratuitamente. Poter reperire informazioni utili in questa mole di dati ha ulteriormente acceso i riflettori sugli information retrieval systems, capaci di recuperare in maniera rapida ed efficace informazioni preziose rispetto a una domanda dell'utente detta query. Di particolare rilievo è stata la TREC-COVID Challenge, competizione per lo sviluppo di un sistema di IR addestrato e testato sul dataset CORD19. Il problema principale è dato dal fatto che la grande mole di documenti è totalmente non etichettata e risulta dunque impossibile addestrare modelli di reti neurali direttamente su di essi. Per aggirare il problema abbiamo messo a punto nuove soluzioni self-supervised, a cui abbiamo applicato lo stato dell'arte del deep metric learning e dell'NLP. Il deep metric learning, che sta avendo un enorme successo soprattuto nella computer vision, addestra il modello ad "avvicinare" tra loro immagini simili e "allontanare" immagini differenti. Dato che sia le immagini che il testo vengono rappresentati attraverso vettori di numeri reali (embeddings) si possano utilizzare le stesse tecniche per "avvicinare" tra loro elementi testuali pertinenti (e.g. una query e un paragrafo) e "allontanare" elementi non pertinenti. Abbiamo dunque addestrato un modello SciBERT con varie loss, che ad oggi rappresentano lo stato dell'arte del deep metric learning, in maniera completamente self-supervised direttamente e unicamente sul dataset CORD19, valutandolo poi sul set formale TREC-COVID attraverso un sistema di IR e ottenendo risultati interessanti.
Resumo:
Acoustic Emission (AE) monitoring can be used to detect the presence of damage as well as determine its location in Structural Health Monitoring (SHM) applications. Information on the time difference of the signal generated by the damage event arriving at different sensors is essential in performing localization. This makes the time of arrival (ToA) an important piece of information to retrieve from the AE signal. Generally, this is determined using statistical methods such as the Akaike Information Criterion (AIC) which is particularly prone to errors in the presence of noise. And given that the structures of interest are surrounded with harsh environments, a way to accurately estimate the arrival time in such noisy scenarios is of particular interest. In this work, two new methods are presented to estimate the arrival times of AE signals which are based on Machine Learning. Inspired by great results in the field, two models are presented which are Deep Learning models - a subset of machine learning. They are based on Convolutional Neural Network (CNN) and Capsule Neural Network (CapsNet). The primary advantage of such models is that they do not require the user to pre-define selected features but only require raw data to be given and the models establish non-linear relationships between the inputs and outputs. The performance of the models is evaluated using AE signals generated by a custom ray-tracing algorithm by propagating them on an aluminium plate and compared to AIC. It was found that the relative error in estimation on the test set was < 5% for the models compared to around 45% of AIC. The testing process was further continued by preparing an experimental setup and acquiring real AE signals to test on. Similar performances were observed where the two models not only outperform AIC by more than a magnitude in their average errors but also they were shown to be a lot more robust as compared to AIC which fails in the presence of noise.
Resumo:
Reinforcement Learning is an increasingly popular area of Artificial Intelligence. The applications of this learning paradigm are many, but its application in mobile computing is in its infancy. This study aims to provide an overview of current Reinforcement Learning applications on mobile devices, as well as to introduce a new framework for iOS devices: Swift-RL Lib. This new Swift package allows developers to easily support and integrate two of the most common RL algorithms, Q-Learning and Deep Q-Network, in a fully customizable environment. All processes are performed on the device, without any need for remote computation. The framework was tested in different settings and evaluated through several use cases. Through an in-depth performance analysis, we show that the platform provides effective and efficient support for Reinforcement Learning for mobile applications.
Resumo:
Social interactions have been the focus of social science research for a century, but their study has recently been revolutionized by novel data sources and by methods from computer science, network science, and complex systems science. The study of social interactions is crucial for understanding complex societal behaviours. Social interactions are naturally represented as networks, which have emerged as a unifying mathematical language to understand structural and dynamical aspects of socio-technical systems. Networks are, however, highly dimensional objects, especially when considering the scales of real-world systems and the need to model the temporal dimension. Hence the study of empirical data from social systems is challenging both from a conceptual and a computational standpoint. A possible approach to tackling such a challenge is to use dimensionality reduction techniques that represent network entities in a low-dimensional feature space, preserving some desired properties of the original data. Low-dimensional vector space representations, also known as network embeddings, have been extensively studied, also as a way to feed network data to machine learning algorithms. Network embeddings were initially developed for static networks and then extended to incorporate temporal network data. We focus on dimensionality reduction techniques for time-resolved social interaction data modelled as temporal networks. We introduce a novel embedding technique that models the temporal and structural similarities of events rather than nodes. Using empirical data on social interactions, we show that this representation captures information relevant for the study of dynamical processes unfolding over the network, such as epidemic spreading. We then turn to another large-scale dataset on social interactions: a popular Web-based crowdfunding platform. We show that tensor-based representations of the data and dimensionality reduction techniques such as tensor factorization allow us to uncover the structural and temporal aspects of the system and to relate them to geographic and temporal activity patterns.
Resumo:
A densely built environment is a complex system of infrastructure, nature, and people closely interconnected and interacting. Vehicles, public transport, weather action, and sports activities constitute a manifold set of excitation and degradation sources for civil structures. In this context, operators should consider different factors in a holistic approach for assessing the structural health state. Vibration-based structural health monitoring (SHM) has demonstrated great potential as a decision-supporting tool to schedule maintenance interventions. However, most excitation sources are considered an issue for practical SHM applications since traditional methods are typically based on strict assumptions on input stationarity. Last-generation low-cost sensors present limitations related to a modest sensitivity and high noise floor compared to traditional instrumentation. If these devices are used for SHM in urban scenarios, short vibration recordings collected during high-intensity events and vehicle passage may be the only available datasets with a sufficient signal-to-noise ratio. While researchers have spent efforts to mitigate the effects of short-term phenomena in vibration-based SHM, the ultimate goal of this thesis is to exploit them and obtain valuable information on the structural health state. First, this thesis proposes strategies and algorithms for smart sensors operating individually or in a distributed computing framework to identify damage-sensitive features based on instantaneous modal parameters and influence lines. Ordinary traffic and people activities become essential sources of excitation, while human-powered vehicles, instrumented with smartphones, take the role of roving sensors in crowdsourced monitoring strategies. The technical and computational apparatus is optimized using in-memory computing technologies. Moreover, identifying additional local features can be particularly useful to support the damage assessment of complex structures. Thereby, smart coatings are studied to enable the self-sensing properties of ordinary structural elements. In this context, a machine-learning-aided tomography method is proposed to interpret the data provided by a nanocomposite paint interrogated electrically.
Resumo:
The availability of a huge amount of source code from code archives and open-source projects opens up the possibility to merge machine learning, programming languages, and software engineering research fields. This area is often referred to as Big Code where programming languages are treated instead of natural languages while different features and patterns of code can be exploited to perform many useful tasks and build supportive tools. Among all the possible applications which can be developed within the area of Big Code, the work presented in this research thesis mainly focuses on two particular tasks: the Programming Language Identification (PLI) and the Software Defect Prediction (SDP) for source codes. Programming language identification is commonly needed in program comprehension and it is usually performed directly by developers. However, when it comes at big scales, such as in widely used archives (GitHub, Software Heritage), automation of this task is desirable. To accomplish this aim, the problem is analyzed from different points of view (text and image-based learning approaches) and different models are created paying particular attention to their scalability. Software defect prediction is a fundamental step in software development for improving quality and assuring the reliability of software products. In the past, defects were searched by manual inspection or using automatic static and dynamic analyzers. Now, the automation of this task can be tackled using learning approaches that can speed up and improve related procedures. Here, two models have been built and analyzed to detect some of the commonest bugs and errors at different code granularity levels (file and method levels). Exploited data and models’ architectures are analyzed and described in detail. Quantitative and qualitative results are reported for both PLI and SDP tasks while differences and similarities concerning other related works are discussed.
Resumo:
Deep learning methods are extremely promising machine learning tools to analyze neuroimaging data. However, their potential use in clinical settings is limited because of the existing challenges of applying these methods to neuroimaging data. In this study, first a data leakage type caused by slice-level data split that is introduced during training and validation of a 2D CNN is surveyed and a quantitative assessment of the model’s performance overestimation is presented. Second, an interpretable, leakage-fee deep learning software written in a python language with a wide range of options has been developed to conduct both classification and regression analysis. The software was applied to the study of mild cognitive impairment (MCI) in patients with small vessel disease (SVD) using multi-parametric MRI data where the cognitive performance of 58 patients measured by five neuropsychological tests is predicted using a multi-input CNN model taking brain image and demographic data. Each of the cognitive test scores was predicted using different MRI-derived features. As MCI due to SVD has been hypothesized to be the effect of white matter damage, DTI-derived features MD and FA produced the best prediction outcome of the TMT-A score which is consistent with the existing literature. In a second study, an interpretable deep learning system aimed at 1) classifying Alzheimer disease and healthy subjects 2) examining the neural correlates of the disease that causes a cognitive decline in AD patients using CNN visualization tools and 3) highlighting the potential of interpretability techniques to capture a biased deep learning model is developed. Structural magnetic resonance imaging (MRI) data of 200 subjects was used by the proposed CNN model which was trained using a transfer learning-based approach producing a balanced accuracy of 71.6%. Brain regions in the frontal and parietal lobe showing the cerebral cortex atrophy were highlighted by the visualization tools.
Resumo:
The study of ancient, undeciphered scripts presents unique challenges, that depend both on the nature of the problem and on the peculiarities of each writing system. In this thesis, I present two computational approaches that are tailored to two different tasks and writing systems. The first of these methods is aimed at the decipherment of the Linear A afraction signs, in order to discover their numerical values. This is achieved with a combination of constraint programming, ad-hoc metrics and paleographic considerations. The second main contribution of this thesis regards the creation of an unsupervised deep learning model which uses drawings of signs from ancient writing system to learn to distinguish different graphemes in the vector space. This system, which is based on techniques used in the field of computer vision, is adapted to the study of ancient writing systems by incorporating information about sequences in the model, mirroring what is often done in natural language processing. In order to develop this model, the Cypriot Greek Syllabary is used as a target, since this is a deciphered writing system. Finally, this unsupervised model is adapted to the undeciphered Cypro-Minoan and it is used to answer open questions about this script. In particular, by reconstructing multiple allographs that are not agreed upon by paleographers, it supports the idea that Cypro-Minoan is a single script and not a collection of three script like it was proposed in the literature. These results on two different tasks shows that computational methods can be applied to undeciphered scripts, despite the relatively low amount of available data, paving the way for further advancement in paleography using these methods.
Resumo:
The discovery of new materials and their functions has always been a fundamental component of technological progress. Nowadays, the quest for new materials is stronger than ever: sustainability, medicine, robotics and electronics are all key assets which depend on the ability to create specifically tailored materials. However, designing materials with desired properties is a difficult task, and the complexity of the discipline makes it difficult to identify general criteria. While scientists developed a set of best practices (often based on experience and expertise), this is still a trial-and-error process. This becomes even more complex when dealing with advanced functional materials. Their properties depend on structural and morphological features, which in turn depend on fabrication procedures and environment, and subtle alterations leads to dramatically different results. Because of this, materials modeling and design is one of the most prolific research fields. Many techniques and instruments are continuously developed to enable new possibilities, both in the experimental and computational realms. Scientists strive to enforce cutting-edge technologies in order to make progress. However, the field is strongly affected by unorganized file management, proliferation of custom data formats and storage procedures, both in experimental and computational research. Results are difficult to find, interpret and re-use, and a huge amount of time is spent interpreting and re-organizing data. This also strongly limit the application of data-driven and machine learning techniques. This work introduces possible solutions to the problems described above. Specifically, it talks about developing features for specific classes of advanced materials and use them to train machine learning models and accelerate computational predictions for molecular compounds; developing method for organizing non homogeneous materials data; automate the process of using devices simulations to train machine learning models; dealing with scattered experimental data and use them to discover new patterns.
Resumo:
Machine Learning makes computers capable of performing tasks typically requiring human intelligence. A domain where it is having a considerable impact is the life sciences, allowing to devise new biological analysis protocols, develop patients’ treatments efficiently and faster, and reduce healthcare costs. This Thesis work presents new Machine Learning methods and pipelines for the life sciences focusing on the unsupervised field. At a methodological level, two methods are presented. The first is an “Ab Initio Local Principal Path” and it is a revised and improved version of a pre-existing algorithm in the manifold learning realm. The second contribution is an improvement over the Import Vector Domain Description (one-class learning) through the Kullback-Leibler divergence. It hybridizes kernel methods to Deep Learning obtaining a scalable solution, an improved probabilistic model, and state-of-the-art performances. Both methods are tested through several experiments, with a central focus on their relevance in life sciences. Results show that they improve the performances achieved by their previous versions. At the applicative level, two pipelines are presented. The first one is for the analysis of RNA-Seq datasets, both transcriptomic and single-cell data, and is aimed at identifying genes that may be involved in biological processes (e.g., the transition of tissues from normal to cancer). In this project, an R package is released on CRAN to make the pipeline accessible to the bioinformatic Community through high-level APIs. The second pipeline is in the drug discovery domain and is useful for identifying druggable pockets, namely regions of a protein with a high probability of accepting a small molecule (a drug). Both these pipelines achieve remarkable results. Lastly, a detour application is developed to identify the strengths/limitations of the “Principal Path” algorithm by analyzing Convolutional Neural Networks induced vector spaces. This application is conducted in the music and visual arts domains.
Resumo:
Riding the wave of recent groundbreaking achievements, artificial intelligence (AI) is currently the buzzword on everybody’s lips and, allowing algorithms to learn from historical data, Machine Learning (ML) emerged as its pinnacle. The multitude of algorithms, each with unique strengths and weaknesses, highlights the absence of a universal solution and poses a challenging optimization problem. In response, automated machine learning (AutoML) navigates vast search spaces within minimal time constraints. By lowering entry barriers, AutoML emerged as promising the democratization of AI, yet facing some challenges. In data-centric AI, the discipline of systematically engineering data used to build an AI system, the challenge of configuring data pipelines is rather simple. We devise a methodology for building effective data pre-processing pipelines in supervised learning as well as a data-centric AutoML solution for unsupervised learning. In human-centric AI, many current AutoML tools were not built around the user but rather around algorithmic ideas, raising ethical and social bias concerns. We contribute by deploying AutoML tools aiming at complementing, instead of replacing, human intelligence. In particular, we provide solutions for single-objective and multi-objective optimization and showcase the challenges and potential of novel interfaces featuring large language models. Finally, there are application areas that rely on numerical simulators, often related to earth observations, they tend to be particularly high-impact and address important challenges such as climate change and crop life cycles. We commit to coupling these physical simulators with (Auto)ML solutions towards a physics-aware AI. Specifically, in precision farming, we design a smart irrigation platform that: allows real-time monitoring of soil moisture, predicts future moisture values, and estimates water demand to schedule the irrigation.
Resumo:
Trying to explain to a robot what to do is a difficult undertaking, and only specific types of people have been able to do so far, such as programmers or operators who have learned how to use controllers to communicate with a robot. My internship's goal was to create and develop a framework that would make that easier. The system uses deep learning techniques to recognize a set of hand gestures, both static and dynamic. Then, based on the gesture, it sends a command to a robot. To be as generic as feasible, the communication is implemented using Robot Operating System (ROS). Furthermore, users can add new recognizable gestures and link them to new robot actions; a finite state automaton enforces the users' input verification and correct action sequence. Finally, the users can create and utilize a macro to describe a sequence of actions performable by a robot.
Resumo:
Correctness of information gathered in production environments is an essential part of quality assurance processes in many industries, this task is often performed by human resources who visually take annotations in various steps of the production flow. Depending on the performed task the correlation between where exactly the information is gathered and what it represents is more than often lost in the process. The lack of labeled data places a great boundary on the application of deep neural networks aimed at object detection tasks, moreover supervised training of deep models requires a great amount of data to be available. Reaching an adequate large collection of labeled images through classic techniques of data annotations is an exhausting and costly task to perform, not always suitable for every scenario. A possible solution is to generate synthetic data that replicates the real one and use it to fine-tune a deep neural network trained on one or more source domains to a different target domain. The purpose of this thesis is to show a real case scenario where the provided data were both in great scarcity and missing the required annotations. Sequentially a possible approach is presented where synthetic data has been generated to address those issues while standing as a training base of deep neural networks for object detection, capable of working on images taken in production-like environments. Lastly, it compares performance on different types of synthetic data and convolutional neural networks used as backbones for the model.
Resumo:
Il ruolo dell’informatica è diventato chiave del funzionamento del mondo moderno, ormai sempre più in progressiva digitalizzazione di ogni singolo aspetto della vita dell’individuo. Con l’aumentare della complessità e delle dimensioni dei programmi, il rilevamento di errori diventa sempre di più un’attività difficile e che necessita l’impiego di tempo e risorse. Meccanismi di analisi del codice sorgente tradizionali sono esistiti fin dalla nascita dell’informatica stessa e il loro ruolo all’interno della catena produttiva di un team di programmatori non è mai stato cosi fondamentale come lo è tuttora. Questi meccanismi di analisi, però, non sono esenti da problematiche: il tempo di esecuzione su progetti di grandi dimensioni e la percentuale di falsi positivi possono, infatti, diventare un importante problema. Per questi motivi, meccanismi fondati su Machine Learning, e più in particolare Deep Learning, sono stati sviluppati negli ultimi anni. Questo lavoro di tesi si pone l’obbiettivo di esplorare e sviluppare un modello di Deep Learning per il riconoscimento di errori in un qualsiasi file sorgente scritto in linguaggio C e C++.
Resumo:
Activation functions within neural networks play a crucial role in Deep Learning since they allow to learn complex and non-trivial patterns in the data. However, the ability to approximate non-linear functions is a significant limitation when implementing neural networks in a quantum computer to solve typical machine learning tasks. The main burden lies in the unitarity constraint of quantum operators, which forbids non-linearity and poses a considerable obstacle to developing such non-linear functions in a quantum setting. Nevertheless, several attempts have been made to tackle the realization of the quantum activation function in the literature. Recently, the idea of the QSplines has been proposed to approximate a non-linear activation function by implementing the quantum version of the spline functions. Yet, QSplines suffers from various drawbacks. Firstly, the final function estimation requires a post-processing step; thus, the value of the activation function is not available directly as a quantum state. Secondly, QSplines need many error-corrected qubits and a very long quantum circuits to be executed. These constraints do not allow the adoption of the QSplines on near-term quantum devices and limit their generalization capabilities. This thesis aims to overcome these limitations by leveraging hybrid quantum-classical computation. In particular, a few different methods for Variational Quantum Splines are proposed and implemented, to pave the way for the development of complete quantum activation functions and unlock the full potential of quantum neural networks in the field of quantum machine learning.