11 resultados para Natural language processing (Computer science) -- TFC
em AMS Tesi di Dottorato - Alm@DL - Università di Bologna
Resumo:
In the last decades, Artificial Intelligence has witnessed multiple breakthroughs in deep learning. In particular, purely data-driven approaches have opened to a wide variety of successful applications due to the large availability of data. Nonetheless, the integration of prior knowledge is still required to compensate for specific issues like lack of generalization from limited data, fairness, robustness, and biases. In this thesis, we analyze the methodology of integrating knowledge into deep learning models in the field of Natural Language Processing (NLP). We start by remarking on the importance of knowledge integration. We highlight the possible shortcomings of these approaches and investigate the implications of integrating unstructured textual knowledge. We introduce Unstructured Knowledge Integration (UKI) as the process of integrating unstructured knowledge into machine learning models. We discuss UKI in the field of NLP, where knowledge is represented in a natural language format. We identify UKI as a complex process comprised of multiple sub-processes, different knowledge types, and knowledge integration properties to guarantee. We remark on the challenges of integrating unstructured textual knowledge and bridge connections with well-known research areas in NLP. We provide a unified vision of structured knowledge extraction (KE) and UKI by identifying KE as a sub-process of UKI. We investigate some challenging scenarios where structured knowledge is not a feasible prior assumption and formulate each task from the point of view of UKI. We adopt simple yet effective neural architectures and discuss the challenges of such an approach. Finally, we identify KE as a form of symbolic representation. From this perspective, we remark on the need of defining sophisticated UKI processes to verify the validity of knowledge integration. To this end, we foresee frameworks capable of combining symbolic and sub-symbolic representations for learning as a solution.
Resumo:
One of the most visionary goals of Artificial Intelligence is to create a system able to mimic and eventually surpass the intelligence observed in biological systems including, ambitiously, the one observed in humans. The main distinctive strength of humans is their ability to build a deep understanding of the world by learning continuously and drawing from their experiences. This ability, which is found in various degrees in all intelligent biological beings, allows them to adapt and properly react to changes by incrementally expanding and refining their knowledge. Arguably, achieving this ability is one of the main goals of Artificial Intelligence and a cornerstone towards the creation of intelligent artificial agents. Modern Deep Learning approaches allowed researchers and industries to achieve great advancements towards the resolution of many long-standing problems in areas like Computer Vision and Natural Language Processing. However, while this current age of renewed interest in AI allowed for the creation of extremely useful applications, a concerningly limited effort is being directed towards the design of systems able to learn continuously. The biggest problem that hinders an AI system from learning incrementally is the catastrophic forgetting phenomenon. This phenomenon, which was discovered in the 90s, naturally occurs in Deep Learning architectures where classic learning paradigms are applied when learning incrementally from a stream of experiences. This dissertation revolves around the Continual Learning field, a sub-field of Machine Learning research that has recently made a comeback following the renewed interest in Deep Learning approaches. This work will focus on a comprehensive view of continual learning by considering algorithmic, benchmarking, and applicative aspects of this field. This dissertation will also touch on community aspects such as the design and creation of research tools aimed at supporting Continual Learning research, and the theoretical and practical aspects concerning public competitions in this field.
Resumo:
Deep Neural Networks (DNNs) have revolutionized a wide range of applications beyond traditional machine learning and artificial intelligence fields, e.g., computer vision, healthcare, natural language processing and others. At the same time, edge devices have become central in our society, generating an unprecedented amount of data which could be used to train data-hungry models such as DNNs. However, the potentially sensitive or confidential nature of gathered data poses privacy concerns when storing and processing them in centralized locations. To this purpose, decentralized learning decouples model training from the need of directly accessing raw data, by alternating on-device training and periodic communications. The ability of distilling knowledge from decentralized data, however, comes at the cost of facing more challenging learning settings, such as coping with heterogeneous hardware and network connectivity, statistical diversity of data, and ensuring verifiable privacy guarantees. This Thesis proposes an extensive overview of decentralized learning literature, including a novel taxonomy and a detailed description of the most relevant system-level contributions in the related literature for privacy, communication efficiency, data and system heterogeneity, and poisoning defense. Next, this Thesis presents the design of an original solution to tackle communication efficiency and system heterogeneity, and empirically evaluates it on federated settings. For communication efficiency, an original method, specifically designed for Convolutional Neural Networks, is also described and evaluated against the state-of-the-art. Furthermore, this Thesis provides an in-depth review of recently proposed methods to tackle the performance degradation introduced by data heterogeneity, followed by empirical evaluations on challenging data distributions, highlighting strengths and possible weaknesses of the considered solutions. Finally, this Thesis presents a novel perspective on the usage of Knowledge Distillation as a mean for optimizing decentralized learning systems in settings characterized by data heterogeneity or system heterogeneity. Our vision on relevant future research directions close the manuscript.
Resumo:
The study of ancient, undeciphered scripts presents unique challenges, that depend both on the nature of the problem and on the peculiarities of each writing system. In this thesis, I present two computational approaches that are tailored to two different tasks and writing systems. The first of these methods is aimed at the decipherment of the Linear A afraction signs, in order to discover their numerical values. This is achieved with a combination of constraint programming, ad-hoc metrics and paleographic considerations. The second main contribution of this thesis regards the creation of an unsupervised deep learning model which uses drawings of signs from ancient writing system to learn to distinguish different graphemes in the vector space. This system, which is based on techniques used in the field of computer vision, is adapted to the study of ancient writing systems by incorporating information about sequences in the model, mirroring what is often done in natural language processing. In order to develop this model, the Cypriot Greek Syllabary is used as a target, since this is a deciphered writing system. Finally, this unsupervised model is adapted to the undeciphered Cypro-Minoan and it is used to answer open questions about this script. In particular, by reconstructing multiple allographs that are not agreed upon by paleographers, it supports the idea that Cypro-Minoan is a single script and not a collection of three script like it was proposed in the literature. These results on two different tasks shows that computational methods can be applied to undeciphered scripts, despite the relatively low amount of available data, paving the way for further advancement in paleography using these methods.
Resumo:
The rapid progression of biomedical research coupled with the explosion of scientific literature has generated an exigent need for efficient and reliable systems of knowledge extraction. This dissertation contends with this challenge through a concentrated investigation of digital health, Artificial Intelligence, and specifically Machine Learning and Natural Language Processing's (NLP) potential to expedite systematic literature reviews and refine the knowledge extraction process. The surge of COVID-19 complicated the efforts of scientists, policymakers, and medical professionals in identifying pertinent articles and assessing their scientific validity. This thesis presents a substantial solution in the form of the COKE Project, an initiative that interlaces machine reading with the rigorous protocols of Evidence-Based Medicine to streamline knowledge extraction. In the framework of the COKE (“COVID-19 Knowledge Extraction framework for next-generation discovery science”) Project, this thesis aims to underscore the capacity of machine reading to create knowledge graphs from scientific texts. The project is remarkable for its innovative use of NLP techniques such as a BERT + bi-LSTM language model. This combination is employed to detect and categorize elements within medical abstracts, thereby enhancing the systematic literature review process. The COKE project's outcomes show that NLP, when used in a judiciously structured manner, can significantly reduce the time and effort required to produce medical guidelines. These findings are particularly salient during times of medical emergency, like the COVID-19 pandemic, when quick and accurate research results are critical.
Resumo:
Using Big Data and Natural Language Processing (NLP) tools, this dissertation investigates the narrative strategies that atypical actors can leverage to deal with the adverse reactions they often elicit. Extensive research shows that atypical actors, those who fail to abide by established contextual standards and norms, are subject to skepticism and face a higher risk of rejection. Indeed, atypical actors combine features and behaviors in unconventional ways, thereby generating confusion in the audience and instilling doubts about their propositions' legitimacy. However, the same atypicality is often cited as the precursor to socio-cultural innovation and a strategic act to expand the capacity for delivering valued goods and services. Contextualizing the conditions under which atypicality is celebrated or punished has been a significant theoretical challenge for scholars interested in reconciling this tension. Nevertheless, prior work has focused on audience side factors or on actor-side characteristics that are only scantily under an actor's control (e.g., status and reputation). This dissertation demonstrates that atypical actors can use strategically crafted narratives to mitigate against the audience’s negative response. In particular, when atypical actors evoke conventional features in their story, they are more likely to overcome the illegitimacy discount usually applied to them. Moreover, narratives become successful navigational devices for atypicality when atypical actors use a more abstract language. This simplifies classification and provides the audience with more flexibility to interpret and understand them.
Resumo:
The treatment of the Cerebral Palsy (CP) is considered as the “core problem” for the whole field of the pediatric rehabilitation. The reason why this pathology has such a primary role, can be ascribed to two main aspects. First of all CP is the form of disability most frequent in childhood (one new case per 500 birth alive, (1)), secondarily the functional recovery of the “spastic” child is, historically, the clinical field in which the majority of the therapeutic methods and techniques (physiotherapy, orthotic, pharmacologic, orthopedic-surgical, neurosurgical) were first applied and tested. The currently accepted definition of CP – Group of disorders of the development of movement and posture causing activity limitation (2) – is the result of a recent update by the World Health Organization to the language of the International Classification of Functioning Disability and Health, from the original proposal of Ingram – A persistent but not unchangeable disorder of posture and movement – dated 1955 (3). This definition considers CP as a permanent ailment, i.e. a “fixed” condition, that however can be modified both functionally and structurally by means of child spontaneous evolution and treatments carried out during childhood. The lesion that causes the palsy, happens in a structurally immature brain in the pre-, peri- or post-birth period (but only during the firsts months of life). The most frequent causes of CP are: prematurity, insufficient cerebral perfusion, arterial haemorrhage, venous infarction, hypoxia caused by various origin (for example from the ingestion of amniotic liquid), malnutrition, infection and maternal or fetal poisoning. In addition to these causes, traumas and malformations have to be included. The lesion, whether focused or spread over the nervous system, impairs the whole functioning of the Central Nervous System (CNS). As a consequence, they affect the construction of the adaptive functions (4), first of all posture control, locomotion and manipulation. The palsy itself does not vary over time, however it assumes an unavoidable “evolutionary” feature when during growth the child is requested to meet new and different needs through the construction of new and different functions. It is essential to consider that clinically CP is not only a direct expression of structural impairment, that is of etiology, pathogenesis and lesion timing, but it is mainly the manifestation of the path followed by the CNS to “re”-construct the adaptive functions “despite” the presence of the damage. “Palsy” is “the form of the function that is implemented by an individual whose CNS has been damaged in order to satisfy the demands coming from the environment” (4). Therefore it is only possible to establish general relations between lesion site, nature and size, and palsy and recovery processes. It is quite common to observe that children with very similar neuroimaging can have very different clinical manifestations of CP and, on the other hand, children with very similar motor behaviors can have completely different lesion histories. A very clear example of this is represented by hemiplegic forms, which show bilateral hemispheric lesions in a high percentage of cases. The first section of this thesis is aimed at guiding the interpretation of CP. First of all the issue of the detection of the palsy is treated from historical viewpoint. Consequently, an extended analysis of the current definition of CP, as internationally accepted, is provided. The definition is then outlined in terms of a space dimension and then of a time dimension, hence it is highlighted where this definition is unacceptably lacking. The last part of the first section further stresses the importance of shifting from the traditional concept of CP as a palsy of development (defect analysis) towards the notion of development of palsy, i.e., as the product of the relationship that the individual however tries to dynamically build with the surrounding environment (resource semeiotics) starting and growing from a different availability of resources, needs, dreams, rights and duties (4). In the scientific and clinic community no common classification system of CP has so far been universally accepted. Besides, no standard operative method or technique have been acknowledged to effectively assess the different disabilities and impairments exhibited by children with CP. CP is still “an artificial concept, comprising several causes and clinical syndromes that have been grouped together for a convenience of management” (5). The lack of standard and common protocols able to effectively diagnose the palsy, and as a consequence to establish specific treatments and prognosis, is mainly because of the difficulty to elevate this field to a level based on scientific evidence. A solution aimed at overcoming the current incomplete treatment of CP children is represented by the clinical systematic adoption of objective tools able to measure motor defects and movement impairments. A widespread application of reliable instruments and techniques able to objectively evaluate both the form of the palsy (diagnosis) and the efficacy of the treatments provided (prognosis), constitutes a valuable method able to validate care protocols, establish the efficacy of classification systems and assess the validity of definitions. Since the ‘80s, instruments specifically oriented to the analysis of the human movement have been advantageously designed and applied in the context of CP with the aim of measuring motor deficits and, especially, gait deviations. The gait analysis (GA) technique has been increasingly used over the years to assess, analyze, classify, and support the process of clinical decisions making, allowing for a complete investigation of gait with an increased temporal and spatial resolution. GA has provided a basis for improving the outcome of surgical and nonsurgical treatments and for introducing a new modus operandi in the identification of defects and functional adaptations to the musculoskeletal disorders. Historically, the first laboratories set up for gait analysis developed their own protocol (set of procedures for data collection and for data reduction) independently, according to performances of the technologies available at that time. In particular, the stereophotogrammetric systems mainly based on optoelectronic technology, soon became a gold-standard for motion analysis. They have been successfully applied especially for scientific purposes. Nowadays the optoelectronic systems have significantly improved their performances in term of spatial and temporal resolution, however many laboratories continue to use the protocols designed on the technology available in the ‘70s and now out-of-date. Furthermore, these protocols are not coherent both for the biomechanical models and for the adopted collection procedures. In spite of these differences, GA data are shared, exchanged and interpreted irrespectively to the adopted protocol without a full awareness to what extent these protocols are compatible and comparable with each other. Following the extraordinary advances in computer science and electronics, new systems for GA no longer based on optoelectronic technology, are now becoming available. They are the Inertial and Magnetic Measurement Systems (IMMSs), based on miniature MEMS (Microelectromechanical systems) inertial sensor technology. These systems are cost effective, wearable and fully portable motion analysis systems, these features gives IMMSs the potential to be used both outside specialized laboratories and to consecutive collect series of tens of gait cycles. The recognition and selection of the most representative gait cycle is then easier and more reliable especially in CP children, considering their relevant gait cycle variability. The second section of this thesis is focused on GA. In particular, it is firstly aimed at examining the differences among five most representative GA protocols in order to assess the state of the art with respect to the inter-protocol variability. The design of a new protocol is then proposed and presented with the aim of achieving gait analysis on CP children by means of IMMS. The protocol, named ‘Outwalk’, contains original and innovative solutions oriented at obtaining joint kinematic with calibration procedures extremely comfortable for the patients. The results of a first in-vivo validation of Outwalk on healthy subjects are then provided. In particular, this study was carried out by comparing Outwalk used in combination with an IMMS with respect to a reference protocol and an optoelectronic system. In order to set a more accurate and precise comparison of the systems and the protocols, ad hoc methods were designed and an original formulation of the statistical parameter coefficient of multiple correlation was developed and effectively applied. On the basis of the experimental design proposed for the validation on healthy subjects, a first assessment of Outwalk, together with an IMMS, was also carried out on CP children. The third section of this thesis is dedicated to the treatment of walking in CP children. Commonly prescribed treatments in addressing gait abnormalities in CP children include physical therapy, surgery (orthopedic and rhizotomy), and orthoses. The orthotic approach is conservative, being reversible, and widespread in many therapeutic regimes. Orthoses are used to improve the gait of children with CP, by preventing deformities, controlling joint position, and offering an effective lever for the ankle joint. Orthoses are prescribed for the additional aims of increasing walking speed, improving stability, preventing stumbling, and decreasing muscular fatigue. The ankle-foot orthosis (AFO), with a rigid ankle, are primarily designed to prevent equinus and other foot deformities with a positive effect also on more proximal joints. However, AFOs prevent the natural excursion of the tibio-tarsic joint during the second rocker, hence hampering the natural leaning progression of the whole body under the effect of the inertia (6). A new modular (submalleolar) astragalus-calcanear orthosis, named OMAC, has recently been proposed with the intention of substituting the prescription of AFOs in those CP children exhibiting a flat and valgus-pronated foot. The aim of this section is thus to present the mechanical and technical features of the OMAC by means of an accurate description of the device. In particular, the integral document of the deposited Italian patent, is provided. A preliminary validation of OMAC with respect to AFO is also reported as resulted from an experimental campaign on diplegic CP children, during a three month period, aimed at quantitatively assessing the benefit provided by the two orthoses on walking and at qualitatively evaluating the changes in the quality of life and motor abilities. As already stated, CP is universally considered as a persistent but not unchangeable disorder of posture and movement. Conversely to this definition, some clinicians (4) have recently pointed out that movement disorders may be primarily caused by the presence of perceptive disorders, where perception is not merely the acquisition of sensory information, but an active process aimed at guiding the execution of movements through the integration of sensory information properly representing the state of one’s body and of the environment. Children with perceptive impairments show an overall fear of moving and the onset of strongly unnatural walking schemes directly caused by the presence of perceptive system disorders. The fourth section of the thesis thus deals with accurately defining the perceptive impairment exhibited by diplegic CP children. A detailed description of the clinical signs revealing the presence of the perceptive impairment, and a classification scheme of the clinical aspects of perceptual disorders is provided. In the end, a functional reaching test is proposed as an instrumental test able to disclosure the perceptive impairment. References 1. Prevalence and characteristics of children with cerebral palsy in Europe. Dev Med Child Neurol. 2002 Set;44(9):633-640. 2. Bax M, Goldstein M, Rosenbaum P, Leviton A, Paneth N, Dan B, et al. Proposed definition and classification of cerebral palsy, April 2005. Dev Med Child Neurol. 2005 Ago;47(8):571-576. 3. Ingram TT. A study of cerebral palsy in the childhood population of Edinburgh. Arch. Dis. Child. 1955 Apr;30(150):85-98. 4. Ferrari A, Cioni G. The spastic forms of cerebral palsy : a guide to the assessment of adaptive functions. Milan: Springer; 2009. 5. Olney SJ, Wright MJ. Cerebral Palsy. Campbell S et al. Physical Therapy for Children. 2nd Ed. Philadelphia: Saunders. 2000;:533-570. 6. Desloovere K, Molenaers G, Van Gestel L, Huenaerts C, Van Campenhout A, Callewaert B, et al. How can push-off be preserved during use of an ankle foot orthosis in children with hemiplegia? A prospective controlled study. Gait Posture. 2006 Ott;24(2):142-151.
Resumo:
The availability of a huge amount of source code from code archives and open-source projects opens up the possibility to merge machine learning, programming languages, and software engineering research fields. This area is often referred to as Big Code where programming languages are treated instead of natural languages while different features and patterns of code can be exploited to perform many useful tasks and build supportive tools. Among all the possible applications which can be developed within the area of Big Code, the work presented in this research thesis mainly focuses on two particular tasks: the Programming Language Identification (PLI) and the Software Defect Prediction (SDP) for source codes. Programming language identification is commonly needed in program comprehension and it is usually performed directly by developers. However, when it comes at big scales, such as in widely used archives (GitHub, Software Heritage), automation of this task is desirable. To accomplish this aim, the problem is analyzed from different points of view (text and image-based learning approaches) and different models are created paying particular attention to their scalability. Software defect prediction is a fundamental step in software development for improving quality and assuring the reliability of software products. In the past, defects were searched by manual inspection or using automatic static and dynamic analyzers. Now, the automation of this task can be tackled using learning approaches that can speed up and improve related procedures. Here, two models have been built and analyzed to detect some of the commonest bugs and errors at different code granularity levels (file and method levels). Exploited data and models’ architectures are analyzed and described in detail. Quantitative and qualitative results are reported for both PLI and SDP tasks while differences and similarities concerning other related works are discussed.
Resumo:
In the framework of industrial problems, the application of Constrained Optimization is known to have overall very good modeling capability and performance and stands as one of the most powerful, explored, and exploited tool to address prescriptive tasks. The number of applications is huge, ranging from logistics to transportation, packing, production, telecommunication, scheduling, and much more. The main reason behind this success is to be found in the remarkable effort put in the last decades by the OR community to develop realistic models and devise exact or approximate methods to solve the largest variety of constrained or combinatorial optimization problems, together with the spread of computational power and easily accessible OR software and resources. On the other hand, the technological advancements lead to a data wealth never seen before and increasingly push towards methods able to extract useful knowledge from them; among the data-driven methods, Machine Learning techniques appear to be one of the most promising, thanks to its successes in domains like Image Recognition, Natural Language Processes and playing games, but also the amount of research involved. The purpose of the present research is to study how Machine Learning and Constrained Optimization can be used together to achieve systems able to leverage the strengths of both methods: this would open the way to exploiting decades of research on resolution techniques for COPs and constructing models able to adapt and learn from available data. In the first part of this work, we survey the existing techniques and classify them according to the type, method, or scope of the integration; subsequently, we introduce a novel and general algorithm devised to inject knowledge into learning models through constraints, Moving Target. In the last part of the thesis, two applications stemming from real-world projects and done in collaboration with Optit will be presented.
Resumo:
Neural representations (NR) have emerged in the last few years as a powerful tool to represent signals from several domains, such as images, 3D shapes, or audio. Indeed, deep neural networks have been shown capable of approximating continuous functions that describe a given signal with theoretical infinite resolution. This finding allows obtaining representations whose memory footprint is fixed and decoupled from the resolution at which the underlying signal can be sampled, something that is not possible with traditional discrete representations, e.g., grids of pixels for images or voxels for 3D shapes. During the last two years, many techniques have been proposed to improve the capability of NR to approximate high-frequency details and to make the optimization procedures required to obtain NR less demanding both in terms of time and data requirements, motivating many researchers to deploy NR as the main form of data representation for complex pipelines. Following this line of research, we first show that NR can approximate precisely Unsigned Distance Functions, providing an effective way to represent garments that feature open 3D surfaces and unknown topology. Then, we present a pipeline to obtain in a few minutes a compact Neural Twin® for a given object, by exploiting the recent advances in modeling neural radiance fields. Furthermore, we move a step in the direction of adopting NR as a standalone representation, by considering the possibility of performing downstream tasks by processing directly the NR weights. We first show that deep neural networks can be compressed into compact latent codes. Then, we show how this technique can be exploited to perform deep learning on implicit neural representations (INR) of 3D shapes, by only looking at the weights of the networks.
Resumo:
Riding the wave of recent groundbreaking achievements, artificial intelligence (AI) is currently the buzzword on everybody’s lips and, allowing algorithms to learn from historical data, Machine Learning (ML) emerged as its pinnacle. The multitude of algorithms, each with unique strengths and weaknesses, highlights the absence of a universal solution and poses a challenging optimization problem. In response, automated machine learning (AutoML) navigates vast search spaces within minimal time constraints. By lowering entry barriers, AutoML emerged as promising the democratization of AI, yet facing some challenges. In data-centric AI, the discipline of systematically engineering data used to build an AI system, the challenge of configuring data pipelines is rather simple. We devise a methodology for building effective data pre-processing pipelines in supervised learning as well as a data-centric AutoML solution for unsupervised learning. In human-centric AI, many current AutoML tools were not built around the user but rather around algorithmic ideas, raising ethical and social bias concerns. We contribute by deploying AutoML tools aiming at complementing, instead of replacing, human intelligence. In particular, we provide solutions for single-objective and multi-objective optimization and showcase the challenges and potential of novel interfaces featuring large language models. Finally, there are application areas that rely on numerical simulators, often related to earth observations, they tend to be particularly high-impact and address important challenges such as climate change and crop life cycles. We commit to coupling these physical simulators with (Auto)ML solutions towards a physics-aware AI. Specifically, in precision farming, we design a smart irrigation platform that: allows real-time monitoring of soil moisture, predicts future moisture values, and estimates water demand to schedule the irrigation.