11 resultados para Language Learning

em AMS Tesi di Dottorato - Alm@DL - Università di Bologna


Relevância:

60.00% 60.00%

Publicador:

Resumo:

Les recherches relatives à l'utilisation des TICE se concentrent fréquemment soit sur la dimension cognitive, sur la dimension linguistique ou sur la dimension culturelle. Le plus souvent, les recherches empiriques se proposent d'évaluer les effets directs des TICE sur les performances langagières des apprenants. En outre, les recherches, surtout en psychologie cognitive, sont le plus souvent effectuées en laboratoire. C'est pourquoi le travail présenté dans cette thèse se propose d'inscrire l'utilisation des TICE dans une perspective écologique, et de proposer une approche intégrée pour l'analyse des pratiques effectives aussi bien en didactique des langues qu'en didactique de la traduction. En ce qui concerne les aspects cognitifs, nous recourons à un concept apprécié des praticiens, celui de stratégies d'apprentissage. Les quatre premiers chapitres de la présente thèse sont consacrés à l'élaboration du cadre théorique dans lequel nous inscrivons notre recherche. Nous aborderons en premier lieu les aspects disciplinaires, et notamment l’interdisciplinarité de nos deux champs de référence. Ensuite nous traiterons les stratégies d'apprentissage et les stratégies de traduction. Dans un troisième mouvement, nous nous efforcerons de définir les deux compétences visées par notre recherche : la production écrite et la traduction. Dans un quatrième temps, nous nous intéresserons aux modifications introduites par les TICE dans les pratiques d'enseignement et d'apprentissage de ces deux compétences. Le cinquième chapitre a pour objet la présentation, l'analyse des données recueillies auprès de groupes d'enseignants et d'étudiants de la section de français de la SSLMIT. Il s’agira dans un premier temps, de présenter notre corpus. Ensuite nous procéderons à l’analyse des données. Enfin, nous présenterons, après une synthèse globale, des pistes didactiques et scientifiques à même de prolonger notre travail.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Although the debate of what data science is has a long history and has not reached a complete consensus yet, Data Science can be summarized as the process of learning from data. Guided by the above vision, this thesis presents two independent data science projects developed in the scope of multidisciplinary applied research. The first part analyzes fluorescence microscopy images typically produced in life science experiments, where the objective is to count how many marked neuronal cells are present in each image. Aiming to automate the task for supporting research in the area, we propose a neural network architecture tuned specifically for this use case, cell ResUnet (c-ResUnet), and discuss the impact of alternative training strategies in overcoming particular challenges of our data. The approach provides good results in terms of both detection and counting, showing performance comparable to the interpretation of human operators. As a meaningful addition, we release the pre-trained model and the Fluorescent Neuronal Cells dataset collecting pixel-level annotations of where neuronal cells are located. In this way, we hope to help future research in the area and foster innovative methodologies for tackling similar problems. The second part deals with the problem of distributed data management in the context of LHC experiments, with a focus on supporting ATLAS operations concerning data transfer failures. In particular, we analyze error messages produced by failed transfers and propose a Machine Learning pipeline that leverages the word2vec language model and K-means clustering. This provides groups of similar errors that are presented to human operators as suggestions of potential issues to investigate. The approach is demonstrated on one full day of data, showing promising ability in understanding the message content and providing meaningful groupings, in line with previously reported incidents by human operators.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In the last decades, Artificial Intelligence has witnessed multiple breakthroughs in deep learning. In particular, purely data-driven approaches have opened to a wide variety of successful applications due to the large availability of data. Nonetheless, the integration of prior knowledge is still required to compensate for specific issues like lack of generalization from limited data, fairness, robustness, and biases. In this thesis, we analyze the methodology of integrating knowledge into deep learning models in the field of Natural Language Processing (NLP). We start by remarking on the importance of knowledge integration. We highlight the possible shortcomings of these approaches and investigate the implications of integrating unstructured textual knowledge. We introduce Unstructured Knowledge Integration (UKI) as the process of integrating unstructured knowledge into machine learning models. We discuss UKI in the field of NLP, where knowledge is represented in a natural language format. We identify UKI as a complex process comprised of multiple sub-processes, different knowledge types, and knowledge integration properties to guarantee. We remark on the challenges of integrating unstructured textual knowledge and bridge connections with well-known research areas in NLP. We provide a unified vision of structured knowledge extraction (KE) and UKI by identifying KE as a sub-process of UKI. We investigate some challenging scenarios where structured knowledge is not a feasible prior assumption and formulate each task from the point of view of UKI. We adopt simple yet effective neural architectures and discuss the challenges of such an approach. Finally, we identify KE as a form of symbolic representation. From this perspective, we remark on the need of defining sophisticated UKI processes to verify the validity of knowledge integration. To this end, we foresee frameworks capable of combining symbolic and sub-symbolic representations for learning as a solution.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In the framework of industrial problems, the application of Constrained Optimization is known to have overall very good modeling capability and performance and stands as one of the most powerful, explored, and exploited tool to address prescriptive tasks. The number of applications is huge, ranging from logistics to transportation, packing, production, telecommunication, scheduling, and much more. The main reason behind this success is to be found in the remarkable effort put in the last decades by the OR community to develop realistic models and devise exact or approximate methods to solve the largest variety of constrained or combinatorial optimization problems, together with the spread of computational power and easily accessible OR software and resources. On the other hand, the technological advancements lead to a data wealth never seen before and increasingly push towards methods able to extract useful knowledge from them; among the data-driven methods, Machine Learning techniques appear to be one of the most promising, thanks to its successes in domains like Image Recognition, Natural Language Processes and playing games, but also the amount of research involved. The purpose of the present research is to study how Machine Learning and Constrained Optimization can be used together to achieve systems able to leverage the strengths of both methods: this would open the way to exploiting decades of research on resolution techniques for COPs and constructing models able to adapt and learn from available data. In the first part of this work, we survey the existing techniques and classify them according to the type, method, or scope of the integration; subsequently, we introduce a novel and general algorithm devised to inject knowledge into learning models through constraints, Moving Target. In the last part of the thesis, two applications stemming from real-world projects and done in collaboration with Optit will be presented.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

One of the most visionary goals of Artificial Intelligence is to create a system able to mimic and eventually surpass the intelligence observed in biological systems including, ambitiously, the one observed in humans. The main distinctive strength of humans is their ability to build a deep understanding of the world by learning continuously and drawing from their experiences. This ability, which is found in various degrees in all intelligent biological beings, allows them to adapt and properly react to changes by incrementally expanding and refining their knowledge. Arguably, achieving this ability is one of the main goals of Artificial Intelligence and a cornerstone towards the creation of intelligent artificial agents. Modern Deep Learning approaches allowed researchers and industries to achieve great advancements towards the resolution of many long-standing problems in areas like Computer Vision and Natural Language Processing. However, while this current age of renewed interest in AI allowed for the creation of extremely useful applications, a concerningly limited effort is being directed towards the design of systems able to learn continuously. The biggest problem that hinders an AI system from learning incrementally is the catastrophic forgetting phenomenon. This phenomenon, which was discovered in the 90s, naturally occurs in Deep Learning architectures where classic learning paradigms are applied when learning incrementally from a stream of experiences. This dissertation revolves around the Continual Learning field, a sub-field of Machine Learning research that has recently made a comeback following the renewed interest in Deep Learning approaches. This work will focus on a comprehensive view of continual learning by considering algorithmic, benchmarking, and applicative aspects of this field. This dissertation will also touch on community aspects such as the design and creation of research tools aimed at supporting Continual Learning research, and the theoretical and practical aspects concerning public competitions in this field.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Deep learning methods are extremely promising machine learning tools to analyze neuroimaging data. However, their potential use in clinical settings is limited because of the existing challenges of applying these methods to neuroimaging data. In this study, first a data leakage type caused by slice-level data split that is introduced during training and validation of a 2D CNN is surveyed and a quantitative assessment of the model’s performance overestimation is presented. Second, an interpretable, leakage-fee deep learning software written in a python language with a wide range of options has been developed to conduct both classification and regression analysis. The software was applied to the study of mild cognitive impairment (MCI) in patients with small vessel disease (SVD) using multi-parametric MRI data where the cognitive performance of 58 patients measured by five neuropsychological tests is predicted using a multi-input CNN model taking brain image and demographic data. Each of the cognitive test scores was predicted using different MRI-derived features. As MCI due to SVD has been hypothesized to be the effect of white matter damage, DTI-derived features MD and FA produced the best prediction outcome of the TMT-A score which is consistent with the existing literature. In a second study, an interpretable deep learning system aimed at 1) classifying Alzheimer disease and healthy subjects 2) examining the neural correlates of the disease that causes a cognitive decline in AD patients using CNN visualization tools and 3) highlighting the potential of interpretability techniques to capture a biased deep learning model is developed. Structural magnetic resonance imaging (MRI) data of 200 subjects was used by the proposed CNN model which was trained using a transfer learning-based approach producing a balanced accuracy of 71.6%. Brain regions in the frontal and parietal lobe showing the cerebral cortex atrophy were highlighted by the visualization tools.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Creativity seems mysterious; when we experience a creative spark, it is difficult to explain how we got that idea, and we often recall notions like ``inspiration" and ``intuition" when we try to explain the phenomenon. The fact that we are clueless about how a creative idea manifests itself does not necessarily imply that a scientific explanation cannot exist. We are unaware of how we perform certain tasks, such as biking or language understanding, but we have more and more computational techniques that can replicate and hopefully explain such activities. We should understand that every creative act is a fruit of experience, society, and culture. Nothing comes from nothing. Novel ideas are never utterly new; they stem from representations that are already in mind. Creativity involves establishing new relations between pieces of information we had already: then, the greater the knowledge, the greater the possibility of finding uncommon connections, and the more the potential to be creative. In this vein, a beneficial approach to a better understanding of creativity must include computational or mechanistic accounts of such inner procedures and the formation of the knowledge that enables such connections. That is the aim of Computational Creativity: to develop computational systems for emulating and studying creativity. Hence, this dissertation focuses on these two related research areas: discussing computational mechanisms to generate creative artifacts and describing some implicit cognitive processes that can form the basis for creative thoughts.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Deep Neural Networks (DNNs) have revolutionized a wide range of applications beyond traditional machine learning and artificial intelligence fields, e.g., computer vision, healthcare, natural language processing and others. At the same time, edge devices have become central in our society, generating an unprecedented amount of data which could be used to train data-hungry models such as DNNs. However, the potentially sensitive or confidential nature of gathered data poses privacy concerns when storing and processing them in centralized locations. To this purpose, decentralized learning decouples model training from the need of directly accessing raw data, by alternating on-device training and periodic communications. The ability of distilling knowledge from decentralized data, however, comes at the cost of facing more challenging learning settings, such as coping with heterogeneous hardware and network connectivity, statistical diversity of data, and ensuring verifiable privacy guarantees. This Thesis proposes an extensive overview of decentralized learning literature, including a novel taxonomy and a detailed description of the most relevant system-level contributions in the related literature for privacy, communication efficiency, data and system heterogeneity, and poisoning defense. Next, this Thesis presents the design of an original solution to tackle communication efficiency and system heterogeneity, and empirically evaluates it on federated settings. For communication efficiency, an original method, specifically designed for Convolutional Neural Networks, is also described and evaluated against the state-of-the-art. Furthermore, this Thesis provides an in-depth review of recently proposed methods to tackle the performance degradation introduced by data heterogeneity, followed by empirical evaluations on challenging data distributions, highlighting strengths and possible weaknesses of the considered solutions. Finally, this Thesis presents a novel perspective on the usage of Knowledge Distillation as a mean for optimizing decentralized learning systems in settings characterized by data heterogeneity or system heterogeneity. Our vision on relevant future research directions close the manuscript.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The development of Next Generation Sequencing promotes Biology in the Big Data era. The ever-increasing gap between proteins with known sequences and those with a complete functional annotation requires computational methods for automatic structure and functional annotation. My research has been focusing on proteins and led so far to the development of three novel tools, DeepREx, E-SNPs&GO and ISPRED-SEQ, based on Machine and Deep Learning approaches. DeepREx computes the solvent exposure of residues in a protein chain. This problem is relevant for the definition of structural constraints regarding the possible folding of the protein. DeepREx exploits Long Short-Term Memory layers to capture residue-level interactions between positions distant in the sequence, achieving state-of-the-art performances. With DeepRex, I conducted a large-scale analysis investigating the relationship between solvent exposure of a residue and its probability to be pathogenic upon mutation. E-SNPs&GO predicts the pathogenicity of a Single Residue Variation. Variations occurring on a protein sequence can have different effects, possibly leading to the onset of diseases. E-SNPs&GO exploits protein embeddings generated by two novel Protein Language Models (PLMs), as well as a new way of representing functional information coming from the Gene Ontology. The method achieves state-of-the-art performances and is extremely time-efficient when compared to traditional approaches. ISPRED-SEQ predicts the presence of Protein-Protein Interaction sites in a protein sequence. Knowing how a protein interacts with other molecules is crucial for accurate functional characterization. ISPRED-SEQ exploits a convolutional layer to parse local context after embedding the protein sequence with two novel PLMs, greatly surpassing the current state-of-the-art. All methods are published in international journals and are available as user-friendly web servers. They have been developed keeping in mind standard guidelines for FAIRness (FAIR: Findable, Accessible, Interoperable, Reusable) and are integrated into the public collection of tools provided by ELIXIR, the European infrastructure for Bioinformatics.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this thesis, we investigate the role of applied physics in epidemiological surveillance through the application of mathematical models, network science and machine learning. The spread of a communicable disease depends on many biological, social, and health factors. The large masses of data available make it possible, on the one hand, to monitor the evolution and spread of pathogenic organisms; on the other hand, to study the behavior of people, their opinions and habits. Presented here are three lines of research in which an attempt was made to solve real epidemiological problems through data analysis and the use of statistical and mathematical models. In Chapter 1, we applied language-inspired Deep Learning models to transform influenza protein sequences into vectors encoding their information content. We then attempted to reconstruct the antigenic properties of different viral strains using regression models and to identify the mutations responsible for vaccine escape. In Chapter 2, we constructed a compartmental model to describe the spread of a bacterium within a hospital ward. The model was informed and validated on time series of clinical measurements, and a sensitivity analysis was used to assess the impact of different control measures. Finally (Chapter 3) we reconstructed the network of retweets among COVID-19 themed Twitter users in the early months of the SARS-CoV-2 pandemic. By means of community detection algorithms and centrality measures, we characterized users’ attention shifts in the network, showing that scientific communities, initially the most retweeted, lost influence over time to national political communities. In the Conclusion, we highlighted the importance of the work done in light of the main contemporary challenges for epidemiological surveillance. In particular, we present reflections on the importance of nowcasting and forecasting, the relationship between data and scientific research, and the need to unite the different scales of epidemiological surveillance.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The rapid progression of biomedical research coupled with the explosion of scientific literature has generated an exigent need for efficient and reliable systems of knowledge extraction. This dissertation contends with this challenge through a concentrated investigation of digital health, Artificial Intelligence, and specifically Machine Learning and Natural Language Processing's (NLP) potential to expedite systematic literature reviews and refine the knowledge extraction process. The surge of COVID-19 complicated the efforts of scientists, policymakers, and medical professionals in identifying pertinent articles and assessing their scientific validity. This thesis presents a substantial solution in the form of the COKE Project, an initiative that interlaces machine reading with the rigorous protocols of Evidence-Based Medicine to streamline knowledge extraction. In the framework of the COKE (“COVID-19 Knowledge Extraction framework for next-generation discovery science”) Project, this thesis aims to underscore the capacity of machine reading to create knowledge graphs from scientific texts. The project is remarkable for its innovative use of NLP techniques such as a BERT + bi-LSTM language model. This combination is employed to detect and categorize elements within medical abstracts, thereby enhancing the systematic literature review process. The COKE project's outcomes show that NLP, when used in a judiciously structured manner, can significantly reduce the time and effort required to produce medical guidelines. These findings are particularly salient during times of medical emergency, like the COVID-19 pandemic, when quick and accurate research results are critical.