20 resultados para blended learning methods
Resumo:
Deep Neural Networks (DNNs) have revolutionized a wide range of applications beyond traditional machine learning and artificial intelligence fields, e.g., computer vision, healthcare, natural language processing and others. At the same time, edge devices have become central in our society, generating an unprecedented amount of data which could be used to train data-hungry models such as DNNs. However, the potentially sensitive or confidential nature of gathered data poses privacy concerns when storing and processing them in centralized locations. To this purpose, decentralized learning decouples model training from the need of directly accessing raw data, by alternating on-device training and periodic communications. The ability of distilling knowledge from decentralized data, however, comes at the cost of facing more challenging learning settings, such as coping with heterogeneous hardware and network connectivity, statistical diversity of data, and ensuring verifiable privacy guarantees. This Thesis proposes an extensive overview of decentralized learning literature, including a novel taxonomy and a detailed description of the most relevant system-level contributions in the related literature for privacy, communication efficiency, data and system heterogeneity, and poisoning defense. Next, this Thesis presents the design of an original solution to tackle communication efficiency and system heterogeneity, and empirically evaluates it on federated settings. For communication efficiency, an original method, specifically designed for Convolutional Neural Networks, is also described and evaluated against the state-of-the-art. Furthermore, this Thesis provides an in-depth review of recently proposed methods to tackle the performance degradation introduced by data heterogeneity, followed by empirical evaluations on challenging data distributions, highlighting strengths and possible weaknesses of the considered solutions. Finally, this Thesis presents a novel perspective on the usage of Knowledge Distillation as a mean for optimizing decentralized learning systems in settings characterized by data heterogeneity or system heterogeneity. Our vision on relevant future research directions close the manuscript.
Resumo:
The thesis of this paper is based on the assumption that the socio-economic system in which we are living is characterised by three great trends: growing attention to the promotion of human capital; extremely rapid technological progress, based above all on the information and communication technologies (ICT); the establishment of new production and organizational set-ups. These transformation processes pose a concrete challenge to the training sector, which is called to satisfy the demand for new skills that need to be developed and disseminated. Hence the growing interest that the various training sub-systems devote to the issues of lifelong learning and distance learning. In such a context, the so-called e-learning acquires a central role. The first chapter proposes a reference theoretical framework for the transformations that are shaping post-industrial society. It analyzes some key issues such as: how work is changing, the evolution of organizational set-ups and the introduction of learning organization, the advent of the knowledge society and of knowledge companies, the innovation of training processes, and the key role of ICT in the new training and learning systems. The second chapter focuses on the topic of e-learning as an effective training model in response to the need for constant learning that is emerging in the knowledge society. This chapter starts with a reflection on the importance of lifelong learning and introduces the key arguments of this thesis, i.e. distance learning (DL) and the didactic methodology called e-learning. It goes on with an analysis of the various theoretic and technical aspects of e-learning. In particular, it delves into the theme of e-learning as an integrated and constant training environment, characterized by customized programmes and collaborative learning, didactic assistance and constant monitoring of the results. Thus, all the aspects of e-learning are taken into exam: the actors and the new professionals, the virtual communities as learning subjects, the organization of contents in learning objects, the conformity to international standards, the integrated platforms and so on. The third chapter, which concludes the theoretic-interpretative part, starts with a short presentation of the state-of-the-art e-learning international market that aims to understand its peculiarities and its current trends. Finally, we focus on some important regulation aspects related to the strong impulse given by the European Commission first, and by the Italian governments secondly, to the development and diffusion of e-learning. The second part of the thesis (chapters 4, 5 and 6) focus on field research, which aims to define the Italian scenario for e-learning. In particular, we have examined some key topics such as: the challenges of training and the instruments to face such challenges; the new didactic methods and technologies for lifelong learning; the level of diffusion of e-learning in Italy; the relation between classroom training and online training; the main factors of success as well as the most critical aspects of the introduction of e-learning in the various learning environments. As far as the methodological aspects are concerned, we have favoured a qualitative and quantitative analysis. A background analysis has been done to collect the statistical data available on this topic, as well as the research previously carried out in this area. The main source of data is constituted by the results of the Observatory on e-learning of Aitech-Assinform, which covers the 2000s and four areas of implementation (firms, public administration, universities, school): the thesis has reviewed the results of the last three available surveys, offering a comparative interpretation of them. We have then carried out an in-depth empirical examination of two case studies, which have been selected by virtue of the excellence they have achieved and can therefore be considered advanced and emblematic experiences (a large firm and a Graduate School).
Resumo:
Statistical modelling and statistical learning theory are two powerful analytical frameworks for analyzing signals and developing efficient processing and classification algorithms. In this thesis, these frameworks are applied for modelling and processing biomedical signals in two different contexts: ultrasound medical imaging systems and primate neural activity analysis and modelling. In the context of ultrasound medical imaging, two main applications are explored: deconvolution of signals measured from a ultrasonic transducer and automatic image segmentation and classification of prostate ultrasound scans. In the former application a stochastic model of the radio frequency signal measured from a ultrasonic transducer is derived. This model is then employed for developing in a statistical framework a regularized deconvolution procedure, for enhancing signal resolution. In the latter application, different statistical models are used to characterize images of prostate tissues, extracting different features. These features are then uses to segment the images in region of interests by means of an automatic procedure based on a statistical model of the extracted features. Finally, machine learning techniques are used for automatic classification of the different region of interests. In the context of neural activity signals, an example of bio-inspired dynamical network was developed to help in studies of motor-related processes in the brain of primate monkeys. The presented model aims to mimic the abstract functionality of a cell population in 7a parietal region of primate monkeys, during the execution of learned behavioural tasks.
Resumo:
Machine learning comprises a series of techniques for automatic extraction of meaningful information from large collections of noisy data. In many real world applications, data is naturally represented in structured form. Since traditional methods in machine learning deal with vectorial information, they require an a priori form of preprocessing. Among all the learning techniques for dealing with structured data, kernel methods are recognized to have a strong theoretical background and to be effective approaches. They do not require an explicit vectorial representation of the data in terms of features, but rely on a measure of similarity between any pair of objects of a domain, the kernel function. Designing fast and good kernel functions is a challenging problem. In the case of tree structured data two issues become relevant: kernel for trees should not be sparse and should be fast to compute. The sparsity problem arises when, given a dataset and a kernel function, most structures of the dataset are completely dissimilar to one another. In those cases the classifier has too few information for making correct predictions on unseen data. In fact, it tends to produce a discriminating function behaving as the nearest neighbour rule. Sparsity is likely to arise for some standard tree kernel functions, such as the subtree and subset tree kernel, when they are applied to datasets with node labels belonging to a large domain. A second drawback of using tree kernels is the time complexity required both in learning and classification phases. Such a complexity can sometimes prevents the kernel application in scenarios involving large amount of data. This thesis proposes three contributions for resolving the above issues of kernel for trees. A first contribution aims at creating kernel functions which adapt to the statistical properties of the dataset, thus reducing its sparsity with respect to traditional tree kernel functions. Specifically, we propose to encode the input trees by an algorithm able to project the data onto a lower dimensional space with the property that similar structures are mapped similarly. By building kernel functions on the lower dimensional representation, we are able to perform inexact matchings between different inputs in the original space. A second contribution is the proposal of a novel kernel function based on the convolution kernel framework. Convolution kernel measures the similarity of two objects in terms of the similarities of their subparts. Most convolution kernels are based on counting the number of shared substructures, partially discarding information about their position in the original structure. The kernel function we propose is, instead, especially focused on this aspect. A third contribution is devoted at reducing the computational burden related to the calculation of a kernel function between a tree and a forest of trees, which is a typical operation in the classification phase and, for some algorithms, also in the learning phase. We propose a general methodology applicable to convolution kernels. Moreover, we show an instantiation of our technique when kernels such as the subtree and subset tree kernels are employed. In those cases, Direct Acyclic Graphs can be used to compactly represent shared substructures in different trees, thus reducing the computational burden and storage requirements.
Resumo:
In many application domains data can be naturally represented as graphs. When the application of analytical solutions for a given problem is unfeasible, machine learning techniques could be a viable way to solve the problem. Classical machine learning techniques are defined for data represented in a vectorial form. Recently some of them have been extended to deal directly with structured data. Among those techniques, kernel methods have shown promising results both from the computational complexity and the predictive performance point of view. Kernel methods allow to avoid an explicit mapping in a vectorial form relying on kernel functions, which informally are functions calculating a similarity measure between two entities. However, the definition of good kernels for graphs is a challenging problem because of the difficulty to find a good tradeoff between computational complexity and expressiveness. Another problem we face is learning on data streams, where a potentially unbounded sequence of data is generated by some sources. There are three main contributions in this thesis. The first contribution is the definition of a new family of kernels for graphs based on Directed Acyclic Graphs (DAGs). We analyzed two kernels from this family, achieving state-of-the-art results from both the computational and the classification point of view on real-world datasets. The second contribution consists in making the application of learning algorithms for streams of graphs feasible. Moreover,we defined a principled way for the memory management. The third contribution is the application of machine learning techniques for structured data to non-coding RNA function prediction. In this setting, the secondary structure is thought to carry relevant information. However, existing methods considering the secondary structure have prohibitively high computational complexity. We propose to apply kernel methods on this domain, obtaining state-of-the-art results.