43 resultados para Online learning, prediction with expert advice, combinato rial prediction, easy data
em Aston University Research Archive
Resumo:
We study the dynamics of on-line learning in multilayer neural networks where training examples are sampled with repetition and where the number of examples scales with the number of network weights. The analysis is carried out using the dynamical replica method aimed at obtaining a closed set of coupled equations for a set of macroscopic variables from which both training and generalization errors can be calculated. We focus on scenarios whereby training examples are corrupted by additive Gaussian output noise and regularizers are introduced to improve the network performance. The dependence of the dynamics on the noise level, with and without regularizers, is examined, as well as that of the asymptotic values obtained for both training and generalization errors. We also demonstrate the ability of the method to approximate the learning dynamics in structurally unrealizable scenarios. The theoretical results show good agreement with those obtained by computer simulations.
Resumo:
This research explored how a more student-directed learning design can support the creation of togetherness and belonging in a community of distance learners in formal higher education. Postgraduate students in a New Zealand School of Education experienced two different learning tasks as part of their online distance learning studies. The tasks centered around two online asynchronous discussions each for the same period of time and with the same group of students, but following two different learning design principles. All messages were analyzed using a twostep analysis process, content analysis and social network analysis. Although the findings showed a balance of power between the tutor and the students in the first high e-moderated activity, a better pattern of group interaction and community feeling was found in the low e-moderated activity. The paper will discuss the findings in terms of the implications for learning design and the role of the tutor.
Resumo:
We present and analyze three different online algorithms for learning in discrete Hidden Markov Models (HMMs) and compare their performance with the Baldi-Chauvin Algorithm. Using the Kullback-Leibler divergence as a measure of the generalization error we draw learning curves in simplified situations and compare the results. The performance for learning drifting concepts of one of the presented algorithms is analyzed and compared with the Baldi-Chauvin algorithm in the same situations. A brief discussion about learning and symmetry breaking based on our results is also presented. © 2006 American Institute of Physics.
Resumo:
The Vapnik-Chervonenkis (VC) dimension is a combinatorial measure of a certain class of machine learning problems, which may be used to obtain upper and lower bounds on the number of training examples needed to learn to prescribed levels of accuracy. Most of the known bounds apply to the Probably Approximately Correct (PAC) framework, which is the framework within which we work in this paper. For a learning problem with some known VC dimension, much is known about the order of growth of the sample-size requirement of the problem, as a function of the PAC parameters. The exact value of sample-size requirement is however less well-known, and depends heavily on the particular learning algorithm being used. This is a major obstacle to the practical application of the VC dimension. Hence it is important to know exactly how the sample-size requirement depends on VC dimension, and with that in mind, we describe a general algorithm for learning problems having VC dimension 1. Its sample-size requirement is minimal (as a function of the PAC parameters), and turns out to be the same for all non-trivial learning problems having VC dimension 1. While the method used cannot be naively generalised to higher VC dimension, it suggests that optimal algorithm-dependent bounds may improve substantially on current upper bounds.
Resumo:
This paper presents a model for measuring personal knowledge development in online learning environments. It is based on Nonaka‘s SECI model of organisational knowledge creation. It is argued that Socialisation is not a relevant mode in the context of online learning and was therefore not covered in the measurement instrument. Therefore, the remaining three of SECI‘s knowledge conversion modes, namely Externalisation, Combination, and Internalisation were used and a measurement instrument was created which also examines the interrelationships between the three modes. Data was collected using an online survey, in which online learners report on their experiences of personal knowledge development in online learning environments. In other words, the instrument measures the magnitude of online learners‘ Externalisation and combination activities as well as their level of internalisation, which is the outcome of their personal knowledge development in online learning.
Resumo:
This paper reports on an investigation with first year undergraduate Product Design and Management students within a School of Engineering. The students at the time of this investigation had studied fundamental engineering science and mathematics for one semester. The students were given an open ended, ill formed problem which involved designing a simple bridge to cross a river. They were given a talk on problem solving and given a rubric to follow, if they chose to do so. They were not given any formulae or procedures needed in order to resolve the problem. In theory, they possessed the knowledge to ask the right questions in order to make assumptions but, in practice, it turned out they were unable to link their a priori knowledge to resolve this problem. They were able to solve simple beam problems when given closed questions. The results show they were unable to visualise a simple bridge as an augmented beam problem and ask pertinent questions and hence formulate appropriate assumptions in order to offer resolutions.
Resumo:
We study the effect of two types of noise, data noise and model noise, in an on-line gradient-descent learning scenario for general two-layer student network with an arbitrary number of hidden units. Training examples are randomly drawn input vectors labeled by a two-layer teacher network with an arbitrary number of hidden units. Data is then corrupted by Gaussian noise affecting either the output or the model itself. We examine the effect of both types of noise on the evolution of order parameters and the generalization error in various phases of the learning process.
Resumo:
Online learning is discussed from the viewpoint of Bayesian statistical inference. By replacing the true posterior distribution with a simpler parametric distribution, one can define an online algorithm by a repetition of two steps: An update of the approximate posterior, when a new example arrives, and an optimal projection into the parametric family. Choosing this family to be Gaussian, we show that the algorithm achieves asymptotic efficiency. An application to learning in single layer neural networks is given.
Resumo:
The impact and use of information and communication technology on learning outcomes for accounting students is not well understood. This study investigates the impact of design features of Blackboard 1 used as aWeb-based Learning Environment (WBLE) in teaching undergraduate accounting students. Specifically, this investigation reports on a number of Blackboard design features (e.g. delivery of lecture notes, announcements, online assessment and model answers) used to deliver learning materials regarded as necessary to enhance learning outcomes. Responses from 369 on-campus students provided data to develop a regression model that seeks to explain enhanced participation and mental effort. The final regression shows that student satisfaction with the use of a WBLE is associated with five design features or variables. These include usefulness and availability of lecture notes, online assessment, model answers, and online chat.
Resumo:
Customer Base Analysis is perhaps the first stage of analysis in customer value, aiming to predict purchase frequency and customer lifecycle. An important part of the customer purchase frequency and its retention has to do with the service upgrade. Many models have tried to predict purchase frequency as well as upgrading. The comparison of these models seems important to provide academics with a picture of the current situation. The purpose of this research is to evaluate how models can predict service upgrade among a customer database of an online DVD rental company and suggest an alternative based on data mining techniques and data on historical transactions.
Resumo:
Today, the data available to tackle many scientific challenges is vast in quantity and diverse in nature. The exploration of heterogeneous information spaces requires suitable mining algorithms as well as effective visual interfaces. Most existing systems concentrate either on mining algorithms or on visualization techniques. Though visual methods developed in information visualization have been helpful, for improved understanding of a complex large high-dimensional dataset, there is a need for an effective projection of such a dataset onto a lower-dimension (2D or 3D) manifold. This paper introduces a flexible visual data mining framework which combines advanced projection algorithms developed in the machine learning domain and visual techniques developed in the information visualization domain. The framework follows Shneiderman’s mantra to provide an effective user interface. The advantage of such an interface is that the user is directly involved in the data mining process. We integrate principled projection methods, such as Generative Topographic Mapping (GTM) and Hierarchical GTM (HGTM), with powerful visual techniques, such as magnification factors, directional curvatures, parallel coordinates, billboarding, and user interaction facilities, to provide an integrated visual data mining framework. Results on a real life high-dimensional dataset from the chemoinformatics domain are also reported and discussed. Projection results of GTM are analytically compared with the projection results from other traditional projection methods, and it is also shown that the HGTM algorithm provides additional value for large datasets. The computational complexity of these algorithms is discussed to demonstrate their suitability for the visual data mining framework.
Resumo:
Retrospective clinical data presents many challenges for data mining and machine learning. The transcription of patient records from paper charts and subsequent manipulation of data often results in high volumes of noise as well as a loss of other important information. In addition, such datasets often fail to represent expert medical knowledge and reasoning in any explicit manner. In this research we describe applying data mining methods to retrospective clinical data to build a prediction model for asthma exacerbation severity for pediatric patients in the emergency department. Difficulties in building such a model forced us to investigate alternative strategies for analyzing and processing retrospective data. This paper describes this process together with an approach to mining retrospective clinical data by incorporating formalized external expert knowledge (secondary knowledge sources) into the classification task. This knowledge is used to partition the data into a number of coherent sets, where each set is explicitly described in terms of the secondary knowledge source. Instances from each set are then classified in a manner appropriate for the characteristics of the particular set. We present our methodology and outline a set of experiential results that demonstrate some advantages and some limitations of our approach. © 2008 Springer-Verlag Berlin Heidelberg.
Resumo:
This thesis introduces a flexible visual data exploration framework which combines advanced projection algorithms from the machine learning domain with visual representation techniques developed in the information visualisation domain to help a user to explore and understand effectively large multi-dimensional datasets. The advantage of such a framework to other techniques currently available to the domain experts is that the user is directly involved in the data mining process and advanced machine learning algorithms are employed for better projection. A hierarchical visualisation model guided by a domain expert allows them to obtain an informed segmentation of the input space. Two other components of this thesis exploit properties of these principled probabilistic projection algorithms to develop a guided mixture of local experts algorithm which provides robust prediction and a model to estimate feature saliency simultaneously with the training of a projection algorithm.Local models are useful since a single global model cannot capture the full variability of a heterogeneous data space such as the chemical space. Probabilistic hierarchical visualisation techniques provide an effective soft segmentation of an input space by a visualisation hierarchy whose leaf nodes represent different regions of the input space. We use this soft segmentation to develop a guided mixture of local experts (GME) algorithm which is appropriate for the heterogeneous datasets found in chemoinformatics problems. Moreover, in this approach the domain experts are more involved in the model development process which is suitable for an intuition and domain knowledge driven task such as drug discovery. We also derive a generative topographic mapping (GTM) based data visualisation approach which estimates feature saliency simultaneously with the training of a visualisation model.
Resumo:
We developed a parallel strategy for learning optimally specific realizable rules by perceptrons, in an online learning scenario. Our result is a generalization of the Caticha–Kinouchi (CK) algorithm developed for learning a perceptron with a synaptic vector drawn from a uniform distribution over the N-dimensional sphere, so called the typical case. Our method outperforms the CK algorithm in almost all possible situations, failing only in a denumerable set of cases. The algorithm is optimal in the sense that it saturates Bayesian bounds when it succeeds.