837 resultados para Learning with noise
Resumo:
We study the effect of two types of noise, data noise and model noise, in an on-line gradient-descent learning scenario for general two-layer student network with an arbitrary number of hidden units. Training examples are randomly drawn input vectors labeled by a two-layer teacher network with an arbitrary number of hidden units. Data is then corrupted by Gaussian noise affecting either the output or the model itself. We examine the effect of both types of noise on the evolution of order parameters and the generalization error in various phases of the learning process.
Resumo:
Convex potential minimisation is the de facto approach to binary classification. However, Long and Servedio [2008] proved that under symmetric label noise (SLN), minimisation of any convex potential over a linear function class can result in classification performance equivalent to random guessing. This ostensibly shows that convex losses are not SLN-robust. In this paper, we propose a convex, classification-calibrated loss and prove that it is SLN-robust. The loss avoids the Long and Servedio [2008] result by virtue of being negatively unbounded. The loss is a modification of the hinge loss, where one does not clamp at zero; hence, we call it the unhinged loss. We show that the optimal unhinged solution is equivalent to that of a strongly regularised SVM, and is the limiting solution for any convex potential; this implies that strong l2 regularisation makes most standard learners SLN-robust. Experiments confirm the unhinged loss’ SLN-robustness.
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
In this paper, we derive an EM algorithm for nonlinear state space models. We use it to estimate jointly the neural network weights, the model uncertainty and the noise in the data. In the E-step we apply a forwardbackward Rauch-Tung-Striebel smoother to compute the network weights. For the M-step, we derive expressions to compute the model uncertainty and the measurement noise. We find that the method is intrinsically very powerful, simple and stable.
Resumo:
It is well known that the addition of noise to the input data of a neural network during training can, in some circumstances, lead to significant improvements in generalization performance. Previous work has shown that such training with noise is equivalent to a form of regularization in which an extra term is added to the error function. However, the regularization term, which involves second derivatives of the error function, is not bounded below, and so can lead to difficulties if used directly in a learning algorithm based on error minimization. In this paper we show that, for the purposes of network training, the regularization term can be reduced to a positive definite form which involves only first derivatives of the network mapping. For a sum-of-squares error function, the regularization term belongs to the class of generalized Tikhonov regularizers. Direct minimization of the regularized error function provides a practical alternative to training with noise.
Resumo:
We study the effect of regularization in an on-line gradient-descent learning scenario for a general two-layer student network with an arbitrary number of hidden units. Training examples are randomly drawn input vectors labelled by a two-layer teacher network with an arbitrary number of hidden units which may be corrupted by Gaussian output noise. We examine the effect of weight decay regularization on the dynamical evolution of the order parameters and generalization error in various phases of the learning process, in both noiseless and noisy scenarios.
Resumo:
Over the last two decades, the notion of teacher leadership has emerged as a key concept in both the teaching and leadership literature. While researchers have not reached consensus regarding a definition, there has been some agreement that teacher leadership can operate at both a formal and informal level in schools and that it includes leadership of an instructional, organisational and professional development nature (York-Barr & Duke, 2004). Teacher leadership is a construct that tends not to be applied to pre-service teachers as interns, but is more often connected with the professional role of mentors who collaborate with them as they make the transition to being a beginning teacher. We argue that teacher leadership should be recognised as a professional and career goal during this formative learning phase and that interns should be expected to overtly demonstrate signs, albeit early ones, of leadership in instruction and other professional areas of development. The aim of this paper is to explore the extent to which teacher education interns at one university in Queensland reported on activities that may be deemed to be ‘teacher leadership.’ The research approach used in this study was an examination of 145 reflective reports written in 2008 by final Bachelor of Education (primary) pre-service teachers. These reports recorded the pre-service teachers’ perceptions of their professional learning with a school-based mentor in response to four outcomes of internship that were scaffolded by their mentor or initiated by them. These outcomes formed the bases of our research questions into the professional learning of the interns and included, ‘increased knowledge and capacity to teach within the total world of work as a teacher;’ ‘to work autonomously and interdependently’; to make ‘growth in critical reflectivity’, and the ‘ability to initiate professional development with the mentoring process’. Using the approaches of the constant comparative method of Strauss and Corbin (1998) key categories of experiences emerged. These categories were then identified as belonging to main meta-category labelled as ‘teacher leadership.’ Our research findings revealed that five dimensions of teacher leadership – effective practice in schools; school curriculum work; professional development of colleagues; parent and community involvement; and contributions to the profession – were evident in the written reports by interns. Not surprisingly, the mentor/intern relationship was the main vehicle for enabling the intern to learn about teaching and leadership. The paper concludes with some key implications for developers of preservice education programmes regarding the need for teacher leadership to be part of the discourse of these programmes.
Resumo:
To date, automatic recognition of semantic information such as salient objects and mid-level concepts from images is a challenging task. Since real-world objects tend to exist in a context within their environment, the computer vision researchers have increasingly incorporated contextual information for improving object recognition. In this paper, we present a method to build a visual contextual ontology from salient objects descriptions for image annotation. The ontologies include not only partOf/kindOf relations, but also spatial and co-occurrence relations. A two-step image annotation algorithm is also proposed based on ontology relations and probabilistic inference. Different from most of the existing work, we specially exploit how to combine representation of ontology, contextual knowledge and probabilistic inference. The experiments show that image annotation results are improved in the LabelMe dataset.
Resumo:
Research on analogies in science education has focussed on student interpretation of teacher and textbook analogies, psychological aspects of learning with analogies and structured approaches for teaching with analogies. Few studies have investigated how analogies might be pivotal in students’ growing participation in chemical discourse. To study analogies in this way requires a sociocultural perspective on learning that focuses on ways in which language, signs, symbols and practices mediate participation in chemical discourse. This study reports research findings from a teacher-research study of two analogy-writing activities in a chemistry class. The study began with a theoretical model, Third Space, which informed analyses and interpretation of data. Third Space was operationalized into two sub-constructs called Dialogical Interactions and Hybrid Discourses. The aims of this study were to investigate sociocultural aspects of learning chemistry with analogies in order to identify classroom activities where students generate Dialogical Interactions and Hybrid Discourses, and to refine the operationalization of Third Space. These aims were addressed through three research questions. The research questions were studied through an instrumental case study design. The study was conducted in my Year 11 chemistry class at City State High School for the duration of one Semester. Data were generated through a range of data collection methods and analysed through discourse analysis using the Dialogical Interactions and Hybrid Discourse sub-constructs as coding categories. Results indicated that student interactions differed between analogical activities and mathematical problem-solving activities. Specifically, students drew on discourses other than school chemical discourse to construct analogies and their growing participation in chemical discourse was tracked using the Third Space model as an interpretive lens. Results of this study led to modification of the theoretical model adopted at the beginning of the study to a new model called Merged Discourse. Merged Discourse represents the mutual relationship that formed during analogical activities between the Analog Discourse and the Target Discourse. This model can be used for interpreting and analysing classroom discourse centred on analogical activities from sociocultural perspectives. That is, it can be used to code classroom discourse to reveal students’ growing participation with chemical (or scientific) discourse consistent with sociocultural perspectives on learning.
Resumo:
Gen Y students are digital natives (Prensky 2001) who learn in complex and diverse ways, with a variety of learning styles apparent in any given course. This paper proposes a web 2.0 conceptual learning solution–online student videos–to respond to different learning styles that exist in the classroom.
Resumo:
We consider the problem of prediction with expert advice in the setting where a forecaster is presented with several online prediction tasks. Instead of competing against the best expert separately on each task, we assume the tasks are related, and thus we expect that a few experts will perform well on the entire set of tasks. That is, our forecaster would like, on each task, to compete against the best expert chosen from a small set of experts. While we describe the “ideal” algorithm and its performance bound, we show that the computation required for this algorithm is as hard as computation of a matrix permanent. We present an efficient algorithm based on mixing priors, and prove a bound that is nearly as good for the sequential task presentation case. We also consider a harder case where the task may change arbitrarily from round to round, and we develop an efficient approximate randomized algorithm based on Markov chain Monte Carlo techniques.
Resumo:
The paper "the importance of convexity in learning with squared loss" gave a lower bound on the sample complexity of learning with quadratic loss using a nonconvex function class. The proof contains an error. We show that the lower bound is true under a stronger condition that holds for many cases of interest.
Resumo:
We consider the problem of prediction with expert advice in the setting where a forecaster is presented with several online prediction tasks. Instead of competing against the best expert separately on each task, we assume the tasks are related, and thus we expect that a few experts will perform well on the entire set of tasks. That is, our forecaster would like, on each task, to compete against the best expert chosen from a small set of experts. While we describe the "ideal" algorithm and its performance bound, we show that the computation required for this algorithm is as hard as computation of a matrix permanent. We present an efficient algorithm based on mixing priors, and prove a bound that is nearly as good for the sequential task presentation case. We also consider a harder case where the task may change arbitrarily from round to round, and we develop an efficient approximate randomized algorithm based on Markov chain Monte Carlo techniques.