6 resultados para Teaching-learning of the language
em Indian Institute of Science - Bangalore - Índia
Resumo:
Multiaction learning automata which update their action probabilities on the basis of the responses they get from an environment are considered in this paper. The automata update the probabilities according to whether the environment responds with a reward or a penalty. Learning automata are said to possess ergodicity of the mean if the mean action probability is the state probability (or unconditional probability) of an ergodic Markov chain. In an earlier paper [11] we considered the problem of a two-action learning automaton being ergodic in the mean (EM). The family of such automata was characterized completely by proving the necessary and sufficient conditions for automata to be EM. In this paper, we generalize the results of [11] and obtain necessary and sufficient conditions for the multiaction learning automaton to be EM. These conditions involve two families of probability updating functions. It is shown that for the automaton to be EM the two families must be linearly dependent. The vector defining the linear dependence is the only vector parameter which controls the rate of convergence of the automaton. Further, the technique for reducing the variance of the limiting distribution is discussed. Just as in the two-action case, it is shown that the set of absolutely expedient schemes and the set of schemes which possess ergodicity of the mean are mutually disjoint.
Resumo:
A new clustering technique, based on the concept of immediato neighbourhood, with a novel capability to self-learn the number of clusters expected in the unsupervized environment, has been developed. The method compares favourably with other clustering schemes based on distance measures, both in terms of conceptual innovations and computational economy. Test implementation of the scheme using C-l flight line training sample data in a simulated unsupervized mode has brought out the efficacy of the technique. The technique can easily be implemented as a front end to established pattern classification systems with supervized learning capabilities to derive unified learning systems capable of operating in both supervized and unsupervized environments. This makes the technique an attractive proposition in the context of remotely sensed earth resources data analysis wherein it is essential to have such a unified learning system capability.
Resumo:
We propose two algorithms for Q-learning that use the two-timescale stochastic approximation methodology. The first of these updates Q-values of all feasible state–action pairs at each instant while the second updates Q-values of states with actions chosen according to the ‘current’ randomized policy updates. A proof of convergence of the algorithms is shown. Finally, numerical experiments using the proposed algorithms on an application of routing in communication networks are presented on a few different settings.
Resumo:
We consider the problem of Probably Ap-proximate Correct (PAC) learning of a bi-nary classifier from noisy labeled exam-ples acquired from multiple annotators(each characterized by a respective clas-sification noise rate). First, we consider the complete information scenario, where the learner knows the noise rates of all the annotators. For this scenario, we derive sample complexity bound for the Mini-mum Disagreement Algorithm (MDA) on the number of labeled examples to be ob-tained from each annotator. Next, we consider the incomplete information sce-nario, where each annotator is strategic and holds the respective noise rate as a private information. For this scenario, we design a cost optimal procurement auc-tion mechanism along the lines of Myer-son’s optimal auction design framework in a non-trivial manner. This mechanism satisfies incentive compatibility property,thereby facilitating the learner to elicit true noise rates of all the annotators.