14 resultados para Multimedia Learning Simulation

em Indian Institute of Science - Bangalore - Índia


Relevância:

30.00% 30.00%

Publicador:

Resumo:

The statistical minimum risk pattern recognition problem, when the classification costs are random variables of unknown statistics, is considered. Using medical diagnosis as a possible application, the problem of learning the optimal decision scheme is studied for a two-class twoaction case, as a first step. This reduces to the problem of learning the optimum threshold (for taking appropriate action) on the a posteriori probability of one class. A recursive procedure for updating an estimate of the threshold is proposed. The estimation procedure does not require the knowledge of actual class labels of the sample patterns in the design set. The adaptive scheme of using the present threshold estimate for taking action on the next sample is shown to converge, in probability, to the optimum. The results of a computer simulation study of three learning schemes demonstrate the theoretically predictable salient features of the adaptive scheme.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The problem of learning correct decision rules to minimize the probability of misclassification is a long-standing problem of supervised learning in pattern recognition. The problem of learning such optimal discriminant functions is considered for the class of problems where the statistical properties of the pattern classes are completely unknown. The problem is posed as a game with common payoff played by a team of mutually cooperating learning automata. This essentially results in a probabilistic search through the space of classifiers. The approach is inherently capable of learning discriminant functions that are nonlinear in their parameters also. A learning algorithm is presented for the team and convergence is established. It is proved that the team can obtain the optimal classifier to an arbitrary approximation. Simulation results with a few examples are presented where the team learns the optimal classifier.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The concept of a “mutualistic teacher” is introduced for unsupervised learning of the mean vectors of the components of a mixture of multivariate normal densities, when the number of classes is also unknown. The unsupervised learning problem is formulated here as a multi-stage quasi-supervised problem incorporating a cluster approach. The mutualistic teacher creates a quasi-supervised environment at each stage by picking out “mutual pairs” of samples and assigning identical (but unknown) labels to the individuals of each mutual pair. The number of classes, if not specified, can be determined at an intermediate stage. The risk in assigning identical labels to the individuals of mutual pairs is estimated. Results of some simulation studies are presented.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Bandwidth allocation for multimedia applications in case of network congestion and failure poses technical challenges due to bursty and delay sensitive nature of the applications. The growth of multimedia services on Internet and the development of agent technology have made us to investigate new techniques for resolving the bandwidth issues in multimedia communications. Agent technology is emerging as a flexible promising solution for network resource management and QoS (Quality of Service) control in a distributed environment. In this paper, we propose an adaptive bandwidth allocation scheme for multimedia applications by deploying the static and mobile agents. It is a run-time allocation scheme that functions at the network nodes. This technique adaptively finds an alternate patchup route for every congested/failed link and reallocates the bandwidth for the affected multimedia applications. The designed method has been tested (analytical and simulation)with various network sizes and conditions. The results are presented to assess the performance and effectiveness of the approach. This work also demonstrates some of the benefits of the agent based schemes in providing flexibility, adaptability, software reusability, and maintainability. (C) 2004 Elsevier Inc. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Bandwidth allocation for multimedia applications in case of network congestion and failure poses technical challenges due to bursty and delay sensitive nature of the applications. The growth of multimedia services on Internet and the development of agent technology have made us to investigate new techniques for resolving the bandwidth issues in multimedia communications. Agent technology is emerging as a flexible promising solution for network resource management and QoS (Quality of Service) control in a distributed environment. In this paper, we propose an adaptive bandwidth allocation scheme for multimedia applications by deploying the static and mobile agents. It is a run-time allocation scheme that functions at the network nodes. This technique adaptively finds an alternate patchup route for every congested/failed link and reallocates the bandwidth for the affected multimedia applications. The designed method has been tested (analytical and simulation)with various network sizes and conditions. The results are presented to assess the performance and effectiveness of the approach. This work also demonstrates some of the benefits of the agent based schemes in providing flexibility, adaptability, software reusability, and maintainability. (C) 2004 Elsevier Inc. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We develop a simulation based algorithm for finite horizon Markov decision processes with finite state and finite action space. Illustrative numerical experiments with the proposed algorithm are shown for problems in flow control of communication networks and capacity switching in semiconductor fabrication.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In many problems of decision making under uncertainty the system has to acquire knowledge of its environment and learn the optimal decision through its experience. Such problems may also involve the system having to arrive at the globally optimal decision, when at each instant only a subset of the entire set of possible alternatives is available. These problems can be successfully modelled and analysed by learning automata. In this paper an estimator learning algorithm, which maintains estimates of the reward characteristics of the random environment, is presented for an automaton with changing number of actions. A learning automaton using the new scheme is shown to be e-optimal. The simulation results demonstrate the fast convergence properties of the new algorithm. The results of this study can be extended to the design of other types of estimator algorithms with good convergence properties.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A feedforward network composed of units of teams of parameterized learning automata is considered as a model of a reinforcement teaming system. The internal state vector of each learning automaton is updated using an algorithm consisting of a gradient following term and a random perturbation term. It is shown that the algorithm weakly converges to a solution of the Langevin equation implying that the algorithm globally maximizes an appropriate function. The algorithm is decentralized, and the units do not have any information exchange during updating. Simulation results on common payoff games and pattern recognition problems show that reasonable rates of convergence can be obtained.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, we investigate the use of reinforcement learning (RL) techniques to the problem of determining dynamic prices in an electronic retail market. As representative models, we consider a single seller market and a two seller market, and formulate the dynamic pricing problem in a setting that easily generalizes to markets with more than two sellers. We first formulate the single seller dynamic pricing problem in the RL framework and solve the problem using the Q-learning algorithm through simulation. Next we model the two seller dynamic pricing problem as a Markovian game and formulate the problem in the RL framework. We solve this problem using actor-critic algorithms through simulation. We believe our approach to solving these problems is a promising way of setting dynamic prices in multi-agent environments. We illustrate the methodology with two illustrative examples of typical retail markets.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents the design and implementation of a learning controller for the Automatic Generation Control (AGC) in power systems based on a reinforcement learning (RL) framework. In contrast to the recent RL scheme for AGC proposed by us, the present method permits handling of power system variables such as Area Control Error (ACE) and deviations from scheduled frequency and tie-line flows as continuous variables. (In the earlier scheme, these variables have to be quantized into finitely many levels). The optimal control law is arrived at in the RL framework by making use of Q-learning strategy. Since the state variables are continuous, we propose the use of Radial Basis Function (RBF) neural networks to compute the Q-values for a given input state. Since, in this application we cannot provide training data appropriate for the standard supervised learning framework, a reinforcement learning algorithm is employed to train the RBF network. We also employ a novel exploration strategy, based on a Learning Automata algorithm,for generating training samples during Q-learning. The proposed scheme, in addition to being simple to implement, inherits all the attractive features of an RL scheme such as model independent design, flexibility in control objective specification, robustness etc. Two implementations of the proposed approach are presented. Through simulation studies the attractiveness of this approach is demonstrated.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

On introduit une nouvelle classe de schémas de renforcement des automates d'apprentissage utilisant les estimations des caractéristiques aléatoires de l'environnement. On montre que les algorithmes convergent en probabilité vers le choix optimal des actions. On présente les résultats de simulation et on suggère des applications à un environnement à plusieurs apprentissages

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, we consider an intrusion detection application for Wireless Sensor Networks. We study the problem of scheduling the sleep times of the individual sensors, where the objective is to maximize the network lifetime while keeping the tracking error to a minimum. We formulate this problem as a partially-observable Markov decision process (POMDP) with continuous stateaction spaces, in a manner similar to Fuemmeler and Veeravalli (IEEE Trans Signal Process 56(5), 2091-2101, 2008). However, unlike their formulation, we consider infinite horizon discounted and average cost objectives as performance criteria. For each criterion, we propose a convergent on-policy Q-learning algorithm that operates on two timescales, while employing function approximation. Feature-based representations and function approximation is necessary to handle the curse of dimensionality associated with the underlying POMDP. Our proposed algorithm incorporates a policy gradient update using a one-simulation simultaneous perturbation stochastic approximation estimate on the faster timescale, while the Q-value parameter (arising from a linear function approximation architecture for the Q-values) is updated in an on-policy temporal difference algorithm-like fashion on the slower timescale. The feature selection scheme employed in each of our algorithms manages the energy and tracking components in a manner that assists the search for the optimal sleep-scheduling policy. For the sake of comparison, in both discounted and average settings, we also develop a function approximation analogue of the Q-learning algorithm. This algorithm, unlike the two-timescale variant, does not possess theoretical convergence guarantees. Finally, we also adapt our algorithms to include a stochastic iterative estimation scheme for the intruder's mobility model and this is useful in settings where the latter is not known. Our simulation results on a synthetic 2-dimensional network setting suggest that our algorithms result in better tracking accuracy at the cost of only a few additional sensors, in comparison to a recent prior work.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The aim in this paper is to allocate the `sleep time' of the individual sensors in an intrusion detection application so that the energy consumption from the sensors is reduced, while keeping the tracking error to a minimum. We propose two novel reinforcement learning (RL) based algorithms that attempt to minimize a certain long-run average cost objective. Both our algorithms incorporate feature-based representations to handle the curse of dimensionality associated with the underlying partially-observable Markov decision process (POMDP). Further, the feature selection scheme used in our algorithms intelligently manages the energy cost and tracking cost factors, which in turn assists the search for the optimal sleeping policy. We also extend these algorithms to a setting where the intruder's mobility model is not known by incorporating a stochastic iterative scheme for estimating the mobility model. The simulation results on a synthetic 2-d network setting are encouraging.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Three-dimensional (3-D) full-wave electromagnetic simulation using method of moments (MoM) under the framework of fast solver algorithms like fast multipole method (FMM) is often bottlenecked by the speed of convergence of the Krylov-subspace-based iterative process. This is primarily because the electric field integral equation (EFIE) matrix, even with cutting-edge preconditioning techniques, often exhibits bad spectral properties arising from frequency or geometry-based ill-conditioning, which render iterative solvers slow to converge or stagnate occasionally. In this communication, a novel technique to expedite the convergence of MoMmatrix solution at a specific frequency is proposed, by extracting and applying Eigen-vectors from a previously solved neighboring frequency in an augmented generalized minimum residual (AGMRES) iterative framework. This technique can be applied in unison with any preconditioner. Numerical results demonstrate up to 40% speed-up in convergence using the proposed Eigen-AGMRES method.