842 resultados para Reinforcement Learning,Deep Neural Networks,Python,Stable Baseline,Gym


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Soil information is needed for managing the agricultural environment. The aim of this study was to apply artificial neural networks (ANNs) for the prediction of soil classes using orbital remote sensing products, terrain attributes derived from a digital elevation model and local geology information as data sources. This approach to digital soil mapping was evaluated in an area with a high degree of lithologic diversity in the Serra do Mar. The neural network simulator used in this study was JavaNNS and the backpropagation learning algorithm. For soil class prediction, different combinations of the selected discriminant variables were tested: elevation, declivity, aspect, curvature, curvature plan, curvature profile, topographic index, solar radiation, LS topographic factor, local geology information, and clay mineral indices, iron oxides and the normalized difference vegetation index (NDVI) derived from an image of a Landsat-7 Enhanced Thematic Mapper Plus (ETM+) sensor. With the tested sets, best results were obtained when all discriminant variables were associated with geological information (overall accuracy 93.2 - 95.6 %, Kappa index 0.924 - 0.951, for set 13). Excluding the variable profile curvature (set 12), overall accuracy ranged from 93.9 to 95.4 % and the Kappa index from 0.932 to 0.948. The maps based on the neural network classifier were consistent and similar to conventional soil maps drawn for the study area, although with more spatial details. The results show the potential of ANNs for soil class prediction in mountainous areas with lithological diversity.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This work focuses on the prediction of the two main nitrogenous variables that describe the water quality at the effluent of a Wastewater Treatment Plant. We have developed two kind of Neural Networks architectures based on considering only one output or, in the other hand, the usual five effluent variables that define the water quality: suspended solids, biochemical organic matter, chemical organic matter, total nitrogen and total Kjedhal nitrogen. Two learning techniques based on a classical adaptative gradient and a Kalman filter have been implemented. In order to try to improve generalization and performance we have selected variables by means genetic algorithms and fuzzy systems. The training, testing and validation sets show that the final networks are able to learn enough well the simulated available data specially for the total nitrogen

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Many classification systems rely on clustering techniques in which a collection of training examples is provided as an input, and a number of clusters c1,...cm modelling some concept C results as an output, such that every cluster ci is labelled as positive or negative. Given a new, unlabelled instance enew, the above classification is used to determine to which particular cluster ci this new instance belongs. In such a setting clusters can overlap, and a new unlabelled instance can be assigned to more than one cluster with conflicting labels. In the literature, such a case is usually solved non-deterministically by making a random choice. This paper presents a novel, hybrid approach to solve this situation by combining a neural network for classification along with a defeasible argumentation framework which models preference criteria for performing clustering.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Many species are able to learn to associate behaviours with rewards as this gives fitness advantages in changing environments. Social interactions between population members may, however, require more cognitive abilities than simple trial-and-error learning, in particular the capacity to make accurate hypotheses about the material payoff consequences of alternative action combinations. It is unclear in this context whether natural selection necessarily favours individuals to use information about payoffs associated with nontried actions (hypothetical payoffs), as opposed to simple reinforcement of realized payoff. Here, we develop an evolutionary model in which individuals are genetically determined to use either trial-and-error learning or learning based on hypothetical reinforcements, and ask what is the evolutionarily stable learning rule under pairwise symmetric two-action stochastic repeated games played over the individual's lifetime. We analyse through stochastic approximation theory and simulations the learning dynamics on the behavioural timescale, and derive conditions where trial-and-error learning outcompetes hypothetical reinforcement learning on the evolutionary timescale. This occurs in particular under repeated cooperative interactions with the same partner. By contrast, we find that hypothetical reinforcement learners tend to be favoured under random interactions, but stable polymorphisms can also obtain where trial-and-error learners are maintained at a low frequency. We conclude that specific game structures can select for trial-and-error learning even in the absence of costs of cognition, which illustrates that cost-free increased cognition can be counterselected under social interactions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Stochastic learning processes for a specific feature detector are studied. This technique is applied to nonsmooth multilayer neural networks requested to perform a discrimination task of order 3 based on the ssT-block¿ssC-block problem. Our system proves to be capable of achieving perfect generalization, after presenting finite numbers of examples, by undergoing a phase transition. The corresponding annealed theory, which involves the Ising model under external field, shows good agreement with Monte Carlo simulations.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Dans le domaine des neurosciences computationnelles, l'hypothèse a été émise que le système visuel, depuis la rétine et jusqu'au cortex visuel primaire au moins, ajuste continuellement un modèle probabiliste avec des variables latentes, à son flux de perceptions. Ni le modèle exact, ni la méthode exacte utilisée pour l'ajustement ne sont connus, mais les algorithmes existants qui permettent l'ajustement de tels modèles ont besoin de faire une estimation conditionnelle des variables latentes. Cela nous peut nous aider à comprendre pourquoi le système visuel pourrait ajuster un tel modèle; si le modèle est approprié, ces estimé conditionnels peuvent aussi former une excellente représentation, qui permettent d'analyser le contenu sémantique des images perçues. Le travail présenté ici utilise la performance en classification d'images (discrimination entre des types d'objets communs) comme base pour comparer des modèles du système visuel, et des algorithmes pour ajuster ces modèles (vus comme des densités de probabilité) à des images. Cette thèse (a) montre que des modèles basés sur les cellules complexes de l'aire visuelle V1 généralisent mieux à partir d'exemples d'entraînement étiquetés que les réseaux de neurones conventionnels, dont les unités cachées sont plus semblables aux cellules simples de V1; (b) présente une nouvelle interprétation des modèles du système visuels basés sur des cellules complexes, comme distributions de probabilités, ainsi que de nouveaux algorithmes pour les ajuster à des données; et (c) montre que ces modèles forment des représentations qui sont meilleures pour la classification d'images, après avoir été entraînés comme des modèles de probabilités. Deux innovations techniques additionnelles, qui ont rendu ce travail possible, sont également décrites : un algorithme de recherche aléatoire pour sélectionner des hyper-paramètres, et un compilateur pour des expressions mathématiques matricielles, qui peut optimiser ces expressions pour processeur central (CPU) et graphique (GPU).

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We describe an adaptive, mid-level approach to the wireless device power management problem. Our approach is based on reinforcement learning, a machine learning framework for autonomous agents. We describe how our framework can be applied to the power management problem in both infrastructure and ad~hoc wireless networks. From this thesis we conclude that mid-level power management policies can outperform low-level policies and are more convenient to implement than high-level policies. We also conclude that power management policies need to adapt to the user and network, and that a mid-level power management framework based on reinforcement learning fulfills these requirements.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper proposes a hybrid coordination method for behavior-based control architectures. The hybrid method takes advantages of the robustness and modularity in competitive approaches as well as optimized trajectories in cooperative ones. This paper shows the feasibility of applying this hybrid method with a 3D-navigation to an autonomous underwater vehicle (AUV). The behaviors are learnt online by means of reinforcement learning. A continuous Q-learning implemented with a feed-forward neural network is employed. Realistic simulations were carried out. The results obtained show the good performance of the hybrid method on behavior coordination as well as the convergence of the behaviors

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Aquesta tesi proposa l'ús d'un seguit de tècniques pel control a alt nivell d'un robot autònom i també per l'aprenentatge automàtic de comportaments. L'objectiu principal de la tesis fou el de dotar d'intel·ligència als robots autònoms que han d'acomplir unes missions determinades en entorns desconeguts i no estructurats. Una de les premisses tingudes en compte en tots els passos d'aquesta tesis va ser la selecció d'aquelles tècniques que poguessin ésser aplicades en temps real, i demostrar-ne el seu funcionament amb experiments reals. El camp d'aplicació de tots els experiments es la robòtica submarina. En una primera part, la tesis es centra en el disseny d'una arquitectura de control que ha de permetre l'assoliment d'una missió prèviament definida. En particular, la tesis proposa l'ús de les arquitectures de control basades en comportaments per a l'assoliment de cada una de les tasques que composen la totalitat de la missió. Una arquitectura d'aquest tipus està formada per un conjunt independent de comportaments, els quals representen diferents intencions del robot (ex.: "anar a una posició", "evitar obstacles",...). Es presenta una recerca bibliogràfica sobre aquest camp i alhora es mostren els resultats d'aplicar quatre de les arquitectures basades en comportaments més representatives a una tasca concreta. De l'anàlisi dels resultats se'n deriva que un dels factors que més influeixen en el rendiment d'aquestes arquitectures, és la metodologia emprada per coordinar les respostes dels comportaments. Per una banda, la coordinació competitiva és aquella en que només un dels comportaments controla el robot. Per altra banda, en la coordinació cooperativa el control del robot és realitza a partir d'una fusió de totes les respostes dels comportaments actius. La tesis, proposa un esquema híbrid d'arquitectura capaç de beneficiar-se dels principals avantatges d'ambdues metodologies. En una segona part, la tesis proposa la utilització de l'aprenentatge per reforç per aprendre l'estructura interna dels comportaments. Aquest tipus d'aprenentatge és adequat per entorns desconeguts i el procés d'aprenentatge es realitza al mateix temps que el robot està explorant l'entorn. La tesis presenta també un estat de l'art d'aquest camp, en el que es detallen els principals problemes que apareixen en utilitzar els algoritmes d'aprenentatge per reforç en aplicacions reals, com la robòtica. El problema de la generalització és un dels que més influeix i consisteix en permetre l'ús de variables continues sense augmentar substancialment el temps de convergència. Després de descriure breument les principals metodologies per generalitzar, la tesis proposa l'ús d'una xarxa neural combinada amb l'algoritme d'aprenentatge per reforç Q_learning. Aquesta combinació proporciona una gran capacitat de generalització i una molt bona disposició per aprendre en tasques de robòtica amb exigències de temps real. No obstant, les xarxes neurals són aproximadors de funcions no-locals, el que significa que en treballar amb un conjunt de dades no homogeni es produeix una interferència: aprendre en un subconjunt de l'espai significa desaprendre en la resta de l'espai. El problema de la interferència afecta de manera directa en robòtica, ja que l'exploració de l'espai es realitza sempre localment. L'algoritme proposat en la tesi té en compte aquest problema i manté una base de dades representativa de totes les zones explorades. Així doncs, totes les mostres de la base de dades s'utilitzen per actualitzar la xarxa neural, i per tant, l'aprenentatge és homogeni. Finalment, la tesi presenta els resultats obtinguts amb la arquitectura de control basada en comportaments i l'algoritme d'aprenentatge per reforç. Els experiments es realitzen amb el robot URIS, desenvolupat a la Universitat de Girona, i el comportament après és el seguiment d'un objecte mitjançant visió per computador. La tesi detalla tots els dispositius desenvolupats pels experiments així com les característiques del propi robot submarí. Els resultats obtinguts demostren la idoneïtat de les propostes en permetre l'aprenentatge del comportament en temps real. En un segon apartat de resultats es demostra la capacitat de generalització de l'algoritme d'aprenentatge mitjançant el "benchmark" del "cotxe i la muntanya". Els resultats obtinguts en aquest problema milloren els resultats d'altres metodologies, demostrant la millor capacitat de generalització de les xarxes neurals.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

More than thirty years ago, Amari and colleagues proposed a statistical framework for identifying structurally stable macrostates of neural networks from observations of their microstates. We compare their stochastic stability criterion with a deterministic stability criterion based on the ergodic theory of dynamical systems, recently proposed for the scheme of contextual emergence and applied to particular inter-level relations in neuroscience. Stochastic and deterministic stability criteria for macrostates rely on macro-level contexts, which make them sensitive to differences between different macro-levels.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we present the initial results using an artificial neural network to predict the onset of Parkinson's Disease tremors in a human subject. Data for the network was obtained from implanted deep brain electrodes. A tuned artificial neural network was shown to be able to identify the pattern of the onset tremor from these real time recordings.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we consider the possibility of using an artificial neural network to accurately identify the onset of Parkinson’s Disease tremors in human subjects. Data for the network is obtained by means of deep brain implantation in the human brain. Results presented have been obtained from a practical study (i.e. real not simulated data) but should be regarded as initial trials to be discussed further. It can be seen that a tuned artificial neural network can act as an extremely effective predictor in these circumstances.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The possibility of using a radial basis function neural network (RBFNN) to accurately recognise and predict the onset of Parkinson’s disease tremors in human subjects is discussed in this paper. The data for training the RBFNN are obtained by means of deep brain electrodes implanted in a Parkinson disease patient’s brain. The effectiveness of a RBFNN is initially demonstrated by a real case study.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Self-organizing neural networks have been implemented in a wide range of application areas such as speech processing, image processing, optimization and robotics. Recent variations to the basic model proposed by the authors enable it to order state space using a subset of the input vector and to apply a local adaptation procedure that does not rely on a predefined test duration limit. Both these variations have been incorporated into a new feature map architecture that forms an integral part of an Hybrid Learning System (HLS) based on a genetic-based classifier system. Problems are represented within HLS as objects characterized by environmental features. Objects controlled by the system have preset targets set against a subset of their features. The system's objective is to achieve these targets by evolving a behavioural repertoire that efficiently explores and exploits the problem environment. Feature maps encode two types of knowledge within HLS — long-term memory traces of useful regularities within the environment and the classifier performance data calibrated against an object's feature states and targets. Self-organization of these networks constitutes non-genetic-based (experience-driven) learning within HLS. This paper presents a description of the HLS architecture and an analysis of the modified feature map implementing associative memory. Initial results are presented that demonstrate the behaviour of the system on a simple control task.