842 resultados para Reinforcement Learning,Deep Neural Networks,Python,Stable Baseline,Gym
Resumo:
We analyze the average performance of a general class of learning algorithms for the nondeterministic polynomial time complete problem of rule extraction by a binary perceptron. The examples are generated by a rule implemented by a teacher network of similar architecture. A variational approach is used in trying to identify the potential energy that leads to the largest generalization in the thermodynamic limit. We restrict our search to algorithms that always satisfy the binary constraints. A replica symmetric ansatz leads to a learning algorithm which presents a phase transition in violation of an information theoretical bound. Stability analysis shows that this is due to a failure of the replica symmetric ansatz and the first step of replica symmetry breaking (RSB) is studied. The variational method does not determine a unique potential but it allows construction of a class with a unique minimum within each first order valley. Members of this class improve on the performance of Gibbs algorithm but fail to reach the Bayesian limit in the low generalization phase. They even fail to reach the performance of the best binary, an optimal clipping of the barycenter of version space. We find a trade-off between a good low performance and early onset of perfect generalization. Although the RSB may be locally stable we discuss the possibility that it fails to be the correct saddle point globally. ©2000 The American Physical Society.
Resumo:
Autonomous robots must be able to learn and maintain models of their environments. In this context, the present work considers techniques for the classification and extraction of features from images in joined with artificial neural networks in order to use them in the system of mapping and localization of the mobile robot of Laboratory of Automation and Evolutive Computer (LACE). To do this, the robot uses a sensorial system composed for ultrasound sensors and a catadioptric vision system formed by a camera and a conical mirror. The mapping system is composed by three modules. Two of them will be presented in this paper: the classifier and the characterizer module. The first module uses a hierarchical neural network to do the classification; the second uses techiniques of extraction of attributes of images and recognition of invariant patterns extracted from the places images set. The neural network of the classifier module is structured in two layers, reason and intuition, and is trained to classify each place explored for the robot amongst four predefine classes. The final result of the exploration is the construction of a topological map of the explored environment. Results gotten through the simulation of the both modules of the mapping system will be presented in this paper. © 2008 IEEE.
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
Pós-graduação em Geociências e Meio Ambiente - IGCE
Identificação automática das primeiras quebras em traços sísmicos por meio de uma rede neural direta
Resumo:
Apesar do avanço tecnológico ocorrido na prospecção sísmica, com a rotina dos levantamentos 2D e 3D, e o significativo aumento na quantidade de dados, a identificação dos tempos de chegada da onda sísmica direta (primeira quebra), que se propaga diretamente do ponto de tiro até a posição dos arranjos de geofones, permanece ainda dependente da avaliação visual do intérprete sísmico. O objetivo desta dissertação, insere-se no processamento sísmico com o intuito de buscar um método eficiente, tal que possibilite a simulação computacional do comportamento visual do intérprete sísmico, através da automação dos processos de tomada de decisão envolvidos na identificação das primeiras quebras em um traço sísmico. Visando, em última análise, preservar o conhecimento intuitivo do intérprete para os casos complexos, nos quais o seu conhecimento será, efetivamente, melhor aproveitado. Recentes descobertas na tecnologia neurocomputacional produziram técnicas que possibilitam a simulação dos aspectos qualitativos envolvidos nos processos visuais de identificação ou interpretação sísmica, com qualidade e aceitabilidade dos resultados. As redes neurais artificiais são uma implementação da tecnologia neurocomputacional e foram, inicialmente, desenvolvidas por neurobiologistas como modelos computacionais do sistema nervoso humano. Elas diferem das técnicas computacionais convencionais pela sua habilidade em adaptar-se ou aprender através de uma repetitiva exposição a exemplos, pela sua tolerância à falta de alguns dos componentes dos dados e pela sua robustez no tratamento com dados contaminados por ruído. O método aqui apresentado baseia-se na aplicação da técnica das redes neurais artificiais para a identificação das primeiras quebras nos traços sísmicos, a partir do estabelecimento de uma conveniente arquitetura para a rede neural artificial do tipo direta, treinada com o algoritmo da retro-propagação do erro. A rede neural artificial é entendida aqui como uma simulação computacional do processo intuitivo de tomada de decisão realizado pelo intérprete sísmico para a identificação das primeiras quebras nos traços sísmicos. A aplicabilidade, eficiência e limitações desta abordagem serão avaliadas em dados sintéticos obtidos a partir da teoria do raio.
Resumo:
Em geral, estruturas espaciais e manipuladores robóticos leves têm uma característica similar e inerente que é a flexibilidade. Esta característica torna a dinâmica do sistema muito mais complexa e com maiores dificuldades para a análise de estabilidade e controle. Então, braços robóticos bastantes leves, com velocidade elevada e potencia limitada devem considerar o controle de vibração causada pela flexibilidade. Por este motivo, uma estratégia de controle é desejada não somente para o controle do modo rígido mas também que seja capaz de controlar os modos de vibração do braço robótico flexível. Também, redes neurais artificiais (RNA) são identificadas como uma subespecialidade de inteligência artificial. Constituem atualmente uma teoria para o estudo de fenômenos complexos e representam uma nova ferramenta na tecnologia de processamento de informação, por possuírem características como processamento paralelo, capacidade de aprendizagem, mapeamento não-linear e capacidade de generalização. Assim, neste estudo utilizam-se RNA na identificação e controle do braço robótico com elos flexíveis. Esta tese apresenta a modelagem dinâmica de braços robóticos com elos flexíveis, 1D no plano horizontal e 2D no plano vertical com ação da gravidade, respectivamente. Modelos dinâmicos reduzidos são obtidos pelo formalismo de Newton-Euler, e utiliza-se o método dos elementos finitos (MEF) na discretização dos deslocamentos elásticos baseado na teoria elementar da viga. Além disso, duas estratégias de controle têm sido desenvolvidas com a finalidade de eliminar as vibrações devido à flexibilidade do braço robótico com elos flexíveis. Primeiro, utilizase um controlador neural feedforward (NFF) na obtenção da dinâmica inversa do braço robótico flexível e o calculo do torque da junta. E segundo, para obter precisão no posicionamento... (Resumo completo, clicar acesso eletrônico abaixo)
Resumo:
Ceramic parts are increasingly replacing metal parts due to their excellent physical, chemical and mechanical properties, however they also make them difficult to manufacture by traditional machining methods. The developments carried out in this work are used to estimate tool wear during the grinding of advanced ceramics. The learning process was fed with data collected from a surface grinding machine with tangential diamond wheel and alumina ceramic test specimens, in three cutting configurations: with depths of cut of 120 mu m, 70 mu m and 20 mu m. The grinding wheel speed was 35m/s and the table speed 2.3m/s. Four neural models were evaluated, namely: Multilayer Perceptron, Radial Basis Function, Generalized Regression Neural Networks and the Adaptive Neuro-Fuzzy Inference System. The models'performance evaluation routines were executed automatically, testing all the possible combinations of inputs, number of neurons, number of layers, and spreading. The computational results reveal that the neural models were highly successful in estimating tool wear, since the errors were lower than 4%.
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
In this study, an effective microbial consortium for the biodegradation of phenol was grown under different operational conditions, and the effects of phosphate concentration (1.4 g L-1, 2.8 g L-1, 4.2 g L-1), temperature (25 degrees C, 30 degrees C, 35 degrees C), agitation (150 rpm, 200 rpm, 250 rpm) and pH (6, 7, 8) on phenol degradation were investigated, whereupon an artificial neural network (ANN) model was developed in order to predict degradation. The learning, recall and generalization characteristics of neural networks were studied using data from the phenol degradation system. The efficiency of the model generated by the ANN was then tested and compared with the experimental results obtained. In both cases, the results corroborate the idea that aeration and temperature are crucial to increasing the efficiency of biodegradation.
Resumo:
Semi-supervised learning techniques have gained increasing attention in the machine learning community, as a result of two main factors: (1) the available data is exponentially increasing; (2) the task of data labeling is cumbersome and expensive, involving human experts in the process. In this paper, we propose a network-based semi-supervised learning method inspired by the modularity greedy algorithm, which was originally applied for unsupervised learning. Changes have been made in the process of modularity maximization in a way to adapt the model to propagate labels throughout the network. Furthermore, a network reduction technique is introduced, as well as an extensive analysis of its impact on the network. Computer simulations are performed for artificial and real-world databases, providing a numerical quantitative basis for the performance of the proposed method.
Resumo:
Semisupervised learning is a machine learning approach that is able to employ both labeled and unlabeled samples in the training process. In this paper, we propose a semisupervised data classification model based on a combined random-preferential walk of particles in a network (graph) constructed from the input dataset. The particles of the same class cooperate among themselves, while the particles of different classes compete with each other to propagate class labels to the whole network. A rigorous model definition is provided via a nonlinear stochastic dynamical system and a mathematical analysis of its behavior is carried out. A numerical validation presented in this paper confirms the theoretical predictions. An interesting feature brought by the competitive-cooperative mechanism is that the proposed model can achieve good classification rates while exhibiting low computational complexity order in comparison to other network-based semisupervised algorithms. Computer simulations conducted on synthetic and real-world datasets reveal the effectiveness of the model.
Resumo:
Semi-supervised learning is a classification paradigm in which just a few labeled instances are available for the training process. To overcome this small amount of initial label information, the information provided by the unlabeled instances is also considered. In this paper, we propose a nature-inspired semi-supervised learning technique based on attraction forces. Instances are represented as points in a k-dimensional space, and the movement of data points is modeled as a dynamical system. As the system runs, data items with the same label cooperate with each other, and data items with different labels compete among them to attract unlabeled points by applying a specific force function. In this way, all unlabeled data items can be classified when the system reaches its stable state. Stability analysis for the proposed dynamical system is performed and some heuristics are proposed for parameter setting. Simulation results show that the proposed technique achieves good classification results on artificial data sets and is comparable to well-known semi-supervised techniques using benchmark data sets.
Resumo:
Die Arbeit behandelt das Problem der Skalierbarkeit von Reinforcement Lernen auf hochdimensionale und komplexe Aufgabenstellungen. Unter Reinforcement Lernen versteht man dabei eine auf approximativem Dynamischen Programmieren basierende Klasse von Lernverfahren, die speziell Anwendung in der Künstlichen Intelligenz findet und zur autonomen Steuerung simulierter Agenten oder realer Hardwareroboter in dynamischen und unwägbaren Umwelten genutzt werden kann. Dazu wird mittels Regression aus Stichproben eine Funktion bestimmt, die die Lösung einer "Optimalitätsgleichung" (Bellman) ist und aus der sich näherungsweise optimale Entscheidungen ableiten lassen. Eine große Hürde stellt dabei die Dimensionalität des Zustandsraums dar, die häufig hoch und daher traditionellen gitterbasierten Approximationsverfahren wenig zugänglich ist. Das Ziel dieser Arbeit ist es, Reinforcement Lernen durch nichtparametrisierte Funktionsapproximation (genauer, Regularisierungsnetze) auf -- im Prinzip beliebig -- hochdimensionale Probleme anwendbar zu machen. Regularisierungsnetze sind eine Verallgemeinerung von gewöhnlichen Basisfunktionsnetzen, die die gesuchte Lösung durch die Daten parametrisieren, wodurch die explizite Wahl von Knoten/Basisfunktionen entfällt und so bei hochdimensionalen Eingaben der "Fluch der Dimension" umgangen werden kann. Gleichzeitig sind Regularisierungsnetze aber auch lineare Approximatoren, die technisch einfach handhabbar sind und für die die bestehenden Konvergenzaussagen von Reinforcement Lernen Gültigkeit behalten (anders als etwa bei Feed-Forward Neuronalen Netzen). Allen diesen theoretischen Vorteilen gegenüber steht allerdings ein sehr praktisches Problem: der Rechenaufwand bei der Verwendung von Regularisierungsnetzen skaliert von Natur aus wie O(n**3), wobei n die Anzahl der Daten ist. Das ist besonders deswegen problematisch, weil bei Reinforcement Lernen der Lernprozeß online erfolgt -- die Stichproben werden von einem Agenten/Roboter erzeugt, während er mit der Umwelt interagiert. Anpassungen an der Lösung müssen daher sofort und mit wenig Rechenaufwand vorgenommen werden. Der Beitrag dieser Arbeit gliedert sich daher in zwei Teile: Im ersten Teil der Arbeit formulieren wir für Regularisierungsnetze einen effizienten Lernalgorithmus zum Lösen allgemeiner Regressionsaufgaben, der speziell auf die Anforderungen von Online-Lernen zugeschnitten ist. Unser Ansatz basiert auf der Vorgehensweise von Recursive Least-Squares, kann aber mit konstantem Zeitaufwand nicht nur neue Daten sondern auch neue Basisfunktionen in das bestehende Modell einfügen. Ermöglicht wird das durch die "Subset of Regressors" Approximation, wodurch der Kern durch eine stark reduzierte Auswahl von Trainingsdaten approximiert wird, und einer gierigen Auswahlwahlprozedur, die diese Basiselemente direkt aus dem Datenstrom zur Laufzeit selektiert. Im zweiten Teil übertragen wir diesen Algorithmus auf approximative Politik-Evaluation mittels Least-Squares basiertem Temporal-Difference Lernen, und integrieren diesen Baustein in ein Gesamtsystem zum autonomen Lernen von optimalem Verhalten. Insgesamt entwickeln wir ein in hohem Maße dateneffizientes Verfahren, das insbesondere für Lernprobleme aus der Robotik mit kontinuierlichen und hochdimensionalen Zustandsräumen sowie stochastischen Zustandsübergängen geeignet ist. Dabei sind wir nicht auf ein Modell der Umwelt angewiesen, arbeiten weitestgehend unabhängig von der Dimension des Zustandsraums, erzielen Konvergenz bereits mit relativ wenigen Agent-Umwelt Interaktionen, und können dank des effizienten Online-Algorithmus auch im Kontext zeitkritischer Echtzeitanwendungen operieren. Wir demonstrieren die Leistungsfähigkeit unseres Ansatzes anhand von zwei realistischen und komplexen Anwendungsbeispielen: dem Problem RoboCup-Keepaway, sowie der Steuerung eines (simulierten) Oktopus-Tentakels.
Resumo:
Im Forschungsgebiet der Künstlichen Intelligenz, insbesondere im Bereich des maschinellen Lernens, hat sich eine ganze Reihe von Verfahren etabliert, die von biologischen Vorbildern inspiriert sind. Die prominentesten Vertreter derartiger Verfahren sind zum einen Evolutionäre Algorithmen, zum anderen Künstliche Neuronale Netze. Die vorliegende Arbeit befasst sich mit der Entwicklung eines Systems zum maschinellen Lernen, das Charakteristika beider Paradigmen in sich vereint: Das Hybride Lernende Klassifizierende System (HCS) wird basierend auf dem reellwertig kodierten eXtended Learning Classifier System (XCS), das als Lernmechanismus einen Genetischen Algorithmus enthält, und dem Wachsenden Neuralen Gas (GNG) entwickelt. Wie das XCS evolviert auch das HCS mit Hilfe eines Genetischen Algorithmus eine Population von Klassifizierern - das sind Regeln der Form [WENN Bedingung DANN Aktion], wobei die Bedingung angibt, in welchem Bereich des Zustandsraumes eines Lernproblems ein Klassifizierer anwendbar ist. Beim XCS spezifiziert die Bedingung in der Regel einen achsenparallelen Hyperquader, was oftmals keine angemessene Unterteilung des Zustandsraumes erlaubt. Beim HCS hingegen werden die Bedingungen der Klassifizierer durch Gewichtsvektoren beschrieben, wie die Neuronen des GNG sie besitzen. Jeder Klassifizierer ist anwendbar in seiner Zelle der durch die Population des HCS induzierten Voronoizerlegung des Zustandsraumes, dieser kann also flexibler unterteilt werden als beim XCS. Die Verwendung von Gewichtsvektoren ermöglicht ferner, einen vom Neuronenadaptationsverfahren des GNG abgeleiteten Mechanismus als zweites Lernverfahren neben dem Genetischen Algorithmus einzusetzen. Während das Lernen beim XCS rein evolutionär erfolgt, also nur durch Erzeugen neuer Klassifizierer, ermöglicht dies dem HCS, bereits vorhandene Klassifizierer anzupassen und zu verbessern. Zur Evaluation des HCS werden mit diesem verschiedene Lern-Experimente durchgeführt. Die Leistungsfähigkeit des Ansatzes wird in einer Reihe von Lernproblemen aus den Bereichen der Klassifikation, der Funktionsapproximation und des Lernens von Aktionen in einer interaktiven Lernumgebung unter Beweis gestellt.
Resumo:
In this paper two models for the simulation of glucose-insulin metabolism of children with Type 1 diabetes are presented. The models are based on the combined use of Compartmental Models (CMs) and artificial Neural Networks (NNs). Data from children with Type 1 diabetes, stored in a database, have been used as input to the models. The data are taken from four children with Type 1 diabetes and contain information about glucose levels taken from continuous glucose monitoring system, insulin intake and food intake, along with corresponding time. The influences of taken insulin on plasma insulin concentration, as well as the effect of food intake on glucose input into the blood from the gut, are estimated from the CMs. The outputs of CMs, along with previous glucose measurements, are fed to a NN, which provides short-term prediction of glucose values. For comparative reasons two different NN architectures have been tested: a Feed-Forward NN (FFNN) trained with the back-propagation algorithm with adaptive learning rate and momentum, and a Recurrent NN (RNN), trained with the Real Time Recurrent Learning (RTRL) algorithm. The results indicate that the best prediction performance can be achieved by the use of RNN.