978 resultados para Learning behavior
Resumo:
If we are to understand how we can build machines capable of broad purpose learning and reasoning, we must first aim to build systems that can represent, acquire, and reason about the kinds of commonsense knowledge that we humans have about the world. This endeavor suggests steps such as identifying the kinds of knowledge people commonly have about the world, constructing suitable knowledge representations, and exploring the mechanisms that people use to make judgments about the everyday world. In this work, I contribute to these goals by proposing an architecture for a system that can learn commonsense knowledge about the properties and behavior of objects in the world. The architecture described here augments previous machine learning systems in four ways: (1) it relies on a seven dimensional notion of context, built from information recently given to the system, to learn and reason about objects' properties; (2) it has multiple methods that it can use to reason about objects, so that when one method fails, it can fall back on others; (3) it illustrates the usefulness of reasoning about objects by thinking about their similarity to other, better known objects, and by inferring properties of objects from the categories that they belong to; and (4) it represents an attempt to build an autonomous learner and reasoner, that sets its own goals for learning about the world and deduces new facts by reflecting on its acquired knowledge. This thesis describes this architecture, as well as a first implementation, that can learn from sentences such as ``A blue bird flew to the tree'' and ``The small bird flew to the cage'' that birds can fly. One of the main contributions of this work lies in suggesting a further set of salient ideas about how we can build broader purpose commonsense artificial learners and reasoners.
Resumo:
We discuss a formulation for active example selection for function learning problems. This formulation is obtained by adapting Fedorov's optimal experiment design to the learning problem. We specifically show how to analytically derive example selection algorithms for certain well defined function classes. We then explore the behavior and sample complexity of such active learning algorithms. Finally, we view object detection as a special case of function learning and show how our formulation reduces to a useful heuristic to choose examples to reduce the generalization error.
Resumo:
We introduce basic behaviors as primitives for control and learning in situated, embodied agents interacting in complex domains. We propose methods for selecting, formally specifying, algorithmically implementing, empirically evaluating, and combining behaviors from a basic set. We also introduce a general methodology for automatically constructing higher--level behaviors by learning to select from this set. Based on a formulation of reinforcement learning using conditions, behaviors, and shaped reinforcement, out approach makes behavior selection learnable in noisy, uncertain environments with stochastic dynamics. All described ideas are validated with groups of up to 20 mobile robots performing safe--wandering, following, aggregation, dispersion, homing, flocking, foraging, and learning to forage.
Resumo:
This paper proposes a hybrid coordination method for behavior-based control architectures. The hybrid method takes advantages of the robustness and modularity in competitive approaches as well as optimized trajectories in cooperative ones. This paper shows the feasibility of applying this hybrid method with a 3D-navigation to an autonomous underwater vehicle (AUV). The behaviors are learnt online by means of reinforcement learning. A continuous Q-learning implemented with a feed-forward neural network is employed. Realistic simulations were carried out. The results obtained show the good performance of the hybrid method on behavior coordination as well as the convergence of the behaviors
Resumo:
The purpose of this paper is to propose a Neural-Q_learning approach designed for online learning of simple and reactive robot behaviors. In this approach, the Q_function is generalized by a multi-layer neural network allowing the use of continuous states and actions. The algorithm uses a database of the most recent learning samples to accelerate and guarantee the convergence. Each Neural-Q_learning function represents an independent, reactive and adaptive behavior which maps sensorial states to robot control actions. A group of these behaviors constitutes a reactive control scheme designed to fulfill simple missions. The paper centers on the description of the Neural-Q_learning based behaviors showing their performance with an underwater robot in a target following task. Real experiments demonstrate the convergence and stability of the learning system, pointing out its suitability for online robot learning. Advantages and limitations are discussed
Resumo:
Reinforcement learning (RL) is a very suitable technique for robot learning, as it can learn in unknown environments and in real-time computation. The main difficulties in adapting classic RL algorithms to robotic systems are the generalization problem and the correct observation of the Markovian state. This paper attempts to solve the generalization problem by proposing the semi-online neural-Q_learning algorithm (SONQL). The algorithm uses the classic Q_learning technique with two modifications. First, a neural network (NN) approximates the Q_function allowing the use of continuous states and actions. Second, a database of the most representative learning samples accelerates and stabilizes the convergence. The term semi-online is referred to the fact that the algorithm uses the current but also past learning samples. However, the algorithm is able to learn in real-time while the robot is interacting with the environment. The paper shows simulated results with the "mountain-car" benchmark and, also, real results with an underwater robot in a target following behavior
Resumo:
This paper proposes a high-level reinforcement learning (RL) control system for solving the action selection problem of an autonomous robot. Although the dominant approach, when using RL, has been to apply value function based algorithms, the system here detailed is characterized by the use of direct policy search methods. Rather than approximating a value function, these methodologies approximate a policy using an independent function approximator with its own parameters, trying to maximize the future expected reward. The policy based algorithm presented in this paper is used for learning the internal state/action mapping of a behavior. In this preliminary work, we demonstrate its feasibility with simulated experiments using the underwater robot GARBI in a target reaching task
Resumo:
El desarrollo del presente documento constituye una investigación sobre las actitudes de los directivos frente a la adopción del e-learning como herramienta de trabajo en las organizaciones de Bogotá. Para ello se realizó una encuesta a 101 directivos, tomando como base el tipo de muestreo de conveniencia; esto con el objetivo de identificar sus actitudes frente al uso del e-learning y su influencia dentro de la organización. Como resultado se obtuvo que las actitudes de los directivos influencian en el uso de herramientas e-learning, así como también en las acciones que promueven su uso y en las actitudes de los empleados; por otro lado se identificó que las creencias relacionadas con la apropiación de herramientas e-learning y los factores facilitadores del uso de estas, influencian en las actitudes de los directivos. Lo anterior, corresponde a los análisis llevados a cabo a partir de los resultados contrastados con los estudios empíricos hallados y el marco teórico desarrollado.
Resumo:
Resumen basado en el de la publicación
Resumo:
With the advent of digital era web applications have become inevitable part of our lives. We are using the web to manage even the financially or ethically sensitive issues. For this reason exploration of information seeking behavior is an exciting area of research. Current study provides insight on information seeking behavior using a classic ‘Find the Difference’ game. 50 university students between the age of 19 and 26 participated in the study. Eye movement data were recorded with a Tobii T120 device. Participants carried out 4 continuous tasks. Each task included two pictures side by side with 7 hidden differences. After finishing the tasks, participants were asked to repeat the game with the same picture set. This data collection methodology allows the evaluation of learning curves. Additionally, participants were asked about their hand preference. For the purpose of analysis the following metrics were applied: task times (including saccades), fixation count and fixation duration (without saccades). The right- and left-hand side on each picture was selected as AOI (Area of Interest) to detect side preference in connection with hand preference. Results suggest a significant difference between male and female participants regarding aggregated task times (male 58.37s respectively female 68.37s), deviation in the number of fixations and fixation duration (apparently female have less but longer fixations) and also in the distribution of fixations between AOIs. Using eyetracking data current paper highlights the similarities and differences in information acquisition strategies respectively reveals gender and education (Arts vs. Sciences) dependent characteristics of interaction.
Resumo:
A representative community sample of primiparous depressed women and a nondepressed control group were assessed while in interaction with their infants at 2 months postpartum. At 3 months, infants were assessed on the Still-face perturbation of face to face interaction, and a subsample completed an Instrumental Learning paradigm. Compared to nondepressed women, depressed mothers' interactions were both less contingent and less affectively attuned to infant behavior. Postnatal depression did not adversely affect the infant's performance in either the Still-face perturbation or the Instrumental Learning assessment. Maternal responsiveness in interactions at 2 months predicted the infant's performance in the Instrumental Learning assessment but not in the Still-face perturbation. The implications of these findings for theories of infant cognitive and emotional development are discussed.
Resumo:
This article explores young infants' ability to learn new words in situations providing tightly controlled social and salience cues to their reference. Four experiments investigated whether, given two potential referents, 15-month-olds would attach novel labels to (a) an image toward which a digital recording of a face turned and gazed, (b) a moving image versus a stationary image, (c) a moving image toward which the face gazed, and (d) a gazed-on image versus a moving image. Infants successfully used the recorded gaze cue to form new word-referent associations and also showed learning in the salience condition. However, their behavior in the salience condition and in the experiments that followed suggests that, rather than basing their judgments of the words' reference on the mere presence or absence of the referent's motion, infants were strongly biased to attend to the consistency with which potential referents moved when a word was heard. (c) 2006 Elsevier Inc. All rights reserved.
Resumo:
We focus on the learning dynamics in multiproduct price-setting markets, where firms use past strategies and performance to adapt to the corresponding equilibrium.
Resumo:
Cognitive functions such as attention and memory are known to be impaired in End Stage Renal Disease (ESRD), but the sites of the neural changes underlying these impairments are uncertain. Patients and controls took part in a latent learning task, which had previously shown a dissociation between patients with Parkinson’s disease and those with medial temporal damage. ESRD patients (n=24) and age and education-matched controls (n=24) were randomly assigned to either an exposed or unexposed condition. In Phase 1 of the task, participants learned that a cue (word) on the back of a schematic head predicted that the subsequently seen face would be smiling. For the exposed (but not unexposed) condition, an additional (irrelevant) colour cue was shown during presentation. In Phase 2, a different association, between colour and facial expression, was learned. Instructions were the same for each phase: participants had to predict whether the subsequently viewed face was going to be happy or sad. No difference in error rate between the groups was found in Phase 1, suggesting that patients and controls learned at a similar rate. However, in Phase 2, a significant interaction was found between group and condition, with exposed controls performing significantly worse than unexposed (therefore demonstrating learned irrelevance). In contrast, exposed patients made a similar number of errors to unexposed in Phase 2. The pattern of results in ESRD was different from that previously found in Parkinson’s disease, suggesting a different neural origin.