834 resultados para Reward based model
Resumo:
In this comment, problems associated with an oversimplified FDTD based model used for trapping force calculation in recent papers "Computation of the optical trapping force using an FDTD based technique" [Opt. Express 13, 3707 (2005)], and "Rigorous time domain simulation of momentum transfer between light and microscopic particles in optical trapping" [Opt. Express 12, 2220 (2004)] are discussed. A more rigorous model using in Poynting vector is also presented.
Resumo:
Animals repeat rewarded behaviors, but the physiological basis of reward-based learning has only been partially elucidated. On one hand, experimental evidence shows that the neuromodulator dopamine carries information about rewards and affects synaptic plasticity. On the other hand, the theory of reinforcement learning provides a framework for reward-based learning. Recent models of reward-modulated spike-timing-dependent plasticity have made first steps towards bridging the gap between the two approaches, but faced two problems. First, reinforcement learning is typically formulated in a discrete framework, ill-adapted to the description of natural situations. Second, biologically plausible models of reward-modulated spike-timing-dependent plasticity require precise calculation of the reward prediction error, yet it remains to be shown how this can be computed by neurons. Here we propose a solution to these problems by extending the continuous temporal difference (TD) learning of Doya (2000) to the case of spiking neurons in an actor-critic network operating in continuous time, and with continuous state and action representations. In our model, the critic learns to predict expected future rewards in real time. Its activity, together with actual rewards, conditions the delivery of a neuromodulatory TD signal to itself and to the actor, which is responsible for action choice. In simulations, we show that such an architecture can solve a Morris water-maze-like navigation task, in a number of trials consistent with reported animal performance. We also use our model to solve the acrobot and the cartpole problems, two complex motor control tasks. Our model provides a plausible way of computing reward prediction error in the brain. Moreover, the analytically derived learning rule is consistent with experimental evidence for dopamine-modulated spike-timing-dependent plasticity.
Resumo:
Recent experiments have shown that spike-timing-dependent plasticity is influenced by neuromodulation. We derive theoretical conditions for successful learning of reward-related behavior for a large class of learning rules where Hebbian synaptic plasticity is conditioned on a global modulatory factor signaling reward. We show that all learning rules in this class can be separated into a term that captures the covariance of neuronal firing and reward and a second term that presents the influence of unsupervised learning. The unsupervised term, which is, in general, detrimental for reward-based learning, can be suppressed if the neuromodulatory signal encodes the difference between the reward and the expected reward-but only if the expected reward is calculated for each task and stimulus separately. If several tasks are to be learned simultaneously, the nervous system needs an internal critic that is able to predict the expected reward for arbitrary stimuli. We show that, with a critic, reward-modulated spike-timing-dependent plasticity is capable of learning motor trajectories with a temporal resolution of tens of milliseconds. The relation to temporal difference learning, the relevance of block-based learning paradigms, and the limitations of learning with a critic are discussed.
Resumo:
Copyright © 2014 John Wiley & Sons, Ltd. Copyright © 2014 John Wiley & Sons, Ltd. Summary A field programmable gate array (FPGA) based model predictive controller for two phases of spacecraft rendezvous is presented. Linear time-varying prediction models are used to accommodate elliptical orbits, and a variable prediction horizon is used to facilitate finite time completion of the longer range manoeuvres, whilst a fixed and receding prediction horizon is used for fine-grained tracking at close range. The resulting constrained optimisation problems are solved using a primal-dual interior point algorithm. The majority of the computational demand is in solving a system of simultaneous linear equations at each iteration of this algorithm. To accelerate these operations, a custom circuit is implemented, using a combination of Mathworks HDL Coder and Xilinx System Generator for DSP, and used as a peripheral to a MicroBlaze soft-core processor on the FPGA, on which the remainder of the system is implemented. Certain logic that can be hard-coded for fixed sized problems is implemented to be configurable online, in order to accommodate the varying problem sizes associated with the variable prediction horizon. The system is demonstrated in closed-loop by linking the FPGA with a simulation of the spacecraft dynamics running in Simulink on a PC, using Ethernet. Timing comparisons indicate that the custom implementation is substantially faster than pure embedded software-based interior point methods running on the same MicroBlaze and could be competitive with a pure custom hardware implementation.
Resumo:
TRISO-Model(tridimensional integrated software development model)是为处理软件开发的复杂性和动态性而提出的三维集成软件开发方法学,其中,多维模型之间的语义一致性维护以及对模型应用中公共操作部分的重用,提出了基于一致语义进行模型管理的需求.给出了基于MDA(model driven architecture)进行模型管理的方法MDA-MMMethod(MDA based model management method),应用MDA的4层模型管理结构,基于MDA核心标准MOF(meta object facility)所提供的公共语义基础管理模型和元模型,MDA-MMMethod支持各种MDA模型操作标准实现在TRSIO-model应用中的重用.开发了相应的支持系统MDA-MMSystem(MDA based model management system),应用于SoftPM的项目实践中.与传统方法相比,模型应用的开发效率得到了显著提高,同时降低了开发成本.最后,给出了模型融合的应用实例介绍.
Resumo:
This paper describes the main features of a view-based model of object recognition. The model tries to capture general properties to be expected in a biological architecture for object recognition. The basic module is a regularization network in which each of the hidden units is broadly tuned to a specific view of the object to be recognized.
Resumo:
Choosing the right or the best option is often a demanding and challenging task for the user (e.g., a customer in an online retailer) when there are many available alternatives. In fact, the user rarely knows which offering will provide the highest value. To reduce the complexity of the choice process, automated recommender systems generate personalized recommendations. These recommendations take into account the preferences collected from the user in an explicit (e.g., letting users express their opinion about items) or implicit (e.g., studying some behavioral features) way. Such systems are widespread; research indicates that they increase the customers' satisfaction and lead to higher sales. Preference handling is one of the core issues in the design of every recommender system. This kind of system often aims at guiding users in a personalized way to interesting or useful options in a large space of possible options. Therefore, it is important for them to catch and model the user's preferences as accurately as possible. In this thesis, we develop a comparative preference-based user model to represent the user's preferences in conversational recommender systems. This type of user model allows the recommender system to capture several preference nuances from the user's feedback. We show that, when applied to conversational recommender systems, the comparative preference-based model is able to guide the user towards the best option while the system is interacting with her. We empirically test and validate the suitability and the practical computational aspects of the comparative preference-based user model and the related preference relations by comparing them to a sum of weights-based user model and the related preference relations. Product configuration, scheduling a meeting and the construction of autonomous agents are among several artificial intelligence tasks that involve a process of constrained optimization, that is, optimization of behavior or options subject to given constraints with regards to a set of preferences. When solving a constrained optimization problem, pruning techniques, such as the branch and bound technique, point at directing the search towards the best assignments, thus allowing the bounding functions to prune more branches in the search tree. Several constrained optimization problems may exhibit dominance relations. These dominance relations can be particularly useful in constrained optimization problems as they can instigate new ways (rules) of pruning non optimal solutions. Such pruning methods can achieve dramatic reductions in the search space while looking for optimal solutions. A number of constrained optimization problems can model the user's preferences using the comparative preferences. In this thesis, we develop a set of pruning rules used in the branch and bound technique to efficiently solve this kind of optimization problem. More specifically, we show how to generate newly defined pruning rules from a dominance algorithm that refers to a set of comparative preferences. These rules include pruning approaches (and combinations of them) which can drastically prune the search space. They mainly reduce the number of (expensive) pairwise comparisons performed during the search while guiding constrained optimization algorithms to find optimal solutions. Our experimental results show that the pruning rules that we have developed and their different combinations have varying impact on the performance of the branch and bound technique.
Resumo:
BACKGROUND: Computer simulations are of increasing importance in modeling biological phenomena. Their purpose is to predict behavior and guide future experiments. The aim of this project is to model the early immune response to vaccination by an agent based immune response simulation that incorporates realistic biophysics and intracellular dynamics, and which is sufficiently flexible to accurately model the multi-scale nature and complexity of the immune system, while maintaining the high performance critical to scientific computing. RESULTS: The Multiscale Systems Immunology (MSI) simulation framework is an object-oriented, modular simulation framework written in C++ and Python. The software implements a modular design that allows for flexible configuration of components and initialization of parameters, thus allowing simulations to be run that model processes occurring over different temporal and spatial scales. CONCLUSION: MSI addresses the need for a flexible and high-performing agent based model of the immune system.
Resumo:
INTRODUCTION: We previously reported models that characterized the synergistic interaction between remifentanil and sevoflurane in blunting responses to verbal and painful stimuli. This preliminary study evaluated the ability of these models to predict a return of responsiveness during emergence from anesthesia and a response to tibial pressure when patients required analgesics in the recovery room. We hypothesized that model predictions would be consistent with observed responses. We also hypothesized that under non-steady-state conditions, accounting for the lag time between sevoflurane effect-site concentration (Ce) and end-tidal (ET) concentration would improve predictions. METHODS: Twenty patients received a sevoflurane, remifentanil, and fentanyl anesthetic. Two model predictions of responsiveness were recorded at emergence: an ET-based and a Ce-based prediction. Similarly, 2 predictions of a response to noxious stimuli were recorded when patients first required analgesics in the recovery room. Model predictions were compared with observations with graphical and temporal analyses. RESULTS: While patients were anesthetized, model predictions indicated a high likelihood that patients would be unresponsive (> or = 99%). However, after termination of the anesthetic, models exhibited a wide range of predictions at emergence (1%-97%). Although wide, the Ce-based predictions of responsiveness were better distributed over a percentage ranking of observations than the ET-based predictions. For the ET-based model, 45% of the patients awoke within 2 min of the 50% model predicted probability of unresponsiveness and 65% awoke within 4 min. For the Ce-based model, 45% of the patients awoke within 1 min of the 50% model predicted probability of unresponsiveness and 85% awoke within 3.2 min. Predictions of a response to a painful stimulus in the recovery room were similar for the Ce- and ET-based models. DISCUSSION: Results confirmed, in part, our study hypothesis; accounting for the lag time between Ce and ET sevoflurane concentrations improved model predictions of responsiveness but had no effect on predicting a response to a noxious stimulus in the recovery room. These models may be useful in predicting events of clinical interest but large-scale evaluations with numerous patients are needed to better characterize model performance.
Resumo:
Dascalu, M., Trausan-Matu, S., McNamara, D.S., & Dessus, P. (2015). ReaderBench – Automated Evaluation of Collaboration based on Cohesion and Dialogism. International Journal of Computer-Supported Collaborative Learning, 10(4), 395–423. doi: 10.1007/s11412-015-9226-y
Resumo:
The identification of non-linear systems using only observed finite datasets has become a mature research area over the last two decades. A class of linear-in-the-parameter models with universal approximation capabilities have been intensively studied and widely used due to the availability of many linear-learning algorithms and their inherent convergence conditions. This article presents a systematic overview of basic research on model selection approaches for linear-in-the-parameter models. One of the fundamental problems in non-linear system identification is to find the minimal model with the best model generalisation performance from observational data only. The important concepts in achieving good model generalisation used in various non-linear system-identification algorithms are first reviewed, including Bayesian parameter regularisation and models selective criteria based on the cross validation and experimental design. A significant advance in machine learning has been the development of the support vector machine as a means for identifying kernel models based on the structural risk minimisation principle. The developments on the convex optimisation-based model construction algorithms including the support vector regression algorithms are outlined. Input selection algorithms and on-line system identification algorithms are also included in this review. Finally, some industrial applications of non-linear models are discussed.
Resumo:
One of the most influential explanations of voting behaviour is based on economic factors: when the economy is doing well, voters reward the incumbent government and when the economy is doing badly, voters punish the incumbent. This reward-punishment model is thought to be particularly appropriate at second order contests such as European Parliament elections. Yet operationalising this economic voting model using citizens' perceptions of economic performance may suffer from endogeneity problems if citizens' perceptions are in fact a function of their party preferences rather than being a cause of their party preferences. Thus, this article models a 'strict' version of economic voting in which they purge citizens' economic perceptions of partisan effects and only use as a predictor of voting that portion of citizens' economic perceptions that is caused by the real world economy. Using data on voting at the 2004 European Parliament elections for 23 European Union electorates, the article finds some, but limited, evidence for economic voting that is dependent on both voter sophistication and clarity of responsibility for the economy within any country. First, only politically sophisticated voters' subjective economic assessments are in fact grounded in economic reality. Second, the portion of subjective economic assessments that is a function of the real world economy is a significant predictor of voting only in single party government contexts where there can be a clear attribution of responsibility. For coalition government contexts, the article finds essentially no impact of the real economy via economic perceptions on vote choice, at least at European Parliament elections.
Resumo:
In a recently published study, Sloutsky and Fisher [Sloutsky, V. M., & Fisher, A.V. (2004a). When development and learning decrease memory: Evidence against category-based induction in children. Psychological Science, 15, 553-558; Sloutsky, V. M., & Fisher, A. V. (2004b). Induction and categorization in young children: A similarity-based model. Journal of Experimental Psychology: General, 133, 166-188.] demonstrated that children have better memory for the items that they generalise to than do adults. On the basis of this finding, they claim that children and adults use different mechanisms for inductive generalisations;whereas adults focus on shared category membership, children project properties on the basis of perceptual similarity. Sloutsky & Fisher attribute children's enhanced recognition memory to the more detailed processing required by this similarity-based mechanism. In Experiment I we show that children look at the stimulus items for longer than adults. In Experiment 2 we demonstrate that although when given just 250 ms to inspect the items children remain capable of making accurate inferences, their subsequent memory for those items decreases significantly. These findings suggest that there are no necessary conclusions to be drawn from Sloutsky & Fisher's results about developmental differences in generalisation strategy. (C) 2007 Elsevier B.V. All rights reserved.
Resumo:
Objective: The objective of this study was to examine the relationship of the job strain model and the effort-reward imbalance model with heavy drinking.
Resumo:
Homology modeling was used to build 3D models of the N-methyl-D-aspartate (NMDA) receptor glycine binding site on the basis of an X-ray structure of the water-soluble AMPA-sensitive receptor. The docking of agonists and antagonists to these models was used to reveal binding modes of ligands and to explain known structure-activity relationships. Two types of quantitative models, 3D-QSAR/CoMFA and a regression model based on docking energies, were built for antagonists (derivatives of 4-hydroxy-2-quinolone, quinoxaline-2,3-dione, and related compounds). The CoMFA steric and electrostatic maps were superimposed on the homology-based model, and a close correspondence was marked. The derived computational models have permitted the evaluation of the structural features crucial for high glycine binding site affinity and are important for the design of new ligands.