6 resultados para Reward

em CaltechTHESIS


Relevância:

10.00% 10.00%

Publicador:

Resumo:

Therapy employing epidural electrostimulation holds great potential for improving therapy for patients with spinal cord injury (SCI) (Harkema et al., 2011). Further promising results from combined therapies using electrostimulation have also been recently obtained (e.g., van den Brand et al., 2012). The devices being developed to deliver the stimulation are highly flexible, capable of delivering any individual stimulus among a combinatorially large set of stimuli (Gad et al., 2013). While this extreme flexibility is very useful for ensuring that the device can deliver an appropriate stimulus, the challenge of choosing good stimuli is quite substantial, even for expert human experimenters. To develop a fully implantable, autonomous device which can provide useful therapy, it is necessary to design an algorithmic method for choosing the stimulus parameters. Such a method can be used in a clinical setting, by caregivers who are not experts in the neurostimulator's use, and to allow the system to adapt autonomously between visits to the clinic. To create such an algorithm, this dissertation pursues the general class of active learning algorithms that includes Gaussian Process Upper Confidence Bound (GP-UCB, Srinivas et al., 2010), developing the Gaussian Process Batch Upper Confidence Bound (GP-BUCB, Desautels et al., 2012) and Gaussian Process Adaptive Upper Confidence Bound (GP-AUCB) algorithms. This dissertation develops new theoretical bounds for the performance of these and similar algorithms, empirically assesses these algorithms against a number of competitors in simulation, and applies a variant of the GP-BUCB algorithm in closed-loop to control SCI therapy via epidural electrostimulation in four live rats. The algorithm was tasked with maximizing the amplitude of evoked potentials in the rats' left tibialis anterior muscle. These experiments show that the algorithm is capable of directing these experiments sensibly, finding effective stimuli in all four animals. Further, in direct competition with an expert human experimenter, the algorithm produced superior performance in terms of average reward and comparable or superior performance in terms of maximum reward. These results indicate that variants of GP-BUCB may be suitable for autonomously directing SCI therapy.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Humans are particularly adept at modifying their behavior in accordance with changing environmental demands. Through various mechanisms of cognitive control, individuals are able to tailor actions to fit complex short- and long-term goals. The research described in this thesis uses functional magnetic resonance imaging to characterize the neural correlates of cognitive control at two levels of complexity: response inhibition and self-control in intertemporal choice. First, we examined changes in neural response associated with increased experience and skill in response inhibition; successful response inhibition was associated with decreased neural response over time in the right ventrolateral prefrontal cortex, a region widely implicated in cognitive control, providing evidence for increased neural efficiency with learned automaticity. We also examined a more abstract form of cognitive control using intertemporal choice. In two experiments, we identified putative neural substrates for individual differences in temporal discounting, or the tendency to prefer immediate to delayed rewards. Using dynamic causal models, we characterized the neural circuit between ventromedial prefrontal cortex, an area involved in valuation, and dorsolateral prefrontal cortex, a region implicated in self-control in intertemporal and dietary choice, and found that connectivity from dorsolateral prefrontal cortex to ventromedial prefrontal cortex increases at the time of choice, particularly when delayed rewards are chosen. Moreover, estimates of the strength of connectivity predicted out-of-sample individual rates of temporal discounting, suggesting a neurocomputational mechanism for variation in the ability to delay gratification. Next, we interrogated the hypothesis that individual differences in temporal discounting are in part explained by the ability to imagine future reward outcomes. Using a novel paradigm, we imaged neural response during the imagining of primary rewards, and identified negative correlations between activity in regions associated the processing of both real and imagined rewards (lateral orbitofrontal cortex and ventromedial prefrontal cortex, respectively) and the individual temporal discounting parameters estimated in the previous experiment. These data suggest that individuals who are better able to represent reward outcomes neurally are less susceptible to temporal discounting. Together, these findings provide further insight into role of the prefrontal cortex in implementing cognitive control, and propose neurobiological substrates for individual variation.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Modern robots are increasingly expected to function in uncertain and dynamically challenging environments, often in proximity with humans. In addition, wide scale adoption of robots requires on-the-fly adaptability of software for diverse application. These requirements strongly suggest the need to adopt formal representations of high level goals and safety specifications, especially as temporal logic formulas. This approach allows for the use of formal verification techniques for controller synthesis that can give guarantees for safety and performance. Robots operating in unstructured environments also face limited sensing capability. Correctly inferring a robot's progress toward high level goal can be challenging.

This thesis develops new algorithms for synthesizing discrete controllers in partially known environments under specifications represented as linear temporal logic (LTL) formulas. It is inspired by recent developments in finite abstraction techniques for hybrid systems and motion planning problems. The robot and its environment is assumed to have a finite abstraction as a Partially Observable Markov Decision Process (POMDP), which is a powerful model class capable of representing a wide variety of problems. However, synthesizing controllers that satisfy LTL goals over POMDPs is a challenging problem which has received only limited attention.

This thesis proposes tractable, approximate algorithms for the control synthesis problem using Finite State Controllers (FSCs). The use of FSCs to control finite POMDPs allows for the closed system to be analyzed as finite global Markov chain. The thesis explicitly shows how transient and steady state behavior of the global Markov chains can be related to two different criteria with respect to satisfaction of LTL formulas. First, the maximization of the probability of LTL satisfaction is related to an optimization problem over a parametrization of the FSC. Analytic computation of gradients are derived which allows the use of first order optimization techniques.

The second criterion encourages rapid and frequent visits to a restricted set of states over infinite executions. It is formulated as a constrained optimization problem with a discounted long term reward objective by the novel utilization of a fundamental equation for Markov chains - the Poisson equation. A new constrained policy iteration technique is proposed to solve the resulting dynamic program, which also provides a way to escape local maxima.

The algorithms proposed in the thesis are applied to the task planning and execution challenges faced during the DARPA Autonomous Robotic Manipulation - Software challenge.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Nicotinic acetylcholine receptors (nAChRs) are pentameric, ligand-gated, cation channels found throughout the central and peripheral nervous system, whose endogenous ligand is acetylcholine, but which can also be acted upon by nicotine. The subunit compositions of nAChR determine their physiological and pharmacological properties, with different subunits expressed in different combinations or areas throughout the brain. The behavioral and physiological effects of nicotine are elicited by its agonistic and desensitizing actions selectively on neuronal nAChRs. The midbrain is of particular interest due to its population of nAChRs expressed on dopaminergic neurons, which are important for reward and reinforcement, and possibly contribute to nicotine dependence. The α6-subunit is found on dopaminergic neurons but very few other regions of the brain, making it an interesting drug target. We assayed a novel nicotinic agonist, called TI-299423 or TC299, for its possible selectivity for α6-containing nAChRs. Our goal was to isolate the role of α6-containing nAChRs in nicotine reward and reinforcement, and provide insight into the search for more effective smoking cessation compounds. This was done using a variety of in vitro and behavioral assays, aimed dually at understanding TI-299423’s exact mechanism of action and its downstream effects. Additionally, we looked at the effects of another compound, menthol, on nicotine reward. Understanding how reward is generated in the cholinergic system and how that is modulated by other compounds contributes to a better understand of our complex neural circuitry and provides insight for the future development of therapeutics.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Nicotinic acetylcholine receptors are pentameric ligand-gated ion channels mediating fast synaptic transmission throughout the peripheral and central nervous systems. They have been implicated in various processes related to cognitive functions, learning and memory, arousal, reward, motor control and analgesia. Therefore, these receptors present alluring potential therapeutic targets for the treatment of pain, epilepsy, Alzheimer’s disease, Parkinson’s disease, Tourette’s syndrome, schizophrenia, anxiety, depression and nicotine addiction. The work detailed in this thesis focuses on binding studies of neuronal nicotinic receptors and aims to further our knowledge of subtype specific functional and structural information.

Chapter 1 is an introductory chapter describing the structure and function of nicotinic acetylcholine receptors as well as the methodologies used for the dissertation work described herein. There are several different subtypes of nicotinic acetylcholine receptors known to date and the subtle variations in their structure and function present a challenging area of study. The work presented in this thesis deals specifically with the α4β2 subtype of nicotinic acetylcholine receptor. This subtype assembles into 2 closely related stoichiometries, termed throughout this thesis as A3B2 and A2B3 after their respective subunit composition. Chapter 2 describes binding studies of select nicotinic agonists on A3B2 and A2B3 receptors determined by whole-cell recording. Three key binding interactions, a cation-π and two hydrogen bonds, were probed for four nicotinic agonists, acetylcholine, nicotine, smoking cessation drug varenicline (Chantix®) and the related natural product cytisine.

Results from the binding studies presented in Chapter 2 show that the major difference in binding of these four agonists to A3B2 and A2B3 receptors lies in one of the two hydrogen bond interactions where the agonist acts as the hydrogen bond acceptor and the backbone NH of a conserved leucine residue in the receptor acts as the hydrogen bond donor. Chapter 3 focuses on studying the effect of modulating the hydrogen bond acceptor ability of nicotine and epibatidine on A3B2 receptor function determined by whole-cell recording. Finally, Chapter 4 describes single-channel recording studies of varenicline binding to A2B3 and A3B2 receptors.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A person living in an industrialized society has almost no choice but to receive information daily with negative implications for himself or others. His attention will often be drawn to the ups and downs of economic indicators or the alleged misdeeds of leaders and organizations. Reacting to new information is central to economics, but economics typically ignores the affective aspect of the response, for example, of stress or anger. These essays present the results of considering how the affective aspect of the response can influence economic outcomes.

The first chapter presents an experiment in which individuals were presented with information about various non-profit organizations and allowed to take actions that rewarded or punished those organizations. When social interaction was introduced into this environment an asymmetry between rewarding and punishing appeared. The net effects of punishment became greater and more variable, whereas the effects of reward were unchanged. The individuals were more strongly influenced by negative social information and used that information to target unpopular organizations. These behaviors contributed to an increase in inequality among the outcomes of the organizations.

The second and third chapters present empirical studies of reactions to negative information about local economic conditions. Economic factors are among the most prevalent stressors, and stress is known to have numerous negative effects on health. These chapters document localized, transient effects of the announcement of information about large-scale job losses. News of mass layoffs and shut downs of large military bases are found to decrease birth weights and gestational ages among babies born in the affected regions. The effect magnitudes are close to those estimated in similar studies of disasters.