960 resultados para Attribute Assignment
Resumo:
Learning by reinforcement is important in shaping animal behavior. But behavioral decision making is likely to involve the integration of many synaptic events in space and time. So in using a single reinforcement signal to modulate synaptic plasticity a twofold problem arises. Different synapses will have contributed differently to the behavioral decision and, even for one and the same synapse, releases at different times may have had different effects. Here we present a plasticity rule which solves this spatio-temporal credit assignment problem in a population of spiking neurons. The learning rule is spike time dependent and maximizes the expected reward by following its stochastic gradient. Synaptic plasticity is modulated not only by the reward but by a population feedback signal as well. While this additional signal solves the spatial component of the problem, the temporal one is solved by means of synaptic eligibility traces. In contrast to temporal difference based approaches to reinforcement learning, our rule is explicit with regard to the assumed biophysical mechanisms. Neurotransmitter concentrations determine plasticity and learning occurs fully online. Further, it works even if the task to be learned is non-Markovian, i.e. when reinforcement is not determined by the current state of the system but may also depend on past events. The performance of the model is assessed by studying three non-Markovian tasks. In the first task the reward is delayed beyond the last action with non-related stimuli and actions appearing in between. The second one involves an action sequence which is itself extended in time and reward is only delivered at the last action, as is the case in any type of board-game. The third is the inspection game that has been studied in neuroeconomics. It only has a mixed Nash equilibrium and exemplifies that the model also copes with stochastic reward delivery and the learning of mixed strategies.
Resumo:
We present a model for plasticity induction in reinforcement learning which is based on a cascade of synaptic memory traces. In the cascade of these so called eligibility traces presynaptic input is first corre lated with postsynaptic events, next with the behavioral decisions and finally with the external reinforcement. A population of leaky integrate and fire neurons endowed with this plasticity scheme is studied by simulation on different tasks. For operant co nditioning with delayed reinforcement, learning succeeds even when the delay is so large that the delivered reward reflects the appropriateness, not of the immediately preceeding response, but of a decision made earlier on in the stimulus - decision sequence . So the proposed model does not rely on the temporal contiguity between decision and pertinent reward and thus provides a viable means of addressing the temporal credit assignment problem. In the same task, learning speeds up with increasing population si ze, showing that the plasticity cascade simultaneously addresses the spatial problem of assigning credit to the different population neurons. Simulations on other task such as sequential decision making serve to highlight the robustness of the proposed sch eme and, further, contrast its performance to that of temporal difference based approaches to reinforcement learning.
Resumo:
n learning from trial and error, animals need to relate behavioral decisions to environmental reinforcement even though it may be difficult to assign credit to a particular decision when outcomes are uncertain or subject to delays. When considering the biophysical basis of learning, the credit-assignment problem is compounded because the behavioral decisions themselves result from the spatio-temporal aggregation of many synaptic releases. We present a model of plasticity induction for reinforcement learning in a population of leaky integrate and fire neurons which is based on a cascade of synaptic memory traces. Each synaptic cascade correlates presynaptic input first with postsynaptic events, next with the behavioral decisions and finally with external reinforcement. For operant conditioning, learning succeeds even when reinforcement is delivered with a delay so large that temporal contiguity between decision and pertinent reward is lost due to intervening decisions which are themselves subject to delayed reinforcement. This shows that the model provides a viable mechanism for temporal credit assignment. Further, learning speeds up with increasing population size, so the plasticity cascade simultaneously addresses the spatial problem of assigning credit to synapses in different population neurons. Simulations on other tasks, such as sequential decision making, serve to contrast the performance of the proposed scheme to that of temporal difference-based learning. We argue that, due to their comparative robustness, synaptic plasticity cascades are attractive basic models of reinforcement learning in the brain.
Resumo:
An optimizing compiler internal representation fundamentally affects the clarity, efficiency and feasibility of optimization algorithms employed by the compiler. Static Single Assignment (SSA) as a state-of-the-art program representation has great advantages though still can be improved. This dissertation explores the domain of single assignment beyond SSA, and presents two novel program representations: Future Gated Single Assignment (FGSA) and Recursive Future Predicated Form (RFPF). Both FGSA and RFPF embed control flow and data flow information, enabling efficient traversal program information and thus leading to better and simpler optimizations. We introduce future value concept, the designing base of both FGSA and RFPF, which permits a consumer instruction to be encountered before the producer of its source operand(s) in a control flow setting. We show that FGSA is efficiently computable by using a series T1/T2/TR transformation, yielding an expected linear time algorithm for combining together the construction of the pruned single assignment form and live analysis for both reducible and irreducible graphs. As a result, the approach results in an average reduction of 7.7%, with a maximum of 67% in the number of gating functions compared to the pruned SSA form on the SPEC2000 benchmark suite. We present a solid and near optimal framework to perform inverse transformation from single assignment programs. We demonstrate the importance of unrestricted code motion and present RFPF. We develop algorithms which enable instruction movement in acyclic, as well as cyclic regions, and show the ease to perform optimizations such as Partial Redundancy Elimination on RFPF.
Resumo:
OBJECTIVE: The aim of this study was to estimate intra- and post-operative risk using the American Society of Anaesthesiologists (ASA) classification which is an important predictor of an intervention and of the entire operating programme. STUDY DESIGN: In this retrospective study, 4435 consecutive patients undergoing elective and emergency surgery at the Gynaecological Clinic of the University Hospital of Zurich were included. The ASA classification for pre-operative risk assessment was determined by an anaesthesiologist after a thorough physical examination. We observed several pre-, intra- and post-operative parameters, such as age, body-mass-index, duration of anaesthesia, duration of surgery, blood loss, duration of post-operative stay, complicated post-operative course, morbidity and mortality. The investigation of different risk factors was achieved by a multiple linear regression model for log-transformed duration of hospitalisation. RESULTS: Age and obesity were responsible for a higher ASA classification. ASA grade correlates with the duration of anaesthesia and the duration of the surgery itself. There was a significant difference in blood loss between ASA grades I (113+/-195 ml) and III (222+/-470 ml) and between classes II (176+/-432 ml) and III. The duration of post-operative hospitalisation could also be correlated with ASA class. ASA class I=1.7+/-3.0 days, ASA class II=3.6+/-4.3 days, ASA class III=6.8+/-8.2 days, and ASA class IV=6.2+/-3.9 days. The mean post-operative in-hospital stay was 2.5+/-4.0 days without complications, and 8.7+/-6.7 days with post-operative complications. Multiple linear regression model showed that not only the ASA classification contained an important information for the duration of hospitalisation. Parameters such as age, class of diagnosis, post-operative complications, etc. also have an influence on the duration of hospitalisation. CONCLUSION: This study shows that the ASA classification can be used as a good and early available predictor for the planning of an intervention in gynaecological surgery. The ASA classification helps the surgeon to assess the peri-operative risk profile of which important information can be derived for the planning of the operation programme.
Resumo:
A one-pot, general synthesis of highly functionalized quateraryls through carbanion-induced, base-catalyzed ring transformation of 5,6-diaryl-2H-pyran-2-ones and core-substituted phenylacetones is delineated. These conversions were found to give diversely functionalized benzenes bearing peripheral aryl rings, some of which possess inherent atropisomerism. Exemplarily for one of the quateraryls, the optical resolution of the respective atropo-enantiomers by HPLC on a chiral phase and the assignment of their absolute axial configurations succeeded by LC-CD coupling in combination with semiempirical CNDO/S and TDDFT CD calculations. This synthetic approach offers – in a transition metal-free environment – high flexibility in the construction of quateraryls with the desired conformational freedom along the molecular axis, which may help in exploring and developing new potential ligands for asymmetric synthesis.
Resumo:
Open source software projects are multi-collaborative works incorporating the contributions of numerous developers who, in spite of publishing their code under a public license such as GPL, Apache or BSD, retain the copyright in their contributions. Having multiple copyright-owners can make the steering of a project difficult, if not impossible, as there is no ultimate authority able to take decisions relating to the maintenance and use of the project. This predicament can be remedied by centring the dispersed copyrights in a single authority via contributor agreements. Whether to introduce contributor agreements, and if so in which form, is a pressing question for many emerging, but also for established projects. The current paper provides an insight into the ethos of different projects and their reason for adopting or rejecting particular contributor agreements. It further examines the exact set-up of the contributor agreements used and concludes that smart drafting can blur the difference between CAAs and CLAs to a considerable extent, manoeuvring them into a legal grey area. To avoid costly litigation to test the legal enforceability of individual clauses, this paper proposes the establishment of an international committee comprised of developers, product managers and lawyers interested in finding a common terminology that may serve as a foundation for every contributor agreement