9 resultados para Credit events correlation

em BORIS: Bern Open Repository and Information System - Berna - Suiça


Relevância:

40.00% 40.00%

Publicador:

Resumo:

Measurements of spin correlation in top quark pair production are presented using data collected with the ATLAS detector at the LHC with proton-proton collisions at a center-of-mass energy of 7 TeV, corresponding to an integrated luminosity of 4.6  fb −1 . Events are selected in final states with two charged leptons and at least two jets and in final states with one charged lepton and at least four jets. Four different observables sensitive to different properties of the top quark pair production mechanism are used to extract the correlation between the top and antitop quark spins. Some of these observables are measured for the first time. The measurements are in good agreement with the Standard Model prediction at next-to-leading-order accuracy.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Learning by reinforcement is important in shaping animal behavior, and in particular in behavioral decision making. Such decision making is likely to involve the integration of many synaptic events in space and time. However, using a single reinforcement signal to modulate synaptic plasticity, as suggested in classical reinforcement learning algorithms, a twofold problem arises. Different synapses will have contributed differently to the behavioral decision, and even for one and the same synapse, releases at different times may have had different effects. Here we present a plasticity rule which solves this spatio-temporal credit assignment problem in a population of spiking neurons. The learning rule is spike-time dependent and maximizes the expected reward by following its stochastic gradient. Synaptic plasticity is modulated not only by the reward, but also by a population feedback signal. While this additional signal solves the spatial component of the problem, the temporal one is solved by means of synaptic eligibility traces. In contrast to temporal difference (TD) based approaches to reinforcement learning, our rule is explicit with regard to the assumed biophysical mechanisms. Neurotransmitter concentrations determine plasticity and learning occurs fully online. Further, it works even if the task to be learned is non-Markovian, i.e. when reinforcement is not determined by the current state of the system but may also depend on past events. The performance of the model is assessed by studying three non-Markovian tasks. In the first task, the reward is delayed beyond the last action with non-related stimuli and actions appearing in between. The second task involves an action sequence which is itself extended in time and reward is only delivered at the last action, as it is the case in any type of board-game. The third task is the inspection game that has been studied in neuroeconomics, where an inspector tries to prevent a worker from shirking. Applying our algorithm to this game yields a learning behavior which is consistent with behavioral data from humans and monkeys, revealing themselves properties of a mixed Nash equilibrium. The examples show that our neuronal implementation of reward based learning copes with delayed and stochastic reward delivery, and also with the learning of mixed strategies in two-opponent games.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Learning by reinforcement is important in shaping animal behavior. But behavioral decision making is likely to involve the integration of many synaptic events in space and time. So in using a single reinforcement signal to modulate synaptic plasticity a twofold problem arises. Different synapses will have contributed differently to the behavioral decision and, even for one and the same synapse, releases at different times may have had different effects. Here we present a plasticity rule which solves this spatio-temporal credit assignment problem in a population of spiking neurons. The learning rule is spike time dependent and maximizes the expected reward by following its stochastic gradient. Synaptic plasticity is modulated not only by the reward but by a population feedback signal as well. While this additional signal solves the spatial component of the problem, the temporal one is solved by means of synaptic eligibility traces. In contrast to temporal difference based approaches to reinforcement learning, our rule is explicit with regard to the assumed biophysical mechanisms. Neurotransmitter concentrations determine plasticity and learning occurs fully online. Further, it works even if the task to be learned is non-Markovian, i.e. when reinforcement is not determined by the current state of the system but may also depend on past events. The performance of the model is assessed by studying three non-Markovian tasks. In the first task the reward is delayed beyond the last action with non-related stimuli and actions appearing in between. The second one involves an action sequence which is itself extended in time and reward is only delivered at the last action, as is the case in any type of board-game. The third is the inspection game that has been studied in neuroeconomics. It only has a mixed Nash equilibrium and exemplifies that the model also copes with stochastic reward delivery and the learning of mixed strategies.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We present a model for plasticity induction in reinforcement learning which is based on a cascade of synaptic memory traces. In the cascade of these so called eligibility traces presynaptic input is first corre lated with postsynaptic events, next with the behavioral decisions and finally with the external reinforcement. A population of leaky integrate and fire neurons endowed with this plasticity scheme is studied by simulation on different tasks. For operant co nditioning with delayed reinforcement, learning succeeds even when the delay is so large that the delivered reward reflects the appropriateness, not of the immediately preceeding response, but of a decision made earlier on in the stimulus - decision sequence . So the proposed model does not rely on the temporal contiguity between decision and pertinent reward and thus provides a viable means of addressing the temporal credit assignment problem. In the same task, learning speeds up with increasing population si ze, showing that the plasticity cascade simultaneously addresses the spatial problem of assigning credit to the different population neurons. Simulations on other task such as sequential decision making serve to highlight the robustness of the proposed sch eme and, further, contrast its performance to that of temporal difference based approaches to reinforcement learning.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

n learning from trial and error, animals need to relate behavioral decisions to environmental reinforcement even though it may be difficult to assign credit to a particular decision when outcomes are uncertain or subject to delays. When considering the biophysical basis of learning, the credit-assignment problem is compounded because the behavioral decisions themselves result from the spatio-temporal aggregation of many synaptic releases. We present a model of plasticity induction for reinforcement learning in a population of leaky integrate and fire neurons which is based on a cascade of synaptic memory traces. Each synaptic cascade correlates presynaptic input first with postsynaptic events, next with the behavioral decisions and finally with external reinforcement. For operant conditioning, learning succeeds even when reinforcement is delivered with a delay so large that temporal contiguity between decision and pertinent reward is lost due to intervening decisions which are themselves subject to delayed reinforcement. This shows that the model provides a viable mechanism for temporal credit assignment. Further, learning speeds up with increasing population size, so the plasticity cascade simultaneously addresses the spatial problem of assigning credit to synapses in different population neurons. Simulations on other tasks, such as sequential decision making, serve to contrast the performance of the proposed scheme to that of temporal difference-based learning. We argue that, due to their comparative robustness, synaptic plasticity cascades are attractive basic models of reinforcement learning in the brain.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Upper Jurassic (Kimmeridgian)±Upper Cretaceous (Cenomanian) inner platform carbonates in the Western Taurides are composed of metre-scale upward-shallowing cyclic deposits (parasequences) and important karstic surfaces capping some of the cycles. Peritidal cycles (shallow subtidal facies capped by tidal-¯at laminites or fenestrate limestones) are regressive- and transgressive-prone (upward-deepening followed by upward-shallowing facies trends). Subtidal cycles are of two types and indicate incomplete shallowing. Submerged subtidal cycles are composed of deeper subtidal facies overlain by shallow subtidal facies. Exposed subtidal cycles consist of deeper subtidal facies overlain by shallow subtidal facies that are capped by features indicative of prolonged subaerial exposure. Subtidal facies occur characteristically in the Jurassic, while peritidal cycles are typical for the Lower Cretaceous of the region. Within the foraminiferal and dasyclad algal biostratigraphic framework, four karst breccia levels are recognized as the boundaries of major second-order cycles, introduced for the ®rst time in this study. These levels correspond to the Kimmeridgian±Portlandian boundary, mid-Early Valanginian, mid-Early Aptian and mid-Cenomanian and represent important sea level falls which affected the distribution of foraminiferal fauna and dasyclad ¯ora of the Taurus carbonate platform. Within the Kimmeridgian±Cenomanian interval 26 third-order sequences (types 1 and 2) are recognized. These sequences are the records of eustatic sea level ¯uctuations rather than the records of local tectonic events because the boundaries of the sequences representing 1±4 Ma intervals are correlative with global sea level falls. Third-order sequences and metre-scale cyclic deposits are the major units used for long-distance, high-resolution sequence stratigraphic correlation in the Western Taurides. Metre-scale cyclic deposits (parasequences) in the Cretaceous show genetical stacking patterns within third-order sequences and correspond to fourth-order sequences representing 100±200 ka. These cycles are possibly the E2 signal (126 ka) of the orbital eccentricity cycles of the Milankovitch band. The slight deviation of values, calculated for parasequences, from the mean value of eccentricity cycles can be explained by the currently imprecise geochronology established in the Cretaceous and missed sea level oscillations when the platform lay above fluctuating sea level.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Correct predictions of future blood glucose levels in individuals with Type 1 Diabetes (T1D) can be used to provide early warning of upcoming hypo-/hyperglycemic events and thus to improve the patient's safety. To increase prediction accuracy and efficiency, various approaches have been proposed which combine multiple predictors to produce superior results compared to single predictors. Three methods for model fusion are presented and comparatively assessed. Data from 23 T1D subjects under sensor-augmented pump (SAP) therapy were used in two adaptive data-driven models (an autoregressive model with output correction - cARX, and a recurrent neural network - RNN). Data fusion techniques based on i) Dempster-Shafer Evidential Theory (DST), ii) Genetic Algorithms (GA), and iii) Genetic Programming (GP) were used to merge the complimentary performances of the prediction models. The fused output is used in a warning algorithm to issue alarms of upcoming hypo-/hyperglycemic events. The fusion schemes showed improved performance with lower root mean square errors, lower time lags, and higher correlation. In the warning algorithm, median daily false alarms (DFA) of 0.25%, and 100% correct alarms (CA) were obtained for both event types. The detection times (DT) before occurrence of events were 13.0 and 12.1 min respectively for hypo-/hyperglycemic events. Compared to the cARX and RNN models, and a linear fusion of the two, the proposed fusion schemes represents a significant improvement.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Distributions sensitive to the underlying event in QCD jet events have been measured with the ATLAS detector at the LHC, based on 37 pb−1 of proton–proton collision data collected at a centre-of-mass energy of 7 TeV. Chargedparticle mean pT and densities of all-particle ET and chargedparticle multiplicity and pT have been measured in regions azimuthally transverse to the hardest jet in each event. These are presented both as one-dimensional distributions and with their mean values as functions of the leading-jet transverse momentum from 20 to 800 GeV. The correlation of chargedparticle mean pT with charged-particle multiplicity is also studied, and the ET densities include the forward rapidity region; these features provide extra data constraints for Monte Carlo modelling of colour reconnection and beamremnant effects respectively. For the first time, underlying event observables have been computed separately for inclusive jet and exclusive dijet event selections, allowing more detailed study of the interplay of multiple partonic scattering and QCD radiation contributions to the underlying event. Comparisonsto the predictions of different Monte Carlo models show a need for further model tuning, but the standard approach is found to generally reproduce the features of the underlying event in both types of event selection.