72 resultados para Reinforcement Learning,resource-constrained devices,iOS devices,on-device machine learning
Resumo:
The paper deals with batch scheduling problems in process industries where final products arise from several successive chemical or physical transformations of raw materials using multi–purpose equipment. In batch production mode, the total requirements of intermediate and final products are partitioned into batches. The production start of a batch at a given level requires the availability of all input products. We consider the problem of scheduling the production of given batches such that the makespan is minimized. Constraints like minimum and maximum time lags between successive production levels, sequence–dependent facility setup times, finite intermediate storages, production breaks, and time–varying manpower contribute to the complexity of this problem. We propose a new solution approach using models and methods of resource–constrained project scheduling, which (approximately) solves problems of industrial size within a reasonable amount of time.
Resumo:
To compare the prediction of hip fracture risk of several bone ultrasounds (QUS), 7062 Swiss women > or =70 years of age were measured with three QUSs (two of the heel, one of the phalanges). Heel QUSs were both predictive of hip fracture risk, whereas the phalanges QUS was not. INTRODUCTION: As the number of hip fracture is expected to increase during these next decades, it is important to develop strategies to detect subjects at risk. Quantitative bone ultrasound (QUS), an ionizing radiation-free method, which is transportable, could be interesting for this purpose. MATERIALS AND METHODS: The Swiss Evaluation of the Methods of Measurement of Osteoporotic Fracture Risk (SEMOF) study is a multicenter cohort study, which compared three QUSs for the assessment of hip fracture risk in a sample of 7609 elderly ambulatory women > or =70 years of age. Two QUSs measured the heel (Achilles+; GE-Lunar and Sahara; Hologic), and one measured the heel (DBM Sonic 1200; IGEA). The Cox proportional hazards regression was used to estimate the hazard of the first hip fracture, adjusted for age, BMI, and center, and the area under the ROC curves were calculated to compare the devices and their parameters. RESULTS: From the 7609 women who were included in the study, 7062 women 75.2 +/- 3.1 (SD) years of age were prospectively followed for 2.9 +/- 0.8 years. Eighty women reported a hip fracture. A decrease by 1 SD of the QUS variables corresponded to an increase of the hip fracture risk from 2.3 (95% CI, 1.7, 3.1) to 2.6 (95% CI, 1.9, 3.4) for the three variables of Achilles+ and from 2.2 (95% CI, 1.7, 3.0) to 2.4 (95% CI, 1.8, 3.2) for the three variables of Sahara. Risk gradients did not differ significantly among the variables of the two heel QUS devices. On the other hand, the phalanges QUS (DBM Sonic 1200) was not predictive of hip fracture risk, with an adjusted hazard risk of 1.2 (95% CI, 0.9, 1.5), even after reanalysis of the digitalized data and using different cut-off levels (1700 or 1570 m/s). CONCLUSIONS: In this elderly women population, heel QUS devices were both predictive of hip fracture risk, whereas the phalanges QUS device was not.
Resumo:
INTRODUCTION In this in-vitro study, we aimed to investigate the predictability of the expected amount of stripping using 3 common stripping devices on premolars. METHODS One hundred eighty extracted premolars were mounted and aligned in silicone. Tooth mobility was tested with Periotest (Medizintechnik Gulden, Modautal, Germany) (8.3 ± 2.8 units). The selected methods for interproximal enamel reduction were hand-pulled strips (Horico, Hapf Ringleb & Company, Berlin, Germany), oscillating segmental disks (O-drive-OD 30; KaVo Dental, Biberach, Germany), and motor-driven abrasive strips (Orthofile; SDC Switzerland, Lugano-Grancia, Switzerland). With each device, the operator intended to strip 0.1, 0.2, 0.3, or 0.4 mm on the mesial side of 15 teeth. The teeth were scanned before and after stripping with a 3-dimensional laser scanner. Superposition and measurement of stripped enamel on the most mesial point of the tooth were conducted with Viewbox software (dHal Software, Kifissia, Greece). The Wilcoxon signed rank test and the Kruskal-Wallis test were applied; statistical significance was set at alpha ≤ 0.05. RESULTS Large variations between the intended and the actual amounts of stripped enamel, and between stripping procedures, were observed. Significant differences were found at 0.1 mm of intended stripping (P ≤ 0.05) for the hand-pulled method and at 0.4 mm of intended stripping (P ≤ 0.001 to P = 0.05) for all methods. For all scenarios of enamel reduction, the actual amount of stripping was less than the predetermined and expected amount of stripping. The Kruskal-Wallis analysis showed no significant differences between the 3 methods. CONCLUSIONS There were variations in the stripped amounts of enamel, and the stripping technique did not appear to be a significant predictor of the actual amount of enamel reduction. In most cases, actual stripping was less than the intended amount of enamel reduction.
Resumo:
BACKGROUND Efficiently performed basic life support (BLS) after cardiac arrest is proven to be effective. However, cardiopulmonary resuscitation (CPR) is strenuous and rescuers' performance declines rapidly over time. Audio-visual feedback devices reporting CPR quality may prevent this decline. We aimed to investigate the effect of various CPR feedback devices on CPR quality. METHODS In this open, prospective, randomised, controlled trial we compared three CPR feedback devices (PocketCPR, CPRmeter, iPhone app PocketCPR) with standard BLS without feedback in a simulated scenario. 240 trained medical students performed single rescuer BLS on a manikin for 8min. Effective compression (compressions with correct depth, pressure point and sufficient decompression) as well as compression rate, flow time fraction and ventilation parameters were compared between the four groups. RESULTS Study participants using the PocketCPR performed 17±19% effective compressions compared to 32±28% with CPRmeter, 25±27% with the iPhone app PocketCPR, and 35±30% applying standard BLS (PocketCPR vs. CPRmeter p=0.007, PocketCPR vs. standard BLS p=0.001, others: ns). PocketCPR and CPRmeter prevented a decline in effective compression over time, but overall performance in the PocketCPR group was considerably inferior to standard BLS. Compression depth and rate were within the range recommended in the guidelines in all groups. CONCLUSION While we found differences between the investigated CPR feedback devices, overall BLS quality was suboptimal in all groups. Surprisingly, effective compression was not improved by any CPR feedback device compared to standard BLS. All feedback devices caused substantial delay in starting CPR, which may worsen outcome.
Resumo:
Learning by reinforcement is important in shaping animal behavior, and in particular in behavioral decision making. Such decision making is likely to involve the integration of many synaptic events in space and time. However, using a single reinforcement signal to modulate synaptic plasticity, as suggested in classical reinforcement learning algorithms, a twofold problem arises. Different synapses will have contributed differently to the behavioral decision, and even for one and the same synapse, releases at different times may have had different effects. Here we present a plasticity rule which solves this spatio-temporal credit assignment problem in a population of spiking neurons. The learning rule is spike-time dependent and maximizes the expected reward by following its stochastic gradient. Synaptic plasticity is modulated not only by the reward, but also by a population feedback signal. While this additional signal solves the spatial component of the problem, the temporal one is solved by means of synaptic eligibility traces. In contrast to temporal difference (TD) based approaches to reinforcement learning, our rule is explicit with regard to the assumed biophysical mechanisms. Neurotransmitter concentrations determine plasticity and learning occurs fully online. Further, it works even if the task to be learned is non-Markovian, i.e. when reinforcement is not determined by the current state of the system but may also depend on past events. The performance of the model is assessed by studying three non-Markovian tasks. In the first task, the reward is delayed beyond the last action with non-related stimuli and actions appearing in between. The second task involves an action sequence which is itself extended in time and reward is only delivered at the last action, as it is the case in any type of board-game. The third task is the inspection game that has been studied in neuroeconomics, where an inspector tries to prevent a worker from shirking. Applying our algorithm to this game yields a learning behavior which is consistent with behavioral data from humans and monkeys, revealing themselves properties of a mixed Nash equilibrium. The examples show that our neuronal implementation of reward based learning copes with delayed and stochastic reward delivery, and also with the learning of mixed strategies in two-opponent games.
Resumo:
Learning by reinforcement is important in shaping animal behavior. But behavioral decision making is likely to involve the integration of many synaptic events in space and time. So in using a single reinforcement signal to modulate synaptic plasticity a twofold problem arises. Different synapses will have contributed differently to the behavioral decision and, even for one and the same synapse, releases at different times may have had different effects. Here we present a plasticity rule which solves this spatio-temporal credit assignment problem in a population of spiking neurons. The learning rule is spike time dependent and maximizes the expected reward by following its stochastic gradient. Synaptic plasticity is modulated not only by the reward but by a population feedback signal as well. While this additional signal solves the spatial component of the problem, the temporal one is solved by means of synaptic eligibility traces. In contrast to temporal difference based approaches to reinforcement learning, our rule is explicit with regard to the assumed biophysical mechanisms. Neurotransmitter concentrations determine plasticity and learning occurs fully online. Further, it works even if the task to be learned is non-Markovian, i.e. when reinforcement is not determined by the current state of the system but may also depend on past events. The performance of the model is assessed by studying three non-Markovian tasks. In the first task the reward is delayed beyond the last action with non-related stimuli and actions appearing in between. The second one involves an action sequence which is itself extended in time and reward is only delivered at the last action, as is the case in any type of board-game. The third is the inspection game that has been studied in neuroeconomics. It only has a mixed Nash equilibrium and exemplifies that the model also copes with stochastic reward delivery and the learning of mixed strategies.