807 resultados para Bandit problems


Relevância:

60.00% 60.00%

Publicador:

Resumo:

We show that if performance measures in a stochastic scheduling problem satisfy a set of so-called partial conservation laws (PCL), which extend previously studied generalized conservation laws (GCL), then the problem is solved optimally by a priority-index policy for an appropriate range of linear performance objectives, where the optimal indices are computed by a one-pass adaptive-greedy algorithm, based on Klimov's. We further apply this framework to investigate the indexability property of restless bandits introduced by Whittle, obtaining the following results: (1) we identify a class of restless bandits (PCL-indexable) which are indexable; membership in this class is tested through a single run of the adaptive-greedy algorithm, which also computes the Whittle indices when the test is positive; this provides a tractable sufficient condition for indexability; (2) we further indentify the class of GCL-indexable bandits, which includes classical bandits, having the property that they are indexable under any linear reward objective. The analysis is based on the so-called achievable region method, as the results follow fromnew linear programming formulations for the problems investigated.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We develop a mathematical programming approach for the classicalPSPACE - hard restless bandit problem in stochastic optimization.We introduce a hierarchy of n (where n is the number of bandits)increasingly stronger linear programming relaxations, the lastof which is exact and corresponds to the (exponential size)formulation of the problem as a Markov decision chain, while theother relaxations provide bounds and are efficiently computed. Wealso propose a priority-index heuristic scheduling policy fromthe solution to the first-order relaxation, where the indices aredefined in terms of optimal dual variables. In this way wepropose a policy and a suboptimality guarantee. We report resultsof computational experiments that suggest that the proposedheuristic policy is nearly optimal. Moreover, the second-orderrelaxation is found to provide strong bounds on the optimalvalue.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

La thèse comporte trois essais en microéconomie appliquée. En utilisant des modèles d’apprentissage (learning) et d’externalité de réseau, elle étudie le comportement des agents économiques dans différentes situations. Le premier essai de la thèse se penche sur la question de l’utilisation des ressources naturelles en situation d’incertitude et d’apprentissage (learning). Plusieurs auteurs ont abordé le sujet, mais ici, nous étudions un modèle d’apprentissage dans lequel les agents qui consomment la ressource ne formulent pas les mêmes croyances a priori. Le deuxième essai aborde le problème générique auquel fait face, par exemple, un fonds de recherche désirant choisir les meilleurs parmi plusieurs chercheurs de différentes générations et de différentes expériences. Le troisième essai étudie un modèle particulier d’organisation d’entreprise dénommé le marketing multiniveau (multi-level marketing). Le premier chapitre est intitulé "Renewable Resource Consumption in a Learning Environment with Heterogeneous beliefs". Nous y avons utilisé un modèle d’apprentissage avec croyances hétérogènes pour étudier l’exploitation d’une ressource naturelle en situation d’incertitude. Il faut distinguer ici deux types d’apprentissage : le adaptive learning et le learning proprement dit. Ces deux termes ont été empruntés à Koulovatianos et al (2009). Nous avons montré que, en comparaison avec le adaptive learning, le learning a un impact négatif sur la consommation totale par tous les exploitants de la ressource. Mais individuellement certains exploitants peuvent consommer plus la ressource en learning qu’en adaptive learning. En effet, en learning, les consommateurs font face à deux types d’incitations à ne pas consommer la ressource (et donc à investir) : l’incitation propre qui a toujours un effet négatif sur la consommation de la ressource et l’incitation hétérogène dont l’effet peut être positif ou négatif. L’effet global du learning sur la consommation individuelle dépend donc du signe et de l’ampleur de l’incitation hétérogène. Par ailleurs, en utilisant les variations absolues et relatives de la consommation suite à un changement des croyances, il ressort que les exploitants ont tendance à converger vers une décision commune. Le second chapitre est intitulé "A Perpetual Search for Talent across Overlapping Generations". Avec un modèle dynamique à générations imbriquées, nous avons étudié iv comment un Fonds de recherche devra procéder pour sélectionner les meilleurs chercheurs à financer. Les chercheurs n’ont pas la même "ancienneté" dans l’activité de recherche. Pour une décision optimale, le Fonds de recherche doit se baser à la fois sur l’ancienneté et les travaux passés des chercheurs ayant soumis une demande de subvention de recherche. Il doit être plus favorable aux jeunes chercheurs quant aux exigences à satisfaire pour être financé. Ce travail est également une contribution à l’analyse des Bandit Problems. Ici, au lieu de tenter de calculer un indice, nous proposons de classer et d’éliminer progressivement les chercheurs en les comparant deux à deux. Le troisième chapitre est intitulé "Paradox about the Multi-Level Marketing (MLM)". Depuis quelques décennies, on rencontre de plus en plus une forme particulière d’entreprises dans lesquelles le produit est commercialisé par le biais de distributeurs. Chaque distributeur peut vendre le produit et/ou recruter d’autres distributeurs pour l’entreprise. Il réalise des profits sur ses propres ventes et reçoit aussi des commissions sur la vente des distributeurs qu’il aura recrutés. Il s’agit du marketing multi-niveau (multi-level marketing, MLM). La structure de ces types d’entreprise est souvent qualifiée par certaines critiques de système pyramidal, d’escroquerie et donc insoutenable. Mais les promoteurs des marketing multi-niveau rejettent ces allégations en avançant que le but des MLMs est de vendre et non de recruter. Les gains et les règles de jeu sont tels que les distributeurs ont plus incitation à vendre le produit qu’à recruter. Toutefois, si cette argumentation des promoteurs de MLMs est valide, un paradoxe apparaît. Pourquoi un distributeur qui désire vraiment vendre le produit et réaliser un gain recruterait-il d’autres individus qui viendront opérer sur le même marché que lui? Comment comprendre le fait qu’un agent puisse recruter des personnes qui pourraient devenir ses concurrents, alors qu’il est déjà établi que tout entrepreneur évite et même combat la concurrence. C’est à ce type de question que s’intéresse ce chapitre. Pour expliquer ce paradoxe, nous avons utilisé la structure intrinsèque des organisations MLM. En réalité, pour être capable de bien vendre, le distributeur devra recruter. Les commissions perçues avec le recrutement donnent un pouvoir de vente en ce sens qu’elles permettent au recruteur d’être capable de proposer un prix compétitif pour le produit qu’il désire vendre. Par ailleurs, les MLMs ont une structure semblable à celle des multi-sided markets au sens de Rochet et Tirole (2003, 2006) et Weyl (2010). Le recrutement a un effet externe sur la vente et la vente a un effet externe sur le recrutement, et tout cela est géré par le promoteur de l’organisation. Ainsi, si le promoteur ne tient pas compte de ces externalités dans la fixation des différentes commissions, les agents peuvent se tourner plus ou moins vers le recrutement.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

We present a polyhedral framework for establishing general structural properties on optimal solutions of stochastic scheduling problems, where multiple job classes vie for service resources: the existence of an optimal priority policy in a given family, characterized by a greedoid (whose feasible class subsets may receive higher priority), where optimal priorities are determined by class-ranking indices, under restricted linear performance objectives (partial indexability). This framework extends that of Bertsimas and Niño-Mora (1996), which explained the optimality of priority-index policies under all linear objectives (general indexability). We show that, if performance measures satisfy partial conservation laws (with respect to the greedoid), which extend previous generalized conservation laws, then the problem admits a strong LP relaxation over a so-called extended greedoid polytope, which has strong structural and algorithmic properties. We present an adaptive-greedy algorithm (which extends Klimov's) taking as input the linear objective coefficients, which (1) determines whether the optimal LP solution is achievable by a policy in the given family; and (2) if so, computes a set of class-ranking indices that characterize optimal priority policies in the family. In the special case of project scheduling, we show that, under additional conditions, the optimal indices can be computed separately for each project (index decomposition). We further apply the framework to the important restless bandit model (two-action Markov decision chains), obtaining new index policies, that extend Whittle's (1988), and simple sufficient conditions for their validity. These results highlight the power of polyhedral methods (the so-called achievable region approach) in dynamic and stochastic optimization.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

We present a polyhedral framework for establishing general structural properties on optimal solutions of stochastic scheduling problems, where multiple job classes vie for service resources: the existence of an optimal priority policy in a given family, characterized by a greedoid(whose feasible class subsets may receive higher priority), where optimal priorities are determined by class-ranking indices, under restricted linear performance objectives (partial indexability). This framework extends that of Bertsimas and Niño-Mora (1996), which explained the optimality of priority-index policies under all linear objectives (general indexability). We show that, if performance measures satisfy partial conservation laws (with respect to the greedoid), which extend previous generalized conservation laws, then theproblem admits a strong LP relaxation over a so-called extended greedoid polytope, which has strong structural and algorithmic properties. We present an adaptive-greedy algorithm (which extends Klimov's) taking as input the linear objective coefficients, which (1) determines whether the optimal LP solution is achievable by a policy in the given family; and (2) if so, computes a set of class-ranking indices that characterize optimal priority policies in the family. In the special case of project scheduling, we show that, under additional conditions, the optimal indices can be computed separately for each project (index decomposition). We further apply the framework to the important restless bandit model (two-action Markov decision chains), obtaining new index policies, that extend Whittle's (1988), and simple sufficient conditions for their validity. These results highlight the power of polyhedral methods (the so-called achievable region approach) in dynamic and stochastic optimization.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Substantial complexity has been introduced into treatment regimens for patients with human immunodeficiency virus (HIV) infection. Many drug-related problems (DRPs) are detected in these patients, such as low adherence, therapeutic inefficacy, and safety issues. We evaluated the impact of pharmacist interventions on CD4+ T-lymphocyte count, HIV viral load, and DRPs in patients with HIV infection. In this 18-month prospective controlled study, 90 outpatients were selected by convenience sampling from the Hospital Dia-University of Campinas Teaching Hospital (Brazil). Forty-five patients comprised the pharmacist intervention group and 45 the control group; all patients had HIV infection with or without acquired immunodeficiency syndrome. Pharmaceutical appointments were conducted based on the Pharmacotherapy Workup method, although DRPs and pharmacist intervention classifications were modified for applicability to institutional service limitations and research requirements. Pharmacist interventions were performed immediately after detection of DRPs. The main outcome measures were DRPs, CD4+ T-lymphocyte count, and HIV viral load. After pharmacist intervention, DRPs decreased from 5.2 (95% confidence interval [CI] =4.1-6.2) to 4.2 (95% CI =3.3-5.1) per patient (P=0.043). A total of 122 pharmacist interventions were proposed, with an average of 2.7 interventions per patient. All the pharmacist interventions were accepted by physicians, and among patients, the interventions were well accepted during the appointments, but compliance with the interventions was not measured. A statistically significant increase in CD4+ T-lymphocyte count in the intervention group was found (260.7 cells/mm(3) [95% CI =175.8-345.6] to 312.0 cells/mm(3) [95% CI =23.5-40.6], P=0.015), which was not observed in the control group. There was no statistical difference between the groups regarding HIV viral load. This study suggests that pharmacist interventions in patients with HIV infection can cause an increase in CD4+ T-lymphocyte counts and a decrease in DRPs, demonstrating the importance of an optimal pharmaceutical care plan.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this study, the transmission-line modeling (TLM) applied to bio-thermal problems was improved by incorporating several novel computational techniques, which include application of graded meshes which resulted in 9 times faster in computational time and uses only a fraction (16%) of the computational resources used by regular meshes in analyzing heat flow through heterogeneous media. Graded meshes, unlike regular meshes, allow heat sources to be modeled in all segments of the mesh. A new boundary condition that considers thermal properties and thus resulting in a more realistic modeling of complex problems is introduced. Also, a new way of calculating an error parameter is introduced. The calculated temperatures between nodes were compared against the results obtained from the literature and agreed within less than 1% difference. It is reasonable, therefore, to conclude that the improved TLM model described herein has great potential in heat transfer of biological systems.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The scope of this study is to identify the prevalence of access to information about how to prevent oral problems among schoolchildren in the public school network, as well as the factors associated with such access. This is a cross-sectional and analytical study conducted among 12-year-old schoolchildren in a Brazilian municipality with a large population. The examinations were performed by 24 trained dentists and calibrated with the aid of 24 recorders. Data collection occurred in 36 public schools selected from the 89 public schools of the city. Descriptive, univariate and multiple analyses were conducted. Of the 2510 schoolchildren included in the study, 2211 reported having received information about how to prevent oral problems. Access to such information was greater among those who used private dental services; and lower among those who used the service for treatment, who evaluated the service as regular or bad/awful. The latter use toothbrush only or toothbrush and tongue scrubbing as a means of oral hygiene and who reported not being satisfied with the appearance of their teeth. The conclusion drawn is that the majority of schoolchildren had access to information about how to prevent oral problems, though access was associated with the characteristics of health services, health behavior and outcomes.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Efforts presented by the scientific community in recent years towards the development of numerous green chemical processes and wastewater treatment technologies are presented and discussed. In the light of these approaches, environmentally friendly technologies, as well as the key role played by the well-known advanced oxidation processes, are discussed, giving special attention to the ones comprising ozone applications. Fundamentals and applied aspects dealing with ozone technology and its application are also presented.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

PURPOSE: Compare parents' reports of youth problems (PRYP) with adolescent problems self-reports (APSR) pre/post behavioral treatment of nocturnal enuresis (NE) based on the use of a urine alarm. MATERIALS AND METHODS: Adolescents (N = 19) with mono-symptomatic (primary or secondary) nocturnal enuresis group treatment for 40 weeks. Discharge criterion was established as 8 weeks with consecutive dry nights. PRYP and APSR were scored by the Child Behavior Checklist (CBCL) and Youth Self-Report (YSR). RESULTS: Pre-treatment data: 1) Higher number of clinical cases based on parent report than on self-report for Internalizing Problems (IP) (13/19 vs. 4/19), Externalizing Problems (EP) (7/19 vs. 5/19) and Total Problem (TP) (11/19 vs. 5/19); 2) Mean PRYP scores for IP (60.8) and TP (61) were within the deviant range (T score ≥ 60); while mean PRYP scores for EP (57.4) and mean APSR scores (IP = 52.4, EP = 49.5, TP = 52.4) were within the normal range. Difference between PRYP' and APSR' scores was significant. Post treatment data: 1) Discharge for majority of the participants (16/19); 2) Reduction in the number of clinical cases on parental evaluation: 9/19 adolescents remained within clinical range for IP, 2/19 for EP, and 7/19 for TP. 3) All post-treatment mean scores were within the normal range; the difference between pre and post evaluation scores was significant for PRYP. CONCLUSIONS: The behavioral treatment based on the use of urine alarm is effective for adolescents with mono-symptomatic (primary and secondary) nocturnal enuresis. The study favors the hypothesis that enuresis is a cause, not a consequence, of other behavioral problems.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This work develops a method for solving ordinary differential equations, that is, initial-value problems, with solutions approximated by using Legendre's polynomials. An iterative procedure for the adjustment of the polynomial coefficients is developed, based on the genetic algorithm. This procedure is applied to several examples providing comparisons between its results and the best polynomial fitting when numerical solutions by the traditional Runge-Kutta or Adams methods are available. The resulting algorithm provides reliable solutions even if the numerical solutions are not available, that is, when the mass matrix is singular or the equation produces unstable running processes.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We investigate the performance of a variant of Axelrod's model for dissemination of culture-the Adaptive Culture Heuristic (ACH)-on solving an NP-Complete optimization problem, namely, the classification of binary input patterns of size F by a Boolean Binary Perceptron. In this heuristic, N agents, characterized by binary strings of length F which represent possible solutions to the optimization problem, are fixed at the sites of a square lattice and interact with their nearest neighbors only. The interactions are such that the agents' strings (or cultures) become more similar to the low-cost strings of their neighbors resulting in the dissemination of these strings across the lattice. Eventually the dynamics freezes into a homogeneous absorbing configuration in which all agents exhibit identical solutions to the optimization problem. We find through extensive simulations that the probability of finding the optimal solution is a function of the reduced variable F/N(1/4) so that the number of agents must increase with the fourth power of the problem size, N proportional to F(4), to guarantee a fixed probability of success. In this case, we find that the relaxation time to reach an absorbing configuration scales with F(6) which can be interpreted as the overall computational cost of the ACH to find an optimal set of weights for a Boolean binary perceptron, given a fixed probability of success.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Given a prime power q, define c (q) as the minimum cardinality of a subset H of F 3 q which satisfies the following property: every vector in this space di ff ers in at most 1 coordinate from a multiple of a vector in H. In this work, we introduce two extremal problems in combinatorial number theory aiming to discuss a known connection between the corresponding coverings and sum-free sets. Also, we provide several bounds on these maps which yield new classes of coverings, improving the previous upper bound on c (q)

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A model where agents show discrete behavior regarding their actions, but have continuous opinions that are updated by interacting with other agents is presented. This new updating rule is applied to both the voter and Sznajd models for interaction between neighbors, and its consequences are discussed. The appearance of extremists is naturally observed and it seems to be a characteristic of this model.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Objectives: Main Objective: to identify ethical problems in primary care according to nurses` and doctors` perceptions. Secondary Objective: to know ethical issues of patient-professional relationships in primary care. Design: Synthesis to integrate and reinterpret primary results of qualitative studies. Setting: Primary healthcare centers, Sao Paulo, SP, Brazil. Participants and/or context: Incidental sample of 34 nurses and 36 medical doctors working in primary healthcare centers selected by convenience. Methods: Individual, semi-structured interviews to identity situations considered as sources of ethical problems. The sample is socially representative of primary care health centers and professionals. Data collection assured discourse saturation. Hermeneutic-dialectical discourse analysis was used to study the results. Results: Patient-professional relationships and team work were the main sources of ethical problems. The most important problems were patient information, privacy, confidentiality, interpersonal relationship, linkage and patient autonomy. These issues reflect the recent changes in clinical relation ships and show the peculiarities of primary care with its continuous care which lasts a long time. Healthcare involves multiprofessional team work in the midst of the patient claims for autonomy. Good care of patients needs requires a relationship based on communication and cooperation, and includes feelings and values, with communication skills. Conclusions: Ethical problems in primary care are common situations. For quality and humane primary care the relationship should consist of dialogue, trust and cooperation. (C) 2009 Elsevier Espana, S.L. All rights reserved.