974 resultados para DDAP Dock Door Assignment Problem


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Learning by reinforcement is important in shaping animal behavior. But behavioral decision making is likely to involve the integration of many synaptic events in space and time. So in using a single reinforcement signal to modulate synaptic plasticity a twofold problem arises. Different synapses will have contributed differently to the behavioral decision and, even for one and the same synapse, releases at different times may have had different effects. Here we present a plasticity rule which solves this spatio-temporal credit assignment problem in a population of spiking neurons. The learning rule is spike time dependent and maximizes the expected reward by following its stochastic gradient. Synaptic plasticity is modulated not only by the reward but by a population feedback signal as well. While this additional signal solves the spatial component of the problem, the temporal one is solved by means of synaptic eligibility traces. In contrast to temporal difference based approaches to reinforcement learning, our rule is explicit with regard to the assumed biophysical mechanisms. Neurotransmitter concentrations determine plasticity and learning occurs fully online. Further, it works even if the task to be learned is non-Markovian, i.e. when reinforcement is not determined by the current state of the system but may also depend on past events. The performance of the model is assessed by studying three non-Markovian tasks. In the first task the reward is delayed beyond the last action with non-related stimuli and actions appearing in between. The second one involves an action sequence which is itself extended in time and reward is only delivered at the last action, as is the case in any type of board-game. The third is the inspection game that has been studied in neuroeconomics. It only has a mixed Nash equilibrium and exemplifies that the model also copes with stochastic reward delivery and the learning of mixed strategies.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We present a model for plasticity induction in reinforcement learning which is based on a cascade of synaptic memory traces. In the cascade of these so called eligibility traces presynaptic input is first corre lated with postsynaptic events, next with the behavioral decisions and finally with the external reinforcement. A population of leaky integrate and fire neurons endowed with this plasticity scheme is studied by simulation on different tasks. For operant co nditioning with delayed reinforcement, learning succeeds even when the delay is so large that the delivered reward reflects the appropriateness, not of the immediately preceeding response, but of a decision made earlier on in the stimulus - decision sequence . So the proposed model does not rely on the temporal contiguity between decision and pertinent reward and thus provides a viable means of addressing the temporal credit assignment problem. In the same task, learning speeds up with increasing population si ze, showing that the plasticity cascade simultaneously addresses the spatial problem of assigning credit to the different population neurons. Simulations on other task such as sequential decision making serve to highlight the robustness of the proposed sch eme and, further, contrast its performance to that of temporal difference based approaches to reinforcement learning.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

n learning from trial and error, animals need to relate behavioral decisions to environmental reinforcement even though it may be difficult to assign credit to a particular decision when outcomes are uncertain or subject to delays. When considering the biophysical basis of learning, the credit-assignment problem is compounded because the behavioral decisions themselves result from the spatio-temporal aggregation of many synaptic releases. We present a model of plasticity induction for reinforcement learning in a population of leaky integrate and fire neurons which is based on a cascade of synaptic memory traces. Each synaptic cascade correlates presynaptic input first with postsynaptic events, next with the behavioral decisions and finally with external reinforcement. For operant conditioning, learning succeeds even when reinforcement is delivered with a delay so large that temporal contiguity between decision and pertinent reward is lost due to intervening decisions which are themselves subject to delayed reinforcement. This shows that the model provides a viable mechanism for temporal credit assignment. Further, learning speeds up with increasing population size, so the plasticity cascade simultaneously addresses the spatial problem of assigning credit to synapses in different population neurons. Simulations on other tasks, such as sequential decision making, serve to contrast the performance of the proposed scheme to that of temporal difference-based learning. We argue that, due to their comparative robustness, synaptic plasticity cascades are attractive basic models of reinforcement learning in the brain.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Physical distribution plays an imporant role in contemporary logistics management. Both satisfaction level of of customer and competitiveness of company can be enhanced if the distribution problem is solved optimally. The multi-depot vehicle routing problem (MDVRP) belongs to a practical logistics distribution problem, which consists of three critical issues: customer assignment, customer routing, and vehicle sequencing. According to the literatures, the solution approaches for the MDVRP are not satisfactory because some unrealistic assumptions were made on the first sub-problem of the MDVRP, ot the customer assignment problem. To refine the approaches, the focus of this paper is confined to this problem only. This paper formulates the customer assignment problem as a minimax-type integer linear programming model with the objective of minimizing the cycle time of the depots where setup times are explicitly considered. Since the model is proven to be MP-complete, a genetic algorithm is developed for solving the problem. The efficiency and effectiveness of the genetic algorithm are illustrated by a numerical example.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Purpose – This paper sets out to study a production-planning problem for printed circuit board (PCB) assembly. A PCB assembly company may have a number of assembly lines for production of several product types in large volume. Design/methodology/approach – Pure integer linear programming models are formulated for assigning the product types to assembly lines, which is the line assignment problem, with the objective of minimizing the total production cost. In this approach, unrealistic assignment, which was suffered by previous researchers, is avoided by incorporating several constraints into the model. In this paper, a genetic algorithm is developed to solve the line assignment problem. Findings – The procedure of the genetic algorithm to the problem and a numerical example for illustrating the models are provided. It is also proved that the algorithm is effective and efficient in dealing with the problem. Originality/value – This paper studies the line assignment problem arising in a PCB manufacturing company in which the production volume is high.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The aim of this paper is to explore a new approach to obtain better traffic demand (Origin-Destination, OD matrices) for dense urban networks. From reviewing existing methods, from static to dynamic OD matrix evaluation, possible deficiencies in the approach could be identified: traffic assignment details for complex urban network and lacks in dynamic approach. To improve the global process of traffic demand estimation, this paper is focussing on a new methodology to determine dynamic OD matrices for urban areas characterized by complex route choice situation and high level of traffic controls. An iterative bi-level approach will be used, the Lower level (traffic assignment) problem will determine, dynamically, the utilisation of the network by vehicles using heuristic data from mesoscopic traffic simulator and the Upper level (matrix adjustment) problem will proceed to an OD estimation using optimization Kalman filtering technique. In this way, a full dynamic and continuous estimation of the final OD matrix could be obtained. First results of the proposed approach and remarks are presented.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper recasts the multiple data path assignment problem solved by Torng and Wilhelm by the dynamic programming method [1] into a minimal covering problem following a switching theoretic approach. The concept of bus compatibility for the data transfers is used to obtain the various ways of interconnecting the circuit modules with the minimum number of buses that allow concurrent data transfers. These have been called the feasible solutions of the problem. The minimal cost solutions are obtained by assigning weights to the bus-compatible sets present in the feasible solutions. Minimization of the cost of the solution by increasing the number of buses is also discussed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper recasts the multiple data path assignment problem solved by Torng and Wilhelm by the dynamic programming method [1] into a minimal covering problem following a switching theoretic approach. The concept of bus compatibility for the data transfers is used to obtain the various ways of interconnecting the circuit modules with the minimum number of buses that allow concurrent data transfers. These have been called the feasible solutions of the problem. The minimal cost solutions are obtained by assigning weights to the bus-compatible sets present in the feasible solutions. Minimization of the cost of the solution by increasing the number of buses is also discussed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The controllability grammian is important in many control applications. Given a set of closed-loop eigenvalues the corresponding controllability grammian can be obtained by computing the controller which assigns the eigenvalues and then by solving the Lyapunov equation that defines the grammian. The relationship between the controllability grammian, resulting from state feedback, and the closed-loop eigenvalues of a single input linear time invariant (LTI) system is obtained. The proposed methodology does not require the computation of the controller that assigns the specified eigenvalues. The closed-loop system matrix is obtained from the knowledge of the open-loop system matrix, control influence matrix and the specified closed-loop eigenvalues. Knowing the closed-loop system matrix, the grammian is then obtained from the solution of the Lyapunov equation that defines it. Finally the proposed idea is extended to find the state covariance matrix for a specified set of closed-loop eigenvalues (without computing the controller), due to impulsive input in the disturbance channel and to solve the eigenvalue assignment problem for the single input case.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We apply the objective method of Aldous to the problem of finding the minimum-cost edge cover of the complete graph with random independent and identically distributed edge costs. The limit, as the number of vertices goes to infinity, of the expected minimum cost for this problem is known via a combinatorial approach of Hessler and Wastlund. We provide a proof of this result using the machinery of the objective method and local weak convergence, which was used to prove the (2) limit of the random assignment problem. A proof via the objective method is useful because it provides us with more information on the nature of the edge's incident on a typical root in the minimum-cost edge cover. We further show that a belief propagation algorithm converges asymptotically to the optimal solution. This can be applied in a computational linguistics problem of semantic projection. The belief propagation algorithm yields a near optimal solution with lesser complexity than the known best algorithms designed for optimality in worst-case settings.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

O surgimento de novos serviços de telecomunicações tem provocado um enorme aumento no tráfego de dados nas redes de transmissão. Para atender a essa demanda crescente, novas tecnologias foram desenvolvidas e implementadas ao longo dos anos, sendo que um dos principais avanços está na área de transmissão óptica, devido à grande capacidade de transporte de informação da fibra óptica. A tecnologia que melhor explora a capacidade desse meio de transmissão atualmente é a multiplexação por divisão de comprimento de onda ou Wavelength Division Multiplexing (WDM) que permite a transmissão de diversos sinais utilizando apenas uma fibra óptica. Redes ópticas WDM se tornaram muito complexas, com enorme capacidade de transmissão de informação (terabits por segundo), para atender à explosão de necessidade por largura de banda. Nesse contexto, é de extrema importância que os recursos dessas redes sejam utilizados de forma inteligente e otimizada. Um dos maiores desafios em uma rede óptica é a escolha de uma rota e a seleção de um comprimento de onda disponível na rede para atender uma solicitação de conexão utilizando o menor número de recursos possível. Esse problema é bastante complexo e ficou conhecido como problema de roteamento e alocação de comprimento de onda ou, simplesmente, problema RWA (Routing and Wavelentgh Assignment problem). Muitos estudos foram realizados com o objetivo de encontrar uma solução eficiente para esse problema, mas nem sempre é possível aliar bom desempenho com baixo tempo de execução, requisito fundamental em redes de telecomunicações. A técnica de algoritmo genético (AG) tem sido utilizada para encontrar soluções de problemas de otimização, como é o caso do problema RWA, e tem obtido resultados superiores quando comparada com soluções heurísticas tradicionais encontradas na literatura. Esta dissertação apresenta, resumidamente, os conceitos de redes ópticas e de algoritmos genéticos, e descreve uma formulação do problema RWA adequada à solução por algoritmo genético.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Object matching is a fundamental operation in data analysis. It typically requires the definition of a similarity measure between the classes of objects to be matched. Instead, we develop an approach which is able to perform matching by requiring a similarity measure only within each of the classes. This is achieved by maximizing the dependency between matched pairs of observations by means of the Hilbert Schmidt Independence Criterion. This problem can be cast as one of maximizing a quadratic assignment problem with special structure and we present a simple algorithm for finding a locally optimal solution.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

建立了极大极小任务分配问题的混合整数线性规划模型,提出一种矩阵作业解答,并与穷举解及混合整数线性规划解的计算复杂度进行了比较.理论分析和数值试验表明矩阵作业法对两类任务分配问题,极大极小和总体极小任务分配问题,有效地提供最优解.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

本文将2类方阵指派问题——极大极小和总体极小指派问题——的矩阵作业解法推广到非方阵情形,即求解任务与人员数目不等的指派问题,且维持矩阵作业法的效率.假定m>n,则按本文行优先选取算法求解m×n非方阵指派问题的最大逻辑运算量为O(mn2),其效率通常与执行一轮覆盖的矩阵作业法相当。

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Google AdSense Program is a successful internet advertisement program where Google places contextual adverts on third-party websites and shares the resulting revenue with each publisher. Advertisers have budgets and bid on ad slots while publishers set reserve prices for the ad slots on their websites. Following previous modelling efforts, we model the program as a two-sided market with advertisers on one side and publishers on the other. We show a reduction from the Generalised Assignment Problem (GAP) to the problem of computing the revenue maximising allocation and pricing of publisher slots under a first-price auction. GAP is APX-hard but a (1-1/e) approximation is known. We compute truthful and revenue-maximizing prices and allocation of ad slots to advertisers under a second-price auction. The auctioneer's revenue is within (1-1/e) second-price optimal.