This paper proposes a high-level reinforcement learning (RL) control system for solving the action selection problem of an autonomous robot. Although the dominant approach, when using RL, has been to apply value function based algorithms, the system here detailed is characterized by the use of direct policy search methods. Rather than approximating a value function, these methodologies approximate a policy using an independent function approximator with its own parameters, trying to maximize the future expected reward. The policy based algorithm presented in this paper is used for learning the internal state/action mapping of a behavior. In this preliminary work, we demonstrate its feasibility with simulated experiments using the underwater robot GARBI in a target reaching task


This paper presents an Optimised Search Heuristic that combines a tabu search method with the verification of violated valid inequalities. The solution delivered by the tabu search is partially destroyed by a randomised greedy procedure, and then the valid inequalities are used to guide the reconstruction of a complete solution. An application of the new method to the Job-Shop Scheduling problem is presented.


Iterated Local Search has many of the desirable features of a metaheuristic: it is simple, easy to implement, robust, and highly effective. The essential idea of Iterated Local Search lies in focusing the search not on the full space of solutions but on a smaller subspace defined by the solutions that are locally optimal for a given optimization engine. The success of Iterated Local Search lies in the biased sampling of this set of local optima. How effective this approach turns out to be depends mainly on the choice of the local search, the perturbations, and the acceptance criterion. So far, in spite of its conceptual simplicity, it has lead to a number of state-of-the-art results without the use of too much problem-specific knowledge. But with further work so that the different modules are well adapted to the problem at hand, Iterated Local Search can often become a competitive or even state of the artalgorithm. The purpose of this review is both to give a detailed description of this metaheuristic and to show where it stands in terms of performance.


To study the short-run and long-run implications on wage inequality, we introducedirected technical change into a Ricardian model of offshoring. A unique final good isproduced by combining a skilled and an unskilled product, each produced from a continuumof intermediates (tasks). Some of these tasks can be transferred from a skill-abundant Westto a skill-scarce East. Profit maximization determines both the extent of offshoring andtechnological progress. Offshoring induces skill-biased technical change because it increasesthe relative price of skill-intensive products and induces technical change favoring unskilledworkers because it expands the market size for technologies complementing unskilled labor.In the empirically more relevant case, starting from low levels, an increase in offshoringopportunities triggers a transition with falling real wages for unskilled workers in the West,skill-biased technical change and rising skill premia worldwide. However, when the extentof offshoring becomes sufficiently large, further increases in offshoring induce technicalchange now biased in favor of unskilled labor because offshoring closes the gap betweenunskilled wages in the West and the East, thus limiting the power of the price effectfueling skill-biased technical change. The unequalizing impact of offshoring is thus greatestat the beginning. Transitional dynamics reveal that offshoring and technical change aresubstitutes in the short run but complements in the long run. Finally, though offshoringimproves the welfare of workers in the East, it may benefit or harm unskilled workers inthe West depending on elasticities and the equilibrium growth rate.


In this paper we present an algorithm to assign proctors toexams. This NP-hard problem is related to the generalized assignmentproblem with multiple objectives. The problem consists of assigningteaching assistants to proctor final exams at a university. We formulatethis problem as a multiobjective integer program (IP) with a preferencefunction and a workload-fairness function. We then consider also a weightedobjective that combines both functions. We develop a scatter searchprocedure and compare its outcome with solutions found by solving theIP model with CPLEX 6.5. Our test problems are real instances from aUniversity in Spain.


Firms compete by choosing both a price and a design from a family of designs thatcan be represented as demand rotations. Consumers engage in costly sequential searchamong firms. Each time a consumer pays a search cost he observes a new offering. Anoffering consists of a price quote and a new good, where goods might vary in the extentto which they are good matches for the consumer. In equilibrium, only two design-styles arise: either the most niche where consumers are likely to either love or loathethe product, or the broadest where consumers are likely to have similar valuations. Inequilibrium, different firms may simultaneously offer both design-styles. We performcomparative statics on the equilibrium and show that a fall in search costs can lead tohigher industry prices and profits and lower consumer surplus. Our analysis is relatedto discussions of how the internet has led to the prevalence of niche goods and the"long tail" phenomenon.


In a world where poor countries provide weak protection for intellectual propertyrights, market integration shifts technical change in favor of rich nations. Throughthis channel, free trade may amplify international income differences. At the sametime, integration with countries where intellectual property rights are weakly protectedcan slow down the world growth rate. A crucial implication of these results is thatprotection of intellectual property is most beneficial in open countries. This prediction,which is novel in the literature, finds support in the data on a panel of 53 countriesobserved in the years 1965-1990.


We propose a stylized model of a problem-solving organization whoseinternal communication structure is given by a fixed network. Problemsarrive randomly anywhere in this network and must find their way to theirrespective specialized solvers by relying on local information alone.The organization handles multiple problems simultaneously. For this reason,the process may be subject to congestion. We provide a characterization ofthe threshold of collapse of the network and of the stock of foatingproblems (or average delay) that prevails below that threshold. We buildupon this characterization to address a design problem: the determinationof what kind of network architecture optimizes performance for any givenproblem arrival rate. We conclude that, for low arrival rates, the optimalnetwork is very polarized (i.e. star-like or centralized ), whereas it islargely homogenous (or decentralized ) for high arrival rates. We also showthat, if an auxiliary assumption holds, the transition between these twoopposite structures is sharp and they are the only ones to ever qualify asoptimal.


This paper advances a highly tractable model with search theoretic foundations for money and neoclassical growth. In the model, manufacturingand commerce are distinct and separate activities. In manufacturing,goods are efficiently produced combining capital and labor. In commerce,goods are exchanged in bilateral meetings. The model is applied to studythe effects of inßation on capital accumulation and welfare. With realisticparameters, inflation has large negative effects on welfare even though itraises capital and output. In contrast, with cash-in-advance, a deviceinformally motivated with bilateral trading, inflation depresses capitaland output and has a negligible effect on welfare.


This paper presents a simple Optimised Search Heuristic for the Job Shop Scheduling problem that combines a GRASP heuristic with a branch-and-bound algorithm. The proposed method is compared with similar approaches and leads to better results in terms of solution quality and computing times.


Our task in this paper is to analyze the organization of trading in the era of quantitativefinance. To do so, we conduct an ethnography of arbitrage, the trading strategy that bestexemplifies finance in the wake of the quantitative revolution. In contrast to value andmomentum investing, we argue, arbitrage involves an art of association - the constructionof equivalence (comparability) of properties across different assets. In place of essentialor relationa l characteristics, the peculiar valuation that takes place in arbitrage is based on an operation that makes something the measure of something else - associating securities to each other. The process of recognizing opportunities and the practices of making novel associations are shaped by the specific socio-spatial and socio-technical configurations of the trading room. Calculation is distributed across persons and instruments as the trading room organizes interaction among diverse principles of valuation.


This paper considers a job search model where the environment is notstationary along the unemployment spell and where jobs do not lastforever. Under this circumstance, reservation wages can be lower thanwithout separations, as in a stationary environment, but they can alsobe initially higher because of the non-stationarity of the model. Moreover,the time-dependence of reservation wages is stronger than with noseparations. The model is estimated structurally using Spanish data forthe period 1985-1996. The main finding is that, although the decrease inreservation wages is the main determinant of the change in the exit ratefrom unemployment for the first four months, later on the only effect comesfrom the job offer arrival rate, given that acceptance probabilities areroughly equal to one.


A welfare analysis of unemployment insurance (UI) is performed in a generalequilibrium job search model. Finitely-lived, risk-averse workers smooth consumption over time by accumulating assets, choose search effort whenunemployed, and suffer disutility from work. Firms hire workers, purchasecapital, and pay taxes to finance worker benefits; their equity is the assetaccumulated by workers. A matching function relates unemployment, hiringexpenditure, and search effort to the formation of jobs. The model is calibrated to US data; the parameters relating job search effort to the probability of job finding are chosen to match microeconomic studies ofunemployment spells. Under logarithmic utility, numerical simulation shows rather small welfaregains from UI. Even without UI, workers smooth consumption effectivelythrough asset accumulation. Greater risk aversion leads to substantiallylarger welfare gains from UI; however, even in this case much of its welfareimpact is due not to consumption smoothing effects, but rather to decreased work disutility, or to a variety of externalities.


In this paper I show how borrowing constraints and job search interact.I fit a dynamic model to data from the National Longitudinal Survey(1979-cohort) and show that borrowing constraints are significant. Agentswith more initial assets and more access to credit attain higher wagesfor several periods after high school graduation. The unemployed maintaintheir consumption by running down their assets, while the employed saveto buffer against future unemployment spells. I also show that, unlikein models with exogenous income streams, unemployment transfers, byallowing agents to attain higher wages do not 'crowd out' but increasesaving.


Des dels inicis dels ordinadors com a màquines programables, l’home ha intentat dotar-los de certa intel•ligència per tal de pensar o raonar el més semblant possible als humans. Un d’aquests intents ha sigut fer que la màquina sigui capaç de pensar de tal manera que estudiï jugades i guanyi partides d’escacs. En l’actualitat amb els actuals sistemes multi tasca, orientat a objectes i accés a memòria i gràcies al potent hardware del que disposem, comptem amb una gran varietat de programes que es dediquen a jugar a escacs. Però no hi ha només programes petits, hi ha fins i tot màquines senceres dedicades a calcular i estudiar jugades per tal de guanyar als millors jugadors del món. L’objectiu del meu treball és dur a terme un estudi i implementació d’un d’aquests programes, per això es divideix en dues parts. La part teòrica o de l’estudi, consta d’un estudi dels sistemes d’intel•ligència artificial que es dediquen a jugar a escacs, estudi i cerca d’una funció d’avaluació vàlida i estudi dels algorismes de cerca. La part pràctica del treball es basa en la implementació d’un sistema intel•ligent capaç de jugar a escacs amb certa lògica. Aquesta implementació es porta a terme amb l’ajuda de les llibreries SDL, utilitzant l’algorisme minimax amb poda alfa-beta i codi c++. Com a conclusió del projecte m’agradaria remarcar que l’estudi realitzat m’ha deixat veure que crear un joc d’escacs no era tan fàcil com jo pensava però m’ha aportat la satisfacció d’aplicar tot el que he après durant la carrera i de descobrir moltes altres coses noves.