213 resultados para Space problem
Resumo:
In this article we present a hybrid approach for automatic summarization of Spanish medical texts. There are a lot of systems for automatic summarization using statistics or linguistics, but only a few of them combining both techniques. Our idea is that to reach a good summary we need to use linguistic aspects of texts, but as well we should benefit of the advantages of statistical techniques. We have integrated the Cortex (Vector Space Model) and Enertex (statistical physics) systems coupled with the Yate term extractor, and the Disicosum system (linguistics). We have compared these systems and afterwards we have integrated them in a hybrid approach. Finally, we have applied this hybrid system over a corpora of medical articles and we have evaluated their performances obtaining good results.
Resumo:
From a managerial point of view, the more effcient, simple, and parameter-free (ESP) an algorithm is, the more likely it will be used in practice for solving real-life problems. Following this principle, an ESP algorithm for solving the Permutation Flowshop Sequencing Problem (PFSP) is proposed in this article. Using an Iterated Local Search (ILS) framework, the so-called ILS-ESP algorithm is able to compete in performance with other well-known ILS-based approaches, which are considered among the most effcient algorithms for the PFSP. However, while other similar approaches still employ several parameters that can affect their performance if not properly chosen, our algorithm does not require any particular fine-tuning process since it uses basic "common sense" rules for the local search, perturbation, and acceptance criterion stages of the ILS metaheuristic. Our approach defines a new operator for the ILS perturbation process, a new acceptance criterion based on extremely simple and transparent rules, and a biased randomization process of the initial solution to randomly generate different alternative initial solutions of similar quality -which is attained by applying a biased randomization to a classical PFSP heuristic. This diversification of the initial solution aims at avoiding poorly designed starting points and, thus, allows the methodology to take advantage of current trends in parallel and distributed computing. A set of extensive tests, based on literature benchmarks, has been carried out in order to validate our algorithm and compare it against other approaches. These tests show that our parameter-free algorithm is able to compete with state-of-the-art metaheuristics for the PFSP. Also, the experiments show that, when using parallel computing, it is possible to improve the top ILS-based metaheuristic by just incorporating to it our biased randomization process with a high-quality pseudo-random number generator.
Resumo:
The past four decades have witnessed an explosive growth in the field of networkbased facility location modeling. This is not at all surprising since location policy is one of the most profitable areas of applied systems analysis in regional science and ample theoretical and applied challenges are offered. Location-allocation models seek the location of facilities and/or services (e.g., schools, hospitals, and warehouses) so as to optimize one or several objectives generally related to the efficiency of the system or to the allocation of resources. This paper concerns the location of facilities or services in discrete space or networks, that are related to the public sector, such as emergency services (ambulances, fire stations, and police units), school systems and postal facilities. The paper is structured as follows: first, we will focus on public facility location models that use some type of coverage criterion, with special emphasis in emergency services. The second section will examine models based on the P-Median problem and some of the issues faced by planners when implementing this formulation in real world locational decisions. Finally, the last section will examine new trends in public sector facility location modeling.
Resumo:
When continuous data are coded to categorical variables, two types of coding are possible: crisp coding in the form of indicator, or dummy, variables with values either 0 or 1; or fuzzy coding where each observation is transformed to a set of "degrees of membership" between 0 and 1, using co-called membership functions. It is well known that the correspondence analysis of crisp coded data, namely multiple correspondence analysis, yields principal inertias (eigenvalues) that considerably underestimate the quality of the solution in a low-dimensional space. Since the crisp data only code the categories to which each individual case belongs, an alternative measure of fit is simply to count how well these categories are predicted by the solution. Another approach is to consider multiple correspondence analysis equivalently as the analysis of the Burt matrix (i.e., the matrix of all two-way cross-tabulations of the categorical variables), and then perform a joint correspondence analysis to fit just the off-diagonal tables of the Burt matrix - the measure of fit is then computed as the quality of explaining these tables only. The correspondence analysis of fuzzy coded data, called "fuzzy multiple correspondence analysis", suffers from the same problem, albeit attenuated. Again, one can count how many correct predictions are made of the categories which have highest degree of membership. But here one can also defuzzify the results of the analysis to obtain estimated values of the original data, and then calculate a measure of fit in the familiar percentage form, thanks to the resultant orthogonal decomposition of variance. Furthermore, if one thinks of fuzzy multiple correspondence analysis as explaining the two-way associations between variables, a fuzzy Burt matrix can be computed and the same strategy as in the crisp case can be applied to analyse the off-diagonal part of this matrix. In this paper these alternative measures of fit are defined and applied to a data set of continuous meteorological variables, which are coded crisply and fuzzily into three categories. Measuring the fit is further discussed when the data set consists of a mixture of discrete and continuous variables.
Resumo:
One of the disadvantages of old age is that there is more past than future: this,however, may be turned into an advantage if the wealth of experience and, hopefully,wisdom gained in the past can be reflected upon and throw some light on possiblefuture trends. To an extent, then, this talk is necessarily personal, certainly nostalgic,but also self critical and inquisitive about our understanding of the discipline ofstatistics. A number of almost philosophical themes will run through the talk: searchfor appropriate modelling in relation to the real problem envisaged, emphasis onsensible balances between simplicity and complexity, the relative roles of theory andpractice, the nature of communication of inferential ideas to the statistical layman, theinter-related roles of teaching, consultation and research. A list of keywords might be:identification of sample space and its mathematical structure, choices betweentransform and stay, the role of parametric modelling, the role of a sample spacemetric, the underused hypothesis lattice, the nature of compositional change,particularly in relation to the modelling of processes. While the main theme will berelevance to compositional data analysis we shall point to substantial implications forgeneral multivariate analysis arising from experience of the development ofcompositional data analysis…
Resumo:
The standard one-machine scheduling problem consists in schedulinga set of jobs in one machine which can handle only one job at atime, minimizing the maximum lateness. Each job is available forprocessing at its release date, requires a known processing timeand after finishing the processing, it is delivery after a certaintime. There also can exists precedence constraints between pairsof jobs, requiring that the first jobs must be completed beforethe second job can start. An extension of this problem consistsin assigning a time interval between the processing of the jobsassociated with the precedence constrains, known by finish-starttime-lags. In presence of this constraints, the problem is NP-hardeven if preemption is allowed. In this work, we consider a specialcase of the one-machine preemption scheduling problem with time-lags, where the time-lags have a chain form, and propose apolynomial algorithm to solve it. The algorithm consist in apolynomial number of calls of the preemption version of the LongestTail Heuristic. One of the applicability of the method is to obtainlower bounds for NP-hard one-machine and job-shop schedulingproblems. We present some computational results of thisapplication, followed by some conclusions.
Resumo:
We start with a generalization of the well-known three-door problem:the n-door problem. The solution of this new problem leads us toa beautiful representation system for real numbers in (0,1] as alternated series, known in the literature as Pierce expansions. A closer look to Pierce expansions will take us to some metrical properties of sets defined through the Pierce expansions of its elements. Finally, these metrical properties will enable us to present 'strange' sets, similar to the classical Cantor set.
Resumo:
One of the assumptions of the Capacitated Facility Location Problem (CFLP) is thatdemand is known and fixed. Most often, this is not the case when managers take somestrategic decisions such as locating facilities and assigning demand points to thosefacilities. In this paper we consider demand as stochastic and we model each of thefacilities as an independent queue. Stochastic models of manufacturing systems anddeterministic location models are put together in order to obtain a formula for thebacklogging probability at a potential facility location.Several solution techniques have been proposed to solve the CFLP. One of the mostrecently proposed heuristics, a Reactive Greedy Adaptive Search Procedure, isimplemented in order to solve the model formulated. We present some computationalexperiments in order to evaluate the heuristics performance and to illustrate the use ofthis new formulation for the CFLP. The paper finishes with a simple simulationexercise.
Resumo:
Revenue management practices often include overbooking capacity to account for customerswho make reservations but do not show up. In this paper, we consider the network revenuemanagement problem with no-shows and overbooking, where the show-up probabilities are specificto each product. No-show rates differ significantly by product (for instance, each itinerary andfare combination for an airline) as sale restrictions and the demand characteristics vary byproduct. However, models that consider no-show rates by each individual product are difficultto handle as the state-space in dynamic programming formulations (or the variable space inapproximations) increases significantly. In this paper, we propose a randomized linear program tojointly make the capacity control and overbooking decisions with product-specific no-shows. Weestablish that our formulation gives an upper bound on the optimal expected total profit andour upper bound is tighter than a deterministic linear programming upper bound that appearsin the existing literature. Furthermore, we show that our upper bound is asymptotically tightin a regime where the leg capacities and the expected demand is scaled linearly with the samerate. We also describe how the randomized linear program can be used to obtain a bid price controlpolicy. Computational experiments indicate that our approach is quite fast, able to scale to industrialproblems and can provide significant improvements over standard benchmarks.
Resumo:
The problems arising in commercial distribution are complex and involve several players and decision levels. One important decision is relatedwith the design of the routes to distribute the products, in an efficient and inexpensive way.This article deals with a complex vehicle routing problem that can beseen as a new extension of the basic vehicle routing problem. The proposed model is a multi-objective combinatorial optimization problemthat considers three objectives and multiple periods, which models in a closer way the real distribution problems. The first objective is costminimization, the second is balancing work levels and the third is amarketing objective. An application of the model on a small example, with5 clients and 3 days, is presented. The results of the model show the complexity of solving multi-objective combinatorial optimization problems and the contradiction between the several distribution management objective.
Resumo:
Iterated Local Search has many of the desirable features of a metaheuristic: it is simple, easy to implement, robust, and highly effective. The essential idea of Iterated Local Search lies in focusing the search not on the full space of solutions but on a smaller subspace defined by the solutions that are locally optimal for a given optimization engine. The success of Iterated Local Search lies in the biased sampling of this set of local optima. How effective this approach turns out to be depends mainly on the choice of the local search, the perturbations, and the acceptance criterion. So far, in spite of its conceptual simplicity, it has lead to a number of state-of-the-art results without the use of too much problem-specific knowledge. But with further work so that the different modules are well adapted to the problem at hand, Iterated Local Search can often become a competitive or even state of the artalgorithm. The purpose of this review is both to give a detailed description of this metaheuristic and to show where it stands in terms of performance.
Resumo:
We obtain minimax lower bounds on the regret for the classicaltwo--armed bandit problem. We provide a finite--sample minimax version of the well--known log $n$ asymptotic lower bound of Lai and Robbins. Also, in contrast to the log $n$ asymptotic results on the regret, we show that the minimax regret is achieved by mere random guessing under fairly mild conditions on the set of allowable configurations of the two arms. That is, we show that for {\sl every} allocation rule and for {\sl every} $n$, there is a configuration such that the regret at time $n$ is at least 1 -- $\epsilon$ times the regret of random guessing, where $\epsilon$ is any small positive constant.
Resumo:
The network revenue management (RM) problem arises in airline, hotel, media,and other industries where the sale products use multiple resources. It can be formulatedas a stochastic dynamic program but the dynamic program is computationallyintractable because of an exponentially large state space, and a number of heuristicshave been proposed to approximate it. Notable amongst these -both for their revenueperformance, as well as their theoretically sound basis- are approximate dynamic programmingmethods that approximate the value function by basis functions (both affinefunctions as well as piecewise-linear functions have been proposed for network RM)and decomposition methods that relax the constraints of the dynamic program to solvesimpler dynamic programs (such as the Lagrangian relaxation methods). In this paperwe show that these two seemingly distinct approaches coincide for the network RMdynamic program, i.e., the piecewise-linear approximation method and the Lagrangianrelaxation method are one and the same.