950 resultados para Railways, Scheduling, Heuristics, Search Algorithms
Resumo:
Projecte d'adaptació del programa GNU Chess al sistema de grid computing 'Condor'. I amb això, es planteja un estudi sobre els algorismes de cerca i la seva aplicació en entorns distribuïts. Una sèrie de proves sobre unes mostres de una partida d'escacs contra el propi GNU Chess ens ajuden a posar de relleu els avantatges i inconvenients de cada un dels algorismes proposats.
Resumo:
Abstract : In the subject of fingerprints, the rise of computers tools made it possible to create powerful automated search algorithms. These algorithms allow, inter alia, to compare a fingermark to a fingerprint database and therefore to establish a link between the mark and a known source. With the growth of the capacities of these systems and of data storage, as well as increasing collaboration between police services on the international level, the size of these databases increases. The current challenge for the field of fingerprint identification consists of the growth of these databases, which makes it possible to find impressions that are very similar but coming from distinct fingers. However and simultaneously, this data and these systems allow a description of the variability between different impressions from a same finger and between impressions from different fingers. This statistical description of the withinand between-finger variabilities computed on the basis of minutiae and their relative positions can then be utilized in a statistical approach to interpretation. The computation of a likelihood ratio, employing simultaneously the comparison between the mark and the print of the case, the within-variability of the suspects' finger and the between-variability of the mark with respect to a database, can then be based on representative data. Thus, these data allow an evaluation which may be more detailed than that obtained by the application of rules established long before the advent of these large databases or by the specialists experience. The goal of the present thesis is to evaluate likelihood ratios, computed based on the scores of an automated fingerprint identification system when the source of the tested and compared marks is known. These ratios must support the hypothesis which it is known to be true. Moreover, they should support this hypothesis more and more strongly with the addition of information in the form of additional minutiae. For the modeling of within- and between-variability, the necessary data were defined, and acquired for one finger of a first donor, and two fingers of a second donor. The database used for between-variability includes approximately 600000 inked prints. The minimal number of observations necessary for a robust estimation was determined for the two distributions used. Factors which influence these distributions were also analyzed: the number of minutiae included in the configuration and the configuration as such for both distributions, as well as the finger number and the general pattern for between-variability, and the orientation of the minutiae for within-variability. In the present study, the only factor for which no influence has been shown is the orientation of minutiae The results show that the likelihood ratios resulting from the use of the scores of an AFIS can be used for evaluation. Relatively low rates of likelihood ratios supporting the hypothesis known to be false have been obtained. The maximum rate of likelihood ratios supporting the hypothesis that the two impressions were left by the same finger when the impressions came from different fingers obtained is of 5.2 %, for a configuration of 6 minutiae. When a 7th then an 8th minutia are added, this rate lowers to 3.2 %, then to 0.8 %. In parallel, for these same configurations, the likelihood ratios obtained are on average of the order of 100,1000, and 10000 for 6,7 and 8 minutiae when the two impressions come from the same finger. These likelihood ratios can therefore be an important aid for decision making. Both positive evolutions linked to the addition of minutiae (a drop in the rates of likelihood ratios which can lead to an erroneous decision and an increase in the value of the likelihood ratio) were observed in a systematic way within the framework of the study. Approximations based on 3 scores for within-variability and on 10 scores for between-variability were found, and showed satisfactory results. Résumé : Dans le domaine des empreintes digitales, l'essor des outils informatisés a permis de créer de puissants algorithmes de recherche automatique. Ces algorithmes permettent, entre autres, de comparer une trace à une banque de données d'empreintes digitales de source connue. Ainsi, le lien entre la trace et l'une de ces sources peut être établi. Avec la croissance des capacités de ces systèmes, des potentiels de stockage de données, ainsi qu'avec une collaboration accrue au niveau international entre les services de police, la taille des banques de données augmente. Le défi actuel pour le domaine de l'identification par empreintes digitales consiste en la croissance de ces banques de données, qui peut permettre de trouver des impressions très similaires mais provenant de doigts distincts. Toutefois et simultanément, ces données et ces systèmes permettent une description des variabilités entre différentes appositions d'un même doigt, et entre les appositions de différents doigts, basées sur des larges quantités de données. Cette description statistique de l'intra- et de l'intervariabilité calculée à partir des minuties et de leurs positions relatives va s'insérer dans une approche d'interprétation probabiliste. Le calcul d'un rapport de vraisemblance, qui fait intervenir simultanément la comparaison entre la trace et l'empreinte du cas, ainsi que l'intravariabilité du doigt du suspect et l'intervariabilité de la trace par rapport à une banque de données, peut alors se baser sur des jeux de données représentatifs. Ainsi, ces données permettent d'aboutir à une évaluation beaucoup plus fine que celle obtenue par l'application de règles établies bien avant l'avènement de ces grandes banques ou par la seule expérience du spécialiste. L'objectif de la présente thèse est d'évaluer des rapports de vraisemblance calcul és à partir des scores d'un système automatique lorsqu'on connaît la source des traces testées et comparées. Ces rapports doivent soutenir l'hypothèse dont il est connu qu'elle est vraie. De plus, ils devraient soutenir de plus en plus fortement cette hypothèse avec l'ajout d'information sous la forme de minuties additionnelles. Pour la modélisation de l'intra- et l'intervariabilité, les données nécessaires ont été définies, et acquises pour un doigt d'un premier donneur, et deux doigts d'un second donneur. La banque de données utilisée pour l'intervariabilité inclut environ 600000 empreintes encrées. Le nombre minimal d'observations nécessaire pour une estimation robuste a été déterminé pour les deux distributions utilisées. Des facteurs qui influencent ces distributions ont, par la suite, été analysés: le nombre de minuties inclus dans la configuration et la configuration en tant que telle pour les deux distributions, ainsi que le numéro du doigt et le dessin général pour l'intervariabilité, et la orientation des minuties pour l'intravariabilité. Parmi tous ces facteurs, l'orientation des minuties est le seul dont une influence n'a pas été démontrée dans la présente étude. Les résultats montrent que les rapports de vraisemblance issus de l'utilisation des scores de l'AFIS peuvent être utilisés à des fins évaluatifs. Des taux de rapports de vraisemblance relativement bas soutiennent l'hypothèse que l'on sait fausse. Le taux maximal de rapports de vraisemblance soutenant l'hypothèse que les deux impressions aient été laissées par le même doigt alors qu'en réalité les impressions viennent de doigts différents obtenu est de 5.2%, pour une configuration de 6 minuties. Lorsqu'une 7ème puis une 8ème minutie sont ajoutées, ce taux baisse d'abord à 3.2%, puis à 0.8%. Parallèlement, pour ces mêmes configurations, les rapports de vraisemblance sont en moyenne de l'ordre de 100, 1000, et 10000 pour 6, 7 et 8 minuties lorsque les deux impressions proviennent du même doigt. Ces rapports de vraisemblance peuvent donc apporter un soutien important à la prise de décision. Les deux évolutions positives liées à l'ajout de minuties (baisse des taux qui peuvent amener à une décision erronée et augmentation de la valeur du rapport de vraisemblance) ont été observées de façon systématique dans le cadre de l'étude. Des approximations basées sur 3 scores pour l'intravariabilité et sur 10 scores pour l'intervariabilité ont été trouvées, et ont montré des résultats satisfaisants.
Resumo:
The objective of this thesis is to develop and generalize further the differential evolution based data classification method. For many years, evolutionary algorithms have been successfully applied to many classification tasks. Evolution algorithms are population based, stochastic search algorithms that mimic natural selection and genetics. Differential evolution is an evolutionary algorithm that has gained popularity because of its simplicity and good observed performance. In this thesis a differential evolution classifier with pool of distances is proposed, demonstrated and initially evaluated. The differential evolution classifier is a nearest prototype vector based classifier that applies a global optimization algorithm, differential evolution, to determine the optimal values for all free parameters of the classifier model during the training phase of the classifier. The differential evolution classifier applies the individually optimized distance measure for each new data set to be classified is generalized to cover a pool of distances. Instead of optimizing a single distance measure for the given data set, the selection of the optimal distance measure from a predefined pool of alternative measures is attempted systematically and automatically. Furthermore, instead of only selecting the optimal distance measure from a set of alternatives, an attempt is made to optimize the values of the possible control parameters related with the selected distance measure. Specifically, a pool of alternative distance measures is first created and then the differential evolution algorithm is applied to select the optimal distance measure that yields the highest classification accuracy with the current data. After determining the optimal distance measures for the given data set together with their optimal parameters, all determined distance measures are aggregated to form a single total distance measure. The total distance measure is applied to the final classification decisions. The actual classification process is still based on the nearest prototype vector principle; a sample belongs to the class represented by the nearest prototype vector when measured with the optimized total distance measure. During the training process the differential evolution algorithm determines the optimal class vectors, selects optimal distance metrics, and determines the optimal values for the free parameters of each selected distance measure. The results obtained with the above method confirm that the choice of distance measure is one of the most crucial factors for obtaining higher classification accuracy. The results also demonstrate that it is possible to build a classifier that is able to select the optimal distance measure for the given data set automatically and systematically. After finding optimal distance measures together with optimal parameters from the particular distance measure results are then aggregated to form a total distance, which will be used to form the deviation between the class vectors and samples and thus classify the samples. This thesis also discusses two types of aggregation operators, namely, ordered weighted averaging (OWA) based multi-distances and generalized ordered weighted averaging (GOWA). These aggregation operators were applied in this work to the aggregation of the normalized distance values. The results demonstrate that a proper combination of aggregation operator and weight generation scheme play an important role in obtaining good classification accuracy. The main outcomes of the work are the six new generalized versions of previous method called differential evolution classifier. All these DE classifier demonstrated good results in the classification tasks.
Resumo:
The curse of dimensionality is a major problem in the fields of machine learning, data mining and knowledge discovery. Exhaustive search for the most optimal subset of relevant features from a high dimensional dataset is NP hard. Sub–optimal population based stochastic algorithms such as GP and GA are good choices for searching through large search spaces, and are usually more feasible than exhaustive and deterministic search algorithms. On the other hand, population based stochastic algorithms often suffer from premature convergence on mediocre sub–optimal solutions. The Age Layered Population Structure (ALPS) is a novel metaheuristic for overcoming the problem of premature convergence in evolutionary algorithms, and for improving search in the fitness landscape. The ALPS paradigm uses an age–measure to control breeding and competition between individuals in the population. This thesis uses a modification of the ALPS GP strategy called Feature Selection ALPS (FSALPS) for feature subset selection and classification of varied supervised learning tasks. FSALPS uses a novel frequency count system to rank features in the GP population based on evolved feature frequencies. The ranked features are translated into probabilities, which are used to control evolutionary processes such as terminal–symbol selection for the construction of GP trees/sub-trees. The FSALPS metaheuristic continuously refines the feature subset selection process whiles simultaneously evolving efficient classifiers through a non–converging evolutionary process that favors selection of features with high discrimination of class labels. We investigated and compared the performance of canonical GP, ALPS and FSALPS on high–dimensional benchmark classification datasets, including a hyperspectral image. Using Tukey’s HSD ANOVA test at a 95% confidence interval, ALPS and FSALPS dominated canonical GP in evolving smaller but efficient trees with less bloat expressions. FSALPS significantly outperformed canonical GP and ALPS and some reported feature selection strategies in related literature on dimensionality reduction.
Resumo:
The curse of dimensionality is a major problem in the fields of machine learning, data mining and knowledge discovery. Exhaustive search for the most optimal subset of relevant features from a high dimensional dataset is NP hard. Sub–optimal population based stochastic algorithms such as GP and GA are good choices for searching through large search spaces, and are usually more feasible than exhaustive and determinis- tic search algorithms. On the other hand, population based stochastic algorithms often suffer from premature convergence on mediocre sub–optimal solutions. The Age Layered Population Structure (ALPS) is a novel meta–heuristic for overcoming the problem of premature convergence in evolutionary algorithms, and for improving search in the fitness landscape. The ALPS paradigm uses an age–measure to control breeding and competition between individuals in the population. This thesis uses a modification of the ALPS GP strategy called Feature Selection ALPS (FSALPS) for feature subset selection and classification of varied supervised learning tasks. FSALPS uses a novel frequency count system to rank features in the GP population based on evolved feature frequencies. The ranked features are translated into probabilities, which are used to control evolutionary processes such as terminal–symbol selection for the construction of GP trees/sub-trees. The FSALPS meta–heuristic continuously refines the feature subset selection process whiles simultaneously evolving efficient classifiers through a non–converging evolutionary process that favors selection of features with high discrimination of class labels. We investigated and compared the performance of canonical GP, ALPS and FSALPS on high–dimensional benchmark classification datasets, including a hyperspectral image. Using Tukey’s HSD ANOVA test at a 95% confidence interval, ALPS and FSALPS dominated canonical GP in evolving smaller but efficient trees with less bloat expressions. FSALPS significantly outperformed canonical GP and ALPS and some reported feature selection strategies in related literature on dimensionality reduction.
Resumo:
Mesh generation is an important step inmany numerical methods.We present the “HierarchicalGraphMeshing” (HGM)method as a novel approach to mesh generation, based on algebraic graph theory.The HGM method can be used to systematically construct configurations exhibiting multiple hierarchies and complex symmetry characteristics. The hierarchical description of structures provided by the HGM method can be exploited to increase the efficiency of multiscale and multigrid methods. In this paper, the HGMmethod is employed for the systematic construction of super carbon nanotubes of arbitrary order, which present a pertinent example of structurally and geometrically complex, yet highly regular, structures. The HGMalgorithm is computationally efficient and exhibits good scaling characteristics. In particular, it scales linearly for super carbon nanotube structures and is working much faster than geometry-based methods employing neighborhood search algorithms. Its modular character makes it conducive to automatization. For the generation of a mesh, the information about the geometry of the structure in a given configuration is added in a way that relates geometric symmetries to structural symmetries. The intrinsically hierarchic description of the resulting mesh greatly reduces the effort of determining mesh hierarchies for multigrid and multiscale applications and helps to exploit symmetry-related methods in the mechanical analysis of complex structures.
Resumo:
We present a general Multi-Agent System framework for distributed data mining based on a Peer-to-Peer model. Agent protocols are implemented through message-based asynchronous communication. The framework adopts a dynamic load balancing policy that is particularly suitable for irregular search algorithms. A modular design allows a separation of the general-purpose system protocols and software components from the specific data mining algorithm. The experimental evaluation has been carried out on a parallel frequent subgraph mining algorithm, which has shown good scalability performances.
Resumo:
This thesis describes design methodologies for frequency selective surfaces (FSSs) composed of periodic arrays of pre-fractals metallic patches on single-layer dielectrics (FR4, RT/duroid). Shapes presented by Sierpinski island and T fractal geometries are exploited to the simple design of efficient band-stop spatial filters with applications in the range of microwaves. Initial results are discussed in terms of the electromagnetic effect resulting from the variation of parameters such as, fractal iteration number (or fractal level), fractal iteration factor, and periodicity of FSS, depending on the used pre-fractal element (Sierpinski island or T fractal). The transmission properties of these proposed periodic arrays are investigated through simulations performed by Ansoft DesignerTM and Ansoft HFSSTM commercial softwares that run full-wave methods. To validate the employed methodology, FSS prototypes are selected for fabrication and measurement. The obtained results point to interesting features for FSS spatial filters: compactness, with high values of frequency compression factor; as well as stable frequency responses at oblique incidence of plane waves. This thesis also approaches, as it main focus, the application of an alternative electromagnetic (EM) optimization technique for analysis and synthesis of FSSs with fractal motifs. In application examples of this technique, Vicsek and Sierpinski pre-fractal elements are used in the optimal design of FSS structures. Based on computational intelligence tools, the proposed technique overcomes the high computational cost associated to the full-wave parametric analyzes. To this end, fast and accurate multilayer perceptron (MLP) neural network models are developed using different parameters as design input variables. These neural network models aim to calculate the cost function in the iterations of population-based search algorithms. Continuous genetic algorithm (GA), particle swarm optimization (PSO), and bees algorithm (BA) are used for FSSs optimization with specific resonant frequency and bandwidth. The performance of these algorithms is compared in terms of computational cost and numerical convergence. Consistent results can be verified by the excellent agreement obtained between simulations and measurements related to FSS prototypes built with a given fractal iteration
Resumo:
We have investigated and extensively tested three families of non-convex optimization approaches for solving the transmission network expansion planning problem: simulated annealing (SA), genetic algorithms (GA), and tabu search algorithms (TS). The paper compares the main features of the three approaches and presents an integrated view of these methodologies. A hybrid approach is then proposed which presents performances which are far better than the ones obtained with any of these approaches individually. Results obtained in tests performed with large scale real-life networks are summarized.
Resumo:
We have investigated and extensively tested three families of non-convex optimization approaches for solving the transmission network expansion planning problem: simulated annealing (SA), genetic algorithms (GA), and tabu search algorithms (TS). The paper compares the main features of the three approaches and presents an integrated view of these methodologies. A hybrid approach is then proposed which presents performances which are far better than the ones obtained with any of these approaches individually. Results obtained in tests performed with large scale real-life networks are summarized.
Resumo:
In this work, a heuristic model for integrated planning of primary distribution network and secondary distribution circuits is proposed. A Tabu Search (TS) algorithm is employed to solve the planning of primary distribution networks. Evolutionary Algorithms (EA) are used to solve the planning model of secondary networks. The planning integration of both networks is carried out by means a constructive heuristic taking into account a set of integration alternatives between these networks. These integration alternatives are treated in a hierarchical way. The planning of primary networks and secondary distribution circuits is carried out based on assessment of the effects of the alternative solutions in the expansion costs of both networks simultaneously. In order to evaluate this methodology, tests were performed for a real-life distribution system taking into account the primary and secondary networks.
Resumo:
This work proposes a methodology for optimized allocation of switches for automatic load transfer in distribution systems in order to improve the reliability indexes by restoring such systems which present voltage classes of 23 to 35 kV and radial topology. The automatic switches must be allocated on the system in order to transfer load remotely among the sources at the substations. The problem of switch allocation is formulated as nonlinear constrained mixed integer programming model subject to a set of economical and physical constraints. A dedicated Tabu Search (TS) algorithm is proposed to solve this model. The proposed methodology is tested for a large real-life distribution system. © 2011 IEEE.
Resumo:
Consider a one-dimensional environment with N randomly distributed sites. An agent explores this random medium moving deterministically with a spatial memory μ. A crossover from local to global exploration occurs in one dimension at a well-defined memory value μ1=log2N. In its stochastic version, the dynamics is ruled by the memory and by temperature T, which affects the hopping displacement. This dynamics also shows a crossover in one dimension, obtained computationally, between exploration schemes, characterized yet by the trajectory size (Np) (aging effect). In this paper we provide an analytical approach considering the modified stochastic version where the parameter T plays the role of a maximum hopping distance. This modification allows us to obtain a general analytical expression for the crossover, as a function of the parameters μ, T, and Np. Differently from what has been proposed by previous studies, we find that the crossover occurs in any dimension d. These results have been validated by numerical experiments and may be of great value for fixing optimal parameters in search algorithms. © 2013 American Physical Society.
Resumo:
This paper presents an optimum user-steered boundary tracking approach for image segmentation, which simulates the behavior of water flowing through a riverbed. The riverbed approach was devised using the image foresting transform with a never-exploited connectivity function. We analyze its properties in the derived image graphs and discuss its theoretical relation with other popular methods such as live wire and graph cuts. Several experiments show that riverbed can significantly reduce the number of user interactions (anchor points), as compared to live wire for objects with complex shapes. This paper also includes a discussion about how to combine different methods in order to take advantage of their complementary strengths.