875 resultados para constructive heuristic algorithms


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Les algorithmes d'apprentissage profond forment un nouvel ensemble de méthodes puissantes pour l'apprentissage automatique. L'idée est de combiner des couches de facteurs latents en hierarchies. Cela requiert souvent un coût computationel plus elevé et augmente aussi le nombre de paramètres du modèle. Ainsi, l'utilisation de ces méthodes sur des problèmes à plus grande échelle demande de réduire leur coût et aussi d'améliorer leur régularisation et leur optimization. Cette thèse adresse cette question sur ces trois perspectives. Nous étudions tout d'abord le problème de réduire le coût de certains algorithmes profonds. Nous proposons deux méthodes pour entrainer des machines de Boltzmann restreintes et des auto-encodeurs débruitants sur des distributions sparses à haute dimension. Ceci est important pour l'application de ces algorithmes pour le traitement de langues naturelles. Ces deux méthodes (Dauphin et al., 2011; Dauphin and Bengio, 2013) utilisent l'échantillonage par importance pour échantilloner l'objectif de ces modèles. Nous observons que cela réduit significativement le temps d'entrainement. L'accéleration atteint 2 ordres de magnitude sur plusieurs bancs d'essai. Deuxièmement, nous introduisont un puissant régularisateur pour les méthodes profondes. Les résultats expérimentaux démontrent qu'un bon régularisateur est crucial pour obtenir de bonnes performances avec des gros réseaux (Hinton et al., 2012). Dans Rifai et al. (2011), nous proposons un nouveau régularisateur qui combine l'apprentissage non-supervisé et la propagation de tangente (Simard et al., 1992). Cette méthode exploite des principes géometriques et permit au moment de la publication d'atteindre des résultats à l'état de l'art. Finalement, nous considérons le problème d'optimiser des surfaces non-convexes à haute dimensionalité comme celle des réseaux de neurones. Tradionellement, l'abondance de minimum locaux était considéré comme la principale difficulté dans ces problèmes. Dans Dauphin et al. (2014a) nous argumentons à partir de résultats en statistique physique, de la théorie des matrices aléatoires, de la théorie des réseaux de neurones et à partir de résultats expérimentaux qu'une difficulté plus profonde provient de la prolifération de points-selle. Dans ce papier nous proposons aussi une nouvelle méthode pour l'optimisation non-convexe.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Dans des contextes de post-urgence tels que le vit la partie occidentale de la République Démocratique du Congo (RDC), l’un des défis cruciaux auxquels font face les hôpitaux ruraux est de maintenir un niveau de médicaments essentiels dans la pharmacie. Sans ces médicaments pour traiter les maladies graves, l’impact sur la santé de la population est significatif. Les hôpitaux encourent également des pertes financières dues à la péremption lorsque trop de médicaments sont commandés. De plus, les coûts du transport des médicaments ainsi que du superviseur sont très élevés pour les hôpitaux isolés ; les coûts du transport peuvent à eux seuls dépasser ceux des médicaments. En utilisant la province du Bandundu, RDC pour une étude de cas, notre recherche tente de déterminer la faisabilité (en termes et de la complexité du problème et des économies potentielles) d’un problème de routage synchronisé pour la livraison de médicaments et pour les visites de supervision. Nous proposons une formulation du problème de tournées de véhicules avec capacité limitée qui gère plusieurs exigences nouvelles, soit la synchronisation des activités, la préséance et deux fréquences d’activités. Nous mettons en œuvre une heuristique « cluster first, route second » avec une base de données géospatiales qui permet de résoudre le problème. Nous présentons également un outil Internet qui permet de visualiser les solutions sur des cartes. Les résultats préliminaires de notre étude suggèrent qu’une solution synchronisée pourrait offrir la possibilité aux hôpitaux ruraux d’augmenter l’accessibilité des services médicaux aux populations rurales avec une augmentation modique du coût de transport actuel.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Extensive use of the Internet coupled with the marvelous growth in e-commerce and m-commerce has created a huge demand for information security. The Secure Socket Layer (SSL) protocol is the most widely used security protocol in the Internet which meets this demand. It provides protection against eaves droppings, tampering and forgery. The cryptographic algorithms RC4 and HMAC have been in use for achieving security services like confidentiality and authentication in the SSL. But recent attacks against RC4 and HMAC have raised questions in the confidence on these algorithms. Hence two novel cryptographic algorithms MAJE4 and MACJER-320 have been proposed as substitutes for them. The focus of this work is to demonstrate the performance of these new algorithms and suggest them as dependable alternatives to satisfy the need of security services in SSL. The performance evaluation has been done by using practical implementation method.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Internet today has become a vital part of day to day life, owing to the revolutionary changes it has brought about in various fields. Dependence on the Internet as an information highway and knowledge bank is exponentially increasing so that a going back is beyond imagination. Transfer of critical information is also being carried out through the Internet. This widespread use of the Internet coupled with the tremendous growth in e-commerce and m-commerce has created a vital need for infonnation security.Internet has also become an active field of crackers and intruders. The whole development in this area can become null and void if fool-proof security of the data is not ensured without a chance of being adulterated. It is, hence a challenge before the professional community to develop systems to ensure security of the data sent through the Internet.Stream ciphers, hash functions and message authentication codes play vital roles in providing security services like confidentiality, integrity and authentication of the data sent through the Internet. There are several ·such popular and dependable techniques, which have been in use widely, for quite a long time. This long term exposure makes them vulnerable to successful or near successful attempts for attacks. Hence it is the need of the hour to develop new algorithms with better security.Hence studies were conducted on various types of algorithms being used in this area. Focus was given to identify the properties imparting security at this stage. By making use of a perception derived from these studies, new algorithms were designed. Performances of these algorithms were then studied followed by necessary modifications to yield an improved system consisting of a new stream cipher algorithm MAJE4, a new hash code JERIM- 320 and a new message authentication code MACJER-320. Detailed analysis and comparison with the existing popular schemes were also carried out to establish the security levels.The Secure Socket Layer (SSL) I Transport Layer Security (TLS) protocol is one of the most widely used security protocols in Internet. The cryptographic algorithms RC4 and HMAC have been in use for achieving security services like confidentiality and authentication in the SSL I TLS. But recent attacks on RC4 and HMAC have raised questions about the reliability of these algorithms. Hence MAJE4 and MACJER-320 have been proposed as substitutes for them. Detailed studies on the performance of these new algorithms were carried out; it has been observed that they are dependable alternatives.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Computational Biology is the research are that contributes to the analysis of biological data through the development of algorithms which will address significant research problems.The data from molecular biology includes DNA,RNA ,Protein and Gene expression data.Gene Expression Data provides the expression level of genes under different conditions.Gene expression is the process of transcribing the DNA sequence of a gene into mRNA sequences which in turn are later translated into proteins.The number of copies of mRNA produced is called the expression level of a gene.Gene expression data is organized in the form of a matrix. Rows in the matrix represent genes and columns in the matrix represent experimental conditions.Experimental conditions can be different tissue types or time points.Entries in the gene expression matrix are real values.Through the analysis of gene expression data it is possible to determine the behavioral patterns of genes such as similarity of their behavior,nature of their interaction,their respective contribution to the same pathways and so on. Similar expression patterns are exhibited by the genes participating in the same biological process.These patterns have immense relevance and application in bioinformatics and clinical research.Theses patterns are used in the medical domain for aid in more accurate diagnosis,prognosis,treatment planning.drug discovery and protein network analysis.To identify various patterns from gene expression data,data mining techniques are essential.Clustering is an important data mining technique for the analysis of gene expression data.To overcome the problems associated with clustering,biclustering is introduced.Biclustering refers to simultaneous clustering of both rows and columns of a data matrix. Clustering is a global whereas biclustering is a local model.Discovering local expression patterns is essential for identfying many genetic pathways that are not apparent otherwise.It is therefore necessary to move beyond the clustering paradigm towards developing approaches which are capable of discovering local patterns in gene expression data.A biclusters is a submatrix of the gene expression data matrix.The rows and columns in the submatrix need not be contiguous as in the gene expression data matrix.Biclusters are not disjoint.Computation of biclusters is costly because one will have to consider all the combinations of columans and rows in order to find out all the biclusters.The search space for the biclustering problem is 2 m+n where m and n are the number of genes and conditions respectively.Usually m+n is more than 3000.The biclustering problem is NP-hard.Biclustering is a powerful analytical tool for the biologist.The research reported in this thesis addresses the problem of biclustering.Ten algorithms are developed for the identification of coherent biclusters from gene expression data.All these algorithms are making use of a measure called mean squared residue to search for biclusters.The objective here is to identify the biclusters of maximum size with the mean squared residue lower than a given threshold. All these algorithms begin the search from tightly coregulated submatrices called the seeds.These seeds are generated by K-Means clustering algorithm.The algorithms developed can be classified as constraint based,greedy and metaheuristic.Constarint based algorithms uses one or more of the various constaints namely the MSR threshold and the MSR difference threshold.The greedy approach makes a locally optimal choice at each stage with the objective of finding the global optimum.In metaheuristic approaches particle Swarm Optimization(PSO) and variants of Greedy Randomized Adaptive Search Procedure(GRASP) are used for the identification of biclusters.These algorithms are implemented on the Yeast and Lymphoma datasets.Biologically relevant and statistically significant biclusters are identified by all these algorithms which are validated by Gene Ontology database.All these algorithms are compared with some other biclustering algorithms.Algorithms developed in this work overcome some of the problems associated with the already existing algorithms.With the help of some of the algorithms which are developed in this work biclusters with very high row variance,which is higher than the row variance of any other algorithm using mean squared residue, are identified from both Yeast and Lymphoma data sets.Such biclusters which make significant change in the expression level are highly relevant biologically.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Motivation for Speaker recognition work is presented in the first part of the thesis. An exhaustive survey of past work in this field is also presented. A low cost system not including complex computation has been chosen for implementation. Towards achieving this a PC based system is designed and developed. A front end analog to digital convertor (12 bit) is built and interfaced to a PC. Software to control the ADC and to perform various analytical functions including feature vector evaluation is developed. It is shown that a fixed set of phrases incorporating evenly balanced phonemes is aptly suited for the speaker recognition work at hand. A set of phrases are chosen for recognition. Two new methods are adopted for the feature evaluation. Some new measurements involving a symmetry check method for pitch period detection and ACE‘ are used as featured. Arguments are provided to show the need for a new model for speech production. Starting from heuristic, a knowledge based (KB) speech production model is presented. In this model, a KB provides impulses to a voice producing mechanism and constant correction is applied via a feedback path. It is this correction that differs from speaker to speaker. Methods of defining measurable parameters for use as features are described. Algorithms for speaker recognition are developed and implemented. Two methods are presented. The first is based on the model postulated. Here the entropy on the utterance of a phoneme is evaluated. The transitions of voiced regions are used as speaker dependent features. The second method presented uses features found in other works, but evaluated differently. A knock—out scheme is used to provide the weightage values for the selection of features. Results of implementation are presented which show on an average of 80% recognition. It is also shown that if there are long gaps between sessions, the performance deteriorates and is speaker dependent. Cross recognition percentages are also presented and this in the worst case rises to 30% while the best case is 0%. Suggestions for further work are given in the concluding chapter.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Extensive use of the Internet coupled with the marvelous growth in e-commerce and m-commerce has created a huge demand for information security. The Secure Socket Layer (SSL) protocol is the most widely used security protocol in the Internet which meets this demand. It provides protection against eaves droppings, tampering and forgery. The cryptographic algorithms RC4 and HMAC have been in use for achieving security services like confidentiality and authentication in the SSL. But recent attacks against RC4 and HMAC have raised questions in the confidence on these algorithms. Hence two novel cryptographic algorithms MAJE4 and MACJER-320 have been proposed as substitutes for them. The focus of this work is to demonstrate the performance of these new algorithms and suggest them as dependable alternatives to satisfy the need of security services in SSL. The performance evaluation has been done by using practical implementation method.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

An Overview of known spatial clustering algorithms The space of interest can be the two-dimensional abstraction of the surface of the earth or a man-made space like the layout of a VLSI design, a volume containing a model of the human brain, or another 3d-space representing the arrangement of chains of protein molecules. The data consists of geometric information and can be either discrete or continuous. The explicit location and extension of spatial objects define implicit relations of spatial neighborhood (such as topological, distance and direction relations) which are used by spatial data mining algorithms. Therefore, spatial data mining algorithms are required for spatial characterization and spatial trend analysis. Spatial data mining or knowledge discovery in spatial databases differs from regular data mining in analogous with the differences between non-spatial data and spatial data. The attributes of a spatial object stored in a database may be affected by the attributes of the spatial neighbors of that object. In addition, spatial location, and implicit information about the location of an object, may be exactly the information that can be extracted through spatial data mining

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We develop several algorithms for computations in Galois extensions of p-adic fields. Our algorithms are based on existing algorithms for number fields and are exact in the sense that we do not need to consider approximations to p-adic numbers. As an application we describe an algorithmic approach to prove or disprove various conjectures for local and global epsilon constants.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Data mining means to summarize information from large amounts of raw data. It is one of the key technologies in many areas of economy, science, administration and the internet. In this report we introduce an approach for utilizing evolutionary algorithms to breed fuzzy classifier systems. This approach was exercised as part of a structured procedure by the students Achler, Göb and Voigtmann as contribution to the 2006 Data-Mining-Cup contest, yielding encouragingly positive results.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This report gives a detailed discussion on the system, algorithms, and techniques that we have applied in order to solve the Web Service Challenges (WSC) of the years 2006 and 2007. These international contests are focused on semantic web service composition. In each challenge of the contests, a repository of web services is given. The input and output parameters of the services in the repository are annotated with semantic concepts. A query to a semantic composition engine contains a set of available input concepts and a set of wanted output concepts. In order to employ an offered service for a requested role, the concepts of the input parameters of the offered operations must be more general than requested (contravariance). In contrast, the concepts of the output parameters of the offered service must be more specific than requested (covariance). The engine should respond to a query by providing a valid composition as fast as possible. We discuss three different methods for web service composition: an uninformed search in form of an IDDFS algorithm, a greedy informed search based on heuristic functions, and a multi-objective genetic algorithm.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this report, we discuss the application of global optimization and Evolutionary Computation to distributed systems. We therefore selected and classified many publications, giving an insight into the wide variety of optimization problems which arise in distributed systems. Some interesting approaches from different areas will be discussed in greater detail with the use of illustrative examples.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Concept lattices are used in formal concept analysis to represent data conceptually so that the original data are still recognizable. Their line diagrams should reflect the semantical relationships within the data. Up to now, no satisfactory automatic drawing programs for this task exist. The geometrical heuristic is the most successful tool for drawing concept lattices manually. It ueses a geometric representation as intermediate step between the list of upper covers and the line diagram of the lattice.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Distributed systems are one of the most vital components of the economy. The most prominent example is probably the internet, a constituent element of our knowledge society. During the recent years, the number of novel network types has steadily increased. Amongst others, sensor networks, distributed systems composed of tiny computational devices with scarce resources, have emerged. The further development and heterogeneous connection of such systems imposes new requirements on the software development process. Mobile and wireless networks, for instance, have to organize themselves autonomously and must be able to react to changes in the environment and to failing nodes alike. Researching new approaches for the design of distributed algorithms may lead to methods with which these requirements can be met efficiently. In this thesis, one such method is developed, tested, and discussed in respect of its practical utility. Our new design approach for distributed algorithms is based on Genetic Programming, a member of the family of evolutionary algorithms. Evolutionary algorithms are metaheuristic optimization methods which copy principles from natural evolution. They use a population of solution candidates which they try to refine step by step in order to attain optimal values for predefined objective functions. The synthesis of an algorithm with our approach starts with an analysis step in which the wanted global behavior of the distributed system is specified. From this specification, objective functions are derived which steer a Genetic Programming process where the solution candidates are distributed programs. The objective functions rate how close these programs approximate the goal behavior in multiple randomized network simulations. The evolutionary process step by step selects the most promising solution candidates and modifies and combines them with mutation and crossover operators. This way, a description of the global behavior of a distributed system is translated automatically to programs which, if executed locally on the nodes of the system, exhibit this behavior. In our work, we test six different ways for representing distributed programs, comprising adaptations and extensions of well-known Genetic Programming methods (SGP, eSGP, and LGP), one bio-inspired approach (Fraglets), and two new program representations called Rule-based Genetic Programming (RBGP, eRBGP) designed by us. We breed programs in these representations for three well-known example problems in distributed systems: election algorithms, the distributed mutual exclusion at a critical section, and the distributed computation of the greatest common divisor of a set of numbers. Synthesizing distributed programs the evolutionary way does not necessarily lead to the envisaged results. In a detailed analysis, we discuss the problematic features which make this form of Genetic Programming particularly hard. The two Rule-based Genetic Programming approaches have been developed especially in order to mitigate these difficulties. In our experiments, at least one of them (eRBGP) turned out to be a very efficient approach and in most cases, was superior to the other representations.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In dieser Arbeit werden Algorithmen zur Untersuchung der äquivarianten Tamagawazahlvermutung von Burns und Flach entwickelt. Zunächst werden Algorithmen angegeben mit denen die lokale Fundamentalklasse, die globale Fundamentalklasse und Tates kanonische Klasse berechnet werden können. Dies ermöglicht unter anderem Berechnungen in Brauergruppen von Zahlkörpererweiterungen. Anschließend werden diese Algorithmen auf die Tamagawazahlvermutung angewendet. Die Epsilonkonstantenvermutung kann dadurch für alle Galoiserweiterungen L|K bewiesen werden, bei denen L in einer Galoiserweiterung E|Q vom Grad kleiner gleich 15 eingebettet werden kann. Für die Tamagawazahlvermutung an der Stelle 1 wird ein Algorithmus angegeben, der die Vermutung für ein gegebenes Fallbeispiel L|Q numerischen verifizieren kann. Im Spezialfall, dass alle Charaktere rational oder abelsch sind, kann dieser Algorithmus die Vermutung für L|Q sogar beweisen.