951 resultados para random search algorithms


Relevância:

80.00% 80.00%

Publicador:

Resumo:

En la realización de este proyecto se ha tratado principalmente la temática del web scraping sobre documentos HTML en Android. Como resultado del mismo, se ha propuesto una metodología para poder realizar web scraping en aplicaciones implementadas para este sistema operativo y se desarrollará una aplicación basada en esta metodología que resulte útil a los alumnos de la escuela. Web scraping se puede definir como una técnica basada en una serie de algoritmos de búsqueda de contenido con el fin de obtener una determinada información de páginas web, descartando aquella que no sea relevante. Como parte central, se ha dedicado bastante tiempo al estudio de los navegadores y servidores Web, y del lenguaje HTML presente en casi todas las páginas web en la actualidad así como de los mecanismos utilizados para la comunicación entre cliente y servidor ya que son los pilares en los que se basa esta técnica. Se ha realizado un estudio de las técnicas y herramientas necesarias, aportándose todos los conceptos teóricos necesarios, así como la proposición de una posible metodología para su implementación. Finalmente se ha codificado la aplicación UPMdroid, desarrollada con el fin de ejemplificar la implementación de la metodología propuesta anteriormente y a la vez desarrollar una aplicación cuya finalidad es brindar al estudiante de la ETSIST un soporte móvil en Android que le facilite el acceso y la visualización de aquellos datos más importantes del curso académico como son: el horario de clases y las calificaciones de las asignaturas en las que se matricule. Esta aplicación, además de implementar la metodología propuesta, es una herramienta muy interesante para el alumno, ya que le permite utilizar de una forma sencilla e intuitiva gran número de funcionalidades de la escuela solucionando así los problemas de visualización de contenido web en los dispositivos. ABSTRACT. The main topic of this project is about the web scraping over HTML documents on Android OS. As a result thereof, it is proposed a methodology to perform web scraping in deployed applications for this operating system and based on this methodology that is useful to the ETSIST school students. Web scraping can be defined as a technique based on a number of content search algorithms in order to obtain certain information from web pages, discarding those that are not relevant. As a main part, has spent considerable time studying browsers and Web servers, and the HTML language that is present today in almost all websites as well as the mechanisms used for communication between client and server because they are the pillars which this technique is based. We performed a study of the techniques and tools needed, providing all the necessary theoretical concepts, as well as the proposal of a possible methodology for implementation. Finally it has codified UPMdroid application, developed in order to illustrate the implementation of the previously proposed methodology and also to give the student a mobile ETSIST Android support to facilitate access and display those most important data of the current academic year such as: class schedules and scores for the subjects in which you are enrolled. This application, in addition to implement the proposed methodology is also a very interesting tool for the student, as it allows a simple and intuitive way of use these school functionalities thus fixing the viewing web content on devices.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In the maximum parsimony (MP) and minimum evolution (ME) methods of phylogenetic inference, evolutionary trees are constructed by searching for the topology that shows the minimum number of mutational changes required (M) and the smallest sum of branch lengths (S), respectively, whereas in the maximum likelihood (ML) method the topology showing the highest maximum likelihood (A) of observing a given data set is chosen. However, the theoretical basis of the optimization principle remains unclear. We therefore examined the relationships of M, S, and A for the MP, ME, and ML trees with those for the true tree by using computer simulation. The results show that M and S are generally greater for the true tree than for the MP and ME trees when the number of nucleotides examined (n) is relatively small, whereas A is generally lower for the true tree than for the ML tree. This finding indicates that the optimization principle tends to give incorrect topologies when n is small. To deal with this disturbing property of the optimization principle, we suggest that more attention should be given to testing the statistical reliability of an estimated tree rather than to finding the optimal tree with excessive efforts. When a reliability test is conducted, simplified MP, ME, and ML algorithms such as the neighbor-joining method generally give conclusions about phylogenetic inference very similar to those obtained by the more extensive tree search algorithms.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The emotif database is a collection of more than 170 000 highly specific and sensitive protein sequence motifs representing conserved biochemical properties and biological functions. These protein motifs are derived from 7697 sequence alignments in the BLOCKS+ database (released on June 23, 2000) and all 8244 protein sequence alignments in the PRINTS database (version 27.0) using the emotif-maker algorithm developed by Nevill-Manning et al. (Nevill-Manning,C.G., Wu,T.D. and Brutlag,D.L. (1998) Proc. Natl Acad. Sci. USA, 95, 5865–5871; Nevill-Manning,C.G., Sethi,K.S., Wu,T.D. and Brutlag,D.L. (1997) ISMB-97, 5, 202–209). Since the amino acids and the groups of amino acids in these sequence motifs represent critical positions conserved in evolution, search algorithms employing the emotif patterns can identify and classify more widely divergent sequences than methods based on global sequence similarity. The emotif protein pattern database is available at http://motif.stanford.edu/emotif/.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Is the pathway of protein folding determined by the relative stability of folding intermediates, or by the relative height of the activation barriers leading to these intermediates? This is a fundamental question for resolving the Levinthal paradox, which stated that protein folding by a random search mechanism would require a time too long to be plausible. To answer this question, we have studied the guanidinium chloride (GdmCl)-induced folding/unfolding of staphylococcal nuclease [(SNase, formerly EC 3.1.4.7; now called microbial nuclease or endonuclease, EC 3.1.31.1] by stopped-flow circular dichroism (CD) and differential scanning microcalorimetry (DSC). The data show that while the equilibrium transition is a quasi-two-state process, kinetics in the 2-ms to 500-s time range are triphasic. Data support the sequential mechanism for SNase folding: U3 <--> U2 <--> U1 <--> N0, where U1, U2, and U3 are substates of the unfolded protein and N0 is the native state. Analysis of the relative population of the U1, U2, and U3 species in 2.0 M GdmCl gives delta-G values for the U3 --> U2 reaction of +0.1 kcal/mol and for the U2 --> U1 reaction of -0.49 kcal/mol. The delta-G value for the U1 --> N0 reaction is calculated to be -4.5 kcal/mol from DSC data. The activation energy, enthalpy, and entropy for each kinetic step are also determined. These results allow us to make the following four conclusions. (i) Although the U1, U2, and U3 states are nearly isoenergetic, no random walk occurs among them during the folding. The pathway of folding is unique and sequential. In other words, the relative stability of the folding intermediates does not dictate the folding pathway. Instead, the folding is a descent toward the global free-energy minimum of the native state via the least activation path in the vast energy landscape. Barrier avoidance leads the way, and barrier height limits the rate. Thus, the Levinthal paradox is not applicable to the protein-folding problem. (ii) The main folding reaction (U1 --> N0), in which the peptide chain acquires most of its free energy (via van der Waals' contacts, hydrogen bonding, and electrostatic interactions), is a highly concerted process. These energy-acquiring events take place in a single kinetic phase. (iii) U1 appears to be a compact unfolded species; the rate of conversion of U2 to U1 depends on the viscosity of solution. (iv) All four relaxation times reported here depend on GdmCl concentrations: it is likely that none involve the cis/trans isomerization of prolines. Finally, a mechanism is presented in which formation of sheet-like chain conformations and a hydrophobic condensation event precede the main-chain folding reaction.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In the past decade, tremendous advances in the state of the art of automatic speech recognition by machine have taken place. A reduction in the word error rate by more than a factor of 5 and an increase in recognition speeds by several orders of magnitude (brought about by a combination of faster recognition search algorithms and more powerful computers), have combined to make high-accuracy, speaker-independent, continuous speech recognition for large vocabularies possible in real time, on off-the-shelf workstations, without the aid of special hardware. These advances promise to make speech recognition technology readily available to the general public. This paper focuses on the speech recognition advances made through better speech modeling techniques, chiefly through more accurate mathematical modeling of speech sounds.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

En este artículo se investigan técnicas automáticas para encontrar un modelo óptimo de características en el caso de un analizador de dependencias basado en transiciones. Mostramos un estudio comparativo entre algoritmos de búsqueda, sistemas de validación y reglas de decisión demostrando al mismo tiempo que usando nuestros métodos es posible conseguir modelos complejos que proporcionan mejores resultados que los modelos que siguen configuraciones por defecto.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We introduce a novel way of measuring the entropy of a set of values undergoing changes. Such a measure becomes useful when analyzing the temporal development of an algorithm designed to numerically update a collection of values such as artificial neural network weights undergoing adjustments during learning. We measure the entropy as a function of the phase-space of the values, i.e. their magnitude and velocity of change, using a method based on the abstract measure of entropy introduced by the philosopher Rudolf Carnap. By constructing a time-dynamic two-dimensional Voronoi diagram using Voronoi cell generators with coordinates of value- and value-velocity (change of magnitude), the entropy becomes a function of the cell areas. We term this measure teleonomic entropy since it can be used to describe changes in any end-directed (teleonomic) system. The usefulness of the method is illustrated when comparing the different approaches of two search algorithms, a learning artificial neural network and a population of discovering agents. (C) 2004 Elsevier Inc. All rights reserved.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Human perception is finely tuned to extract structure about the 4D world of time and space as well as properties such as color and texture. Developing intuitions about spatial structure beyond 4D requires exploiting other perceptual and cognitive abilities. One of the most natural ways to explore complex spaces is for a user to actively navigate through them, using local explorations and global summaries to develop intuitions about structure, and then testing the developing ideas by further exploration. This article provides a brief overview of a technique for visualizing surfaces defined over moderate-dimensional binary spaces, by recursively unfolding them onto a 2D hypergraph. We briefly summarize the uses of a freely available Web-based visualization tool, Hyperspace Graph Paper (HSGP), for exploring fitness landscapes and search algorithms in evolutionary computation. HSGP provides a way for a user to actively explore a landscape, from simple tasks such as mapping the neighborhood structure of different points, to seeing global properties such as the size and distribution of basins of attraction or how different search algorithms interact with landscape structure. It has been most useful for exploring recursive and repetitive landscapes, and its strength is that it allows intuitions to be developed through active navigation by the user, and exploits the visual system's ability to detect pattern and texture. The technique is most effective when applied to continuous functions over Boolean variables using 4 to 16 dimensions.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In electronic support, receivers must maintain surveillance over the very wide portion of the electromagnetic spectrum in which threat emitters operate. A common approach is to use a receiver with a relatively narrow bandwidth that sweeps its centre frequency over the threat bandwidth to search for emitters. The sequence and timing of changes in the centre frequency constitute a search strategy. The search can be expedited, if there is intelligence about the operational parameters of the emitters that are likely to be found. However, it can happen that the intelligence is deficient, untrustworthy or absent. In this case, what is the best search strategy to use? A random search strategy based on a continuous-time Markov chain (CTMC) is proposed. When the search is conducted for emitters with a periodic scan, it is shown that there is an optimal configuration for the CTMC. It is optimal in the sense that the expected time to intercept an emitter approaches linearity most quickly with respect to the emitter's scan period. A fast and smooth approach to linearity is important, as other strategies can exhibit considerable and abrupt variations in the intercept time as a function of scan period. In theory and numerical examples, the optimum CTMC strategy is compared with other strategies to demonstrate its superior properties.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The system of development unstable processes prediction is given. It is based on a decision-tree method. The processing technique of the expert information is offered. It is indispensable for constructing and processing by a decision-tree method. In particular data is set in the fuzzy form. The original search algorithms of optimal paths of development of the forecast process are described. This one is oriented to processing of trees of large dimension with vector estimations of arcs.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

* The research was supported by INTAS 00-397 and 00-626 Projects.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Real world search problems, characterised by nonlinearity, noise and multidimensionality, are often best solved by hybrid algorithms. Techniques embodying different necessary features are triggered at specific iterations, in response to the current state of the problem space. In the existing literature, this alternation is managed either statically (through pre-programmed policies) or dynamically, at the cost of high coupling with algorithm inner representation. We extract two design patterns for hybrid metaheuristic search algorithms, the All-Seeing Eye and the Commentator patterns, which we argue should be replaced by the more flexible and loosely coupled Simple Black Box (Two-B) and Utility-based Black Box (Three-B) patterns that we propose here. We recommend the Two-B pattern for purely fitness based hybridisations and the Three-B pattern for more generic search quality evaluation based hybridisations.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Abstract The ultimate problem considered in this thesis is modeling a high-dimensional joint distribution over a set of discrete variables. For this purpose, we consider classes of context-specific graphical models and the main emphasis is on learning the structure of such models from data. Traditional graphical models compactly represent a joint distribution through a factorization justi ed by statements of conditional independence which are encoded by a graph structure. Context-speci c independence is a natural generalization of conditional independence that only holds in a certain context, speci ed by the conditioning variables. We introduce context-speci c generalizations of both Bayesian networks and Markov networks by including statements of context-specific independence which can be encoded as a part of the model structures. For the purpose of learning context-speci c model structures from data, we derive score functions, based on results from Bayesian statistics, by which the plausibility of a structure is assessed. To identify high-scoring structures, we construct stochastic and deterministic search algorithms designed to exploit the structural decomposition of our score functions. Numerical experiments on synthetic and real-world data show that the increased exibility of context-specific structures can more accurately emulate the dependence structure among the variables and thereby improve the predictive accuracy of the models.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The size of online image datasets is constantly increasing. Considering an image dataset with millions of images, image retrieval becomes a seemingly intractable problem for exhaustive similarity search algorithms. Hashing methods, which encodes high-dimensional descriptors into compact binary strings, have become very popular because of their high efficiency in search and storage capacity. In the first part, we propose a multimodal retrieval method based on latent feature models. The procedure consists of a nonparametric Bayesian framework for learning underlying semantically meaningful abstract features in a multimodal dataset, a probabilistic retrieval model that allows cross-modal queries and an extension model for relevance feedback. In the second part, we focus on supervised hashing with kernels. We describe a flexible hashing procedure that treats binary codes and pairwise semantic similarity as latent and observed variables, respectively, in a probabilistic model based on Gaussian processes for binary classification. We present a scalable inference algorithm with the sparse pseudo-input Gaussian process (SPGP) model and distributed computing. In the last part, we define an incremental hashing strategy for dynamic databases where new images are added to the databases frequently. The method is based on a two-stage classification framework using binary and multi-class SVMs. The proposed method also enforces balance in binary codes by an imbalance penalty to obtain higher quality binary codes. We learn hash functions by an efficient algorithm where the NP-hard problem of finding optimal binary codes is solved via cyclic coordinate descent and SVMs are trained in a parallelized incremental manner. For modifications like adding images from an unseen class, we propose an incremental procedure for effective and efficient updates to the previous hash functions. Experiments on three large-scale image datasets demonstrate that the incremental strategy is capable of efficiently updating hash functions to the same retrieval performance as hashing from scratch.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Les travaux de ce mémoire traitent du problème d’ordonnancement et d’optimisation de la production dans un environnement de plusieurs machines en présence de contraintes sur les ressources matérielles dans une usine d’extrusion plastique. La minimisation de la somme pondérée des retards est le critère économique autour duquel s’articule cette étude car il représente un critère très important pour le respect des délais. Dans ce mémoire, nous proposons une approche exacte via une formulation mathématique capable des donner des solutions optimales et une approche heuristique qui repose sur deux méthodes de construction de solution sérielle et parallèle et un ensemble de méthodes de recherche dans le voisinage (recuit-simulé, recherche avec tabous, GRASP et algorithme génétique) avec cinq variantes de voisinages. Pour être en totale conformité avec la réalité de l’industrie du plastique, nous avons pris en considération certaines caractéristiques très fréquentes telles que les temps de changement d’outils sur les machines lorsqu’un ordre de fabrication succède à un autre sur une machine donnée. La disponibilité des extrudeuses et des matrices d’extrusion représente le goulot d’étranglement dans ce problème d’ordonnancement. Des séries d’expérimentations basées sur des problèmes tests ont été effectuées pour évaluer la qualité de la solution obtenue avec les différents algorithmes proposés. L’analyse des résultats a démontré que les méthodes de construction de solution ne sont pas suffisantes pour assurer de bons résultats et que les méthodes de recherche dans le voisinage donnent des solutions de très bonne qualité. Le choix du voisinage est important pour raffiner la qualité de la solution obtenue. Mots-clés : ordonnancement, optimisation, extrusion, formulation mathématique, heuristique, recuit-simulé, recherche avec tabous, GRASP, algorithme génétique