138 resultados para Evolutionary algorithm (EA)
em Université de Lausanne, Switzerland
Resumo:
In recent years, protein-ligand docking has become a powerful tool for drug development. Although several approaches suitable for high throughput screening are available, there is a need for methods able to identify binding modes with high accuracy. This accuracy is essential to reliably compute the binding free energy of the ligand. Such methods are needed when the binding mode of lead compounds is not determined experimentally but is needed for structure-based lead optimization. We present here a new docking software, called EADock, that aims at this goal. It uses an hybrid evolutionary algorithm with two fitness functions, in combination with a sophisticated management of the diversity. EADock is interfaced with the CHARMM package for energy calculations and coordinate handling. A validation was carried out on 37 crystallized protein-ligand complexes featuring 11 different proteins. The search space was defined as a sphere of 15 A around the center of mass of the ligand position in the crystal structure, and on the contrary to other benchmarks, our algorithm was fed with optimized ligand positions up to 10 A root mean square deviation (RMSD) from the crystal structure, excluding the latter. This validation illustrates the efficiency of our sampling strategy, as correct binding modes, defined by a RMSD to the crystal structure lower than 2 A, were identified and ranked first for 68% of the complexes. The success rate increases to 78% when considering the five best ranked clusters, and 92% when all clusters present in the last generation are taken into account. Most failures could be explained by the presence of crystal contacts in the experimental structure. Finally, the ability of EADock to accurately predict binding modes on a real application was illustrated by the successful docking of the RGD cyclic pentapeptide on the alphaVbeta3 integrin, starting far away from the binding pocket.
Resumo:
3 Summary 3. 1 English The pharmaceutical industry has been facing several challenges during the last years, and the optimization of their drug discovery pipeline is believed to be the only viable solution. High-throughput techniques do participate actively to this optimization, especially when complemented by computational approaches aiming at rationalizing the enormous amount of information that they can produce. In siiico techniques, such as virtual screening or rational drug design, are now routinely used to guide drug discovery. Both heavily rely on the prediction of the molecular interaction (docking) occurring between drug-like molecules and a therapeutically relevant target. Several softwares are available to this end, but despite the very promising picture drawn in most benchmarks, they still hold several hidden weaknesses. As pointed out in several recent reviews, the docking problem is far from being solved, and there is now a need for methods able to identify binding modes with a high accuracy, which is essential to reliably compute the binding free energy of the ligand. This quantity is directly linked to its affinity and can be related to its biological activity. Accurate docking algorithms are thus critical for both the discovery and the rational optimization of new drugs. In this thesis, a new docking software aiming at this goal is presented, EADock. It uses a hybrid evolutionary algorithm with two fitness functions, in combination with a sophisticated management of the diversity. EADock is interfaced with .the CHARMM package for energy calculations and coordinate handling. A validation was carried out on 37 crystallized protein-ligand complexes featuring 11 different proteins. The search space was defined as a sphere of 15 R around the center of mass of the ligand position in the crystal structure, and conversely to other benchmarks, our algorithms was fed with optimized ligand positions up to 10 A root mean square deviation 2MSD) from the crystal structure. This validation illustrates the efficiency of our sampling heuristic, as correct binding modes, defined by a RMSD to the crystal structure lower than 2 A, were identified and ranked first for 68% of the complexes. The success rate increases to 78% when considering the five best-ranked clusters, and 92% when all clusters present in the last generation are taken into account. Most failures in this benchmark could be explained by the presence of crystal contacts in the experimental structure. EADock has been used to understand molecular interactions involved in the regulation of the Na,K ATPase, and in the activation of the nuclear hormone peroxisome proliferatoractivated receptors a (PPARa). It also helped to understand the action of common pollutants (phthalates) on PPARy, and the impact of biotransformations of the anticancer drug Imatinib (Gleevec®) on its binding mode to the Bcr-Abl tyrosine kinase. Finally, a fragment-based rational drug design approach using EADock was developed, and led to the successful design of new peptidic ligands for the a5ß1 integrin, and for the human PPARa. In both cases, the designed peptides presented activities comparable to that of well-established ligands such as the anticancer drug Cilengitide and Wy14,643, respectively. 3.2 French Les récentes difficultés de l'industrie pharmaceutique ne semblent pouvoir se résoudre que par l'optimisation de leur processus de développement de médicaments. Cette dernière implique de plus en plus. de techniques dites "haut-débit", particulièrement efficaces lorsqu'elles sont couplées aux outils informatiques permettant de gérer la masse de données produite. Désormais, les approches in silico telles que le criblage virtuel ou la conception rationnelle de nouvelles molécules sont utilisées couramment. Toutes deux reposent sur la capacité à prédire les détails de l'interaction moléculaire entre une molécule ressemblant à un principe actif (PA) et une protéine cible ayant un intérêt thérapeutique. Les comparatifs de logiciels s'attaquant à cette prédiction sont flatteurs, mais plusieurs problèmes subsistent. La littérature récente tend à remettre en cause leur fiabilité, affirmant l'émergence .d'un besoin pour des approches plus précises du mode d'interaction. Cette précision est essentielle au calcul de l'énergie libre de liaison, qui est directement liée à l'affinité du PA potentiel pour la protéine cible, et indirectement liée à son activité biologique. Une prédiction précise est d'une importance toute particulière pour la découverte et l'optimisation de nouvelles molécules actives. Cette thèse présente un nouveau logiciel, EADock, mettant en avant une telle précision. Cet algorithme évolutionnaire hybride utilise deux pressions de sélections, combinées à une gestion de la diversité sophistiquée. EADock repose sur CHARMM pour les calculs d'énergie et la gestion des coordonnées atomiques. Sa validation a été effectuée sur 37 complexes protéine-ligand cristallisés, incluant 11 protéines différentes. L'espace de recherche a été étendu à une sphère de 151 de rayon autour du centre de masse du ligand cristallisé, et contrairement aux comparatifs habituels, l'algorithme est parti de solutions optimisées présentant un RMSD jusqu'à 10 R par rapport à la structure cristalline. Cette validation a permis de mettre en évidence l'efficacité de notre heuristique de recherche car des modes d'interactions présentant un RMSD inférieur à 2 R par rapport à la structure cristalline ont été classés premier pour 68% des complexes. Lorsque les cinq meilleures solutions sont prises en compte, le taux de succès grimpe à 78%, et 92% lorsque la totalité de la dernière génération est prise en compte. La plupart des erreurs de prédiction sont imputables à la présence de contacts cristallins. Depuis, EADock a été utilisé pour comprendre les mécanismes moléculaires impliqués dans la régulation de la Na,K ATPase et dans l'activation du peroxisome proliferatoractivated receptor a (PPARa). Il a également permis de décrire l'interaction de polluants couramment rencontrés sur PPARy, ainsi que l'influence de la métabolisation de l'Imatinib (PA anticancéreux) sur la fixation à la kinase Bcr-Abl. Une approche basée sur la prédiction des interactions de fragments moléculaires avec protéine cible est également proposée. Elle a permis la découverte de nouveaux ligands peptidiques de PPARa et de l'intégrine a5ß1. Dans les deux cas, l'activité de ces nouveaux peptides est comparable à celles de ligands bien établis, comme le Wy14,643 pour le premier, et le Cilengitide (PA anticancéreux) pour la seconde.
Resumo:
Optimizing collective behavior in multiagent systems requires algorithms to find not only appropriate individual behaviors but also a suitable composition of agents within a team. Over the last two decades, evolutionary methods have emerged as a promising approach for the design of agents and their compositions into teams. The choice of a crossover operator that facilitates the evolution of optimal team composition is recognized to be crucial, but so far, it has never been thoroughly quantified. Here, we highlight the limitations of two different crossover operators that exchange entire agents between teams: restricted agent swapping (RAS) that exchanges only corresponding agents between teams and free agent swapping (FAS) that allows an arbitrary exchange of agents. Our results show that RAS suffers from premature convergence, whereas FAS entails insufficient convergence. Consequently, in both cases, the exploration and exploitation aspects of the evolutionary algorithm are not well balanced resulting in the evolution of suboptimal team compositions. To overcome this problem, we propose combining the two methods. Our approach first applies FAS to explore the search space and then RAS to exploit it. This mixed approach is a much more efficient strategy for the evolution of team compositions compared to either strategy on its own. Our results suggest that such a mixed agent-swapping algorithm should always be preferred whenever the optimal composition of individuals in a multiagent system is unknown.
Resumo:
Abstract The main objective of this work is to show how the choice of the temporal dimension and of the spatial structure of the population influences an artificial evolutionary process. In the field of Artificial Evolution we can observe a common trend in synchronously evolv¬ing panmictic populations, i.e., populations in which any individual can be recombined with any other individual. Already in the '90s, the works of Spiessens and Manderick, Sarma and De Jong, and Gorges-Schleuter have pointed out that, if a population is struc¬tured according to a mono- or bi-dimensional regular lattice, the evolutionary process shows a different dynamic with respect to the panmictic case. In particular, Sarma and De Jong have studied the selection pressure (i.e., the diffusion of a best individual when the only selection operator is active) induced by a regular bi-dimensional structure of the population, proposing a logistic modeling of the selection pressure curves. This model supposes that the diffusion of a best individual in a population follows an exponential law. We show that such a model is inadequate to describe the process, since the growth speed must be quadratic or sub-quadratic in the case of a bi-dimensional regular lattice. New linear and sub-quadratic models are proposed for modeling the selection pressure curves in, respectively, mono- and bi-dimensional regu¬lar structures. These models are extended to describe the process when asynchronous evolutions are employed. Different dynamics of the populations imply different search strategies of the resulting algorithm, when the evolutionary process is used to solve optimisation problems. A benchmark of both discrete and continuous test problems is used to study the search characteristics of the different topologies and updates of the populations. In the last decade, the pioneering studies of Watts and Strogatz have shown that most real networks, both in the biological and sociological worlds as well as in man-made structures, have mathematical properties that set them apart from regular and random structures. In particular, they introduced the concepts of small-world graphs, and they showed that this new family of structures has interesting computing capabilities. Populations structured according to these new topologies are proposed, and their evolutionary dynamics are studied and modeled. We also propose asynchronous evolutions for these structures, and the resulting evolutionary behaviors are investigated. Many man-made networks have grown, and are still growing incrementally, and explanations have been proposed for their actual shape, such as Albert and Barabasi's preferential attachment growth rule. However, many actual networks seem to have undergone some kind of Darwinian variation and selection. Thus, how these networks might have come to be selected is an interesting yet unanswered question. In the last part of this work, we show how a simple evolutionary algorithm can enable the emrgence o these kinds of structures for two prototypical problems of the automata networks world, the majority classification and the synchronisation problems. Synopsis L'objectif principal de ce travail est de montrer l'influence du choix de la dimension temporelle et de la structure spatiale d'une population sur un processus évolutionnaire artificiel. Dans le domaine de l'Evolution Artificielle on peut observer une tendence à évoluer d'une façon synchrone des populations panmictiques, où chaque individu peut être récombiné avec tout autre individu dans la population. Déjà dans les année '90, Spiessens et Manderick, Sarma et De Jong, et Gorges-Schleuter ont observé que, si une population possède une structure régulière mono- ou bi-dimensionnelle, le processus évolutionnaire montre une dynamique différente de celle d'une population panmictique. En particulier, Sarma et De Jong ont étudié la pression de sélection (c-à-d la diffusion d'un individu optimal quand seul l'opérateur de sélection est actif) induite par une structure régulière bi-dimensionnelle de la population, proposant une modélisation logistique des courbes de pression de sélection. Ce modèle suppose que la diffusion d'un individu optimal suit une loi exponentielle. On montre que ce modèle est inadéquat pour décrire ce phénomène, étant donné que la vitesse de croissance doit obéir à une loi quadratique ou sous-quadratique dans le cas d'une structure régulière bi-dimensionnelle. De nouveaux modèles linéaires et sous-quadratique sont proposés pour des structures mono- et bi-dimensionnelles. Ces modèles sont étendus pour décrire des processus évolutionnaires asynchrones. Différentes dynamiques de la population impliquent strategies différentes de recherche de l'algorithme résultant lorsque le processus évolutionnaire est utilisé pour résoudre des problèmes d'optimisation. Un ensemble de problèmes discrets et continus est utilisé pour étudier les charactéristiques de recherche des différentes topologies et mises à jour des populations. Ces dernières années, les études de Watts et Strogatz ont montré que beaucoup de réseaux, aussi bien dans les mondes biologiques et sociologiques que dans les structures produites par l'homme, ont des propriétés mathématiques qui les séparent à la fois des structures régulières et des structures aléatoires. En particulier, ils ont introduit la notion de graphe sm,all-world et ont montré que cette nouvelle famille de structures possède des intéressantes propriétés dynamiques. Des populations ayant ces nouvelles topologies sont proposés, et leurs dynamiques évolutionnaires sont étudiées et modélisées. Pour des populations ayant ces structures, des méthodes d'évolution asynchrone sont proposées, et la dynamique résultante est étudiée. Beaucoup de réseaux produits par l'homme se sont formés d'une façon incrémentale, et des explications pour leur forme actuelle ont été proposées, comme le preferential attachment de Albert et Barabàsi. Toutefois, beaucoup de réseaux existants doivent être le produit d'un processus de variation et sélection darwiniennes. Ainsi, la façon dont ces structures ont pu être sélectionnées est une question intéressante restée sans réponse. Dans la dernière partie de ce travail, on montre comment un simple processus évolutif artificiel permet à ce type de topologies d'émerger dans le cas de deux problèmes prototypiques des réseaux d'automates, les tâches de densité et de synchronisation.
Resumo:
The genus Prunus L. is large and economically important. However, phylogenetic relationships within Prunus at low taxonomic level, particularly in the subgenus Amygdalus L. s.l., remain poorly investigated. This paper attempts to document the evolutionary history of Amygdalus s.l. and establishes a temporal framework, by assembling molecular data from conservative and variable molecular markers. The nuclear s6pdh gene in combination with the plastid trnSG spacer are analyzed with bayesian and maximum likelihood methods. Since previous phylogenetic analysis with these markers lacked resolution, we additionally analyzed 13 nuclear SSR loci with the δµ2 distance, followed by an unweighted pair group method using arithmetic averages algorithm. Our phylogenetic analysis with both sequence and SSR loci confirms the split between sections Amygdalus and Persica, comprising almonds and peaches, respectively. This result is in agreement with biogeographic data showing that each of the two sections is naturally distributed on each side of the Central Asian Massif chain. Using coalescent based estimations, divergence times between the two sections strongly varied when considering sequence data only or combined with SSR. The sequence-only based estimate (5 million years ago) was congruent with the Central Asian Massif orogeny and subsequent climate change. Given the low level of differentiation within the two sections using both marker types, the utility of combining microsatellites and data sequences to address phylogenetic relationships at low taxonomic level within Amygdalus is discussed. The recent evolutionary histories of almond and peach are discussed in view of the domestication processes that arose in these two phenotypically-diverging gene pools: almonds and peaches were domesticated from the Amygdalus s.s. and Persica sections, respectively. Such economically important crops may serve as good model to study divergent domestication process in close genetic pool.
Resumo:
Rubisco is responsible for the fixation of CO2 into organic compounds through photosynthesis and thus has a great agronomic importance. It is well established that this enzyme suffers from a slow catalysis, and its low specificity results into photorespiration, which is considered as an energy waste for the plant. However, natural variations exist, and some Rubisco lineages, such as in C4 plants, exhibit higher catalytic efficiencies coupled to lower specificities. These C4 kinetics could have evolved as an adaptation to the higher CO2 concentration present in C4 photosynthetic cells. In this study, using phylogenetic analyses on a large data set of C3 and C4 monocots, we showed that the rbcL gene, which encodes the large subunit of Rubisco, evolved under positive selection in independent C4 lineages. This confirms that selective pressures on Rubisco have been switched in C4 plants by the high CO2 environment prevailing in their photosynthetic cells. Eight rbcL codons evolving under positive selection in C4 clades were involved in parallel changes among the 23 independent monocot C4 lineages included in this study. These amino acids are potentially responsible for the C4 kinetics, and their identification opens new roads for human-directed Rubisco engineering. The introgression of C4-like high-efficiency Rubisco would strongly enhance C3 crop yields in the future CO2-enriched atmosphere.
Resumo:
Pleistocene glacial and interglacial periods have moulded the evolutionary history of European cold-adapted organisms. The role of the different mountain massifs has, however, not been accurately investigated in the case of high-altitude insect species. Here, we focus on three closely related species of non-flying leaf beetles of the genus Oreina (Coleoptera, Chrysomelidae), which are often found in sympatry within the mountain ranges of Europe. After showing that the species concept as currently applied does not match barcoding results, we show, based on more than 700 sequences from one nuclear and three mitochondrial genes, the role of biogeography in shaping the phylogenetic hypothesis. Dating the phylogeny using an insect molecular clock, we show that the earliest lineages diverged more than 1 Mya and that the main shift in diversification rate occurred between 0.36 and 0.18 Mya. By using a probabilistic approach on the parsimony-based dispersal/vicariance framework (MP-DIVA) as well as a direct likelihood method of state change optimization, we show that the Alps acted as a cross-roads with multiple events of dispersal to and reinvasion from neighbouring mountains. However, the relative importance of vicariance vs. dispersal events on the process of rapid diversification remains difficult to evaluate because of a bias towards overestimation of vicariance in the DIVA algorithm. Parallels are drawn with recent studies of cold-adapted species, although our study reveals novel patterns in diversity and genetic links between European mountains, and highlights the importance of neglected regions, such as the Jura and the Balkanic range.
Resumo:
The evolution of grasses using C4 photosynthesis and their sudden rise to ecological dominance 3 to 8 million years ago is among the most dramatic examples of biome assembly in the geological record. A growing body of work suggests that the patterns and drivers of C4 grassland expansion were considerably more complex than originally assumed. Previous research has benefited substantially from dialog between geologists and ecologists, but current research must now integrate fully with phylogenetics. A synthesis of grass evolutionary biology with grassland ecosystem science will further our knowledge of the evolution of traits that promote dominance in grassland systems and will provide a new context in which to evaluate the relative importance of C4 photosynthesis in transforming ecosystems across large regions of Earth.
Resumo:
Trioecy is an uncommon sexual system in which males, females, and hermaphrodites co-occur as three clearly different gender classes. The evolutionary stability of trioecy is unclear, but would depend on factors such as hermaphroditic sex allocation and rates of outcrossing vs. selfing. Here, trioecious populations of Mercurialis annua are described for the first time. We examined the frequencies of females, males and hermaphrodites across ten natural populations and evaluated the association between the frequency of females and plant densities. Previous studies have shown that selfing rates in this species are density-dependent and are reduced in the presence of males, which produce substantially more pollen than hermaphrodites. Accordingly, we examined the evolutionary stability of trioecy using an experiment in which we (a) indirectly manipulated selfing rates by altering plant densities and the frequency of males in a fully factorial manner across 20 experimental plots and (b) examined the effect of these manipulations on the frequency of the three sex phenotypes in the next generation of plants. In the parental generation, we measured the seed and pollen allocations of hermaphrodites and compared them with allocations by unisexual plants. In natural populations, females occurred at higher frequencies in denser patches, a finding consistent with our expectations. Under our experimental conditions, however, no combination of plant densities and male frequencies was associated with increased frequencies of females. Our results suggest that the factors that regulate female frequencies in trioecious populations of M. annua are independent of those regulating male frequencies (density), and that the stable co-existence of all three sex phenotypes within populations is unlikely.
Resumo:
Functional RNA structures play an important role both in the context of noncoding RNA transcripts as well as regulatory elements in mRNAs. Here we present a computational study to detect functional RNA structures within the ENCODE regions of the human genome. Since structural RNAs in general lack characteristic signals in primary sequence, comparative approaches evaluating evolutionary conservation of structures are most promising. We have used three recently introduced programs based on either phylogenetic-stochastic context-free grammar (EvoFold) or energy directed folding (RNAz and AlifoldZ), yielding several thousand candidate structures (corresponding to approximately 2.7% of the ENCODE regions). EvoFold has its highest sensitivity in highly conserved and relatively AU-rich regions, while RNAz favors slightly GC-rich regions, resulting in a relatively small overlap between methods. Comparison with the GENCODE annotation points to functional RNAs in all genomic contexts, with a slightly increased density in 3'-UTRs. While we estimate a significant false discovery rate of approximately 50%-70% many of the predictions can be further substantiated by additional criteria: 248 loci are predicted by both RNAz and EvoFold, and an additional 239 RNAz or EvoFold predictions are supported by the (more stringent) AlifoldZ algorithm. Five hundred seventy RNAz structure predictions fall into regions that show signs of selection pressure also on the sequence level (i.e., conserved elements). More than 700 predictions overlap with noncoding transcripts detected by oligonucleotide tiling arrays. One hundred seventy-five selected candidates were tested by RT-PCR in six tissues, and expression could be verified in 43 cases (24.6%).
Resumo:
PURPOSE OF REVIEW: To provide updated insights into innate antiviral immunity and highlight prototypical evolutionary features of well characterized HIV restriction factors. RECENT FINDINGS: Recently, a new HIV restriction factor, Myxovirus resistance 2, has been discovered and the region/residue responsible for its activity identified using an evolutionary approach. Furthermore, IFI16, an innate immunity protein known to sense several viruses, has been shown to contribute to the defense to HIV-1 by causing cell death upon sensing HIV-1 DNA. SUMMARY: Restriction factors against HIV show characteristic signatures of positive selection. Different patterns of accelerated sequence evolution can distinguish antiviral strategies--offense or defence--as well as the level of specificity of the antiviral properties. Sequence analysis of primate orthologs of restriction factors serves to localize functional domains and sites responsible for antiviral action. We use recent discoveries to illustrate how evolutionary genomic analyses help identify new antiviral genes and their mechanisms of action.
Resumo:
The heat- and odour-producing genus Arum (Araceae) has interested scientists for centuries. This long-term interest has allowed a deep knowledge of some complex processes, such as the physiology and dynamics of its characteristic lure-and-trap pollination system, to be built up. However, mainly because of its large distributional range and high degree of morphological variation, species' limits and relationships are still under discussion. Today, the genus comprises 28 species subdivided into two subgenera, two sections and six subsections. In this study, the phylogeny of the genus is inferred on the basis of four plastid regions, and the evolution of several morphological characters is investigated. Our phylogenetic hypothesis is not in agreement with the current infrageneric classification of the genus and challenges the monophyly of several species. This demonstrates the need for a new infrageneric classification based on characters reflecting the evolution of this enigmatic genus. To investigate the biogeography of Arum deeply, further spatiotemporal analyses were performed, addressing the importance of the Mediterranean basin in the diversification of Arum. Our results suggest that its centre of origin was the European-Aegean region, and that major diversification happened during the last 10 Myr.
Resumo:
The question of why some social systems have evolved close inbreeding is particularly intriguing given expected short- and long-term negative effects of this breeding system. Using social spiders as a case study, we quantitatively show that the potential costs of avoiding inbreeding through dispersal and solitary living could have outweighed the costs of inbreeding depression in the origin of inbred spider sociality. We further review the evidence that despite being favored in the short term, inbred spider sociality may constitute in the long run an evolutionary dead end. We also review other cases, such as the naked mole rats and some bark and ambrosia beetles, mites, psocids, thrips, parasitic ants, and termites, in which inbreeding and sociality are associated and the evidence for and against this breeding system being, in general, an evolutionary dead end.