972 resultados para Evolutionary multiobjective optimization
Resumo:
3 Summary 3. 1 English The pharmaceutical industry has been facing several challenges during the last years, and the optimization of their drug discovery pipeline is believed to be the only viable solution. High-throughput techniques do participate actively to this optimization, especially when complemented by computational approaches aiming at rationalizing the enormous amount of information that they can produce. In siiico techniques, such as virtual screening or rational drug design, are now routinely used to guide drug discovery. Both heavily rely on the prediction of the molecular interaction (docking) occurring between drug-like molecules and a therapeutically relevant target. Several softwares are available to this end, but despite the very promising picture drawn in most benchmarks, they still hold several hidden weaknesses. As pointed out in several recent reviews, the docking problem is far from being solved, and there is now a need for methods able to identify binding modes with a high accuracy, which is essential to reliably compute the binding free energy of the ligand. This quantity is directly linked to its affinity and can be related to its biological activity. Accurate docking algorithms are thus critical for both the discovery and the rational optimization of new drugs. In this thesis, a new docking software aiming at this goal is presented, EADock. It uses a hybrid evolutionary algorithm with two fitness functions, in combination with a sophisticated management of the diversity. EADock is interfaced with .the CHARMM package for energy calculations and coordinate handling. A validation was carried out on 37 crystallized protein-ligand complexes featuring 11 different proteins. The search space was defined as a sphere of 15 R around the center of mass of the ligand position in the crystal structure, and conversely to other benchmarks, our algorithms was fed with optimized ligand positions up to 10 A root mean square deviation 2MSD) from the crystal structure. This validation illustrates the efficiency of our sampling heuristic, as correct binding modes, defined by a RMSD to the crystal structure lower than 2 A, were identified and ranked first for 68% of the complexes. The success rate increases to 78% when considering the five best-ranked clusters, and 92% when all clusters present in the last generation are taken into account. Most failures in this benchmark could be explained by the presence of crystal contacts in the experimental structure. EADock has been used to understand molecular interactions involved in the regulation of the Na,K ATPase, and in the activation of the nuclear hormone peroxisome proliferatoractivated receptors a (PPARa). It also helped to understand the action of common pollutants (phthalates) on PPARy, and the impact of biotransformations of the anticancer drug Imatinib (Gleevec®) on its binding mode to the Bcr-Abl tyrosine kinase. Finally, a fragment-based rational drug design approach using EADock was developed, and led to the successful design of new peptidic ligands for the a5ß1 integrin, and for the human PPARa. In both cases, the designed peptides presented activities comparable to that of well-established ligands such as the anticancer drug Cilengitide and Wy14,643, respectively. 3.2 French Les récentes difficultés de l'industrie pharmaceutique ne semblent pouvoir se résoudre que par l'optimisation de leur processus de développement de médicaments. Cette dernière implique de plus en plus. de techniques dites "haut-débit", particulièrement efficaces lorsqu'elles sont couplées aux outils informatiques permettant de gérer la masse de données produite. Désormais, les approches in silico telles que le criblage virtuel ou la conception rationnelle de nouvelles molécules sont utilisées couramment. Toutes deux reposent sur la capacité à prédire les détails de l'interaction moléculaire entre une molécule ressemblant à un principe actif (PA) et une protéine cible ayant un intérêt thérapeutique. Les comparatifs de logiciels s'attaquant à cette prédiction sont flatteurs, mais plusieurs problèmes subsistent. La littérature récente tend à remettre en cause leur fiabilité, affirmant l'émergence .d'un besoin pour des approches plus précises du mode d'interaction. Cette précision est essentielle au calcul de l'énergie libre de liaison, qui est directement liée à l'affinité du PA potentiel pour la protéine cible, et indirectement liée à son activité biologique. Une prédiction précise est d'une importance toute particulière pour la découverte et l'optimisation de nouvelles molécules actives. Cette thèse présente un nouveau logiciel, EADock, mettant en avant une telle précision. Cet algorithme évolutionnaire hybride utilise deux pressions de sélections, combinées à une gestion de la diversité sophistiquée. EADock repose sur CHARMM pour les calculs d'énergie et la gestion des coordonnées atomiques. Sa validation a été effectuée sur 37 complexes protéine-ligand cristallisés, incluant 11 protéines différentes. L'espace de recherche a été étendu à une sphère de 151 de rayon autour du centre de masse du ligand cristallisé, et contrairement aux comparatifs habituels, l'algorithme est parti de solutions optimisées présentant un RMSD jusqu'à 10 R par rapport à la structure cristalline. Cette validation a permis de mettre en évidence l'efficacité de notre heuristique de recherche car des modes d'interactions présentant un RMSD inférieur à 2 R par rapport à la structure cristalline ont été classés premier pour 68% des complexes. Lorsque les cinq meilleures solutions sont prises en compte, le taux de succès grimpe à 78%, et 92% lorsque la totalité de la dernière génération est prise en compte. La plupart des erreurs de prédiction sont imputables à la présence de contacts cristallins. Depuis, EADock a été utilisé pour comprendre les mécanismes moléculaires impliqués dans la régulation de la Na,K ATPase et dans l'activation du peroxisome proliferatoractivated receptor a (PPARa). Il a également permis de décrire l'interaction de polluants couramment rencontrés sur PPARy, ainsi que l'influence de la métabolisation de l'Imatinib (PA anticancéreux) sur la fixation à la kinase Bcr-Abl. Une approche basée sur la prédiction des interactions de fragments moléculaires avec protéine cible est également proposée. Elle a permis la découverte de nouveaux ligands peptidiques de PPARa et de l'intégrine a5ß1. Dans les deux cas, l'activité de ces nouveaux peptides est comparable à celles de ligands bien établis, comme le Wy14,643 pour le premier, et le Cilengitide (PA anticancéreux) pour la seconde.
Resumo:
Mixture materials, mix design, and pavement construction are not isolated steps in the concrete paving process. Each affects the other in ways that determine overall pavement quality and long-term performance. However, equipment and procedures commonly used to test concrete materials and concrete pavements have not changed in decades, leaving gaps in our ability to understand and control the factors that determine concrete durability. The concrete paving community needs tests that will adequately characterize the materials, predict interactions, and monitor the properties of the concrete. The overall objectives of this study are (1) to evaluate conventional and new methods for testing concrete and concrete materials to prevent material and construction problems that could lead to premature concrete pavement distress and (2) to examine and refine a suite of tests that can accurately evaluate concrete pavement properties. The project included three phases. In Phase I, the research team contacted each of 16 participating states to gather information about concrete and concrete material tests. A preliminary suite of tests to ensure long-term pavement performance was developed. The tests were selected to provide useful and easy-to-interpret results that can be performed reasonably and routinely in terms of time, expertise, training, and cost. The tests examine concrete pavement properties in five focal areas critical to the long life and durability of concrete pavements: (1) workability, (2) strength development, (3) air system, (4) permeability, and (5) shrinkage. The tests were relevant at three stages in the concrete paving process: mix design, preconstruction verification, and construction quality control. In Phase II, the research team conducted field testing in each participating state to evaluate the preliminary suite of tests and demonstrate the testing technologies and procedures using local materials. A Mobile Concrete Research Lab was designed and equipped to facilitate the demonstrations. This report documents the results of the 16 state projects. Phase III refined and finalized lab and field tests based on state project test data. The results of the overall project are detailed herein. The final suite of tests is detailed in the accompanying testing guide.
Resumo:
Abstract Phenotypic polymorphism is an ideal system to study natural selection in wild populations, because it allows tracking population genetic changes by means of phenotypic changes. A wide variety of polymorphic traits have been studied in numerous animals and plants, as for example colour patterns in moths, snails and birds, human laterality, male reproductive strategies, plant morphology or mating systems. This thesis focused on Dactylorhiza sarnbucina, a rewardless European orchid species, showing a striking flower colour polymorphism, with either yellow or red flowered individuals co-occurring in natural populations. Several studies have investigated its evolutionary ecology since Nilsson's seminal paper in 1980, with a particular emphasis in the evolution and maintenance of its colour polymorphism. One of the main selective forces proposed to maintain this colour polymorphism was pollinator driven negative frequency-dependent selection (NFDS), when each morph is advantaged when rare, and comparatively disadvantaged when common. However, other investigators have recently questioned the occurrence of NFDS, and proposed alternatively that fluctuating selection may maintain this colour polymorphism. In this thesis, we aimed at reviewing and synthesizing these different studies, and also brought our contribution on D. sambucina reproductive ecology. Because numerous hypotheses have still to be tested, we concluded by saying that we are a long way from understanding the evolution and dynamics of colour polymorphism in natural D. sambucina populations. Beside the debated question of colour polymorphism maintenance, one question remained to be tested: what are the consequences of polymorphism per se. We experimentally addressed this question using artificial populations of D. sambucina, and found no relationship between population phenotypic diversity and orchid pollination success. This finding suggest that polymorphism itself was not an advantage for deceptive species such D sambucina, contrarily to the expectations. Finally, we suggest potential research perspectives that could allow a better understanding of the evolutionary ecology of this species. Résumé Le polymorphisme phénotypique est un système biologique idéal pour étudier l'action de la sélection en populations naturelles, grâce à la possibilité de suivre les changements génétiques de la population en étudiant les phénotypes des individus. De très nombreuses études ont montré du polymorphisme phénotypique chez les animaux, par exemple la latéralité chez l'Homme, la coloration des escargots ou des oiseaux. Dans le règne végétal, le polymorphisme est souvent associé à des traits du système de reproduction. Cette thèse est centrée sur une espèce d'orchidée Européenne qui ne produit pas de nectar, Dactylorhiza sambucina. Cette espèce présente des individus à fleurs jaunes et des individus à fleurs rouge, généralement présents en mélange dans les populations naturelles. Plusieurs études ont investigué l'écologie évolutive de cette espèce depuis 25 ans, avec comme thème central l'évolution et le maintien de ce polymorphisme. La principale force sélective proposée pour maintenir ce polymorphisme de couleur est la sélection fréquence-dépendante, exercée par le comportement des pollinisateurs. Chacun des deux variants de couleur est favorisé quand il est rare, et défavorisé quand il devient commun. Bien que ce mécanisme semble agir, certains auteurs doutent de son importance, et ont proposé que les variations temporelles ou spatiales des forces de sélection puisse maintenir le polymorphisme de couleur chez D. sambucina. Dans cette thèse, nous avons voulu résumer et synthétiser les résultats de ces différentes études, et aussi présenter des données nouvelles concernant la reproduction de cette espèce. À la vue de ces résultats, il apparait que de nombreux points nécessitent des expériences complémentaires, et que la compréhension de ce système biologique est encore fragmentaire. Nous nous sommes également intéressés à une question laissée en suspens dans la littérature: le polymorphisme de couleur en soit confère-t-il un avantage à l'espèce, comme proposé par certains auteurs? En construisant des populations artificielles de D. sambucina, nous avons pu montrer que le polymorphisme de couleur n'augmente pas le succès reproducteur de l'espèce. Nous terminons ce travail de recherche en proposant plusieurs axes de recherche pouvant conduire à une meilleure compréhension de l'écologie et de l'évolution de cette espèce.
Resumo:
Many questions in evolutionary biology require an estimate of divergence times but, for groups with a sparse fossil record, such estimates rely heavily on molecular dating methods. The accuracy of these methods depends on both an adequate underlying model and the appropriate implementation of fossil evidence as calibration points. We explore the effect of these in Poaceae (grasses), a diverse plant lineage with a very limited fossil record, focusing particularly on dating the early divergences in the group. We show that molecular dating based on a data set of plastid markers is strongly dependent on the model assumptions. In particular, an acceleration of evolutionary rates at the base of Poaceae followed by a deceleration in the descendants strongly biases methods that assume an autocorrelation of rates. This problem can be circumvented by using markers that have lower rate variation, and we show that phylogenetic markers extracted from complete nuclear genomes can be a useful complement to the more commonly used plastid markers. However, estimates of divergence times remain strongly affected by different implementations of fossil calibration points. Analyses calibrated with only macrofossils lead to estimates for the age of core Poaceae ∼51-55 Ma, but the inclusion of microfossil evidence pushes this age to 74-82 Ma and leads to lower estimated evolutionary rates in grasses. These results emphasize the importance of considering markers from multiple genomes and alternative fossil placements when addressing evolutionary issues that depend on ages estimated for important groups.
Resumo:
MOTIVATION: The analysis of molecular coevolution provides information on the potential functional and structural implication of positions along DNA sequences, and several methods are available to identify coevolving positions using probabilistic or combinatorial approaches. The specific nucleotide or amino acid profile associated with the coevolution process is, however, not estimated, but only known profiles, such as the Watson-Crick constraint, are usually considered a priori in current measures of coevolution. RESULTS: Here, we propose a new probabilistic model, Coev, to identify coevolving positions and their associated profile in DNA sequences while incorporating the underlying phylogenetic relationships. The process of coevolution is modeled by a 16 × 16 instantaneous rate matrix that includes rates of transition as well as a profile of coevolution. We used simulated, empirical and illustrative data to evaluate our model and to compare it with a model of 'independent' evolution using Akaike Information Criterion. We showed that the Coev model is able to discriminate between coevolving and non-coevolving positions and provides better specificity and specificity than other available approaches. We further demonstrate that the identification of the profile of coevolution can shed new light on the process of dependent substitution during lineage evolution.
Resumo:
We examined the sequence variation of mitochondrial DNA control region and cytochrome b gene of the house mouse (Mus musculus sensu lato) drawn from ca. 200 localities, with 286 new samples drawn primarily from previously unsampled portions of their Eurasian distribution and with the objective of further clarifying evolutionary episodes of this species before and after the onset of human-mediated long-distance dispersals. Phylogenetic analysis of the expanded data detected five equally distinct clades, with geographic ranges of northern Eurasia (musculus, MUS), India and Southeast Asia (castaneus, CAS), Nepal (unspecified, NEP), western Europe (domesticus, DOM) and Yemen (gentilulus). Our results confirm previous suggestions of Southwestern Asia as the likely place of origin of M. musculus and the region of Iran, Afghanistan, Pakistan, and northern India, specifically as the ancestral homeland of CAS. The divergence of the subspecies lineages and of internal sublineage differentiation within CAS were estimated to be 0.37-0.47 and 0.14-0.23 million years ago (mya), respectively, assuming a split of M. musculus and Mus spretus at 1.7 mya. Of the four CAS sublineages detected, only one extends to eastern parts of India, Southeast Asia, Indonesia, Philippines, South China, Northeast China, Primorye, Sakhalin and Japan, implying a dramatic range expansion of CAS out of its homeland during an evolutionary short time, perhaps associated with the spread of agricultural practices. Multiple and non-coincident eastward dispersal events of MUS sublineages to distant geographic areas, such as northern China, Russia and Korea, are inferred, with the possibility of several different routes.
Resumo:
Letter to the Editor on Wang M, Wang Q, Wang Z, Zhang X, Pan Y. The molecular evolutionary patterns of the insulin/FOXO signaling pathway
Resumo:
As modern molecular biology moves towards the analysis of biological systems as opposed to their individual components, the need for appropriate mathematical and computational techniques for understanding the dynamics and structure of such systems is becoming more pressing. For example, the modeling of biochemical systems using ordinary differential equations (ODEs) based on high-throughput, time-dense profiles is becoming more common-place, which is necessitating the development of improved techniques to estimate model parameters from such data. Due to the high dimensionality of this estimation problem, straight-forward optimization strategies rarely produce correct parameter values, and hence current methods tend to utilize genetic/evolutionary algorithms to perform non-linear parameter fitting. Here, we describe a completely deterministic approach, which is based on interval analysis. This allows us to examine entire sets of parameters, and thus to exhaust the global search within a finite number of steps. In particular, we show how our method may be applied to a generic class of ODEs used for modeling biochemical systems called Generalized Mass Action Models (GMAs). In addition, we show that for GMAs our method is amenable to the technique in interval arithmetic called constraint propagation, which allows great improvement of its efficiency. To illustrate the applicability of our method we apply it to some networks of biochemical reactions appearing in the literature, showing in particular that, in addition to estimating system parameters in the absence of noise, our method may also be used to recover the topology of these networks.
Resumo:
MOTIVATION: The detection of positive selection is widely used to study gene and genome evolution, but its application remains limited by the high computational cost of existing implementations. We present a series of computational optimizations for more efficient estimation of the likelihood function on large-scale phylogenetic problems. We illustrate our approach using the branch-site model of codon evolution. RESULTS: We introduce novel optimization techniques that substantially outperform both CodeML from the PAML package and our previously optimized sequential version SlimCodeML. These techniques can also be applied to other likelihood-based phylogeny software. Our implementation scales well for large numbers of codons and/or species. It can therefore analyse substantially larger datasets than CodeML. We evaluated FastCodeML on different platforms and measured average sequential speedups of FastCodeML (single-threaded) versus CodeML of up to 5.8, average speedups of FastCodeML (multi-threaded) versus CodeML on a single node (shared memory) of up to 36.9 for 12 CPU cores, and average speedups of the distributed FastCodeML versus CodeML of up to 170.9 on eight nodes (96 CPU cores in total).Availability and implementation: ftp://ftp.vital-it.ch/tools/FastCodeML/. CONTACT: selectome@unil.ch or nicolas.salamin@unil.ch.
Resumo:
In this thesis, different genetic tools are used to investigate both natural variation and speciation in the Ficedula flycatcher system: pied (Ficedula hypoleuca) and collared (F. albicollis) flycatchers. The molecular evolution of a gene involved in postnatal body growth, GH, has shown high degree of conservation at the mature protein between birds and mammals, whereas the variation observed in its signal peptide seems to be adaptive in pied flycatcher (I & II). Speciation is the process by which reproductive barriers to gene flow evolve between populations, and understanding the mechanisms involved in pre- and post-zygotic isolation have been investigated in Ficedula flycatchers. The Z chromosome have been suggested to be the hotspot for genes involved in speciation, thus sequencing of 13 Z-linked coding genes from the two species in allopatry and sympatry have been conducted (III). Surprisingly, the majority of Z-linked genes seemed to be highly conserved, suggesting instead a potential involvement of regulatory regions. Previous studies have shown that genes involved in hybrid fitness, female preferences and male plumage colouration are sex-linked. Hence, three pigmentation genes have been investigated: MC1R, AGRP, and TYRP1. Of these three genes, TYRP1 was identified as a strong candidate to be associated with black-brown plumage variation in sympatric populations, and hence is a strong candidate for a gene contributing to pre-zygotic isolation (IV). In sympatric areas, where pied and collared flycatchers have overlapping breeding areas, hybridization sometimes occurs leading to the production of unfit hybrids. By using a proteomic approach a novel expression pattern in hybrids was revealed compared to the parental species (V) and differentially expressed proteins subsequently identified by sequence similarity (VI). In conclusion, the Z chromosome appears to play an important role in flycatcher speciation, but probably not at the coding level. In addition the novel expression patterns might give new insights into the maladaptive hybrids.
Resumo:
Phylogenetic trees representing the evolutionary relationships of homologous genes are the entry point for many evolutionary analyses. For instance, the use of a phylogenetic tree can aid in the inference of orthology and paralogy relationships, and in the detection of relevant evolutionary events such as gene family expansions and contractions, horizontal gene transfer, recombination or incomplete lineage sorting. Similarly, given the plurality of evolutionary histories among genes encoded in a given genome, there is a need for the combined analysis of genome-wide collections of phylogenetic trees (phylomes). Here, we introduce a new release of PhylomeDB (http://phylomedb.org), a public repository of phylomes. Currently, PhylomeDB hosts 120 public phylomes, comprising >1.5 million maximum likelihood trees and multiple sequence alignments. In the current release, phylogenetic trees are annotated with taxonomic, protein-domain arrangement, functional and evolutionary information. PhylomeDB is also a major source for phylogeny-based predictions of orthology and paralogy, covering >10 million proteins across 1059 sequenced species. Here we describe newly implemented PhylomeDB features, and discuss a benchmark of the orthology predictions provided by the database, the impact of proteome updates and the use of the phylome approach in the analysis of newly sequenced genomes and transcriptomes.
Resumo:
Detecting the action of selection in natural populations can be achieved using the QST-FST comparison that relies on the estimation of FST with neutral markers, and QST using quantitative traits potentially under selection. QST higher than FST suggests the action of directional selection and thus potential local adaptation. In this article, we apply the QST-FST comparison to four populations of the hermaphroditic freshwater snail Radix balthica located in a floodplain habitat. In contrast to most studies published so far, we did not detect evidence of directional selection for local optima for any of the traits we measured: QST calculated using three different methods was never higher than FST. A strong inbreeding depression was also detected, indicating that outcrossing is probably predominant over selfing in the studied populations. Our results suggest that in this floodplain habitat, local adaptation of R. balthica populations may be hindered by genetic drift, and possibly altered by uneven gene flow linked to flood frequency.
Resumo:
Abstract: The objective of this work was to evaluate 41 microsatellite markers for heterologous amplifications in piracanjuba (Brycon orbignyanus). Some markers were tested for the first time. Loci were optimized for PCR conditions and applied to a sample of 49 individuals. Thirty-one loci resulted in PCR product formation, whereas ten loci yielded intelligible polymorphic patterns in the evaluated sample and can be used for amplifications in this species. From the evaluated markers, four loci (BoM1, BoM13, Bh6, and Bh16) are valid to be applied in the study of piracanjuba.