33 resultados para Combinatorial optimization algorithms
em Université de Lausanne, Switzerland
Resumo:
Combinatorial optimization involves finding an optimal solution in a finite set of options; many everyday life problems are of this kind. However, the number of options grows exponentially with the size of the problem, such that an exhaustive search for the best solution is practically infeasible beyond a certain problem size. When efficient algorithms are not available, a practical approach to obtain an approximate solution to the problem at hand, is to start with an educated guess and gradually refine it until we have a good-enough solution. Roughly speaking, this is how local search heuristics work. These stochastic algorithms navigate the problem search space by iteratively turning the current solution into new candidate solutions, guiding the search towards better solutions. The search performance, therefore, depends on structural aspects of the search space, which in turn depend on the move operator being used to modify solutions. A common way to characterize the search space of a problem is through the study of its fitness landscape, a mathematical object comprising the space of all possible solutions, their value with respect to the optimization objective, and a relationship of neighborhood defined by the move operator. The landscape metaphor is used to explain the search dynamics as a sort of potential function. The concept is indeed similar to that of potential energy surfaces in physical chemistry. Borrowing ideas from that field, we propose to extend to combinatorial landscapes the notion of the inherent network formed by energy minima in energy landscapes. In our case, energy minima are the local optima of the combinatorial problem, and we explore several definitions for the network edges. At first, we perform an exhaustive sampling of local optima basins of attraction, and define weighted transitions between basins by accounting for all the possible ways of crossing the basins frontier via one random move. Then, we reduce the computational burden by only counting the chances of escaping a given basin via random kick moves that start at the local optimum. Finally, we approximate network edges from the search trajectory of simple search heuristics, mining the frequency and inter-arrival time with which the heuristic visits local optima. Through these methodologies, we build a weighted directed graph that provides a synthetic view of the whole landscape, and that we can characterize using the tools of complex networks science. We argue that the network characterization can advance our understanding of the structural and dynamical properties of hard combinatorial landscapes. We apply our approach to prototypical problems such as the Quadratic Assignment Problem, the NK model of rugged landscapes, and the Permutation Flow-shop Scheduling Problem. We show that some network metrics can differentiate problem classes, correlate with problem non-linearity, and predict problem hardness as measured from the performances of trajectory-based local search heuristics.
Resumo:
The paper presents an approach for mapping of precipitation data. The main goal is to perform spatial predictions and simulations of precipitation fields using geostatistical methods (ordinary kriging, kriging with external drift) as well as machine learning algorithms (neural networks). More practically, the objective is to reproduce simultaneously both the spatial patterns and the extreme values. This objective is best reached by models integrating geostatistics and machine learning algorithms. To demonstrate how such models work, two case studies have been considered: first, a 2-day accumulation of heavy precipitation and second, a 6-day accumulation of extreme orographic precipitation. The first example is used to compare the performance of two optimization algorithms (conjugate gradients and Levenberg-Marquardt) of a neural network for the reproduction of extreme values. Hybrid models, which combine geostatistical and machine learning algorithms, are also treated in this context. The second dataset is used to analyze the contribution of radar Doppler imagery when used as external drift or as input in the models (kriging with external drift and neural networks). Model assessment is carried out by comparing independent validation errors as well as analyzing data patterns.
Resumo:
The goal of the present work was assess the feasibility of using a pseudo-inverse and null-space optimization approach in the modeling of the shoulder biomechanics. The method was applied to a simplified musculoskeletal shoulder model. The mechanical system consisted in the arm, and the external forces were the arm weight, 6 scapulo-humeral muscles and the reaction at the glenohumeral joint, which was considered as a spherical joint. The muscle wrapping was considered around the humeral head assumed spherical. The dynamical equations were solved in a Lagrangian approach. The mathematical redundancy of the mechanical system was solved in two steps: a pseudo-inverse optimization to minimize the square of the muscle stress and a null-space optimization to restrict the muscle force to physiological limits. Several movements were simulated. The mathematical and numerical aspects of the constrained redundancy problem were efficiently solved by the proposed method. The prediction of muscle moment arms was consistent with cadaveric measurements and the joint reaction force was consistent with in vivo measurements. This preliminary work demonstrated that the developed algorithm has a great potential for more complex musculoskeletal modeling of the shoulder joint. In particular it could be further applied to a non-spherical joint model, allowing for the natural translation of the humeral head in the glenoid fossa.
Resumo:
Tractography is a class of algorithms aiming at in vivo mapping the major neuronal pathways in the white matter from diffusion magnetic resonance imaging (MRI) data. These techniques offer a powerful tool to noninvasively investigate at the macroscopic scale the architecture of the neuronal connections of the brain. However, unfortunately, the reconstructions recovered with existing tractography algorithms are not really quantitative even though diffusion MRI is a quantitative modality by nature. As a matter of fact, several techniques have been proposed in recent years to estimate, at the voxel level, intrinsic microstructural features of the tissue, such as axonal density and diameter, by using multicompartment models. In this paper, we present a novel framework to reestablish the link between tractography and tissue microstructure. Starting from an input set of candidate fiber-tracts, which are estimated from the data using standard fiber-tracking techniques, we model the diffusion MRI signal in each voxel of the image as a linear combination of the restricted and hindered contributions generated in every location of the brain by these candidate tracts. Then, we seek for the global weight of each of them, i.e., the effective contribution or volume, such that they globally fit the measured signal at best. We demonstrate that these weights can be easily recovered by solving a global convex optimization problem and using efficient algorithms. The effectiveness of our approach has been evaluated both on a realistic phantom with known ground-truth and in vivo brain data. Results clearly demonstrate the benefits of the proposed formulation, opening new perspectives for a more quantitative and biologically plausible assessment of the structural connectivity of the brain.
Resumo:
In the context of Systems Biology, computer simulations of gene regulatory networks provide a powerful tool to validate hypotheses and to explore possible system behaviors. Nevertheless, modeling a system poses some challenges of its own: especially the step of model calibration is often difficult due to insufficient data. For example when considering developmental systems, mostly qualitative data describing the developmental trajectory is available while common calibration techniques rely on high-resolution quantitative data. Focusing on the calibration of differential equation models for developmental systems, this study investigates different approaches to utilize the available data to overcome these difficulties. More specifically, the fact that developmental processes are hierarchically organized is exploited to increase convergence rates of the calibration process as well as to save computation time. Using a gene regulatory network model for stem cell homeostasis in Arabidopsis thaliana the performance of the different investigated approaches is evaluated, documenting considerable gains provided by the proposed hierarchical approach.
Resumo:
Microstructure imaging from diffusion magnetic resonance (MR) data represents an invaluable tool to study non-invasively the morphology of tissues and to provide a biological insight into their microstructural organization. In recent years, a variety of biophysical models have been proposed to associate particular patterns observed in the measured signal with specific microstructural properties of the neuronal tissue, such as axon diameter and fiber density. Despite very appealing results showing that the estimated microstructure indices agree very well with histological examinations, existing techniques require computationally very expensive non-linear procedures to fit the models to the data which, in practice, demand the use of powerful computer clusters for large-scale applications. In this work, we present a general framework for Accelerated Microstructure Imaging via Convex Optimization (AMICO) and show how to re-formulate this class of techniques as convenient linear systems which, then, can be efficiently solved using very fast algorithms. We demonstrate this linearization of the fitting problem for two specific models, i.e. ActiveAx and NODDI, providing a very attractive alternative for parameter estimation in those techniques; however, the AMICO framework is general and flexible enough to work also for the wider space of microstructure imaging methods. Results demonstrate that AMICO represents an effective means to accelerate the fit of existing techniques drastically (up to four orders of magnitude faster) while preserving accuracy and precision in the estimated model parameters (correlation above 0.9). We believe that the availability of such ultrafast algorithms will help to accelerate the spread of microstructure imaging to larger cohorts of patients and to study a wider spectrum of neurological disorders.
Resumo:
In recent years, protein-ligand docking has become a powerful tool for drug development. Although several approaches suitable for high throughput screening are available, there is a need for methods able to identify binding modes with high accuracy. This accuracy is essential to reliably compute the binding free energy of the ligand. Such methods are needed when the binding mode of lead compounds is not determined experimentally but is needed for structure-based lead optimization. We present here a new docking software, called EADock, that aims at this goal. It uses an hybrid evolutionary algorithm with two fitness functions, in combination with a sophisticated management of the diversity. EADock is interfaced with the CHARMM package for energy calculations and coordinate handling. A validation was carried out on 37 crystallized protein-ligand complexes featuring 11 different proteins. The search space was defined as a sphere of 15 A around the center of mass of the ligand position in the crystal structure, and on the contrary to other benchmarks, our algorithm was fed with optimized ligand positions up to 10 A root mean square deviation (RMSD) from the crystal structure, excluding the latter. This validation illustrates the efficiency of our sampling strategy, as correct binding modes, defined by a RMSD to the crystal structure lower than 2 A, were identified and ranked first for 68% of the complexes. The success rate increases to 78% when considering the five best ranked clusters, and 92% when all clusters present in the last generation are taken into account. Most failures could be explained by the presence of crystal contacts in the experimental structure. Finally, the ability of EADock to accurately predict binding modes on a real application was illustrated by the successful docking of the RGD cyclic pentapeptide on the alphaVbeta3 integrin, starting far away from the binding pocket.
Resumo:
Pharmacokinetic variability in drug levels represent for some drugs a major determinant of treatment success, since sub-therapeutic concentrations might lead to toxic reactions, treatment discontinuation or inefficacy. This is true for most antiretroviral drugs, which exhibit high inter-patient variability in their pharmacokinetics that has been partially explained by some genetic and non-genetic factors. The population pharmacokinetic approach represents a very useful tool for the description of the dose-concentration relationship, the quantification of variability in the target population of patients and the identification of influencing factors. It can thus be used to make predictions and dosage adjustment optimization based on Bayesian therapeutic drug monitoring (TDM). This approach has been used to characterize the pharmacokinetics of nevirapine (NVP) in 137 HIV-positive patients followed within the frame of a TDM program. Among tested covariates, body weight, co-administration of a cytochrome (CYP) 3A4 inducer or boosted atazanavir as well as elevated aspartate transaminases showed an effect on NVP elimination. In addition, genetic polymorphism in the CYP2B6 was associated with reduced NVP clearance. Altogether, these factors could explain 26% in NVP variability. Model-based simulations were used to compare the adequacy of different dosage regimens in relation to the therapeutic target associated with treatment efficacy. In conclusion, the population approach is very useful to characterize the pharmacokinetic profile of drugs in a population of interest. The quantification and the identification of the sources of variability is a rational approach to making optimal dosage decision for certain drugs administered chronically.
Resumo:
Les cellules CD8? T cytolytiques (CTL) sont les principaux effecteurs du système immunitaire adaptatif contre les infections et les tumeurs. La récente identification d?antigènes tumoraux humains reconnus par des cellules T cytolytiques est la base pour le, développement des vaccins antigène spécifiques contre le cancer. Le nombre d?antigènes tumoraux reconnus par des CTL que puisse être utilisé comme cible pour la vaccination des patients atteints du cancer est encore limité. Une nouvelle technique, simple et rapide, vient d?être proposée pour l?identification d?antigènes reconnus par des CTL. Elle se base sur l?utilisation de librairies combinatoriales de peptides arrangées en un format de "scanning" ou balayage par position (PS-SCL). La première partie de cette étude a consisté à valider cette nouvelle technique par une analyse détaillée de la reconnaissance des PS-SCL par différents clones de CTL spécifiques pour des antigènes associés à la tumeur (TAA) connus ainsi que par des clones de spécificité inconnue. Les résultats de ces analyses révèlent que pour tous les clones, la plupart des acides aminés qui composent la séquence du peptide antigénique naturel ont été identifiés par l?utilisation des PS-SCL. Les résultats obtenus ont permis d?identifier des peptides analogues ayant une antigènicité augmentée par rapport au peptide naturel, ainsi que des peptides comportant de multiples modifications de séquence, mais présentant la même réactivité que le peptide naturel. La deuxième partie de cette étude a consisté à effectuer des analyses biométriques des résultats complexes générés par la PS-SCL. Cette approche a permis l?identification des séquences correspondant aux épitopes naturels à partir de bases de données de peptides publiques. Parmi des milliers de peptides, les séquences naturelles se trouvent comprises dans les 30 séquences ayant les scores potentiels de stimulation les plus élevés pour chaque TAA étudié. Mais plus important encore, l?utilisation des PS-SCL avec un clone réactif contre des cellules tumorales mais de spécificité inconnue nous a permis d?identifier I?epitope reconnu par ce clone. Les données présentées ici encouragent l?utilisation des PS-SCL pour l?identification et l?optimisation d?épitopes pour des CTL réactifs anti-tumoraux, ainsi que pour l?étude de la reconnaissance dégénérée d?antigènes par les CTL.<br/><br/>CD8+ cytolytic T lymphocytes (CTL) are the main effector cells of the adaptive immune system against infection and tumors. The recent identification of moleculariy defined human tumor Ags recognized by autologous CTL has opened new opportunities for the development of Ag-specific cancer vaccines. Despite extensive work, however, the number of CTL-defined tumor Ags that are suitable targets for the vaccination of cancer patients is still limited, especially because of the laborious and time consuming nature of the procedures currentiy used for their identification. The use of combinatorial peptide libraries in positionai scanning format (Positional Scanning Synthetic Combinatorial Libraries, PS-SCL)' has recently been proposed as an alternative approach for the identification of these epitopes. To validate this approach, we analyzed in detail the recognition of PS-SCL by tumor-reactive CTL clones specific for multiple well-defined tumor-associated Ags (TAA) as well as by tumor-reactive CTL clones of unknown specificity. The results of these analyses revealed that for all the TAA-specific clones studied most of the amino acids composing the native antigenic peptide sequences could be identified through the use of PS-SCL. Based on the data obtained from the screening of PS-SCL, we could design peptide analogs of increased antigenicity as well as cross-reactive analog peptides containing multiple amino acid substitutions. In addition, the resuits of PS-SCL-screening combined with a recently developed biometric data analysis (PS-SCL-based biometric database analysis) allowed the identification of the native peptides in public protein databases among the 30 most active sequences, and this was the case for all the TAA studied. More importantiy, the screening of PS- SCL with a tumor-reactive CTL clone of unknown specificity resulted in the identification of the actual epitope. Overall, these data encourage the use of PS-SCL not oniy for the identification and optimization of tumor-associated CTL epitopes, but also for the analysis of degeneracy in T lymphocyte receptor (TCR) recognition of tumor Ags.<br/><br/>Les cellules T CD8? cytolytiques font partie des globules blancs du sang et sont les principales responsables de la lutte contre les infections et les tumeurs. Les immunologistes cherchent depuis des années à identifier des molécules exprimées et présentées à la surface des tumeurs qui puissent être reconnues par des cellules T CD8? cytolytiques capables ensuite de tuer ces tumeurs de façon spécifique. Ce type de molécules représente la base pour le développement de vaccins contre le cancer puisqu?elles pourraient être injectées aux patients afin d?induire une réponse anti- tumorale. A présent, il y a très peu de molécules capables de stimuler le système immunitaire contre les tumeurs qui sont connues parce que les techniques développées à ce jour pour leur identification sont complexes et longues. Une nouvelle technique vient d?être proposée pour l?identification de ce type de molécules qui se base sur l?utilisation de librairies de peptides. Ces librairies représentent toutes les combinaisons possibles des composants de base des molécules recherchées. La première partie de cette étude a consisté à valider cette nouvelle technique en utilisant des cellules T CD8? cytolytiques capables de tuer des cellules tumorales en reconnaissant une molécule connue présente à leur surface. On a démontré que l?utilisation des librairies permet d?identifier la plupart des composants de base de la molécule reconnue par les cellules T CD8? cytolytiques utilisées. La deuxième partie de cette étude a consisté à effectuer une recherche des molécules potentiellement actives dans des protéines présentes dans des bases des données en utilisant un programme informatique qui permet de classer les molécules sur la base de leur activité biologique. Parmi des milliers de molécules de la base de données, celles reconnues par nos cellules T CD8? cytolytiques ont été trouvées parmi les plus actives. Plus intéressant encore, la combinaison de ces deux techniques nous a permis d?identifier la molécule reconnue par une population de cellules T CD8? cytolytiques ayant une activité anti-tumorale, mais pour laquelle on ne connaissait pas la spécificité. Nos résultats encouragent l?utilisation des librairies pour trouver et optimiser des molécules reconnues spécifiquement par des cellules T CD8? cytolytiques capables de tuer des tumeurs.
Resumo:
Restriction site-associated DNA sequencing (RADseq) provides researchers with the ability to record genetic polymorphism across thousands of loci for nonmodel organisms, potentially revolutionizing the field of molecular ecology. However, as with other genotyping methods, RADseq is prone to a number of sources of error that may have consequential effects for population genetic inferences, and these have received only limited attention in terms of the estimation and reporting of genotyping error rates. Here we use individual sample replicates, under the expectation of identical genotypes, to quantify genotyping error in the absence of a reference genome. We then use sample replicates to (i) optimize de novo assembly parameters within the program Stacks, by minimizing error and maximizing the retrieval of informative loci; and (ii) quantify error rates for loci, alleles and single-nucleotide polymorphisms. As an empirical example, we use a double-digest RAD data set of a nonmodel plant species, Berberis alpina, collected from high-altitude mountains in Mexico.
Resumo:
The algorithmic approach to data modelling has developed rapidly these last years, in particular methods based on data mining and machine learning have been used in a growing number of applications. These methods follow a data-driven methodology, aiming at providing the best possible generalization and predictive abilities instead of concentrating on the properties of the data model. One of the most successful groups of such methods is known as Support Vector algorithms. Following the fruitful developments in applying Support Vector algorithms to spatial data, this paper introduces a new extension of the traditional support vector regression (SVR) algorithm. This extension allows for the simultaneous modelling of environmental data at several spatial scales. The joint influence of environmental processes presenting different patterns at different scales is here learned automatically from data, providing the optimum mixture of short and large-scale models. The method is adaptive to the spatial scale of the data. With this advantage, it can provide efficient means to model local anomalies that may typically arise in situations at an early phase of an environmental emergency. However, the proposed approach still requires some prior knowledge on the possible existence of such short-scale patterns. This is a possible limitation of the method for its implementation in early warning systems. The purpose of this paper is to present the multi-scale SVR model and to illustrate its use with an application to the mapping of Cs137 activity given the measurements taken in the region of Briansk following the Chernobyl accident.
Resumo:
Purpose: To evaluate the feasibility, determine the optimal b-value, and assess the utility of 3-T diffusion-weighted MR imaging (DWI) of the spine in differentiating benign from pathologic vertebral compression fractures.Methods and Materials: Twenty patients with 38 vertebral compression fractures (24 benign, 14 pathologic) and 20 controls (total: 23 men, 17 women, mean age 56.2years) were included from December 2010 to May 2011 in this IRB-approved prospective study. MR imaging of the spine was performed on a 3-T unit with T1-w, fat-suppressed T2-w, gadolinium-enhanced fat-suppressed T1-w and zoomed-EPI (2D RF excitation pulse combined with reduced field-of-view single-shot echo-planar readout) diffusion-w (b-values: 0, 300, 500 and 700s/mm2) sequences. Two radiologists independently assessed zoomed-EPI image quality in random order using a 4-point scale: 1=excellent to 4=poor. They subsequently measured apparent diffusion coefficients (ADCs) in normal vertebral bodies and compression fractures, in consensus.Results: Lower b-values correlated with better image quality scores, with significant differences between b=300 (mean±SD=2.6±0.8), b=500 (3.0±0.7) and b=700 (3.6±0.6) (all p<0.001). Mean ADCs of normal vertebral bodies (n=162) were 0.23, 0.17 and 0.11×10-3mm2/s with b=300, 500 and 700s/mm2, respectively. In contrast, mean ADCs were 0.89, 0.70 and 0.59×10-3mm2/s for benign vertebral compression fractures and 0.79, 0.66 and 0.51×10-3mm2/s for pathologic fractures with b=300, 500 and 700s/mm2, respectively. No significant difference was found between ADCs of benign and pathologic fractures.Conclusion: 3-T DWI of the spine is feasible and lower b-values (300s/mm2) are recommended. However, our preliminary results show no advantage of DWI in differentiating benign from pathologic vertebral compression fractures.