187 resultados para Computational modeling
Resumo:
Protein-ligand docking has made important progress during the last decade and has become a powerful tool for drug development, opening the way to virtual high throughput screening and in silico structure-based ligand design. Despite the flattering picture that has been drawn, recent publications have shown that the docking problem is far from being solved, and that more developments are still needed to achieve high successful prediction rates and accuracy. Introducing an accurate description of the solvation effect upon binding is thought to be essential to achieve this goal. In particular, EADock uses the Generalized Born Molecular Volume 2 (GBMV2) solvent model, which has been shown to reproduce accurately the desolvation energies calculated by solving the Poisson equation. Here, the implementation of the Fast Analytical Continuum Treatment of Solvation (FACTS) as an implicit solvation model in small molecules docking calculations has been assessed using the EADock docking program. Our results strongly support the use of FACTS for docking. The success rates of EADock/FACTS and EADock/GBMV2 are similar, i.e. around 75% for local docking and 65% for blind docking. However, these results come at a much lower computational cost: FACTS is 10 times faster than GBMV2 in calculating the total electrostatic energy, and allows a speed up of EADock by a factor of 4. This study also supports the EADock development strategy relying on the CHARMM package for energy calculations, which enables straightforward implementation and testing of the latest developments in the field of Molecular Modeling.
Resumo:
Summary The specific CD8+ T cell immune response against tumors relies on the recognition by the T cell receptor (TCR) on cytotoxic T lymphocytes (CTL) of antigenic peptides bound to the class I major histocompatibility complex (MHC) molecule. Such tumor associated antigenic peptides are the focus of tumor immunotherapy with peptide vaccines. The strategy for obtaining an improved immune response often involves the design of modified tumor associated antigenic peptides. Such modifications aim at creating higher affinity and/or degradation resistant peptides and require precise structures of the peptide-MHC class I complex. In addition, the modified peptide must be cross-recognized by CTLs specific for the parental peptide, i.e. preserve the structure of the epitope. Detailed structural information on the modified peptide in complex with MHC is necessary for such predictions. In this thesis, the main focus is the development of theoretical in silico methods for prediction of both structure and cross-reactivity of peptide-MHC class I complexes. Applications of these methods in the context of immunotherapy are also presented. First, a theoretical method for structure prediction of peptide-MHC class I complexes is developed and validated. The approach is based on a molecular dynamics protocol to sample the conformational space of the peptide in its MHC environment. The sampled conformers are evaluated using conformational free energy calculations. The method, which is evaluated for its ability to reproduce 41 X-ray crystallographic structures of different peptide-MHC class I complexes, shows an overall prediction success of 83%. Importantly, in the clinically highly relevant subset of peptide-HLAA*0201 complexes, the prediction success is 100%. Based on these structure predictions, a theoretical approach for prediction of cross-reactivity is developed and validated. This method involves the generation of quantitative structure-activity relationships using three-dimensional molecular descriptors and a genetic neural network. The generated relationships are highly predictive as proved by high cross-validated correlation coefficients (0.78-0.79). Together, the here developed theoretical methods open the door for efficient rational design of improved peptides to be used in immunotherapy. Résumé La réponse immunitaire spécifique contre des tumeurs dépend de la reconnaissance par les récepteurs des cellules T CD8+ de peptides antigéniques présentés par les complexes majeurs d'histocompatibilité (CMH) de classe I. Ces peptides sont utilisés comme cible dans l'immunothérapie par vaccins peptidiques. Afin d'augmenter la réponse immunitaire, les peptides sont modifiés de façon à améliorer l'affinité et/ou la résistance à la dégradation. Ceci nécessite de connaître la structure tridimensionnelle des complexes peptide-CMH. De plus, les peptides modifiés doivent être reconnus par des cellules T spécifiques du peptide natif. La structure de l'épitope doit donc être préservée et des structures détaillées des complexes peptide-CMH sont nécessaires. Dans cette thèse, le thème central est le développement des méthodes computationnelles de prédiction des structures des complexes peptide-CMH classe I et de la reconnaissance croisée. Des applications de ces méthodes de prédiction à l'immunothérapie sont également présentées. Premièrement, une méthode théorique de prédiction des structures des complexes peptide-CMH classe I est développée et validée. Cette méthode est basée sur un échantillonnage de l'espace conformationnel du peptide dans le contexte du récepteur CMH classe I par dynamique moléculaire. Les conformations sont évaluées par leurs énergies libres conformationnelles. La méthode est validée par sa capacité à reproduire 41 structures des complexes peptide-CMH classe I obtenues par cristallographie aux rayons X. Le succès prédictif général est de 83%. Pour le sous-groupe HLA-A*0201 de complexes de grande importance pour l'immunothérapie, ce succès est de 100%. Deuxièmement, à partir de ces structures prédites in silico, une méthode théorique de prédiction de la reconnaissance croisée est développée et validée. Celle-ci consiste à générer des relations structure-activité quantitatives en utilisant des descripteurs moléculaires tridimensionnels et un réseau de neurones couplé à un algorithme génétique. Les relations générées montrent une capacité de prédiction remarquable avec des valeurs de coefficients de corrélation de validation croisée élevées (0.78-0.79). Les méthodes théoriques développées dans le cadre de cette thèse ouvrent la voie du design de vaccins peptidiques améliorés.
Resumo:
Cefepime is a broad-spectrum cephalosporin indicated for in-hospital treatment of severe infections. Acute neurotoxicity, an increasingly recognized adverse effect of this drug in an overdose, predominantly affects patients with reduced renal function. Although dialytic approaches have been advocated to treat this condition, their role in this indication remains unclear. We report the case of an 88-year-old female patient with impaired renal function who developed life-threatening neurologic symptoms during cefepime therapy. She was treated with two intermittent 3-hour high-flux, high-efficiency hemodialysis sessions. Serial pre-, post-, and peridialytic (pre- and postfilter) serum cefepime concentrations were measured. Pharmacokinetic modeling showed that this dialytic strategy allowed for serum cefepime concentrations to return to the estimated nontoxic range 15 hours earlier than would have been the case without an intervention. The patient made a full clinical recovery over the next 48 hours. We conclude that at least 1 session of intermittent hemodialysis may shorten the time to return to the nontoxic range in severe clinically patent intoxication. It should be considered early in its clinical course pending chemical confirmation, even in frail elderly patients. Careful dosage adjustment and a high index of suspicion are essential in this population.
Resumo:
The coverage and volume of geo-referenced datasets are extensive and incessantly¦growing. The systematic capture of geo-referenced information generates large volumes¦of spatio-temporal data to be analyzed. Clustering and visualization play a key¦role in the exploratory data analysis and the extraction of knowledge embedded in¦these data. However, new challenges in visualization and clustering are posed when¦dealing with the special characteristics of this data. For instance, its complex structures,¦large quantity of samples, variables involved in a temporal context, high dimensionality¦and large variability in cluster shapes.¦The central aim of my thesis is to propose new algorithms and methodologies for¦clustering and visualization, in order to assist the knowledge extraction from spatiotemporal¦geo-referenced data, thus improving making decision processes.¦I present two original algorithms, one for clustering: the Fuzzy Growing Hierarchical¦Self-Organizing Networks (FGHSON), and the second for exploratory visual data analysis:¦the Tree-structured Self-organizing Maps Component Planes. In addition, I present¦methodologies that combined with FGHSON and the Tree-structured SOM Component¦Planes allow the integration of space and time seamlessly and simultaneously in¦order to extract knowledge embedded in a temporal context.¦The originality of the FGHSON lies in its capability to reflect the underlying structure¦of a dataset in a hierarchical fuzzy way. A hierarchical fuzzy representation of¦clusters is crucial when data include complex structures with large variability of cluster¦shapes, variances, densities and number of clusters. The most important characteristics¦of the FGHSON include: (1) It does not require an a-priori setup of the number¦of clusters. (2) The algorithm executes several self-organizing processes in parallel.¦Hence, when dealing with large datasets the processes can be distributed reducing the¦computational cost. (3) Only three parameters are necessary to set up the algorithm.¦In the case of the Tree-structured SOM Component Planes, the novelty of this algorithm¦lies in its ability to create a structure that allows the visual exploratory data analysis¦of large high-dimensional datasets. This algorithm creates a hierarchical structure¦of Self-Organizing Map Component Planes, arranging similar variables' projections in¦the same branches of the tree. Hence, similarities on variables' behavior can be easily¦detected (e.g. local correlations, maximal and minimal values and outliers).¦Both FGHSON and the Tree-structured SOM Component Planes were applied in¦several agroecological problems proving to be very efficient in the exploratory analysis¦and clustering of spatio-temporal datasets.¦In this thesis I also tested three soft competitive learning algorithms. Two of them¦well-known non supervised soft competitive algorithms, namely the Self-Organizing¦Maps (SOMs) and the Growing Hierarchical Self-Organizing Maps (GHSOMs); and the¦third was our original contribution, the FGHSON. Although the algorithms presented¦here have been used in several areas, to my knowledge there is not any work applying¦and comparing the performance of those techniques when dealing with spatiotemporal¦geospatial data, as it is presented in this thesis.¦I propose original methodologies to explore spatio-temporal geo-referenced datasets¦through time. Our approach uses time windows to capture temporal similarities and¦variations by using the FGHSON clustering algorithm. The developed methodologies¦are used in two case studies. In the first, the objective was to find similar agroecozones¦through time and in the second one it was to find similar environmental patterns¦shifted in time.¦Several results presented in this thesis have led to new contributions to agroecological¦knowledge, for instance, in sugar cane, and blackberry production.¦Finally, in the framework of this thesis we developed several software tools: (1)¦a Matlab toolbox that implements the FGHSON algorithm, and (2) a program called¦BIS (Bio-inspired Identification of Similar agroecozones) an interactive graphical user¦interface tool which integrates the FGHSON algorithm with Google Earth in order to¦show zones with similar agroecological characteristics.
Resumo:
BACKGROUND: The goal of this paper is to investigate the respective influence of work characteristics, the effort-reward ratio, and overcommitment on the poor mental health of out-of-hospital care providers. METHODS: 333 out-of-hospital care providers answered a questionnaire that included queries on mental health (GHQ-12), demographics, health-related information and work characteristics, questions from the Effort-Reward Imbalance Questionnaire, and items about overcommitment. A two-level multiple regression was performed between mental health (the dependent variable) and the effort-reward ratio, the overcommitment score, weekly number of interventions, percentage of non-prehospital transport of patients out of total missions, gender, and age. Participants were first-level units, and ambulance services were second-level units. We also shadowed ambulance personnel for a total of 416 hr. RESULTS: With cutoff points of 2/3 and 3/4 positive answers on the GHQ-12, the percentages of potential cases with poor mental health were 20% and 15%, respectively. The effort-reward ratio was associated with poor mental health (P < 0.001), irrespective of age or gender. Overcommitment was associated with poor mental health; this association was stronger in women (β = 0.054) than in men (β = 0.020). The percentage of prehospital missions out of total missions was only associated with poor mental health at the individual level. CONCLUSIONS: Emergency medical services should pay attention to the way employees perceive their efforts and the rewarding aspects of their work: an imbalance of those aspects is associated with poor mental health. Low perceived esteem appeared particularly associated with poor mental health. This suggests that supervisors of emergency medical services should enhance the value of their employees' work. Employees with overcommitment should also receive appropriate consideration. Preventive measures should target individual perceptions of effort and reward in order to improve mental health in prehospital care providers.
Resumo:
The potential ecological impact of ongoing climate change has been much discussed. High mountain ecosystems were identified early on as potentially very sensitive areas. Scenarios of upward species movement and vegetation shift are commonly discussed in the literature. Mountains being characteristically conic in shape, impact scenarios usually assume that a smaller surface area will be available as species move up. However, as the frequency distribution of additional physiographic factors (e.g., slope angle) changes with increasing elevation (e.g., with few gentle slopes available at higher elevation), species migrating upslope may encounter increasingly unsuitable conditions. As a result, many species could suffer severe reduction of their habitat surface, which could in turn affect patterns of biodiversity. In this paper, results from static plant distribution modeling are used to derive climate change impact scenarios in a high mountain environment. Models are adjusted with presence/absence of species. Environmental predictors used are: annual mean air temperature, slope, indices of topographic position, geology, rock cover, modeled permafrost and several indices of solar radiation and snow cover duration. Potential Habitat Distribution maps were drawn for 62 higher plant species, from which three separate climate change impact scenarios were derived. These scenarios show a great range of response, depending on the species and the degree of warming. Alpine species would be at greatest risk of local extinction, whereas species with a large elevation range would run the lowest risk. Limitations of the models and scenarios are further discussed.
Resumo:
With the advancement of high-throughput sequencing and dramatic increase of available genetic data, statistical modeling has become an essential part in the field of molecular evolution. Statistical modeling results in many interesting discoveries in the field, from detection of highly conserved or diverse regions in a genome to phylogenetic inference of species evolutionary history Among different types of genome sequences, protein coding regions are particularly interesting due to their impact on proteins. The building blocks of proteins, i.e. amino acids, are coded by triples of nucleotides, known as codons. Accordingly, studying the evolution of codons leads to fundamental understanding of how proteins function and evolve. The current codon models can be classified into three principal groups: mechanistic codon models, empirical codon models and hybrid ones. The mechanistic models grasp particular attention due to clarity of their underlying biological assumptions and parameters. However, they suffer from simplified assumptions that are required to overcome the burden of computational complexity. The main assumptions applied to the current mechanistic codon models are (a) double and triple substitutions of nucleotides within codons are negligible, (b) there is no mutation variation among nucleotides of a single codon and (c) assuming HKY nucleotide model is sufficient to capture essence of transition- transversion rates at nucleotide level. In this thesis, I develop a framework of mechanistic codon models, named KCM-based model family framework, based on holding or relaxing the mentioned assumptions. Accordingly, eight different models are proposed from eight combinations of holding or relaxing the assumptions from the simplest one that holds all the assumptions to the most general one that relaxes all of them. The models derived from the proposed framework allow me to investigate the biological plausibility of the three simplified assumptions on real data sets as well as finding the best model that is aligned with the underlying characteristics of the data sets. -- Avec l'avancement de séquençage à haut débit et l'augmentation dramatique des données géné¬tiques disponibles, la modélisation statistique est devenue un élément essentiel dans le domaine dé l'évolution moléculaire. Les résultats de la modélisation statistique dans de nombreuses découvertes intéressantes dans le domaine de la détection, de régions hautement conservées ou diverses dans un génome de l'inférence phylogénétique des espèces histoire évolutive. Parmi les différents types de séquences du génome, les régions codantes de protéines sont particulièrement intéressants en raison de leur impact sur les protéines. Les blocs de construction des protéines, à savoir les acides aminés, sont codés par des triplets de nucléotides, appelés codons. Par conséquent, l'étude de l'évolution des codons mène à la compréhension fondamentale de la façon dont les protéines fonctionnent et évoluent. Les modèles de codons actuels peuvent être classés en trois groupes principaux : les modèles de codons mécanistes, les modèles de codons empiriques et les hybrides. Les modèles mécanistes saisir une attention particulière en raison de la clarté de leurs hypothèses et les paramètres biologiques sous-jacents. Cependant, ils souffrent d'hypothèses simplificatrices qui permettent de surmonter le fardeau de la complexité des calculs. Les principales hypothèses retenues pour les modèles actuels de codons mécanistes sont : a) substitutions doubles et triples de nucleotides dans les codons sont négligeables, b) il n'y a pas de variation de la mutation chez les nucléotides d'un codon unique, et c) en supposant modèle nucléotidique HKY est suffisant pour capturer l'essence de taux de transition transversion au niveau nucléotidique. Dans cette thèse, je poursuis deux objectifs principaux. Le premier objectif est de développer un cadre de modèles de codons mécanistes, nommé cadre KCM-based model family, sur la base de la détention ou de l'assouplissement des hypothèses mentionnées. En conséquence, huit modèles différents sont proposés à partir de huit combinaisons de la détention ou l'assouplissement des hypothèses de la plus simple qui détient toutes les hypothèses à la plus générale qui détend tous. Les modèles dérivés du cadre proposé nous permettent d'enquêter sur la plausibilité biologique des trois hypothèses simplificatrices sur des données réelles ainsi que de trouver le meilleur modèle qui est aligné avec les caractéristiques sous-jacentes des jeux de données. Nos expériences montrent que, dans aucun des jeux de données réelles, tenant les trois hypothèses mentionnées est réaliste. Cela signifie en utilisant des modèles simples qui détiennent ces hypothèses peuvent être trompeuses et les résultats de l'estimation inexacte des paramètres. Le deuxième objectif est de développer un modèle mécaniste de codon généralisée qui détend les trois hypothèses simplificatrices, tandis que d'informatique efficace, en utilisant une opération de matrice appelée produit de Kronecker. Nos expériences montrent que sur un jeux de données choisis au hasard, le modèle proposé de codon mécaniste généralisée surpasse autre modèle de codon par rapport à AICc métrique dans environ la moitié des ensembles de données. En outre, je montre à travers plusieurs expériences que le modèle général proposé est biologiquement plausible.
Resumo:
Modeling concentration-response function became extremely popular in ecotoxicology during the last decade. Indeed, modeling allows determining the total response pattern of a given substance. However, reliable modeling is consuming in term of data, which is in contradiction with the current trend in ecotoxicology, which aims to reduce, for cost and ethical reasons, the number of data produced during an experiment. It is therefore crucial to determine experimental design in a cost-effective manner. In this paper, we propose to use the theory of locally D-optimal designs to determine the set of concentrations to be tested so that the parameters of the concentration-response function can be estimated with high precision. We illustrated this approach by determining the locally D-optimal designs to estimate the toxicity of the herbicide dinoseb on daphnids and algae. The results show that the number of concentrations to be tested is often equal to the number of parameters and often related to the their meaning, i.e. they are located close to the parameters. Furthermore, the results show that the locally D-optimal design often has the minimal number of support points and is not much sensitive to small changes in nominal values of the parameters. In order to reduce the experimental cost and the use of test organisms, especially in case of long-term studies, reliable nominal values may therefore be fixed based on prior knowledge and literature research instead of on preliminary experiments
Resumo:
The likelihood of significant exposure to drugs in infants through breast milk is poorly defined, given the difficulties of conducting pharmacokinetics (PK) studies. Using fluoxetine (FX) as an example, we conducted a proof-of-principle study applying population PK (popPK) modeling and simulation to estimate drug exposure in infants through breast milk. We simulated data for 1,000 mother-infant pairs, assuming conservatively that the FX clearance in an infant is 20% of the allometrically adjusted value in adults. The model-generated estimate of the milk-to-plasma ratio for FX (mean: 0.59) was consistent with those reported in other studies. The median infant-to-mother ratio of FX steady-state plasma concentrations predicted by the simulation was 8.5%. Although the disposition of the active metabolite, norfluoxetine, could not be modeled, popPK-informed simulation may be valid for other drugs, particularly those without active metabolites, thereby providing a practical alternative to conventional PK studies for exposure risk assessment in this population.
Resumo:
A haplotype is an m-long binary vector. The XOR-genotype of two haplotypes is the m-vector of their coordinate-wise XOR. We study the following problem: Given a set of XOR-genotypes, reconstruct their haplotypes so that the set of resulting haplotypes can be mapped onto a perfect phylogeny (PP) tree. The question is motivated by studying population evolution in human genetics, and is a variant of the perfect phylogeny haplotyping problem that has received intensive attention recently. Unlike the latter problem, in which the input is "full" genotypes, here we assume less informative input, and so may be more economical to obtain experimentally. Building on ideas of Gusfield, we show how to solve the problem in polynomial time, by a reduction to the graph realization problem. The actual haplotypes are not uniquely determined by that tree they map onto, and the tree itself may or may not be unique. We show that tree uniqueness implies uniquely determined haplotypes, up to inherent degrees of freedom, and give a sufficient condition for the uniqueness. To actually determine the haplotypes given the tree, additional information is necessary. We show that two or three full genotypes suffice to reconstruct all the haplotypes, and present a linear algorithm for identifying those genotypes.