976 resultados para computational modeling
Resumo:
The dynamical analysis of large biological regulatory networks requires the development of scalable methods for mathematical modeling. Following the approach initially introduced by Thomas, we formalize the interactions between the components of a network in terms of discrete variables, functions, and parameters. Model simulations result in directed graphs, called state transition graphs. We are particularly interested in reachability properties and asymptotic behaviors, which correspond to terminal strongly connected components (or "attractors") in the state transition graph. A well-known problem is the exponential increase of the size of state transition graphs with the number of network components, in particular when using the biologically realistic asynchronous updating assumption. To address this problem, we have developed several complementary methods enabling the analysis of the behavior of large and complex logical models: (i) the definition of transition priority classes to simplify the dynamics; (ii) a model reduction method preserving essential dynamical properties, (iii) a novel algorithm to compact state transition graphs and directly generate compressed representations, emphasizing relevant transient and asymptotic dynamical properties. The power of an approach combining these different methods is demonstrated by applying them to a recent multilevel logical model for the network controlling CD4+ T helper cell response to antigen presentation and to a dozen cytokines. This model accounts for the differentiation of canonical Th1 and Th2 lymphocytes, as well as of inflammatory Th17 and regulatory T cells, along with many hybrid subtypes. All these methods have been implemented into the software GINsim, which enables the definition, the analysis, and the simulation of logical regulatory graphs.
Resumo:
This PhD project aims to study paraphrasing, initially understood as the different ways in which the same content is expressed linguistically. We will go into that concept in depth trying to define and delimit its scope more accurately. In that sense, we also aim to discover which kind of structures and phenomena it covers. Although there exist some paraphrasing typologies, the great majority of them only apply to English, and focus on lexical and syntactic transformations. Our intention is to go further into this subject and propose a paraphrasing typology for Spanish and Catalan combining lexical, syntactic, semantic and pragmatic knowledge. We apply a bottom-up methodology trying to collect evidence of this phenomenon from the data. For this purpose, we are initially using the Spanish Wikipedia as our corpus. The internal structure of this encyclopedia makes it a good resource for extracting paraphrasing examples for our investigation. This empirical approach will be complemented with the use of linguistic knowledge, and by comparing and contrasting our results to previously proposed paraphrasing typologies in order to enlarge the possible paraphrasing forms found in our corpus. The fact that the same content can be expressed in many different ways presents a major challenge for Natural Language Processing (NLP) applications. Thus, research on paraphrasing has recently been attracting increasing attention in the fields of NLP and Computational Linguistics. The results obtained in this investigation would be of great interest in many of these applications.
Multimodel inference and multimodel averaging in empirical modeling of occupational exposure levels.
Resumo:
Empirical modeling of exposure levels has been popular for identifying exposure determinants in occupational hygiene. Traditional data-driven methods used to choose a model on which to base inferences have typically not accounted for the uncertainty linked to the process of selecting the final model. Several new approaches propose making statistical inferences from a set of plausible models rather than from a single model regarded as 'best'. This paper introduces the multimodel averaging approach described in the monograph by Burnham and Anderson. In their approach, a set of plausible models are defined a priori by taking into account the sample size and previous knowledge of variables influent on exposure levels. The Akaike information criterion is then calculated to evaluate the relative support of the data for each model, expressed as Akaike weight, to be interpreted as the probability of the model being the best approximating model given the model set. The model weights can then be used to rank models, quantify the evidence favoring one over another, perform multimodel prediction, estimate the relative influence of the potential predictors and estimate multimodel-averaged effects of determinants. The whole approach is illustrated with the analysis of a data set of 1500 volatile organic compound exposure levels collected by the Institute for work and health (Lausanne, Switzerland) over 20 years, each concentration having been divided by the relevant Swiss occupational exposure limit and log-transformed before analysis. Multimodel inference represents a promising procedure for modeling exposure levels that incorporates the notion that several models can be supported by the data and permits to evaluate to a certain extent model selection uncertainty, which is seldom mentioned in current practice.
Resumo:
We evaluate the performance of different optimization techniques developed in the context of optical flowcomputation with different variational models. In particular, based on truncated Newton methods (TN) that have been an effective approach for large-scale unconstrained optimization, we develop the use of efficient multilevel schemes for computing the optical flow. More precisely, we evaluate the performance of a standard unidirectional multilevel algorithm - called multiresolution optimization (MR/OPT), to a bidrectional multilevel algorithm - called full multigrid optimization (FMG/OPT). The FMG/OPT algorithm treats the coarse grid correction as an optimization search direction and eventually scales it using a line search. Experimental results on different image sequences using four models of optical flow computation show that the FMG/OPT algorithm outperforms both the TN and MR/OPT algorithms in terms of the computational work and the quality of the optical flow estimation.
Resumo:
A mathematical model is proposed to analyze the effects of acquired immunity on the transmission of schistosomiasis in the human host. From this model the prevalence curve dependent on four parameters can be obtained. These parameters were estimated fitting the data by the maximum likelihood method. The model showed a good retrieving capacity of real data from two endemic areas of schistosomiasis: Touros, Brazil (Schistosoma mansoni) and Misungwi, Tanzania (S. haematobium). Also, the average worm burden per person and the dispersion of parasite per person in the community can be obtained from the model. In this paper, the stabilizing effects of the acquired immunity assumption in the model are assessed in terms of the epidemiological variables as follows. Regarded to the prevalence curve, we calculate the confidence interval, and related to the average worm burden and the worm dispersion in the community, the sensitivity analysis (the range of the variation) of both variables with respect to their parameters is performed.
Resumo:
Recently, the introduction of second generation sequencing and further advance-ments in confocal microscopy have enabled system-level studies for the functional characterization of genes. The degree of complexity intrinsic to these approaches needs the development of bioinformatics methodologies and computational models for extracting meaningful biological knowledge from the enormous amount of experi¬mental data which is continuously generated. This PhD thesis presents several novel bioinformatics methods and computational models to address specific biological questions in Plant Biology by using the plant Arabidopsis thaliana as a model system. First, a spatio-temporal qualitative analysis of quantitative transcript and protein profiles is applied to show the role of the BREVIS RADIX (BRX) protein in the auxin- cytokinin crosstalk for root meristem growth. Core of this PhD work is the functional characterization of the interplay between the BRX protein and the plant hormone auxin in the root meristem by using a computational model based on experimental evidence. Hyphotesis generated by the modelled to the discovery of a differential endocytosis pattern in the root meristem that splits the auxin transcriptional response via the plasma membrane to nucleus partitioning of BRX. This positional information system creates an auxin transcriptional pattern that deviates from the canonical auxin response and is necessary to sustain the expression of a subset of BRX-dependent auxin-responsive genes to drive root meristem growth. In the second part of this PhD thesis, we characterized the genome-wide impact of large scale deletions on four divergent Arabidopsis natural strains, through the integration of Ultra-High Throughput Sequencing data with data from genomic hybridizations on tiling arrays. Analysis of the identified deletions revealed a considerable portion of protein coding genes affected and supported a history of genomic rearrangements shaped by evolution. In the last part of the thesis, we showed that VIP3 gene in Arabidopsis has an evo-lutionary conserved role in the 3' to 5' mRNA degradation machinery, by applying a novel approach for the analysis of mRNA-Seq data from random-primed mRNA. Altogether, this PhD research contains major advancements in the study of natural genomic variation in plants and in the application of computational morphodynamics models for the functional characterization of biological pathways essential for the plant. - Récemment, l'introduction du séquençage de seconde génération et les avancées dans la microscopie confocale ont permis des études à l'échelle des différents systèmes cellulaires pour la caractérisation fonctionnelle de gènes. Le degrés de complexité intrinsèque à ces approches ont requis le développement de méthodologies bioinformatiques et de modèles mathématiques afin d'extraire de la masse de données expérimentale générée, des information biologiques significatives. Ce doctorat présente à la fois des méthodes bioinformatiques originales et des modèles mathématiques pour répondre à certaines questions spécifiques de Biologie Végétale en utilisant la plante Arabidopsis thaliana comme modèle. Premièrement, une analyse qualitative spatio-temporelle de profiles quantitatifs de transcripts et de protéines est utilisée pour montrer le rôle de la protéine BREVIS RADIX (BRX) dans le dialogue entre l'auxine et les cytokinines, des phytohormones, dans la croissance du méristème racinaire. Le noyau de ce travail de thèse est la caractérisation fonctionnelle de l'interaction entre la protéine BRX et la phytohormone auxine dans le méristème de la racine en utilisant des modèles informatiques basés sur des preuves expérimentales. Les hypothèses produites par le modèle ont mené à la découverte d'un schéma différentiel d'endocytose dans le méristème racinaire qui divise la réponse transcriptionnelle à l'auxine par le partitionnement de BRX de la membrane plasmique au noyau de la cellule. Cette information positionnelle crée une réponse transcriptionnelle à l'auxine qui dévie de la réponse canonique à l'auxine et est nécessaire pour soutenir l'expression d'un sous ensemble de gènes répondant à l'auxine et dépendant de BRX pour conduire la croissance du méristème. Dans la seconde partie de cette thèse de doctorat, nous avons caractérisé l'impact sur l'ensemble du génome des délétions à grande échelle sur quatre souches divergentes naturelles d'Arabidopsis, à travers l'intégration du séquençage à ultra-haut-débit avec l'hybridation génomique sur puces ADN. L'analyse des délétions identifiées a révélé qu'une proportion considérable de gènes codant était affectée, supportant l'idée d'un historique de réarrangement génomique modelé durant l'évolution. Dans la dernière partie de cette thèse, nous avons montré que le gène VÏP3 dans Arabidopsis a conservé un rôle évolutif dans la machinerie de dégradation des ARNm dans le sens 3' à 5', en appliquant une nouvelle approche pour l'analyse des données de séquençage d'ARNm issue de transcripts amplifiés aléatoirement. Dans son ensemble, cette recherche de doctorat contient des avancées majeures dans l'étude des variations génomiques naturelles des plantes et dans l'application de modèles morphodynamiques informatiques pour la caractérisation de réseaux biologiques essentiels à la plante. - Le développement des plantes est écrit dans leurs codes génétiques. Pour comprendre comment les plantes sont capables de s'adapter aux changements environnementaux, il est essentiel d'étudier comment leurs gènes gouvernent leur formation. Plus nous essayons de comprendre le fonctionnement d'une plante, plus nous réalisons la complexité des mécanismes biologiques, à tel point que l'utilisation d'outils et de modèles mathématiques devient indispensable. Dans ce travail, avec l'utilisation de la plante modèle Arabidopsis thalicinci nous avons résolu des problèmes biologiques spécifiques à travers le développement et l'application de méthodes informatiques concrètes. Dans un premier temps, nous avons investigué comment le gène BREVIS RADIX (BRX) régule le développement de la racine en contrôlant la réponse à deux hormones : l'auxine et la cytokinine. Nous avons employé une analyse statistique sur des mesures quantitatives de transcripts et de produits de gènes afin de démontrer que BRX joue un rôle antagonisant dans le dialogue entre ces deux hormones. Lorsque ce-dialogue moléculaire est perturbé, la racine primaire voit sa longueur dramatiquement réduite. Pour comprendre comment BRX répond à l'auxine, nous avons développé un modèle informatique basé sur des résultats expérimentaux. Les simulations successives ont mené à la découverte d'un signal positionnel qui contrôle la réponse de la racine à l'auxine par la régulation du mouvement intracellulaire de BRX. Dans la seconde partie de cette thèse, nous avons analysé le génome entier de quatre souches naturelles d'Arabidopsis et nous avons trouvé qu'une grande partie de leurs gènes étaient manquant par rapport à la souche de référence. Ce résultat indique que l'historique des modifications génomiques conduites par l'évolution détermine une disponibilité différentielle des gènes fonctionnels dans ces plantes. Dans la dernière partie de ce travail, nous avons analysé les données du transcriptome de la plante où le gène VIP3 était non fonctionnel. Ceci nous a permis de découvrir le rôle double de VIP3 dans la régulation de l'initiation de la transcription et dans la dégradation des transcripts. Ce rôle double n'avait jusqu'alors été démontrée que chez l'homme. Ce travail de doctorat supporte le développement et l'application de méthodologies informatiques comme outils inestimables pour résoudre la complexité des problèmes biologiques dans la recherche végétale. L'intégration de la biologie végétale et l'informatique est devenue de plus en plus importante pour l'avancée de nos connaissances sur le fonctionnement et le développement des plantes.
Resumo:
A factor limiting preliminary rockfall hazard mapping at regional scale is often the lack of knowledge of potential source areas. Nowadays, high resolution topographic data (LiDAR) can account for realistic landscape details even at large scale. With such fine-scale morphological variability, quantitative geomorphometric analyses become a relevant approach for delineating potential rockfall instabilities. Using digital elevation model (DEM)-based ?slope families? concept over areas of similar lithology and cliffs and screes zones available from the 1:25,000 topographic map, a susceptibility rockfall hazard map was drawn up in the canton of Vaud, Switzerland, in order to provide a relevant hazard overview. Slope surfaces over morphometrically-defined thresholds angles were considered as rockfall source zones. 3D modelling (CONEFALL) was then applied on each of the estimated source zones in order to assess the maximum runout length. Comparison with known events and other rockfall hazard assessments are in good agreement, showing that it is possible to assess rockfall activities over large areas from DEM-based parameters and topographical elements.
Resumo:
Retroelements are important evolutionary forces but can be deleterious if left uncontrolled. Members of the human APOBEC3 family of cytidine deaminases can inhibit a wide range of endogenous, as well as exogenous, retroelements. These enzymes are structurally organized in one or two domains comprising a zinc-coordinating motif. APOBEC3G contains two such domains, only the C terminal of which is endowed with editing activity, while its N-terminal counterpart binds RNA, promotes homo-oligomerization, and is necessary for packaging into human immunodeficiency virus type 1 (HIV-1) virions. Here, we performed a large-scale mutagenesis-based analysis of the APOBEC3G N terminus, testing mutants for (i) inhibition of vif-defective HIV-1 infection and Alu retrotransposition, (ii) RNA binding, and (iii) oligomerization. Furthermore, in the absence of structural information on this domain, we used homology modeling to examine the positions of functionally important residues and of residues found to be under positive selection by phylogenetic analyses of primate APOBEC3G genes. Our results reveal the importance of a predicted RNA binding dimerization interface both for packaging into HIV-1 virions and inhibition of both HIV-1 infection and Alu transposition. We further found that the HIV-1-blocking activity of APOBEC3G N-terminal mutants defective for packaging can be almost entirely rescued if their virion incorporation is forced by fusion with Vpr, indicating that the corresponding region of APOBEC3G plays little role in other aspects of its action against this pathogen. Interestingly, residues forming the APOBEC3G dimer interface are highly conserved, contrasting with the rapid evolution of two neighboring surface-exposed amino acid patches, one targeted by the Vif protein of primate lentiviruses and the other of yet-undefined function.
Resumo:
Recognition by the T-cell receptor (TCR) of immunogenic peptides (p) presented by Class I major histocompatibility complexes (MHC) is the key event in the immune response against virus-infected cells or tumor cells. A study of the 2C TCR/SIYR/H-2K(b) system using a computational alanine scanning and a much faster binding free energy decomposition based on the Molecular Mechanics-Generalized Born Surface Area (MM-GBSA) method is presented. The results show that the TCR-p-MHC binding free energy decomposition using this approach and including entropic terms provides a detailed and reliable description of the interactions between the molecules at an atomistic level. Comparison of the decomposition results with experimentally determined activity differences for alanine mutants yields a correlation of 0.67 when the entropy is neglected and 0.72 when the entropy is taken into account. Similarly, comparison of experimental activities with variations in binding free energies determined by computational alanine scanning yields correlations of 0.72 and 0.74 when the entropy is neglected or taken into account, respectively. Some key interactions for the TCR-p-MHC binding are analyzed and some possible side chains replacements are proposed in the context of TCR protein engineering. In addition, a comparison of the two theoretical approaches for estimating the role of each side chain in the complexation is given, and a new ad hoc approach to decompose the vibrational entropy term into atomic contributions, the linear decomposition of the vibrational entropy (LDVE), is introduced. The latter allows the rapid calculation of the entropic contribution of interesting side chains to the binding. This new method is based on the idea that the most important contributions to the vibrational entropy of a molecule originate from residues that contribute most to the vibrational amplitude of the normal modes. The LDVE approach is shown to provide results very similar to those of the exact but highly computationally demanding method.
Resumo:
Summary: Global warming has led to an average earth surface temperature increase of about 0.7 °C in the 20th century, according to the 2007 IPCC report. In Switzerland, the temperature increase in the same period was even higher: 1.3 °C in the Northern Alps anal 1.7 °C in the Southern Alps. The impacts of this warming on ecosystems aspecially on climatically sensitive systems like the treeline ecotone -are already visible today. Alpine treeline species show increased growth rates, more establishment of young trees in forest gaps is observed in many locations and treelines are migrating upwards. With the forecasted warming, this globally visible phenomenon is expected to continue. This PhD thesis aimed to develop a set of methods and models to investigate current and future climatic treeline positions and treeline shifts in the Swiss Alps in a spatial context. The focus was therefore on: 1) the quantification of current treeline dynamics and its potential causes, 2) the evaluation and improvement of temperaturebased treeline indicators and 3) the spatial analysis and projection of past, current and future climatic treeline positions and their respective elevational shifts. The methods used involved a combination of field temperature measurements, statistical modeling and spatial modeling in a geographical information system. To determine treeline shifts and assign the respective drivers, neighborhood relationships between forest patches were analyzed using moving window algorithms. Time series regression modeling was used in the development of an air-to-soil temperature transfer model to calculate thermal treeline indicators. The indicators were then applied spatially to delineate the climatic treeline, based on interpolated temperature data. Observation of recent forest dynamics in the Swiss treeline ecotone showed that changes were mainly due to forest in-growth, but also partly to upward attitudinal shifts. The recent reduction in agricultural land-use was found to be the dominant driver of these changes. Climate-driven changes were identified only at the uppermost limits of the treeline ecotone. Seasonal mean temperature indicators were found to be the best for predicting climatic treelines. Applying dynamic seasonal delimitations and the air-to-soil temperature transfer model improved the indicators' applicability for spatial modeling. Reproducing the climatic treelines of the past 45 years revealed regionally different attitudinal shifts, the largest being located near the highest mountain mass. Modeling climatic treelines based on two IPCC climate warming scenarios predicted major shifts in treeline altitude. However, the currently-observed treeline is not expected to reach this limit easily, due to lagged reaction, possible climate feedback effects and other limiting factors. Résumé: Selon le rapport 2007 de l'IPCC, le réchauffement global a induit une augmentation de la température terrestre de 0.7 °C en moyenne au cours du 20e siècle. En Suisse, l'augmentation durant la même période a été plus importante: 1.3 °C dans les Alpes du nord et 1.7 °C dans les Alpes du sud. Les impacts de ce réchauffement sur les écosystèmes - en particuliers les systèmes sensibles comme l'écotone de la limite des arbres - sont déjà visibles aujourd'hui. Les espèces de la limite alpine des forêts ont des taux de croissance plus forts, on observe en de nombreux endroits un accroissement du nombre de jeunes arbres s'établissant dans les trouées et la limite des arbres migre vers le haut. Compte tenu du réchauffement prévu, on s'attend à ce que ce phénomène, visible globalement, persiste. Cette thèse de doctorat visait à développer un jeu de méthodes et de modèles pour étudier dans un contexte spatial la position présente et future de la limite climatique des arbres, ainsi que ses déplacements, au sein des Alpes suisses. L'étude s'est donc focalisée sur: 1) la quantification de la dynamique actuelle de la limite des arbres et ses causes potentielles, 2) l'évaluation et l'amélioration des indicateurs, basés sur la température, pour la limite des arbres et 3) l'analyse spatiale et la projection de la position climatique passée, présente et future de la limite des arbres et des déplacements altitudinaux de cette position. Les méthodes utilisées sont une combinaison de mesures de température sur le terrain, de modélisation statistique et de la modélisation spatiale à l'aide d'un système d'information géographique. Les relations de voisinage entre parcelles de forêt ont été analysées à l'aide d'algorithmes utilisant des fenêtres mobiles, afin de mesurer les déplacements de la limite des arbres et déterminer leurs causes. Un modèle de transfert de température air-sol, basé sur les modèles de régression sur séries temporelles, a été développé pour calculer des indicateurs thermiques de la limite des arbres. Les indicateurs ont ensuite été appliqués spatialement pour délimiter la limite climatique des arbres, sur la base de données de températures interpolées. L'observation de la dynamique forestière récente dans l'écotone de la limite des arbres en Suisse a montré que les changements étaient principalement dus à la fermeture des trouées, mais aussi en partie à des déplacements vers des altitudes plus élevées. Il a été montré que la récente déprise agricole était la cause principale de ces changements. Des changements dus au climat n'ont été identifiés qu'aux limites supérieures de l'écotone de la limite des arbres. Les indicateurs de température moyenne saisonnière se sont avérés le mieux convenir pour prédire la limite climatique des arbres. L'application de limites dynamiques saisonnières et du modèle de transfert de température air-sol a amélioré l'applicabilité des indicateurs pour la modélisation spatiale. La reproduction des limites climatiques des arbres durant ces 45 dernières années a mis en évidence des changements d'altitude différents selon les régions, les plus importants étant situés près du plus haut massif montagneux. La modélisation des limites climatiques des arbres d'après deux scénarios de réchauffement climatique de l'IPCC a prédit des changements majeurs de l'altitude de la limite des arbres. Toutefois, l'on ne s'attend pas à ce que la limite des arbres actuellement observée atteigne cette limite facilement, en raison du délai de réaction, d'effets rétroactifs du climat et d'autres facteurs limitants.
Resumo:
Recent progress in the experimental determination of protein structures allow to understand, at a very detailed level, the molecular recognition mechanisms that are at the basis of the living matter. This level of understanding makes it possible to design rational therapeutic approaches, in which effectors molecules are adapted or created de novo to perform a given function. An example of such an approach is drug design, were small inhibitory molecules are designed using in silico simulations and tested in vitro. In this article, we present a similar approach to rationally optimize the sequence of killer T lymphocytes receptors to make them more efficient against melanoma cells. The architecture of this translational research project is presented together with its implications both at the level of basic research as well as in the clinics.
Resumo:
Hidden Markov models (HMMs) are probabilistic models that are well adapted to many tasks in bioinformatics, for example, for predicting the occurrence of specific motifs in biological sequences. MAMOT is a command-line program for Unix-like operating systems, including MacOS X, that we developed to allow scientists to apply HMMs more easily in their research. One can define the architecture and initial parameters of the model in a text file and then use MAMOT for parameter optimization on example data, decoding (like predicting motif occurrence in sequences) and the production of stochastic sequences generated according to the probabilistic model. Two examples for which models are provided are coiled-coil domains in protein sequences and protein binding sites in DNA. A wealth of useful features include the use of pseudocounts, state tying and fixing of selected parameters in learning, and the inclusion of prior probabilities in decoding. AVAILABILITY: MAMOT is implemented in C++, and is distributed under the GNU General Public Licence (GPL). The software, documentation, and example model files can be found at http://bcf.isb-sib.ch/mamot