909 resultados para Data-Driven Behavior Modeling
Resumo:
La présente thèse s'intitule "Développent et Application des Méthodologies Computationnelles pour la Modélisation Qualitative". Elle comprend tous les différents projets que j'ai entrepris en tant que doctorante. Plutôt qu'une mise en oeuvre systématique d'un cadre défini a priori, cette thèse devrait être considérée comme une exploration des méthodes qui peuvent nous aider à déduire le plan de processus regulatoires et de signalisation. Cette exploration a été mue par des questions biologiques concrètes, plutôt que par des investigations théoriques. Bien que tous les projets aient inclus des systèmes divergents (réseaux régulateurs de gènes du cycle cellulaire, réseaux de signalisation de cellules pulmonaires) ainsi que des organismes (levure à fission, levure bourgeonnante, rat, humain), nos objectifs étaient complémentaires et cohérents. Le projet principal de la thèse est la modélisation du réseau de l'initiation de septation (SIN) du S.pombe. La cytokinèse dans la levure à fission est contrôlée par le SIN, un réseau signalant de protéines kinases qui utilise le corps à pôle-fuseau comme échafaudage. Afin de décrire le comportement qualitatif du système et prédire des comportements mutants inconnus, nous avons décidé d'adopter l'approche de la modélisation booléenne. Dans cette thèse, nous présentons la construction d'un modèle booléen étendu du SIN, comprenant la plupart des composantes et des régulateurs du SIN en tant que noeuds individuels et testable expérimentalement. Ce modèle utilise des niveaux d'activité du CDK comme noeuds de contrôle pour la simulation d'évènements du SIN à différents stades du cycle cellulaire. Ce modèle a été optimisé en utilisant des expériences d'un seul "knock-out" avec des effets phénotypiques connus comme set d'entraînement. Il a permis de prédire correctement un set d'évaluation de "knock-out" doubles. De plus, le modèle a fait des prédictions in silico qui ont été validées in vivo, permettant d'obtenir de nouvelles idées de la régulation et l'organisation hiérarchique du SIN. Un autre projet concernant le cycle cellulaire qui fait partie de cette thèse a été la construction d'un modèle qualitatif et minimal de la réciprocité des cyclines dans la S.cerevisiae. Les protéines Clb dans la levure bourgeonnante présentent une activation et une dégradation caractéristique et séquentielle durant le cycle cellulaire, qu'on appelle communément les vagues des Clbs. Cet évènement est coordonné avec la courbe d'activation inverse du Sic1, qui a un rôle inhibitoire dans le système. Pour l'identification des modèles qualitatifs minimaux qui peuvent expliquer ce phénomène, nous avons sélectionné des expériences bien définies et construit tous les modèles minimaux possibles qui, une fois simulés, reproduisent les résultats attendus. Les modèles ont été filtrés en utilisant des simulations ODE qualitatives et standardisées; seules celles qui reproduisaient le phénotype des vagues ont été gardées. L'ensemble des modèles minimaux peut être utilisé pour suggérer des relations regulatoires entre les molécules participant qui peuvent ensuite être testées expérimentalement. Enfin, durant mon doctorat, j'ai participé au SBV Improver Challenge. Le but était de déduire des réseaux spécifiques à des espèces (humain et rat) en utilisant des données de phosphoprotéines, d'expressions des gènes et des cytokines, ainsi qu'un réseau de référence, qui était mis à disposition comme donnée préalable. Notre solution pour ce concours a pris la troisième place. L'approche utilisée est expliquée en détail dans le dernier chapitre de la thèse. -- The present dissertation is entitled "Development and Application of Computational Methodologies in Qualitative Modeling". It encompasses the diverse projects that were undertaken during my time as a PhD student. Instead of a systematic implementation of a framework defined a priori, this thesis should be considered as an exploration of the methods that can help us infer the blueprint of regulatory and signaling processes. This exploration was driven by concrete biological questions, rather than theoretical investigation. Even though the projects involved divergent systems (gene regulatory networks of cell cycle, signaling networks in lung cells), as well as organisms (fission yeast, budding yeast, rat, human), our goals were complementary and coherent. The main project of the thesis is the modeling of the Septation Initiation Network (SIN) in S.pombe. Cytokinesis in fission yeast is controlled by the SIN, a protein kinase signaling network that uses the spindle pole body as scaffold. In order to describe the qualitative behavior of the system and predict unknown mutant behaviors we decided to adopt a Boolean modeling approach. In this thesis, we report the construction of an extended, Boolean model of the SIN, comprising most SIN components and regulators as individual, experimentally testable nodes. The model uses CDK activity levels as control nodes for the simulation of SIN related events in different stages of the cell cycle. The model was optimized using single knock-out experiments of known phenotypic effect as a training set, and was able to correctly predict a double knock-out test set. Moreover, the model has made in silico predictions that have been validated in vivo, providing new insights into the regulation and hierarchical organization of the SIN. Another cell cycle related project that is part of this thesis was to create a qualitative, minimal model of cyclin interplay in S.cerevisiae. CLB proteins in budding yeast present a characteristic, sequential activation and decay during the cell cycle, commonly referred to as Clb waves. This event is coordinated with the inverse activation curve of Sic1, which has an inhibitory role in the system. To generate minimal qualitative models that can explain this phenomenon, we selected well-defined experiments and constructed all possible minimal models that, when simulated, reproduce the expected results. The models were filtered using standardized qualitative ODE simulations; only the ones reproducing the wave-like phenotype were kept. The set of minimal models can be used to suggest regulatory relations among the participating molecules, which will subsequently be tested experimentally. Finally, during my PhD I participated in the SBV Improver Challenge. The goal was to infer species-specific (human and rat) networks, using phosphoprotein, gene expression and cytokine data and a reference network provided as prior knowledge. Our solution to the challenge was selected as in the final chapter of the thesis.
Resumo:
We use data from about 700 GPS stations in the EuroMediterranen region to investigate the present-day behavior of the the Calabrian subduction zone within the Mediterranean-scale plates kinematics and to perform local scale studies about the strain accumulation on active structures. We focus attenction on the Messina Straits and Crati Valley faults where GPS data show extentional velocity gradients of ∼3 mm/yr and ∼2 mm/yr, respectively. We use dislocation model and a non-linear constrained optimization algorithm to invert for fault geometric parameters and slip-rates and evaluate the associated uncertainties adopting a bootstrap approach. Our analysis suggest the presence of two partially locked normal faults. To investigate the impact of elastic strain contributes from other nearby active faults onto the observed velocity gradient we use a block modeling approach. Our models show that the inferred slip-rates on the two analyzed structures are strongly impacted by the assumed locking width of the Calabrian subduction thrust. In order to frame the observed local deformation features within the present- day central Mediterranean kinematics we realyze a statistical analysis testing the indipendent motion (w.r.t. the African and Eurasias plates) of the Adriatic, Cal- abrian and Sicilian blocks. Our preferred model confirms a microplate like behaviour for all the investigated blocks. Within these kinematic boundary conditions we fur- ther investigate the Calabrian Slab interface geometry using a combined approach of block modeling and χ2ν statistic. Almost no information is obtained using only the horizontal GPS velocities that prove to be a not sufficient dataset for a multi-parametric inversion approach. Trying to stronger constrain the slab geometry we estimate the predicted vertical velocities performing suites of forward models of elastic dislocations varying the fault locking depth. Comparison with the observed field suggest a maximum resolved locking depth of 25 km.
Resumo:
Synthetic Biology is a relatively new discipline, born at the beginning of the New Millennium, that brings the typical engineering approach (abstraction, modularity and standardization) to biotechnology. These principles aim to tame the extreme complexity of the various components and aid the construction of artificial biological systems with specific functions, usually by means of synthetic genetic circuits implemented in bacteria or simple eukaryotes like yeast. The cell becomes a programmable machine and its low-level programming language is made of strings of DNA. This work was performed in collaboration with researchers of the Department of Electrical Engineering of the University of Washington in Seattle and also with a student of the Corso di Laurea Magistrale in Ingegneria Biomedica at the University of Bologna: Marilisa Cortesi. During the collaboration I contributed to a Synthetic Biology project already started in the Klavins Laboratory. In particular, I modeled and subsequently simulated a synthetic genetic circuit that was ideated for the implementation of a multicelled behavior in a growing bacterial microcolony. In the first chapter the foundations of molecular biology are introduced: structure of the nucleic acids, transcription, translation and methods to regulate gene expression. An introduction to Synthetic Biology completes the section. In the second chapter is described the synthetic genetic circuit that was conceived to make spontaneously emerge, from an isogenic microcolony of bacteria, two different groups of cells, termed leaders and followers. The circuit exploits the intrinsic stochasticity of gene expression and intercellular communication via small molecules to break the symmetry in the phenotype of the microcolony. The four modules of the circuit (coin flipper, sender, receiver and follower) and their interactions are then illustrated. In the third chapter is derived the mathematical representation of the various components of the circuit and the several simplifying assumptions are made explicit. Transcription and translation are modeled as a single step and gene expression is function of the intracellular concentration of the various transcription factors that act on the different promoters of the circuit. A list of the various parameters and a justification for their value closes the chapter. In the fourth chapter are described the main characteristics of the gro simulation environment, developed by the Self Organizing Systems Laboratory of the University of Washington. Then, a sensitivity analysis performed to pinpoint the desirable characteristics of the various genetic components is detailed. The sensitivity analysis makes use of a cost function that is based on the fraction of cells in each one of the different possible states at the end of the simulation and the wanted outcome. Thanks to a particular kind of scatter plot, the parameters are ranked. Starting from an initial condition in which all the parameters assume their nominal value, the ranking suggest which parameter to tune in order to reach the goal. Obtaining a microcolony in which almost all the cells are in the follower state and only a few in the leader state seems to be the most difficult task. A small number of leader cells struggle to produce enough signal to turn the rest of the microcolony in the follower state. It is possible to obtain a microcolony in which the majority of cells are followers by increasing as much as possible the production of signal. Reaching the goal of a microcolony that is split in half between leaders and followers is comparatively easy. The best strategy seems to be increasing slightly the production of the enzyme. To end up with a majority of leaders, instead, it is advisable to increase the basal expression of the coin flipper module. At the end of the chapter, a possible future application of the leader election circuit, the spontaneous formation of spatial patterns in a microcolony, is modeled with the finite state machine formalism. The gro simulations provide insights into the genetic components that are needed to implement the behavior. In particular, since both the examples of pattern formation rely on a local version of Leader Election, a short-range communication system is essential. Moreover, new synthetic components that allow to reliably downregulate the growth rate in specific cells without side effects need to be developed. In the appendix are listed the gro code utilized to simulate the model of the circuit, a script in the Python programming language that was used to split the simulations on a Linux cluster and the Matlab code developed to analyze the data.
Resumo:
A feature represents a functional requirement fulfilled by a system. Since many maintenance tasks are expressed in terms of features, it is important to establish the correspondence between a feature and its implementation in source code. Traditional approaches to establish this correspondence exercise features to generate a trace of runtime events, which is then processed by post-mortem analysis. These approaches typically generate large amounts of data to analyze. Due to their static nature, these approaches do not support incremental and interactive analysis of features. We propose a radically different approach called live feature analysis, which provides a model at runtime of features. Our approach analyzes features on a running system and also makes it possible to grow feature representations by exercising different scenarios of the same feature, and identifies execution elements even to the sub-method level. We describe how live feature analysis is implemented effectively by annotating structural representations of code based on abstract syntax trees. We illustrate our live analysis with a case study where we achieve a more complete feature representation by exercising and merging variants of feature behavior and demonstrate the efficiency or our technique with benchmarks.
Resumo:
The respiratory central pattern generator is a collection of medullary neurons that generates the rhythm of respiration. The respiratory central pattern generator feeds phrenic motor neurons, which, in turn, drive the main muscle of respiration, the diaphragm. The purpose of this thesis is to understand the neural control of respiration through mathematical models of the respiratory central pattern generator and phrenic motor neurons. ^ We first designed and validated a Hodgkin-Huxley type model that mimics the behavior of phrenic motor neurons under a wide range of electrical and pharmacological perturbations. This model was constrained physiological data from the literature. Next, we designed and validated a model of the respiratory central pattern generator by connecting four Hodgkin-Huxley type models of medullary respiratory neurons in a mutually inhibitory network. This network was in turn driven by a simple model of an endogenously bursting neuron, which acted as the pacemaker for the respiratory central pattern generator. Finally, the respiratory central pattern generator and phrenic motor neuron models were connected and their interactions studied. ^ Our study of the models has provided a number of insights into the behavior of the respiratory central pattern generator and phrenic motor neurons. These include the suggestion of a role for the T-type and N-type calcium channels during single spikes and repetitive firing in phrenic motor neurons, as well as a better understanding of network properties underlying respiratory rhythm generation. We also utilized an existing model of lung mechanics to study the interactions between the respiratory central pattern generator and ventilation. ^
Resumo:
The 1-diode/2-resistors electric circuit equivalent to a photovoltaic system is analyzed. The equations at particular points of the I–V curve are studied considering the maximum number of terms. The maximum power point as a boundary condition is given special attention. A new analytical method is developed based on a reduced amount of information, consisting in the normal manufacturer data. Results indicate that this new method is faster than numerical methods and has similar (or better) accuracy than other existing methods, numerical or analytical.
Resumo:
We model nongraphitized carbon black surfaces and investigate adsorption of argon on these surfaces by using the grand canonical Monte Carlo simulation. In this model, the nongraphitized surface is modeled as a stack of graphene layers with some carbon atoms of the top graphene layer being randomly removed. The percentage of the surface carbon atoms being removed and the effective size of the defect ( created by the removal) are the key parameters to characterize the nongraphitized surface. The patterns of adsorption isotherm and isosteric heat are particularly studied, as a function of these surface parameters as well as pressure and temperature. It is shown that the adsorption isotherm shows a steplike behavior on a perfect graphite surface and becomes smoother on nongraphitized surfaces. Regarding the isosteric heat versus loading, we observe for the case of graphitized thermal carbon black the increase of heat in the submonolayer coverage and then a sharp decline in the heat when the second layer is starting to form, beyond which it increases slightly. On the other hand, the isosteric heat versus loading for a highly nongraphitized surface shows a general decline with respect to loading, which is due to the energetic heterogeneity of the surface. It is only when the fluid-fluid interaction is greater than the surface energetic factor that we see a minimum-maximum in the isosteric heat versus loading. These simulation results of isosteric heat agree well with the experimental results of graphitization of Spheron 6 (Polley, M. H.; Schaeffer, W. D.; Smith, W. R. J. Phys. Chem. 1953, 57, 469; Beebe, R. A.; Young, D. M. J. Phys. Chem. 1954, 58, 93). Adsorption isotherms and isosteric heat in pores whose walls have defects are also studied from the simulation, and the pattern of isotherm and isosteric heat could be used to identify the fingerprint of the surface.
Resumo:
The possibility to analyze, quantify and forecast epidemic outbreaks is fundamental when devising effective disease containment strategies. Policy makers are faced with the intricate task of drafting realistically implementable policies that strike a balance between risk management and cost. Two major techniques policy makers have at their disposal are: epidemic modeling and contact tracing. Models are used to forecast the evolution of the epidemic both globally and regionally, while contact tracing is used to reconstruct the chain of people who have been potentially infected, so that they can be tested, isolated and treated immediately. However, both techniques might provide limited information, especially during an already advanced crisis when the need for action is urgent. In this paper we propose an alternative approach that goes beyond epidemic modeling and contact tracing, and leverages behavioral data generated by mobile carrier networks to evaluate contagion risk on a per-user basis. The individual risk represents the loss incurred by not isolating or treating a specific person, both in terms of how likely it is for this person to spread the disease as well as how many secondary infections it will cause. To this aim, we develop a model, named Progmosis, which quantifies this risk based on movement and regional aggregated statistics about infection rates. We develop and release an open-source tool that calculates this risk based on cellular network events. We simulate a realistic epidemic scenarios, based on an Ebola virus outbreak; we find that gradually restricting the mobility of a subset of individuals reduces the number of infected people after 30 days by 24%.
Resumo:
Groundwater systems of different densities are often mathematically modeled to understand and predict environmental behavior such as seawater intrusion or submarine groundwater discharge. Additional data collection may be justified if it will cost-effectively aid in reducing the uncertainty of a model's prediction. The collection of salinity, as well as, temperature data could aid in reducing predictive uncertainty in a variable-density model. However, before numerical models can be created, rigorous testing of the modeling code needs to be completed. This research documents the benchmark testing of a new modeling code, SEAWAT Version 4. The benchmark problems include various combinations of density-dependent flow resulting from variations in concentration and temperature. The verified code, SEAWAT, was then applied to two different hydrological analyses to explore the capacity of a variable-density model to guide data collection. ^ The first analysis tested a linear method to guide data collection by quantifying the contribution of different data types and locations toward reducing predictive uncertainty in a nonlinear variable-density flow and transport model. The relative contributions of temperature and concentration measurements, at different locations within a simulated carbonate platform, for predicting movement of the saltwater interface were assessed. Results from the method showed that concentration data had greater worth than temperature data in reducing predictive uncertainty in this case. Results also indicated that a linear method could be used to quantify data worth in a nonlinear model. ^ The second hydrological analysis utilized a model to identify the transient response of the salinity, temperature, age, and amount of submarine groundwater discharge to changes in tidal ocean stage, seasonal temperature variations, and different types of geology. The model was compared to multiple kinds of data to (1) calibrate and verify the model, and (2) explore the potential for the model to be used to guide the collection of data using techniques such as electromagnetic resistivity, thermal imagery, and seepage meters. Results indicated that the model can be used to give insight to submarine groundwater discharge and be used to guide data collection. ^
Resumo:
A class of multi-process models is developed for collections of time indexed count data. Autocorrelation in counts is achieved with dynamic models for the natural parameter of the binomial distribution. In addition to modeling binomial time series, the framework includes dynamic models for multinomial and Poisson time series. Markov chain Monte Carlo (MCMC) and Po ́lya-Gamma data augmentation (Polson et al., 2013) are critical for fitting multi-process models of counts. To facilitate computation when the counts are high, a Gaussian approximation to the P ́olya- Gamma random variable is developed.
Three applied analyses are presented to explore the utility and versatility of the framework. The first analysis develops a model for complex dynamic behavior of themes in collections of text documents. Documents are modeled as a “bag of words”, and the multinomial distribution is used to characterize uncertainty in the vocabulary terms appearing in each document. State-space models for the natural parameters of the multinomial distribution induce autocorrelation in themes and their proportional representation in the corpus over time.
The second analysis develops a dynamic mixed membership model for Poisson counts. The model is applied to a collection of time series which record neuron level firing patterns in rhesus monkeys. The monkey is exposed to two sounds simultaneously, and Gaussian processes are used to smoothly model the time-varying rate at which the neuron’s firing pattern fluctuates between features associated with each sound in isolation.
The third analysis presents a switching dynamic generalized linear model for the time-varying home run totals of professional baseball players. The model endows each player with an age specific latent natural ability class and a performance enhancing drug (PED) use indicator. As players age, they randomly transition through a sequence of ability classes in a manner consistent with traditional aging patterns. When the performance of the player significantly deviates from the expected aging pattern, he is identified as a player whose performance is consistent with PED use.
All three models provide a mechanism for sharing information across related series locally in time. The models are fit with variations on the P ́olya-Gamma Gibbs sampler, MCMC convergence diagnostics are developed, and reproducible inference is emphasized throughout the dissertation.
Resumo:
Water use efficiency (WUE) is considered as a determinant of yield under stress and a component of crop drought resistance. Stomatal behavior regulates both transpiration rate and net assimilation and has been suggested to be crucial for improving crop WUE. In this work, a dynamic model was used to examine the impact of dynamic properties of stomata on WUE. The model includes sub-models of stomatal conductance dynamics, solute accumulation in the mesophyll, mesophyll water content, and water flow to the mesophyll. Using the instantaneous value of stomatal conductance, photosynthesis, and transpiration rate were simulated using a biochemical model and Penman-Monteith equation, respectively. The model was parameterized for a cucumber leaf and model outputs were evaluated using climatic data. Our simulations revealed that WUE was higher on a cloudy than a sunny day. Fast stomatal reaction to light decreased WUE during the period of increasing light (e.g., in the morning) by up to 10.2% and increased WUE during the period of decreasing light (afternoon) by up to 6.25%. Sensitivity of daily WUE to stomatal parameters and mesophyll conductance to CO2 was tested for sunny and cloudy days. Increasing mesophyll conductance to CO2 was more likely to increase WUE for all climatic conditions (up to 5.5% on the sunny day) than modifications of stomatal reaction speed to light and maximum stomatal conductance.
Resumo:
American tegumentary leishmaniasis (ATL) is a disease transmitted to humans by the female sandflies of the genus Lutzomyia. Several factors are involved in the disease transmission cycle. In this work only rainfall and deforestation were considered to assess the variability in the incidence of ATL. In order to reach this goal, monthly recorded data of the incidence of ATL in Orán, Salta, Argentina, were used, in the period 1985-2007. The square root of the relative incidence of ATL and the corresponding variance were formulated as time series, and these data were smoothed by moving averages of 12 and 24 months, respectively. The same procedure was applied to the rainfall data. Typical months, which are April, August, and December, were found and allowed us to describe the dynamical behavior of ATL outbreaks. These results were tested at 95% confidence level. We concluded that the variability of rainfall would not be enough to justify the epidemic outbreaks of ATL in the period 1997-2000, but it consistently explains the situation observed in the years 2002 and 2004. Deforestation activities occurred in this region could explain epidemic peaks observed in both years and also during the entire time of observation except in 2005-2007.
Resumo:
The caffeine solubility in supercritical CO2 was studied by assessing the effects of pressure and temperature on the extraction of green coffee oil (GCO). The Peng-Robinson¹ equation of state was used to correlate the solubility of caffeine with a thermodynamic model and two mixing rules were evaluated: the classical mixing rule of van der Waals with two adjustable parameters (PR-VDW) and a density dependent one, proposed by Mohamed and Holder² with two (PR-MH, two parameters adjusted to the attractive term) and three (PR-MH3 two parameters adjusted to the attractive and one to the repulsive term) adjustable parameters. The best results were obtained with the mixing rule of Mohamed and Holder² with three parameters.
Resumo:
study-specific results, their findings should be interpreted with caution
Resumo:
This study aimed to describe and compare the ventilation behavior during an incremental test utilizing three mathematical models and to compare the feature of ventilation curve fitted by the best mathematical model between aerobically trained (TR) and untrained ( UT) men. Thirty five subjects underwent a treadmill test with 1 km.h(-1) increases every minute until exhaustion. Ventilation averages of 20 seconds were plotted against time and fitted by: bi-segmental regression model (2SRM); three-segmental regression model (3SRM); and growth exponential model (GEM). Residual sum of squares (RSS) and mean square error (MSE) were calculated for each model. The correlations between peak VO2 (VO2PEAK), peak speed (Speed(PEAK)), ventilatory threshold identified by the best model (VT2SRM) and the first derivative calculated for workloads below (moderate intensity) and above (heavy intensity) VT2SRM were calculated. The RSS and MSE for GEM were significantly higher (p < 0.01) than for 2SRM and 3SRM in pooled data and in UT, but no significant difference was observed among the mathematical models in TR. In the pooled data, the first derivative of moderate intensities showed significant negative correlations with VT2SRM (r = -0.58; p < 0.01) and Speed(PEAK) (r = -0.46; p < 0.05) while the first derivative of heavy intensities showed significant negative correlation with VT2SRM (r = -0.43; p < 0.05). In UT group the first derivative of moderate intensities showed significant negative correlations with VT2SRM (r = -0.65; p < 0.05) and Speed(PEAK) (r = -0.61; p < 0.05), while the first derivative of heavy intensities showed significant negative correlation with VT2SRM (r= -0.73; p < 0.01), Speed(PEAK) (r = -0.73; p < 0.01) and VO2PEAK (r = -0.61; p < 0.05) in TR group. The ventilation behavior during incremental treadmill test tends to show only one threshold. UT subjects showed a slower ventilation increase during moderate intensities while TR subjects showed a slower ventilation increase during heavy intensities.