973 resultados para Computational biology
Resumo:
Recent experiments revealed that the fruit fly Drosophila melanogaster has a dedicated mechanism for forgetting: blocking the G-protein Rac leads to slower and activating Rac to faster forgetting. This active form of forgetting lacks a satisfactory functional explanation. We investigated optimal decision making for an agent adapting to a stochastic environment where a stimulus may switch between being indicative of reward or punishment. Like Drosophila, an optimal agent shows forgetting with a rate that is linked to the time scale of changes in the environment. Moreover, to reduce the odds of missing future reward, an optimal agent may trade the risk of immediate pain for information gain and thus forget faster after aversive conditioning. A simple neuronal network reproduces these features. Our theory shows that forgetting in Drosophila appears as an optimal adaptive behavior in a changing environment. This is in line with the view that forgetting is adaptive rather than a consequence of limitations of the memory system.
Resumo:
Current models of embryological development focus on intracellular processes such as gene expression and protein networks, rather than on the complex relationship between subcellular processes and the collective cellular organization these processes support. We have explored this collective behavior in the context of neocortical development, by modeling the expansion of a small number of progenitor cells into a laminated cortex with layer and cell type specific projections. The developmental process is steered by a formal language analogous to genomic instructions, and takes place in a physically realistic three-dimensional environment. A common genome inserted into individual cells control their individual behaviors, and thereby gives rise to collective developmental sequences in a biologically plausible manner. The simulation begins with a single progenitor cell containing the artificial genome. This progenitor then gives rise through a lineage of offspring to distinct populations of neuronal precursors that migrate to form the cortical laminae. The precursors differentiate by extending dendrites and axons, which reproduce the experimentally determined branching patterns of a number of different neuronal cell types observed in the cat visual cortex. This result is the first comprehensive demonstration of the principles of self-construction whereby the cortical architecture develops. In addition, our model makes several testable predictions concerning cell migration and branching mechanisms.
Resumo:
HIV-1-infected cells in peripheral blood can be grouped into different transcriptional subclasses. Quantifying the turnover of these cellular subclasses can provide important insights into the viral life cycle and the generation and maintenance of latently infected cells. We used previously published data from five patients chronically infected with HIV-1 that initiated combination antiretroviral therapy (cART). Patient-matched PCR for unspliced and multiply spliced viral RNAs combined with limiting dilution analysis provided measurements of transcriptional profiles at the single cell level. Furthermore, measurement of intracellular transcripts and extracellular virion-enclosed HIV-1 RNA allowed us to distinguish productive from non-productive cells. We developed a mathematical model describing the dynamics of plasma virus and the transcriptional subclasses of HIV-1-infected cells. Fitting the model to the data allowed us to better understand the phenotype of different transcriptional subclasses and their contribution to the overall turnover of HIV-1 before and during cART. The average number of virus-producing cells in peripheral blood is small during chronic infection. We find that a substantial fraction of cells can become defectively infected. Assuming that the infection is homogenous throughout the body, we estimate an average in vivo viral burst size on the order of 104 virions per cell. Our study provides novel quantitative insights into the turnover and development of different subclasses of HIV-1-infected cells, and indicates that cells containing solely unspliced viral RNA are a good marker for viral latency. The model illustrates how the pool of latently infected cells becomes rapidly established during the first months of acute infection and continues to increase slowly during the first years of chronic infection. Having a detailed understanding of this process will be useful for the evaluation of viral eradication strategies that aim to deplete the latent reservoir of HIV-1.
Resumo:
The prenatal development of neural circuits must provide sufficient configuration to support at least a set of core postnatal behaviors. Although knowledge of various genetic and cellular aspects of development is accumulating rapidly, there is less systematic understanding of how these various processes play together in order to construct such functional networks. Here we make some steps toward such understanding by demonstrating through detailed simulations how a competitive co-operative ('winner-take-all', WTA) network architecture can arise by development from a single precursor cell. This precursor is granted a simplified gene regulatory network that directs cell mitosis, differentiation, migration, neurite outgrowth and synaptogenesis. Once initial axonal connection patterns are established, their synaptic weights undergo homeostatic unsupervised learning that is shaped by wave-like input patterns. We demonstrate how this autonomous genetically directed developmental sequence can give rise to self-calibrated WTA networks, and compare our simulation results with biological data.
Resumo:
People make numerous decisions every day including perceptual decisions such as walking through a crowd, decisions over primary rewards such as what to eat, and social decisions that require balancing own and others’ benefits. The unifying principles behind choices in various domains are, however, still not well understood. Mathematical models that describe choice behavior in specific contexts have provided important insights into the computations that may underlie decision making in the brain. However, a critical and largely unanswered question is whether these models generalize from one choice context to another. Here we show that a model adapted from the perceptual decision-making domain and estimated on choices over food rewards accurately predicts choices and reaction times in four independent sets of subjects making social decisions. The robustness of the model across domains provides behavioral evidence for a common decision-making process in perceptual, primary reward, and social decision making.
Resumo:
Glioblastoma multiforme (GBM) tumors are the most common malignant primary brain tumors in adults. The current theory is that these tumors are caused by self-renewing glioblastoma-derived stem cells (GSCs). At the current time, the mechanisms that regulate self-renewal and other oncogenic properties of GSCs remain unknown. Recently, we found transcriptional repressor REST maintains self-renewal in neural stem cells (NSCs) and in GSCs. REST also regulates other oncogenic properties, such as apoptosis, invasion and proliferation. However, the mechanisms by which REST regulates these oncogenic properties are unknown. In an attempt to determine these mechanisms, we performed loss and gain-of-function experiments and genome-wide mRNA expression analysis in GSCs, and we were able to identify REST-regulated genes in GSCs. This was accomplished, after screening concordantly regulated genes in NSCs and GSCs, utilizing two RE1 databases, and setting two-fold expression as filters on the resulting genes. These results received further validation by qRT-PCR. Ingenuity Pathway Analysis (IPA) analysis further revealed the top REST target genes in GSCs were downstream targets of REST and/or involved in other cancers in other cell lines. IPA also revealed that many of the differentially-regulated genes identified in this study are involved in oncogenic properties seen in GBM, and which we believe are related to REST expression.
Resumo:
My dissertation focuses on two aspects of RNA sequencing technology. The first is the methodology for modeling the overdispersion inherent in RNA-seq data for differential expression analysis. This aspect is addressed in three sections. The second aspect is the application of RNA-seq data to identify the CpG island methylator phenotype (CIMP) by integrating datasets of mRNA expression level and DNA methylation status. Section 1: The cost of DNA sequencing has reduced dramatically in the past decade. Consequently, genomic research increasingly depends on sequencing technology. However it remains elusive how the sequencing capacity influences the accuracy of mRNA expression measurement. We observe that accuracy improves along with the increasing sequencing depth. To model the overdispersion, we use the beta-binomial distribution with a new parameter indicating the dependency between overdispersion and sequencing depth. Our modified beta-binomial model performs better than the binomial or the pure beta-binomial model with a lower false discovery rate. Section 2: Although a number of methods have been proposed in order to accurately analyze differential RNA expression on the gene level, modeling on the base pair level is required. Here, we find that the overdispersion rate decreases as the sequencing depth increases on the base pair level. Also, we propose four models and compare them with each other. As expected, our beta binomial model with a dynamic overdispersion rate is shown to be superior. Section 3: We investigate biases in RNA-seq by exploring the measurement of the external control, spike-in RNA. This study is based on two datasets with spike-in controls obtained from a recent study. We observe an undiscovered bias in the measurement of the spike-in transcripts that arises from the influence of the sample transcripts in RNA-seq. Also, we find that this influence is related to the local sequence of the random hexamer that is used in priming. We suggest a model of the inequality between samples and to correct this type of bias. Section 4: The expression of a gene can be turned off when its promoter is highly methylated. Several studies have reported that a clear threshold effect exists in gene silencing that is mediated by DNA methylation. It is reasonable to assume the thresholds are specific for each gene. It is also intriguing to investigate genes that are largely controlled by DNA methylation. These genes are called “L-shaped” genes. We develop a method to determine the DNA methylation threshold and identify a new CIMP of BRCA. In conclusion, we provide a detailed understanding of the relationship between the overdispersion rate and sequencing depth. And we reveal a new bias in RNA-seq and provide a detailed understanding of the relationship between this new bias and the local sequence. Also we develop a powerful method to dichotomize methylation status and consequently we identify a new CIMP of breast cancer with a distinct classification of molecular characteristics and clinical features.
Resumo:
Purely data-driven approaches for machine learning present difficulties when data are scarce relative to the complexity of the model or when the model is forced to extrapolate. On the other hand, purely mechanistic approaches need to identify and specify all the interactions in the problem at hand (which may not be feasible) and still leave the issue of how to parameterize the system. In this paper, we present a hybrid approach using Gaussian processes and differential equations to combine data-driven modeling with a physical model of the system. We show how different, physically inspired, kernel functions can be developed through sensible, simple, mechanistic assumptions about the underlying system. The versatility of our approach is illustrated with three case studies from motion capture, computational biology, and geostatistics.
Resumo:
A gene expression atlas is an essential resource to quantify and understand the multiscale processes of embryogenesis in time and space. The automated reconstruction of a prototypic 4D atlas for vertebrate early embryos, using multicolor fluorescence in situ hybridization with nuclear counterstain, requires dedicated computational strategies. To this goal, we designed an original methodological framework implemented in a software tool called Match-IT. With only minimal human supervision, our system is able to gather gene expression patterns observed in different analyzed embryos with phenotypic variability and map them onto a series of common 3D templates over time, creating a 4D atlas. This framework was used to construct an atlas composed of 6 gene expression templates from a cohort of zebrafish early embryos spanning 6 developmental stages from 4 to 6.3 hpf (hours post fertilization). They included 53 specimens, 181,415 detected cell nuclei and the segmentation of 98 gene expression patterns observed in 3D for 9 different genes. In addition, an interactive visualization software, Atlas-IT, was developed to inspect, supervise and analyze the atlas. Match-IT and Atlas-IT, including user manuals, representative datasets and video tutorials, are publicly and freely available online. We also propose computational methods and tools for the quantitative assessment of the gene expression templates at the cellular scale, with the identification, visualization and analysis of coexpression patterns, synexpression groups and their dynamics through developmental stages.
Resumo:
A Biologia Computacional tem desenvolvido algoritmos aplicados a problemas relevantes da Biologia. Um desses problemas é a Protein Structure Prediction (PSP). Vários métodos têm sido desenvolvidos na literatura para lidar com esse problema. Porém a reprodução de resultados e a comparação dos mesmos não têm sido uma tarefa fácil. Nesse sentido, o Critical Assessment of protein Structure Prediction (CASP), busca entre seus objetivos, realizar tais comparações. Além disso, os sistemas desenvolvidos para esse problema em geral não possuem interface amigável, não favorecendo o uso por não especialistas da computação. Buscando reduzir essas dificuldades, este trabalho propões o Koala, um sistema baseado em uma plataforma web, que integra vários métodos de predição e análises de estruturas de proteínas, possibilitando a execução de experimentos complexos com o uso de fluxos de trabalhos. Os métodos de predição disponíveis podem ser integrados para a realização de análises dos resultados, usando as métricas RMSD, GDT-TS ou TM-Score. Além disso, o método Sort by front dominance (baseado no critério de optimalidade de Pareto), proposto nesse trabalho, consegue avaliar predições sem uma estrutura de referência. Os resultados obtidos, usando proteínas alvo de artigos recentes e do CASP11, indicam que o Koala tem capacidade de realizar um conjunto relativamente grande de experimentos estruturados, beneficiando a determinação de melhores estruturas de proteínas, bem como o desenvolvimento de novas abordagens para predição e análise por meio de fluxos de trabalho.
Resumo:
Cyclic peptides containing oxazole and thiazole heterocycles have been examined for their capacity to be used as scaffolds in larger, more complex, protein-like structures. Both the macrocyclic scaffolds and the supramolecular structures derived therefrom have been visualised by molecular modelling techniques. These molecules are too symmetrical to examine structurally by NMR spectroscopy. The cyclic hexapeptide ([Aaa-Thz](3), [Aaa-Oxz](3)) and cyclic octapeptide ([Aaa-Thz](4), [Aaa-Oxz](4)) analogues are composed of dipeptide surrogates (Aaa: amino acid, Thz: thiazole, Oxz: oxazole) derived from intramolecular condensation of cysteine or serine/threonine side chains in dipeptides like Aaa-Cys, Aaa-Ser and Aaa-Thr. The five-membered heterocyclic rings, like thiazole, oxazole and reduced analogues like thiazoline, thiazolidine and oxazoline have profound influences on the structures and bioactivities of cyclic peptides derived therefrom. This work suggests that such constrained cyclic peptides can be used as scaffolds to create a range of novel protein-like supramolecular structures (e.g. cylinders, troughs, cones, multi-loop structures, helix bundles) that are comparable in size, shape and composition to bioactive surfaces of proteins. They may therefore represent interesting starting points for the design of novel artificial proteins and artificial enzymes. (C) 2002 Elsevier Science Inc. All rights reserved.
Resumo:
A recent development of the Markov chain Monte Carlo (MCMC) technique is the emergence of MCMC samplers that allow transitions between different models. Such samplers make possible a range of computational tasks involving models, including model selection, model evaluation, model averaging and hypothesis testing. An example of this type of sampler is the reversible jump MCMC sampler, which is a generalization of the Metropolis-Hastings algorithm. Here, we present a new MCMC sampler of this type. The new sampler is a generalization of the Gibbs sampler, but somewhat surprisingly, it also turns out to encompass as particular cases all of the well-known MCMC samplers, including those of Metropolis, Barker, and Hastings. Moreover, the new sampler generalizes the reversible jump MCMC. It therefore appears to be a very general framework for MCMC sampling. This paper describes the new sampler and illustrates its use in three applications in Computational Biology, specifically determination of consensus sequences, phylogenetic inference and delineation of isochores via multiple change-point analysis.
Resumo:
Cluster analysis via a finite mixture model approach is considered. With this approach to clustering, the data can be partitioned into a specified number of clusters g by first fitting a mixture model with g components. An outright clustering of the data is then obtained by assigning an observation to the component to which it has the highest estimated posterior probability of belonging; that is, the ith cluster consists of those observations assigned to the ith component (i = 1,..., g). The focus is on the use of mixtures of normal components for the cluster analysis of data that can be regarded as being continuous. But attention is also given to the case of mixed data, where the observations consist of both continuous and discrete variables.
Resumo:
We present two methods of estimating the trend, seasonality and noise in time series of coronary heart disease events. In contrast to previous work we use a non-linear trend, allow multiple seasonal components, and carefully examine the residuals from the fitted model. We show the importance of estimating these three aspects of the observed data to aid insight of the underlying process, although our major focus is on the seasonal components. For one method we allow the seasonal effects to vary over time and show how this helps the understanding of the association between coronary heart disease and varying temperature patterns. Copyright (C) 2004 John Wiley Sons, Ltd.