21 resultados para Computational biology
em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo (BDPI/USP)
Resumo:
Mathematical models, as instruments for understanding the workings of nature, are a traditional tool of physics, but they also play an ever increasing role in biology - in the description of fundamental processes as well as that of complex systems. In this review, the authors discuss two examples of the application of group theoretical methods, which constitute the mathematical discipline for a quantitative description of the idea of symmetry, to genetics. The first one appears, in the form of a pseudo-orthogonal (Lorentz like) symmetry, in the stochastic modelling of what may be regarded as the simplest possible example of a genetic network and, hopefully, a building block for more complicated ones: a single self-interacting or externally regulated gene with only two possible states: ` on` and ` off`. The second is the algebraic approach to the evolution of the genetic code, according to which the current code results from a dynamical symmetry breaking process, starting out from an initial state of complete symmetry and ending in the presently observed final state of low symmetry. In both cases, symmetry plays a decisive role: in the first, it is a characteristic feature of the dynamics of the gene switch and its decay to equilibrium, whereas in the second, it provides the guidelines for the evolution of the coding rules.
Resumo:
Motivation: DNA assembly programs classically perform an all-against-all comparison of reads to identify overlaps, followed by a multiple sequence alignment and generation of a consensus sequence. If the aim is to assemble a particular segment, instead of a whole genome or transcriptome, a target-specific assembly is a more sensible approach. GenSeed is a Perl program that implements a seed-driven recursive assembly consisting of cycles comprising a similarity search, read selection and assembly. The iterative process results in a progressive extension of the original seed sequence. GenSeed was tested and validated on many applications, including the reconstruction of nuclear genes or segments, full-length transcripts, and extrachromosomal genomes. The robustness of the method was confirmed through the use of a variety of DNA and protein seeds, including short sequences derived from SAGE and proteome projects.
Resumo:
The main goal of this paper is to investigate a cure rate model that comprehends some well-known proposals found in the literature. In our work the number of competing causes of the event of interest follows the negative binomial distribution. The model is conveniently reparametrized through the cured fraction, which is then linked to covariates by means of the logistic link. We explore the use of Markov chain Monte Carlo methods to develop a Bayesian analysis in the proposed model. The procedure is illustrated with a numerical example.
Resumo:
We have studied an agent model which presents the emergence of sexual barriers through the onset of assortative mating, a condition that might lead to sympatric speciation. In the model, individuals are characterized by two traits, each determined by a single locus A or B. Heterozygotes on A are penalized by introducing an adaptive difference from homozygotes. Two niches are available. Each A homozygote is adapted to one of the niches. The second trait, called the marker trait has no bearing on the fitness. The model includes mating preferences, which are inherited from the mother and subject to random variations. A parameter controlling recombination probabilities of the two loci is also introduced. We study the phase diagram by means of simulations, in the space of parameters (adaptive difference, carrying capacity, recombination probability). Three phases are found, characterized by (i) assortative mating, (ii) extinction of one of the A alleles and (iii) Hardy-Weinberg like equilibrium. We also make perturbations of these phases to see how robust they are. Assortative mating can be gained or lost with changes that present hysteresis loops, showing the resulting equilibrium to have partial memory of the initial state and that the process of going from a polymorphic panmictic phase to a phase where assortative mating acts as sexual barrier can be described as a first-order transition. (C) 2009 Published by Elsevier Ltd.
Resumo:
Human parasitic diseases are the foremost threat to human health and welfare around the world. Trypanosomiasis is a very serious infectious disease against which the currently available drugs are limited and not effective. Therefore, there is an urgent need for new chemotherapeutic agents. One attractive drug target is the major cysteine protease from Trypanosoma cruzi, cruzain. In the present work, comparative molecular field analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA) studies were conducted on a series of thiosemicarbazone and semicarbazone derivatives as inhibitors of cruzain. Molecular modeling studies were performed in order to identify the preferred binding mode of the inhibitors into the enzyme active site, and to generate structural alignments for the three-dimensional quantitative structure-activity relationship (3D QSAR) investigations. Statistically significant models were obtained (CoMFA. r(2) = 0.96 and q(2) = 0.78; CoMSIA, r(2) = 0.91 and q(2) = 0.73), indicating their predictive ability for untested compounds. The models were externally validated employing a test set, and the predicted values were in good agreement with the experimental results. The final QSAR models and the information gathered from the 3D CoMFA and CoMSIA contour maps provided important insights into the chemical and structural basis involved in the molecular recognition process of this family of cruzain inhibitors, and should be useful for the design of new structurally related analogs with improved potency. (C) 2009 Elsevier Inc. All rights reserved.
Resumo:
Hypercycles are information integration systems which are thought to overcome the information crisis of prebiotic evolution by ensuring the coexistence of several short templates. For imperfect template replication, we derive a simple expression for the maximum number of distinct templates n(m). that can coexist in a hypercycle and show that it is a decreasing function of the length L of the templates. In the case of high replication accuracy we find that the product n(m)L tends to a constant value, limiting thus the information content of the hypercycle. Template coexistence is achieved either as a stationary equilibrium (stable fixed point) or a stable periodic orbit in which the total concentration of functional templates is nonzero. For the hypercycle system studied here we find numerical evidence that the existence of an unstable fixed point is a necessary condition for the presence of periodic orbits. (C) 2008 Elsevier Ltd. All rights reserved.
Resumo:
The study of pharmacokinetic properties (PK) is of great importance in drug discovery and development. In the present work, PK/DB (a new freely available database for PK) was designed with the aim of creating robust databases for pharmacokinetic studies and in silico absorption, distribution, metabolism and excretion (ADME) prediction. Comprehensive, web-based and easy to access, PK/DB manages 1203 compounds which represent 2973 pharmacokinetic measurements, including five models for in silico ADME prediction (human intestinal absorption, human oral bioavailability, plasma protein binding, bloodbrain barrier and water solubility).
Resumo:
The coexistence between different types of templates has been the choice solution to the information crisis of prebiotic evolution, triggered by the finding that a single RNA-like template cannot carry enough information to code for any useful replicase. In principle, confining d distinct templates of length L in a package or protocell, whose Survival depends on the coexistence of the templates it holds in, could resolve this crisis provided that d is made sufficiently large. Here we review the prototypical package model of Niesert et al. [1981. Origin of life between Scylla and Charybdis. J. Mol. Evol. 17, 348-353] which guarantees the greatest possible region of viability of the protocell population, and show that this model, and hence the entire package approach, does not resolve the information crisis. In particular, we show that the total information stored in a viable protocell (Ld) tends to a constant value that depends only on the spontaneous error rate per nucleotide of the template replication mechanism. As a result, an increase of d must be followed by a decrease of L, so that the net information gain is null. (C) 2008 Elsevier Ltd. All rights reserved.
Resumo:
Several gene regulatory network models containing concepts of directionality at the edges have been proposed. However, only a few reports have an interpretable definition of directionality. Here, differently from the standard causality concept defined by Pearl, we introduce the concept of contagion in order to infer directionality at the edges, i.e., asymmetries in gene expression dependences of regulatory networks. Moreover, we present a bootstrap algorithm in order to test the contagion concept. This technique was applied in simulated data and, also, in an actual large sample of biological data. Literature review has confirmed some genes identified by contagion as actually belonging to the TP53 pathway.
Resumo:
We have considered a Bayesian approach for the nonlinear regression model by replacing the normal distribution on the error term by some skewed distributions, which account for both skewness and heavy tails or skewness alone. The type of data considered in this paper concerns repeated measurements taken in time on a set of individuals. Such multiple observations on the same individual generally produce serially correlated outcomes. Thus, additionally, our model does allow for a correlation between observations made from the same individual. We have illustrated the procedure using a data set to study the growth curves of a clinic measurement of a group of pregnant women from an obstetrics clinic in Santiago, Chile. Parameter estimation and prediction were carried out using appropriate posterior simulation schemes based in Markov Chain Monte Carlo methods. Besides the deviance information criterion (DIC) and the conditional predictive ordinate (CPO), we suggest the use of proper scoring rules based on the posterior predictive distribution for comparing models. For our data set, all these criteria chose the skew-t model as the best model for the errors. These DIC and CPO criteria are also validated, for the model proposed here, through a simulation study. As a conclusion of this study, the DIC criterion is not trustful for this kind of complex model.
Resumo:
Regression models for the mean quality-adjusted survival time are specified from hazard functions of transitions between two states and the mean quality-adjusted survival time may be a complex function of covariates. We discuss a regression model for the mean quality-adjusted survival (QAS) time based on pseudo-observations, which has the advantage of directly modeling the effect of covariates in the QAS time. Both Monte Carlo Simulations and a real data set are studied. Copyright (C) 2009 John Wiley & Sons, Ltd.
Resumo:
In many epidemiological studies it is common to resort to regression models relating incidence of a disease and its risk factors. The main goal of this paper is to consider inference on such models with error-prone observations and variances of the measurement errors changing across observations. We suppose that the observations follow a bivariate normal distribution and the measurement errors are normally distributed. Aggregate data allow the estimation of the error variances. Maximum likelihood estimates are computed numerically via the EM algorithm. Consistent estimation of the asymptotic variance of the maximum likelihood estimators is also discussed. Test statistics are proposed for testing hypotheses of interest. Further, we implement a simple graphical device that enables an assessment of the model`s goodness of fit. Results of simulations concerning the properties of the test statistics are reported. The approach is illustrated with data from the WHO MONICA Project on cardiovascular disease. Copyright (C) 2008 John Wiley & Sons, Ltd.
Resumo:
We analyze data obtained from a study designed to evaluate training effects on the performance of certain motor activities of Parkinson`s disease patients. Maximum likelihood methods were used to fit beta-binomial/Poisson regression models tailored to evaluate the effects of training on the numbers of attempted and successful specified manual movements in 1 min periods, controlling for disease stage and use of the preferred hand. We extend models previously considered by other authors in univariate settings to account for the repeated measures nature of the data. The results suggest that the expected number of attempts and successes increase with training, except for patients with advanced stages of the disease using the non-preferred hand. Copyright (c) 2008 John Wiley & Sons, Ltd.
Resumo:
We propose a likelihood ratio test ( LRT) with Bartlett correction in order to identify Granger causality between sets of time series gene expression data. The performance of the proposed test is compared to a previously published bootstrapbased approach. LRT is shown to be significantly faster and statistically powerful even within non- Normal distributions. An R package named gGranger containing an implementation for both Granger causality identification tests is also provided.
Resumo:
We describe AMIN (Amidase N-terminal domain), a novel protein domain found specifically in bacterial periplasmic proteins. AMIN domains are widely distributed among peptidoglycan hydrolases and transporter protein families. Based on experimental data, contextual information and phyletic profiles, we suggest that AMIN domains mediate the targeting of periplasmic or extracellular proteins to specific regions of the bacterial envelope.