992 resultados para Computational Identification


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In silico analyses of Leishmania spp. genome data are a powerful resource to improve the understanding of these pathogens' biology. Trypanosomatids such as Leishmania spp. have their protein-coding genes grouped in long polycistronic units of functionally unrelated genes. The control of gene expression happens by a variety of posttranscriptional mechanisms. The high degree of synteny among Leishmania species is accompanied by highly conserved coding sequences (CDS) and poorly conserved intercoding untranslated sequences. To identify the elements involved in the control of gene expression, we conducted an in silico investigation to find conserved intercoding sequences (CICS) in the genomes of L major, L infantum, and L braziliensis. We used a combination of computational tools, such as Linux-Shell, PERL and R languages, BLAST, MSPcrunch, SSAKE, and Pred-A-Term algorithms to construct a pipeline which was able to: (i) search for conservation in target-regions, (ii) eliminate CICS redundancy and mask repeat elements, (iii) predict the mRNA's extremities, (iv) analyze the distribution of orthologous genes within the generated LeishCICS-clusters, (v) assign GO terms to the LeishCICS-clusters. and (vi) provide statistical support for the gene-enrichment annotation. We associated the LeishCICS-cluster data, generated at the end of the pipeline, with the expression profile oft. donovani genes during promastigote-amastigote differentiation, as previously evaluated by others (GEO accession: GSE21936). A Pearson's correlation coefficient greater than 0.5 was observed for 730 LeishCICS-clusters containing from 2 to 17 genes. The designed computational pipeline is a useful tool and its application identified potential regulatory cis elements and putative regulons in Leishmania. (C) 2012 Elsevier B.V. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A common interest in gene expression data analysis is to identify from a large pool of candidate genes the genes that present significant changes in expression levels between a treatment and a control biological condition. Usually, it is done using a statistic value and a cutoff value that are used to separate the genes differentially and nondifferentially expressed. In this paper, we propose a Bayesian approach to identify genes differentially expressed calculating sequentially credibility intervals from predictive densities which are constructed using the sampled mean treatment effect from all genes in study excluding the treatment effect of genes previously identified with statistical evidence for difference. We compare our Bayesian approach with the standard ones based on the use of the t-test and modified t-tests via a simulation study, using small sample sizes which are common in gene expression data analysis. Results obtained report evidence that the proposed approach performs better than standard ones, especially for cases with mean differences and increases in treatment variance in relation to control variance. We also apply the methodologies to a well-known publicly available data set on Escherichia coli bacterium.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We recently showed that oxadiazoles have anti-Trypanosoma cruzi activity at micromolar concentrations. These compounds are easy to synthesize and show a number of clear and interpretable structure-activity relationships (SAR), features that make them attractive to pursue potency enhancement. We present here the structural design, synthesis, and anti-T. cruzi evaluation of new oxadiazoles denoted 5a-h and 6a-h. The design of these compounds was based on a previous model of computational docking of oxadiazoles on the T. cruzi protease cruzain. We tested the ability of these compounds to inhibit catalytic activity of cruzain, but we found no correlation between the enzyme inhibition and the antiparasitic activity of the compounds. However, we found reliable SAR data when we tested these compounds against the whole parasite. While none of these oxadiazoles showed toxicity for mammalian cells, oxadiazoles 6c (fluorine), 6d (chlorine), and 6e (bromine) reduced epimastigote proliferation and were cidal for trypomastigotes of T. cruzi Y strain. Oxadiazoles 6c and 6d have IC50 of 9.5 +/- 2.8 and 3.5 +/- 1.8 mu M for trypomastigotes, while Benznidazole, which is the currently used drug for Chagas disease treatment, showed an IC50 of 11.3 +/- 2.8 mu M. Compounds 6c and 6d impair trypomastigote development and invasion in macrophages, and also induce ultrastructural alterations in trypomastigotes. Finally, compound 6d given orally at 50 mg/kg substantially reduces the parasitemia in T. cruzi-infected BALB/c mice. Our drug design resulted in potency enhancement of oxadiazoles as anti-Chagas disease agents, and culminated with the identification of oxadiazole 6d, a trypanosomicidal compound in an animal model of infection. (C) 2012 Elsevier Ltd. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Motivation An actual issue of great interest, both under a theoretical and an applicative perspective, is the analysis of biological sequences for disclosing the information that they encode. The development of new technologies for genome sequencing in the last years, opened new fundamental problems since huge amounts of biological data still deserve an interpretation. Indeed, the sequencing is only the first step of the genome annotation process that consists in the assignment of biological information to each sequence. Hence given the large amount of available data, in silico methods became useful and necessary in order to extract relevant information from sequences. The availability of data from Genome Projects gave rise to new strategies for tackling the basic problems of computational biology such as the determination of the tridimensional structures of proteins, their biological function and their reciprocal interactions. Results The aim of this work has been the implementation of predictive methods that allow the extraction of information on the properties of genomes and proteins starting from the nucleotide and aminoacidic sequences, by taking advantage of the information provided by the comparison of the genome sequences from different species. In the first part of the work a comprehensive large scale genome comparison of 599 organisms is described. 2,6 million of sequences coming from 551 prokaryotic and 48 eukaryotic genomes were aligned and clustered on the basis of their sequence identity. This procedure led to the identification of classes of proteins that are peculiar to the different groups of organisms. Moreover the adopted similarity threshold produced clusters that are homogeneous on the structural point of view and that can be used for structural annotation of uncharacterized sequences. The second part of the work focuses on the characterization of thermostable proteins and on the development of tools able to predict the thermostability of a protein starting from its sequence. By means of Principal Component Analysis the codon composition of a non redundant database comprising 116 prokaryotic genomes has been analyzed and it has been showed that a cross genomic approach can allow the extraction of common determinants of thermostability at the genome level, leading to an overall accuracy in discriminating thermophilic coding sequences equal to 95%. This result outperform those obtained in previous studies. Moreover, we investigated the effect of multiple mutations on protein thermostability. This issue is of great importance in the field of protein engineering, since thermostable proteins are generally more suitable than their mesostable counterparts in technological applications. A Support Vector Machine based method has been trained to predict if a set of mutations can enhance the thermostability of a given protein sequence. The developed predictor achieves 88% accuracy.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The inherent stochastic character of most of the physical quantities involved in engineering models has led to an always increasing interest for probabilistic analysis. Many approaches to stochastic analysis have been proposed. However, it is widely acknowledged that the only universal method available to solve accurately any kind of stochastic mechanics problem is Monte Carlo Simulation. One of the key parts in the implementation of this technique is the accurate and efficient generation of samples of the random processes and fields involved in the problem at hand. In the present thesis an original method for the simulation of homogeneous, multi-dimensional, multi-variate, non-Gaussian random fields is proposed. The algorithm has proved to be very accurate in matching both the target spectrum and the marginal probability. The computational efficiency and robustness are very good too, even when dealing with strongly non-Gaussian distributions. What is more, the resulting samples posses all the relevant, welldefined and desired properties of “translation fields”, including crossing rates and distributions of extremes. The topic of the second part of the thesis lies in the field of non-destructive parametric structural identification. Its objective is to evaluate the mechanical characteristics of constituent bars in existing truss structures, using static loads and strain measurements. In the cases of missing data and of damages that interest only a small portion of the bar, Genetic Algorithm have proved to be an effective tool to solve the problem.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The structural peculiarities of a protein are related to its biological function. In the fatty acid elongation cycle, one small carrier protein shuttles and delivers the acyl intermediates from one enzyme to the other. The carrier has to recognize several enzymatic counterparts, specifically interact with each of them, and finally transiently deliver the carried substrate to the active site. Carry out such a complex game requires the players to be flexible and efficiently adapt their structure to the interacting protein or substrate. In a drug discovery effort, the structure-function relationships of a target system should be taken into account to optimistically interfere with its biological function. In this doctoral work, the essential role of structural plasticity in key steps of fatty acid biosynthesis in Plasmodium falciparum is investigated by means of molecular simulations. The key steps considered include the delivery of acyl substrates and the structural rearrangements of catalytic pockets upon ligand binding. The ground-level bases for carrier/enzyme recognition and interaction are also put forward. The structural features of the target have driven the selection of proper drug discovery tools, which captured the dynamics of biological processes and could allow the rational design of novel inhibitors. The model may be perspectively used for the identification of novel pathway-based antimalarial compounds.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Proper ion channels’ functioning is a prerequisite for a normal cell and disorders involving ion channels, or channelopathies, underlie many human diseases. Long QT syndromes (LQTS) for example may arise from the malfunctioning of hERG channel, caused either by the binding of drugs or mutations in HERG gene. In the first part of this thesis I present a framework to investigate the mechanism of ion conduction through hERG channel. The free energy profile governing the elementary steps of ion translocation in the pore was computed by means of umbrella sampling simulations. Compared to previous studies, we detected a different dynamic behavior: according to our data hERG is more likely to mediate a conduction mechanism which has been referred to as “single-vacancy-like” by Roux and coworkers (2001), rather then a “knock-on” mechanism. The same protocol was applied to a model of hERG presenting the Gly628Ser mutation, found to be cause of congenital LQTS. The results provided interesting insights about the reason of the malfunctioning of the mutant channel. Since they have critical functions in viruses’ life cycle, viral ion channels, such as M2 proton channel, are considered attractive targets for antiviral therapy. A deep knowledge of the mechanisms that the virus employs to survive in the host cell is of primary importance in the identification of new antiviral strategies. In the second part of this thesis I shed light on the role that M2 plays in the control of electrical potential inside the virus, being the charge equilibration a condition required to allow proton influx. The ion conduction through M2 was simulated using metadynamics technique. Based on our results we suggest that a potential anion-mediated cation-proton exchange, as well as a direct anion-proton exchange could both contribute to explain the activity of the M2 channel.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Ketamine is an anesthetic and analgesic regularly used in veterinary patients. As ketamine is almost always administered in combination with other drugs, interactions between ketamine and other drugs bear the risk of either adverse effects or diminished efficacy. Since cytochrome P450 enzymes (CYPs) play a pivotal role in the phase I metabolism of the majority of all marketed drugs, drug-drug interactions often occur at the active site of these enzymes. CYPs have been thoroughly examined in humans and laboratory animals, but little is known about equine CYPs. The characterization of equine CYPs is essential for a better understanding of drug metabolism in horses. We report annotation, cloning and heterologous expression of the equine CYP2B6 in V79 Chinese hamster fibroblasts. After computational annotation of all CYP2B genes, the coding sequence (CDS) of equine CYP2B6 was amplified by RT-PCR from horse liver total RNA and revealed an amino acid sequence identity of 77% and a similarity of 93.7% to its human ortholog. A non-synonymous variant c.226G>A in exon 2 of the equine CYP2B6 was detected in 97 horses. The mutant A-allele showed an allele frequency of 82%. Two further variants in exon 3 were detected in one and two horses of this group, respectively. Transfected V79 cells were incubated with racemic ketamine and norketamine as probe substrates to determine metabolic activity. The recombinant equine CYP2B6 N-demethylated ketamine to norketamine and produced metabolites of norketamine, such as hydroxylated norketamines and 5,6-dehydronorketamine. V(max) for S-/and R-norketamine formation was 0.49 and 0.45nmol/h/mg cellular protein and K(m) was 3.41 and 2.66μM, respectively. The N-demethylation of S-/R-ketamine was inhibited concentration-dependently with clopidogrel showing an IC(50) of 5.63 and 6.26μM, respectively. The functional importance of the recorded genetic variants remains to be explored. Equine CYP2B6 was determined to be a CYP enzyme involved in ketamine and norketamine metabolism, thus confirming results from inhibition studies with horse liver microsomes. Clopidogrel seems to be a feasible inhibitor for equine CYP2B6. The specificity still needs to be established with other single equine CYPs. Heterologous expression of single equine CYP enzymes opens new possibilities to substantially improve the understanding of drug metabolism and drug interactions in horses.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Stepwise uncertainty reduction (SUR) strategies aim at constructing a sequence of points for evaluating a function  f in such a way that the residual uncertainty about a quantity of interest progressively decreases to zero. Using such strategies in the framework of Gaussian process modeling has been shown to be efficient for estimating the volume of excursion of f above a fixed threshold. However, SUR strategies remain cumbersome to use in practice because of their high computational complexity, and the fact that they deliver a single point at each iteration. In this article we introduce several multipoint sampling criteria, allowing the selection of batches of points at which f can be evaluated in parallel. Such criteria are of particular interest when f is costly to evaluate and several CPUs are simultaneously available. We also manage to drastically reduce the computational cost of these strategies through the use of closed form formulas. We illustrate their performances in various numerical experiments, including a nuclear safety test case. Basic notions about kriging, auxiliary problems, complexity calculations, R code, and data are available online as supplementary materials.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Automated identification of vertebrae from X-ray image(s) is an important step for various medical image computing tasks such as 2D/3D rigid and non-rigid registration. In this chapter we present a graphical model-based solution for automated vertebra identification from X-ray image(s). Our solution does not ask for a training process using training data and has the capability to automatically determine the number of vertebrae visible in the image(s). This is achieved by combining a graphical model-based maximum a posterior probability (MAP) estimate with a mean-shift based clustering. Experiments conducted on simulated X-ray images as well as on a low-dose low quality X-ray spinal image of a scoliotic patient verified its performance.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Establishment of phylogenetic relationships remains a challenging task because it is based on computational analysis of genomic hot spots that display species-specific sequence variations. Here, we identify a species-specific thymine-to-guanine sequence variation in the Glrb gene which gives rise to species-specific splice donor sites in the Glrb genes of mouse and bushbaby. The resulting splice insert in the receptor for the inhibitory neurotransmitter glycine (GlyR) conveys synaptic receptor clustering and specific association with a particular synaptic plasticity-related splice variant of the postsynaptic scaffold protein gephyrin. This study identifies a new genomic hot spot which contributes to phylogenetic diversification of protein function and advances our understanding of phylogenetic relationships.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Borrelia burgdorferi is the etiological agent of Lyme disease, the most common tick-borne disease in the United States. Although the most frequently reported symptom is arthritis, patients can also experience severe cardiac, neurologic, and dermatologic abnormalities. The identification of virulence determinants in infectious B. burgdorferi strains has been limited by their slow growth rate, poor transformability, and general lack of genetic tools. The present study demonstrates the use of transposon mutagenesis for the identification of infectivity-related factors in infectious B. burgdorferi, examines the potential role for chemotaxis in mammalian infection, and describes the development of a novel method for the analysis of recombination events at the Ids antigenic variation locus. A pool of Himar1 mutants was isolated using an infectious B. burgdorferi clone and the transposon vector pMarGent. Clones exhibiting reduced infectivity in mice possessed insertions in virulence determinants putatively involved in host survival and dissemination. These results demonstrated the feasibility of extensive transposon mutagenesis studies for the identification of additional infectivity-related factors. mcp-5 mutants were chosen for further study to determine the role of chemotaxis during infection. Animal studies indicated that mcp-5 mutants exhibited a reduced infectivity potential, and suggested a role for mcp-5 during the early stages of infection. An in vitro phenotype for an mcp-5 mutant was not detected. Genetic complementation of an mcp-5 mutant resulted in restoration of Mcp-5 expression in the complemented clone, as demonstrated by western blotting, but the organisms were not infectious in mice. We believe this result is a consequence of differences in expression between genes located on the linear chromosome and genes present on the circular plasmid used for trans-complementation. Overall, this work implicates mcp-5 as an important determinant of mammalian infectivity. Finally, the development of a computer-assisted method for the analysis of recombination events occurring at the B. burgdorferi vls antigenic variation locus has proven highly valuable for the detailed examination of vls gene conversion. The studies described here provide evidence for the importance of chemotaxis during infection in mice and demonstrate advances in both genetic and computational approaches for the further characterization of the Lyme disease spirochete. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

My dissertation focuses on two aspects of RNA sequencing technology. The first is the methodology for modeling the overdispersion inherent in RNA-seq data for differential expression analysis. This aspect is addressed in three sections. The second aspect is the application of RNA-seq data to identify the CpG island methylator phenotype (CIMP) by integrating datasets of mRNA expression level and DNA methylation status. Section 1: The cost of DNA sequencing has reduced dramatically in the past decade. Consequently, genomic research increasingly depends on sequencing technology. However it remains elusive how the sequencing capacity influences the accuracy of mRNA expression measurement. We observe that accuracy improves along with the increasing sequencing depth. To model the overdispersion, we use the beta-binomial distribution with a new parameter indicating the dependency between overdispersion and sequencing depth. Our modified beta-binomial model performs better than the binomial or the pure beta-binomial model with a lower false discovery rate. Section 2: Although a number of methods have been proposed in order to accurately analyze differential RNA expression on the gene level, modeling on the base pair level is required. Here, we find that the overdispersion rate decreases as the sequencing depth increases on the base pair level. Also, we propose four models and compare them with each other. As expected, our beta binomial model with a dynamic overdispersion rate is shown to be superior. Section 3: We investigate biases in RNA-seq by exploring the measurement of the external control, spike-in RNA. This study is based on two datasets with spike-in controls obtained from a recent study. We observe an undiscovered bias in the measurement of the spike-in transcripts that arises from the influence of the sample transcripts in RNA-seq. Also, we find that this influence is related to the local sequence of the random hexamer that is used in priming. We suggest a model of the inequality between samples and to correct this type of bias. Section 4: The expression of a gene can be turned off when its promoter is highly methylated. Several studies have reported that a clear threshold effect exists in gene silencing that is mediated by DNA methylation. It is reasonable to assume the thresholds are specific for each gene. It is also intriguing to investigate genes that are largely controlled by DNA methylation. These genes are called “L-shaped” genes. We develop a method to determine the DNA methylation threshold and identify a new CIMP of BRCA. In conclusion, we provide a detailed understanding of the relationship between the overdispersion rate and sequencing depth. And we reveal a new bias in RNA-seq and provide a detailed understanding of the relationship between this new bias and the local sequence. Also we develop a powerful method to dichotomize methylation status and consequently we identify a new CIMP of breast cancer with a distinct classification of molecular characteristics and clinical features.