931 resultados para bioinformatics
Resumo:
Neuronal dynamics are fundamentally constrained by the underlying structural network architecture, yet much of the details of this synaptic connectivity are still unknown even in neuronal cultures in vitro. Here we extend a previous approach based on information theory, the Generalized Transfer Entropy, to the reconstruction of connectivity of simulated neuronal networks of both excitatory and inhibitory neurons. We show that, due to the model-free nature of the developed measure, both kinds of connections can be reliably inferred if the average firing rate between synchronous burst events exceeds a small minimum frequency. Furthermore, we suggest, based on systematic simulations, that even lower spontaneous inter-burst rates could be raised to meet the requirements of our reconstruction algorithm by applying a weak spatially homogeneous stimulation to the entire network. By combining multiple recordings of the same in silico network before and after pharmacologically blocking inhibitory synaptic transmission, we show then how it becomes possible to infer with high confidence the excitatory or inhibitory nature of each individual neuron.
Resumo:
Background: Optimization methods allow designing changes in a system so that specific goals are attained. These techniques are fundamental for metabolic engineering. However, they are not directly applicable for investigating the evolution of metabolic adaptation to environmental changes. Although biological systems have evolved by natural selection and result in well-adapted systems, we can hardly expect that actual metabolic processes are at the theoretical optimum that could result from an optimization analysis. More likely, natural systems are to be found in a feasible region compatible with global physiological requirements. Results: We first present a new method for globally optimizing nonlinear models of metabolic pathways that are based on the Generalized Mass Action (GMA) representation. The optimization task is posed as a nonconvex nonlinear programming (NLP) problem that is solved by an outer- approximation algorithm. This method relies on solving iteratively reduced NLP slave subproblems and mixed-integer linear programming (MILP) master problems that provide valid upper and lower bounds, respectively, on the global solution to the original NLP. The capabilities of this method are illustrated through its application to the anaerobic fermentation pathway in Saccharomyces cerevisiae. We next introduce a method to identify the feasibility parametric regions that allow a system to meet a set of physiological constraints that can be represented in mathematical terms through algebraic equations. This technique is based on applying the outer-approximation based algorithm iteratively over a reduced search space in order to identify regions that contain feasible solutions to the problem and discard others in which no feasible solution exists. As an example, we characterize the feasible enzyme activity changes that are compatible with an appropriate adaptive response of yeast Saccharomyces cerevisiae to heat shock Conclusion: Our results show the utility of the suggested approach for investigating the evolution of adaptive responses to environmental changes. The proposed method can be used in other important applications such as the evaluation of parameter changes that are compatible with health and disease states.
Resumo:
Background: Current advances in genomics, proteomics and other areas of molecular biology make the identification and reconstruction of novel pathways an emerging area of great interest. One such class of pathways is involved in the biogenesis of Iron-Sulfur Clusters (ISC). Results: Our goal is the development of a new approach based on the use and combination of mathematical, theoretical and computational methods to identify the topology of a target network. In this approach, mathematical models play a central role for the evaluation of the alternative network structures that arise from literature data-mining, phylogenetic profiling, structural methods, and human curation. As a test case, we reconstruct the topology of the reaction and regulatory network for the mitochondrial ISC biogenesis pathway in S. cerevisiae. Predictions regarding how proteins act in ISC biogenesis are validated by comparison with published experimental results. For example, the predicted role of Arh1 and Yah1 and some of the interactions we predict for Grx5 both matches experimental evidence. A putative role for frataxin in directly regulating mitochondrial iron import is discarded from our analysis, which agrees with also published experimental results. Additionally, we propose a number of experiments for testing other predictions and further improve the identification of the network structure. Conclusion: We propose and apply an iterative in silico procedure for predictive reconstruction of the network topology of metabolic pathways. The procedure combines structural bioinformatics tools and mathematical modeling techniques that allow the reconstruction of biochemical networks. Using the Iron Sulfur cluster biogenesis in S. cerevisiae as a test case we indicate how this procedure can be used to analyze and validate the network model against experimental results. Critical evaluation of the obtained results through this procedure allows devising new wet lab experiments to confirm its predictions or provide alternative explanations for further improving the models.
Resumo:
Background: Understanding the relationship between gene expression changes, enzyme activity shifts, and the corresponding physiological adaptive response of organisms to environmental cues is crucial in explaining how cells cope with stress. For example, adaptation of yeast to heat shock involves a characteristic profile of changes to the expression levels of genes coding for enzymes of the glycolytic pathway and some of its branches. The experimental determination of changes in gene expression profiles provides a descriptive picture of the adaptive response to stress. However, it does not explain why a particular profile is selected for any given response. Results: We used mathematical models and analysis of in silico gene expression profiles (GEPs) to understand how changes in gene expression correlate to an efficient response of yeast cells to heat shock. An exhaustive set of GEPs, matched with the corresponding set of enzyme activities, was simulated and analyzed. The effectiveness of each profile in the response to heat shock was evaluated according to relevant physiological and functional criteria. The small subset of GEPs that lead to effective physiological responses after heat shock was identified as the result of the tuning of several evolutionary criteria. The experimentally observed transcriptional changes in response to heat shock belong to this set and can be explained by quantitative design principles at the physiological level that ultimately constrain changes in gene expression. Conclusion: Our theoretical approach suggests a method for understanding the combined effect of changes in the expression of multiple genes on the activity of metabolic pathways, and consequently on the adaptation of cellular metabolism to heat shock. This method identifies quantitative design principles that facilitate understating the response of the cell to stress.
Resumo:
Background: Parallel T-Coffee (PTC) was the first parallel implementation of the T-Coffee multiple sequence alignment tool. It is based on MPI and RMA mechanisms. Its purpose is to reduce the execution time of the large-scale sequence alignments. It can be run on distributed memory clusters allowing users to align data sets consisting of hundreds of proteins within a reasonable time. However, most of the potential users of this tool are not familiar with the use of grids or supercomputers. Results: In this paper we show how PTC can be easily deployed and controlled on a super computer architecture using a web portal developed using Rapid. Rapid is a tool for efficiently generating standardized portlets for a wide range of applications and the approach described here is generic enough to be applied to other applications, or to deploy PTC on different HPC environments. Conclusions: The PTC portal allows users to upload a large number of sequences to be aligned by the parallel version of TC that cannot be aligned by a single machine due to memory and execution time constraints. The web portal provides a user-friendly solution.
Resumo:
The Pseudomonas aeruginosa toxin L-2-amino-4-methoxy-trans-3-butenoic acid (AMB) is a non-proteinogenic amino acid which is toxic for prokaryotes and eukaryotes. Production of AMB requires a five-gene cluster encoding a putative LysE-type transporter (AmbA), two non-ribosomal peptide synthetases (AmbB and AmbE), and two iron(II)/α-ketoglutarate-dependent oxygenases (AmbC and AmbD). Bioinformatics analysis predicts one thiolation (T) domain for AmbB and two T domains (T1 and T2) for AmbE, suggesting that AMB is generated by a processing step from a precursor tripeptide assembled on a thiotemplate. Using a combination of ATP-PPi exchange assays, aminoacylation assays, and mass spectrometry-based analysis of enzyme-bound substrates and pathway intermediates, the AmbB substrate was identified to be L-alanine (L-Ala), while the T1 and T2 domains of AmbE were loaded with L-glutamate (L-Glu) and L-Ala, respectively. Loading of L-Ala at T2 of AmbE occurred only in the presence of AmbB, indicative of a trans loading mechanism. In vitro assays performed with AmbB and AmbE revealed the dipeptide L-Glu-L-Ala at T1 and the tripeptide L-Ala-L-Glu-L-Ala attached at T2. When AmbC and AmbD were included in the assay, these peptides were no longer detected. Instead, an L-Ala-AMB-L-Ala tripeptide was found at T2. These data are in agreement with a biosynthetic model in which L-Glu is converted into AMB by the action of AmbC, AmbD, and tailoring domains of AmbE. The importance of the flanking L-Ala residues in the precursor tripeptide is discussed.
Resumo:
Background: Reconstruction of genes and/or protein networks from automated analysis of the literature is one of the current targets of text mining in biomedical research. Some user-friendly tools already perform this analysis on precompiled databases of abstracts of scientific papers. Other tools allow expert users to elaborate and analyze the full content of a corpus of scientific documents. However, to our knowledge, no user friendly tool that simultaneously analyzes the latest set of scientific documents available on line and reconstructs the set of genes referenced in those documents is available. Results: This article presents such a tool, Biblio-MetReS, and compares its functioning and results to those of other user-friendly applications (iHOP, STRING) that are widely used. Under similar conditions, Biblio-MetReS creates networks that are comparable to those of other user friendly tools. Furthermore, analysis of full text documents provides more complete reconstructions than those that result from using only the abstract of the document. Conclusions: Literature-based automated network reconstruction is still far from providing complete reconstructions of molecular networks. However, its value as an auxiliary tool is high and it will increase as standards for reporting biological entities and relationships become more widely accepted and enforced. Biblio- MetReS is an application that can be downloaded from http://metres.udl.cat/. It provides an easy to use environment for researchers to reconstruct their networks of interest from an always up to date set of scientific documents.
Resumo:
DnaSP, DNA Sequence Polymorphism, is a software package for the analysis of nucleotide polymorphism from aligned DNA sequence data. DnaSP can estimate several measures of DNA sequence variation within and between populations (in noncoding, synonymous or nonsynonymous sites, or in various sorts of codon positions), as well as linkage disequilibrium, recombination, gene flow and gene conversion parameters. DnaSP can also carry out several tests of neutrality: Hudson, Kreitman and Aguadé (1987), Tajima (1989), McDonald and Kreitman (1991), Fu and Li (1993), and Fu (1997) tests. Additionally, DnaSP can estimate the confidence intervals of some test-statistics by the coalescent. The results of the analyses are displayed on tabular and graphic form.
Resumo:
VariScan is a software package for the analysis of DNA sequence polymorphisms at the whole genome scale. Among other features, the software:(1) can conduct many population genetic analyses; (2) incorporates a multiresolution wavelet transform-based method that allows capturing relevant information from DNA polymorphism data; and (3) it facilitates the visualization of the results in the most commonly used genome browsers.
Resumo:
Abstract Background: Micro RNAs are small, non-coding, single-stranded RNAs that negatively regulate gene expression at the post-transcriptional level. Since miR-143 was found to be down-regulated in prostate cancer cells, we wanted to analyze its expression in human prostate cancer, and test the ability of miR-43 to arrest prostate cancer cell growth in vitro and in vivo. Results: Expression of miR-143 was analyzed in human prostate cancers by quantitative PCR, and by in situ hybridization. miR-143 was introduced in cancer cells in vivo by electroporation. Bioinformatics analysis and luciferase-based assays were used to determine miR-143 targets. We show in this study that miR-143 levels are inversely correlated with advanced stages of prostate cancer. Rescue of miR-143 expression in cancer cells results in the arrest of cell proliferation and the abrogation of tumor growth in mice. Furthermore, we show that the effects of miR-143 are mediated, at least in part by the inhibition of extracellular signal-regulated kinase-5 (ERK5) activity. We show here that ERK5 is a miR-143 target in prostate cancer. Conclusions: miR-143 is as a new target for prostate cancer treatment.
Resumo:
Evolutionary developmental biology has grown historically from the capacity to relate patterns of evolution in anatomy to patterns of evolution of expression of specific genes, whether between very distantly related species, or very closely related species or populations. Scaling up such studies by taking advantage of modern transcriptomics brings promising improvements, allowing us to estimate the overall impact and molecular mechanisms of convergence, constraint or innovation in anatomy and development. But it also presents major challenges, including the computational definitions of anatomical homology and of organ function, the criteria for the comparison of developmental stages, the annotation of transcriptomics data to proper anatomical and developmental terms, and the statistical methods to compare transcriptomic data between species to highlight significant conservation or changes. In this article, we review these challenges, and the ongoing efforts to address them, which are emerging from bioinformatics work on ontologies, evolutionary statistics, and data curation, with a focus on their implementation in the context of the development of our database Bgee (http://bgee.org). J. Exp. Zool. (Mol. Dev. Evol.) 324B: 372-382, 2015. © 2015 Wiley Periodicals, Inc.
Resumo:
Background Nowadays, combining the different sources of information to improve the biological knowledge available is a challenge in bioinformatics. One of the most powerful methods for integrating heterogeneous data types are kernel-based methods. Kernel-based data integration approaches consist of two basic steps: firstly the right kernel is chosen for each data set; secondly the kernels from the different data sources are combined to give a complete representation of the available data for a given statistical task. Results We analyze the integration of data from several sources of information using kernel PCA, from the point of view of reducing dimensionality. Moreover, we improve the interpretability of kernel PCA by adding to the plot the representation of the input variables that belong to any dataset. In particular, for each input variable or linear combination of input variables, we can represent the direction of maximum growth locally, which allows us to identify those samples with higher/lower values of the variables analyzed. Conclusions The integration of different datasets and the simultaneous representation of samples and variables together give us a better understanding of biological knowledge.
Resumo:
MOTIVATION: The functional impact of small molecules is increasingly being assessed in different eukaryotic species through large-scale phenotypic screening initiatives. Identifying the targets of these molecules is crucial to mechanistically understand their function and uncover new therapeutically relevant modes of action. However, despite extensive work carried out in model organisms and human, it is still unclear to what extent one can use information obtained in one species to make predictions in other species. RESULTS: Here, for the first time, we explore and validate at a large scale the use of protein homology relationships to predict the targets of small molecules across different species. Our results show that exploiting target homology can significantly improve the predictions, especially for molecules experimentally tested in other species. Interestingly, when considering separately orthology and paralogy relationships, we observe that mapping small molecule interactions among orthologs improves prediction accuracy, while including paralogs does not improve and even sometimes worsens the prediction accuracy. Overall, our results provide a novel approach to integrate chemical screening results across multiple species and highlight the promises and remaining challenges of using protein homology for small molecule target identification. AVAILABILITY AND IMPLEMENTATION: Homology-based predictions can be tested on our website http://www.swisstargetprediction.ch. CONTACT: david.gfeller@unil.ch or vincent.zoete@isb-sib.ch. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Resumo:
Mitochondrial DNA (mtDNA), a maternally inherited 16.6-Kb molecule crucial for energy production, is implicated in numerous human traits and disorders. It has been hypothesized that the presence of mutations in the mtDNA may contribute to the complex genetic basis of schizophreniadisease, due to the evidence of maternal inheritance and the presence of schizophrenia symptoms in patients affected of a mitochondrial disorder related to a mtDNA mutation. The present project aims to study the association of variants of mitochondrial DNA (mtDNA), and an increased risk of schizophrenia in a cohort of patients and controls from the same population. The entire mtDNA of 55 schizophrenia patients with an apparent maternal transmission of the disease and 38 controls was sequenced by Next Generation Sequencing (Ion Torrent PGM, Life Technologies) and compared to the reference sequence. The current method for establishing mtDNA haplotypes is Sanger sequencing, which is laborious, timeconsuming, and expensive. With the emergence of Next Generation Sequencing technologies, this sequencing process can be much more quickly and cost-efficiently. We have identified 14 variants that have not been previously reported. Two of them were missense variants: MTATP6 p.V113M and MTND5 p.F334L ,and also three variants encoding rRNA and one variant encoding tRNA. Not significant differences have been found in the number of variants between the two groups. We found that the sequence alignment algorithm employed to align NGS reads played a significant role in the analysis of the data and the resulting mtDNA haplotypes. Further development of the bioinformatics analysis and annotation step would be desirable to facilitate the application of NGS in mtDNA analysis.
Resumo:
MOTIVATION: Lipids are a large and diverse group of biological molecules with roles in membrane formation, energy storage and signaling. Cellular lipidomes may contain tens of thousands of structures, a staggering degree of complexity whose significance is not yet fully understood. High-throughput mass spectrometry-based platforms provide a means to study this complexity, but the interpretation of lipidomic data and its integration with prior knowledge of lipid biology suffers from a lack of appropriate tools to manage the data and extract knowledge from it. RESULTS: To facilitate the description and exploration of lipidomic data and its integration with prior biological knowledge, we have developed a knowledge resource for lipids and their biology-SwissLipids. SwissLipids provides curated knowledge of lipid structures and metabolism which is used to generate an in silico library of feasible lipid structures. These are arranged in a hierarchical classification that links mass spectrometry analytical outputs to all possible lipid structures, metabolic reactions and enzymes. SwissLipids provides a reference namespace for lipidomic data publication, data exploration and hypothesis generation. The current version of SwissLipids includes over 244 000 known and theoretically possible lipid structures, over 800 proteins, and curated links to published knowledge from over 620 peer-reviewed publications. We are continually updating the SwissLipids hierarchy with new lipid categories and new expert curated knowledge. AVAILABILITY: SwissLipids is freely available at http://www.swisslipids.org/. CONTACT: alan.bridge@isb-sib.ch SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.