7 resultados para universal in silico predictor of protein protein interaction
em AMS Tesi di Dottorato - Alm@DL - Università di Bologna
Resumo:
The vast majority of known proteins have not yet been experimentally characterized and little is known about their function. The design and implementation of computational tools can provide insight into the function of proteins based on their sequence, their structure, their evolutionary history and their association with other proteins. Knowledge of the three-dimensional (3D) structure of a protein can lead to a deep understanding of its mode of action and interaction, but currently the structures of <1% of sequences have been experimentally solved. For this reason, it became urgent to develop new methods that are able to computationally extract relevant information from protein sequence and structure. The starting point of my work has been the study of the properties of contacts between protein residues, since they constrain protein folding and characterize different protein structures. Prediction of residue contacts in proteins is an interesting problem whose solution may be useful in protein folding recognition and de novo design. The prediction of these contacts requires the study of the protein inter-residue distances related to the specific type of amino acid pair that are encoded in the so-called contact map. An interesting new way of analyzing those structures came out when network studies were introduced, with pivotal papers demonstrating that protein contact networks also exhibit small-world behavior. In order to highlight constraints for the prediction of protein contact maps and for applications in the field of protein structure prediction and/or reconstruction from experimentally determined contact maps, I studied to which extent the characteristic path length and clustering coefficient of the protein contacts network are values that reveal characteristic features of protein contact maps. Provided that residue contacts are known for a protein sequence, the major features of its 3D structure could be deduced by combining this knowledge with correctly predicted motifs of secondary structure. In the second part of my work I focused on a particular protein structural motif, the coiled-coil, known to mediate a variety of fundamental biological interactions. Coiled-coils are found in a variety of structural forms and in a wide range of proteins including, for example, small units such as leucine zippers that drive the dimerization of many transcription factors or more complex structures such as the family of viral proteins responsible for virus-host membrane fusion. The coiled-coil structural motif is estimated to account for 5-10% of the protein sequences in the various genomes. Given their biological importance, in my work I introduced a Hidden Markov Model (HMM) that exploits the evolutionary information derived from multiple sequence alignments, to predict coiled-coil regions and to discriminate coiled-coil sequences. The results indicate that the new HMM outperforms all the existing programs and can be adopted for the coiled-coil prediction and for large-scale genome annotation. Genome annotation is a key issue in modern computational biology, being the starting point towards the understanding of the complex processes involved in biological networks. The rapid growth in the number of protein sequences and structures available poses new fundamental problems that still deserve an interpretation. Nevertheless, these data are at the basis of the design of new strategies for tackling problems such as the prediction of protein structure and function. Experimental determination of the functions of all these proteins would be a hugely time-consuming and costly task and, in most instances, has not been carried out. As an example, currently, approximately only 20% of annotated proteins in the Homo sapiens genome have been experimentally characterized. A commonly adopted procedure for annotating protein sequences relies on the "inheritance through homology" based on the notion that similar sequences share similar functions and structures. This procedure consists in the assignment of sequences to a specific group of functionally related sequences which had been grouped through clustering techniques. The clustering procedure is based on suitable similarity rules, since predicting protein structure and function from sequence largely depends on the value of sequence identity. However, additional levels of complexity are due to multi-domain proteins, to proteins that share common domains but that do not necessarily share the same function, to the finding that different combinations of shared domains can lead to different biological roles. In the last part of this study I developed and validate a system that contributes to sequence annotation by taking advantage of a validated transfer through inheritance procedure of the molecular functions and of the structural templates. After a cross-genome comparison with the BLAST program, clusters were built on the basis of two stringent constraints on sequence identity and coverage of the alignment. The adopted measure explicity answers to the problem of multi-domain proteins annotation and allows a fine grain division of the whole set of proteomes used, that ensures cluster homogeneity in terms of sequence length. A high level of coverage of structure templates on the length of protein sequences within clusters ensures that multi-domain proteins when present can be templates for sequences of similar length. This annotation procedure includes the possibility of reliably transferring statistically validated functions and structures to sequences considering information available in the present data bases of molecular functions and structures.
Resumo:
Due to the growing attention of consumers towards their food, improvement of quality of animal products has become one of the main focus of research. To this aim, the application of modern molecular genetics approaches has been proved extremely useful and effective. This innovative drive includes all livestock species productions, including pork. The Italian pig breeding industry is unique because needs heavy pigs slaughtered at about 160 kg for the production of high quality processed products. For this reason, it requires precise meat quality and carcass characteristics. Two aspects have been considered in this thesis: the application of the transcriptome analysis in post mortem pig muscles as a possible method to evaluate meat quality parameters related to the pre mortem status of the animals, including health, nutrition, welfare, and with potential applications for product traceability (chapters 3 and 4); the study of candidate genes for obesity related traits in order to identify markers associated with fatness in pigs that could be applied to improve carcass quality (chapters 5, 6, and 7). Chapter three addresses the first issue from a methodological point of view. When we considered this issue, it was not obvious that post mortem skeletal muscle could be useful for transcriptomic analysis. Therefore we demonstrated that the quality of RNA extracted from skeletal muscle of pigs sampled at different post mortem intervals (20 minutes, 2 hours, 6 hours, and 24 hours) is good for downstream applications. Degradation occurred starting from 48 h post mortem even if at this time it is still possible to use some RNA products. In the fourth chapter, in order to demonstrate the potential use of RNA obtained up to 24 hours post mortem, we present the results of RNA analysis with the Affymetrix microarray platform that made it possible to assess the level of expression of more of 24000 mRNAs. We did not identify any significant differences between the different post mortem times suggesting that this technique could be applied to retrieve information coming from the transcriptome of skeletal muscle samples not collected just after slaughtering. This study represents the first contribution of this kind applied to pork. In the fifth chapter, we investigated as candidate for fat deposition the TBC1D1 [TBC1 (tre-2/USP6, BUB2, cdc16) gene. This gene is involved in mechanisms regulating energy homeostasis in skeletal muscle and is associated with predisposition to obesity in humans. By resequencing a fragment of the TBC1D1 gene we identified three synonymous mutations localized in exon 2 (g.40A>G, g.151C>T, and g.172T>C) and 2 polymorphisms localized in intron 2 (g.219G>A and g.252G>A). One of these polymorphisms (g.219G>A) was genotyped by high resolution melting (HRM) analysis and PCR-RFLP. Moreover, this gene sequence was mapped by radiation hybrid analysis on porcine chromosome 8. The association study was conducted in 756 performance tested pigs of Italian Large White and Italian Duroc breeds. Significant results were obtained for lean meat content, back fat thickness, visible intermuscular fat and ham weight. In chapter six, a second candidate gene (tribbles homolog 3, TRIB3) is analyzed in a study of association with carcass and meat quality traits. The TRIB3 gene is involved in energy metabolism of skeletal muscle and plays a role as suppressor of adipocyte differentiation. We identified two polymorphisms in the first coding exon of the porcine TRIB3 gene, one is a synonymous SNP (c.132T> C), a second is a missense mutation (c.146C> T, p.P49L). The two polymorphisms appear to be in complete linkage disequilibrium between and within breeds. The in silico analysis of the p.P49L substitution suggests that it might have a functional effect. The association study in about 650 pigs indicates that this marker is associated with back fat thickness in Italian Large White and Italian Duroc breeds in two different experimental designs. This polymorphisms is also associated with lactate content of muscle semimembranosus in Italian Large White pigs. Expression analysis indicated that this gene is transcribed in skeletal muscle and adipose tissue as well as in other tissues. In the seventh chapter, we reported the genotyping results for of 677 SNPs in extreme divergent groups of pigs chosen according to the extreme estimated breeding values for back fat thickness. SNPs were identified by resequencing, literature mining and in silico database mining. analysis, data reported in the literature of 60 candidates genes for obesity. Genotyping was carried out using the GoldenGate (Illumina) platform. Of the analyzed SNPs more that 300 were polymorphic in the genotyped population and had minor allele frequency (MAF) >0.05. Of these SNPs, 65 were associated (P<0.10) with back fat thickness. One of the most significant gene marker was the same TBC1D1 SNPs reported in chapter 5, confirming the role of this gene in fat deposition in pig. These results could be important to better define the pig as a model for human obesity other than for marker assisted selection to improve carcass characteristics.
Resumo:
Mutations in OPA1 gene have been identified in the majority of patients with Dominant Optic Atrophy (DOA), a blinding disease, and the syndromic form DOA-plus. OPA1 protein is a mitochondrial GTPase involved in various mitochondrial functions, present in humans in eight isoforms, resulting from alternative splicing and proteolytic processing. In this study we have investigated the specific role of each isoform through expression in OPA-/- MEFs, by evaluating their ability to improve the defective mitochondrial phenotypes. All isoforms were able to rescue the energetic efficiency, mitochondrial DNA (mtDNA) content and cristae integrity, but only the presence of both long and short forms could recover the mitochondrial morphology. In order to identify the OPA1 protein domains crucial for its functions, we selected and modified the isoform 1, shown to be one of the most efficient in preserving mitochondrial phenotype, to express three specific OPA1 variants, namely: one with a different N-terminus portion, one unable to generate short form owing to deletion of S1 cleavage site and one with a defective GTPase domain. We demonstrated that the simultaneous presence of the N- and C-terminus of OPA1 was essential for the mtDNA maintenance; a cleavable isoform generating s-forms was necessary to completely rescue the energetic competence and the presence of the C-terminus was sufficient to partially recover the cristae ultrastructure. Lastly, several pathogenic OPA1 mutations were inserted in MEF clones and the biochemical features investigated, to correlate the defective phenotypes with the clinical severity of patients. Our results clearly indicate that this cell model reflects very well the clinical characteristics of the patients, and therefore can be proposed as an useful tool to shed light on the pathomechanism underlying DOA.
Resumo:
We present a study of the metal sites of different proteins through X-ray Absorption Fine Structure (XAFS) spectroscopy. First of all, the capabilities of XAFS analysis have been improved by ab initio simulation of the near-edge region of the spectra, and an original analysis method has been proposed. The method subsequently served ad a tool to treat diverse biophysical problems, like the inhibition of proton-translocating proteins by metal ions and the matrix effect exerted on photosynthetic proteins (the bacterial Reaction Center, RC) by strongly dehydrate sugar matrices. A time-resolved study of Fe site of RC with μs resolution has been as well attempted. Finally, a further step aimed to improve the reliability of XAFS analysis has been performed by calculating the dynamical parameters of the metal binding cluster by means of DFT methods, and the theoretical result obtained for MbCO has been successfully compared with experimental data.
Resumo:
Nano(bio)science and nano(bio)technology play a growing and tremendous interest both on academic and industrial aspects. They are undergoing rapid developments on many fronts such as genomics, proteomics, system biology, and medical applications. However, the lack of characterization tools for nano(bio)systems is currently considered as a major limiting factor to the final establishment of nano(bio)technologies. Flow Field-Flow Fractionation (FlFFF) is a separation technique that is definitely emerging in the bioanalytical field, and the number of applications on nano(bio)analytes such as high molar-mass proteins and protein complexes, sub-cellular units, viruses, and functionalized nanoparticles is constantly increasing. This can be ascribed to the intrinsic advantages of FlFFF for the separation of nano(bio)analytes. FlFFF is ideally suited to separate particles over a broad size range (1 nm-1 μm) according to their hydrodynamic radius (rh). The fractionation is carried out in an empty channel by a flow stream of a mobile phase of any composition. For these reasons, fractionation is developed without surface interaction of the analyte with packing or gel media, and there is no stationary phase able to induce mechanical or shear stress on nanosized analytes, which are for these reasons kept in their native state. Characterization of nano(bio)analytes is made possible after fractionation by interfacing the FlFFF system with detection techniques for morphological, optical or mass characterization. For instance, FlFFF coupling with multi-angle light scattering (MALS) detection allows for absolute molecular weight and size determination, and mass spectrometry has made FlFFF enter the field of proteomics. Potentialities of FlFFF couplings with multi-detection systems are discussed in the first section of this dissertation. The second and the third sections are dedicated to new methods that have been developed for the analysis and characterization of different samples of interest in the fields of diagnostics, pharmaceutics, and nanomedicine. The second section focuses on biological samples such as protein complexes and protein aggregates. In particular it focuses on FlFFF methods developed to give new insights into: a) chemical composition and morphological features of blood serum lipoprotein classes, b) time-dependent aggregation pattern of the amyloid protein Aβ1-42, and c) aggregation state of antibody therapeutics in their formulation buffers. The third section is dedicated to the analysis and characterization of structured nanoparticles designed for nanomedicine applications. The discussed results indicate that FlFFF with on-line MALS and fluorescence detection (FD) may become the unparallel methodology for the analysis and characterization of new, structured, fluorescent nanomaterials.
Resumo:
Chromatography is the most widely used technique for high-resolution separation and analysis of proteins. This technique is very useful for the purification of delicate compounds, e.g. pharmaceuticals, because it is usually performed at milder conditions than separation processes typically used by chemical industry. This thesis focuses on affinity chromatography. Chromatographic processes are traditionally performed using columns packed with porous resin. However, these supports have several limitations, including the dependence on intra-particle diffusion, a slow mass transfer mechanism, for the transport of solute molecules to the binding sites within the pores and high pressure drop through the packed bed. These limitations can be overcome by using chromatographic supports like membranes or monoliths. Dye-ligands are considered important alternatives to natural ligands. Several reactive dyes, particularly Cibacron Blue F3GA, are used as affinity ligand for protein purification. Cibacron Blue F3GA is a triazine dye that interacts specifically and reversibly with albumin. The aim of this study is to prepare dye-affinity membranes and monoliths for efficient removal of albumin and to compare the three different affinity supports: membranes and monoliths and a commercial column HiTrapTM Blue HP, produced by GE Healthcare. A comparison among the three supports was performed in terms of binding capacity at saturation (DBC100%) and dynamic binding capacity at 10% breakthrough (DBC10%) using solutions of pure BSA. The results obtained show that the CB-RC membranes and CB-Epoxy monoliths can be compared to commercial support, column HiTrapTM Blue HP, for the separation of albumin. These results encourage a further characterization of the new supports examined.
Resumo:
Nowadays, in developed countries, the excessive food intake, in conjunction with a decreased physical activity, has led to an increase in lifestyle-related diseases, such as obesity, cardiovascular diseases, type -2 diabetes, a range of cancer types and arthritis. The socio-economic importance of such lifestyle-related diseases has encouraged countries to increase their efforts in research, and many projects have been initiated recently in research that focuses on the relationship between food and health. Thanks to these efforts and to the growing availability of technologies, the food companies are beginning to develop healthier food. The necessity of rapid and affordable methods, helping the food industries in the ingredient selection has stimulated the development of in vitro systems that simulate the physiological functions to which the food components are submitted when administrated in vivo. One of the most promising tool now available appears the in vitro digestion, which aims at predicting, in a comparative way among analogue food products, the bioaccessibility of the nutrients of interest.. The adoption of the foodomics approach has been chosen in this work to evaluate the modifications occurring during the in vitro digestion of selected protein-rich food products. The measure of the proteins breakdown was performed via NMR spectroscopy, the only techniques capable of observing, directly in the simulated gastric and duodenal fluids, the soluble oligo- and polypeptides released during the in vitro digestion process. The overall approach pioneered along this PhD work, has been discussed and promoted in a large scientific community, with specialists networked under the INFOGEST COST Action, which recently released a harmonized protocol for the in vitro digestion. NMR spectroscopy, when used in tandem with the in vitro digestion, generates a new concept, which provides an additional attribute to describe the food quality: the comparative digestibility, which measures the improvement of the nutrients bioaccessibility.