926 resultados para structure-function map
Resumo:
Communication within and across proteins is crucial for the biological functioning of proteins. Experiments such as mutational studies on proteins provide important information on the amino acids, which are crucial for their function. However, the protein structures are complex and it is unlikely that the entire responsibility of the function rests on only a few amino acids. A large fraction of the protein is expected to participate in its function at some level or other. Thus, it is relevant to consider the protein structures as a completely connected network and then deduce the properties, which are related to the global network features. In this direction, our laboratory has been engaged in representing the protein structure as a network of non-covalent connections and we have investigated a variety of problems in structural biology, such as the identification of functional and folding clusters, determinants of quaternary association and characterization of the network properties of protein structures. We have also addressed a few important issues related to protein dynamics, such as the process of oligomerization in multimers, mechanism on protein folding, and ligand induced communications (allosteric effect). In this review we highlight some of the investigations which we have carried out in the recent past. A review on protein structure graphs was presented earlier, in which the focus was on the graphs and graph spectral properties and their implementation in the study of protein structure graphs/networks (PSN). In this article, we briefly summarize the relevant parts of the methodology and the focus is on the advancement brought out in the understanding of protein structure-function relationships through structure networks. The investigations of structural/biological problems are divided into two parts, in which the first part deals with the analysis of PSNs based on static structures obtained from x-ray crystallography. The second part highlights the changes in the network, associated with biological functions, which are deduced from the network analysis on the structures obtained from molecular dynamics simulations.
Resumo:
This thesis studies human gene expression space using high throughput gene expression data from DNA microarrays. In molecular biology, high throughput techniques allow numerical measurements of expression of tens of thousands of genes simultaneously. In a single study, this data is traditionally obtained from a limited number of sample types with a small number of replicates. For organism-wide analysis, this data has been largely unavailable and the global structure of human transcriptome has remained unknown. This thesis introduces a human transcriptome map of different biological entities and analysis of its general structure. The map is constructed from gene expression data from the two largest public microarray data repositories, GEO and ArrayExpress. The creation of this map contributed to the development of ArrayExpress by identifying and retrofitting the previously unusable and missing data and by improving the access to its data. It also contributed to creation of several new tools for microarray data manipulation and establishment of data exchange between GEO and ArrayExpress. The data integration for the global map required creation of a new large ontology of human cell types, disease states, organism parts and cell lines. The ontology was used in a new text mining and decision tree based method for automatic conversion of human readable free text microarray data annotations into categorised format. The data comparability and minimisation of the systematic measurement errors that are characteristic to each lab- oratory in this large cross-laboratories integrated dataset, was ensured by computation of a range of microarray data quality metrics and exclusion of incomparable data. The structure of a global map of human gene expression was then explored by principal component analysis and hierarchical clustering using heuristics and help from another purpose built sample ontology. A preface and motivation to the construction and analysis of a global map of human gene expression is given by analysis of two microarray datasets of human malignant melanoma. The analysis of these sets incorporate indirect comparison of statistical methods for finding differentially expressed genes and point to the need to study gene expression on a global level.
Resumo:
Molecular dynamics simulations have been carried out on all the jacalin-carbohydrate complexes of known structure, models of unliganded molecules derived from the complexes and also models of relevant complexes where X-ray structures are not available. Results of the simulations and the available crystal structures involving jacalin permit delineation of the relatively rigid and flexible regions of the molecule and the dynamical variability of the hydrogen bonds involved in stabilizing the structure. Local flexibility appears to be related to solvent accessibility. Hydrogen bonds involving side chains and water bridges involving buried water molecules appear to be important in the stabilization of loop structures. The lectin-carbohydrate interactions observed in crystal structures, the average parameters pertaining to them derived from simulations, energetic contribution of the stacking residue estimated from quantum mechanical calculations, and the scatter of the locations of carbohydrate and carbohydrate-binding residues are consistent with the known thermodynamic parameters of jacalin-carbohydrate interactions. The simulations, along with X-ray results, provide a fuller picture of carbohydrate binding by jacalin than provided by crystallographic analysis alone. The simulations confirm that in the unliganded structures water molecules tend to occupy the positions occupied by carbohydrate oxygens in the lectin-carbohydrate complexes. Population distributions in simulations of the free lectin, the ligands, and the complexes indicate a combination of conformational selection and induced fit. Proteins 2009; 77:760-777.
Resumo:
Background: The members of cupin superfamily exhibit large variations in their sequences, functions, organization of domains, quaternary associations and the nature of bound metal ion, despite having a conserved beta-barrel structural scaffold. Here, an attempt has been made to understand structure-function relationships among the members of this diverse superfamily and identify the principles governing functional diversity. The cupin superfamily also contains proteins for which the structures are available through world-wide structural genomics initiatives but characterized as ``hypothetical''. We have explored the feasibility of obtaining clues to functions of such proteins by means of comparative analysis with cupins of known structure and function. Methodology/Principal Findings: A 3-D structure-based phylogenetic approach was undertaken. Interestingly, a dendrogram generated solely on the basis of structural dissimilarity measure at the level of domain folds was found to cluster functionally similar members. This clustering also reflects an independent evolution of the two domains in bicupins. Close examination of structural superposition of members across various functional clusters reveals structural variations in regions that not only form the active site pocket but are also involved in interaction with another domain in the same polypeptide or in the oligomer. Conclusions/Significance: Structure-based phylogeny of cupins can influence identification of functions of proteins of yet unknown function with cupin fold. This approach can be extended to other proteins with a common fold that show high evolutionary divergence. This approach is expected to have an influence on the function annotation in structural genomics initiatives.
Solution structure of O-glycosylated C-terminal leucine zipper domain of human salivary mucin (MUC7)
Resumo:
Solution structures of a 23 residue glycopeptide II (KIS* RFLLYMKNLLNRIIDDMVEQ, where * denotes the glycan Gal-beta-(1-3)-alpha-GalNAc) and its deglycosylated counterpart I derived from the C-terminal leucine zipper domain of low molecular weight human salivary mucin (MUC7) were studied using CD, NMR spectroscopy and molecular modeling. The peptide I was synthesized using the Fmoc chemistry following the conventional procedure and the glycopeptide II was synthesized incorporating the O-glycosylated building block (N alpha-Fmoc-Ser-[Ac-4,-beta-D-Gal-(1,3)-Ac(2)alpha-D-GalN(3)]-OPfp) at the appropriate position in stepwise assembly of peptide chain. Solution structures of these glycosylated and nonglycosylated peptides were studied in water and in the presence of 50% of an organic cosolvent, trifluoroethanol (TFE) using circular dichroism (CD), and in 50% TFE using two-dimensional proton nuclear magnetic resonance (2D H-1 NMR) spectroscopy. CD spectra in aqueous medium indicate that the apopeptide I adapts, mostly, a beta-sheet conformation whereas the glycopeptide II assumes helical structure. This transition in the secondary structure, upon glycosylation, demonstrates that the carbohydrate moiety exerts significant effect on the peptide backbone conformation. However, in 50% TFE both the peptides show pronounced helical structure. Sequential and medium range NOEs, C alpha H chemical shift perturbations, (3)J(NH:C alpha H) couplings and deuterium exchange rates of the amide proton resonances in water containing 50% TFE indicate that the peptide I adapts alpha-helical structure from Ile2-Val21 and the glycopeptide II adapts alpha-helical structure from Ser3-Glu22. The observation of continuous stretch of helix in both the peptides as observed by both NMR and CD spectroscopy strongly suggests that the C-terminal domain of MUC7 with heptad repeats of leucines or methionine residues may be stabilized by dimeric leucine zipper motif. The results reported herein may be invaluable in understanding the aggregation (or dimerization) of MUC7 glycoprotein which would eventually have implications in determining its structure-function relationship.
Resumo:
A phenomenological model of spin sharing by the constituents of a proton is constructed, based on the recent EMC measurement of the spin dependent structure function and knowledge of the unpolarized parton densities.
Resumo:
Experimental studies have observed significant changes in both structure and function of lysozyme (and other proteins) on addition of a small amount of dimethyl sulfoxide (DMSO) in aqueous solution. Our atomistic molecular dynamic simulations of lysozyme in water-DMSO reveal the following sequence of changes on increasing DMSO concentration. (i) At the initial stage (around 5% DMSO concentration) protein's conformational flexibility gets markedly suppressed. From study of radial distribution functions, we attribute this to the preferential solvation of exposed protein hydrophobic residues by the methyl groups of DMSO. (ii) In the next stage (10-15% DMSO concentration range), lysozome partially unfolds accompanied by an increase both in fluctuation and in exposed protein surface area. (iii) Between 15-20% concentration ranges, both conformational fluctuation and solvent accessible protein surface area suddenly decrease again indicating the formation of an intermediate collapse state. These results are in good agreement with near-UV circular dichroism (CD) and fluorescence studies. We explain this apparently surprising behavior in terms of a structural transformation which involves clustering among the methyl groups of DMSO. (iv) Beyond 20% concentration of DMSO, the protein starts its final sojourn towards the unfolding state with further increase in conformational fluctuation and loss in native contacts. Most importantly, analysis of contact map and fluctuation near the active site reveal that both partial unfolding and conformational fluctuations are centered mostly on the hydrophobic core of active site of lysozyme. Our results could offer a general explanation and universal picture of the anomalous behavior of protein structure-function observed in the presence of cosolvents (DMSO, ethanol, tertiary butyl alcohol, dioxane) at their low concentrations. (C) 2012 American Institute of Physics. [http://dx.doi.org/10.1063/1.3694268]
Resumo:
Comparison of multiple protein structures has a broad range of applications in the analysis of protein structure, function and evolution. Multiple structure alignment tools (MSTAs) are necessary to obtain a simultaneous comparison of a family of related folds. In this study, we have developed a method for multiple structure comparison largely based on sequence alignment techniques. A widely used Structural Alphabet named Protein Blocks (PBs) was used to transform the information on 3D protein backbone conformation as a ID sequence string. A progressive alignment strategy similar to CLUSTALW was adopted for multiple PB sequence alignment (mulPBA). Highly similar stretches identified by the pairwise alignments are given higher weights during the alignment. The residue equivalences from PB based alignments are used to obtain a three dimensional fit of the structures followed by an iterative refinement of the structural superposition. Systematic comparisons using benchmark datasets of MSTAs underlines that the alignment quality is better than MULTIPROT, MUSTANG and the alignments in HOMSTRAD, in more than 85% of the cases. Comparison with other rigid-body and flexible MSTAs also indicate that mulPBA alignments are superior to most of the rigid-body MSTAs and highly comparable to the flexible alignment methods. (C) 2012 Elsevier Masson SAS. All rights reserved.
Resumo:
The Ramachandran map clearly delineates the regions of accessible conformational (phi-) space for amino acid residues in proteins. Experimental distributions of phi, values in high-resolution protein structures, reveal sparsely populated zones within fully allowed regions and distinct clusters in apparently disallowed regions. Conformational space has been divided into 14 distinct bins. Residues adopting these relatively rare conformations are presented and amino acid propensities for these regions are estimated. Inspection of specific examples in a completely arid, fully allowed region in the top left quadrant establishes that side-chain and backbone interactions may provide the energetic compensation necessary for populating this region of phi- space. Asn, Asp, and His residues showed the highest propensities in this region. The two distinct clusters in the bottom right quadrant which are formally disallowed on strict steric considerations correspond to the gamma turn (C7 axial) conformation (Bin 12) and the i + 1 position of Type II turns (Bin 13). Of the 516 non-Gly residues in Bin 13, 384 occupied the i + 1 position of Type II turns. Further examination of these turn segments revealed a high propensity to occur at the N-terminus of helices and as a tight turn in hairpins. The strand-helix motif with the Type II turn as a connecting element was also found in as many as 57 examples. Proteins 2014; 82:1101-1112. (c) 2013 Wiley Periodicals, Inc.
Resumo:
Inference of molecular function of proteins is the fundamental task in the quest for understanding cellular processes. The task is getting increasingly difficult with thousands of new proteins discovered each day. The difficulty arises primarily due to lack of high-throughput experimental technique for assessing protein molecular function, a lacunae that computational approaches are trying hard to fill. The latter too faces a major bottleneck in absence of clear evidence based on evolutionary information. Here we propose a de novo approach to annotate protein molecular function through structural dynamics match for a pair of segments from two dissimilar proteins, which may share even <10% sequence identity. To screen these matches, corresponding 1 mu s coarse-grained (CG) molecular dynamics trajectories were used to compute normalized root-mean-square-fluctuation graphs and select mobile segments, which were, thereafter, matched for all pairs using unweighted three-dimensional autocorrelation vectors. Our in-house custom-built forcefield (FF), extensively validated against dynamics information obtained from experimental nuclear magnetic resonance data, was specifically used to generate the CG dynamics trajectories. The test for correspondence of dynamics-signature of protein segments and function revealed 87% true positive rate and 93.5% true negative rate, on a dataset of 60 experimentally validated proteins, including moonlighting proteins and those with novel functional motifs. A random test against 315 unique fold/function proteins for a negative test gave >99% true recall. A blind prediction on a novel protein appears consistent with additional evidences retrieved therein. This is the first proof-of-principle of generalized use of structural dynamics for inferring protein molecular function leveraging our custom-made CG FF, useful to all. (C) 2014 Wiley Periodicals, Inc.
Resumo:
Background: In the post-genomic era where sequences are being determined at a rapid rate, we are highly reliant on computational methods for their tentative biochemical characterization. The Pfam database currently contains 3,786 families corresponding to ``Domains of Unknown Function'' (DUF) or ``Uncharacterized Protein Family'' (UPF), of which 3,087 families have no reported three-dimensional structure, constituting almost one-fourth of the known protein families in search for both structure and function. Results: We applied a `computational structural genomics' approach using five state-of-the-art remote similarity detection methods to detect the relationship between uncharacterized DUFs and domain families of known structures. The association with a structural domain family could serve as a start point in elucidating the function of a DUF. Amongst these five methods, searches in SCOP-NrichD database have been applied for the first time. Predictions were classified into high, medium and low-confidence based on the consensus of results from various approaches and also annotated with enzyme and Gene ontology terms. 614 uncharacterized DUFs could be associated with a known structural domain, of which high confidence predictions, involving at least four methods, were made for 54 families. These structure-function relationships for the 614 DUF families can be accessed on-line at http://proline.biochem.iisc.ernet.in/RHD_DUFS/. For potential enzymes in this set, we assessed their compatibility with the associated fold and performed detailed structural and functional annotation by examining alignments and extent of conservation of functional residues. Detailed discussion is provided for interesting assignments for DUF3050, DUF1636, DUF1572, DUF2092 and DUF659. Conclusions: This study provides insights into the structure and potential function for nearly 20 % of the DUFs. Use of different computational approaches enables us to reliably recognize distant relationships, especially when they converge to a common assignment because the methods are often complementary. We observe that while pointers to the structural domain can offer the right clues to the function of a protein, recognition of its precise functional role is still `non-trivial' with many DUF domains conserving only some of the critical residues. It is not clear whether these are functional vestiges or instances involving alternate substrates and interacting partners. Reviewers: This article was reviewed by Drs Eugene Koonin, Frank Eisenhaber and Srikrishna Subramanian.
Resumo:
The longitudinal fluctuating velocity of a turbulent boundary layer was measured in a water channel at a moderate Reynolds number. The extended self-similar scaling law of structure function proposed by Benzi was verified. The longitudinal fluctuating velocity, in the turbulent boundary layer was decomposed into many multi-scale eddy structures by wavelet transform. The extended self-similar scaling law of structure function for each scale eddy velocity was investigated. The conclusions are I) The statistical properties of turbulence could be self-similar not only at high Reynolds number, but also at moderate and low Reynolds number, and they could be characterized by the same set of scaling exponents xi (1)(n) = n/3 and xi (2)(n) = n/3 of the fully developed regime. 2) The range of scales where the extended self-similarity valid is much larger than the inertial range and extends far deep into the dissipation range,vith the same set of scaling exponents. 3) The extended selfsimilarity is applicable not only for homogeneous turbulence, but also for shear turbulence such as turbulent boundary layers.
Resumo:
The longitudinal structure function (LSF) and the transverse structure function (TSF) in isotropic turbulence are calculated using a vortex model. The vortex model is composed of the Rankine and Burgers vortices which have the exponential distributions in the vortex Reynolds number and vortex radii. This model exhibits a power law in the inertial range and satisfies the minimal condition of isotropy that the second-order exponent of the LSF in the inertial range is equal to that of the TSF. Also observed are differences between longitudinal and transverse structure functions caused by intermittency. These differences are related to their scaling differences which have been previously observed in experiments and numerical simulations.
Resumo:
A novel short neurotoxin, cobrotoxin c (CBT C) was isolated from the venom of monocellate cobra (Naja kaouthia) using a combination of ion-exchange chromatography and FPLC. Its primary structure was determined by Edman degradation. CBT C is composed of 61 amino acid residues. It differs from cobrotoxin b (CBT B) by only two amino acid substitutions, Thr/Ala11 and Arg/Thr56, which are not located on the functionally important regions by sequence similarity. However, the LD50 is 0.08 mg/g to mice, i.e. approximately five-fold higher than for CBT B. Strikingly, a structure-function relationship analysis suggests the existence of a functionally important domain on the outside of Loop III of CBT C. The functionally important basic residues on the outside of Loop III might have a pairwise interaction with alpha subunit, instead of gamma or delta subunits of the nicotinic acetylcholine receptor (nAChR). (C) 2002 Elsevier Science Inc. All rights reserved.
Resumo:
Kolmogorov's two-thirds, ((Δv) 2) ∼ e 2/ 3r 2/ 3, and five-thirds, E ∼ e 2/ 3k -5/ 3, laws are formally equivalent in the limit of vanishing viscosity, v → 0. However, for most Reynolds numbers encountered in laboratory scale experiments, or numerical simulations, it is invariably easier to observe the five-thirds law. By creating artificial fields of isotropic turbulence composed of a random sea of Gaussian eddies whose size and energy distribution can be controlled, we show why this is the case. The energy of eddies of scale, s, is shown to vary as s 2/ 3, in accordance with Kolmogorov's 1941 law, and we vary the range of scales, γ = s max/s min, in any one realisation from γ = 25 to γ = 800. This is equivalent to varying the Reynolds number in an experiment from R λ = 60 to R λ = 600. While there is some evidence of a five-thirds law for g > 50 (R λ > 100), the two-thirds law only starts to become apparent when g approaches 200 (R λ ∼ 240). The reason for this discrepancy is that the second-order structure function is a poor filter, mixing information about energy and enstrophy, and from scales larger and smaller than r. In particular, in the inertial range, ((Δv) 2) takes the form of a mixed power-law, a 1+a 2r 2+a 3r 2/ 3, where a 2r 2 tracks the variation in enstrophy and a 3r 2/ 3 the variation in energy. These findings are shown to be consistent with experimental data where the polution of the r 2/ 3 law by the enstrophy contribution, a 2r 2, is clearly evident. We show that higherorder structure functions (of even order) suffer from a similar deficiency.