5 resultados para Structural Biology
em Bucknell University Digital Commons - Pensilvania - USA
Resumo:
Background: In protein sequence classification, identification of the sequence motifs or n-grams that can precisely discriminate between classes is a more interesting scientific question than the classification itself. A number of classification methods aim at accurate classification but fail to explain which sequence features indeed contribute to the accuracy. We hypothesize that sequences in lower denominations (n-grams) can be used to explore the sequence landscape and to identify class-specific motifs that discriminate between classes during classification. Discriminative n-grams are short peptide sequences that are highly frequent in one class but are either minimally present or absent in other classes. In this study, we present a new substitution-based scoring function for identifying discriminative n-grams that are highly specific to a class. Results: We present a scoring function based on discriminative n-grams that can effectively discriminate between classes. The scoring function, initially, harvests the entire set of 4- to 8-grams from the protein sequences of different classes in the dataset. Similar n-grams of the same size are combined to form new n-grams, where the similarity is defined by positive amino acid substitution scores in the BLOSUM62 matrix. Substitution has resulted in a large increase in the number of discriminatory n-grams harvested. Due to the unbalanced nature of the dataset, the frequencies of the n-grams are normalized using a dampening factor, which gives more weightage to the n-grams that appear in fewer classes and vice-versa. After the n-grams are normalized, the scoring function identifies discriminative 4- to 8-grams for each class that are frequent enough to be above a selection threshold. By mapping these discriminative n-grams back to the protein sequences, we obtained contiguous n-grams that represent short class-specific motifs in protein sequences. Our method fared well compared to an existing motif finding method known as Wordspy. We have validated our enriched set of class-specific motifs against the functionally important motifs obtained from the NLSdb, Prosite and ELM databases. We demonstrate that this method is very generic; thus can be widely applied to detect class-specific motifs in many protein sequence classification tasks. Conclusion: The proposed scoring function and methodology is able to identify class-specific motifs using discriminative n-grams derived from the protein sequences. The implementation of amino acid substitution scores for similarity detection, and the dampening factor to normalize the unbalanced datasets have significant effect on the performance of the scoring function. Our multipronged validation tests demonstrate that this method can detect class-specific motifs from a wide variety of protein sequence classes with a potential application to detecting proteome-specific motifs of different organisms.
Performance Tuning Non-Uniform Sampling for Sensitivity Enhancement of Signal-Limited Biological NMR
Resumo:
Non-uniform sampling (NUS) has been established as a route to obtaining true sensitivity enhancements when recording indirect dimensions of decaying signals in the same total experimental time as traditional uniform incrementation of the indirect evolution period. Theory and experiments have shown that NUS can yield up to two-fold improvements in the intrinsic signal-to-noise ratio (SNR) of each dimension, while even conservative protocols can yield 20-40 % improvements in the intrinsic SNR of NMR data. Applications of biological NMR that can benefit from these improvements are emerging, and in this work we develop some practical aspects of applying NUS nD-NMR to studies that approach the traditional detection limit of nD-NMR spectroscopy. Conditions for obtaining high NUS sensitivity enhancements are considered here in the context of enabling H-1,N-15-HSQC experiments on natural abundance protein samples and H-1,C-13-HMBC experiments on a challenging natural product. Through systematic studies we arrive at more precise guidelines to contrast sensitivity enhancements with reduced line shape constraints, and report an alternative sampling density based on a quarter-wave sinusoidal distribution that returns the highest fidelity we have seen to date in line shapes obtained by maximum entropy processing of non-uniformly sampled data.
Resumo:
Phosphatidylinositol-specific phospholipases C (PI-PLC) are known to participate in many eukaryotic signal transduction pathways and act as virulence factors in lower organisms. Glycerophosphoryl diester phosphodiesterase (GDPD) enzymes are involved in phosphate homeostasis and phospholipid catabolism for energy production. Streptomyces antibioticus phosphatidylinositol-specific phospholipase C (SaPLC1) is a 38 kDa enzyme that displays characteristics of both enzyme superfamilies, representing an evolutionary link between these divergent enzyme classes. SaPLC1 also boasts a unique catalytic mechanism that involves a trans 1,6-cyclic inositol phosphate intermediate instead of the typical cis 1,2-cyclic inositol phosphate. The mechanism by which this occurs is still unclear. To attack this problem, we established a wide mutagenesis scan of the active site and measured activities of alanine mutants. A chemical rescue assay was developed to verify that the activity loss was due to the removal of the functional role of the mutated residue. 31P-NMR was employed in characterizing and quantifying intermediates in mutants that slowed the reaction sufficiently. We found that the H37A and H76A mutations support the hypothesis that these structurally conserved residues are also conserved in terms of their catalytic roles. H37 was found to be the general base (GB), while H76 plays the role of general acid (GA). K131 was identified as a semi-conserved key positive charge donor found at the entrance of the active site. By elucidating the SaPLC1 mechanism in relation to its active site architecture, we have increased our understanding of the structure-function relations that support catalysis in the PI-PLC/GDPD superfamily. These findings provide groundwork for in vivo studies of SaPLC1 function and its possible role in novel signaling or metabolism in Streptomyces.
Resumo:
The mechanical properties of cytoskeletal networks are intimately involved in determining how forces and cellular processes are generated, directed, and transmitted in living cells. However, determining the mechanical properties of subcellular molecular complexes in vivo has proven to be difficult. Here, we combine in vivo measurements by optical microscopy, X-ray diffraction, and transmission electron microscopy with theoretical modeling to decipher the mechanical properties of the magnetosome chain system encountered in magnetotactic bacteria. We exploit the magnetic properties of the endogenous intracellular nanoparticles to apply a force on the filament-connector pair involved in the backbone formation and stabilization. We show that the magnetosome chain can be broken by the application of external field strength higher than 30 mT and suggest that this originates from the rupture of the magnetosome connector MamJ. In addition, we calculate that the biological determinants can withstand in vivo a force of 25 pN. This quantitative understanding provides insights for the design of functional materials such as actuators and sensors using cellular components.
Resumo:
Recently, we have demonstrated that considerable inherent sensitivity gains are attained in MAS NMR spectra acquired by nonuniform sampling (NUS) and introduced maximum entropy interpolation (MINT) processing that assures the linearity of transformation between the time and frequency domains. In this report, we examine the utility of the NUS/MINT approach in multidimensional datasets possessing high dynamic range, such as homonuclear C-13-C-13 correlation spectra. We demonstrate on model compounds and on 1-73-(U-C-13,N-15)/74-108-(U-N-15) E. coli thioredoxin reassembly, that with appropriately constructed 50 % NUS schedules inherent sensitivity gains of 1.7-2.1-fold are readily reached in such datasets. We show that both linearity and line width are retained under these experimental conditions throughout the entire dynamic range of the signals. Furthermore, we demonstrate that the reproducibility of the peak intensities is excellent in the NUS/MINT approach when experiments are repeated multiple times and identical experimental and processing conditions are employed. Finally, we discuss the principles for design and implementation of random exponentially biased NUS sampling schedules for homonuclear C-13-C-13 MAS correlation experiments that yield high-quality artifact-free datasets.