4 resultados para attribute-based signature
em National Center for Biotechnology Information - NCBI
Resumo:
We present a general approach to forming structure-activity relationships (SARs). This approach is based on representing chemical structure by atoms and their bond connectivities in combination with the inductive logic programming (ILP) algorithm PROGOL. Existing SAR methods describe chemical structure by using attributes which are general properties of an object. It is not possible to map chemical structure directly to attribute-based descriptions, as such descriptions have no internal organization. A more natural and general way to describe chemical structure is to use a relational description, where the internal construction of the description maps that of the object described. Our atom and bond connectivities representation is a relational description. ILP algorithms can form SARs with relational descriptions. We have tested the relational approach by investigating the SARs of 230 aromatic and heteroaromatic nitro compounds. These compounds had been split previously into two subsets, 188 compounds that were amenable to regression and 42 that were not. For the 188 compounds, a SAR was found that was as accurate as the best statistical or neural network-generated SARs. The PROGOL SAR has the advantages that it did not need the use of any indicator variables handcrafted by an expert, and the generated rules were easily comprehensible. For the 42 compounds, PROGOL formed a SAR that was significantly (P < 0.025) more accurate than linear regression, quadratic regression, and back-propagation. This SAR is based on an automatically generated structural alert for mutagenicity.
Resumo:
We describe a fluorescence-based directed termination PCR (fluorescent DT–PCR) that allows accurate determination of actual sequence changes without dideoxy DNA sequencing. This is achieved using near infrared dye-labeled primers and performing two PCR reactions under low and unbalanced dNTP concentrations. Visualization of resulting termination fragments is accomplished with a dual dye Li-cor DNA sequencer. As each DT–PCR reaction generates two sets of terminating fragments, a pair of complementary reactions with limiting dATP and dCTP collectively provide information on the entire sequence of a target DNA, allowing an accurate determination of any base change. Blind analysis of 78 mutants of the supF reporter gene using fluorescent DT–PCR not only correctly determined the nature and position of all types of substitution mutations in the supF gene, but also allowed rapid scanning of the signature sequences among identical mutations. The method provides simplicity in the generation of terminating fragments and 100% accuracy in mutation characterization. Fluorescent DT–PCR was successfully used to generate a UV-induced spectrum of mutations in the supF gene following replication on a single plate of human DNA repair-deficient cells. We anticipate that the automated DT–PCR method will serve as a cost-effective alternative to dideoxy sequencing in studies involving large-scale analysis for nucleotide sequence changes.
Resumo:
As the number of protein folds is quite limited, a mode of analysis that will be increasingly common in the future, especially with the advent of structural genomics, is to survey and re-survey the finite parts list of folds from an expanding number of perspectives. We have developed a new resource, called PartsList, that lets one dynamically perform these comparative fold surveys. It is available on the web at http://bioinfo.mbb.yale.edu/partslist and http://www.partslist.org. The system is based on the existing fold classifications and functions as a form of companion annotation for them, providing ‘global views’ of many already completed fold surveys. The central idea in the system is that of comparison through ranking; PartsList will rank the approximately 420 folds based on more than 180 attributes. These include: (i) occurrence in a number of completely sequenced genomes (e.g. it will show the most common folds in the worm versus yeast); (ii) occurrence in the structure databank (e.g. most common folds in the PDB); (iii) both absolute and relative gene expression information (e.g. most changing folds in expression over the cell cycle); (iv) protein–protein interactions, based on experimental data in yeast and comprehensive PDB surveys (e.g. most interacting fold); (v) sensitivity to inserted transposons; (vi) the number of functions associated with the fold (e.g. most multi-functional folds); (vii) amino acid composition (e.g. most Cys-rich folds); (viii) protein motions (e.g. most mobile folds); and (ix) the level of similarity based on a comprehensive set of structural alignments (e.g. most structurally variable folds). The integration of whole-genome expression and protein–protein interaction data with structural information is a particularly novel feature of our system. We provide three ways of visualizing the rankings: a profiler emphasizing the progression of high and low ranks across many pre-selected attributes, a dynamic comparer for custom comparisons and a numerical rankings correlator. These allow one to directly compare very different attributes of a fold (e.g. expression level, genome occurrence and maximum motion) in the uniform numerical format of ranks. This uniform framework, in turn, highlights the way that the frequency of many of the attributes falls off with approximate power-law behavior (i.e. according to V–b, for attribute value V and constant exponent b), with a few folds having large values and most having small values.
Resumo:
Ligand transport through myoglobin (Mb) has been observed by using optically heterodyne-detected transient grating spectroscopy. Experimental implementation using diffractive optics has provided unprecedented sensitivity for the study of protein motions by enabling the passive phase locking of the four beams that constitute the experiment, and an unambiguous separation of the Real and Imaginary parts of the signal. Ligand photodissociation of carboxymyoglobin (MbCO) induces a sequence of events involving the relaxation of the protein structure to accommodate ligand escape. These motions show up in the Real part of the signal. The ligand (CO) transport process involves an initial, small amplitude, change in volume, reflecting the transit time of the ligand through the protein, followed by a significantly larger volume change with ligand escape to the surrounding water. The latter process is well described by a single exponential process of 725 ± 15 ns at room temperature. The overall dynamics provide a distinctive signature that can be understood in the context of segmental protein fluctuations that aid ligand escape via a few specific cavities, and they suggest the existence of discrete escape pathways.