21 resultados para K-Fold Accuracy

em CentAUR: Central Archive University of Reading - UK


Relevância:

30.00% 30.00%

Publicador:

Resumo:

The nuclear magnetic resonance (NMR) structure of a central segment of the previously annotated severe acute respiratory syndrome (SARS)-unique domain (SUD-M, for "middle of the SARS-unique domain") in SARS coronavirus (SARS-CoV) nonstructural protein 3 (nsp3) has been determined. SUD-M(513-651) exhibits a macrodomain fold containing the nsp3 residues 528 to 648, and there is a flexibly extended N-terminal tail with the residues 513 to 527 and a C-terminal flexible tail of residues 649 to 651. As a follow-up to this initial result, we also solved the structure of a construct representing only the globular domain of residues 527 to 651 [SUD-M(527-651)]. NMR chemical shift perturbation experiments showed that SUD-M(527-651) binds single-stranded poly(A) and identified the contact area with this RNA on the protein surface, and electrophoretic mobility shift assays then confirmed that SUD-M has higher affinity for purine bases than for pyrimidine bases. In a further search for clues to the function, we found that SUD-M(527-651) has the closest three-dimensional structure homology with another domain of nsp3, the ADP-ribose-1 ''-phosphatase nsp3b, although the two proteins share only 5% sequence identity in the homologous sequence regions. SUD-M(527-651) also shows three-dimensional structure homology with several helicases and nucleoside triphosphate-binding proteins, but it does not contain the motifs of catalytic residues found in these structural homologues. The combined results from NMR screening of potential substrates and the structure-based homology studies now form a basis for more focused investigations on the role of the SARS-unique domain in viral infection.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The NMR structure of a central segment of the previously annotated "SARS-unique domain" (SUD-M; "middle of the SARS-unique domain") in the SARS coronavirus (SARS-CoV) non-structural protein 3 (nsp3) has been determined. SUD-M(513-651) exhibits a macrodomain fold containing the nsp3-residues 528-648, and there is a flexibly extended N-terminal tail with the residues 513-527 and a C-terminal flexible tail of residues 649-651. As a follow-up to this initial result, we also solved the structure of a construct representing only the globular domain of residues 527-651 [SUD-M(527-651)]. NMR chemical shift perturbation experiments showed that SUD-M(527-651) binds single-stranded poly-A and identified the contact area with this RNA on the protein surface, and electrophoretic mobility shift assays then confirmed that SUD-M has higher affinity for purine bases than for pyrimidine bases. In further search for clues to the function, we found that SUD-M(527-651) has the closest three-dimensional structure homology with another domain of nsp3, the ADP-ribose-1''-phosphatase nsp3b, although the two proteins share only 5% sequence identity in the homologous sequence regions. SUD-M(527-651) also shows 3D structure homology with several helicases and NTP-binding proteins, but it does not contain the motifs of catalytic residues found in these structural homologues. The combined results from NMR screening of potential substrates and the structure-based homology studies now form a basis for more focused investigations on the role of the SARS-unique domain in viral infection.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background: Selecting the highest quality 3D model of a protein structure from a number of alternatives remains an important challenge in the field of structural bioinformatics. Many Model Quality Assessment Programs (MQAPs) have been developed which adopt various strategies in order to tackle this problem, ranging from the so called "true" MQAPs capable of producing a single energy score based on a single model, to methods which rely on structural comparisons of multiple models or additional information from meta-servers. However, it is clear that no current method can separate the highest accuracy models from the lowest consistently. In this paper, a number of the top performing MQAP methods are benchmarked in the context of the potential value that they add to protein fold recognition. Two novel methods are also described: ModSSEA, which based on the alignment of predicted secondary structure elements and ModFOLD which combines several true MQAP methods using an artificial neural network. Results: The ModSSEA method is found to be an effective model quality assessment program for ranking multiple models from many servers, however further accuracy can be gained by using the consensus approach of ModFOLD. The ModFOLD method is shown to significantly outperform the true MQAPs tested and is competitive with methods which make use of clustering or additional information from multiple servers. Several of the true MQAPs are also shown to add value to most individual fold recognition servers by improving model selection, when applied as a post filter in order to re-rank models. Conclusion: MQAPs should be benchmarked appropriately for the practical context in which they are intended to be used. Clustering based methods are the top performing MQAPs where many models are available from many servers; however, they often do not add value to individual fold recognition servers when limited models are available. Conversely, the true MQAP methods tested can often be used as effective post filters for re-ranking few models from individual fold recognition servers and further improvements can be achieved using a consensus of these methods.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BACKGROUND: In order to maintain the most comprehensive structural annotation databases we must carry out regular updates for each proteome using the latest profile-profile fold recognition methods. The ability to carry out these updates on demand is necessary to keep pace with the regular updates of sequence and structure databases. Providing the highest quality structural models requires the most intensive profile-profile fold recognition methods running with the very latest available sequence databases and fold libraries. However, running these methods on such a regular basis for every sequenced proteome requires large amounts of processing power.In this paper we describe and benchmark the JYDE (Job Yield Distribution Environment) system, which is a meta-scheduler designed to work above cluster schedulers, such as Sun Grid Engine (SGE) or Condor. We demonstrate the ability of JYDE to distribute the load of genomic-scale fold recognition across multiple independent Grid domains. We use the most recent profile-profile version of our mGenTHREADER software in order to annotate the latest version of the Human proteome against the latest sequence and structure databases in as short a time as possible. RESULTS: We show that our JYDE system is able to scale to large numbers of intensive fold recognition jobs running across several independent computer clusters. Using our JYDE system we have been able to annotate 99.9% of the protein sequences within the Human proteome in less than 24 hours, by harnessing over 500 CPUs from 3 independent Grid domains. CONCLUSION: This study clearly demonstrates the feasibility of carrying out on demand high quality structural annotations for the proteomes of major eukaryotic organisms. Specifically, we have shown that it is now possible to provide complete regular updates of profile-profile based fold recognition models for entire eukaryotic proteomes, through the use of Grid middleware such as JYDE.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

An increasing number of neuroscience experiments are using virtual reality to provide a more immersive and less artificial experimental environment. This is particularly useful to navigation and three-dimensional scene perception experiments. Such experiments require accurate real-time tracking of the observer's head in order to render the virtual scene. Here, we present data on the accuracy of a commonly used six degrees of freedom tracker (Intersense IS900) when it is moved in ways typical of virtual reality applications. We compared the reported location of the tracker with its location computed by an optical tracking method. When the tracker was stationary, the root mean square error in spatial accuracy was 0.64 mm. However, we found that errors increased over ten-fold (up to 17 mm) when the tracker moved at speeds common in virtual reality applications. We demonstrate that the errors we report here are predominantly due to inaccuracies of the IS900 system rather than the optical tracking against which it was compared. (c) 2006 Elsevier B.V. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This work compares and contrasts results of classifying time-domain ECG signals with pathological conditions taken from the MITBIH arrhythmia database. Linear discriminant analysis and a multi-layer perceptron were used as classifiers. The neural network was trained by two different methods, namely back-propagation and a genetic algorithm. Converting the time-domain signal into the wavelet domain reduced the dimensionality of the problem at least 10-fold. This was achieved using wavelets from the db6 family as well as using adaptive wavelets generated using two different strategies. The wavelet transforms used in this study were limited to two decomposition levels. A neural network with evolved weights proved to be the best classifier with a maximum of 99.6% accuracy when optimised wavelet-transform ECG data wits presented to its input and 95.9% accuracy when the signals presented to its input were decomposed using db6 wavelets. The linear discriminant analysis achieved a maximum classification accuracy of 95.7% when presented with optimised and 95.5% with db6 wavelet coefficients. It is shown that the much simpler signal representation of a few wavelet coefficients obtained through an optimised discrete wavelet transform facilitates the classification of non-stationary time-variant signals task considerably. In addition, the results indicate that wavelet optimisation may improve the classification ability of a neural network. (c) 2005 Elsevier B.V. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The correlated k-distribution (CKD) method is widely used in the radiative transfer schemes of atmospheric models and involves dividing the spectrum into a number of bands and then reordering the gaseous absorption coefficients within each one. The fluxes and heating rates for each band may then be computed by discretizing the reordered spectrum into of order 10 quadrature points per major gas and performing a monochromatic radiation calculation for each point. In this presentation it is shown that for clear-sky longwave calculations, sufficient accuracy for most applications can be achieved without the need for bands: reordering may be performed on the entire longwave spectrum. The resulting full-spectrum correlated k (FSCK) method requires significantly fewer monochromatic calculations than standard CKD to achieve a given accuracy. The concept is first demonstrated by comparing with line-by-line calculations for an atmosphere containing only water vapor, in which it is shown that the accuracy of heating-rate calculations improves approximately in proportion to the square of the number of quadrature points. For more than around 20 points, the root-mean-squared error flattens out at around 0.015 K/day due to the imperfect rank correlation of absorption spectra at different pressures in the profile. The spectral overlap of m different gases is treated by considering an m-dimensional hypercube where each axis corresponds to the reordered spectrum of one of the gases. This hypercube is then divided up into a number of volumes, each approximated by a single quadrature point, such that the total number of quadrature points is slightly fewer than the sum of the number that would be required to treat each of the gases separately. The gaseous absorptions for each quadrature point are optimized such that they minimize a cost function expressing the deviation of the heating rates and fluxes calculated by the FSCK method from line-by-line calculations for a number of training profiles. This approach is validated for atmospheres containing water vapor, carbon dioxide, and ozone, in which it is found that in the troposphere and most of the stratosphere, heating-rate errors of less than 0.2 K/day can be achieved using a total of 23 quadrature points, decreasing to less than 0.1 K/day for 32 quadrature points. It would be relatively straightforward to extend the method to include other gases.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The correlated k-distribution (CKD) method is widely used in the radiative transfer schemes of atmospheric models, and involves dividing the spectrum into a number of bands and then reordering the gaseous absorption coefficients within each one. The fluxes and heating rates for each band may then be computed by discretizing the reordered spectrum into of order 10 quadrature points per major gas, and performing a pseudo-monochromatic radiation calculation for each point. In this paper it is first argued that for clear-sky longwave calculations, sufficient accuracy for most applications can be achieved without the need for bands: reordering may be performed on the entire longwave spectrum. The resulting full-spectrum correlated k (FSCK) method requires significantly fewer pseudo-monochromatic calculations than standard CKD to achieve a given accuracy. The concept is first demonstrated by comparing with line-by-line calculations for an atmosphere containing only water vapor, in which it is shown that the accuracy of heating-rate calculations improves approximately in proportion to the square of the number of quadrature points. For more than around 20 points, the root-mean-squared error flattens out at around 0.015 K d−1 due to the imperfect rank correlation of absorption spectra at different pressures in the profile. The spectral overlap of m different gases is treated by considering an m-dimensional hypercube where each axis corresponds to the reordered spectrum of one of the gases. This hypercube is then divided up into a number of volumes, each approximated by a single quadrature point, such that the total number of quadrature points is slightly fewer than the sum of the number that would be required to treat each of the gases separately. The gaseous absorptions for each quadrature point are optimized such they minimize a cost function expressing the deviation of the heating rates and fluxes calculated by the FSCK method from line-by-line calculations for a number of training profiles. This approach is validated for atmospheres containing water vapor, carbon dioxide and ozone, in which it is found that in the troposphere and most of the stratosphere, heating-rate errors of less than 0.2 K d−1 can be achieved using a total of 23 quadrature points, decreasing to less than 0.1 K d−1 for 32 quadrature points. It would be relatively straightforward to extend the method to include other gases.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Radial basis function networks can be trained quickly using linear optimisation once centres and other associated parameters have been initialised. The authors propose a small adjustment to a well accepted initialisation algorithm which improves the network accuracy over a range of problems. The algorithm is described and results are presented.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A number of new and newly improved methods for predicting protein structure developed by the Jones–University College London group were used to make predictions for the CASP6 experiment. Structures were predicted with a combination of fold recognition methods (mGenTHREADER, nFOLD, and THREADER) and a substantially enhanced version of FRAGFOLD, our fragment assembly method. Attempts at automatic domain parsing were made using DomPred and DomSSEA, which are based on a secondary structure parsing algorithm and additionally for DomPred, a simple local sequence alignment scoring function. Disorder prediction was carried out using a new SVM-based version of DISOPRED. Attempts were also made at domain docking and “microdomain” folding in order to build complete chain models for some targets.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

If secondary structure predictions are to be incorporated into fold recognition methods, an assessment of the effect of specific types of errors in predicted secondary structures on the sensitivity of fold recognition should be carried out. Here, we present a systematic comparison of different secondary structure prediction methods by measuring frequencies of specific types of error. We carry out an evaluation of the effect of specific types of error on secondary structure element alignment (SSEA), a baseline fold recognition method. The results of this evaluation indicate that missing out whole helix or strand elements, or predicting the wrong type of element, is more detrimental than predicting the wrong lengths of elements or overpredicting helix or strand. We also suggest that SSEA scoring is an effective method for assessing accuracy of secondary structure prediction and perhaps may also provide a more appropriate assessment of the “usefulness” and quality of predicted secondary structure, if secondary structure alignments are to be used in fold recognition.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

What constitutes a baseline level of success for protein fold recognition methods? As fold recognition benchmarks are often presented without any thought to the results that might be expected from a purely random set of predictions, an analysis of fold recognition baselines is long overdue. Given varying amounts of basic information about a protein—ranging from the length of the sequence to a knowledge of its secondary structure—to what extent can the fold be determined by intelligent guesswork? Can simple methods that make use of secondary structure information assign folds more accurately than purely random methods and could these methods be used to construct viable hierarchical classifications?