973 resultados para kernel method


Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we aim at predicting protein structural classes for low-homology data sets based on predicted secondary structures. We propose a new and simple kernel method, named as SSEAKSVM, to predict protein structural classes. The secondary structures of all protein sequences are obtained by using the tool PSIPRED and then a linear kernel on the basis of secondary structure element alignment scores is constructed for training a support vector machine classifier without parameter adjusting. Our method SSEAKSVM was evaluated on two low-homology datasets 25PDB and 1189 with sequence homology being 25% and 40%, respectively. The jackknife test is used to test and compare our method with other existing methods. The overall accuracies on these two data sets are 86.3% and 84.5%, respectively, which are higher than those obtained by other existing methods. Especially, our method achieves higher accuracies (88.1% and 88.5%) for differentiating the α + β class and the α/β class compared to other methods. This suggests that our method is valuable to predict protein structural classes particularly for low-homology protein sequences. The source code of the method in this paper can be downloaded at http://math.xtu.edu.cn/myphp/math/research/source/SSEAK_source_code.rar.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Error estimates for the error reproducing kernel method (ERKM) are provided. The ERKM is a mesh-free functional approximation scheme [A. Shaw, D. Roy, A NURBS-based error reproducing kernel method with applications in solid mechanics, Computational Mechanics (2006), to appear (available online)], wherein a targeted function and its derivatives are first approximated via non-uniform rational B-splines (NURBS) basis function. Errors in the NURBS approximation are then reproduced via a family of non-NURBS basis functions, constructed using a polynomial reproduction condition, and added to the NURBS approximation of the function obtained in the first step. In addition to the derivation of error estimates, convergence studies are undertaken for a couple of test boundary value problems with known exact solutions. The ERKM is next applied to a one-dimensional Burgers equation where, time evolution leads to a breakdown of the continuous solution and the appearance of a shock. Many available mesh-free schemes appear to be unable to capture this shock without numerical instability. However, given that any desired order of continuity is achievable through NURBS approximations, the ERKM can even accurately approximate functions with discontinuous derivatives. Moreover, due to the variation diminishing property of NURBS, it has advantages in representing sharp changes in gradients. This paper is focused on demonstrating this ability of ERKM via some numerical examples. Comparisons of some of the results with those via the standard form of the reproducing kernel particle method (RKPM) demonstrate the relative numerical advantages and accuracy of the ERKM.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Background - Modelling the interaction between potentially antigenic peptides and Major Histocompatibility Complex (MHC) molecules is a key step in identifying potential T-cell epitopes. For Class II MHC alleles, the binding groove is open at both ends, causing ambiguity in the positional alignment between the groove and peptide, as well as creating uncertainty as to what parts of the peptide interact with the MHC. Moreover, the antigenic peptides have variable lengths, making naive modelling methods difficult to apply. This paper introduces a kernel method that can handle variable length peptides effectively by quantifying similarities between peptide sequences and integrating these into the kernel. Results - The kernel approach presented here shows increased prediction accuracy with a significantly higher number of true positives and negatives on multiple MHC class II alleles, when testing data sets from MHCPEP [1], MCHBN [2], and MHCBench [3]. Evaluation by cross validation, when segregating binders and non-binders, produced an average of 0.824 AROC for the MHCBench data sets (up from 0.756), and an average of 0.96 AROC for multiple alleles of the MHCPEP database. Conclusion - The method improves performance over existing state-of-the-art methods of MHC class II peptide binding predictions by using a custom, knowledge-based representation of peptides. Similarity scores, in contrast to a fixed-length, pocket-specific representation of amino acids, provide a flexible and powerful way of modelling MHC binding, and can easily be applied to other dynamic sequence problems.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

With the advent of Service Oriented Architecture, Web Services have gained tremendous popularity. Due to the availability of a large number of Web services, finding an appropriate Web service according to the requirement of the user is a challenge. This warrants the need to establish an effective and reliable process of Web service discovery. A considerable body of research has emerged to develop methods to improve the accuracy of Web service discovery to match the best service. The process of Web service discovery results in suggesting many individual services that partially fulfil the user’s interest. By considering the semantic relationships of words used in describing the services as well as the use of input and output parameters can lead to accurate Web service discovery. Appropriate linking of individual matched services should fully satisfy the requirements which the user is looking for. This research proposes to integrate a semantic model and a data mining technique to enhance the accuracy of Web service discovery. A novel three-phase Web service discovery methodology has been proposed. The first phase performs match-making to find semantically similar Web services for a user query. In order to perform semantic analysis on the content present in the Web service description language document, the support-based latent semantic kernel is constructed using an innovative concept of binning and merging on the large quantity of text documents covering diverse areas of domain of knowledge. The use of a generic latent semantic kernel constructed with a large number of terms helps to find the hidden meaning of the query terms which otherwise could not be found. Sometimes a single Web service is unable to fully satisfy the requirement of the user. In such cases, a composition of multiple inter-related Web services is presented to the user. The task of checking the possibility of linking multiple Web services is done in the second phase. Once the feasibility of linking Web services is checked, the objective is to provide the user with the best composition of Web services. In the link analysis phase, the Web services are modelled as nodes of a graph and an allpair shortest-path algorithm is applied to find the optimum path at the minimum cost for traversal. The third phase which is the system integration, integrates the results from the preceding two phases by using an original fusion algorithm in the fusion engine. Finally, the recommendation engine which is an integral part of the system integration phase makes the final recommendations including individual and composite Web services to the user. In order to evaluate the performance of the proposed method, extensive experimentation has been performed. Results of the proposed support-based semantic kernel method of Web service discovery are compared with the results of the standard keyword-based information-retrieval method and a clustering-based machine-learning method of Web service discovery. The proposed method outperforms both information-retrieval and machine-learning based methods. Experimental results and statistical analysis also show that the best Web services compositions are obtained by considering 10 to 15 Web services that are found in phase-I for linking. Empirical results also ascertain that the fusion engine boosts the accuracy of Web service discovery by combining the inputs from both the semantic analysis (phase-I) and the link analysis (phase-II) in a systematic fashion. Overall, the accuracy of Web service discovery with the proposed method shows a significant improvement over traditional discovery methods.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This study investigates the potential of Relevance Vector Machine (RVM)-based approach to predict the ultimate capacity of laterally loaded pile in clay. RVM is a sparse approximate Bayesian kernel method. It can be seen as a probabilistic version of support vector machine. It provides much sparser regressors without compromising performance, and kernel bases give a small but worthwhile improvement in performance. RVM model outperforms the two other models based on root-mean-square-error (RMSE) and mean-absolute-error (MAE) performance criteria. It also stimates the prediction variance. The results presented in this paper clearly highlight that the RVM is a robust tool for prediction Of ultimate capacity of laterally loaded piles in clay.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

A method was developed for relative radiometric calibration of single multitemporal Landsat TM image, several multitemporal images covering each others, and several multitemporal images covering different geographic locations. The radiometricly calibrated difference images were used for detecting rapid changes on forest stands. The nonparametric Kernel method was applied for change detection. The accuracy of the change detection was estimated by inspecting the image analysis results in field. The change classification was applied for controlling the quality of the continuously updated forest stand information. The aim was to ensure that all the manmade changes and any forest damages were correctly updated including the attribute and stand delineation information. The image analysis results were compared with the registered treatments and the stand information base. The stands with discrepancies between these two information sources were recommended to be field inspected.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

在深入分析敏感信息过滤任务的特点和难点的基础上,针对现有一般的信息过滤方法的不足,提出了一种利用敏感词的组合信息来改进过滤效果的思想.进而,研究了在核方法的框架下特征共现行为建模的原则并提出了复合ANOVA核来刻画特征组合行为.通过真实信息过滤环境中的测试评估,显示了此敏感信息过滤方法的有效性.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

For two multinormal populations with equal covariance matrices the likelihood ratio discriminant function, an alternative allocation rule to the sample linear discriminant function when n1 ≠ n2 ,is studied analytically. With the assumption of a known covariance matrix its distribution is derived and the expectation of its actual and apparent error rates evaluated and compared with those of the sample linear discriminant function. This comparison indicates that the likelihood ratio allocation rule is robust to unequal sample sizes. The quadratic discriminant function is studied, its distribution reviewed and evaluation of its probabilities of misclassification discussed. For known covariance matrices the distribution of the sample quadratic discriminant function is derived. When the known covariance matrices are proportional exact expressions for the expectation of its actual and apparent error rates are obtained and evaluated. The effectiveness of the sample linear discriminant function for this case is also considered. Estimation of true log-odds for two multinormal populations with equal or unequal covariance matrices is studied. The estimative, Bayesian predictive and a kernel method are compared by evaluating their biases and mean square errors. Some algebraic expressions for these quantities are derived. With equal covariance matrices the predictive method is preferable. Where it derives this superiority is investigated by considering its performance for various levels of fixed true log-odds. It is also shown that the predictive method is sensitive to n1 ≠ n2. For unequal but proportional covariance matrices the unbiased estimative method is preferred. Product Normal kernel density estimates are used to give a kernel estimator of true log-odds. The effect of correlation in the variables with product kernels is considered. With equal covariance matrices the kernel and parametric estimators are compared by simulation. For moderately correlated variables and large dimension sizes the product kernel method is a good estimator of true log-odds.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Ce mémoire porte sur la présentation des estimateurs de Bernstein qui sont des alternatives récentes aux différents estimateurs classiques de fonctions de répartition et de densité. Plus précisément, nous étudions leurs différentes propriétés et les comparons à celles de la fonction de répartition empirique et à celles de l'estimateur par la méthode du noyau. Nous déterminons une expression asymptotique des deux premiers moments de l'estimateur de Bernstein pour la fonction de répartition. Comme pour les estimateurs classiques, nous montrons que cet estimateur vérifie la propriété de Chung-Smirnov sous certaines conditions. Nous montrons ensuite que l'estimateur de Bernstein est meilleur que la fonction de répartition empirique en terme d'erreur quadratique moyenne. En s'intéressant au comportement asymptotique des estimateurs de Bernstein, pour un choix convenable du degré du polynôme, nous montrons que ces estimateurs sont asymptotiquement normaux. Des études numériques sur quelques distributions classiques nous permettent de confirmer que les estimateurs de Bernstein peuvent être préférables aux estimateurs classiques.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The estimation of the long-term wind resource at a prospective site based on a relatively short on-site measurement campaign is an indispensable task in the development of a commercial wind farm. The typical industry approach is based on the measure-correlate-predict �MCP� method where a relational model between the site wind velocity data and the data obtained from a suitable reference site is built from concurrent records. In a subsequent step, a long-term prediction for the prospective site is obtained from a combination of the relational model and the historic reference data. In the present paper, a systematic study is presented where three new MCP models, together with two published reference models �a simple linear regression and the variance ratio method�, have been evaluated based on concurrent synthetic wind speed time series for two sites, simulating the prospective and the reference site. The synthetic method has the advantage of generating time series with the desired statistical properties, including Weibull scale and shape factors, required to evaluate the five methods under all plausible conditions. In this work, first a systematic discussion of the statistical fundamentals behind MCP methods is provided and three new models, one based on a nonlinear regression and two �termed kernel methods� derived from the use of conditional probability density functions, are proposed. All models are evaluated by using five metrics under a wide range of values of the correlation coefficient, the Weibull scale, and the Weibull shape factor. Only one of all models, a kernel method based on bivariate Weibull probability functions, is capable of accurately predicting all performance metrics studied.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Although most raptor species are found mainly in the tropics, information on their home range and spatial requirements in the Neotropics is still scarce. In this study, we used radio telemetry to evaluate the home range and the habitat use and selection of five Roadside hawks, Rupornis magnirostris (Gmelin, 1788) in a heterogeneous landscape in southeastern Brazil. The average home range size calculated using the adaptive kernel method (95% isopleth) was 126.1ha (47.4-266.7ha), but using the minimum convex polygon method (95% isopleth) it was 143.54ha (32.6-382.3ha). The roadside hawk explored a wide variety of habitats, most of them opportunistically, as suggested in the literature. Despite this, habitat quality could influence home range size and promote habitat selection. The observation of habitat use as expected, as well as the relatively small home range size, could be related to the generalist/opportunistic behaviour of the roadside hawk.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The subtracted kernel method is implemented recursively to solve scattering equations for the S-1(0) phase-shifts considering the leading and the next-to-leading order NN interaction.