53 resultados para SVM (Support Vector Machine)

em QUB Research Portal - Research Directory and Institutional Repository for Queen's University Belfast


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Support vector machine (SVM) is a powerful technique for data classification. Despite of its good theoretic foundations and high classification accuracy, normal SVM is not suitable for classification of large data sets, because the training complexity of SVM is highly dependent on the size of data set. This paper presents a novel SVM classification approach for large data sets by using minimum enclosing ball clustering. After the training data are partitioned by the proposed clustering method, the centers of the clusters are used for the first time SVM classification. Then we use the clusters whose centers are support vectors or those clusters which have different classes to perform the second time SVM classification. In this stage most data are removed. Several experimental results show that the approach proposed in this paper has good classification accuracy compared with classic SVM while the training is significantly faster than several other SVM classifiers.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper proposes a new hierarchical learning structure, namely the holistic triple learning (HTL), for extending the binary support vector machine (SVM) to multi-classification problems. For an N-class problem, a HTL constructs a decision tree up to a depth of A leaf node of the decision tree is allowed to be placed with a holistic triple learning unit whose generalisation abilities are assessed and approved. Meanwhile, the remaining nodes in the decision tree each accommodate a standard binary SVM classifier. The holistic triple classifier is a regression model trained on three classes, whose training algorithm is originated from a recently proposed implementation technique, namely the least-squares support vector machine (LS-SVM). A major novelty with the holistic triple classifier is the reduced number of support vectors in the solution. For the resultant HTL-SVM, an upper bound of the generalisation error can be obtained. The time complexity of training the HTL-SVM is analysed, and is shown to be comparable to that of training the one-versus-one (1-vs.-1) SVM, particularly on small-scale datasets. Empirical studies show that the proposed HTL-SVM achieves competitive classification accuracy with a reduced number of support vectors compared to the popular 1-vs-1 alternative.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Retinopathy of prematurity (ROP) is a rare disease in which retinal blood vessels of premature infants fail to develop normally, and is one of the major causes of childhood blindness throughout the world. The Discrete Conditional Phase-type (DC-Ph) model consists of two components, the conditional component measuring the inter-relationships between covariates and the survival component which models the survival distribution using a Coxian phase-type distribution. This paper expands the DC-Ph models by introducing a support vector machine (SVM), in the role of the conditional component. The SVM is capable of classifying multiple outcomes and is used to identify the infant's risk of developing ROP. Class imbalance makes predicting rare events difficult. A new class decomposition technique, which deals with the problem of multiclass imbalance, is introduced. Based on the SVM classification, the length of stay in the neonatal ward is modelled using a 5, 8 or 9 phase Coxian distribution.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The main objective of the study presented in this paper was to investigate the feasibility using support vector machines (SVM) for the prediction of the fresh properties of self-compacting concrete. The radial basis function (RBF) and polynomial kernels were used to predict these properties as a function of the content of mix components. The fresh properties were assessed with the slump flow, T50, T60, V-funnel time, Orimet time, and blocking ratio (L-box). The retention of these tests was also measured at 30 and 60 min after adding the first water. The water dosage varied from 188 to 208 L/m3, the dosage of superplasticiser (SP) from 3.8 to 5.8 kg/m3, and the volume of coarse aggregates from 220 to 360 L/m3. In total, twenty mixes were used to measure the fresh state properties with different mixture compositions. RBF kernel was more accurate compared to polynomial kernel based support vector machines with a root mean square error (RMSE) of 26.9 (correlation coefficient of R2 = 0.974) for slump flow prediction, a RMSE of 0.55 (R2 = 0.910) for T50 (s) prediction, a RMSE of 1.71 (R2 = 0.812) for T60 (s) prediction, a RMSE of 0.1517 (R2 = 0.990) for V-funnel time prediction, a RMSE of 3.99 (R2 = 0.976) for Orimet time prediction, and a RMSE of 0.042 (R2 = 0.988) for L-box ratio prediction, respectively. A sensitivity analysis was performed to evaluate the effects of the dosage of cement and limestone powder, the water content, the volumes of coarse aggregate and sand, the dosage of SP and the testing time on the predicted test responses. The analysis indicates that the proposed SVM RBF model can gain a high precision, which provides an alternative method for predicting the fresh properties of SCC.

Relevância:

100.00% 100.00%

Publicador:

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The peptides derived from envelope proteins have been shown to inhibit the protein-protein interactions in the virus membrane fusion process and thus have a great potential to be developed into effective antiviral therapies. There are three types of envelope proteins each exhibiting distinct structure folds. Although the exact fusion mechanism remains elusive, it was suggested that the three classes of viral fusion proteins share a similar mechanism of membrane fusion. The common mechanism of action makes it possible to correlate the properties of self-derived peptide inhibitors with their activities. Here we developed a support vector machine model using sequence-based statistical scores of self-derived peptide inhibitors as input features to correlate with their activities. The model displayed 92% prediction accuracy with the Matthew’s correlation coefficient of 0.84, obviously superior to those using physicochemical properties and amino acid decomposition as input. The predictive support vector machine model for self- derived peptides of envelope proteins would be useful in development of antiviral peptide inhibitors targeting the virus fusion process.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This research presents a fast algorithm for projected support vector machines (PSVM) by selecting a basis vector set (BVS) for the kernel-induced feature space, the training points are projected onto the subspace spanned by the selected BVS. A standard linear support vector machine (SVM) is then produced in the subspace with the projected training points. As the dimension of the subspace is determined by the size of the selected basis vector set, the size of the produced SVM expansion can be specified. A two-stage algorithm is derived which selects and refines the basis vector set achieving a locally optimal model. The model expansion coefficients and bias are updated recursively for increase and decrease in the basis set and support vector set. The condition for a point to be classed as outside the current basis vector and selected as a new basis vector is derived and embedded in the recursive procedure. This guarantees the linear independence of the produced basis set. The proposed algorithm is tested and compared with an existing sparse primal SVM (SpSVM) and a standard SVM (LibSVM) on seven public benchmark classification problems. Our new algorithm is designed for use in the application area of human activity recognition using smart devices and embedded sensors where their sometimes limited memory and processing resources must be exploited to the full and the more robust and accurate the classification the more satisfied the user. Experimental results demonstrate the effectiveness and efficiency of the proposed algorithm. This work builds upon a previously published algorithm specifically created for activity recognition within mobile applications for the EU Haptimap project [1]. The algorithms detailed in this paper are more memory and resource efficient making them suitable for use with bigger data sets and more easily trained SVMs.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper formulates a linear kernel support vector machine (SVM) as a regularized least-squares (RLS) problem. By defining a set of indicator variables of the errors, the solution to the RLS problem is represented as an equation that relates the error vector to the indicator variables. Through partitioning the training set, the SVM weights and bias are expressed analytically using the support vectors. It is also shown how this approach naturally extends to Sums with nonlinear kernels whilst avoiding the need to make use of Lagrange multipliers and duality theory. A fast iterative solution algorithm based on Cholesky decomposition with permutation of the support vectors is suggested as a solution method. The properties of our SVM formulation are analyzed and compared with standard SVMs using a simple example that can be illustrated graphically. The correctness and behavior of our solution (merely derived in the primal context of RLS) is demonstrated using a set of public benchmarking problems for both linear and nonlinear SVMs.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

As a promising method for pattern recognition and function estimation, least squares support vector machines (LS-SVM) express the training in terms of solving a linear system instead of a quadratic programming problem as for conventional support vector machines (SVM). In this paper, by using the information provided by the equality constraint, we transform the minimization problem with a single equality constraint in LS-SVM into an unconstrained minimization problem, then propose reduced formulations for LS-SVM. By introducing this transformation, the times of using conjugate gradient (CG) method, which is a greatly time-consuming step in obtaining the numerical solution, are reduced to one instead of two as proposed by Suykens et al. (1999). The comparison on computational speed of our method with the CG method proposed by Suykens et al. and the first order and second order SMO methods on several benchmark data sets shows a reduction of training time by up to 44%. (C) 2011 Elsevier B.V. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

N-gram analysis is an approach that investigates the structure of a program using bytes, characters, or text strings. A key issue with N-gram analysis is feature selection amidst the explosion of features that occurs when N is increased. The experiments within this paper represent programs as operational code (opcode) density histograms gained through dynamic analysis. A support vector machine is used to create a reference model, which is used to evaluate two methods of feature reduction, which are 'area of intersect' and 'subspace analysis using eigenvectors.' The findings show that the relationships between features are complex and simple statistics filtering approaches do not provide a viable approach. However, eigenvector subspace analysis produces a suitable filter.