867 resultados para least square-support vector machine


Relevância:

100.00% 100.00%

Publicador:

Resumo:

We propose a unified data modeling approach that is equally applicable to supervised regression and classification applications, as well as to unsupervised probability density function estimation. A particle swarm optimization (PSO) aided orthogonal forward regression (OFR) algorithm based on leave-one-out (LOO) criteria is developed to construct parsimonious radial basis function (RBF) networks with tunable nodes. Each stage of the construction process determines the center vector and diagonal covariance matrix of one RBF node by minimizing the LOO statistics. For regression applications, the LOO criterion is chosen to be the LOO mean square error, while the LOO misclassification rate is adopted in two-class classification applications. By adopting the Parzen window estimate as the desired response, the unsupervised density estimation problem is transformed into a constrained regression problem. This PSO aided OFR algorithm for tunable-node RBF networks is capable of constructing very parsimonious RBF models that generalize well, and our analysis and experimental results demonstrate that the algorithm is computationally even simpler than the efficient regularization assisted orthogonal least square algorithm based on LOO criteria for selecting fixed-node RBF models. Another significant advantage of the proposed learning procedure is that it does not have learning hyperparameters that have to be tuned using costly cross validation. The effectiveness of the proposed PSO aided OFR construction procedure is illustrated using several examples taken from regression and classification, as well as density estimation applications.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We develop a particle swarm optimisation (PSO) aided orthogonal forward regression (OFR) approach for constructing radial basis function (RBF) classifiers with tunable nodes. At each stage of the OFR construction process, the centre vector and diagonal covariance matrix of one RBF node is determined efficiently by minimising the leave-one-out (LOO) misclassification rate (MR) using a PSO algorithm. Compared with the state-of-the-art regularisation assisted orthogonal least square algorithm based on the LOO MR for selecting fixednode RBF classifiers, the proposed PSO aided OFR algorithm for constructing tunable-node RBF classifiers offers significant advantages in terms of better generalisation performance and smaller model size as well as imposes lower computational complexity in classifier construction process. Moreover, the proposed algorithm does not have any hyperparameter that requires costly tuning based on cross validation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A very efficient learning algorithm for model subset selection is introduced based on a new composite cost function that simultaneously optimizes the model approximation ability and model robustness and adequacy. The derived model parameters are estimated via forward orthogonal least squares, but the model subset selection cost function includes a D-optimality design criterion that maximizes the determinant of the design matrix of the subset to ensure the model robustness, adequacy, and parsimony of the final model. The proposed approach is based on the forward orthogonal least square (OLS) algorithm, such that new D-optimality-based cost function is constructed based on the orthogonalization process to gain computational advantages and hence to maintain the inherent advantage of computational efficiency associated with the conventional forward OLS approach. Illustrative examples are included to demonstrate the effectiveness of the new approach.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Recently major processor manufacturers have announced a dramatic shift in their paradigm to increase computing power over the coming years. Instead of focusing on faster clock speeds and more powerful single core CPUs, the trend clearly goes towards multi core systems. This will also result in a paradigm shift for the development of algorithms for computationally expensive tasks, such as data mining applications. Obviously, work on parallel algorithms is not new per se but concentrated efforts in the many application domains are still missing. Multi-core systems, but also clusters of workstations and even large-scale distributed computing infrastructures provide new opportunities and pose new challenges for the design of parallel and distributed algorithms. Since data mining and machine learning systems rely on high performance computing systems, research on the corresponding algorithms must be on the forefront of parallel algorithm research in order to keep pushing data mining and machine learning applications to be more powerful and, especially for the former, interactive. To bring together researchers and practitioners working in this exciting field, a workshop on parallel data mining was organized as part of PKDD/ECML 2006 (Berlin, Germany). The six contributions selected for the program describe various aspects of data mining and machine learning approaches featuring low to high degrees of parallelism: The first contribution focuses the classic problem of distributed association rule mining and focuses on communication efficiency to improve the state of the art. After this a parallelization technique for speeding up decision tree construction by means of thread-level parallelism for shared memory systems is presented. The next paper discusses the design of a parallel approach for dis- tributed memory systems of the frequent subgraphs mining problem. This approach is based on a hierarchical communication topology to solve issues related to multi-domain computational envi- ronments. The forth paper describes the combined use and the customization of software packages to facilitate a top down parallelism in the tuning of Support Vector Machines (SVM) and the next contribution presents an interesting idea concerning parallel training of Conditional Random Fields (CRFs) and motivates their use in labeling sequential data. The last contribution finally focuses on very efficient feature selection. It describes a parallel algorithm for feature selection from random subsets. Selecting the papers included in this volume would not have been possible without the help of an international Program Committee that has provided detailed reviews for each paper. We would like to also thank Matthew Otey who helped with publicity for the workshop.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper describes a novel adaptive noise cancellation system with fast tunable radial basis function (RBF). The weight coefficients of the RBF network are adapted by the multi-innovation recursive least square (MRLS) algorithm. If the RBF network performs poorly despite of the weight adaptation, an insignificant node with little contribution to the overall performance is replaced with a new node without changing the model size. Otherwise, the RBF network structure remains unchanged and only the weight vector is adapted. The simulation results show that the proposed approach can well cancel the noise in both stationary and nonstationary ANC systems.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we propose a novel online modeling algorithm for nonlinear and nonstationary systems using a radial basis function (RBF) neural network with a fixed number of hidden nodes. Each of the RBF basis functions has a tunable center vector and an adjustable diagonal covariance matrix. A multi-innovation recursive least square (MRLS) algorithm is applied to update the weights of RBF online, while the modeling performance is monitored. When the modeling residual of the RBF network becomes large in spite of the weight adaptation, a node identified as insignificant is replaced with a new node, for which the tunable center vector and diagonal covariance matrix are optimized using the quantum particle swarm optimization (QPSO) algorithm. The major contribution is to combine the MRLS weight adaptation and QPSO node structure optimization in an innovative way so that it can track well the local characteristic in the nonstationary system with a very sparse model. Simulation results show that the proposed algorithm has significantly better performance than existing approaches.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper describes a novel on-line learning approach for radial basis function (RBF) neural network. Based on an RBF network with individually tunable nodes and a fixed small model size, the weight vector is adjusted using the multi-innovation recursive least square algorithm on-line. When the residual error of the RBF network becomes large despite of the weight adaptation, an insignificant node with little contribution to the overall system is replaced by a new node. Structural parameters of the new node are optimized by proposed fast algorithms in order to significantly improve the modeling performance. The proposed scheme describes a novel, flexible, and fast way for on-line system identification problems. Simulation results show that the proposed approach can significantly outperform existing ones for nonstationary systems in particular.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Various popular machine learning techniques, like support vector machines, are originally conceived for the solution of two-class (binary) classification problems. However, a large number of real problems present more than two classes. A common approach to generalize binary learning techniques to solve problems with more than two classes, also known as multiclass classification problems, consists of hierarchically decomposing the multiclass problem into multiple binary sub-problems, whose outputs are combined to define the predicted class. This strategy results in a tree of binary classifiers, where each internal node corresponds to a binary classifier distinguishing two groups of classes and the leaf nodes correspond to the problem classes. This paper investigates how measures of the separability between classes can be employed in the construction of binary-tree-based multiclass classifiers, adapting the decompositions performed to each particular multiclass problem. (C) 2010 Elsevier B.V. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Several real problems involve the classification of data into categories or classes. Given a data set containing data whose classes are known, Machine Learning algorithms can be employed for the induction of a classifier able to predict the class of new data from the same domain, performing the desired discrimination. Some learning techniques are originally conceived for the solution of problems with only two classes, also named binary classification problems. However, many problems require the discrimination of examples into more than two categories or classes. This paper presents a survey on the main strategies for the generalization of binary classifiers to problems with more than two classes, known as multiclass classification problems. The focus is on strategies that decompose the original multiclass problem into multiple binary subtasks, whose outputs are combined to obtain the final prediction.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Support vector machines (SVMs) were originally formulated for the solution of binary classification problems. In multiclass problems, a decomposition approach is often employed, in which the multiclass problem is divided into multiple binary subproblems, whose results are combined. Generally, the performance of SVM classifiers is affected by the selection of values for their parameters. This paper investigates the use of genetic algorithms (GAs) to tune the parameters of the binary SVMs in common multiclass decompositions. The developed GA may search for a set of parameter values common to all binary classifiers or for differentiated values for each binary classifier. (C) 2008 Elsevier B.V. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Several popular Machine Learning techniques are originally designed for the solution of two-class problems. However, several classification problems have more than two classes. One approach to deal with multiclass problems using binary classifiers is to decompose the multiclass problem into multiple binary sub-problems disposed in a binary tree. This approach requires a binary partition of the classes for each node of the tree, which defines the tree structure. This paper presents two algorithms to determine the tree structure taking into account information collected from the used dataset. This approach allows the tree structure to be determined automatically for any multiclass dataset.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Robotic mapping is the process of automatically constructing an environment representation using mobile robots. We address the problem of semantic mapping, which consists of using mobile robots to create maps that represent not only metric occupancy but also other properties of the environment. Specifically, we develop techniques to build maps that represent activity and navigability of the environment. Our approach to semantic mapping is to combine machine learning techniques with standard mapping algorithms. Supervised learning methods are used to automatically associate properties of space to the desired classification patterns. We present two methods, the first based on hidden Markov models and the second on support vector machines. Both approaches have been tested and experimentally validated in two problem domains: terrain mapping and activity-based mapping.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In recent decades the public sector comes under pressure in order to improve its performance. The use of Information Technology (IT) has been a tool increasingly used in reaching that goal. Thus, it has become an important issue in public organizations, particularly in institutions of higher education, determine which factors influence the acceptance and use of technology, impacting on the success of its implementation and the desired organizational results. The Technology Acceptance Model - TAM was used as the basis for this study and is based on the constructs perceived usefulness and perceived ease of use. However, when it comes to integrated management systems due to the complexity of its implementation,organizational factors were added to thus seek further explanation of the acceptance of such systems. Thus, added to the model five TAM constructs related to critical success factors in implementing ERP systems, they are: support of top management, communication, training, cooperation, and technological complexity (BUENO and SALMERON, 2008). Based on the foregoing, launches the following research problem: What factors influence the acceptance and use of SIE / module academic at the Federal University of Para, from the users' perception of teachers and technicians? The purpose of this study was to identify the influence of organizational factors, and behavioral antecedents of behavioral intention to use the SIE / module academic UFPA in the perspective of teachers and technical users. This is applied research, exploratory and descriptive, quantitative with the implementation of a survey, and data collection occurred through a structured questionnaire applied to a sample of 229 teachers and 30 technical and administrative staff. Data analysis was carried out through descriptive statistics and structural equation modeling with the technique of partial least squares (PLS). Effected primarily to assess the measurement model, which were verified reliability, convergent and discriminant validity for all indicators and constructs. Then the structural model was analyzed using the bootstrap resampling technique like. In assessing statistical significance, all hypotheses were supported. The coefficient of determination (R ²) was high or average in five of the six endogenous variables, so the model explains 47.3% of the variation in behavioral intention. It is noteworthy that among the antecedents of behavioral intention (BI) analyzed in this study, perceived usefulness is the variable that has a greater effect on behavioral intention, followed by ease of use (PEU) and attitude (AT). Among the organizational aspects (critical success factors) studied technological complexity (TC) and training (ERT) were those with greatest effect on behavioral intention to use, although these effects were lower than those produced by behavioral factors (originating from TAM). It is pointed out further that the support of senior management (TMS) showed, among all variables, the least effect on the intention to use (BI) and was followed by communications (COM) and cooperation (CO), which exert a low effect on behavioral intention (BI). Therefore, as other studies on the TAM constructs were adequate for the present research. Thus, the study contributed towards proving evidence that the Technology Acceptance Model can be applied to predict the acceptance of integrated management systems, even in public. Keywords: Technology

Relevância:

100.00% 100.00%

Publicador:

Resumo:

One of the most important goals of bioinformatics is the ability to identify genes in uncharacterized DNA sequences on world wide database. Gene expression on prokaryotes initiates when the RNA-polymerase enzyme interacts with DNA regions called promoters. In these regions are located the main regulatory elements of the transcription process. Despite the improvement of in vitro techniques for molecular biology analysis, characterizing and identifying a great number of promoters on a genome is a complex task. Nevertheless, the main drawback is the absence of a large set of promoters to identify conserved patterns among the species. Hence, a in silico method to predict them on any species is a challenge. Improved promoter prediction methods can be one step towards developing more reliable ab initio gene prediction methods. In this work, we present an empirical comparison of Machine Learning (ML) techniques such as Na¨ýve Bayes, Decision Trees, Support Vector Machines and Neural Networks, Voted Perceptron, PART, k-NN and and ensemble approaches (Bagging and Boosting) to the task of predicting Bacillus subtilis. In order to do so, we first built two data set of promoter and nonpromoter sequences for B. subtilis and a hybrid one. In order to evaluate of ML methods a cross-validation procedure is applied. Good results were obtained with methods of ML like SVM and Naïve Bayes using B. subtilis. However, we have not reached good results on hybrid database