984 resultados para regression algorithm


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Gaussian Processes (GPs) are promising Bayesian methods for classification and regression problems. They have also been used for semi-supervised learning tasks. In this paper, we propose a new algorithm for solving semi-supervised binary classification problem using sparse GP regression (GPR) models. It is closely related to semi-supervised learning based on support vector regression (SVR) and maximum margin clustering. The proposed algorithm is simple and easy to implement. It gives a sparse solution directly unlike the SVR based algorithm. Also, the hyperparameters are estimated easily without resorting to expensive cross-validation technique. Use of sparse GPR model helps in making the proposed algorithm scalable. Preliminary results on synthetic and real-world data sets demonstrate the efficacy of the new algorithm.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents an optimization algorithm for an ammonia reactor based on a regression model relating the yield to several parameters, control inputs and disturbances. This model is derived from the data generated by hybrid simulation of the steady-state equations describing the reactor behaviour. The simplicity of the optimization program along with its ability to take into account constraints on flow variables make it best suited in supervisory control applications.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper introduces a scheme for classification of online handwritten characters based on polynomial regression of the sampled points of the sub-strokes in a character. The segmentation is done based on the velocity profile of the written character and this requires a smoothening of the velocity profile. We propose a novel scheme for smoothening the velocity profile curve and identification of the critical points to segment the character. We also porpose another method for segmentation based on the human eye perception. We then extract two sets of features for recognition of handwritten characters. Each sub-stroke is a simple curve, a part of the character, and is represented by the distance measure of each point from the first point. This forms the first set of feature vector for each character. The second feature vector are the coeficients obtained from the B-splines fitted to the control knots obtained from the segmentation algorithm. The feature vector is fed to the SVM classifier and it indicates an efficiency of 68% using the polynomial regression technique and 74% using the spline fitting method.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper we propose a novel, scalable, clustering based Ordinal Regression formulation, which is an instance of a Second Order Cone Program (SOCP) with one Second Order Cone (SOC) constraint. The main contribution of the paper is a fast algorithm, CB-OR, which solves the proposed formulation more eficiently than general purpose solvers. Another main contribution of the paper is to pose the problem of focused crawling as a large scale Ordinal Regression problem and solve using the proposed CB-OR. Focused crawling is an efficient mechanism for discovering resources of interest on the web. Posing the problem of focused crawling as an Ordinal Regression problem avoids the need for a negative class and topic hierarchy, which are the main drawbacks of the existing focused crawling methods. Experiments on large synthetic and benchmark datasets show the scalability of CB-OR. Experiments also show that the proposed focused crawler outperforms the state-of-the-art.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents a method of partial automation of specification based regression testing, which we call ESSE (Explicit State Space Enumeration). The first step in ESSE method is the extraction of a finite state model of the system making use of an already tested version of the system under test (SUT). Thereafter, the finite state model thus obtained is used to compute good test sequences that can be used to regression test subsequent versions of the system. We present two new algorithms for test sequence computation - both based on our finite state model generated by the above method. We also provide the details and results of the experimental evaluation of ESSE method. Comparison with a practically used random-testing algorithm has shown substantial improvements.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, we present a novel algorithm for piecewise linear regression which can learn continuous as well as discontinuous piecewise linear functions. The main idea is to repeatedly partition the data and learn a linear model in each partition. The proposed algorithm is similar in spirit to k-means clustering algorithm. We show that our algorithm can also be viewed as a special case of an EM algorithm for maximum likelihood estimation under a reasonable probability model. We empirically demonstrate the effectiveness of our approach by comparing its performance with that of the state of art algorithms on various datasets. (C) 2014 Elsevier Inc. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

ENGLISH: We analyzed catches per unit of effort (CPUE) from the Japanese longline fishery for bigeye tuna (Thunnus obesus) in the central and eastern Pacific Ocean (EPO) with regression tree methods. Regression trees have not previously been used to estimate time series of abundance indices fronl CPUE data. The "optimally sized" tree had 139 parameters; year, month, latitude, and longitude interacted to affect bigeye CPUE. The trend in tree-based abundance indices for the EPO was similar to trends estimated from a generalized linear model and fronl an empirical model that combines oceanographic data with information on the distribution of fish relative to environmental conditions. The regression tree was more parsimonious and would be easier to implement than the other two nl0dels, but the tree provided no information about the nlechanisms that caused bigeye CPUEs to vary in time and space. Bigeye CPUEs increased sharply during the mid-1980's and were more variable at the northern and southern edges of the fishing grounds. Both of these results can be explained by changes in actual abundance and changes in catchability. Results from a regression tree that was fitted to a subset of the data indicated that, in the EPO, bigeye are about equally catchable with regular and deep longlines. This is not consistent with observations that bigeye are more abundant at depth and indicates that classification by gear type (regular or deep longline) may not provide a good measure of capture depth. Asimulated annealing algorithm was used to summarize the tree-based results by partitioning the fishing grounds into regions where trends in bigeye CPUE were similar. Simulated annealing can be useful for designing spatial strata in future sampling programs. SPANISH: Analizamos la captura por unidad de esfuerzo (CPUE) de la pesquería palangrera japonesa de atún patudo (Thunnus obesus) en el Océano Pacifico oriental (OPO) y central con métodos de árbol de regresión. Hasta ahora no se han usado árboles de regresión para estimar series de tiempo de índices de abundancia a partir de datos de CPUE. EI árbol de "tamaño optimo" tuvo 139 parámetros; ano, mes, latitud, y longitud interactuaron para afectar la CPUE de patudo. La tendencia en los índices de abundancia basados en árboles para el OPO fue similar a las tendencias estimadas con un modelo lineal generalizado y con un modelo empírico que combina datos oceanográficos con información sobre la distribución de los peces en relación con las condiciones ambientales. EI árbol de regresión fue mas parsimonioso y seria mas fácil de utilizar que los dos otros modelos, pero no proporciono información sobre los mecanismos que causaron que las CPUE de patudo valiaran en el tiempo y en el espacio. Las CPUE de patudo aumentaron notablemente a mediados de los anos 80 y fueron mas variables en los extremos norte y sur de la zona de pesca. Estos dos resultados pueden ser explicados por cambios en la abundancia real y cambios en la capturabilidad. Los resultados de un arbal de regresión ajustado a un subconjunto de los datos indican que, en el OPO, el patudo es igualmente capturable con palangres regulares y profundos. Esto no es consistente con observaciones de que el patudo abunda mas a profundidad e indica que clasificación por tipo de arte (palangre regular 0 profundo) podría no ser una buena medida de la profundidad de captura. Se uso un algoritmo de templado simulado para resumir los resultados basados en el árbol clasificando las zonas de pesca en zonas con tendencias similares en la CPUE de patudo. El templado simulado podría ser útil para diseñar estratos espaciales en programas futuros de muestreo. (PDF contains 45 pages.)

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The alternate combinational approach of genetic algorithm and neural network (AGANN) has been presented to correct the systematic error of the density functional theory (DFT) calculation. It treats the DFT as a black box and models the error through external statistical information. As a demonstration, the AGANN method has been applied in the correction of the lattice energies from the DFT calculation for 72 metal halides and hydrides. Through the AGANN correction, the mean absolute value of the relative errors of the calculated lattice energies to the experimental values decreases from 4.93% to 1.20% in the testing set. For comparison, the neural network approach reduces the mean value to 2.56%. And for the common combinational approach of genetic algorithm and neural network, the value drops to 2.15%. The multiple linear regression method almost has no correction effect here.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The present study reports an application of the searching combination moving window partial least squares (SCMWPLS) algorithm to the determination of ethenzamide and acetoaminophen in quaternary powdered samples by near infrared (NIR) spectroscopy. Another purpose of the study was to examine the instrumentation effects of spectral resolution and signal-to-noise ratio of the Buchi NIRLab N-200 FT-NIR spectrometer equipped with an InGaAs detector. The informative spectral intervals of NIR spectra of a series of quaternary powdered mixture samples were first located for ethenzamide and acetoaminophen by use of moving window partial least squares regression (MWPLSR). Then, these located spectral intervals were further optimised by SCMWPLS for subsequent partial least squares (PLS) model development. The improved results are attributed to both the less complex PLS models and to higher accuracy of predicted concentrations of ethenzamide and acetoaminophen in the optimised informative spectral intervals that are featured by NIR bands. At the same time, SCMWPLS is also demonstrated as a viable route for wavelength selection.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Elliott, G. N., Worgan, H., Broadhurst, D. I., Draper, J. H., Scullion, J. (2007). Soil differentiation using fingerprint Fourier transform infrared spectroscopy, chemometrics and genetic algorithm-based feature selection. Soil Biology & Biochemistry, 39 (11), 2888-2896. Sponsorship: BBSRC / NERC RAE2008

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper investigates the two-stage stepwise identification for a class of nonlinear dynamic systems that can be described by linear-in-the-parameters models, and the model has to be built from a very large pool of basis functions or model terms. The main objective is to improve the compactness of the model that is obtained by the forward stepwise methods, while retaining the computational efficiency. The proposed algorithm first generates an initial model using a forward stepwise procedure. The significance of each selected term is then reviewed at the second stage and all insignificant ones are replaced, resulting in an optimised compact model with significantly improved performance. The main contribution of this paper is that these two stages are performed within a well-defined regression context, leading to significantly reduced computational complexity. The efficiency of the algorithm is confirmed by the computational complexity analysis, and its effectiveness is demonstrated by the simulation results.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A forward and backward least angle regression (LAR) algorithm is proposed to construct the nonlinear autoregressive model with exogenous inputs (NARX) that is widely used to describe a large class of nonlinear dynamic systems. The main objective of this paper is to improve model sparsity and generalization performance of the original forward LAR algorithm. This is achieved by introducing a replacement scheme using an additional backward LAR stage. The backward stage replaces insignificant model terms selected by forward LAR with more significant ones, leading to an improved model in terms of the model compactness and performance. A numerical example to construct four types of NARX models, namely polynomials, radial basis function (RBF) networks, neuro fuzzy and wavelet networks, is presented to illustrate the effectiveness of the proposed technique in comparison with some popular methods.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Long-term contractual decisions are the basis of an efficient risk management. However those types of decisions have to be supported with a robust price forecast methodology. This paper reports a different approach for long-term price forecast which tries to give answers to that need. Making use of regression models, the proposed methodology has as main objective to find the maximum and a minimum Market Clearing Price (MCP) for a specific programming period, and with a desired confidence level α. Due to the problem complexity, the meta-heuristic Particle Swarm Optimization (PSO) was used to find the best regression parameters and the results compared with the obtained by using a Genetic Algorithm (GA). To validate these models, results from realistic data are presented and discussed in detail.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Using the classical Parzen window estimate as the target function, the kernel density estimation is formulated as a regression problem and the orthogonal forward regression technique is adopted to construct sparse kernel density estimates. The proposed algorithm incrementally minimises a leave-one-out test error score to select a sparse kernel model, and a local regularisation method is incorporated into the density construction process to further enforce sparsity. The kernel weights are finally updated using the multiplicative nonnegative quadratic programming algorithm, which has the ability to reduce the model size further. Except for the kernel width, the proposed algorithm has no other parameters that need tuning, and the user is not required to specify any additional criterion to terminate the density construction procedure. Two examples are used to demonstrate the ability of this regression-based approach to effectively construct a sparse kernel density estimate with comparable accuracy to that of the full-sample optimised Parzen window density estimate.