32 resultados para regularization
Resumo:
Nonlinear system identification is considered using a generalized kernel regression model. Unlike the standard kernel model, which employs a fixed common variance for all the kernel regressors, each kernel regressor in the generalized kernel model has an individually tuned diagonal covariance matrix that is determined by maximizing the correlation between the training data and the regressor using a repeated guided random search based on boosting optimization. An efficient construction algorithm based on orthogonal forward regression with leave-one-out (LOO) test statistic and local regularization (LR) is then used to select a parsimonious generalized kernel regression model from the resulting full regression matrix. The proposed modeling algorithm is fully automatic and the user is not required to specify any criterion to terminate the construction procedure. Experimental results involving two real data sets demonstrate the effectiveness of the proposed nonlinear system identification approach.
Resumo:
In this correspondence new robust nonlinear model construction algorithms for a large class of linear-in-the-parameters models are introduced to enhance model robustness via combined parameter regularization and new robust structural selective criteria. In parallel to parameter regularization, we use two classes of robust model selection criteria based on either experimental design criteria that optimizes model adequacy, or the predicted residual sums of squares (PRESS) statistic that optimizes model generalization capability, respectively. Three robust identification algorithms are introduced, i.e., combined A- and D-optimality with regularized orthogonal least squares algorithm, respectively; and combined PRESS statistic with regularized orthogonal least squares algorithm. A common characteristic of these algorithms is that the inherent computation efficiency associated with the orthogonalization scheme in orthogonal least squares or regularized orthogonal least squares has been extended such that the new algorithms are computationally efficient. Numerical examples are included to demonstrate effectiveness of the algorithms.
Resumo:
We propose a unified data modeling approach that is equally applicable to supervised regression and classification applications, as well as to unsupervised probability density function estimation. A particle swarm optimization (PSO) aided orthogonal forward regression (OFR) algorithm based on leave-one-out (LOO) criteria is developed to construct parsimonious radial basis function (RBF) networks with tunable nodes. Each stage of the construction process determines the center vector and diagonal covariance matrix of one RBF node by minimizing the LOO statistics. For regression applications, the LOO criterion is chosen to be the LOO mean square error, while the LOO misclassification rate is adopted in two-class classification applications. By adopting the Parzen window estimate as the desired response, the unsupervised density estimation problem is transformed into a constrained regression problem. This PSO aided OFR algorithm for tunable-node RBF networks is capable of constructing very parsimonious RBF models that generalize well, and our analysis and experimental results demonstrate that the algorithm is computationally even simpler than the efficient regularization assisted orthogonal least square algorithm based on LOO criteria for selecting fixed-node RBF models. Another significant advantage of the proposed learning procedure is that it does not have learning hyperparameters that have to be tuned using costly cross validation. The effectiveness of the proposed PSO aided OFR construction procedure is illustrated using several examples taken from regression and classification, as well as density estimation applications.
Resumo:
New ways of combining observations with numerical models are discussed in which the size of the state space can be very large, and the model can be highly nonlinear. Also the observations of the system can be related to the model variables in highly nonlinear ways, making this data-assimilation (or inverse) problem highly nonlinear. First we discuss the connection between data assimilation and inverse problems, including regularization. We explore the choice of proposal density in a Particle Filter and show how the ’curse of dimensionality’ might be beaten. In the standard Particle Filter ensembles of model runs are propagated forward in time until observations are encountered, rendering it a pure Monte-Carlo method. In large-dimensional systems this is very inefficient and very large numbers of model runs are needed to solve the data-assimilation problem realistically. In our approach we steer all model runs towards the observations resulting in a much more efficient method. By further ’ensuring almost equal weight’ we avoid performing model runs that are useless in the end. Results are shown for the 40 and 1000 dimensional Lorenz 1995 model.
Resumo:
This paper surveys numerical techniques for the regularization of descriptor (generalized state-space) systems by proportional and derivative feedback. We review generalizations of controllability and observability to descriptor systems along with definitions of regularity and index in terms of the Weierstraß canonical form. Three condensed forms display the controllability and observability properties of a descriptor system. The condensed forms are obtained through orthogonal equivalence transformations and rank decisions, so they may be computed by numerically stable algorithms. In addition, the condensed forms display whether a descriptor system is regularizable, i.e., when the system pencil can be made to be regular by derivative and/or proportional output feedback, and, if so, what index can be achieved. Also included is a a new characterization of descriptor systems that can be made to be regular with index 1 by proportional and derivative output feedback.
Resumo:
In this paper we explore classification techniques for ill-posed problems. Two classes are linearly separable in some Hilbert space X if they can be separated by a hyperplane. We investigate stable separability, i.e. the case where we have a positive distance between two separating hyperplanes. When the data in the space Y is generated by a compact operator A applied to the system states ∈ X, we will show that in general we do not obtain stable separability in Y even if the problem in X is stably separable. In particular, we show this for the case where a nonlinear classification is generated from a non-convergent family of linear classes in X. We apply our results to the problem of quality control of fuel cells where we classify fuel cells according to their efficiency. We can potentially classify a fuel cell using either some external measured magnetic field or some internal current. However we cannot measure the current directly since we cannot access the fuel cell in operation. The first possibility is to apply discrimination techniques directly to the measured magnetic fields. The second approach first reconstructs currents and then carries out the classification on the current distributions. We show that both approaches need regularization and that the regularized classifications are not equivalent in general. Finally, we investigate a widely used linear classification algorithm Fisher's linear discriminant with respect to its ill-posedness when applied to data generated via a compact integral operator. We show that the method cannot stay stable when the number of measurement points becomes large.
Resumo:
In this paper we propose an efficient two-level model identification method for a large class of linear-in-the-parameters models from the observational data. A new elastic net orthogonal forward regression (ENOFR) algorithm is employed at the lower level to carry out simultaneous model selection and elastic net parameter estimation. The two regularization parameters in the elastic net are optimized using a particle swarm optimization (PSO) algorithm at the upper level by minimizing the leave one out (LOO) mean square error (LOOMSE). Illustrative examples are included to demonstrate the effectiveness of the new approaches.
Resumo:
Logistic models are studied as a tool to convert dynamical forecast information (deterministic and ensemble) into probability forecasts. A logistic model is obtained by setting the logarithmic odds ratio equal to a linear combination of the inputs. As with any statistical model, logistic models will suffer from overfitting if the number of inputs is comparable to the number of forecast instances. Computational approaches to avoid overfitting by regularization are discussed, and efficient techniques for model assessment and selection are presented. A logit version of the lasso (originally a linear regression technique), is discussed. In lasso models, less important inputs are identified and the corresponding coefficient is set to zero, providing an efficient and automatic model reduction procedure. For the same reason, lasso models are particularly appealing for diagnostic purposes.
Resumo:
We study inverse problems in neural field theory, i.e., the construction of synaptic weight kernels yielding a prescribed neural field dynamics. We address the issues of existence, uniqueness, and stability of solutions to the inverse problem for the Amari neural field equation as a special case, and prove that these problems are generally ill-posed. In order to construct solutions to the inverse problem, we first recast the Amari equation into a linear perceptron equation in an infinite-dimensional Banach or Hilbert space. In a second step, we construct sets of biorthogonal function systems allowing the approximation of synaptic weight kernels by a generalized Hebbian learning rule. Numerically, this construction is implemented by the Moore–Penrose pseudoinverse method. We demonstrate the instability of these solutions and use the Tikhonov regularization method for stabilization and to prevent numerical overfitting. We illustrate the stable construction of kernels by means of three instructive examples.
Resumo:
We show that the four-dimensional variational data assimilation method (4DVar) can be interpreted as a form of Tikhonov regularization, a very familiar method for solving ill-posed inverse problems. It is known from image restoration problems that L1-norm penalty regularization recovers sharp edges in the image more accurately than Tikhonov, or L2-norm, penalty regularization. We apply this idea from stationary inverse problems to 4DVar, a dynamical inverse problem, and give examples for an L1-norm penalty approach and a mixed total variation (TV) L1–L2-norm penalty approach. For problems with model error where sharp fronts are present and the background and observation error covariances are known, the mixed TV L1–L2-norm penalty performs better than either the L1-norm method or the strong constraint 4DVar (L2-norm)method. A strength of the mixed TV L1–L2-norm regularization is that in the case where a simplified form of the background error covariance matrix is used it produces a much more accurate analysis than 4DVar. The method thus has the potential in numerical weather prediction to overcome operational problems with poorly tuned background error covariance matrices.
Resumo:
A two-stage linear-in-the-parameter model construction algorithm is proposed aimed at noisy two-class classification problems. The purpose of the first stage is to produce a prefiltered signal that is used as the desired output for the second stage which constructs a sparse linear-in-the-parameter classifier. The prefiltering stage is a two-level process aimed at maximizing a model's generalization capability, in which a new elastic-net model identification algorithm using singular value decomposition is employed at the lower level, and then, two regularization parameters are optimized using a particle-swarm-optimization algorithm at the upper level by minimizing the leave-one-out (LOO) misclassification rate. It is shown that the LOO misclassification rate based on the resultant prefiltered signal can be analytically computed without splitting the data set, and the associated computational cost is minimal due to orthogonality. The second stage of sparse classifier construction is based on orthogonal forward regression with the D-optimality algorithm. Extensive simulations of this approach for noisy data sets illustrate the competitiveness of this approach to classification of noisy data problems.
Resumo:
In this paper, various types of fault detection methods for fuel cells are compared. For example, those that use a model based approach or a data driven approach or a combination of the two. The potential advantages and drawbacks of each method are discussed and comparisons between methods are made. In particular, classification algorithms are investigated, which separate a data set into classes or clusters based on some prior knowledge or measure of similarity. In particular, the application of classification methods to vectors of reconstructed currents by magnetic tomography or to vectors of magnetic field measurements directly is explored. Bases are simulated using the finite integration technique (FIT) and regularization techniques are employed to overcome ill-posedness. Fisher's linear discriminant is used to illustrate these concepts. Numerical experiments show that the ill-posedness of the magnetic tomography problem is a part of the classification problem on magnetic field measurements as well. This is independent of the particular working mode of the cell but influenced by the type of faulty behavior that is studied. The numerical results demonstrate the ill-posedness by the exponential decay behavior of the singular values for three examples of fault classes.
Resumo:
Global NDVI data are routinely derived from the AVHRR, SPOT-VGT, and MODIS/Terra earth observation records for a range of applications from terrestrial vegetation monitoring to climate change modeling. This has led to a substantial interest in the harmonization of multisensor records. Most evaluations of the internal consistency and continuity of global multisensor NDVI products have focused on time-series harmonization in the spectral domain, often neglecting the spatial domain. We fill this void by applying variogram modeling (a) to evaluate the differences in spatial variability between 8-km AVHRR, 1-km SPOT-VGT, and 1-km, 500-m, and 250-m MODIS NDVI products over eight EOS (Earth Observing System) validation sites, and (b) to characterize the decay of spatial variability as a function of pixel size (i.e. data regularization) for spatially aggregated Landsat ETM+ NDVI products and a real multisensor dataset. First, we demonstrate that the conjunctive analysis of two variogram properties – the sill and the mean length scale metric – provides a robust assessment of the differences in spatial variability between multiscale NDVI products that are due to spatial (nominal pixel size, point spread function, and view angle) and non-spatial (sensor calibration, cloud clearing, atmospheric corrections, and length of multi-day compositing period) factors. Next, we show that as the nominal pixel size increases, the decay of spatial information content follows a logarithmic relationship with stronger fit value for the spatially aggregated NDVI products (R2 = 0.9321) than for the native-resolution AVHRR, SPOT-VGT, and MODIS NDVI products (R2 = 0.5064). This relationship serves as a reference for evaluation of the differences in spatial variability and length scales in multiscale datasets at native or aggregated spatial resolutions. The outcomes of this study suggest that multisensor NDVI records cannot be integrated into a long-term data record without proper consideration of all factors affecting their spatial consistency. Hence, we propose an approach for selecting the spatial resolution, at which differences in spatial variability between NDVI products from multiple sensors are minimized. This approach provides practical guidance for the harmonization of long-term multisensor datasets.
Resumo:
Approximate Bayesian computation (ABC) methods make use of comparisons between simulated and observed summary statistics to overcome the problem of computationally intractable likelihood functions. As the practical implementation of ABC requires computations based on vectors of summary statistics, rather than full data sets, a central question is how to derive low-dimensional summary statistics from the observed data with minimal loss of information. In this article we provide a comprehensive review and comparison of the performance of the principal methods of dimension reduction proposed in the ABC literature. The methods are split into three nonmutually exclusive classes consisting of best subset selection methods, projection techniques and regularization. In addition, we introduce two new methods of dimension reduction. The first is a best subset selection method based on Akaike and Bayesian information criteria, and the second uses ridge regression as a regularization procedure. We illustrate the performance of these dimension reduction techniques through the analysis of three challenging models and data sets.
Resumo:
We propose a new sparse model construction method aimed at maximizing a model’s generalisation capability for a large class of linear-in-the-parameters models. The coordinate descent optimization algorithm is employed with a modified l1- penalized least squares cost function in order to estimate a single parameter and its regularization parameter simultaneously based on the leave one out mean square error (LOOMSE). Our original contribution is to derive a closed form of optimal LOOMSE regularization parameter for a single term model, for which we show that the LOOMSE can be analytically computed without actually splitting the data set leading to a very simple parameter estimation method. We then integrate the new results within the coordinate descent optimization algorithm to update model parameters one at the time for linear-in-the-parameters models. Consequently a fully automated procedure is achieved without resort to any other validation data set for iterative model evaluation. Illustrative examples are included to demonstrate the effectiveness of the new approaches.