891 resultados para Lanczos, Linear systems, Generalized cross validation
Resumo:
We consider conjugate-gradient like methods for solving block symmetric indefinite linear systems that arise from saddle-point problems or, in particular, regularizations thereof. Such methods require preconditioners that preserve certain sub-blocks from the original systems but allow considerable flexibility for the remaining blocks. We construct a number of families of implicit factorizations that are capable of reproducing the required sub-blocks and (some) of the remainder. These generalize known implicit factorizations for the unregularized case. Improved eigenvalue clustering is possible if additionally some of the noncrucial blocks are reproduced. Numerical experiments confirm that these implicit-factorization preconditioners can be very effective in practice.
Resumo:
A fundamental principle in practical nonlinear data modeling is the parsimonious principle of constructing the minimal model that explains the training data well. Leave-one-out (LOO) cross validation is often used to estimate generalization errors by choosing amongst different network architectures (M. Stone, "Cross validatory choice and assessment of statistical predictions", J. R. Stast. Soc., Ser. B, 36, pp. 117-147, 1974). Based upon the minimization of LOO criteria of either the mean squares of LOO errors or the LOO misclassification rate respectively, we present two backward elimination algorithms as model post-processing procedures for regression and classification problems. The proposed backward elimination procedures exploit an orthogonalization procedure to enable the orthogonality between the subspace as spanned by the pruned model and the deleted regressor. Subsequently, it is shown that the LOO criteria used in both algorithms can be calculated via some analytic recursive formula, as derived in this contribution, without actually splitting the estimation data set so as to reduce computational expense. Compared to most other model construction methods, the proposed algorithms are advantageous in several aspects; (i) There are no tuning parameters to be optimized through an extra validation data set; (ii) The procedure is fully automatic without an additional stopping criteria; and (iii) The model structure selection is directly based on model generalization performance. The illustrative examples on regression and classification are used to demonstrate that the proposed algorithms are viable post-processing methods to prune a model to gain extra sparsity and improved generalization.
Resumo:
A new parameter-estimation algorithm, which minimises the cross-validated prediction error for linear-in-the-parameter models, is proposed, based on stacked regression and an evolutionary algorithm. It is initially shown that cross-validation is very important for prediction in linear-in-the-parameter models using a criterion called the mean dispersion error (MDE). Stacked regression, which can be regarded as a sophisticated type of cross-validation, is then introduced based on an evolutionary algorithm, to produce a new parameter-estimation algorithm, which preserves the parsimony of a concise model structure that is determined using the forward orthogonal least-squares (OLS) algorithm. The PRESS prediction errors are used for cross-validation, and the sunspot and Canadian lynx time series are used to demonstrate the new algorithms.
Resumo:
This study investigated the potential application of mid-infrared spectroscopy (MIR 4,000–900 cm−1) for the determination of milk coagulation properties (MCP), titratable acidity (TA), and pH in Brown Swiss milk samples (n = 1,064). Because MCP directly influence the efficiency of the cheese-making process, there is strong industrial interest in developing a rapid method for their assessment. Currently, the determination of MCP involves time-consuming laboratory-based measurements, and it is not feasible to carry out these measurements on the large numbers of milk samples associated with milk recording programs. Mid-infrared spectroscopy is an objective and nondestructive technique providing rapid real-time analysis of food compositional and quality parameters. Analysis of milk rennet coagulation time (RCT, min), curd firmness (a30, mm), TA (SH°/50 mL; SH° = Soxhlet-Henkel degree), and pH was carried out, and MIR data were recorded over the spectral range of 4,000 to 900 cm−1. Models were developed by partial least squares regression using untreated and pretreated spectra. The MCP, TA, and pH prediction models were improved by using the combined spectral ranges of 1,600 to 900 cm−1, 3,040 to 1,700 cm−1, and 4,000 to 3,470 cm−1. The root mean square errors of cross-validation for the developed models were 2.36 min (RCT, range 24.9 min), 6.86 mm (a30, range 58 mm), 0.25 SH°/50 mL (TA, range 3.58 SH°/50 mL), and 0.07 (pH, range 1.15). The most successfully predicted attributes were TA, RCT, and pH. The model for the prediction of TA provided approximate prediction (R2 = 0.66), whereas the predictive models developed for RCT and pH could discriminate between high and low values (R2 = 0.59 to 0.62). It was concluded that, although the models require further development to improve their accuracy before their application in industry, MIR spectroscopy has potential application for the assessment of RCT, TA, and pH during routine milk analysis in the dairy industry. The implementation of such models could be a means of improving MCP through phenotypic-based selection programs and to amend milk payment systems to incorporate MCP into their payment criteria.
Resumo:
Current methods for estimating vegetation parameters are generally sub-optimal in the way they exploit information and do not generally consider uncertainties. We look forward to a future where operational dataassimilation schemes improve estimates by tracking land surface processes and exploiting multiple types of observations. Dataassimilation schemes seek to combine observations and models in a statistically optimal way taking into account uncertainty in both, but have not yet been much exploited in this area. The EO-LDAS scheme and prototype, developed under ESA funding, is designed to exploit the anticipated wealth of data that will be available under GMES missions, such as the Sentinel family of satellites, to provide improved mapping of land surface biophysical parameters. This paper describes the EO-LDAS implementation, and explores some of its core functionality. EO-LDAS is a weak constraint variational dataassimilationsystem. The prototype provides a mechanism for constraint based on a prior estimate of the state vector, a linear dynamic model, and EarthObservationdata (top-of-canopy reflectance here). The observation operator is a non-linear optical radiative transfer model for a vegetation canopy with a soil lower boundary, operating over the range 400 to 2500 nm. Adjoint codes for all model and operator components are provided in the prototype by automatic differentiation of the computer codes. In this paper, EO-LDAS is applied to the problem of daily estimation of six of the parameters controlling the radiative transfer operator over the course of a year (> 2000 state vector elements). Zero and first order process model constraints are implemented and explored as the dynamic model. The assimilation estimates all state vector elements simultaneously. This is performed in the context of a typical Sentinel-2 MSI operating scenario, using synthetic MSI observations simulated with the observation operator, with uncertainties typical of those achieved by optical sensors supposed for the data. The experiments consider a baseline state vector estimation case where dynamic constraints are applied, and assess the impact of dynamic constraints on the a posteriori uncertainties. The results demonstrate that reductions in uncertainty by a factor of up to two might be obtained by applying the sorts of dynamic constraints used here. The hyperparameter (dynamic model uncertainty) required to control the assimilation are estimated by a cross-validation exercise. The result of the assimilation is seen to be robust to missing observations with quite large data gaps.
Resumo:
We present an efficient graph-based algorithm for quantifying the similarity of household-level energy use profiles, using a notion of similarity that allows for small time–shifts when comparing profiles. Experimental results on a real smart meter data set demonstrate that in cases of practical interest our technique is far faster than the existing method for computing the same similarity measure. Having a fast algorithm for measuring profile similarity improves the efficiency of tasks such as clustering of customers and cross-validation of forecasting methods using historical data. Furthermore, we apply a generalisation of our algorithm to produce substantially better household-level energy use forecasts from historical smart meter data.
Resumo:
We propose a new class of neurofuzzy construction algorithms with the aim of maximizing generalization capability specifically for imbalanced data classification problems based on leave-one-out (LOO) cross validation. The algorithms are in two stages, first an initial rule base is constructed based on estimating the Gaussian mixture model with analysis of variance decomposition from input data; the second stage carries out the joint weighted least squares parameter estimation and rule selection using orthogonal forward subspace selection (OFSS)procedure. We show how different LOO based rule selection criteria can be incorporated with OFSS, and advocate either maximizing the leave-one-out area under curve of the receiver operating characteristics, or maximizing the leave-one-out Fmeasure if the data sets exhibit imbalanced class distribution. Extensive comparative simulations illustrate the effectiveness of the proposed algorithms.
Resumo:
This paper presents the development and evaluation of a method for enabling quantitative and automatic scoring of alternating tapping performance of patients with Parkinson’s disease (PD). Ten healthy elderly subjects and 95 patients in different clinical stages of PD have utilized a touch-pad handheld computer to perform alternate tapping tests in their home environments. First, a neurologist used a web-based system to visually assess impairments in four tapping dimensions (‘speed’, ‘accuracy’, ‘fatigue’ and ‘arrhythmia’) and a global tapping severity (GTS). Second, tapping signals were processed with time series analysis and statistical methods to derive 24 quantitative parameters. Third, principal component analysis was used to reduce the dimensions of these parameters and to obtain scores for the four dimensions. Finally, a logistic regression classifier was trained using a 10-fold stratified cross-validation to map the reduced parameters to the corresponding visually assessed GTS scores. Results showed that the computed scores correlated well to visually assessed scores and were significantly different across Unified Parkinson’s Disease Rating Scale scores of upper limb motor performance. In addition, they had good internal consistency, had good ability to discriminate between healthy elderly and patients in different disease stages, had good sensitivity to treatment interventions and could reflect the natural disease progression over time. In conclusion, the automatic method can be useful to objectively assess the tapping performance of PD patients and can be included in telemedicine tools for remote monitoring of tapping.
Resumo:
In this study the effect of the cultivar on the volatile profile of five different banana varieties was evaluated and determined by dynamic headspace solid-phase microextraction (dHS-SPME) combined with one-dimensional gas chromatography–mass spectrometry (1D-GC–qMS). This approach allowed the definition of a volatile metabolite profile to each banana variety and can be used as pertinent criteria of differentiation. The investigated banana varieties (Dwarf Cavendish, Prata, Maçã, Ouro and Platano) have certified botanical origin and belong to the Musaceae family, the most common genomic group cultivated in Madeira Island (Portugal). The influence of dHS-SPME experimental factors, namely, fibre coating, extraction time and extraction temperature, on the equilibrium headspace analysis was investigated and optimised using univariate optimisation design. A total of 68 volatile organic metabolites (VOMs) were tentatively identified and used to profile the volatile composition in different banana cultivars, thus emphasising the sensitivity and applicability of SPME for establishment of the volatile metabolomic pattern of plant secondary metabolites. Ethyl esters were found to comprise the largest chemical class accounting 80.9%, 86.5%, 51.2%, 90.1% and 6.1% of total peak area for Dwarf Cavendish, Prata, Ouro, Maçã and Platano volatile fraction, respectively. Gas chromatographic peak areas were submitted to multivariate statistical analysis (principal component and stepwise linear discriminant analysis) in order to visualise clusters within samples and to detect the volatile metabolites able to differentiate banana cultivars. The application of the multivariate analysis on the VOMs data set resulted in predictive abilities of 90% as evaluated by the cross-validation procedure.
Resumo:
The present work presents the study and implementation of an adaptive bilinear compensated generalized predictive controller. This work uses conventional techniques of predictive control and includes techniques of adaptive control for better results. In order to solve control problems frequently found in the chemical industry, bilinear models are considered to represent the dynamics of the studied systems. Bilinear models are simpler than general nonlinear model, however it can to represent the intrinsic not-linearities of industrial processes. The linearization of the model, by the approach to time step quasilinear , is used to allow the application of the equations of the generalized predictive controller (GPC). Such linearization, however, generates an error of prediction, which is minimized through a compensation term. The term in study is implemented in an adaptive form, due to the nonlinear relationship between the input signal and the prediction error.Simulation results show the efficiency of adaptive predictive bilinear controller in comparison with the conventional.
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
This work presents a new approach for rainfall measurements making use of weather radar data for real time application to the radar systems operated by institute of Meteorological Research (IPMET) - UNESP - Bauru - SP-Brazil. Several real time adjustment techniques has been presented being most of them based on surface rain-gauge network. However, some of these methods do not regard the effect of the integration area, time integration and distance rainfall-radar. In this paper, artificial neural networks have been applied for generate a radar reflectivity-rain relationships which regard all effects described above. To evaluate prediction procedure, cross validation was performed using data from IPMET weather Doppler radar and rain-gauge network under the radar umbrella. The preliminary results were acceptable for rainfalls prediction. The small errors observed result from the spatial density and the time resolution of the rain-gauges networks used to calibrate the radar.
Resumo:
This paper presents necessary and sufficient conditions for the following problem: given a linear time invariant plant G(s) = N(s)D(s)-1 = C(sI - A]-1B, with m inputs, p outputs, p > m, rank(C) = p, rank(B) = rank(CB) = m, £nd a tandem dynamic controller Gc(s) = D c(s)-1Nc(s) = Cc(sI - A c)-1Bc + Dc, with p inputs and m outputs and a constant output feedback matrix Ko ε ℝm×p such that the feedback system is Strictly Positive Real (SPR). It is shown that this problem has solution if and only if all transmission zeros of the plant have negative real parts. When there exists solution, the proposed method firstly obtains Gc(s) in order to all transmission zeros of Gc(s)G(s) present negative real parts and then Ko is found as the solution of some Linear Matrix Inequalities (LMIs). Then, taking into account this result, a new LMI based design for output Variable Structure Control (VSC) of uncertain dynamic plants is presented. The method can consider the following design specifications: matched disturbances or nonlinearities of the plant, output constraints, decay rate and matched and nonmatched plant uncertainties. © 2006 IEEE.