134 resultados para Error estimate.
em Queensland University of Technology - ePrints Archive
Resumo:
We study model selection strategies based on penalized empirical loss minimization. We point out a tight relationship between error estimation and data-based complexity penalization: any good error estimate may be converted into a data-based penalty function and the performance of the estimate is governed by the quality of the error estimate. We consider several penalty functions, involving error estimates on independent test data, empirical VC dimension, empirical VC entropy, and margin-based quantities. We also consider the maximal difference between the error on the first half of the training data and the second half, and the expected maximal discrepancy, a closely related capacity estimate that can be calculated by Monte Carlo integration. Maximal discrepancy penalty functions are appealing for pattern classification problems, since their computation is equivalent to empirical risk minimization over the training data with some labels flipped.
Resumo:
We study Krylov subspace methods for approximating the matrix-function vector product φ(tA)b where φ(z) = [exp(z) - 1]/z. This product arises in the numerical integration of large stiff systems of differential equations by the Exponential Euler Method, where A is the Jacobian matrix of the system. Recently, this method has found application in the simulation of transport phenomena in porous media within mathematical models of wood drying and groundwater flow. We develop an a posteriori upper bound on the Krylov subspace approximation error and provide a new interpretation of a previously published error estimate. This leads to an alternative Krylov approximation to φ(tA)b, the so-called Harmonic Ritz approximant, which we find does not exhibit oscillatory behaviour of the residual error.
Resumo:
Sample complexity results from computational learning theory, when applied to neural network learning for pattern classification problems, suggest that for good generalization performance the number of training examples should grow at least linearly with the number of adjustable parameters in the network. Results in this paper show that if a large neural network is used for a pattern classification problem and the learning algorithm finds a network with small weights that has small squared error on the training patterns, then the generalization performance depends on the size of the weights rather than the number of weights. For example, consider a two-layer feedforward network of sigmoid units, in which the sum of the magnitudes of the weights associated with each unit is bounded by A and the input dimension is n. We show that the misclassification probability is no more than a certain error estimate (that is related to squared error on the training set) plus A3 √((log n)/m) (ignoring log A and log m factors), where m is the number of training patterns. This may explain the generalization performance of neural networks, particularly when the number of training examples is considerably smaller than the number of weights. It also supports heuristics (such as weight decay and early stopping) that attempt to keep the weights small during training. The proof techniques appear to be useful for the analysis of other pattern classifiers: when the input domain is a totally bounded metric space, we use the same approach to give upper bounds on misclassification probability for classifiers with decision boundaries that are far from the training examples.
Resumo:
For the timber industry, the ability to simulate the drying of wood is invaluable for manufacturing high quality wood products. Mathematically, however, modelling the drying of a wet porous material, such as wood, is a diffcult task due to its heterogeneous and anisotropic nature, and the complex geometry of the underlying pore structure. The well{ developed macroscopic modelling approach involves writing down classical conservation equations at a length scale where physical quantities (e.g., porosity) can be interpreted as averaged values over a small volume (typically containing hundreds or thousands of pores). This averaging procedure produces balance equations that resemble those of a continuum with the exception that effective coeffcients appear in their deffnitions. Exponential integrators are numerical schemes for initial value problems involving a system of ordinary differential equations. These methods differ from popular Newton{Krylov implicit methods (i.e., those based on the backward differentiation formulae (BDF)) in that they do not require the solution of a system of nonlinear equations at each time step but rather they require computation of matrix{vector products involving the exponential of the Jacobian matrix. Although originally appearing in the 1960s, exponential integrators have recently experienced a resurgence in interest due to a greater undertaking of research in Krylov subspace methods for matrix function approximation. One of the simplest examples of an exponential integrator is the exponential Euler method (EEM), which requires, at each time step, approximation of φ(A)b, where φ(z) = (ez - 1)/z, A E Rnxn and b E Rn. For drying in porous media, the most comprehensive macroscopic formulation is TransPore [Perre and Turner, Chem. Eng. J., 86: 117-131, 2002], which features three coupled, nonlinear partial differential equations. The focus of the first part of this thesis is the use of the exponential Euler method (EEM) for performing the time integration of the macroscopic set of equations featured in TransPore. In particular, a new variable{ stepsize algorithm for EEM is presented within a Krylov subspace framework, which allows control of the error during the integration process. The performance of the new algorithm highlights the great potential of exponential integrators not only for drying applications but across all disciplines of transport phenomena. For example, when applied to well{ known benchmark problems involving single{phase liquid ow in heterogeneous soils, the proposed algorithm requires half the number of function evaluations than that required for an equivalent (sophisticated) Newton{Krylov BDF implementation. Furthermore for all drying configurations tested, the new algorithm always produces, in less computational time, a solution of higher accuracy than the existing backward Euler module featured in TransPore. Some new results relating to Krylov subspace approximation of '(A)b are also developed in this thesis. Most notably, an alternative derivation of the approximation error estimate of Hochbruck, Lubich and Selhofer [SIAM J. Sci. Comput., 19(5): 1552{1574, 1998] is provided, which reveals why it performs well in the error control procedure. Two of the main drawbacks of the macroscopic approach outlined above include the effective coefficients must be supplied to the model, and it fails for some drying configurations, where typical dual{scale mechanisms occur. In the second part of this thesis, a new dual{scale approach for simulating wood drying is proposed that couples the porous medium (macroscale) with the underlying pore structure (microscale). The proposed model is applied to the convective drying of softwood at low temperatures and is valid in the so{called hygroscopic range, where hygroscopically held liquid water is present in the solid phase and water exits only as vapour in the pores. Coupling between scales is achieved by imposing the macroscopic gradient on the microscopic field using suitably defined periodic boundary conditions, which allows the macroscopic ux to be defined as an average of the microscopic ux over the unit cell. This formulation provides a first step for moving from the macroscopic formulation featured in TransPore to a comprehensive dual{scale formulation capable of addressing any drying configuration. Simulation results reported for a sample of spruce highlight the potential and flexibility of the new dual{scale approach. In particular, for a given unit cell configuration it is not necessary to supply the effective coefficients prior to each simulation.
Resumo:
In this paper, a new alternating direction implicit Galerkin--Legendre spectral method for the two-dimensional Riesz space fractional nonlinear reaction-diffusion equation is developed. The temporal component is discretized by the Crank--Nicolson method. The detailed implementation of the method is presented. The stability and convergence analysis is strictly proven, which shows that the derived method is stable and convergent of order $2$ in time. An optimal error estimate in space is also obtained by introducing a new orthogonal projector. The present method is extended to solve the fractional FitzHugh--Nagumo model. Numerical results are provided to verify the theoretical analysis.
Resumo:
Objectives Currently, there are no studies combining electromyography (EMG) and sonography to estimate the absolute and relative strength values of erector spinae (ES) muscles in healthy individuals. The purpose of this study was to establish whether the maximum voluntary contraction (MVC) of the ES during isometric contractions could be predicted from the changes in surface EMG as well as in fiber pennation and thickness as measured by sonography. Methods Thirty healthy adults performed 3 isometric extensions at 45° from the vertical to calculate the MVC force. Contractions at 33% and 100% of the MVC force were then used during sonographic and EMG recordings. These measurements were used to observe the architecture and function of the muscles during contraction. Statistical analysis was performed using bivariate regression and regression equations. Results The slope for each regression equation was statistically significant (P < .001) with R2 values of 0.837 and 0.986 for the right and left ES, respectively. The standard error estimate between the sonographic measurements and the regression-estimated pennation angles for the right and left ES were 0.10 and 0.02, respectively. Conclusions Erector spinae muscle activation can be predicted from the changes in fiber pennation during isometric contractions at 33% and 100% of the MVC force. These findings could be essential for developing a regression equation that could estimate the level of muscle activation from changes in the muscle architecture.
Resumo:
An algorithm based on the concept of combining Kalman filter and Least Error Square (LES) techniques is proposed in this paper. The algorithm is intended to estimate signal attributes like amplitude, frequency and phase angle in the online mode. This technique can be used in protection relays, digital AVRs, DGs, DSTATCOMs, FACTS and other power electronics applications. The Kalman filter is modified to operate on a fictitious input signal and provides precise estimation results insensitive to noise and other disturbances. At the same time, the LES system has been arranged to operate in critical transient cases to compensate the delay and inaccuracy identified because of the response of the standard Kalman filter. Practical considerations such as the effect of noise, higher order harmonics, and computational issues of the algorithm are considered and tested in the paper. Several computer simulations and a laboratory test are presented to highlight the usefulness of the proposed method. Simulation results show that the proposed technique can simultaneously estimate the signal attributes, even if it is highly distorted due to the presence of non-linear loads and noise.
Resumo:
We consider complexity penalization methods for model selection. These methods aim to choose a model to optimally trade off estimation and approximation errors by minimizing the sum of an empirical risk term and a complexity penalty. It is well known that if we use a bound on the maximal deviation between empirical and true risks as a complexity penalty, then the risk of our choice is no more than the approximation error plus twice the complexity penalty. There are many cases, however, where complexity penalties like this give loose upper bounds on the estimation error. In particular, if we choose a function from a suitably simple convex function class with a strictly convex loss function, then the estimation error (the difference between the risk of the empirical risk minimizer and the minimal risk in the class) approaches zero at a faster rate than the maximal deviation between empirical and true risks. In this paper, we address the question of whether it is possible to design a complexity penalized model selection method for these situations. We show that, provided the sequence of models is ordered by inclusion, in these cases we can use tight upper bounds on estimation error as a complexity penalty. Surprisingly, this is the case even in situations when the difference between the empirical risk and true risk (and indeed the error of any estimate of the approximation error) decreases much more slowly than the complexity penalty. We give an oracle inequality showing that the resulting model selection method chooses a function with risk no more than the approximation error plus a constant times the complexity penalty.
Resumo:
Bayesian networks (BNs) are graphical probabilistic models used for reasoning under uncertainty. These models are becoming increasing popular in a range of fields including ecology, computational biology, medical diagnosis, and forensics. In most of these cases, the BNs are quantified using information from experts, or from user opinions. An interest therefore lies in the way in which multiple opinions can be represented and used in a BN. This paper proposes the use of a measurement error model to combine opinions for use in the quantification of a BN. The multiple opinions are treated as a realisation of measurement error and the model uses the posterior probabilities ascribed to each node in the BN which are computed from the prior information given by each expert. The proposed model addresses the issues associated with current methods of combining opinions such as the absence of a coherent probability model, the lack of the conditional independence structure of the BN being maintained, and the provision of only a point estimate for the consensus. The proposed model is applied an existing Bayesian Network and performed well when compared to existing methods of combining opinions.