45 resultados para Supervector kernel
Resumo:
Obtaining wind vectors over the ocean is important for weather forecasting and ocean modelling. Several satellite systems used operationally by meteorological agencies utilise scatterometers to infer wind vectors over the oceans. In this paper we present the results of using novel neural network based techniques to estimate wind vectors from such data. The problem is partitioned into estimating wind speed and wind direction. Wind speed is modelled using a multi-layer perceptron (MLP) and a sum of squares error function. Wind direction is a periodic variable and a multi-valued function for a given set of inputs; a conventional MLP fails at this task, and so we model the full periodic probability density of direction conditioned on the satellite derived inputs using a Mixture Density Network (MDN) with periodic kernel functions. A committee of the resulting MDNs is shown to improve the results.
Resumo:
Based on a simple convexity lemma, we develop bounds for different types of Bayesian prediction errors for regression with Gaussian processes. The basic bounds are formulated for a fixed training set. Simpler expressions are obtained for sampling from an input distribution which equals the weight function of the covariance kernel, yielding asymptotically tight results. The results are compared with numerical experiments.
Resumo:
Using techniques from Statistical Physics, the annealed VC entropy for hyperplanes in high dimensional spaces is calculated as a function of the margin for a spherical Gaussian distribution of inputs.
Resumo:
We develop an approach for sparse representations of Gaussian Process (GP) models (which are Bayesian types of kernel machines) in order to overcome their limitations for large data sets. The method is based on a combination of a Bayesian online algorithm together with a sequential construction of a relevant subsample of the data which fully specifies the prediction of the GP model. By using an appealing parametrisation and projection techniques that use the RKHS norm, recursions for the effective parameters and a sparse Gaussian approximation of the posterior process are obtained. This allows both for a propagation of predictions as well as of Bayesian error measures. The significance and robustness of our approach is demonstrated on a variety of experiments.
Resumo:
In recent years there has been an increased interest in applying non-parametric methods to real-world problems. Significant research has been devoted to Gaussian processes (GPs) due to their increased flexibility when compared with parametric models. These methods use Bayesian learning, which generally leads to analytically intractable posteriors. This thesis proposes a two-step solution to construct a probabilistic approximation to the posterior. In the first step we adapt the Bayesian online learning to GPs: the final approximation to the posterior is the result of propagating the first and second moments of intermediate posteriors obtained by combining a new example with the previous approximation. The propagation of em functional forms is solved by showing the existence of a parametrisation to posterior moments that uses combinations of the kernel function at the training points, transforming the Bayesian online learning of functions into a parametric formulation. The drawback is the prohibitive quadratic scaling of the number of parameters with the size of the data, making the method inapplicable to large datasets. The second step solves the problem of the exploding parameter size and makes GPs applicable to arbitrarily large datasets. The approximation is based on a measure of distance between two GPs, the KL-divergence between GPs. This second approximation is with a constrained GP in which only a small subset of the whole training dataset is used to represent the GP. This subset is called the em Basis Vector, or BV set and the resulting GP is a sparse approximation to the true posterior. As this sparsity is based on the KL-minimisation, it is probabilistic and independent of the way the posterior approximation from the first step is obtained. We combine the sparse approximation with an extension to the Bayesian online algorithm that allows multiple iterations for each input and thus approximating a batch solution. The resulting sparse learning algorithm is a generic one: for different problems we only change the likelihood. The algorithm is applied to a variety of problems and we examine its performance both on more classical regression and classification tasks and to the data-assimilation and a simple density estimation problems.
Resumo:
Obtaining wind vectors over the ocean is important for weather forecasting and ocean modelling. Several satellite systems used operationally by meteorological agencies utilise scatterometers to infer wind vectors over the oceans. In this paper we present the results of using novel neural network based techniques to estimate wind vectors from such data. The problem is partitioned into estimating wind speed and wind direction. Wind speed is modelled using a multi-layer perceptron (MLP) and a sum of squares error function. Wind direction is a periodic variable and a multi-valued function for a given set of inputs; a conventional MLP fails at this task, and so we model the full periodic probability density of direction conditioned on the satellite derived inputs using a Mixture Density Network (MDN) with periodic kernel functions. A committee of the resulting MDNs is shown to improve the results.
Resumo:
We develop an approach for sparse representations of Gaussian Process (GP) models (which are Bayesian types of kernel machines) in order to overcome their limitations for large data sets. The method is based on a combination of a Bayesian online algorithm together with a sequential construction of a relevant subsample of the data which fully specifies the prediction of the GP model. By using an appealing parametrisation and projection techniques that use the RKHS norm, recursions for the effective parameters and a sparse Gaussian approximation of the posterior process are obtained. This allows both for a propagation of predictions as well as of Bayesian error measures. The significance and robustness of our approach is demonstrated on a variety of experiments.
Resumo:
We have shown previously that a template model for edge perception successfully predicts perceived blur for a variety of edge profiles (Georgeson, 2001 Journal of Vision 1 438a; Barbieri-Hesse and Georgeson, 2002 Perception 31 Supplement, 54). This study concerns the perceived contrast of edges. Our model spatially differentiates the luminance profile, half-wave rectifies this first derivative, and then differentiates again to create the edge's 'signature'. The spatial scale of the signature is evaluated by filtering it with a set of Gaussian derivative operators. This process finds the correlation between the signature and each operator kernel at each position. These kernels therefore act as templates, and the position and scale of the best-fitting template indicate the position and blur of the edge. Our previous finding, that reducing edge contrast reduces perceived blur, can be explained by replacing the half-wave rectifier with a smooth, biased rectifier function (May and Georgeson, 2003 Perception 32 388; May and Georgeson, 2003 Perception 32 Supplement, 46). With the half-wave rectifier, the peak template response R to a Gaussian edge with contrast C and scale s is given by: R=Cp-1/4s-3/2. Hence, edge contrast can be estimated from response magnitude and blur: C=Rp1/4s3/2. Use of this equation with the modified rectifier predicts that perceived contrast will decrease with increasing blur, particularly at low contrasts. Contrast-matching experiments supported this prediction. In addition, the model correctly predicts the perceived contrast of Gaussian edges modified either by spatial truncation or by the addition of a ramp.
Resumo:
Typical properties of sparse random matrices over finite (Galois) fields are studied, in the limit of large matrices, using techniques from the physics of disordered systems. For the case of a finite field GF(q) with prime order q, we present results for the average kernel dimension, average dimension of the eigenvector spaces and the distribution of the eigenvalues. The number of matrices for a given distribution of entries is also calculated for the general case. The significance of these results to error-correcting codes and random graphs is also discussed.
Resumo:
This thesis is concerned with the measurement of the characteristics of nonlinear systems by crosscorrelation, using pseudorandom input signals based on m sequences. The systems are characterised by Volterra series, and analytical expressions relating the rth order Volterra kernel to r-dimensional crosscorrelation measurements are derived. It is shown that the two-dimensional crosscorrelation measurements are related to the corresponding second order kernel values by a set of equations which may be structured into a number of independent subsets. The m sequence properties determine how the maximum order of the subsets for off-diagonal values is related to the upper bound of the arguments for nonzero kernel values. The upper bound of the arguments is used as a performance index, and the performance of antisymmetric pseudorandom binary, ternary and quinary signals is investigated. The performance indices obtained above are small in relation to the periods of the corresponding signals. To achieve higher performance with ternary signals, a method is proposed for combining the estimates of the second order kernel values so that the effects of some of the undesirable nonzero values in the fourth order autocorrelation function of the input signal are removed. The identification of the dynamics of two-input, single-output systems with multiplicative nonlinearity is investigated. It is shown that the characteristics of such a system may be determined by crosscorrelation experiments using phase-shifted versions of a common signal as inputs. The effects of nonlinearities on the estimates of system weighting functions obtained by crosscorrelation are also investigated. Results obtained by correlation testing of an industrial process are presented, and the differences between theoretical and experimental results discussed for this case;
Resumo:
The ERS-1 Satellite was launched in July 1991 by the European Space Agency into a polar orbit at about 800 km, carrying a C-band scatterometer. A scatterometer measures the amount of backscatter microwave radiation reflected by small ripples on the ocean surface induced by sea-surface winds, and so provides instantaneous snap-shots of wind flow over large areas of the ocean surface, known as wind fields. Inherent in the physics of the observation process is an ambiguity in wind direction; the scatterometer cannot distinguish if the wind is blowing toward or away from the sensor device. This ambiguity implies that there is a one-to-many mapping between scatterometer data and wind direction. Current operational methods for wind field retrieval are based on the retrieval of wind vectors from satellite scatterometer data, followed by a disambiguation and filtering process that is reliant on numerical weather prediction models. The wind vectors are retrieved by the local inversion of a forward model, mapping scatterometer observations to wind vectors, and minimising a cost function in scatterometer measurement space. This thesis applies a pragmatic Bayesian solution to the problem. The likelihood is a combination of conditional probability distributions for the local wind vectors given the scatterometer data. The prior distribution is a vector Gaussian process that provides the geophysical consistency for the wind field. The wind vectors are retrieved directly from the scatterometer data by using mixture density networks, a principled method to model multi-modal conditional probability density functions. The complexity of the mapping and the structure of the conditional probability density function are investigated. A hybrid mixture density network, that incorporates the knowledge that the conditional probability distribution of the observation process is predominantly bi-modal, is developed. The optimal model, which generalises across a swathe of scatterometer readings, is better on key performance measures than the current operational model. Wind field retrieval is approached from three perspectives. The first is a non-autonomous method that confirms the validity of the model by retrieving the correct wind field 99% of the time from a test set of 575 wind fields. The second technique takes the maximum a posteriori probability wind field retrieved from the posterior distribution as the prediction. For the third technique, Markov Chain Monte Carlo (MCMC) techniques were employed to estimate the mass associated with significant modes of the posterior distribution, and make predictions based on the mode with the greatest mass associated with it. General methods for sampling from multi-modal distributions were benchmarked against a specific MCMC transition kernel designed for this problem. It was shown that the general methods were unsuitable for this application due to computational expense. On a test set of 100 wind fields the MAP estimate correctly retrieved 72 wind fields, whilst the sampling method correctly retrieved 73 wind fields.
Resumo:
This paper provides the most fully comprehensive evidence to date on whether or not monetary aggregates are valuable for forecasting US inflation in the early to mid 2000s. We explore a wide range of different definitions of money, including different methods of aggregation and different collections of included monetary assets. In our forecasting experiment we use two non-linear techniques, namely, recurrent neural networks and kernel recursive least squares regression - techniques that are new to macroeconomics. Recurrent neural networks operate with potentially unbounded input memory, while the kernel regression technique is a finite memory predictor. The two methodologies compete to find the best fitting US inflation forecasting models and are then compared to forecasts from a naive random walk model. The best models were non-linear autoregressive models based on kernel methods. Our findings do not provide much support for the usefulness of monetary aggregates in forecasting inflation.
Resumo:
We present an assessment of the practical value of existing traditional and non-standard measures for discriminating healthy people from people with Parkinson's disease (PD) by detecting dysphonia. We introduce a new measure of dysphonia, Pitch Period Entropy (PPE), which is robust to many uncontrollable confounding effects including noisy acoustic environments and normal, healthy variations in voice frequency. We collected sustained phonations from 31 people, 23 with PD. We then selected 10 highly uncorrelated measures, and an exhaustive search of all possible combinations of these measures finds four that in combination lead to overall correct classification performance of 91.4%, using a kernel support vector machine. In conclusion, we find that non-standard methods in combination with traditional harmonics-to-noise ratios are best able to separate healthy from PD subjects. The selected non-standard methods are robust to many uncontrollable variations in acoustic environment and individual subjects, and are thus well-suited to telemonitoring applications.
Resumo:
The literature discusses several methods to control for self-selection effects but provides little guidance on which method to use in a setting with a limited number of variables. The authors theoretically compare and empirically assess the performance of different matching methods and instrumental variable and control function methods in this type of setting by investigating the effect of online banking on product usage. Hybrid matching in combination with the Gaussian kernel algorithm outperforms the other methods with respect to predictive validity. The empirical finding of large self-selection effects indicates the importance of controlling for these effects when assessing the effectiveness of marketing activities.
Resumo:
Agent-based technology is playing an increasingly important role in today’s economy. Usually a multi-agent system is needed to model an economic system such as a market system, in which heterogeneous trading agents interact with each other autonomously. Two questions often need to be answered regarding such systems: 1) How to design an interacting mechanism that facilitates efficient resource allocation among usually self-interested trading agents? 2) How to design an effective strategy in some specific market mechanisms for an agent to maximise its economic returns? For automated market systems, auction is the most popular mechanism to solve resource allocation problems among their participants. However, auction comes in hundreds of different formats, in which some are better than others in terms of not only the allocative efficiency but also other properties e.g., whether it generates high revenue for the auctioneer, whether it induces stable behaviour of the bidders. In addition, different strategies result in very different performance under the same auction rules. With this background, we are inevitably intrigued to investigate auction mechanism and strategy designs for agent-based economics. The international Trading Agent Competition (TAC) Ad Auction (AA) competition provides a very useful platform to develop and test agent strategies in Generalised Second Price auction (GSP). AstonTAC, the runner-up of TAC AA 2009, is a successful advertiser agent designed for GSP-based keyword auction. In particular, AstonTAC generates adaptive bid prices according to the Market-based Value Per Click and selects a set of keyword queries with highest expected profit to bid on to maximise its expected profit under the limit of conversion capacity. Through evaluation experiments, we show that AstonTAC performs well and stably not only in the competition but also across a broad range of environments. The TAC CAT tournament provides an environment for investigating the optimal design of mechanisms for double auction markets. AstonCAT-Plus is the post-tournament version of the specialist developed for CAT 2010. In our experiments, AstonCAT-Plus not only outperforms most specialist agents designed by other institutions but also achieves high allocative efficiencies, transaction success rates and average trader profits. Moreover, we reveal some insights of the CAT: 1) successful markets should maintain a stable and high market share of intra-marginal traders; 2) a specialist’s performance is dependent on the distribution of trading strategies. However, typical double auction models assume trading agents have a fixed trading direction of either buy or sell. With this limitation they cannot directly reflect the fact that traders in financial markets (the most popular application of double auction) decide their trading directions dynamically. To address this issue, we introduce the Bi-directional Double Auction (BDA) market which is populated by two-way traders. Experiments are conducted under both dynamic and static settings of the continuous BDA market. We find that the allocative efficiency of a continuous BDA market mainly comes from rational selection of trading directions. Furthermore, we introduce a high-performance Kernel trading strategy in the BDA market which uses kernel probability density estimator built on historical transaction data to decide optimal order prices. Kernel trading strategy outperforms some popular intelligent double auction trading strategies including ZIP, GD and RE in the continuous BDA market by making the highest profit in static games and obtaining the best wealth in dynamic games.