19 resultados para Priors

em Aston University Research Archive


Relevância:

20.00% 20.00%

Publicador:

Resumo:

We propose a Bayesian framework for regression problems, which covers areas which are usually dealt with by function approximation. An online learning algorithm is derived which solves regression problems with a Kalman filter. Its solution always improves with increasing model complexity, without the risk of over-fitting. In the infinite dimension limit it approaches the true Bayesian posterior. The issues of prior selection and over-fitting are also discussed, showing that some of the commonly held beliefs are misleading. The practical implementation is summarised. Simulations using 13 popular publicly available data sets are used to demonstrate the method and highlight important issues concerning the choice of priors.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We explore the dependence of performance measures, such as the generalization error and generalization consistency, on the structure and the parameterization of the prior on `rules', instanced here by the noisy linear perceptron. Using a statistical mechanics framework, we show how one may assign values to the parameters of a model for a `rule' on the basis of data instancing the rule. Information about the data, such as input distribution, noise distribution and other `rule' characteristics may be embedded in the form of general gaussian priors for improving net performance. We examine explicitly two types of general gaussian priors which are useful in some simple cases. We calculate the optimal values for the parameters of these priors and show their effect in modifying the most probable, MAP, values for the rules.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This report seeks to make concrete some of the ideas we have been discussing about sensible priors for winds over the ocean. In particular, random field models are reviewed, as are permissible covariance functions. The criteria which these covariance functions must satisfy in order that vorticity and divergence exist and are continuous are defined. The use of Helmholtz theorem is discussed, and possible choices for the covariances are suggested.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Text classification is essential for narrowing down the number of documents relevant to a particular topic for further pursual, especially when searching through large biomedical databases. Protein-protein interactions are an example of such a topic with databases being devoted specifically to them. This paper proposed a semi-supervised learning algorithm via local learning with class priors (LL-CP) for biomedical text classification where unlabeled data points are classified in a vector space based on their proximity to labeled nodes. The algorithm has been evaluated on a corpus of biomedical documents to identify abstracts containing information about protein-protein interactions with promising results. Experimental results show that LL-CP outperforms the traditional semisupervised learning algorithms such as SVMand it also performs better than local learning without incorporating class priors.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The ontology engineering research community has focused for many years on supporting the creation, development and evolution of ontologies. Ontology forecasting, which aims at predicting semantic changes in an ontology, represents instead a new challenge. In this paper, we want to give a contribution to this novel endeavour by focusing on the task of forecasting semantic concepts in the research domain. Indeed, ontologies representing scientific disciplines contain only research topics that are already popular enough to be selected by human experts or automatic algorithms. They are thus unfit to support tasks which require the ability of describing and exploring the forefront of research, such as trend detection and horizon scanning. We address this issue by introducing the Semantic Innovation Forecast (SIF) model, which predicts new concepts of an ontology at time t + 1, using only data available at time t. Our approach relies on lexical innovation and adoption information extracted from historical data. We evaluated the SIF model on a very large dataset consisting of over one million scientific papers belonging to the Computer Science domain: the outcomes show that the proposed approach offers a competitive boost in mean average precision-at-ten compared to the baselines when forecasting over 5 years.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Neural network learning rules can be viewed as statistical estimators. They should be studied in Bayesian framework even if they are not Bayesian estimators. Generalisation should be measured by the divergence between the true distribution and the estimated distribution. Information divergences are invariant measurements of the divergence between two distributions. The posterior average information divergence is used to measure the generalisation ability of a network. The optimal estimators for multinomial distributions with Dirichlet priors are studied in detail. This confirms that the definition is compatible with intuition. The results also show that many commonly used methods can be put under this unified framework, by assume special priors and special divergences.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The Bayesian analysis of neural networks is difficult because the prior over functions has a complex form, leading to implementations that either make approximations or use Monte Carlo integration techniques. In this paper I investigate the use of Gaussian process priors over functions, which permit the predictive Bayesian analysis to be carried out exactly using matrix operations. The method has been tested on two challenging problems and has produced excellent results.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The Bayesian analysis of neural networks is difficult because a simple prior over weights implies a complex prior distribution over functions. In this paper we investigate the use of Gaussian process priors over functions, which permit the predictive Bayesian analysis for fixed values of hyperparameters to be carried out exactly using matrix operations. Two methods, using optimization and averaging (via Hybrid Monte Carlo) over hyperparameters have been tested on a number of challenging problems and have produced excellent results.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

For neural networks with a wide class of weight-priors, it can be shown that in the limit of an infinite number of hidden units the prior over functions tends to a Gaussian process. In this paper analytic forms are derived for the covariance function of the Gaussian processes corresponding to networks with sigmoidal and Gaussian hidden units. This allows predictions to be made efficiently using networks with an infinite number of hidden units, and shows that, somewhat paradoxically, it may be easier to compute with infinite networks than finite ones.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The main aim of this paper is to provide a tutorial on regression with Gaussian processes. We start from Bayesian linear regression, and show how by a change of viewpoint one can see this method as a Gaussian process predictor based on priors over functions, rather than on priors over parameters. This leads in to a more general discussion of Gaussian processes in section 4. Section 5 deals with further issues, including hierarchical modelling and the setting of the parameters that control the Gaussian process, the covariance functions for neural network models and the use of Gaussian processes in classification problems.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

For neural networks with a wide class of weight priors, it can be shown that in the limit of an infinite number of hidden units, the prior over functions tends to a gaussian process. In this article, analytic forms are derived for the covariance function of the gaussian processes corresponding to networks with sigmoidal and gaussian hidden units. This allows predictions to be made efficiently using networks with an infinite number of hidden units and shows, somewhat paradoxically, that it may be easier to carry out Bayesian prediction with infinite networks rather than finite ones.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Models of visual motion processing that introduce priors for low speed through Bayesian computations are sometimes treated with scepticism by empirical researchers because of the convenient way in which parameters of the Bayesian priors have been chosen. Using the effects of motion adaptation on motion perception to illustrate, we show that the Bayesian prior, far from being convenient, may be estimated on-line and therefore represents a useful tool by which visual motion processes may be optimized in order to extract the motion signals commonly encountered in every day experience. The prescription for optimization, when combined with system constraints on the transmission of visual information, may lead to an exaggeration of perceptual bias through the process of adaptation. Our approach extends the Bayesian model of visual motion proposed byWeiss et al. [Weiss Y., Simoncelli, E., & Adelson, E. (2002). Motion illusions as optimal perception Nature Neuroscience, 5:598-604.], in suggesting that perceptual bias reflects a compromise taken by a rational system in the face of uncertain signals and system constraints. © 2007.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This study examines the forecasting accuracy of alternative vector autoregressive models each in a seven-variable system that comprises in turn of daily, weekly and monthly foreign exchange (FX) spot rates. The vector autoregressions (VARs) are in non-stationary, stationary and error-correction forms and are estimated using OLS. The imposition of Bayesian priors in the OLS estimations also allowed us to obtain another set of results. We find that there is some tendency for the Bayesian estimation method to generate superior forecast measures relatively to the OLS method. This result holds whether or not the data sets contain outliers. Also, the best forecasts under the non-stationary specification outperformed those of the stationary and error-correction specifications, particularly at long forecast horizons, while the best forecasts under the stationary and error-correction specifications are generally similar. The findings for the OLS forecasts are consistent with recent simulation results. The predictive ability of the VARs is very weak.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The retrieval of wind vectors from satellite scatterometer observations is a non-linear inverse problem. A common approach to solving inverse problems is to adopt a Bayesian framework and to infer the posterior distribution of the parameters of interest given the observations by using a likelihood model relating the observations to the parameters, and a prior distribution over the parameters. We show how Gaussian process priors can be used efficiently with a variety of likelihood models, using local forward (observation) models and direct inverse models for the scatterometer. We present an enhanced Markov chain Monte Carlo method to sample from the resulting multimodal posterior distribution. We go on to show how the computational complexity of the inference can be controlled by using a sparse, sequential Bayes algorithm for estimation with Gaussian processes. This helps to overcome the most serious barrier to the use of probabilistic, Gaussian process methods in remote sensing inverse problems, which is the prohibitively large size of the data sets. We contrast the sampling results with the approximations that are found by using the sparse, sequential Bayes algorithm.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Sentiment analysis or opinion mining aims to use automated tools to detect subjective information such as opinions, attitudes, and feelings expressed in text. This paper proposes a novel probabilistic modeling framework called joint sentiment-topic (JST) model based on latent Dirichlet allocation (LDA), which detects sentiment and topic simultaneously from text. A reparameterized version of the JST model called Reverse-JST, obtained by reversing the sequence of sentiment and topic generation in the modeling process, is also studied. Although JST is equivalent to Reverse-JST without a hierarchical prior, extensive experiments show that when sentiment priors are added, JST performs consistently better than Reverse-JST. Besides, unlike supervised approaches to sentiment classification which often fail to produce satisfactory performance when shifting to other domains, the weakly supervised nature of JST makes it highly portable to other domains. This is verified by the experimental results on data sets from five different domains where the JST model even outperforms existing semi-supervised approaches in some of the data sets despite using no labeled documents. Moreover, the topics and topic sentiment detected by JST are indeed coherent and informative. We hypothesize that the JST model can readily meet the demand of large-scale sentiment analysis from the web in an open-ended fashion.