964 resultados para prediction problems


Relevância:

70.00% 70.00%

Publicador:

Resumo:

In many prediction problems, including those that arise in computer security and computational finance, the process generating the data is best modelled as an adversary with whom the predictor competes. Even decision problems that are not inherently adversarial can be usefully modeled in this way, since the assumptions are sufficiently weak that effective prediction strategies for adversarial settings are very widely applicable.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

In many prediction problems, including those that arise in computer security and computational finance, the process generating the data is best modelled as an adversary with whom the predictor competes. Even decision problems that are not inherently adversarial can be usefully modeled in this way, since the assumptions are sufficiently weak that effective prediction strategies for adversarial settings are very widely applicable.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Log-linear and maximum-margin models are two commonly-used methods in supervised machine learning, and are frequently used in structured prediction problems. Efficient learning of parameters in these models is therefore an important problem, and becomes a key factor when learning from very large data sets. This paper describes exponentiated gradient (EG) algorithms for training such models, where EG updates are applied to the convex dual of either the log-linear or max-margin objective function; the dual in both the log-linear and max-margin cases corresponds to minimizing a convex function with simplex constraints. We study both batch and online variants of the algorithm, and provide rates of convergence for both cases. In the max-margin case, O(1/ε) EG updates are required to reach a given accuracy ε in the dual; in contrast, for log-linear models only O(log(1/ε)) updates are required. For both the max-margin and log-linear cases, our bounds suggest that the online EG algorithm requires a factor of n less computation to reach a desired accuracy than the batch EG algorithm, where n is the number of training examples. Our experiments confirm that the online algorithms are much faster than the batch algorithms in practice. We describe how the EG updates factor in a convenient way for structured prediction problems, allowing the algorithms to be efficiently applied to problems such as sequence learning or natural language parsing. We perform extensive evaluation of the algorithms, comparing them to L-BFGS and stochastic gradient descent for log-linear models, and to SVM-Struct for max-margin models. The algorithms are applied to a multi-class problem as well as to a more complex large-scale parsing task. In all these settings, the EG algorithms presented here outperform the other methods.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Online learning algorithms have recently risen to prominence due to their strong theoretical guarantees and an increasing number of practical applications for large-scale data analysis problems. In this paper, we analyze a class of online learning algorithms based on fixed potentials and nonlinearized losses, which yields algorithms with implicit update rules. We show how to efficiently compute these updates, and we prove regret bounds for the algorithms. We apply our formulation to several special cases where our approach has benefits over existing online learning methods. In particular, we provide improved algorithms and bounds for the online metric learning problem, and show improved robustness for online linear prediction problems. Results over a variety of data sets demonstrate the advantages of our framework.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We consider online prediction problems where the loss between the prediction and the outcome is measured by the squared Euclidean distance and its generalization, the squared Mahalanobis distance. We derive the minimax solutions for the case where the prediction and action spaces are the simplex (this setup is sometimes called the Brier game) and the \ell_2 ball (this setup is related to Gaussian density estimation). We show that in both cases the value of each sub-game is a quadratic function of a simple statistic of the state, with coefficients that can be efficiently computed using an explicit recurrence relation. The resulting deterministic minimax strategy and randomized maximin strategy are linear functions of the statistic.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In many real world prediction problems the output is a structured object like a sequence or a tree or a graph. Such problems range from natural language processing to compu- tational biology or computer vision and have been tackled using algorithms, referred to as structured output learning algorithms. We consider the problem of structured classifi- cation. In the last few years, large margin classifiers like sup-port vector machines (SVMs) have shown much promise for structured output learning. The related optimization prob -lem is a convex quadratic program (QP) with a large num-ber of constraints, which makes the problem intractable for large data sets. This paper proposes a fast sequential dual method (SDM) for structural SVMs. The method makes re-peated passes over the training set and optimizes the dual variables associated with one example at a time. The use of additional heuristics makes the proposed method more efficient. We present an extensive empirical evaluation of the proposed method on several sequence learning problems.Our experiments on large data sets demonstrate that the proposed method is an order of magnitude faster than state of the art methods like cutting-plane method and stochastic gradient descent method (SGD). Further, SDM reaches steady state generalization performance faster than the SGD method. The proposed SDM is thus a useful alternative for large scale structured output learning.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Although it is widely believed that reinforcement learning is a suitable tool for describing behavioral learning, the mechanisms by which it can be implemented in networks of spiking neurons are not fully understood. Here, we show that different learning rules emerge from a policy gradient approach depending on which features of the spike trains are assumed to influence the reward signals, i.e., depending on which neural code is in effect. We use the framework of Williams (1992) to derive learning rules for arbitrary neural codes. For illustration, we present policy-gradient rules for three different example codes - a spike count code, a spike timing code and the most general "full spike train" code - and test them on simple model problems. In addition to classical synaptic learning, we derive learning rules for intrinsic parameters that control the excitability of the neuron. The spike count learning rule has structural similarities with established Bienenstock-Cooper-Munro rules. If the distribution of the relevant spike train features belongs to the natural exponential family, the learning rules have a characteristic shape that raises interesting prediction problems.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Adaptive Resonance Theory (ART) models are real-time neural networks for category learning, pattern recognition, and prediction. Unsupervised fuzzy ART and supervised fuzzy ARTMAP synthesize fuzzy logic and ART networks by exploiting the formal similarity between the computations of fuzzy subsethood and the dynamics of ART category choice, search, and learning. Fuzzy ART self-organizes stable recognition categories in response to arbitrary sequences of analog or binary input patterns. It generalizes the binary ART 1 model, replacing the set-theoretic: intersection (∩) with the fuzzy intersection (∧), or component-wise minimum. A normalization procedure called complement coding leads to a symmetric: theory in which the fuzzy inter:>ec:tion and the fuzzy union (∨), or component-wise maximum, play complementary roles. Complement coding preserves individual feature amplitudes while normalizing the input vector, and prevents a potential category proliferation problem. Adaptive weights :otart equal to one and can only decrease in time. A geometric interpretation of fuzzy AHT represents each category as a box that increases in size as weights decrease. A matching criterion controls search, determining how close an input and a learned representation must be for a category to accept the input as a new exemplar. A vigilance parameter (p) sets the matching criterion and determines how finely or coarsely an ART system will partition inputs. High vigilance creates fine categories, represented by small boxes. Learning stops when boxes cover the input space. With fast learning, fixed vigilance, and an arbitrary input set, learning stabilizes after just one presentation of each input. A fast-commit slow-recode option allows rapid learning of rare events yet buffers memories against recoding by noisy inputs. Fuzzy ARTMAP unites two fuzzy ART networks to solve supervised learning and prediction problems. A Minimax Learning Rule controls ARTMAP category structure, conjointly minimizing predictive error and maximizing code compression. Low vigilance maximizes compression but may therefore cause very different inputs to make the same prediction. When this coarse grouping strategy causes a predictive error, an internal match tracking control process increases vigilance just enough to correct the error. ARTMAP automatically constructs a minimal number of recognition categories, or "hidden units," to meet accuracy criteria. An ARTMAP voting strategy improves prediction by training the system several times using different orderings of the input set. Voting assigns confidence estimates to competing predictions given small, noisy, or incomplete training sets. ARPA benchmark simulations illustrate fuzzy ARTMAP dynamics. The chapter also compares fuzzy ARTMAP to Salzberg's Nested Generalized Exemplar (NGE) and to Simpson's Fuzzy Min-Max Classifier (FMMC); and concludes with a summary of ART and ARTMAP applications.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Incorporating a prediction into future planning and decision making is advisable only if we have judged the prediction’s credibility. This is notoriously difficult and controversial in the case of predictions of future climate. By reviewing epistemic arguments about climate model performance, we discuss how to make and justify judgments about the credibility of climate predictions. We propose a new bounding argument that justifies basing such judgments on the past performance of possibly dissimilar prediction problems. This encourages a more explicit use of data in making quantitative judgments about the credibility of future climate predictions, and in training users of climate predictions to become better judges of credibility. We illustrate the approach using decadal predictions of annual mean, global mean surface air temperature.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In this dissertation, different ways of combining neural predictive models or neural-based forecasts are discussed. The proposed approaches consider mostly Gaussian radial basis function networks, which can be efficiently identified and estimated through recursive/adaptive methods. Two different ways of combining are explored to get a final estimate – model mixing and model synthesis –, with the aim of obtaining improvements both in terms of efficiency and effectiveness. In the context of model mixing, the usual framework for linearly combining estimates from different models is extended, to deal with the case where the forecast errors from those models are correlated. In the context of model synthesis, and to address the problems raised by heavily nonstationary time series, we propose hybrid dynamic models for more advanced time series forecasting, composed of a dynamic trend regressive model (or, even, a dynamic harmonic regressive model), and a Gaussian radial basis function network. Additionally, using the model mixing procedure, two approaches for decision-making from forecasting models are discussed and compared: either inferring decisions from combined predictive estimates, or combining prescriptive solutions derived from different forecasting models. Finally, the application of some of the models and methods proposed previously is illustrated with two case studies, based on time series from finance and from tourism.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Reports 2 and 3 by E. Isaacson, J. J. Stoker, and A. Troesch.