823 resultados para Information display systems.
Resumo:
We study the rates of growth of the regret in online convex optimization. First, we show that a simple extension of the algorithm of Hazan et al eliminates the need for a priori knowledge of the lower bound on the second derivatives of the observed functions. We then provide an algorithm, Adaptive Online Gradient Descent, which interpolates between the results of Zinkevich for linear functions and of Hazan et al for strongly convex functions, achieving intermediate rates between [square root T] and [log T]. Furthermore, we show strong optimality of the algorithm. Finally, we provide an extension of our results to general norms.
Resumo:
Most standard algorithms for prediction with expert advice depend on a parameter called the learning rate. This learning rate needs to be large enough to fit the data well, but small enough to prevent overfitting. For the exponential weights algorithm, a sequence of prior work has established theoretical guarantees for higher and higher data-dependent tunings of the learning rate, which allow for increasingly aggressive learning. But in practice such theoretical tunings often still perform worse (as measured by their regret) than ad hoc tuning with an even higher learning rate. To close the gap between theory and practice we introduce an approach to learn the learning rate. Up to a factor that is at most (poly)logarithmic in the number of experts and the inverse of the learning rate, our method performs as well as if we would know the empirically best learning rate from a large range that includes both conservative small values and values that are much higher than those for which formal guarantees were previously available. Our method employs a grid of learning rates, yet runs in linear time regardless of the size of the grid.
Resumo:
We consider online prediction problems where the loss between the prediction and the outcome is measured by the squared Euclidean distance and its generalization, the squared Mahalanobis distance. We derive the minimax solutions for the case where the prediction and action spaces are the simplex (this setup is sometimes called the Brier game) and the \ell_2 ball (this setup is related to Gaussian density estimation). We show that in both cases the value of each sub-game is a quadratic function of a simple statistic of the state, with coefficients that can be efficiently computed using an explicit recurrence relation. The resulting deterministic minimax strategy and randomized maximin strategy are linear functions of the statistic.
Resumo:
Cloud computing has significantly impacted a broad range of industries, but these technologies and services have been absorbed throughout the marketplace unevenly. Some industries have moved aggressively towards cloud computing, while others have moved much more slowly. For the most part, the energy sector has approached cloud computing in a measured and cautious way, with progress often in the form of private cloud solutions rather than public ones, or hybridized information technology systems that combine cloud and existing non-cloud architectures. By moving towards cloud computing in a very slow and tentative way, however, the energy industry may prevent itself from reaping the full benefit that a more complete migration to the public cloud has brought about in several other industries. This short communication is accordingly intended to offer a high-level overview of cloud computing, and to put forward the argument that the energy sector should make a more complete migration to the public cloud in order to unlock the major system-wide efficiencies that cloud computing can provide. Also, assets within the energy sector should be designed with as much modularity and flexibility as possible so that they are not locked out of cloud-friendly options in the future.
Resumo:
Through ubiquitous computing and location-based social media, information is spreading outside the traditional domains of home and work into the urban environment. Digital technologies have changed the way people relate to the urban form supporting discussion on multiple levels, allowing more citizens to be heard in new ways (Fredericks et al. 2013; Houghton et al. 2014; Caldwell et al. 2013). Face-to-face and digitally mediated discussions, facilitated by tangible and hybrid interaction, such as multi-touch screens and media façades, are initiated through a telephone booth inspired portable structure: The InstaBooth. The InstaBooth prototype employs a multidisciplinary approach to engage local communities in a situated debate on the future of their urban environment. With it, we capture citizens’ past stories and opinions on the use and design of public places. The way public consultations are currently done often engages only a section of the population involved in a proposed development; the more vocal citizens are not necessarily the more representative of the communities (Jenkins 2006). Alternative ways to engage urban dwellers in the debate about the built environment are explored at the moment, including the use of social media or online tools (Foth 2009). This project fosters innovation by providing pathways for communities to participate in the decision making process that informs the urban form. The InstaBooth promotes dialogue and mediation between a bottom-up and a top-down approach to urban design, with the aim of promoting community connectedness with the urban environment. The InstaBooth provides an engagement and discussion platform that leverages a number of locally developed display and interaction technologies in order to facilitate a dialogue of ideas and commentary. The InstaBooth combines multiple interaction techniques into a hybrid (digital and analogue) media space. Through the InstaBooth, urban design and architectural proposals are displayed encouraging commentary from visitors. Inside the InstaBooth, visitors can activate a multi-touch screen in order to browse media, write a note, or draw a picture to provide feedback. The purpose of the InstaBooth is to engage with a broader section of society, including those who are often marginalised. The specific design of the internal and external interfaces, the mutual relationship between these interfaces with regards to information display and interaction, and the question how visitors can engage with the system, are part of the research agenda of the project.
Resumo:
Cognitive scientists were not quick to embrace the functional neuroimaging technologies that emerged during the late 20th century. In this new century, cognitive scientists continue to question, not unreasonably, the relevance of functional neuroimaging investigations that fail to address questions of interest to cognitive science. However, some ultra-cognitive scientists assert that these experiments can never be of relevance to the study of cognition. Their reasoning reflects an adherence to a functionalist philosophy that arbitrarily and purposefully distinguishes mental information-processing systems from brain or brain-like operations. This article addresses whether data from properly conducted functional neuroimaging studies can inform and subsequently constrain the assumptions of theoretical cognitive models. The article commences with a focus upon the functionalist philosophy espoused by the ultra-cognitive scientists, contrasting it with the materialist philosophy that motivates both cognitive neuroimaging investigations and connectionist modelling of cognitive systems. Connectionism and cognitive neuroimaging share many features, including an emphasis on unified cognitive and neural models of systems that combine localist and distributed representations. The utility of designing cognitive neuroimaging studies to test (primarily) connectionist models of cognitive phenomena is illustrated using data from functional magnetic resonance imaging (fMRI) investigations of language production and episodic memory.
Resumo:
In this paper we present a robust method to detect handwritten text from unconstrained drawings on normal whiteboards. Unlike printed text on documents, free form handwritten text has no pattern in terms of size, orientation and font and it is often mixed with other drawings such as lines and shapes. Unlike handwritings on paper, handwritings on a normal whiteboard cannot be scanned so the detection has to be based on photos. Our work traces straight edges on photos of the whiteboard and builds graph representation of connected components. We use geometric properties such as edge density, graph density, aspect ratio and neighborhood similarity to differentiate handwritten text from other drawings. The experiment results show that our method achieves satisfactory precision and recall. Furthermore, the method is robust and efficient enough to be deployed in a mobile device. This is an important enabler of business applications that support whiteboard-centric visual meetings in enterprise scenarios. © 2012 IEEE.
Resumo:
Convex potential minimisation is the de facto approach to binary classification. However, Long and Servedio [2008] proved that under symmetric label noise (SLN), minimisation of any convex potential over a linear function class can result in classification performance equivalent to random guessing. This ostensibly shows that convex losses are not SLN-robust. In this paper, we propose a convex, classification-calibrated loss and prove that it is SLN-robust. The loss avoids the Long and Servedio [2008] result by virtue of being negatively unbounded. The loss is a modification of the hinge loss, where one does not clamp at zero; hence, we call it the unhinged loss. We show that the optimal unhinged solution is equivalent to that of a strongly regularised SVM, and is the limiting solution for any convex potential; this implies that strong l2 regularisation makes most standard learners SLN-robust. Experiments confirm the unhinged loss’ SLN-robustness.
Resumo:
In this paper we consider the problem of learning an n × n kernel matrix from m(1) similarity matrices under general convex loss. Past research have extensively studied the m = 1 case and have derived several algorithms which require sophisticated techniques like ACCP, SOCP, etc. The existing algorithms do not apply if one uses arbitrary losses and often can not handle m > 1 case. We present several provably convergent iterative algorithms, where each iteration requires either an SVM or a Multiple Kernel Learning (MKL) solver for m > 1 case. One of the major contributions of the paper is to extend the well knownMirror Descent(MD) framework to handle Cartesian product of psd matrices. This novel extension leads to an algorithm, called EMKL, which solves the problem in O(m2 log n 2) iterations; in each iteration one solves an MKL involving m kernels and m eigen-decomposition of n × n matrices. By suitably defining a restriction on the objective function, a faster version of EMKL is proposed, called REKL,which avoids the eigen-decomposition. An alternative to both EMKL and REKL is also suggested which requires only an SVMsolver. Experimental results on real world protein data set involving several similarity matrices illustrate the efficacy of the proposed algorithms.
Resumo:
We propose a randomized algorithm for large scale SVM learning which solves the problem by iterating over random subsets of the data. Crucial to the algorithm for scalability is the size of the subsets chosen. In the context of text classification we show that, by using ideas from random projections, a sample size of O(log n) can be used to obtain a solution which is close to the optimal with a high probability. Experiments done on synthetic and real life data sets demonstrate that the algorithm scales up SVM learners, without loss in accuracy. 1
Resumo:
We present four new reinforcement learning algorithms based on actor-critic and natural-gradient ideas, and provide their convergence proofs. Actor-critic rein- forcement learning methods are online approximations to policy iteration in which the value-function parameters are estimated using temporal difference learning and the policy parameters are updated by stochastic gradient descent. Methods based on policy gradients in this way are of special interest because of their com- patibility with function approximation methods, which are needed to handle large or infinite state spaces. The use of temporal difference learning in this way is of interest because in many applications it dramatically reduces the variance of the gradient estimates. The use of the natural gradient is of interest because it can produce better conditioned parameterizations and has been shown to further re- duce variance in some cases. Our results extend prior two-timescale convergence results for actor-critic methods by Konda and Tsitsiklis by using temporal differ- ence learning in the actor and by incorporating natural gradients, and they extend prior empirical studies of natural actor-critic methods by Peters, Vijayakumar and Schaal by providing the first convergence proofs and the first fully incremental algorithms.
Resumo:
We study consistency properties of surrogate loss functions for general multiclass classification problems, defined by a general loss matrix. We extend the notion of classification calibration, which has been studied for binary and multiclass 0-1 classification problems (and for certain other specific learning problems), to the general multiclass setting, and derive necessary and sufficient conditions for a surrogate loss to be classification calibrated with respect to a loss matrix in this setting. We then introduce the notion of \emph{classification calibration dimension} of a multiclass loss matrix, which measures the smallest `size' of a prediction space for which it is possible to design a convex surrogate that is classification calibrated with respect to the loss matrix. We derive both upper and lower bounds on this quantity, and use these results to analyze various loss matrices. In particular, as one application, we provide a different route from the recent result of Duchi et al.\ (2010) for analyzing the difficulty of designing `low-dimensional' convex surrogates that are consistent with respect to pairwise subset ranking losses. We anticipate the classification calibration dimension may prove to be a useful tool in the study and design of surrogate losses for general multiclass learning problems.
Resumo:
The Lovasz θ function of a graph, is a fundamental tool in combinatorial optimization and approximation algorithms. Computing θ involves solving a SDP and is extremely expensive even for moderately sized graphs. In this paper we establish that the Lovasz θ function is equivalent to a kernel learning problem related to one class SVM. This interesting connection opens up many opportunities bridging graph theoretic algorithms and machine learning. We show that there exist graphs, which we call SVM−θ graphs, on which the Lovasz θ function can be approximated well by a one-class SVM. This leads to a novel use of SVM techniques to solve algorithmic problems in large graphs e.g. identifying a planted clique of size Θ(n√) in a random graph G(n,12). A classic approach for this problem involves computing the θ function, however it is not scalable due to SDP computation. We show that the random graph with a planted clique is an example of SVM−θ graph, and as a consequence a SVM based approach easily identifies the clique in large graphs and is competitive with the state-of-the-art. Further, we introduce the notion of a ''common orthogonal labeling'' which extends the notion of a ''orthogonal labelling of a single graph (used in defining the θ function) to multiple graphs. The problem of finding the optimal common orthogonal labelling is cast as a Multiple Kernel Learning problem and is used to identify a large common dense region in multiple graphs. The proposed algorithm achieves an order of magnitude scalability compared to the state of the art.