821 resultados para sparse representation


Relevância:

20.00% 20.00%

Publicador:

Resumo:

We develop an approach for sparse representations of Gaussian Process (GP) models (which are Bayesian types of kernel machines) in order to overcome their limitations for large data sets. The method is based on a combination of a Bayesian online algorithm together with a sequential construction of a relevant subsample of the data which fully specifies the prediction of the GP model. By using an appealing parametrisation and projection techniques that use the RKHS norm, recursions for the effective parameters and a sparse Gaussian approximation of the posterior process are obtained. This allows both for a propagation of predictions as well as of Bayesian error measures. The significance and robustness of our approach is demonstrated on a variety of experiments.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In recent years there has been an increased interest in applying non-parametric methods to real-world problems. Significant research has been devoted to Gaussian processes (GPs) due to their increased flexibility when compared with parametric models. These methods use Bayesian learning, which generally leads to analytically intractable posteriors. This thesis proposes a two-step solution to construct a probabilistic approximation to the posterior. In the first step we adapt the Bayesian online learning to GPs: the final approximation to the posterior is the result of propagating the first and second moments of intermediate posteriors obtained by combining a new example with the previous approximation. The propagation of em functional forms is solved by showing the existence of a parametrisation to posterior moments that uses combinations of the kernel function at the training points, transforming the Bayesian online learning of functions into a parametric formulation. The drawback is the prohibitive quadratic scaling of the number of parameters with the size of the data, making the method inapplicable to large datasets. The second step solves the problem of the exploding parameter size and makes GPs applicable to arbitrarily large datasets. The approximation is based on a measure of distance between two GPs, the KL-divergence between GPs. This second approximation is with a constrained GP in which only a small subset of the whole training dataset is used to represent the GP. This subset is called the em Basis Vector, or BV set and the resulting GP is a sparse approximation to the true posterior. As this sparsity is based on the KL-minimisation, it is probabilistic and independent of the way the posterior approximation from the first step is obtained. We combine the sparse approximation with an extension to the Bayesian online algorithm that allows multiple iterations for each input and thus approximating a batch solution. The resulting sparse learning algorithm is a generic one: for different problems we only change the likelihood. The algorithm is applied to a variety of problems and we examine its performance both on more classical regression and classification tasks and to the data-assimilation and a simple density estimation problems.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Resource allocation in sparsely connected networks, a representative problem of systems with real variables, is studied using the replica and Bethe approximation methods. An efficient distributed algorithm is devised on the basis of insights gained from the analysis and is examined using numerical simulations,showing excellent performance and full agreement with the theoretical results. The physical properties of the resource allocation model are discussed.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The problem of resource allocation in sparse graphs with real variables is studied using methods of statistical physics. An efficient distributed algorithm is devised on the basis of insight gained from the analysis and is examined using numerical simulations, showing excellent performance and full agreement with the theoretical results.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The optimization of resource allocation in sparse networks with real variables is studied using methods of statistical physics. Efficient distributed algorithms are devised on the basis of insight gained from the analysis and are examined using numerical simulations, showing excellent performance and full agreement with the theoretical results.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We develop an approach for sparse representations of Gaussian Process (GP) models (which are Bayesian types of kernel machines) in order to overcome their limitations for large data sets. The method is based on a combination of a Bayesian online algorithm together with a sequential construction of a relevant subsample of the data which fully specifies the prediction of the GP model. By using an appealing parametrisation and projection techniques that use the RKHS norm, recursions for the effective parameters and a sparse Gaussian approximation of the posterior process are obtained. This allows both for a propagation of predictions as well as of Bayesian error measures. The significance and robustness of our approach is demonstrated on a variety of experiments.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Graphic depiction is an established method for academics to present concepts about theories of innovation. These expressions have been adopted by policy-makers, the media and businesses. However, there has been little research on the extent of their usage or effectiveness ex-academia. In addition, innovation theorists have ignored this area of study, despite the communication of information about innovation being acknowledged as a major determinant of success for corporate enterprise. The thesis explores some major themes in the theories of innovation and compares how graphics are used to represent them. The thesis examines the contribution of visual sociology and graphic theory to an investigation of a sample of graphics. The methodological focus is a modified content analysis. The following expressions are explored: check lists, matrices, maps and mapping in the management of innovation; models, flow charts, organisational charts and networks in the innovation process; and curves and cycles in the representation of performance and progress. The main conclusion is that academia is leading the way in usage as well as novelty. The graphic message is switching from prescription to description. The computerisation of graphics has created a major role for the information designer. It is recommended that use of the graphic representation of innovation should be increased in all domains, though it is conceded that its content and execution need to improve, too. Education of graphic 'producers', 'intermediaries' and 'consumers' will play a part in this, as will greater exploration of diversity, novelty and convention. Work has begun to tackle this and suggestions for future research are made.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper explores the representation of the first African World Cup in the British and South African press. Drawing on the output of a variety of media outlets between 2004, when South Africa was awarded the right to host the 2010 event, and the culmination of the tournament in July 2010, this paper contends that a range of representations of Africa have been put forward by the British and South African media. These can be interpreted as alarmist, sensationalist and even racist in certain extreme instances, and hypernationalist and overly defensive in other cases. © 2012 Copyright Taylor and Francis Group, LLC.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In many Environmental Information Systems the actual observations arise from a discrete monitoring network which might be rather heterogeneous in both location and types of measurements made. In this paper we describe the architecture and infrastructure for a system, developed as part of the EU FP6 funded INTAMAP project, to provide a service oriented solution that allows the construction of an interoperable, automatic, interpolation system. This system will be based on the Open Geospatial Consortium’s Web Feature Service (WFS) standard. The essence of our approach is to extend the GML3.1 observation feature to include information about the sensor using SensorML, and to further extend this to incorporate observation error characteristics. Our extended WFS will accept observations, and will store them in a database. The observations will be passed to our R-based interpolation server, which will use a range of methods, including a novel sparse, sequential kriging method (only briefly described here) to produce an internal representation of the interpolated field resulting from the observations currently uploaded to the system. The extended WFS will then accept queries, such as ‘What is the probability distribution of the desired variable at a given point’, ‘What is the mean value over a given region’, or ‘What is the probability of exceeding a certain threshold at a given location’. To support information-rich transfer of complex and uncertain predictions we are developing schema to represent probabilistic results in a GML3.1 (object-property) style. The system will also offer more easily accessible Web Map Service and Web Coverage Service interfaces to allow users to access the system at the level of complexity they require for their specific application. Such a system will offer a very valuable contribution to the next generation of Environmental Information Systems in the context of real time mapping for monitoring and security, particularly for systems that employ a service oriented architecture.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A content analysis examined the way majorities and minorities are represented in the British press. An analysis of the headlines of five British newspapers, over a period of five years, revealed that the words ‘majority’ and ‘minority’ appeared 658 times. Majority headlines were most frequent (66% ), more likely to emphasize the numerical size of the majority, to link majority status with political groups, to be described with positive evaluations, and to cover political issues. By contrast, minority headlines were less frequent (34%), more likely to link minority status with ethnic groups and to other social issues, and less likely to be described with positive evaluations. The implications of examining how real-life majorities and minorities are represented for our understanding of experimental research are discussed.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Very large spatially-referenced datasets, for example, those derived from satellite-based sensors which sample across the globe or large monitoring networks of individual sensors, are becoming increasingly common and more widely available for use in environmental decision making. In large or dense sensor networks, huge quantities of data can be collected over small time periods. In many applications the generation of maps, or predictions at specific locations, from the data in (near) real-time is crucial. Geostatistical operations such as interpolation are vital in this map-generation process and in emergency situations, the resulting predictions need to be available almost instantly, so that decision makers can make informed decisions and define risk and evacuation zones. It is also helpful when analysing data in less time critical applications, for example when interacting directly with the data for exploratory analysis, that the algorithms are responsive within a reasonable time frame. Performing geostatistical analysis on such large spatial datasets can present a number of problems, particularly in the case where maximum likelihood. Although the storage requirements only scale linearly with the number of observations in the dataset, the computational complexity in terms of memory and speed, scale quadratically and cubically respectively. Most modern commodity hardware has at least 2 processor cores if not more. Other mechanisms for allowing parallel computation such as Grid based systems are also becoming increasingly commonly available. However, currently there seems to be little interest in exploiting this extra processing power within the context of geostatistics. In this paper we review the existing parallel approaches for geostatistics. By recognising that diffeerent natural parallelisms exist and can be exploited depending on whether the dataset is sparsely or densely sampled with respect to the range of variation, we introduce two contrasting novel implementations of parallel algorithms based on approximating the data likelihood extending the methods of Vecchia [1988] and Tresp [2000]. Using parallel maximum likelihood variogram estimation and parallel prediction algorithms we show that computational time can be significantly reduced. We demonstrate this with both sparsely sampled data and densely sampled data on a variety of architectures ranging from the common dual core processor, found in many modern desktop computers, to large multi-node super computers. To highlight the strengths and weaknesses of the diffeerent methods we employ synthetic data sets and go on to show how the methods allow maximum likelihood based inference on the exhaustive Walker Lake data set.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Typical properties of sparse random matrices over finite (Galois) fields are studied, in the limit of large matrices, using techniques from the physics of disordered systems. For the case of a finite field GF(q) with prime order q, we present results for the average kernel dimension, average dimension of the eigenvector spaces and the distribution of the eigenvalues. The number of matrices for a given distribution of entries is also calculated for the general case. The significance of these results to error-correcting codes and random graphs is also discussed.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A multi-chromosome GA (Multi-GA) was developed, based upon concepts from the natural world, allowing improved flexibility in a number of areas including representation, genetic operators, their parameter rates and real world multi-dimensional applications. A series of experiments were conducted, comparing the performance of the Multi-GA to a traditional GA on a number of recognised and increasingly complex test optimisation surfaces, with promising results. Further experiments demonstrated the Multi-GA's flexibility through the use of non-binary chromosome representations and its applicability to dynamic parameterisation. A number of alternative and new methods of dynamic parameterisation were investigated, in addition to a new non-binary 'Quotient crossover' mechanism. Finally, the Multi-GA was applied to two real world problems, demonstrating its ability to handle mixed type chromosomes within an individual, the limited use of a chromosome level fitness function, the introduction of new genetic operators for structural self-adaptation and its viability as a serious real world analysis tool. The first problem involved optimum placement of computers within a building, allowing the Multi-GA to use multiple chromosomes with different type representations and different operators in a single individual. The second problem, commonly associated with Geographical Information Systems (GIS), required a spatial analysis location of the optimum number and distribution of retail sites over two different population grids. In applying the Multi-GA, two new genetic operators (addition and deletion) were developed and explored, resulting in the definition of a mechanism for self-modification of genetic material within the Multi-GA structure and a study of this behaviour.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Since much knowledge is tacit, eliciting knowledge is a common bottleneck during the development of knowledge-based systems. Visual interactive simulation (VIS) has been proposed as a means for eliciting experts’ decision-making by getting them to interact with a visual simulation of the real system in which they work. In order to explore the effectiveness and efficiency of VIS based knowledge elicitation, an experiment has been carried out with decision-makers in a Ford Motor Company engine assembly plant. The model properties under investigation were the level of visual representation (2-dimensional, 2½-dimensional and 3-dimensional) and the model parameter settings (unadjusted and adjusted to represent more uncommon and extreme situations). The conclusion from the experiment is that using a 2-dimensional representation with adjusted parameter settings provides the better simulation-based means for eliciting knowledge, at least for the case modelled.