71 resultados para sparse URAs

em Aston University Research Archive


Relevância:

20.00% 20.00%

Publicador:

Resumo:

We develop an approach for a sparse representation for Gaussian Process (GP) models in order to overcome the limitations of GPs caused by large data sets. The method is based on a combination of a Bayesian online algorithm together with a sequential construction of a relevant subsample of the data which fully specifies the prediction of the model. Experimental results on toy examples and large real-world datasets indicate the efficiency of the approach.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We develop an approach for sparse representations of Gaussian Process (GP) models (which are Bayesian types of kernel machines) in order to overcome their limitations for large data sets. The method is based on a combination of a Bayesian online algorithm together with a sequential construction of a relevant subsample of the data which fully specifies the prediction of the GP model. By using an appealing parametrisation and projection techniques that use the RKHS norm, recursions for the effective parameters and a sparse Gaussian approximation of the posterior process are obtained. This allows both for a propagation of predictions as well as of Bayesian error measures. The significance and robustness of our approach is demonstrated on a variety of experiments.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In recent years there has been an increased interest in applying non-parametric methods to real-world problems. Significant research has been devoted to Gaussian processes (GPs) due to their increased flexibility when compared with parametric models. These methods use Bayesian learning, which generally leads to analytically intractable posteriors. This thesis proposes a two-step solution to construct a probabilistic approximation to the posterior. In the first step we adapt the Bayesian online learning to GPs: the final approximation to the posterior is the result of propagating the first and second moments of intermediate posteriors obtained by combining a new example with the previous approximation. The propagation of em functional forms is solved by showing the existence of a parametrisation to posterior moments that uses combinations of the kernel function at the training points, transforming the Bayesian online learning of functions into a parametric formulation. The drawback is the prohibitive quadratic scaling of the number of parameters with the size of the data, making the method inapplicable to large datasets. The second step solves the problem of the exploding parameter size and makes GPs applicable to arbitrarily large datasets. The approximation is based on a measure of distance between two GPs, the KL-divergence between GPs. This second approximation is with a constrained GP in which only a small subset of the whole training dataset is used to represent the GP. This subset is called the em Basis Vector, or BV set and the resulting GP is a sparse approximation to the true posterior. As this sparsity is based on the KL-minimisation, it is probabilistic and independent of the way the posterior approximation from the first step is obtained. We combine the sparse approximation with an extension to the Bayesian online algorithm that allows multiple iterations for each input and thus approximating a batch solution. The resulting sparse learning algorithm is a generic one: for different problems we only change the likelihood. The algorithm is applied to a variety of problems and we examine its performance both on more classical regression and classification tasks and to the data-assimilation and a simple density estimation problems.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Resource allocation in sparsely connected networks, a representative problem of systems with real variables, is studied using the replica and Bethe approximation methods. An efficient distributed algorithm is devised on the basis of insights gained from the analysis and is examined using numerical simulations,showing excellent performance and full agreement with the theoretical results. The physical properties of the resource allocation model are discussed.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The problem of resource allocation in sparse graphs with real variables is studied using methods of statistical physics. An efficient distributed algorithm is devised on the basis of insight gained from the analysis and is examined using numerical simulations, showing excellent performance and full agreement with the theoretical results.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The optimization of resource allocation in sparse networks with real variables is studied using methods of statistical physics. Efficient distributed algorithms are devised on the basis of insight gained from the analysis and are examined using numerical simulations, showing excellent performance and full agreement with the theoretical results.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We develop an approach for sparse representations of Gaussian Process (GP) models (which are Bayesian types of kernel machines) in order to overcome their limitations for large data sets. The method is based on a combination of a Bayesian online algorithm together with a sequential construction of a relevant subsample of the data which fully specifies the prediction of the GP model. By using an appealing parametrisation and projection techniques that use the RKHS norm, recursions for the effective parameters and a sparse Gaussian approximation of the posterior process are obtained. This allows both for a propagation of predictions as well as of Bayesian error measures. The significance and robustness of our approach is demonstrated on a variety of experiments.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Very large spatially-referenced datasets, for example, those derived from satellite-based sensors which sample across the globe or large monitoring networks of individual sensors, are becoming increasingly common and more widely available for use in environmental decision making. In large or dense sensor networks, huge quantities of data can be collected over small time periods. In many applications the generation of maps, or predictions at specific locations, from the data in (near) real-time is crucial. Geostatistical operations such as interpolation are vital in this map-generation process and in emergency situations, the resulting predictions need to be available almost instantly, so that decision makers can make informed decisions and define risk and evacuation zones. It is also helpful when analysing data in less time critical applications, for example when interacting directly with the data for exploratory analysis, that the algorithms are responsive within a reasonable time frame. Performing geostatistical analysis on such large spatial datasets can present a number of problems, particularly in the case where maximum likelihood. Although the storage requirements only scale linearly with the number of observations in the dataset, the computational complexity in terms of memory and speed, scale quadratically and cubically respectively. Most modern commodity hardware has at least 2 processor cores if not more. Other mechanisms for allowing parallel computation such as Grid based systems are also becoming increasingly commonly available. However, currently there seems to be little interest in exploiting this extra processing power within the context of geostatistics. In this paper we review the existing parallel approaches for geostatistics. By recognising that diffeerent natural parallelisms exist and can be exploited depending on whether the dataset is sparsely or densely sampled with respect to the range of variation, we introduce two contrasting novel implementations of parallel algorithms based on approximating the data likelihood extending the methods of Vecchia [1988] and Tresp [2000]. Using parallel maximum likelihood variogram estimation and parallel prediction algorithms we show that computational time can be significantly reduced. We demonstrate this with both sparsely sampled data and densely sampled data on a variety of architectures ranging from the common dual core processor, found in many modern desktop computers, to large multi-node super computers. To highlight the strengths and weaknesses of the diffeerent methods we employ synthetic data sets and go on to show how the methods allow maximum likelihood based inference on the exhaustive Walker Lake data set.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Using methods of statistical physics, we study the average number and kernel size of general sparse random matrices over GF(q), with a given connectivity profile, in the thermodynamical limit of large matrices. We introduce a mapping of GF(q) matrices onto spin systems using the representation of the cyclic group of order q as the q-th complex roots of unity. This representation facilitates the derivation of the average kernel size of random matrices using the replica approach, under the replica symmetric ansatz, resulting in saddle point equations for general connectivity distributions. Numerical solutions are then obtained for particular cases by population dynamics. Similar techniques also allow us to obtain an expression for the exact and average number of random matrices for any general connectivity profile. We present numerical results for particular distributions.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Typical properties of sparse random matrices over finite (Galois) fields are studied, in the limit of large matrices, using techniques from the physics of disordered systems. For the case of a finite field GF(q) with prime order q, we present results for the average kernel dimension, average dimension of the eigenvector spaces and the distribution of the eigenvalues. The number of matrices for a given distribution of entries is also calculated for the general case. The significance of these results to error-correcting codes and random graphs is also discussed.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Inference and optimization of real-value edge variables in sparse graphs are studied using the Bethe approximation and replica method of statistical physics. Equilibrium states of general energy functions involving a large set of real edge variables that interact at the network nodes are obtained in various cases. When applied to the representative problem of network resource allocation, efficient distributed algorithms are also devised. Scaling properties with respect to the network connectivity and the resource availability are found, and links to probabilistic Bayesian approximation methods are established. Different cost measures are considered and algorithmic solutions in the various cases are devised and examined numerically. Simulation results are in full agreement with the theory. © 2007 The American Physical Society.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents a fast part-based subspace selection algorithm, termed the binary sparse nonnegative matrix factorization (B-SNMF). Both the training process and the testing process of B-SNMF are much faster than those of binary principal component analysis (B-PCA). Besides, B-SNMF is more robust to occlusions in images. Experimental results on face images demonstrate the effectiveness and the efficiency of the proposed B-SNMF.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Non-uniform B-spline dictionaries on a compact interval are discussed in the context of sparse signal representation. For each given partition, dictionaries of B-spline functions for the corresponding spline space are built up by dividing the partition into subpartitions and joining together the bases for the concomitant subspaces. The resulting slightly redundant dictionaries are composed of B-spline functions of broader support than those corresponding to the B-spline basis for the identical space. Such dictionaries are meant to assist in the construction of adaptive sparse signal representation through a combination of stepwise optimal greedy techniques.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Optimal paths connecting randomly selected network nodes and fixed routers are studied analytically in the presence of a nonlinear overlap cost that penalizes congestion. Routing becomes more difficult as the number of selected nodes increases and exhibits ergodicity breaking in the case of multiple routers. The ground state of such systems reveals nonmonotonic complex behaviors in average path length and algorithmic convergence, depending on the network topology, and densities of communicating nodes and routers. A distributed linearly scalable routing algorithm is also devised. © 2012 American Physical Society.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Sparse representation of astronomical images is discussed. It is shown that a significant gain in sparsity is achieved when particular mixed dictionaries are used for approximating these types of images with greedy selection strategies. Experiments are conducted to confirm (i) the effectiveness at producing sparse representations and (ii) competitiveness, with respect to the time required to process large images. The latter is a consequence of the suitability of the proposed dictionaries for approximating images in partitions of small blocks. This feature makes it possible to apply the effective greedy selection technique called orthogonal matching pursuit, up to some block size. For blocks exceeding that size, a refinement of the original matching pursuit approach is considered. The resulting method is termed "self-projected matching pursuit," because it is shown to be effective for implementing, via matching pursuit itself, the optional backprojection intermediate steps in that approach. © 2013 Optical Society of America.