2 resultados para Projected models
em Aston University Research Archive
Resumo:
Large monitoring networks are becoming increasingly common and can generate large datasets from thousands to millions of observations in size, often with high temporal resolution. Processing large datasets using traditional geostatistical methods is prohibitively slow and in real world applications different types of sensor can be found across a monitoring network. Heterogeneities in the error characteristics of different sensors, both in terms of distribution and magnitude, presents problems for generating coherent maps. An assumption in traditional geostatistics is that observations are made directly of the underlying process being studied and that the observations are contaminated with Gaussian errors. Under this assumption, sub–optimal predictions will be obtained if the error characteristics of the sensor are effectively non–Gaussian. One method, model based geostatistics, assumes that a Gaussian process prior is imposed over the (latent) process being studied and that the sensor model forms part of the likelihood term. One problem with this type of approach is that the corresponding posterior distribution will be non–Gaussian and computationally demanding as Monte Carlo methods have to be used. An extension of a sequential, approximate Bayesian inference method enables observations with arbitrary likelihoods to be treated, in a projected process kriging framework which is less computationally intensive. The approach is illustrated using a simulated dataset with a range of sensor models and error characteristics.
Resumo:
Heterogeneous datasets arise naturally in most applications due to the use of a variety of sensors and measuring platforms. Such datasets can be heterogeneous in terms of the error characteristics and sensor models. Treating such data is most naturally accomplished using a Bayesian or model-based geostatistical approach; however, such methods generally scale rather badly with the size of dataset, and require computationally expensive Monte Carlo based inference. Recently within the machine learning and spatial statistics communities many papers have explored the potential of reduced rank representations of the covariance matrix, often referred to as projected or fixed rank approaches. In such methods the covariance function of the posterior process is represented by a reduced rank approximation which is chosen such that there is minimal information loss. In this paper a sequential Bayesian framework for inference in such projected processes is presented. The observations are considered one at a time which avoids the need for high dimensional integrals typically required in a Bayesian approach. A C++ library, gptk, which is part of the INTAMAP web service, is introduced which implements projected, sequential estimation and adds several novel features. In particular the library includes the ability to use a generic observation operator, or sensor model, to permit data fusion. It is also possible to cope with a range of observation error characteristics, including non-Gaussian observation errors. Inference for the covariance parameters is explored, including the impact of the projected process approximation on likelihood profiles. We illustrate the projected sequential method in application to synthetic and real datasets. Limitations and extensions are discussed. © 2010 Elsevier Ltd.