82 resultados para Dirichlet series.
Resumo:
Variational methods are a key component of the approximate inference and learning toolbox. These methods fill an important middle ground, retaining distributional information about uncertainty in latent variables, unlike maximum a posteriori methods (MAP), and yet generally requiring less computational time than Monte Carlo Markov Chain methods. In particular the variational Expectation Maximisation (vEM) and variational Bayes algorithms, both involving variational optimisation of a free-energy, are widely used in time-series modelling. Here, we investigate the success of vEM in simple probabilistic time-series models. First we consider the inference step of vEM, and show that a consequence of the well-known compactness property of variational inference is a failure to propagate uncertainty in time, thus limiting the usefulness of the retained distributional information. In particular, the uncertainty may appear to be smallest precisely when the approximation is poorest. Second, we consider parameter learning and analytically reveal systematic biases in the parameters found by vEM. Surprisingly, simpler variational approximations (such a mean-field) can lead to less bias than more complicated structured approximations.
Resumo:
We live in an era of abundant data. This has necessitated the development of new and innovative statistical algorithms to get the most from experimental data. For example, faster algorithms make practical the analysis of larger genomic data sets, allowing us to extend the utility of cutting-edge statistical methods. We present a randomised algorithm that accelerates the clustering of time series data using the Bayesian Hierarchical Clustering (BHC) statistical method. BHC is a general method for clustering any discretely sampled time series data. In this paper we focus on a particular application to microarray gene expression data. We define and analyse the randomised algorithm, before presenting results on both synthetic and real biological data sets. We show that the randomised algorithm leads to substantial gains in speed with minimal loss in clustering quality. The randomised time series BHC algorithm is available as part of the R package BHC, which is available for download from Bioconductor (version 2.10 and above) via http://bioconductor.org/packages/2.10/bioc/html/BHC.html. We have also made available a set of R scripts which can be used to reproduce the analyses carried out in this paper. These are available from the following URL. https://sites.google.com/site/randomisedbhc/.
Resumo:
The accurate prediction of time-changing covariances is an important problem in the modeling of multivariate financial data. However, some of the most popular models suffer from a) overfitting problems and multiple local optima, b) failure to capture shifts in market conditions and c) large computational costs. To address these problems we introduce a novel dynamic model for time-changing covariances. Over-fitting and local optima are avoided by following a Bayesian approach instead of computing point estimates. Changes in market conditions are captured by assuming a diffusion process in parameter values, and finally computationally efficient and scalable inference is performed using particle filters. Experiments with financial data show excellent performance of the proposed method with respect to current standard models.
Resumo:
This paper presents flow field measurements for the turbulent stratified burner introduced in two previous publications in which high resolution scalar measurements were made by Sweeney et al. [1,2] for model validation. The flow fields of the series of premixed and stratified methane/air flames are investigated under turbulent, globally lean conditions (φg=0.75). Velocity data acquired with laser Doppler anemometry (LDA) and particle image velocimetry (PIV) are presented and discussed. Pairwise 2-component LDA measurements provide profiles of axial velocity, radial velocity, tangential velocity and corresponding fluctuating velocities. The LDA measurements of axial and tangential velocities enable the swirl number to be evaluated and the degree of swirl characterized. Power spectral density and autocorrelation functions derived from the LDA data acquired at 10kHz are optimized to calculate the integral time scales. Flow patterns are obtained using a 2-component PIV system operated at 7Hz. Velocity profiles and spatial correlations derived from the PIV and LDA measurements are shown to be in very good agreement, thus offering 3D mapping of the velocities. A strong correlation was observed between the shape of the recirculation zones above the central bluff body and the effects of heat release, stoichiometry and swirl. Detailed analyses of the LDA data further demonstrate that the flow behavior changes significantly with the levels of swirl and stratification, which combines the contributions of dilatation, recirculation and swirl. Key turbulence parameters are derived from the total velocity components, combining axial, radial and tangential velocities. © 2013 The Combustion Institute.
Resumo:
This work applies a variety of multilinear function factorisation techniques to extract appropriate features or attributes from high dimensional multivariate time series for classification. Recently, a great deal of work has centred around designing time series classifiers using more and more complex feature extraction and machine learning schemes. This paper argues that complex learners and domain specific feature extraction schemes of this type are not necessarily needed for time series classification, as excellent classification results can be obtained by simply applying a number of existing matrix factorisation or linear projection techniques, which are simple and computationally inexpensive. We highlight this using a geometric separability measure and classification accuracies obtained though experiments on four different high dimensional multivariate time series datasets. © 2013 IEEE.