41 resultados para Latent variables
Resumo:
In this paper we compare Multi-Layer Perceptrons (a neural network type) with Multivariate Linear Regression in predicting birthweight from nine perinatal variables which are thought to be related. Results show, that seven of the nine variables, i.e., gestational age, mother's body-mass index (BMI), sex of the baby, mother's height, smoking, parity and gravidity, are related to birthweight. We found no significant relationship between birthweight and each of the two variables, i.e., maternal age and social class.
Resumo:
We define a copula process which describes the dependencies between arbitrarily many random variables independently of their marginal distributions. As an example, we develop a stochastic volatility model, Gaussian Copula Process Volatility (GCPV), to predict the latent standard deviations of a sequence of random variables. To make predictions we use Bayesian inference, with the Laplace approximation, and with Markov chain Monte Carlo as an alternative. We find both methods comparable. We also find our model can outperform GARCH on simulated and financial data. And unlike GARCH, GCPV can easily handle missing data, incorporate covariates other than time, and model a rich class of covariance structures.
Resumo:
Density modeling is notoriously difficult for high dimensional data. One approach to the problem is to search for a lower dimensional manifold which captures the main characteristics of the data. Recently, the Gaussian Process Latent Variable Model (GPLVM) has successfully been used to find low dimensional manifolds in a variety of complex data. The GPLVM consists of a set of points in a low dimensional latent space, and a stochastic map to the observed space. We show how it can be interpreted as a density model in the observed space. However, the GPLVM is not trained as a density model and therefore yields bad density estimates. We propose a new training strategy and obtain improved generalisation performance and better density estimates in comparative evaluations on several benchmark data sets. © 2010 Springer-Verlag.
Resumo:
Latent variable models for network data extract a summary of the relational structure underlying an observed network. The simplest possible models subdivide nodes of the network into clusters; the probability of a link between any two nodes then depends only on their cluster assignment. Currently available models can be classified by whether clusters are disjoint or are allowed to overlap. These models can explain a "flat" clustering structure. Hierarchical Bayesian models provide a natural approach to capture more complex dependencies. We propose a model in which objects are characterised by a latent feature vector. Each feature is itself partitioned into disjoint groups (subclusters), corresponding to a second layer of hierarchy. In experimental comparisons, the model achieves significantly improved predictive performance on social and biological link prediction tasks. The results indicate that models with a single layer hierarchy over-simplify real networks.
Resumo:
Copulas allow to learn marginal distributions separately from the multivariate dependence structure (copula) that links them together into a density function. Vine factorizations ease the learning of high-dimensional copulas by constructing a hierarchy of conditional bivariate copulas. However, to simplify inference, it is common to assume that each of these conditional bivariate copulas is independent from its conditioning variables. In this paper, we relax this assumption by discovering the latent functions that specify the shape of a conditional copula given its conditioning variables We learn these functions by following a Bayesian approach based on sparse Gaussian processes with expectation propagation for scalable, approximate inference. Experiments on real-world datasets show that, when modeling all conditional dependencies, we obtain better estimates of the underlying copula of the data.
Resumo:
The design of wind turbine blades is a true multi-objective engineering task. The aerodynamic effectiveness of the turbine needs to be balanced with the system loads introduced by the rotor. Moreover the problem is not dependent on a single geometric property, but besides other parameters on a combination of aerofoil family and various blade functions. The aim of this paper is therefore to present a tool which can help designers to get a deeper insight into the complexity of the design space and to find a blade design which is likely to have a low cost of energy. For the research we use a Computational Blade Optimisation and Load Deflation Tool (CoBOLDT) to investigate the three extreme point designs obtained from a multi-objective optimisation of turbine thrust, annual energy production as well as mass for a horizontal axis wind turbine blade. The optimisation algorithm utilised is based on Multi-Objective Tabu Search which constitutes the core of CoBOLDT. The methodology is capable to parametrise the spanning aerofoils with two-dimensional Free Form Deformation and blade functions with two tangentially connected cubic splines. After geometry generation we use a panel code to create aerofoil polars and a stationary Blade Element Momentum code to evaluate turbine performance. Finally, the obtained loads are fed into a structural layout module to estimate the mass and stiffness of the current blade by means of a fully stressed design. For the presented test case we chose post optimisation analysis with parallel coordinates to reveal geometrical features of the extreme point designs and to select a compromise design from the Pareto set. The research revealed that a blade with a feasible laminate layout can be obtained, that can increase the energy capture and lower steady state systems loads. The reduced aerofoil camber and an increased L/. D-ratio could be identified as the main drivers. This statement could not be made with other tools of the research community before. © 2013 Elsevier Ltd.
Resumo:
We demonstrate how a prior assumption of smoothness can be used to enhance the reconstruction of free energy profiles from multiple umbrella sampling simulations using the Bayesian Gaussian process regression approach. The method we derive allows the concurrent use of histograms and free energy gradients and can easily be extended to include further data. In Part I we review the necessary theory and test the method for one collective variable. We demonstrate improved performance with respect to the weighted histogram analysis method and obtain meaningful error bars without any significant additional computation. In Part II we consider the case of multiple collective variables and compare to a reconstruction using least squares fitting of radial basis functions. We find substantial improvements in the regimes of spatially sparse data or short sampling trajectories. A software implementation is made available on www.libatoms.org.