25 resultados para MANIFOLD DECOMPOSITIONS
em Aston University Research Archive
Resumo:
In this paper, we investigate the use of manifold learning techniques to enhance the separation properties of standard graph kernels. The idea stems from the observation that when we perform multidimensional scaling on the distance matrices extracted from the kernels, the resulting data tends to be clustered along a curve that wraps around the embedding space, a behavior that suggests that long range distances are not estimated accurately, resulting in an increased curvature of the embedding space. Hence, we propose to use a number of manifold learning techniques to compute a low-dimensional embedding of the graphs in an attempt to unfold the embedding manifold, and increase the class separation. We perform an extensive experimental evaluation on a number of standard graph datasets using the shortest-path (Borgwardt and Kriegel, 2005), graphlet (Shervashidze et al., 2009), random walk (Kashima et al., 2003) and Weisfeiler-Lehman (Shervashidze et al., 2011) kernels. We observe the most significant improvement in the case of the graphlet kernel, which fits with the observation that neglecting the locational information of the substructures leads to a stronger curvature of the embedding manifold. On the other hand, the Weisfeiler-Lehman kernel partially mitigates the locality problem by using the node labels information, and thus does not clearly benefit from the manifold learning. Interestingly, our experiments also show that the unfolding of the space seems to reduce the performance gap between the examined kernels.
Resumo:
The quantum Jensen-Shannon divergence kernel [1] was recently introduced in the context of unattributed graphs where it was shown to outperform several commonly used alternatives. In this paper, we study the separability properties of this kernel and we propose a way to compute a low-dimensional kernel embedding where the separation of the different classes is enhanced. The idea stems from the observation that the multidimensional scaling embeddings on this kernel show a strong horseshoe shape distribution, a pattern which is known to arise when long range distances are not estimated accurately. Here we propose to use Isomap to embed the graphs using only local distance information onto a new vectorial space with a higher class separability. The experimental evaluation shows the effectiveness of the proposed approach. © 2013 Springer-Verlag.
Resumo:
We present in this paper ideas to tackle the problem of analysing and forecasting nonstationary time series within the financial domain. Accepting the stochastic nature of the underlying data generator we assume that the evolution of the generator's parameters is restricted on a deterministic manifold. Therefore we propose methods for determining the characteristics of the time-localised distribution. Starting with the assumption of a static normal distribution we refine this hypothesis according to the empirical results obtained with the methods anc conclude with the indication of a dynamic non-Gaussian behaviour with varying dependency for the time series under consideration.
Resumo:
It has been argued that a single two-dimensional visualization plot may not be sufficient to capture all of the interesting aspects of complex data sets, and therefore a hierarchical visualization system is desirable. In this paper we extend an existing locally linear hierarchical visualization system PhiVis ¸iteBishop98a in several directions: bf(1) We allow for em non-linear projection manifolds. The basic building block is the Generative Topographic Mapping. bf(2) We introduce a general formulation of hierarchical probabilistic models consisting of local probabilistic models organized in a hierarchical tree. General training equations are derived, regardless of the position of the model in the tree. bf(3) Using tools from differential geometry we derive expressions for local directional curvatures of the projection manifold. Like PhiVis, our system is statistically principled and is built interactively in a top-down fashion using the EM algorithm. It enables the user to interactively highlight those data in the parent visualization plot which are captured by a child model. We also incorporate into our system a hierarchical, locally selective representation of magnification factors and directional curvatures of the projection manifolds. Such information is important for further refinement of the hierarchical visualization plot, as well as for controlling the amount of regularization imposed on the local models. We demonstrate the principle of the approach on a toy data set and apply our system to two more complex 12- and 19-dimensional data sets.
Resumo:
In data visualization, characterizing local geometric properties of non-linear projection manifolds provides the user with valuable additional information that can influence further steps in the data analysis. We take advantage of the smooth character of GTM projection manifold and analytically calculate its local directional curvatures. Curvature plots are useful for detecting regions where geometry is distorted, for changing the amount of regularization in non-linear projection manifolds, and for choosing regions of interest when constructing detailed lower-level visualization plots.
Resumo:
It has been argued that a single two-dimensional visualization plot may not be sufficient to capture all of the interesting aspects of complex data sets, and therefore a hierarchical visualization system is desirable. In this paper we extend an existing locally linear hierarchical visualization system PhiVis ¸iteBishop98a in several directions: bf(1) We allow for em non-linear projection manifolds. The basic building block is the Generative Topographic Mapping (GTM). bf(2) We introduce a general formulation of hierarchical probabilistic models consisting of local probabilistic models organized in a hierarchical tree. General training equations are derived, regardless of the position of the model in the tree. bf(3) Using tools from differential geometry we derive expressions for local directional curvatures of the projection manifold. Like PhiVis, our system is statistically principled and is built interactively in a top-down fashion using the EM algorithm. It enables the user to interactively highlight those data in the ancestor visualization plots which are captured by a child model. We also incorporate into our system a hierarchical, locally selective representation of magnification factors and directional curvatures of the projection manifolds. Such information is important for further refinement of the hierarchical visualization plot, as well as for controlling the amount of regularization imposed on the local models. We demonstrate the principle of the approach on a toy data set and apply our system to two more complex 12- and 18-dimensional data sets.
Resumo:
Hierarchical visualization systems are desirable because a single two-dimensional visualization plot may not be sufficient to capture all of the interesting aspects of complex high-dimensional data sets. We extend an existing locally linear hierarchical visualization system PhiVis [1] in several directions: bf(1) we allow for em non-linear projection manifolds (the basic building block is the Generative Topographic Mapping -- GTM), bf(2) we introduce a general formulation of hierarchical probabilistic models consisting of local probabilistic models organized in a hierarchical tree, bf(3) we describe folding patterns of low-dimensional projection manifold in high-dimensional data space by computing and visualizing the manifold's local directional curvatures. Quantities such as magnification factors [3] and directional curvatures are helpful for understanding the layout of the nonlinear projection manifold in the data space and for further refinement of the hierarchical visualization plot. Like PhiVis, our system is statistically principled and is built interactively in a top-down fashion using the EM algorithm. We demonstrate the visualization system principle of the approach on a complex 12-dimensional data set and mention possible applications in the pharmaceutical industry.
Resumo:
This paper assesses the extent to which the equity markets of Hungary, Poland the Czech Republic and Russia have become less segmented. Using a variety of tests it is shown there has been a consistent increase in the co-movement of some Eastern European markets and developed markets. Using the variance decompositions from a vector autoregressive representation of returns it is shown that for Poland and Hungary global factors are having an increasing influence on equity returns, suggestive of increased equity market integration. In this paper we model a system of bivariate equity market correlations as a smooth transition logistic trend model in order to establish how rapidly the countries of Eastern Europe are moving away from market segmentation. We find that Hungary is the country which is becoming integrated the most quickly. © 2005 ELsevier Ltd. All rights reserved.
Resumo:
It has been argued that a single two-dimensional visualization plot may not be sufficient to capture all of the interesting aspects of complex data sets, and therefore a hierarchical visualization system is desirable. In this paper we extend an existing locally linear hierarchical visualization system PhiVis (Bishop98a) in several directions: 1. We allow for em non-linear projection manifolds. The basic building block is the Generative Topographic Mapping. 2. We introduce a general formulation of hierarchical probabilistic models consisting of local probabilistic models organized in a hierarchical tree. General training equations are derived, regardless of the position of the model in the tree. 3. Using tools from differential geometry we derive expressions for local directionalcurvatures of the projection manifold. Like PhiVis, our system is statistically principled and is built interactively in a top-down fashion using the EM algorithm. It enables the user to interactively highlight those data in the parent visualization plot which are captured by a child model.We also incorporate into our system a hierarchical, locally selective representation of magnification factors and directional curvatures of the projection manifolds. Such information is important for further refinement of the hierarchical visualization plot, as well as for controlling the amount of regularization imposed on the local models. We demonstrate the principle of the approach on a toy data set andapply our system to two more complex 12- and 19-dimensional data sets.
Resumo:
This paper develops a productivity index applicable when producers are cost minimisers and input prices are known. The index is inspired by the Malmquist index as extended to productivity measurement. The index developed here is defined in terms of input cost rather than input quantity distance functions. Hence, productivity change is decomposed into overall efficiency and cost technical change. Furthermore, overall efficiency change is decomposed into technical and allocative efficiency change and cost technical change into a part capturing shifts of input quantities and shifts of relative input prices. These decompositions provide a clearer picture of the root sources of productivity change. They are illustrated here in a sample of hospitals; results are computed using non-parametric mathematical programming. © 2003 Elsevier B.V. All rights reserved.
Resumo:
Today, the data available to tackle many scientific challenges is vast in quantity and diverse in nature. The exploration of heterogeneous information spaces requires suitable mining algorithms as well as effective visual interfaces. Most existing systems concentrate either on mining algorithms or on visualization techniques. Though visual methods developed in information visualization have been helpful, for improved understanding of a complex large high-dimensional dataset, there is a need for an effective projection of such a dataset onto a lower-dimension (2D or 3D) manifold. This paper introduces a flexible visual data mining framework which combines advanced projection algorithms developed in the machine learning domain and visual techniques developed in the information visualization domain. The framework follows Shneiderman’s mantra to provide an effective user interface. The advantage of such an interface is that the user is directly involved in the data mining process. We integrate principled projection methods, such as Generative Topographic Mapping (GTM) and Hierarchical GTM (HGTM), with powerful visual techniques, such as magnification factors, directional curvatures, parallel coordinates, billboarding, and user interaction facilities, to provide an integrated visual data mining framework. Results on a real life high-dimensional dataset from the chemoinformatics domain are also reported and discussed. Projection results of GTM are analytically compared with the projection results from other traditional projection methods, and it is also shown that the HGTM algorithm provides additional value for large datasets. The computational complexity of these algorithms is discussed to demonstrate their suitability for the visual data mining framework.
Resumo:
The aim of this project was to carry out an investigastion into suitable alternatives to gasoline for use in modern automobiles. The fuel would provide the western world with a means of extending the natural gasoline resources and the third world a way of cutting down their dependence on the oil producing countries for their energy supply. Alcohols, namely methanol and ethanol, provide this solution. They can be used as gasoline extenders or as fuels on their own.In order to fulfil the aims of the project a literature study was carried out to investigate methods and costs of producing these fuels. An experimental programme was then set up in which the performance of the alcohols was studied on a conventional engine. The engine used for this purpose was the Fiat 127 930cc four cylinder engine. This engine was used because of its popularity in the European countries. The Weber fixed jet carburettor, since it was designed to be used with gasoline, was adapted so that the alcohol fuels and the blends could be used in the most efficient way. This was mainly to take account of the lower heat content of the alcohols. The adaptation of the carburettor was in the form of enlarging the main metering jet. Allowances for the alcohol's lower specfic gravity were made during fuel metering.Owing to the low front end volatility of methanol and ethanol, it was expected that `start up' problems would occur. An experimental programme was set up to determine the temperature range for a minimum required percentage `take off' that would ease start-up since it was determined that a `take off' of about 5% v/v liquid in the vapour phase would be sufficient for starting. Additions such as iso-pentane and n-pentane were used to improve the front end volatility. This proved to be successful.The lower heat content of the alcohol fuels also meant that a greater charge of fuel would be required. This was seen to pose further problems with fuel distribution from the carburettor to the individual cylinders on a multicylinder engine. Since it was not possible to modify the existing manifold on the Fiat 127 engine, experimental tests on manifold geometry were carried out using the Ricardo E6 single cylinder variable compression engine. Results from these tests showed that the length, shape and cross-sectional area of the manifold play an important part in the distribution of the fuel entering the cylinder, ie. vapour phase, vapour/small liquid droplet/liquid film phase, vapour/large liquid droplet/liquid film phase etc.The solvent properties of the alcohols and their greater electrical conductivity suggested that the materials used on the engine would be prone to chemical attack. In order to determine the type and rate of chemical attack, an experimental programme was set up whereby carburettor and other components were immersed in the alcohols and in blends of alcohol with gasoline. The test fuels were aerated and in some instances kept at temperatures ranging from 50oC to 90oC. Results from these tests suggest that not all materials used in the conventional engine are equally suitable for use with alcohols and alcohol/gasoline blends. Aluminium for instance was severely attacked by methanol causing pitting and pin-holing in the surface.In general this whole experimental programme gave valuable information on the acceptability of substitute fuels. While the long term effects of alcohol use merit further study, it is clear that methanol and ethanol will be increasingly used in place of gasoline.
Resumo:
Satellite-borne scatterometers are used to measure backscattered micro-wave radiation from the ocean surface. This data may be used to infer surface wind vectors where no direct measurements exist. Inherent in this data are outliers owing to aberrations on the water surface and measurement errors within the equipment. We present two techniques for identifying outliers using neural networks; the outliers may then be removed to improve models derived from the data. Firstly the generative topographic mapping (GTM) is used to create a probability density model; data with low probability under the model may be classed as outliers. In the second part of the paper, a sensor model with input-dependent noise is used and outliers are identified based on their probability under this model. GTM was successfully modified to incorporate prior knowledge of the shape of the observation manifold; however, GTM could not learn the double skinned nature of the observation manifold. To learn this double skinned manifold necessitated the use of a sensor model which imposes strong constraints on the mapping. The results using GTM with a fixed noise level suggested the noise level may vary as a function of wind speed. This was confirmed by experiments using a sensor model with input-dependent noise, where the variation in noise is most sensitive to the wind speed input. Both models successfully identified gross outliers with the largest differences between models occurring at low wind speeds. © 2003 Elsevier Science Ltd. All rights reserved.
Resumo:
Quantitative evidence that establishes the existence of the hairpin vortex state (HVS) in plane Couette flow (PCF) is provided in this work. The evidence presented in this paper shows that the HVS can be obtained via homotopy from a flow with a simple geometrical configuration, namely, the laterally heated flow (LHF). Although the early stages of bifurcations of LHF have been previously investigated, our linear stability analysis reveals that the root in the LHF yields multiple branches via symmetry breaking. These branches connect to the PCF manifold as steady nonlinear amplitude solutions. Moreover, we show that the HVS has a direct bifurcation route to the Rayleigh-Bénard convection. © 2010 The American Physical Society.
Resumo:
This paper is one of the first comprehensive attempts to compare earnings in urban China and India over the recent period. While both economies have grown considerably, we illustrate significant cross-country differences in wage growth since the late 1980s. For this purpose, we make use of comparable datasets, estimate Mincer equations and perform Oaxaca–Blinder decompositions at the mean and at different points of the wage distribution. The initial wage differential in favor of Indian workers, observed in the middle and upper part of the distribution, partly disappears over time. While the 1980s Indian premium is mainly due to higher returns to education and experience, a combination of price and endowment effects explains why Chinese wages have caught up, especially since the mid-1990s. The price effect is only partly explained by the observed convergence in returns to education; the endowment effect is driven by faster increase in education levels in China and significantly accentuates the reversal of the wage gap in favor of this country for the first half of the wage distribution.