942 resultados para high dimensional imaginal geometry


Relevância:

100.00% 100.00%

Publicador:

Resumo:

We establish a fundamental equivalence between singular value decomposition (SVD) and functional principal components analysis (FPCA) models. The constructive relationship allows to deploy the numerical efficiency of SVD to fully estimate the components of FPCA, even for extremely high-dimensional functional objects, such as brain images. As an example, a functional mixed effect model is fitted to high-resolution morphometric (RAVENS) images. The main directions of morphometric variation in brain volumes are identified and discussed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Let P be a probability distribution on q -dimensional space. The so-called Diaconis-Freedman effect means that for a fixed dimension d<dimensional projections of P look like a scale mixture of spherically symmetric Gaussian distributions. The present paper provides necessary and sufficient conditions for this phenomenon in a suitable asymptotic framework with increasing dimension q . It turns out, that the conditions formulated by Diaconis and Freedman (1984) are not only sufficient but necessary as well. Moreover, letting P ^ be the empirical distribution of n independent random vectors with distribution P , we investigate the behavior of the empirical process n √ (P ^ −P) under random projections, conditional on P ^ .

Relevância:

100.00% 100.00%

Publicador:

Resumo:

An Internet portal accessible at www.gdb.unibe.ch has been set up to automatically generate color-coded similarity maps of the ChEMBL database in relation to up to two sets of active compounds taken from the enhanced Directory of Useful Decoys (eDUD), a random set of molecules, or up to two sets of user-defined reference molecules. These maps visualize the relationships between the selected compounds and ChEMBL in six different high dimensional chemical spaces, namely MQN (42-D molecular quantum numbers), SMIfp (34-D SMILES fingerprint), APfp (20-D shape fingerprint), Xfp (55-D pharmacophore fingerprint), Sfp (1024-bit substructure fingerprint), and ECfp4 (1024-bit extended connectivity fingerprint). The maps are supplied in form of Java based desktop applications called “similarity mapplets” allowing interactive content browsing and linked to a “Multifingerprint Browser for ChEMBL” (also accessible directly at www.gdb.unibe.ch) to perform nearest neighbor searches. One can obtain six similarity mapplets of ChEMBL relative to random reference compounds, 606 similarity mapplets relative to single eDUD active sets, 30 300 similarity mapplets relative to pairs of eDUD active sets, and any number of similarity mapplets relative to user-defined reference sets to help visualize the structural diversity of compound series in drug optimization projects and their relationship to other known bioactive compounds.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Normal mixture models are often used to cluster continuous data. However, conventional approaches for fitting these models will have problems in producing nonsingular estimates of the component-covariance matrices when the dimension of the observations is large relative to the number of observations. In this case, methods such as principal components analysis (PCA) and the mixture of factor analyzers model can be adopted to avoid these estimation problems. We examine these approaches applied to the Cabernet wine data set of Ashenfelter (1999), considering the clustering of both the wines and the judges, and comparing our results with another analysis. The mixture of factor analyzers model proves particularly effective in clustering the wines, accurately classifying many of the wines by location.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The notorious "dimensionality curse" is a well-known phenomenon for any multi-dimensional indexes attempting to scale up to high dimensions. One well-known approach to overcome degradation in performance with respect to increasing dimensions is to reduce the dimensionality of the original dataset before constructing the index. However, identifying the correlation among the dimensions and effectively reducing them are challenging tasks. In this paper, we present an adaptive Multi-level Mahalanobis-based Dimensionality Reduction (MMDR) technique for high-dimensional indexing. Our MMDR technique has four notable features compared to existing methods. First, it discovers elliptical clusters for more effective dimensionality reduction by using only the low-dimensional subspaces. Second, data points in the different axis systems are indexed using a single B+-tree. Third, our technique is highly scalable in terms of data size and dimension. Finally, it is also dynamic and adaptive to insertions. An extensive performance study was conducted using both real and synthetic datasets, and the results show that our technique not only achieves higher precision, but also enables queries to be processed efficiently. Copyright Springer-Verlag 2005

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Indexing high dimensional datasets has attracted extensive attention from many researchers in the last decade. Since R-tree type of index structures are known as suffering curse of dimensionality problems, Pyramid-tree type of index structures, which are based on the B-tree, have been proposed to break the curse of dimensionality. However, for high dimensional data, the number of pyramids is often insufficient to discriminate data points when the number of dimensions is high. Its effectiveness degrades dramatically with the increase of dimensionality. In this paper, we focus on one particular issue of curse of dimensionality; that is, the surface of a hypercube in a high dimensional space approaches 100% of the total hypercube volume when the number of dimensions approaches infinite. We propose a new indexing method based on the surface of dimensionality. We prove that the Pyramid tree technology is a special case of our method. The results of our experiments demonstrate clear priority of our novel method.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we propose a novel high-dimensional index method, the BM+-tree, to support efficient processing of similarity search queries in high-dimensional spaces. The main idea of the proposed index is to improve data partitioning efficiency in a high-dimensional space by using a rotary binary hyperplane, which further partitions a subspace and can also take advantage of the twin node concept used in the M+-tree. Compared with the key dimension concept in the M+-tree, the binary hyperplane is more effective in data filtering. High space utilization is achieved by dynamically performing data reallocation between twin nodes. In addition, a post processing step is used after index building to ensure effective filtration. Experimental results using two types of real data sets illustrate a significantly improved filtering efficiency.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Hierarchical visualization systems are desirable because a single two-dimensional visualization plot may not be sufficient to capture all of the interesting aspects of complex high-dimensional data sets. We extend an existing locally linear hierarchical visualization system PhiVis [1] in several directions: bf(1) we allow for em non-linear projection manifolds (the basic building block is the Generative Topographic Mapping -- GTM), bf(2) we introduce a general formulation of hierarchical probabilistic models consisting of local probabilistic models organized in a hierarchical tree, bf(3) we describe folding patterns of low-dimensional projection manifold in high-dimensional data space by computing and visualizing the manifold's local directional curvatures. Quantities such as magnification factors [3] and directional curvatures are helpful for understanding the layout of the nonlinear projection manifold in the data space and for further refinement of the hierarchical visualization plot. Like PhiVis, our system is statistically principled and is built interactively in a top-down fashion using the EM algorithm. We demonstrate the visualization system principle of the approach on a complex 12-dimensional data set and mention possible applications in the pharmaceutical industry.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A framework that connects computational mechanics and molecular dynamics has been developed and described. As the key parts of the framework, the problem of symbolising molecular trajectory and the associated interrelation between microscopic phase space variables and macroscopic observables of the molecular system are considered. Following Shalizi and Moore, it is shown that causal states, the constituent parts of the main construct of computational mechanics, the e-machine, define areas of the phase space that are optimal in the sense of transferring information from the micro-variables to the macro-observables. We have demonstrated that, based on the decay of their Poincare´ return times, these areas can be divided into two classes that characterise the separation of the phase space into resonant and chaotic areas. The first class is characterised by predominantly short time returns, typical to quasi-periodic or periodic trajectories. This class includes a countable number of areas corresponding to resonances. The second class includes trajectories with chaotic behaviour characterised by the exponential decay of return times in accordance with the Poincare´ theorem.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Leu-Enkephalin in explicit water is simulated using classical molecular dynamics. A ß-turn transition is investigated by calculating the topological complexity (in the "computational mechanics" framework [J. P. Crutchfield and K. Young, Phys. Rev. Lett., 63, 105 (1989)]) of the dynamics of both the peptide and the neighbouring water molecules. The complexity of the atomic trajectories of the (relatively short) simulations used in this study reflect the degree of phase space mixing in the system. It is demonstrated that the dynamic complexity of the hydrogen atoms of the peptide and almost all of the hydrogens of the neighbouring waters exhibit a minimum precisely at the moment of the ß-turn transition. This indicates the appearance of simplified periodic patterns in the atomic motion, which could correspond to high-dimensional tori in the phase space. It is hypothesized that this behaviour is the manifestation of the effect described in the approach to molecular transitions by Komatsuzaki and Berry [T. Komatsuzaki and R.S. Berry, Adv. Chem. Phys., 123, 79 (2002)], where a "quasi-regular" dynamics at the transition is suggested. Therefore, for the first time, the less chaotic character of the folding transition in a realistic molecular system is demonstrated. © Springer-Verlag Berlin Heidelberg 2006.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This research evaluates pattern recognition techniques on a subclass of big data where the dimensionality of the input space (p) is much larger than the number of observations (n). Specifically, we evaluate massive gene expression microarray cancer data where the ratio κ is less than one. We explore the statistical and computational challenges inherent in these high dimensional low sample size (HDLSS) problems and present statistical machine learning methods used to tackle and circumvent these difficulties. Regularization and kernel algorithms were explored in this research using seven datasets where κ < 1. These techniques require special attention to tuning necessitating several extensions of cross-validation to be investigated to support better predictive performance. While no single algorithm was universally the best predictor, the regularization technique produced lower test errors in five of the seven datasets studied.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Peer reviewed

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Peer reviewed