871 resultados para High Dimensional Space
Resumo:
We describe a new model which is based on the concept of cognizing theory. The method identifies subsets of the data which are embedded in arbitrary oriented lower dimensional space. We definite k-mean covering, and study its property. Covering subsets of points are repeatedly sampled to construct trial geometry space of various dimensions. The sampling corresponding to the feature space having the best cognition ability between a mode near zero and the rest is selected and the data points are partitioned on the basis of the best cognition ability. The repeated sampling then continues recursively on each block of the data. We propose this algorithm based on cognition models. The experimental results for face recognition demonstrate that the correct rejection rate of the test samples excluded in the classes of training samples is very high and effective.
Resumo:
A new theoretical model of Pattern Recognition principles was proposed, which is based on "matter cognition" instead of "matter classification" in traditional statistical Pattern Recognition. This new model is closer to the function of human being, rather than traditional statistical Pattern Recognition using "optimal separating" as its main principle. So the new model of Pattern Recognition is called the Biomimetic Pattern Recognition (BPR)(1). Its mathematical basis is placed on topological analysis of the sample set in the high dimensional feature space. Therefore, it is also called the Topological Pattern Recognition (TPR). The fundamental idea of this model is based on the fact of the continuity in the feature space of any one of the certain kinds of samples. We experimented with the Biomimetic Pattern Recognition (BPR) by using artificial neural networks, which act through covering the high dimensional geometrical distribution of the sample set in the feature space. Onmidirectionally cognitive tests were done on various kinds of animal and vehicle models of rather similar shapes. For the total 8800 tests, the correct recognition rate is 99.87%. The rejection rate is 0.13% and on the condition of zero error rates, the correct rate of BPR was much better than that of RBF-SVM.
Resumo:
Self-assembled behavior of symmetric ABA rod-coil-rod triblock copolymer melts is studied by applying self-consistent-field lattice techniques in three-dimensional space. The phase diagram is constructed to understand the effects of the chain architecture on the self-assembled behavior. Four stable structures are observed for the ABA rod-coil-rod triblock, i.e., spherelike, lamellar, gyroidlike, and cylindrical structures. Different from AB rod-coil diblock and BAB coil-rod-coil triblock copolymers, the lamellar structure observed in ABA rod-coil-rod triblock copolymer melts is not stable for high volume fraction of the rod component (f(rod)=0.8), which is attributed to the intramolecular interactions between the two rod blocks of the polymer chain.
Resumo:
The self-assembly of symmetric coil-rod-coil ABA-type triblock copolymer melts is studied by applying self-consistent field lattice techniques in a three-dimensional space. The self-assembled ordered structures differ significantly with the variation of the volume fraction of the rod component, which include lamellar, wave lamellar, gyroid, perforated lamellar, cylindrical, and spherical-like phases. To understand the physical essence of these phases and the regimes of occurrence, we construct the phase diagram, which matches qualitatively with the existing experimental results. Compared with the coil-rod AB diblock copolymer, our results revealed that the interfacial grafting density of the separating rod and coil segments shows important influence on the self-assembly behaviors of symmetric coil-rod-coil ABA triblock copolymer melts. We found that the order-disorder transition point changes from f(rod)=0.5 for AB diblock copolymers to f(rod)=0.6 for ABA triblock copolymers. Our results also show that the spherical-like and cylindrical phases occupy most of the region in the phase diagram, and the lamellar phase is found stable only at the high volume fraction of the rod.
Resumo:
Five variables for phenol derivatives were calculated by molecular projection in three-dimensional space which were combined with eight quantum-chemical parameters and three Am indices. These variables were selected by using leaps-and-bounds regression analysis. Multiple linear regression analysis and artificial neural networks' were performed, and the results obtained by using. artificial neural networks are superior than that obtained by using multiple linear regression.
Resumo:
一般说来,离群点是远离其他数据点的数据,但很可能包含着极其重要的信息.提出了一种新的离群模糊核聚类算法来发现样本集中的离群点.通过Mercer核把原来的数据空间映射到特征空间,并为特征空间的每个向量分配一个动态权值,在经典的FCM模糊聚类算法的基础上得到了一个特征空间内的全新的聚类目标函数,通过对目标函数的优化,最终得到了各个数据的权值,根据权值的大小标识出样本集中的离群点.仿真实验的结果表明了该离群模糊核聚类算法的可行性和有效性.
Resumo:
In low-level vision, the representation of scene properties such as shape, albedo, etc., are very high dimensional as they have to describe complicated structures. The approach proposed here is to let the image itself bear as much of the representational burden as possible. In many situations, scene and image are closely related and it is possible to find a functional relationship between them. The scene information can be represented in reference to the image where the functional specifies how to translate the image into the associated scene. We illustrate the use of this representation for encoding shape information. We show how this representation has appealing properties such as locality and slow variation across space and scale. These properties provide a way of improving shape estimates coming from other sources of information like stereo.
Resumo:
Methods are presented (1) to partition or decompose a visual scene into the bodies forming it; (2) to position these bodies in three-dimensional space, by combining two scenes that make a stereoscopic pair; (3) to find the regions or zones of a visual scene that belong to its background; (4) to carry out the isolation of objects in (1) when the input has inaccuracies. Running computer programs implement the methods, and many examples illustrate their behavior. The input is a two-dimensional line-drawing of the scene, assumed to contain three-dimensional bodies possessing flat faces (polyhedra); some of them may be partially occluded. Suggestions are made for extending the work to curved objects. Some comparisons are made with human visual perception. The main conclusion is that it is possible to separate a picture or scene into the constituent objects exclusively on the basis of monocular geometric properties (on the basis of pure form); in fact, successful methods are shown.
Resumo:
Q. Shen and R. Jensen, 'Approximation-based feature selection and application for algae population estimation,' Applied Intelligence, vol. 28, no. 2, pp. 167-181, 2008. Sponsorship: EPSRC RONO: EP/E058388/1
Resumo:
Similarly to protein folding, the association of two proteins is driven by a free energy funnel, determined by favorable interactions in some neighborhood of the native state. We describe a docking method based on stochastic global minimization of funnel-shaped energy functions in the space of rigid body motions (SE(3)) while accounting for flexibility of the interface side chains. The method, called semi-definite programming-based underestimation (SDU), employs a general quadratic function to underestimate a set of local energy minima and uses the resulting underestimator to bias further sampling. While SDU effectively minimizes functions with funnel-shaped basins, its application to docking in the rotational and translational space SE(3) is not straightforward due to the geometry of that space. We introduce a strategy that uses separate independent variables for side-chain optimization, center-to-center distance of the two proteins, and five angular descriptors of the relative orientations of the molecules. The removal of the center-to-center distance turns out to vastly improve the efficiency of the search, because the five-dimensional space now exhibits a well-behaved energy surface suitable for underestimation. This algorithm explores the free energy surface spanned by encounter complexes that correspond to local free energy minima and shows similarity to the model of macromolecular association that proceeds through a series of collisions. Results for standard protein docking benchmarks establish that in this space the free energy landscape is a funnel in a reasonably broad neighborhood of the native state and that the SDU strategy can generate docking predictions with less than 5 � ligand interface Ca root-mean-square deviation while achieving an approximately 20-fold efficiency gain compared to Monte Carlo methods.
Resumo:
A method is proposed that can generate a ranked list of plausible three-dimensional hand configurations that best match an input image. Hand pose estimation is formulated as an image database indexing problem, where the closest matches for an input hand image are retrieved from a large database of synthetic hand images. In contrast to previous approaches, the system can function in the presence of clutter, thanks to two novel clutter-tolerant indexing methods. First, a computationally efficient approximation of the image-to-model chamfer distance is obtained by embedding binary edge images into a high-dimensional Euclide an space. Second, a general-purpose, probabilistic line matching method identifies those line segment correspondences between model and input images that are the least likely to have occurred by chance. The performance of this clutter-tolerant approach is demonstrated in quantitative experiments with hundreds of real hand images.
Resumo:
Nearest neighbor retrieval is the task of identifying, given a database of objects and a query object, the objects in the database that are the most similar to the query. Retrieving nearest neighbors is a necessary component of many practical applications, in fields as diverse as computer vision, pattern recognition, multimedia databases, bioinformatics, and computer networks. At the same time, finding nearest neighbors accurately and efficiently can be challenging, especially when the database contains a large number of objects, and when the underlying distance measure is computationally expensive. This thesis proposes new methods for improving the efficiency and accuracy of nearest neighbor retrieval and classification in spaces with computationally expensive distance measures. The proposed methods are domain-independent, and can be applied in arbitrary spaces, including non-Euclidean and non-metric spaces. In this thesis particular emphasis is given to computer vision applications related to object and shape recognition, where expensive non-Euclidean distance measures are often needed to achieve high accuracy. The first contribution of this thesis is the BoostMap algorithm for embedding arbitrary spaces into a vector space with a computationally efficient distance measure. Using this approach, an approximate set of nearest neighbors can be retrieved efficiently - often orders of magnitude faster than retrieval using the exact distance measure in the original space. The BoostMap algorithm has two key distinguishing features with respect to existing embedding methods. First, embedding construction explicitly maximizes the amount of nearest neighbor information preserved by the embedding. Second, embedding construction is treated as a machine learning problem, in contrast to existing methods that are based on geometric considerations. The second contribution is a method for constructing query-sensitive distance measures for the purposes of nearest neighbor retrieval and classification. In high-dimensional spaces, query-sensitive distance measures allow for automatic selection of the dimensions that are the most informative for each specific query object. It is shown theoretically and experimentally that query-sensitivity increases the modeling power of embeddings, allowing embeddings to capture a larger amount of the nearest neighbor structure of the original space. The third contribution is a method for speeding up nearest neighbor classification by combining multiple embedding-based nearest neighbor classifiers in a cascade. In a cascade, computationally efficient classifiers are used to quickly classify easy cases, and classifiers that are more computationally expensive and also more accurate are only applied to objects that are harder to classify. An interesting property of the proposed cascade method is that, under certain conditions, classification time actually decreases as the size of the database increases, a behavior that is in stark contrast to the behavior of typical nearest neighbor classification systems. The proposed methods are evaluated experimentally in several different applications: hand shape recognition, off-line character recognition, online character recognition, and efficient retrieval of time series. In all datasets, the proposed methods lead to significant improvements in accuracy and efficiency compared to existing state-of-the-art methods. In some datasets, the general-purpose methods introduced in this thesis even outperform domain-specific methods that have been custom-designed for such datasets.
Resumo:
We propose a novel unsupervised approach for linking records across arbitrarily many files, while simultaneously detecting duplicate records within files. Our key innovation is to represent the pattern of links between records as a {\em bipartite} graph, in which records are directly linked to latent true individuals, and only indirectly linked to other records. This flexible new representation of the linkage structure naturally allows us to estimate the attributes of the unique observable people in the population, calculate $k$-way posterior probabilities of matches across records, and propagate the uncertainty of record linkage into later analyses. Our linkage structure lends itself to an efficient, linear-time, hybrid Markov chain Monte Carlo algorithm, which overcomes many obstacles encountered by previously proposed methods of record linkage, despite the high dimensional parameter space. We assess our results on real and simulated data.
Resumo:
In this paper we propose a generalisation of the k-nearest neighbour (k-NN) retrieval method based on an error function using distance metrics in the solution and problem space. It is an interpolative method which is proposed to be effective for sparse case bases. The method applies equally to nominal, continuous and mixed domains, and does not depend upon an embedding n-dimensional space. In continuous Euclidean problem domains, the method is shown to be a generalisation of the Shepard's Interpolation method. We term the retrieval algorithm the Generalised Shepard Nearest Neighbour (GSNN) method. A novel aspect of GSNN is that it provides a general method for interpolation over nominal solution domains. The performance of the retrieval method is examined with reference to the Iris classification problem,and to a simulated sparse nominal value test problem. The introducion of a solution-space metric is shown to out-perform conventional nearest neighbours methods on sparse case bases.
Resumo:
Surrogate-based-optimization methods provide a means to achieve high-fidelity design optimization at reduced computational cost by using a high-fidelity model in combination with lower-fidelity models that are less expensive to evaluate. This paper presents a provably convergent trust-region model-management methodology for variableparameterization design models: that is, models for which the design parameters are defined over different spaces. Corrected space mapping is introduced as a method to map between the variable-parameterization design spaces. It is then used with a sequential-quadratic-programming-like trust-region method for two aerospace-related design optimization problems. Results for a wing design problem and a flapping-flight problem show that the method outperforms direct optimization in the high-fidelity space. On the wing design problem, the new method achieves 76% savings in high-fidelity function calls. On a bat-flight design problem, it achieves approximately 45% time savings, although it converges to a different local minimum than did the benchmark.