871 resultados para High Dimensional Space
Resumo:
This paper presents a framework for performing real-time recursive estimation of landmarks’ visual appearance. Imaging data in its original high dimensional space is probabilistically mapped to a compressed low dimensional space through the definition of likelihood functions. The likelihoods are subsequently fused with prior information using a Bayesian update. This process produces a probabilistic estimate of the low dimensional representation of the landmark visual appearance. The overall filtering provides information complementary to the conventional position estimates which is used to enhance data association. In addition to robotics observations, the filter integrates human observations in the appearance estimates. The appearance tracks as computed by the filter allow landmark classification. The set of labels involved in the classification task is thought of as an observation space where human observations are made by selecting a label. The low dimensional appearance estimates returned by the filter allow for low cost communication in low bandwidth sensor networks. Deployment of the filter in such a network is demonstrated in an outdoor mapping application involving a human operator, a ground and an air vehicle.
Resumo:
Document clustering is one of the prominent methods for mining important information from the vast amount of data available on the web. However, document clustering generally suffers from the curse of dimensionality. Providentially in high dimensional space, data points tend to be more concentrated in some areas of clusters. We take advantage of this phenomenon by introducing a novel concept of dynamic cluster representation named as loci. Clusters’ loci are efficiently calculated using documents’ ranking scores generated from a search engine. We propose a fast loci-based semi-supervised document clustering algorithm that uses clusters’ loci instead of conventional centroids for assigning documents to clusters. Empirical analysis on real-world datasets shows that the proposed method produces cluster solutions with promising quality and is substantially faster than several benchmarked centroid-based semi-supervised document clustering methods.
Resumo:
Identifying unusual or anomalous patterns in an underlying dataset is an important but challenging task in many applications. The focus of the unsupervised anomaly detection literature has mostly been on vectorised data. However, many applications are more naturally described using higher-order tensor representations. Approaches that vectorise tensorial data can destroy the structural information encoded in the high-dimensional space, and lead to the problem of the curse of dimensionality. In this paper we present the first unsupervised tensorial anomaly detection method, along with a randomised version of our method. Our anomaly detection method, the One-class Support Tensor Machine (1STM), is a generalisation of conventional one-class Support Vector Machines to higher-order spaces. 1STM preserves the multiway structure of tensor data, while achieving significant improvement in accuracy and efficiency over conventional vectorised methods. We then leverage the theory of nonlinear random projections to propose the Randomised 1STM (R1STM). Our empirical analysis on several real and synthetic datasets shows that our R1STM algorithm delivers comparable or better accuracy to a state-of-the-art deep learning method and traditional kernelised approaches for anomaly detection, while being approximately 100 times faster in training and testing.
Resumo:
When document corpus is very large, we often need to reduce the number of features. But it is not possible to apply conventional Non-negative Matrix Factorization(NMF) on billion by million matrix as the matrix may not fit in memory. Here we present novel Online NMF algorithm. Using Online NMF, we reduced original high-dimensional space to low-dimensional space. Then we cluster all the documents in reduced dimension using k-means algorithm. We experimentally show that by processing small subsets of documents we will be able to achieve good performance. The method proposed outperforms existing algorithms.
Resumo:
In this paper, a new classifier of speaker identification has been proposed, which is based on Biomimetic pattern recognition (BPR). Distinguished from traditional speaker recognition methods, such as DWT, HMM, GMM, SVM and so on, the proposed classifier is constructed by some finite sub-space which is reasonable covering of the points in high dimensional space according to distributing characteristic of speech feature points. It has been used in the system of speaker identification. Experiment results show that better effect could be obtained especially with lesser samples. Furthermore, the proposed classifier employs a much simpler modeling structure as compared to the GMM. In addition, the basic idea "cognition" of Biomimetic pattern recognition (BPR) results in no requirement of retraining the old system for enrolling new speakers.
Resumo:
In this paper, a novel mathematical model of neuron-Double Synaptic Weight Neuron (DSWN)(l) is presented. The DSWN can simulate many kinds of neuron architectures, including Radial-Basis-Function (RBF), Hyper Sausage and Hyper Ellipsoid models, etc. Moreover, this new model has been implemented in the new CASSANN-II neurocomputer that can be used to form various types of neural networks with multiple mathematical models of neurons. The flexibility of the DSWN has also been described in constructing neural networks. Based on the theory of Biomimetic Pattern Recognition (BPR) and high-dimensional space covering, a recognition system of omni directionally oriented rigid objects on the horizontal surface and a face recognition system had been implemented on CASSANN-II neurocomputer. In these two special cases, the result showed DSWN neural network had great potential in pattern recognition.
Resumo:
The differences between connectionism and symbolicism in artificial intelligence (AI) are illustrated on several aspects in details firstly; then after conceptually decision factors of connectionism are proposed, the commonalities between connectionism and symbolicism are tested to make sure, by some quite typical logic mathematics operation examples such as "parity"; At last, neuron structures are expanded by modifying neuron weights and thresholds in artificial neural networks through adopting high dimensional space geometry cognition, which give more overall development space, and embodied further both commonalities.
Resumo:
Based on biomimetic pattern recognition theory, we proposed a novel speaker-independent continuous speech keyword-spotting algorithm. Without endpoint detection and division, we can get the minimum distance curve between continuous speech samples and every keyword-training net through the dynamic searching to the feature-extracted continuous speech. Then we can count the number of the keywords by investigating the vale-value and the numbers of the vales in the curve. Experiments of small vocabulary continuous speech with various speaking rate have got good recognition results and proved the validity of the algorithm.
Resumo:
The accurate recognition of cancer subtypes is very significant in clinic. Especially, the DNA microarray gene expression technology is applied to diagnosing and recognizing cancer types. This paper proposed a method of that recognized cancer subtypes based on geometrical learning. Firstly, the cancer genes expression profiles data was pretreated and selected feature genes by conventional method; then the expression data of feature genes in the training samples was construed each convex hull in the high-dimensional space using training algorithm of geometrical learning, while the independent test set was tested by the recognition algorithm of geometrical learning. The method was applied to the human acute leukemia gene expression data. The accuracy rate reached to 100%. The experiments have proved its efficiency and feasibility.
Resumo:
The mandarin keyword spotting system was investigated, and a new approach was proposed based on the principle of homology continuity and point location analysis in high-dimensional space geometry theory which are both parts of biomimetic pattern recognition theory. This approach constructed a hyper-polyhedron with sample points in the training set and calculated the distance between each test point and the hyper-polyhedron. The classification resulted from the value of those distances. The approach was tested by a speech database which was created by ourselves. The performance was compared with the classic HMM approach and the results show that the new approach is much better than HMM approach when the training data is not sufficient.
Resumo:
Based on the introduction of the traditional mathematical models of neurons in general-purpose neurocomputer, a novel all-purpose mathematical model-Double synaptic weight neuron (DSWN) is presented, which can simulate all kinds of neuron architectures, including Radial-Basis-Function (RBF) and Back-propagation (BP) models, etc. At the same time, this new model is realized using hardware and implemented in the new CASSANN-II neurocomputer that can be used to form various types of neural networks with multiple mathematical models of neurons. In this paper, the flexibility of the new model has also been described in constructing neural networks and based on the theory of Biomimetic pattern recognition (BPR) and high-dimensional space covering, a recognition system of omni directionally oriented rigid objects on the horizontal surface and a face recognition system had been implemented on CASSANN-H neurocomputer. The result showed DSWN neural network has great potential in pattern recognition.
Resumo:
A new model of pattern recognition principles-Biomimetic Pattern Recognition, which is based on "matter cognition" instead of "matter classification", has been proposed. As a important means realizing Biomimetic Pattern Recognition, the mathematical model and analyzing method of ANN get breakthrough: a novel all-purpose mathematical model has been advanced, which can simulate all kinds of neuron architecture, including RBF and BP models. As the same time this model has been realized using hardware; the high-dimension space geometry method, a new means to analyzing ANN, has been researched.
Resumo:
Digitization is the main feature of modern Information Science. Conjoining the digits and the coordinates, the relation between Information Science and high-dimensional space is consanguineous, and the information issues are transformed to the geometry problems in some high-dimensional spaces. From this basic idea, we propose Computational Information Geometry (CIG) to make information analysis and processing. Two kinds of applications of CIG are given, which are blurred image restoration and pattern recognition. Experimental results are satisfying. And in this paper, how to combine with groups of simple operators in some 2D planes to implement the geometrical computations in high-dimensional space is also introduced. Lots of the algorithms have been realized using software.
Resumo:
A new model of pattern recognition principles-Biomimetic Pattern Recognition, which is based on "matter cognition" instead of "matter classification", has been proposed. As a important means realizing Biomimetic Pattern Recognition, the mathematical model and analyzing method of ANN get breakthrough: a novel all-purpose mathematical model has been advanced, which can simulate all kinds of neuron architecture, including RBF and BP models. As the same time this model has been realized using hardware; the high-dimension space geometry method, a new means to analyzing ANN, has been researched.
Resumo:
Anomalies are unusual and significant changes in a network's traffic levels, which can often involve multiple links. Diagnosing anomalies is critical for both network operators and end users. It is a difficult problem because one must extract and interpret anomalous patterns from large amounts of high-dimensional, noisy data. In this paper we propose a general method to diagnose anomalies. This method is based on a separation of the high-dimensional space occupied by a set of network traffic measurements into disjoint subspaces corresponding to normal and anomalous network conditions. We show that this separation can be performed effectively using Principal Component Analysis. Using only simple traffic measurements from links, we study volume anomalies and show that the method can: (1) accurately detect when a volume anomaly is occurring; (2) correctly identify the underlying origin-destination (OD) flow which is the source of the anomaly; and (3) accurately estimate the amount of traffic involved in the anomalous OD flow. We evaluate the method's ability to diagnose (i.e., detect, identify, and quantify) both existing and synthetically injected volume anomalies in real traffic from two backbone networks. Our method consistently diagnoses the largest volume anomalies, and does so with a very low false alarm rate.