930 resultados para nonlinear dimensionality reduction


Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper, we present the application of a non-linear dimensionality reduction technique for the learning and probabilistic classification of hyperspectral image. Hyperspectral image spectroscopy is an emerging technique for geological investigations from airborne or orbital sensors. It gives much greater information content per pixel on the image than a normal colour image. This should greatly help with the autonomous identification of natural and manmade objects in unfamiliar terrains for robotic vehicles. However, the large information content of such data makes interpretation of hyperspectral images time-consuming and userintensive. We propose the use of Isomap, a non-linear manifold learning technique combined with Expectation Maximisation in graphical probabilistic models for learning and classification. Isomap is used to find the underlying manifold of the training data. This low dimensional representation of the hyperspectral data facilitates the learning of a Gaussian Mixture Model representation, whose joint probability distributions can be calculated offline. The learnt model is then applied to the hyperspectral image at runtime and data classification can be performed.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Eigen-based techniques and other monolithic approaches to face recognition have long been a cornerstone in the face recognition community due to the high dimensionality of face images. Eigen-face techniques provide minimal reconstruction error and limit high-frequency content while linear discriminant-based techniques (fisher-faces) allow the construction of subspaces which preserve discriminatory information. This paper presents a frequency decomposition approach for improved face recognition performance utilising three well-known techniques: Wavelets; Gabor / Log-Gabor; and the Discrete Cosine Transform. Experimentation illustrates that frequency domain partitioning prior to dimensionality reduction increases the information available for classification and greatly increases face recognition performance for both eigen-face and fisher-face approaches.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Performance comparisons between File Signatures and Inverted Files for text retrieval have previously shown several significant shortcomings of file signatures relative to inverted files. The inverted file approach underpins most state-of-the-art search engine algorithms, such as Language and Probabilistic models. It has been widely accepted that traditional file signatures are inferior alternatives to inverted files. This paper describes TopSig, a new approach to the construction of file signatures. Many advances in semantic hashing and dimensionality reduction have been made in recent times, but these were not so far linked to general purpose, signature file based, search engines. This paper introduces a different signature file approach that builds upon and extends these recent advances. We are able to demonstrate significant improvements in the performance of signature file based indexing and retrieval, performance that is comparable to that of state of the art inverted file based systems, including Language models and BM25. These findings suggest that file signatures offer a viable alternative to inverted files in suitable settings and positions the file signatures model in the class of Vector Space retrieval models.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Compressive Sensing (CS) is a popular signal processing technique, that can exactly reconstruct a signal given a small number of random projections of the original signal, provided that the signal is sufficiently sparse. We demonstrate the applicability of CS in the field of gait recognition as a very effective dimensionality reduction technique, using the gait energy image (GEI) as the feature extraction process. We compare the CS based approach to the principal component analysis (PCA) and show that the proposed method outperforms this baseline, particularly under situations where there are appearance changes in the subject. Applying CS to the gait features also avoids the need to train the models, by using a generalised random projection.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper investigates the use of the dimensionality-reduction techniques weighted linear discriminant analysis (WLDA), and weighted median fisher discriminant analysis (WMFD), before probabilistic linear discriminant analysis (PLDA) modeling for the purpose of improving speaker verification performance in the presence of high inter-session variability. Recently it was shown that WLDA techniques can provide improvement over traditional linear discriminant analysis (LDA) for channel compensation in i-vector based speaker verification systems. We show in this paper that the speaker discriminative information that is available in the distance between pair of speakers clustered in the development i-vector space can also be exploited in heavy-tailed PLDA modeling by using the weighted discriminant approaches prior to PLDA modeling. Based upon the results presented within this paper using the NIST 2008 Speaker Recognition Evaluation dataset, we believe that WLDA and WMFD projections before PLDA modeling can provide an improved approach when compared to uncompensated PLDA modeling for i-vector based speaker verification systems.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Hyperhomocysteinemia (hHcy) has been associated with an increased risk of cardiovascular disease and stroke. Essential hypertension (EH), a polygenic condition, has also been associated with increased risk of cardiovascular related disorders. To investigate the role of the homocysteine (Hcy) metabolism pathway in hypertension we conducted a case-control association study of Hcy pathway gene variants in a cohort of Caucasian hypertensives and age- and sex-matched normotensives. We genotyped two polymorphisms in the methylenetetrahydrofolate reductase gene (MTHFR C677T and MTHFR A1298C), one polymorphism in the methionine synthase reductase gene (MTRR A66G), and one polymorphism in the methylenetetrahydrofolate dehydrogenase 1 gene (MTHFD1 G1958A) and assessed their association with hypertension using chi-square analysis. We also performed a multifactor dimensionality reduction (MDR) analysis to investigate any potential epistatic interactions among the four polymorphisms and EH. None of the four polymorphisms was significantly associated with EH and although we found a moderate synergistic interaction between MTHFR A1298C and MTRR A66G, the association of the interaction model with EH was not statistically significant (

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Irregular atrial pressure, defective folate and cholesterol metabolism contribute to the pathogenesis of hypertension. However, little is known about the combined roles of the methylenetetrahydrofolate reductase (MTHFR), apolipoprotein-E (ApoE) and angiotensin-converting enzyme (ACE) genes, which are involved in metabolism and homeostasis. The objective of this study is to investigate the association of the MTHFR 677 C>T and 1298A>C, ACE insertion–deletion (I/D) and ApoE genetic polymorphisms with hypertension and to further explore the epistasis interactions that are involved in these mechanisms. A total of 594 subjects, including 348 normotensive and 246 hypertensive ischemic stroke subjects were recruited. The MTHFR 677 C>T and 1298A>C, ACE I/D and ApoEpolymorphisms were genotyped and the epistasis interaction were analyzed. The MTHFR 677 C>T and ApoE polymorphisms demonstrated significant associations with susceptibility to hypertension in multiple logistic regression models, multifactor dimensionality reduction and a classification and regression tree. In addition, the logistic regression model demonstrated that significant interactions between the ApoE E3E3, E2E4, E2E2 and MTHFR 677 C>T polymorphisms existed. In conclusion, the results of this epistasis study indicated significant association between the ApoE and MTHFR polymorphisms and hypertension.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In recommender systems based on multidimensional data, additional metadata provides algorithms with more information for better understanding the interaction between users and items. However, most of the profiling approaches in neighbourhood-based recommendation approaches for multidimensional data merely split or project the dimensional data and lack the consideration of latent interaction between the dimensions of the data. In this paper, we propose a novel user/item profiling approach for Collaborative Filtering (CF) item recommendation on multidimensional data. We further present incremental profiling method for updating the profiles. For item recommendation, we seek to delve into different types of relations in data to understand the interaction between users and items more fully, and propose three multidimensional CF recommendation approaches for top-N item recommendations based on the proposed user/item profiles. The proposed multidimensional CF approaches are capable of incorporating not only localized relations of user-user and/or item-item neighbourhoods but also latent interaction between all dimensions of the data. Experimental results show significant improvements in terms of recommendation accuracy.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper, we propose a highly reliable fault diagnosis scheme for incipient low-speed rolling element bearing failures. The scheme consists of fault feature calculation, discriminative fault feature analysis, and fault classification. The proposed approach first computes wavelet-based fault features, including the respective relative wavelet packet node energy and entropy, by applying a wavelet packet transform to an incoming acoustic emission signal. The most discriminative fault features are then filtered from the originally produced feature vector by using discriminative fault feature analysis based on a binary bat algorithm (BBA). Finally, the proposed approach employs one-against-all multiclass support vector machines to identify multiple low-speed rolling element bearing defects. This study compares the proposed BBA-based dimensionality reduction scheme with four other dimensionality reduction methodologies in terms of classification performance. Experimental results show that the proposed methodology is superior to other dimensionality reduction approaches, yielding an average classification accuracy of 94.9%, 95.8%, and 98.4% under bearing rotational speeds at 20 revolutions-per-minute (RPM), 80 RPM, and 140 RPM, respectively.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This thesis studies document signatures, which are small representations of documents and other objects that can be stored compactly and compared for similarity. This research finds that document signatures can be effectively and efficiently used to both search and understand relationships between documents in large collections, scalable enough to search a billion documents in a fraction of a second. Deliverables arising from the research include an investigation of the representational capacity of document signatures, the publication of an open-source signature search platform and an approach for scaling signature retrieval to operate efficiently on collections containing hundreds of millions of documents.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

State-of-the-art image-set matching techniques typically implicitly model each image-set with a Gaussian distribution. Here, we propose to go beyond these representations and model image-sets as probability distribution functions (PDFs) using kernel density estimators. To compare and match image-sets, we exploit Csiszar´ f-divergences, which bear strong connections to the geodesic distance defined on the space of PDFs, i.e., the statistical manifold. Furthermore, we introduce valid positive definite kernels on the statistical manifold, which let us make use of more powerful classification schemes to match image-sets. Finally, we introduce a supervised dimensionality reduction technique that learns a latent space where f-divergences reflect the class labels of the data. Our experiments on diverse problems, such as video-based face recognition and dynamic texture classification, evidence the benefits of our approach over the state-of-the-art image-set matching methods.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

An efficient and statistically robust solution for the identification of asteroids among numerous sets of astrometry is presented. In particular, numerical methods have been developed for the short-term identification of asteroids at discovery, and for the long-term identification of scarcely observed asteroids over apparitions, a task which has been lacking a robust method until now. The methods are based on the solid foundation of statistical orbital inversion properly taking into account the observational uncertainties, which allows for the detection of practically all correct identifications. Through the use of dimensionality-reduction techniques and efficient data structures, the exact methods have a loglinear, that is, O(nlog(n)), computational complexity, where n is the number of included observation sets. The methods developed are thus suitable for future large-scale surveys which anticipate a substantial increase in the astrometric data rate. Due to the discontinuous nature of asteroid astrometry, separate sets of astrometry must be linked to a common asteroid from the very first discovery detections onwards. The reason for the discontinuity in the observed positions is the rotation of the observer with the Earth as well as the motion of the asteroid and the observer about the Sun. Therefore, the aim of identification is to find a set of orbital elements that reproduce the observed positions with residuals similar to the inevitable observational uncertainty. Unless the astrometric observation sets are linked, the corresponding asteroid is eventually lost as the uncertainty of the predicted positions grows too large to allow successful follow-up. Whereas the presented identification theory and the numerical comparison algorithm are generally applicable, that is, also in fields other than astronomy (e.g., in the identification of space debris), the numerical methods developed for asteroid identification can immediately be applied to all objects on heliocentric orbits with negligible effects due to non-gravitational forces in the time frame of the analysis. The methods developed have been successfully applied to various identification problems. Simulations have shown that the methods developed are able to find virtually all correct linkages despite challenges such as numerous scarce observation sets, astrometric uncertainty, numerous objects confined to a limited region on the celestial sphere, long linking intervals, and substantial parallaxes. Tens of previously unknown main-belt asteroids have been identified with the short-term method in a preliminary study to locate asteroids among numerous unidentified sets of single-night astrometry of moving objects, and scarce astrometry obtained nearly simultaneously with Earth-based and space-based telescopes has been successfully linked despite a substantial parallax. Using the long-term method, thousands of realistic 3-linkages typically spanning several apparitions have so far been found among designated observation sets each spanning less than 48 hours.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The generalization performance of the SVM classifier depends mainly on the VC dimension and the dimensionality of the data. By reducing the VC dimension of the SVM classifier, its generalization performance is expected to increase. In the present paper, we argue that the VC dimension of SVM classifier can be reduced by applying bootstrapping and dimensionality reduction techniques. Experimental results showed that bootstrapping the original data and bootstrapping the projected (dimensionally reduced) data improved the performance of the SVM classifier.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper, we develop a game theoretic approach for clustering features in a learning problem. Feature clustering can serve as an important preprocessing step in many problems such as feature selection, dimensionality reduction, etc. In this approach, we view features as rational players of a coalitional game where they form coalitions (or clusters) among themselves in order to maximize their individual payoffs. We show how Nash Stable Partition (NSP), a well known concept in the coalitional game theory, provides a natural way of clustering features. Through this approach, one can obtain some desirable properties of the clusters by choosing appropriate payoff functions. For a small number of features, the NSP based clustering can be found by solving an integer linear program (ILP). However, for large number of features, the ILP based approach does not scale well and hence we propose a hierarchical approach. Interestingly, a key result that we prove on the equivalence between a k-size NSP of a coalitional game and minimum k-cut of an appropriately constructed graph comes in handy for large scale problems. In this paper, we use feature selection problem (in a classification setting) as a running example to illustrate our approach. We conduct experiments to illustrate the efficacy of our approach.