9 resultados para Unsupervised Learning

em Indian Institute of Science - Bangalore - Índia


Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we present a methodology for identifying best features from a large feature space. In high dimensional feature space nearest neighbor search is meaningless. In this feature space we see quality and performance issue with nearest neighbor search. Many data mining algorithms use nearest neighbor search. So instead of doing nearest neighbor search using all the features we need to select relevant features. We propose feature selection using Non-negative Matrix Factorization(NMF) and its application to nearest neighbor search. Recent clustering algorithm based on Locally Consistent Concept Factorization(LCCF) shows better quality of document clustering by using local geometrical and discriminating structure of the data. By using our feature selection method we have shown further improvement of performance in the clustering.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The concept of a “mutualistic teacher” is introduced for unsupervised learning of the mean vectors of the components of a mixture of multivariate normal densities, when the number of classes is also unknown. The unsupervised learning problem is formulated here as a multi-stage quasi-supervised problem incorporating a cluster approach. The mutualistic teacher creates a quasi-supervised environment at each stage by picking out “mutual pairs” of samples and assigning identical (but unknown) labels to the individuals of each mutual pair. The number of classes, if not specified, can be determined at an intermediate stage. The risk in assigning identical labels to the individuals of mutual pairs is estimated. Results of some simulation studies are presented.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Outlier detection in high dimensional categorical data has been a problem of much interest due to the extensive use of qualitative features for describing the data across various application areas. Though there exist various established methods for dealing with the dimensionality aspect through feature selection on numerical data, the categorical domain is actively being explored. As outlier detection is generally considered as an unsupervised learning problem due to lack of knowledge about the nature of various types of outliers, the related feature selection task also needs to be handled in a similar manner. This motivates the need to develop an unsupervised feature selection algorithm for efficient detection of outliers in categorical data. Addressing this aspect, we propose a novel feature selection algorithm based on the mutual information measure and the entropy computation. The redundancy among the features is characterized using the mutual information measure for identifying a suitable feature subset with less redundancy. The performance of the proposed algorithm in comparison with the information gain based feature selection shows its effectiveness for outlier detection. The efficacy of the proposed algorithm is demonstrated on various high-dimensional benchmark data sets employing two existing outlier detection methods.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The concept of feature selection in a nonparametric unsupervised learning environment is practically undeveloped because no true measure for the effectiveness of a feature exists in such an environment. The lack of a feature selection phase preceding the clustering process seriously affects the reliability of such learning. New concepts such as significant features, level of significance of features, and immediate neighborhood are introduced which result in meeting implicitly the need for feature slection in the context of clustering techniques.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The concept of feature selection in a nonparametric unsupervised learning environment is practically undeveloped because no true measure for the effectiveness of a feature exists in such an environment. The lack of a feature selection phase preceding the clustering process seriously affects the reliability of such learning. New concepts such as significant features, level of significance of features, and immediate neighborhood are introduced which result in meeting implicitly the need for feature slection in the context of clustering techniques.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Carbon fiber reinforced polymer (CFRP) composite specimens with different thickness, geometry, and stacking sequences were subjected to fatigue spectrum loading in stages. Another set of specimens was subjected to static compression load. On-line acoustic Emission (AE) monitoring was carried out during these tests. Two artificial neural networks, Kohonen-self organizing feature map (KSOM), and multi-layer perceptron (MLP) have been developed for AE signal analysis. AE signals from specimens were clustered using the unsupervised learning KSOM. These clusters were correlated to the failure modes using available a priori information such as AE signal amplitude distributions, time of occurrence of signals, ultrasonic imaging, design of the laminates (stacking sequences, orientation of fibers), and AE parametric plots. Thereafter, AE signals generated from the rest of the specimens were classified by supervised learning MLP. The network developed is made suitable for on-line monitoring of AE signals in the presence of noise, which can be used for detection and identification of failure modes and their growth. The results indicate that the characteristics of AE signals from different failure modes in CFRP remain largely unaffected by the type of load, fiber orientation, and stacking sequences, they being representatives of the type of failure phenomena. The type of loading can have effect only on the extent of damage allowed before the specimens fail and hence on the number of AE signals during the test. The artificial neural networks (ANN) developed and the methods and procedures adopted show significant success in AE signal characterization under noisy environment (detection and identification of failure modes and their growth).

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Wetlands are the most productive and biologically diverse but very fragile ecosystems. They are vulnerable to even small changes in their biotic and abiotic factors. In recent years, there has been concern over the continuous degradation of wetlands due to unplanned developmental activities. This necessitates inventorying, mapping, and monitoring of wetlands to implement sustainable management approaches. The principal objective of this work is to evolve a strategy to identify and monitor wetlands using temporal remote sensing (RS) data. Pattern classifiers were used to extract wetlands automatically from NIR bands of MODIS, Landsat MSS and Landsat TM remote sensing data. MODIS provided data for 2002 to 2007, while for 1973 and 1992 IR Bands of Landsat MSS and TM (79m and 30m spatial resolution) data were used. Principal components of IR bands of MODIS (250 m) were fused with IRS LISS-3 NIR (23.5 m). To extract wetlands, statistical unsupervised learning of IR bands for the respective temporal data was performed using Bayesian approach based on prior probability, mean and covariance. Temporal analysis of wetlands indicates a sharp decline of 58% in Greater Bangalore attributing to intense urbanization processes, evident from a 466% increase in built-up area from 1973 to 2007.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Some experimental results on the recognition of three-dimensional wire-frame objects are presented. In order to overcome the limitations of a recent model, which employs radial basis functions-based neural networks, we have proposed a hybrid learning system for object recognition, featuring: an optimization strategy (simulated annealing) in order to avoid local minima of an energy functional; and an appropriate choice of centers of the units. Further, in an attempt to achieve improved generalization ability, and to reduce the time for training, we invoke the principle of self-organization which utilises an unsupervised learning algorithm.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A new automata model Mr,k, with a conceptually significant innovation in the form of multi-state alternatives at each instance, is proposed in this study. Computer simulations of the Mr,k, model in the context of feature selection in an unsupervised environment has demonstrated the superiority of the model over similar models without this multi-state-choice innovation.