Biblioteca Digital

4 resultados para High dimensional

em Bulgarian Digital Mathematics Library at IMI-BAS

A Comparative Analysis of Predictive Learning Algorithms on High-Dimensional Microarray Cancer Data

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This research evaluates pattern recognition techniques on a subclass of big data where the dimensionality of the input space (p) is much larger than the number of observations (n). Specifically, we evaluate massive gene expression microarray cancer data where the ratio κ is less than one. We explore the statistical and computational challenges inherent in these high dimensional low sample size (HDLSS) problems and present statistical machine learning methods used to tackle and circumvent these difficulties. Regularization and kernel algorithms were explored in this research using seven datasets where κ < 1. These techniques require special attention to tuning necessitating several extensions of cross-validation to be investigated to support better predictive performance. While no single algorithm was universally the best predictor, the regularization technique produced lower test errors in five of the seven datasets studied.

Veja mais

A Bimodality Test in High Dimensions

Relevância:

70.00% 70.00%

Publicador:

Resumo:

We present a test for identifying clusters in high dimensional data based on the k-means algorithm when the null hypothesis is spherical normal. We show that projection techniques used for evaluating validity of clusters may be misleading for such data. In particular, we demonstrate that increasingly well-separated clusters are identified as the dimensionality increases, when no such clusters exist. Furthermore, in a case of true bimodality, increasing the dimensionality makes identifying the correct clusters more difficult. In addition to the original conservative test, we propose a practical test with the same asymptotic behavior that performs well for a moderate number of points and moderate dimensionality. ACM Computing Classification System (1998): I.5.3.

Veja mais

The New Software Package for Dynamic Hierarchical Clustering for Circles Types of Shapes

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In data mining, efforts have focused on finding methods for efficient and effective cluster analysis in large databases. Active themes of research focus on the scalability of clustering methods, the effectiveness of methods for clustering complex shapes and types of data, high-dimensional clustering techniques, and methods for clustering mixed numerical and categorical data in large databases. One of the most accuracy approach based on dynamic modeling of cluster similarity is called Chameleon. In this paper we present a modified hierarchical clustering algorithm that used the main idea of Chameleon and the effectiveness of suggested approach will be demonstrated by the experimental results.

Veja mais

Implementation of the EM Algorithm for Maximum Likelihood Estimation of a Random Effects Model for One Longitudinal Ordinal Outcome

Relevância:

60.00% 60.00%

Publicador:

Resumo:

2010 Mathematics Subject Classification: 62J99.

Veja mais

4 resultados para High dimensional

em Bulgarian Digital Mathematics Library at IMI-BAS

Filtro por publicador

A Comparative Analysis of Predictive Learning Algorithms on High-Dimensional Microarray Cancer Data

A Bimodality Test in High Dimensions

The New Software Package for Dynamic Hierarchical Clustering for Circles Types of Shapes

Implementation of the EM Algorithm for Maximum Likelihood Estimation of a Random Effects Model for One Longitudinal Ordinal Outcome