Biblioteca Digital

7 resultados para sample size calculation

em Bulgarian Digital Mathematics Library at IMI-BAS

Bayesian Prediction of Weibull Distribution Based on Fixed and Random Sample Size

Relevância:

100.00% 100.00%

Publicador:

Resumo:

2000 Mathematics Subject Classification: 62E16, 65C05, 65C20.

Veja mais

Detection of Logical-and-Probabilistic Correlation in Time Series

Relevância:

80.00% 80.00%

Publicador:

Resumo:

An application of the heterogeneous variables system prediction method to solving the time series analysis problem with respect to the sample size is considered in this work. It is created a logical-and-probabilistic correlation from the logical decision function class. Two ways is considered. When the information about event is kept safe in the process, and when it is kept safe in depending process.

Veja mais

Application of the Heterogeneous System Prediction Method to Pattern Recognition Problem

Relevância:

80.00% 80.00%

Publicador:

Resumo:

* This work was financially supported by RFBR-04-01-00858.

Veja mais

Application of the Multivariate Prediction Method to Time Series

Relevância:

80.00% 80.00%

Publicador:

Resumo:

* This work was financially supported by RFBR-04-01-00858.

Veja mais

A Taxonomy of Big Data for Optimal Predictive Machine Learning and Data Mining

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Big data comes in various ways, types, shapes, forms and sizes. Indeed, almost all areas of science, technology, medicine, public health, economics, business, linguistics and social science are bombarded by ever increasing flows of data begging to be analyzed efficiently and effectively. In this paper, we propose a rough idea of a possible taxonomy of big data, along with some of the most commonly used tools for handling each particular category of bigness. The dimensionality p of the input space and the sample size n are usually the main ingredients in the characterization of data bigness. The specific statistical machine learning technique used to handle a particular big data set will depend on which category it falls in within the bigness taxonomy. Large p small n data sets for instance require a different set of tools from the large n small p variety. Among other tools, we discuss Preprocessing, Standardization, Imputation, Projection, Regularization, Penalization, Compression, Reduction, Selection, Kernelization, Hybridization, Parallelization, Aggregation, Randomization, Replication, Sequentialization. Indeed, it is important to emphasize right away that the so-called no free lunch theorem applies here, in the sense that there is no universally superior method that outperforms all other methods on all categories of bigness. It is also important to stress the fact that simplicity in the sense of Ockham’s razor non-plurality principle of parsimony tends to reign supreme when it comes to massive data. We conclude with a comparison of the predictive performance of some of the most commonly used methods on a few data sets.

Veja mais

A Comparative Analysis of Predictive Learning Algorithms on High-Dimensional Microarray Cancer Data

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This research evaluates pattern recognition techniques on a subclass of big data where the dimensionality of the input space (p) is much larger than the number of observations (n). Specifically, we evaluate massive gene expression microarray cancer data where the ratio κ is less than one. We explore the statistical and computational challenges inherent in these high dimensional low sample size (HDLSS) problems and present statistical machine learning methods used to tackle and circumvent these difficulties. Regularization and kernel algorithms were explored in this research using seven datasets where κ < 1. These techniques require special attention to tuning necessitating several extensions of cross-validation to be investigated to support better predictive performance. While no single algorithm was universally the best predictor, the regularization technique produced lower test errors in five of the seven datasets studied.

Veja mais

Relationship between Extremal and Sum Processes Generated by the same Point Process

Relevância:

80.00% 80.00%

Publicador:

Resumo:

2000 Mathematics Subject Classification: Primary 60G51, secondary 60G70, 60F17.

Veja mais

7 resultados para sample size calculation

em Bulgarian Digital Mathematics Library at IMI-BAS

Filtro por publicador

Bayesian Prediction of Weibull Distribution Based on Fixed and Random Sample Size

Detection of Logical-and-Probabilistic Correlation in Time Series

Application of the Heterogeneous System Prediction Method to Pattern Recognition Problem

Application of the Multivariate Prediction Method to Time Series

A Taxonomy of Big Data for Optimal Predictive Machine Learning and Data Mining

A Comparative Analysis of Predictive Learning Algorithms on High-Dimensional Microarray Cancer Data

Relationship between Extremal and Sum Processes Generated by the same Point Process