12 resultados para Data Mining and its Application
em Bulgarian Digital Mathematics Library at IMI-BAS
Resumo:
An expansion formula for fractional derivatives given as in form of a series involving function and moments of its k-th derivative is derived. The convergence of the series is proved and an estimate of the reminder is given. The form of the fractional derivative given here is especially suitable in deriving restrictions, in a form of internal variable theory, following from the second law of thermodynamics, when applied to linear viscoelasticity of fractional derivative type.
Resumo:
2000 Mathematics Subject Classification: 26A33, 33C20
Resumo:
This article presents the principal results of the Ph.D. thesis Investigation and classification of doubly resolvable designs by Stela Zhelezova (Institute of Mathematics and Informatics, BAS), successfully defended at the Specialized Academic Council for Informatics and Mathematical Modeling on 22 February 2010.
Resumo:
MSC 2010: 26A33, 33E12, 35B45, 35B50, 35K99, 45K05 Dedicated to Professor Rudolf Gorenflo on the occasion of his 80th anniversary
Resumo:
2010 Mathematics Subject Classification: Primary 60J80; Secondary 92D30.
Resumo:
2000 Mathematics Subject Classification: Primary 60G55; secondary 60G25.
Resumo:
This paper presents the results of our data mining study of Pb-Zn (lead-zinc) ore assay records from a mine enterprise in Bulgaria. We examined the dataset, cleaned outliers, visualized the data, and created dataset statistics. A Pb-Zn cluster data mining model was created for segmentation and prediction of Pb-Zn ore assay data. The Pb-Zn cluster data model consists of five clusters and DMX queries. We analyzed the Pb-Zn cluster content, size, structure, and characteristics. The set of the DMX queries allows for browsing and managing the clusters, as well as predicting ore assay records. A testing and validation of the Pb-Zn cluster data mining model was developed in order to show its reasonable accuracy before beingused in a production environment. The Pb-Zn cluster data mining model can be used for changes of the mine grinding and floatation processing parameters in almost real-time, which is important for the efficiency of the Pb-Zn ore beneficiation process. ACM Computing Classification System (1998): H.2.8, H.3.3.
Resumo:
* The work is partially supported by Grant no. NIP917 of the Ministry of Science and Education – Republic of Bulgaria.
Resumo:
The concept of knowledge is the central one used when solving the various problems of data mining and pattern recognition in finite spaces of Boolean or multi-valued attributes. A special form of knowledge representation, called implicative regularities, is proposed for applying in two powerful tools of modern logic: the inductive inference and the deductive inference. The first one is used for extracting the knowledge from the data. The second is applied when the knowledge is used for calculation of the goal attribute values. A set of efficient algorithms was developed for that, dealing with Boolean functions and finite predicates represented by logical vectors and matrices.
Resumo:
Dimensionality reduction is a very important step in the data mining process. In this paper, we consider feature extraction for classification tasks as a technique to overcome problems occurring because of “the curse of dimensionality”. Three different eigenvector-based feature extraction approaches are discussed and three different kinds of applications with respect to classification tasks are considered. The summary of obtained results concerning the accuracy of classification schemes is presented with the conclusion about the search for the most appropriate feature extraction method. The problem how to discover knowledge needed to integrate the feature extraction and classification processes is stated. A decision support system to aid in the integration of the feature extraction and classification processes is proposed. The goals and requirements set for the decision support system and its basic structure are defined. The means of knowledge acquisition needed to build up the proposed system are considered.
Resumo:
AMS Subj. Classification: 62P10, 62H30, 68T01
Resumo:
Big data comes in various ways, types, shapes, forms and sizes. Indeed, almost all areas of science, technology, medicine, public health, economics, business, linguistics and social science are bombarded by ever increasing flows of data begging to be analyzed efficiently and effectively. In this paper, we propose a rough idea of a possible taxonomy of big data, along with some of the most commonly used tools for handling each particular category of bigness. The dimensionality p of the input space and the sample size n are usually the main ingredients in the characterization of data bigness. The specific statistical machine learning technique used to handle a particular big data set will depend on which category it falls in within the bigness taxonomy. Large p small n data sets for instance require a different set of tools from the large n small p variety. Among other tools, we discuss Preprocessing, Standardization, Imputation, Projection, Regularization, Penalization, Compression, Reduction, Selection, Kernelization, Hybridization, Parallelization, Aggregation, Randomization, Replication, Sequentialization. Indeed, it is important to emphasize right away that the so-called no free lunch theorem applies here, in the sense that there is no universally superior method that outperforms all other methods on all categories of bigness. It is also important to stress the fact that simplicity in the sense of Ockham’s razor non-plurality principle of parsimony tends to reign supreme when it comes to massive data. We conclude with a comparison of the predictive performance of some of the most commonly used methods on a few data sets.