17 resultados para 080109 Pattern Recognition and Data Mining

em Bulgarian Digital Mathematics Library at IMI-BAS


Relevância:

100.00% 100.00%

Publicador:

Resumo:

* The work is partially supported by Grant no. NIP917 of the Ministry of Science and Education – Republic of Bulgaria.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents the results of our data mining study of Pb-Zn (lead-zinc) ore assay records from a mine enterprise in Bulgaria. We examined the dataset, cleaned outliers, visualized the data, and created dataset statistics. A Pb-Zn cluster data mining model was created for segmentation and prediction of Pb-Zn ore assay data. The Pb-Zn cluster data model consists of five clusters and DMX queries. We analyzed the Pb-Zn cluster content, size, structure, and characteristics. The set of the DMX queries allows for browsing and managing the clusters, as well as predicting ore assay records. A testing and validation of the Pb-Zn cluster data mining model was developed in order to show its reasonable accuracy before beingused in a production environment. The Pb-Zn cluster data mining model can be used for changes of the mine grinding and floatation processing parameters in almost real-time, which is important for the efficiency of the Pb-Zn ore beneficiation process. ACM Computing Classification System (1998): H.2.8, H.3.3.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Big data comes in various ways, types, shapes, forms and sizes. Indeed, almost all areas of science, technology, medicine, public health, economics, business, linguistics and social science are bombarded by ever increasing flows of data begging to be analyzed efficiently and effectively. In this paper, we propose a rough idea of a possible taxonomy of big data, along with some of the most commonly used tools for handling each particular category of bigness. The dimensionality p of the input space and the sample size n are usually the main ingredients in the characterization of data bigness. The specific statistical machine learning technique used to handle a particular big data set will depend on which category it falls in within the bigness taxonomy. Large p small n data sets for instance require a different set of tools from the large n small p variety. Among other tools, we discuss Preprocessing, Standardization, Imputation, Projection, Regularization, Penalization, Compression, Reduction, Selection, Kernelization, Hybridization, Parallelization, Aggregation, Randomization, Replication, Sequentialization. Indeed, it is important to emphasize right away that the so-called no free lunch theorem applies here, in the sense that there is no universally superior method that outperforms all other methods on all categories of bigness. It is also important to stress the fact that simplicity in the sense of Ockham’s razor non-plurality principle of parsimony tends to reign supreme when it comes to massive data. We conclude with a comparison of the predictive performance of some of the most commonly used methods on a few data sets.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, a modification for the high-order neural network (HONN) is presented. Third order networks are considered for achieving translation, rotation and scale invariant pattern recognition. They require however much storage and computation power for the task. The proposed modified HONN takes into account a priori knowledge of the binary patterns that have to be learned, achieving significant gain in computation time and memory requirements. This modification enables the efficient computation of HONNs for image fields of greater that 100 × 100 pixels without any loss of pattern information.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Earlier the authors have suggested a logical level description of classes which allows to reduce a solution of various pattern recognition problems to solution of a sequence of one-type problems with the less dimension. Here conditions of the effectiveness of the use of such a level descriptions are proposed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Logic based Pattern Recognition extends the well known similarity models, where the distance measure is the base instrument for recognition. Initial part (1) of current publication in iTECH-06 reduces the logic based recognition models to the reduced disjunctive normal forms of partially defined Boolean functions. This step appears as a way to alternative pattern recognition instruments through combining metric and logic hypotheses and features, leading to studies of logic forms, hypotheses, hierarchies of hypotheses and effective algorithmic solutions. Current part (2) provides probabilistic conclusions on effective recognition by logic means in a model environment of binary attributes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The purpose of this paper is to explain the notion of clustering and a concrete clustering method- agglomerative hierarchical clustering algorithm. It shows how a data mining method like clustering can be applied to the analysis of stocks, traded on the Bulgarian Stock Exchange in order to identify similar temporal behavior of the traded stocks. This problem is solved with the aid of a data mining tool that is called XLMiner™ for Microsoft Excel Office.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

* This work was financially supported by RFBR-04-01-00858.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The problem of decision functions quality in pattern recognition is considered. An overview of the approaches to the solution of this problem is given. Within the Bayesian framework, we suggest an approach based on the Bayesian interval estimates of quality on a finite set of events.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The paper is devoted to the description of hybrid pattern recognition method developed by research groups from Russia, Armenia and Spain. The method is based upon logical correction over the set of conventional neural networks. Output matrices of neural networks are processed according to the potentiality principle which allows increasing of recognition reliability.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

* The research is supported partly by INTAS: 04-77-7173 project, http://www.intas.be

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this work the new pattern recognition method based on the unification of algebraic and statistical approaches is described. The main point of the method is the voting procedure upon the statistically weighted regularities, which are linear separators in two-dimensional projections of feature space. The report contains brief description of the theoretical foundations of the method, description of its software realization and the results of series of experiments proving its usefulness in practical tasks.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Dimensionality reduction is a very important step in the data mining process. In this paper, we consider feature extraction for classification tasks as a technique to overcome problems occurring because of “the curse of dimensionality”. Three different eigenvector-based feature extraction approaches are discussed and three different kinds of applications with respect to classification tasks are considered. The summary of obtained results concerning the accuracy of classification schemes is presented with the conclusion about the search for the most appropriate feature extraction method. The problem how to discover knowledge needed to integrate the feature extraction and classification processes is stated. A decision support system to aid in the integration of the feature extraction and classification processes is proposed. The goals and requirements set for the decision support system and its basic structure are defined. The means of knowledge acquisition needed to build up the proposed system are considered.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Categorising visitors based on their interaction with a website is a key problem in Web content usage. The clickstreams generated by various users often follow distinct patterns, the knowledge of which may help in providing customised content. This paper proposes an approach to clustering weblog data, based on ART2 neural networks. Due to the characteristics of the ART2 neural network model, the proposed approach can be used for unsupervised and self-learning data mining, which makes it adaptable to dynamically changing websites.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

AMS Subj. Classification: 62P10, 62H30, 68T01