Biblioteca Digital

7 resultados para text mining clusterizzazione clustering auto-organizzazione conoscenza MoK

em Bulgarian Digital Mathematics Library at IMI-BAS

Demo: Using RapidMiner for Text Mining

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this demo the basic text mining technologies by using RapidMining have been reviewed. RapidMining basic characteristics and operators of text mining have been described. Text mining example by using Navie Bayes algorithm and process modeling have been revealed.

Veja mais

A Statistical Approach for Multilingual Document Clustering and Topic Extraction from Clusters

Relevância:

100.00% 100.00%

Publicador:

Resumo:

2000 Mathematics Subject Classification: 62H30

Veja mais

Automatic Generation of Titles for a Corpus of Questions

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper describes the followed methodology to automatically generate titles for a corpus of questions that belong to sociological opinion polls. Titles for questions have a twofold function: (1) they are the input of user searches and (2) they inform about the whole contents of the question and possible answer options. Thus, generation of titles can be considered as a case of automatic summarization. However, the fact that summarization had to be performed over very short texts together with the aforementioned quality conditions imposed on new generated titles led the authors to follow knowledge-rich and domain-dependent strategies for summarization, disregarding the more frequent extractive techniques for summarization.

Veja mais

Using the Agglomerative Method of Hierarchical Clustering as a Data Mining Tool in Capital Market

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The purpose of this paper is to explain the notion of clustering and a concrete clustering method- agglomerative hierarchical clustering algorithm. It shows how a data mining method like clustering can be applied to the analysis of stocks, traded on the Bulgarian Stock Exchange in order to identify similar temporal behavior of the traded stocks. This problem is solved with the aid of a data mining tool that is called XLMiner™ for Microsoft Excel Office.

Veja mais

The New Software Package for Dynamic Hierarchical Clustering for Circles Types of Shapes

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In data mining, efforts have focused on finding methods for efficient and effective cluster analysis in large databases. Active themes of research focus on the scalability of clustering methods, the effectiveness of methods for clustering complex shapes and types of data, high-dimensional clustering techniques, and methods for clustering mixed numerical and categorical data in large databases. One of the most accuracy approach based on dynamic modeling of cluster similarity is called Chameleon. In this paper we present a modified hierarchical clustering algorithm that used the main idea of Chameleon and the effectiveness of suggested approach will be demonstrated by the experimental results.

Veja mais

Data Mining for Browsing Patterns in Weblog Data by Art Neural Networks

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Categorising visitors based on their interaction with a website is a key problem in Web content usage. The clickstreams generated by various users often follow distinct patterns, the knowledge of which may help in providing customised content. This paper proposes an approach to clustering weblog data, based on ART2 neural networks. Due to the characteristics of the ART2 neural network model, the proposed approach can be used for unsupervised and self-learning data mining, which makes it adaptable to dynamically changing websites.

Veja mais

Analysis and Data Mining of Lead-Zinc Ore Data

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents the results of our data mining study of Pb-Zn (lead-zinc) ore assay records from a mine enterprise in Bulgaria. We examined the dataset, cleaned outliers, visualized the data, and created dataset statistics. A Pb-Zn cluster data mining model was created for segmentation and prediction of Pb-Zn ore assay data. The Pb-Zn cluster data model consists of five clusters and DMX queries. We analyzed the Pb-Zn cluster content, size, structure, and characteristics. The set of the DMX queries allows for browsing and managing the clusters, as well as predicting ore assay records. A testing and validation of the Pb-Zn cluster data mining model was developed in order to show its reasonable accuracy before beingused in a production environment. The Pb-Zn cluster data mining model can be used for changes of the mine grinding and floatation processing parameters in almost real-time, which is important for the efficiency of the Pb-Zn ore beneficiation process. ACM Computing Classification System (1998): H.2.8, H.3.3.

Veja mais

7 resultados para text mining clusterizzazione clustering auto-organizzazione conoscenza MoK

em Bulgarian Digital Mathematics Library at IMI-BAS

Filtro por publicador

Demo: Using RapidMiner for Text Mining

A Statistical Approach for Multilingual Document Clustering and Topic Extraction from Clusters

Automatic Generation of Titles for a Corpus of Questions

Using the Agglomerative Method of Hierarchical Clustering as a Data Mining Tool in Capital Market

The New Software Package for Dynamic Hierarchical Clustering for Circles Types of Shapes

Data Mining for Browsing Patterns in Weblog Data by Art Neural Networks

Analysis and Data Mining of Lead-Zinc Ore Data