1000 resultados para Frequent mining


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Education is a complex systematic engineering, which is the guarantee of training high-quality talent, helping society make full use of educational outcomes and promote the healthy development of education. In the education, the students' score is a very important quantitative evaluation indicator, which can objectively reflect the effects of educational system and is an important basis to make lots of scientific decisions. This paper uses clustering algorithm and decision tree to comprehensively analyze the students' score, and obtains useful results. It can be observed that the results are valuable for the teaching and management.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This thesis examined the application of data mining techniques to the issue of predicting pilling propensity of wool knitwear. Using real industrial data, a pilling propensity prediction tool with embedded trained support vector machines is developed to provide high accuracy prediction to wool knitwear even before the yarn is spun!

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This study tests a model of Brand Knowledge and Brand Equity of brands of beer on new and frequent users in two populations that differ in their stage of the beer product life cycle and culture. Using Multiple Logistic Regression (MLR) and Binomial Logistic Regression (BLR), models based on the respondents' Brand Knowledge are able to correctly identify Chinese respondents’ preferred brand of beer 56% of the time, while correctly identifying 77% of respondents in an Australian sample when three top brands are tested. The model could further identify 67% of those that stay or switch in both the Australian and the Chinese samples.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper reports on the preparation and management processes of inconsistent data on damage on residential houses in Victoria, Australia. There are no existing specific and fully relevant databases readily available except for the incomplete paper-based and electronic-based reports. Therefore, the extracting of information from the reports is complicated and time consuming in order to extract and include all the necessary information needed for analysis of damage on residential houses founded on expansive soils. Data mining is adopted to develop a database. Statistical methods and Artificial Intelligence methods are used to quantify the quality of data. The paper concludes that the development of such database could enable BHC to evaluate the usefulness of the reports prepared on the reported damage properties for further analysis.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Purpose – The purpose of this study is to examine the exposures of Australian gold mining firms in the highly volatile period from 1995 to 2000. This period has been characterized by significant changes in gold price due to bulk sale of gold by collective central banks. Specifically, the paper aims to investigate several firm-specific factors that are hypothesized to carry substantial influence on gold beta.

Design/methodology/approach – To estimate gold beta, we use the following multifactor model: Rg,t = a+ßgGPRt + ßxFXRt + ßmRm,t + Et , where Rg,t is the return on the gold stock Index at time t, GPRt is the gold price return denominated in US dollar at time t, FXRt is the foreign exchange return of Australian dollar in terms of US dollar at time t, Rm,t is the market return at time t, and Et is the random error term at time t.

Findings – The paper finds that the values of gold beta are consistently greater than one, implying the sensitive nature of firms’ stock returns to gold price changes. This also suggests that investors holding gold mining stock would receive higher percentage increases in stock returns from a percentage increase in gold price returns, as opposed to investors holding gold bullion. Furthermore, these values have changed substantially over time with significant changes in gold price volatility. The most important and consistent relationship that we find is the impact of firms’ hedging behavior on their respective gold betas. This is consistent with Tufano’s study. It implies that firms, which hedge a greater proportion of their gold reserves, are less sensitive to movements in gold prices. The finding therefore supports the risk management theory that hedging increases shareholder’s wealth. However, cash operating costs, cash reserves and the level of gold production seem to influence very little on the firms’ exposure to gold price changes.

Originality/value – This study is of interest and important to the stock mining companies and investors because the extent of the effect of gold price movements on the stock returns of gold mining companies has significant impacts on returns for both firms and investors especially in their risk management and investment decisions, respectively.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Researchers have been endeavoring to discover concise sets of episode rules instead of complete sets in sequences. Existing approaches, however, are not able to process complex sequences and can not guarantee the accuracy of resulting sets due to the violation of anti-monotonicity of the frequency metric. In some real applications, episode rules need to be extracted from complex sequences in which multiple items may appear in a time slot. This paper investigates the discovery of concise episode rules in complex sequences. We define a concise representation called non-derivable episode rules and formularize the mining problem. Adopting a novel anti-monotonic frequency metric, we then develop a fast approach to discover non-derivable episode rules in complex sequences. Experimental results demonstrate that the utility of the proposed approach substantially reduces the number of rules and achieves fast processing.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Class imbalance in textual data is one important factor that affects the reliability of text mining. For imbalanced textual data, conventional classifiers tend to have a strong performance bias, which results in high accuracy rate on the majority class but very low rate on the minorities. An extreme strategy for unbalanced learning is to discard the majority instances and apply one-class classification to the minority class. However, this could easily cause another type of bias, which increases the accuracy rate on minorities by sacrificing the majorities. This chapter aims to investigate approaches that reduce these two types of performance bias and improve the reliability of discovered classification rules. Experimental results show that the inexact field learning method and parameter optimized one class classifiers achieve more balanced performance than the standard approaches.

Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The problem of extracting infrequent patterns from streams and building associations between these patterns is becoming increasingly relevant today as many events of interest such as attacks in network data or unusual stories in news data occur rarely. The complexity of the problem is compounded when a system is required to deal with data from multiple streams. To address these problems, we present a framework that combines the time based association mining with a pyramidal structure that allows a rolling analysis of the stream and maintains a synopsis of the data without requiring increasing memory resources. We apply the algorithms and show the usefulness of the techniques. © 2007 Crown Copyright.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In today’s high speed networks it is becoming increasingly challenging for network managers to understand the nature of the traffic that is carried in their network. A major problem for traffic analysis in this context is how to extract a concise yet accurate summary of the relevant aggregate traffic flows that are present in network traces. In this paper, we present two summarization techniques to minimize the size of the traffic flow report that is generated by a hierarchical cluster analysis tool. By analyzing the accuracy and compaction gain of our approach on a standard benchmark dataset, we demonstrate that our approach achieves more accurate summaries than those of an existing tool that is based on frequent itemset mining.